实际项目中我们可能在创建topic时没有设置好正确的replication-factor,导致kafka集群虽然是高可用的,但是该topic在有broker宕机时,可能发生无法使用的情况。topic一旦使用又不能轻易删除重建,因此动态增加副本因子就成为最终的选择。
说明:kafka 1.0版本配置文件默认没有default.replication.factor=x, 因此如果创建topic时,不指定–replication-factor 想, 默认副本因子为1. 我们可以在自己的server.properties中配置上常用的副本因子,省去手动调整。例如设置default.replication.factor=3, 详细内容可参考官方文档https://kafka.apache.org/documentation/#replication
原因分析:
假设我们有3个kafka broker分别brokerA、brokerB、brokerC.
- 当我们创建的topic有3个分区partition时并且replication-factor为1,基本上一个broker上一个分区。当一个broker宕机了,该topic就无法使用了,因为三个分区只有两个能用,
- 当我们创建的topic有3个分区partition时并且replication-factor为2时,可能分区数据分布情况是
brokerA, partiton0,partiton1,
brokerB, partiton1,partiton2
brokerC, partiton2,partiton0,
每个分区有一个副本,当其中一个broker宕机了,kafka集群还能完整凑出该topic的三个分区,例如当brokerA宕机了,可以通过brokerB和brokerC组合出topic的三个分区。
如何动态给已经创建的topic添加replication-factor?
可能很多人想使用kafka-topics.sh脚本,那么事情情况如何了?
[root@xszeree6p5z bin]# ./kafka-topics.sh --alter --topic yqtopic01 --zookeeper localhost:2181 --replication-factor 3
Option "[replication-factor]" can't be used with option"[alter]"
Option Description
------ -----------
--alter Alter the number of partitions,replica assignment, and/orconfiguration for the topic.
--config <String: name=value> A topic configuration override for the
截图
可以看出kafka-topics.sh不能用来增加副本因子replication-factor。实际应该使用kafka bin目录下面的kafka-reassign-partitions.sh。
a, 首先我们配置topic的副本,保存为json文件()
例如, 我们想把yqtopic01的部分设置为3,(我的kafka集群有3个broker,id分别为0,1,2), json文件名称为increase-replication-factor.json
{“version”:1,
“partitions”:[
{“topic”:“yqtopic01”,“partition”:0,“replicas”:[0,1,2]},
{“topic”:“yqtopic01”,“partition”:1,“replicas”:[0,1,2]},
{“topic”:“yqtopic01”,“partition”:2,“replicas”:[0,1,2]}
]}
b, 然后执行脚本
./kafka-reassign-partitions.sh -zookeeper 127.0.0.1:2181 --reassignment-json-file increase-replication-factor.json --execute
kafka-reassign-partitions.sh执行截图
我们可以通过执行
kafka-topics.sh --describe --zookeeper localhost:2181 --topic yqtopic01查看现在该topic的副本因子。
总结
所有文档官方文档最权威。https://kafka.apache.org/documentation/#basic_ops_increase_replication_factor
摘录如下:
Increasing replication factor
Increasing the replication factor of an existing partition is easy. Just specify the extra replicas in the custom reassignment json file and use it with the --execute option to increase the replication factor of the specified partitions.
For instance, the following example increases the replication factor of partition 0 of topic foo from 1 to 3. Before increasing the replication factor, the partition’s only replica existed on broker 5. As part of increasing the replication factor, we will add more replicas on brokers 6 and 7.
The first step is to hand craft the custom reassignment plan in a json file:
> cat increase-replication-factor.json
{
"version":1,
"partitions":[{
"topic":"foo","partition":0,"replicas":[5,6,7]}]}
Then, use the json file with the --execute option to start the reassignment process:
> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --execute
Current partition replica assignment{
"version":1,
"partitions":[{
"topic":"foo","partition":0,"replicas":[5]}]}Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions
{
"version":1,
"partitions":[{
"topic":"foo","partition":0,"replicas":[5,6,7]}]}
The --verify option can be used with the tool to check the status of the partition reassignment. Note that the same increase-replication-factor.json (used with the --execute option) should be used with the --verify option:
> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file increase-replication-factor.json --verify
Status of partition reassignment:
Reassignment of partition [foo,0] completed successfully
You can also verify the increase in replication factor with the kafka-topics tool:
> bin/kafka-topics.sh --zookeeper localhost:2181 --topic foo --describe
Topic:foo PartitionCount:1 ReplicationFactor:3 Configs:Topic: foo Partition: 0 Leader: 5 Replicas: 5,6,7 Isr: 5,6,7