24_ Modify copy factor

After creating a theme, we can also modify the number of partitions, as well as the copy factor (number of copies). There are many usage scenarios for modifying replica factors. For example, the wrong number of replica factors is filled in when creating a topic and needs to be modified. For example, after running for a period of time, you want to increase the number of replica factors to improve fault tolerance and reliability.

The details related to partition reallocation were mainly described earlier. The function of modifying replica factor in this section is also realized through the kafka-reassign-partition.sh script used for reallocation. Let's take a closer look at the project.json file used in the example in the previous section:

{
    "version": 1,
    "partitions": [
        {
            "topic": "topic-throttle",
            "partition": 1,
            "replicas": [
                2,
                0
            ],
            "log_dirs": [
                "any",
                "any"
            ]
        },
        {
            "topic": "topic-throttle",
            "partition": 0,
            "replicas": [
                0,
                2
            ],
            "log_dirs": [
                "any",
                "any"
            ]
        },
        {
            "topic": "topic-throttle",
            "partition": 2,
            "replicas": [
                0,
                2
            ],
            "log_dirs": [
                "any",
                "any"
            ]
        }
    ]
}

It can be observed that there are two replicas in the JSON content. We can add a replica ourselves. For example, for partition 1, it can be changed to the following content:

{
    "topic": "topic-throttle",
    "partition": 1,
    "replicas": [
        2,
        1,
        0
    ],
    "log_dirs": [
        "any",
        "any",
        "any"
    ]
}

We can also change the replica content of other partitions to [0,1,2], so that the replica factor of each partition increases from 2 to 3. Note that when adding the replica factor, it should also be in the log_ Add an "any" in dirs, which is the log_dirs represents the log directory in Kafka and corresponds to the configuration value of the log.dir or log.dirs parameter on the broker side. If you don't need to pay attention to this detail, you can simply set it to "any". We save the modified JSON content as a new add.json file. Before executing the kafka-reassign-partition.sh script, the details of the topic throttle (the copy factor is 2) are as follows:

[root@node1 kafka_2.11-2.0.0]# bin/kafka-topics.sh --zookeeper localhost:2181/ kafka --describe --topic topic-throttle
Topic:topic-throttle    PartitionCount:3    ReplicationFactor:2 Configs:
    Topic: topic-throttle   Partition: 0    Leader: 0   Replicas: 0,1   Isr: 0,1
    Topic: topic-throttle   Partition: 1    Leader: 1   Replicas: 1,2   Isr: 2,1
    Topic: topic-throttle   Partition: 2    Leader: 2   Replicas: 2,0   Isr: 2,0

Execute the kafka-reassign-partition.sh script (execute). The details are as follows:

[root@node1 kafka_2.11-2.0.0]# bin/kafka-reassign-partitions.sh --zookeeper localhost:2181/kafka --execute --reassignment-json-file add.json
Current partition replica assignment

{"version":1,"partitions":[{"topic":"topic-throttle","partition":2,"replicas":[2,0],"log_dirs":["any","any"]},{"topic":"topic-throttle","partition":1,"replicas":[1,2],"log_dirs":["any","any"]},{"topic":"topic-throttle","partition":0,"replicas":[0,1],"log_dirs":["any","any"]}]}

Save this to use as the --reassignment-json-file option during rollback
Successfully started reassignment of partitions.

After execution, view the details of the topic throttle again. The details are as follows:

[root@node1 kafka_2.11-2.0.0]# bin/kafka-topics.sh --zookeeper localhost:2181/ kafka --describe --topic topic-throttle
Topic:topic-throttle    PartitionCount:3    ReplicationFactor:3 Configs:
    Topic: topic-throttle   Partition: 0    Leader: 0   Replicas: 0,1,2 Isr: 0,1,2
    Topic: topic-throttle   Partition: 1    Leader: 1   Replicas: 0,1,2 Isr: 2,1,0
    Topic: topic-throttle   Partition: 2    Leader: 2   Replicas: 0,1,2 Isr: 2,0,1

You can see that the corresponding copy factor has increased to 3.

Unlike modifying the number of partitions, the number of replicas can be reduced. This is actually very easy to understand. The most direct way is to close some broker s, but this method is not very formal. Here, we can also reduce the replica factor of the partition through the kafka-reassign-partition.sh script. Modify the contents in the project.json file again. Refer to the following:

{"version":1,"partitions":[{"topic":"topic-throttle","partition":2,"replicas":[0],"log_dirs":["any"]},{"topic":"topic-throttle","partition":1,"replicas":[1],"log_dirs":["any"]},{"topic":"topic-throttle","partition":0,"replicas":[2],"log_dirs":["any"]}]}

After executing the kafka-reassign-partition.sh script again (execute), the details of the topic throttle are as follows:

[root@node1 kafka_2.11-2.0.0]# bin/kafka-topics.sh --zookeeper localhost:2181/ kafka --describe --topic topic-throttle 
Topic:topic-throttle	PartitionCount:3	ReplicationFactor:1	Configs:
     Topic: topic-throttle	Partition: 0	Leader: 2	Replicas: 2	Isr: 2
     Topic: topic-throttle	Partition: 1	Leader: 1	Replicas: 1	Isr: 1
     Topic: topic-throttle	Partition: 2	Leader: 0	Replicas: 0	Isr: 0

You can see that the copy factor of topic throttle has been changed to 1 again.

Careful readers may notice that the candidate schemes we use to execute the kafka-reassign-partition.sh script (execute) are manually modified. When increasing the replica factor, since there are only three broker nodes in the whole example cluster, it only needs to fill the replica from 2 to 3. In addition, in the example, the replica factor is reduced to 1, so that each broker node can be simply polled, so there is less impact of load imbalance. However, in real applications, you may face a cluster containing dozens of broker nodes. When you change the number of replicas from 2 to 5, or from 4 to 3, how to allocate reasonably is a key problem.

We can refer to the partition replica allocation in section 17 for corresponding calculation, but it is really cumbersome if the results are calculated manually instead of through the program. The following shows how to calculate the allocation scheme through the program (essentially the corresponding method in section 17), as shown in code listing 24-1.

Code listing 24-1 Allocation scheme calculation( Scala)
object ComputeReplicaDistribution {
  val partitions = 3
  val replicaFactor = 2

  def main(args: Array[String]): Unit = {
    val brokerMetadatas = List(new BrokerMetadata(0, Option("rack1")),
      new BrokerMetadata(1, Option("rack1")),
      new BrokerMetadata(2, Option("rack1")))
    val replicaAssignment = AdminUtils.assignReplicasToBrokers(brokerMetadatas,
      partitions, replicaFactor)
    println(replicaAssignment)
  }
}

The code calculates the allocation scheme of cluster node [0,1,2], partition number 3, replica factor 2 and inorganic rack information. The program output is as follows:

Map(2 -> ArrayBuffer(0, 2), 1 -> ArrayBuffer(2, 1), 0 -> ArrayBuffer(1, 0))

Partition 2 corresponds to [0,2], partition 1 corresponds to [2,1], and partition 0 corresponds to [1,0]. Therefore, the corresponding candidate scheme of modifying the replica factor to 2 in a 3-node cluster is:

{"version":1,"partitions":[{"topic":"topic-throttle","partition":2,"replicas":[0,2],"log_dirs":["any","any"]},{"topic":"topic-throttle","partition":1,"replicas":[2,1],"log_dirs":["any","any"]},{"topic":"topic-throttle","partition":0,"replicas":[1,0],"log_dirs":["any","any"]}]}

Tags: Java Design Pattern message queue

Posted on Wed, 24 Nov 2021 20:22:05 -0500 by tisa