Understanding High-availability in Kafka

5 min readMay 9, 2021

Kafka categories the message feeds into different topics. Topics are divided into one or more partitions. Partitions provide concurrency and high availability to Kafka.

While a topic is created, a replication factor is configured that specifies the number of replicas to be created for the partitions in that topic. The logs on each partition are replicated across the number of replicas. This mechanism provides fault tolerance to Kafka such that — When a Server in the cluster fails, there are provisions for automatic failovers.

Partition leader and followers

Each partition in the Kafka cluster has one leader and one or more followers. The number of followers is dependent on the replication factor. Replication factor also includes the leader partition.
All the read and write operations for a Partition are directed to the leader partition only. Follower partitions act like consumers to the Leader partition and replicate the logs, maintains the same offset and order of the logs as Leader partition.

Message commit in Kafka

Since there can be more than one partition where a message should be written, so when do we say that message is committed? Is it when the leader partition writes the message OR all the partitions should complete writing the message before the message is committed?

The answer is — It is configurable.

The producer’s acknowledgment setting defines when the message should be treated as committed. The ack value in the producer configuration can be set to the following values:

acks=0 The producer will not wait for the acknowledgment. There is no guarantee that a message is delivered to Kafka’s broker or not.
acks=1 The producer gets an acknowledgment after the leader receives the message. There is no guarantee that message is received by one or more followers.
acks=all The producer gets an acknowledgment when all in-sync replicas have received the message. This configuration provides the highest durability of all the ack-value.

Replication factor and in-sync replicas

Replication factor determines the number of replicas being created for the Topic’s partitions. E.g. if the replication factor is set as 3 then for each of the partition of that topic there will be one leader and two follower partitions.

All the follower partitions will fetch the data as well as offsets from the loader partition and will keep them up-to-date with the leaders. There might still be some delay in replicating data by the followers. Some of the replicas might be more up-to-date with the leader partition than others.

In the case of failure when the replicas are not up-to-date, then there are chances of losing the data. So, what could be the way to ensure that the replicas are in sync with the leader partitions? It is related to the Durability guarantee that any messaging system like Kafka offers. To make sure that there is no data loss in case of node failure where the leader partition was residing — Kafka allows you to specify a configuration to commit the log only when there is data available in the replicas also. It might increase the latency though, since before committing the data, Kafka will wait for replicas to write the message.

Kafka also provides a way to configure that only a few replicas should confirm the receipt of the data before the leader partition commits the data. It is done using the replication factor, and in-sync parameters while defining the topics.

Let’s take an example —

Suppose a topic has a replication factor as 3 and in-sync replicas as 2. It means that there should be at least one more replica along with the leader partition should write the log before committing. So, along with other parameters, the topic configuration will contain the followings —

# Total number of replicas
default.replication.factor=3# Minimum in-sync replicas 
min.insync.replicas=2

This setting will be applicable only if the producer has acks=all.

Availability and Consistency trade offs

What happens when all the Broker nodes in the Kafka cluster die? In this case, there will be need a to wait before the replicas come up again. At least one replica should be UP before starting the operation. What happens when the non-in-sync replica comes up first? Should the operation resume OR should there be a wait for the in-sync replica to be up again?

There is an Availability vs Consistency trade-off. By default Kafka prefers consistency and it will wait for the in-sync replica to come up before resuming the operations, but you can override this behavior.

# Allowing replicas not in ISR set to be elected as leader. 
# This is a change from the default behaviour.unclean.leader.election.enable=true

Electing leaders thru Controller

In the case of failures, the leadership election process is critical to the unavailability. The election process of the leader might take some time if there are many partitions eligible to be the leader. Also, one Topic in Kafka may contain thousands of Partitions that can even increase the window of unavailability.

Kafka has a provision for “Controller nodes”. A broker will be chosen as Controller that will be responsible for changing the leader for all the affected partitions in the affected brokers. This makes the election process cheaper and faster.

Conclusion

Kafka allows you to define the cluster with the high-availability configurations. You may choose the number of replicas to be there with each of your Topics.
It also allows specifying that whether some of the replicas should be in sync with the leader partitions. You may choose to have only a few replicas to be in sync rather than all the replicas. It is a speed vs durability trade-off.
Kafka producer can enable the acknowledgment such that a message will be committed only if all the in-sync replicas have acknowledged the receipt.
When all the broker nodes in the Kafka cluster go down then by default Kafka will resume the operations when one of the replicas that are in sync with the leader partition comes up. So, Kafka prefers Consistency and Durability guarantee by default. This behavior can be altered using the Topic configuration if Availability is preferred over consistency. In this case, Kafka will resume operation when any of the replicas come up.
Kafka identifies one of the broker nodes as the Controller node that has the responsibility of electing the leader partitions in case of Failures. This reduces the unavailability window in case of failures.