Understanding Message Delivery Semantics in Kafka

Photo by Martin Adams on Unsplash
Kafka components
  1. At most once
  2. At least once
  3. Exactly once

Message delivery from Producer to Broker

Message delivery from the producer can be configured to have any of the three message delivery semantics.

At most once (Producer to Broker)
At least once (Producer to Broker)
  1. Broker assigns each producer with an ID.
  2. The producer sends a sequence number along with every message.
  3. Deduplication of the messages is done thru the producer ID and the sequence number.

Message consumption by Consumer from Broker

Kafka uses offsets to identify the message read by the consumers. The consumer controls the position of the offset in the log. Based on the arrangement when the position of the offset will be updated after reading a message will determine whether it is an At least once or At most once semantics. When a consumer crashes and another consumer process is handed over the responsibility of consuming the messages then based on the semantics either the messages can be lost or they can be processed multiple times.

At most one (Broker to Consumer)
At least once (Broker to Consumer)
  1. Kafka producers can publish to multiple topic partitions.
  2. There might be a need for data transfer between different topics while processing the data. (Even streaming)

Summary

  • For defining the message delivery semantics in Kafka following three cases to be considered — a) Message delivery from Producer to Kafka Broker, b) Message transfer between different topics in Kafka Cluster, and c) Consumption of the messages by the consumers
  • A producer can write to multiple topic partitions simultaneously.
  • Kafka supports transactional producers/consumers that can take care of the exactly-once semantics for the cases including where a producer is writing to multiple topic partitions and data transfer between the topics.
  • Using the ID for the producers and sequence number for the messages, Exactly once semantics can be achieved between Producer to Broker.
  • There will be coordination required between the Consumers and the external systems to achieve the exactly-once semantics when the events are processed by the external systems.
  • Offset for the Kafka consumers provides a way to establish this coordination between the Consumers and the external system.
  • Storing the output and the offset together is a good way for deduplication.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store