Kafka and RabbitMQ

Tung Nguyen
5 min readNov 11, 2023

“What is the difference between Kafka and a classic queue in RabbitMQ?” I asked myself this question on the first day I worked with Kafka and RabbitMQ. I guess that many people also have the same question so I would like to share a brief overview of Kafka and RabbitMQ based on the matching of some concepts with the things in the real life. Before going to the detail, let’s get to know the following entities.

JustForFun publishes many magazines a day and it does not have a fixed schedule. This means it may publish only 1 magazine a day, however it also may publish more than 10 magazines a day at any time. The idea sounds weird but it is what the CEO of JustForFun wants to do.

Post Office receives magazines and delivers them to customers of JustForFun, including me.

Magazine Store also gets magazines from JustForFun and keeps them on a shelf for buyers.

How does a classic queue in RabbitMQ work?

Basically, a classic queue in RabbitMQ distributes messages with the push mechanism, it means a publisher just publishes messages to a queue or an exchange and the RabbitMQ broker ensures that the messages will be sent to registered consumers. We can simply match these concepts to the entities as below.

  • RabbitMQ is a Post Office
  • JustForFun is a publisher
  • I am a consumer

Just because I love the magazines of JusForFun, I registered for the new magazines from JustForFun with a contract that says whenever a new magazine is published, it will be sent to my home. As the matter of fact, JustForFun is good in publishing but not very good in logistics and mostly it does not want to work directly with customers. This is the reason why the Post Office is involved in the story to make life easier. In the full picture, JustForFun just sends new magazines to a Post Office and the Post Office makes sure that the magazines with my address will be delivered to my mailbox. This is how a queue of RabbitMQ works in a nutshell.

Additionally, to ensure that I receive and read magazines successfully, I have to submit an acknowledgement message to the Post Office as a confirmation via an online form. Then the Post Office can mark the delivery is done. Otherwise, the Post Office will have to resend the magazines to me eventually. We can see it is similar to acknowledgment basic.ack or basic.nack in RabbitMQ.

From my side, I can only read a maximum of 10 magazines a day, no more. If JustForFun publishes more than 10 magazines a day, I will be overloaded. To avoid this kind of overload, I make a deal with the Post Office that it never delivers to me more than 10 unacknowledged magazines. This is how prefetch count works in RabbitMQ to avoid overload at the consumer end.

https://www.rabbitmq.com/tutorials/tutorial-two-go.html

In the winter, I am too lazy to get the magazines from the mailbox outside of my house right after a magazine is delivered because I am afraid of the cold weather and it is time consuming. Perfectly, I want to get at least 5 magazines at a time but unfortunately I do not know when the next magazines come so I give up the idea. If I want to do the same thing when I use RabbitMQ, I can keep the messages in a buffer at the consumer end. Sounds good, however it is not always the right solution.

How does Kafka work?

Unlike RabbitMQ, Kafka works based on the pull mechanism. In short, a publisher publishes messages to a Kafka topic then consumers will fetch messages from the topic in batch for processing. This means Kafka does not ensure that the messages will be delivered to the consumers. We can consider concepts like this.

  • Kafka is a Magazine Store
  • JustForFun is a publisher.
  • I am a consumer

Overall, JustForFun publishes magazines and sends them to a Magazine Store. Instead of registering magazines from JustForFun, I go to the Magazine Store and grab magazines by myself. Now the decision is on me so I can decide how many magazines I should get each time. Of course, I should not take more than 10 magazines a day. We can see that the number of magazines I take each time is similar to the configuration fetch.min.bytes or max.poll.records in Kafka.

Because I get the magazines by myself, I have to take note of the publish series of the last magazine that I already took so that I won’t take it again next time. This is the reason why a Kafka consumer has to commit its offset of each partition to a special topic __consumer_offsets. The old version of Kafka uses Zookeeper to store offsets.

Kafka: The Definitive Guide
Real-Time Data and Stream Processing at Scale

Suppose that I already took a bunch of magazines from the store last week but somehow I lost them and now I want to read them again. Luckily, I still have a chance to get the magazines because usually the store keeps magazines on its shelf for a month. Similarly, messages are stored durably in Kafka so that they can be replayed if needed. How long a message is kept in a topic can be configured by log.retention.ms or log.retention.minutes or log.retention.hours. RabbitMQ stream has quite similar characteristics of retention and offset.

Yeah, that is pretty much the difference between Kafka and a classic queue in RabbitMQ. Now another question comes to my mind “When should we use Kafka and RabbitMQ?”. If you are interested in the topic, your input is more than welcome in the comments.

--

--

Tung Nguyen

A coding lover. Mouse and keyboard are my friends all day long. Computer is a part of my life and coding is my cup of tea.