Advantages of Kafka Message Compression

  1. We get a much smaller producer request size when it sends data to Kafka.
  2. It’s also faster to transfer data over the network which leads to less latency and better throughput.
  3. We also get better disk utilization in Kafka because in Kafka on the brokers, our messages will be stored in a compressed format. So, our disk has now more capacity for more messages.

Apache Kafka – Message Compression

Kafka Producers are going to write data to topics and topics are made of partitions. Now the producers in Kafka will automatically know to which broker and partition to write based on your message and in case there is a Kafka broker failure in your cluster the producers will automatically recover from it which makes Kafka resilient and which makes Kafka so good and used today. So if we look at a diagram to have the data in our topic partitions we’re going to have a producer on the left-hand side sending data into each of the partitions of our topics. 

 

So here is another setting that’s so important which is Message Compression. Before that let’s understand the Kafka Message Anatomy first.

Similar Reads

Kafka Message Anatomy

The Kafka messages are created by the producer and the first fundamental concept we discussed is the Key. The key can be null and the type of the key is binary. So binary is 0 and 1, but it can be strings and numbers and we’ll see how this happens to convert a string or a number into a binary....

Apache Kafka Message Compression

Basically, our producer usually sends data in the text-based form. For example, most of the time the producers are sending some JSON data. And JSON is text. In this case, it’s important that you apply compression to the producer. JSON is very text heavy and it’s big in size So we must compress it....

Advantages of Kafka Message Compression

We get a much smaller producer request size when it sends data to Kafka. It’s also faster to transfer data over the network which leads to less latency and better throughput. We also get better disk utilization in Kafka because in Kafka on the brokers, our messages will be stored in a compressed format. So, our disk has now more capacity for more messages....

Disadvantages of Kafka Message Compression

When you do compression, producers must commit some CPU cycles to complete that compression.  Similarly, the consumers must commit some CPU cycles to decompress the data....

Which Compression Type You Should Choose?

So as we have discussed above there are mainly four different kinds of compressions available in Kafka, gzip, snappy, lz4, and zstd. It is recommended to use snappy or lz4 because both have the same optimal speed or compression ratio. On the other hand, Gzip is going to have the highest compression ratio, but it’s not very fast. So choose, and test, it’s super simple. You just change one setting and everything works. There’s not one algorithm that works for everyone, so you just try them based on the kind of plan that you have and see the one that works best for you. And finally. it is highly recommended that always use compression in production, especially if you have a high throughput....

Contact Us