Apache Kafka reaches 2.4 release with new partitioner in tow

Apache Kafka reaches 2.4 release with new partitioner in tow

The team behind the distributed streaming platform Apache Kafka has added the finishing touches to version 2.4, adding features like multiple consumer group management, an alternative partitioner, and support for optional tagged fields in its protocol.

Kafka was initially developed by social networking service LinkedIn and was open-sourced in January 2011. It joined the Apache Foundation’s incubator later that year and graduated in October 2012 as a fully-fledged Apache project.

In the current version, the Kafka team added a so-called sticky partitioner. Compared to the regular partitioner, the new one sticks to a partition until a batch is full if there is no specific information on partitions and keys instead of spreading records in a round-robin fashion across partitions which can lead to higher latency.

Once a system has been upgraded to version 2.4, consumers are allowed to fetch data from the closest replica instead of the leader to reduce cross-datacenter network costs. Speaking of consumers, their rebalance protocol now comes with an incremental cooperative rebalancing capability which lets consumers retain partitions during rebalancing. Prior versions only knew an eager approach, revoking assigned partitions before rebalancing and reassigning them altogether, that could take quite a while.

Kafka has also been fitted with functionality to describe, delete, and reset offsets in multiple consumer groups at a time. This, for example, allows querying on consumer groups without the need to start a new JVM first. 

Additional new features include optional tagged fields in the Kafka serialisation format, support for dynamic application log levels in the Admin API, a metric to measure the number of tasks on a connector, and a new API for replica reassignment.

Meanwhile Kafka Streams, a library for building applications that store input and output data in Kafka Clusters, has received new classes to simplify the test interface, support for non-key joining in KTable, and a way of setting custom processor names with its DSL. A full list of changes can be found in the official release notes.