Apache Kafka 2.6 eases rebalancing pain, improves insight

Apache Kafka 2.6 eases rebalancing pain, improves insight

Users waiting for the June-scheduled release of distributed event streaming platform Apache Kafka 2.6 finally have the means to update their system. The latest iteration is meant to improve on aspects including quota management, and rebalancing, and introduces Java 14 support.

Kafka is a LinkedIn-bred project that became open source in 2011 and moved to graduate Apache project status in 2012. Since then it has been adopted by companies such as CloudFlare, Twitter, and Netflix, where it is mainly used in the context of monitoring and analytics.

Administrators will be interested to learn that quota management via the admin client has been extended through a client quota API, which is supposed to make the process a little less error prone. Admins are now also able to use the consumer to trigger a rebalancing process, and can look into how many bytes the system is writing and reading from the disk in order to get to bottlenecks quicker.

Focal points of the release include the Streams client library and the Connect framework. Kafka Streams was added to the project in 2016 to facilitate the writing of applications and microservices that store their input and output data in Kafka clusters. The current version, for example, has tackled the problem of Stream task assignment not taking into account the length of time it would take for a task to catch up, which previously led to quite a bit of downtime after rebalancing.

Version 2.6 has also come loaded with an emit-on-change processing option – so that updates that leave a record’s byte arrays unchanged can be dropped to ease traffic – and improvements for building resource-hungry exactly-once semantics applications.

To improve user insight into event streams, the Kafka team added new metrics to the Streams library. At client level, alive-stream-threads was the only new addition. Meanwhile thread-level and task-level enhancements have been a bit more substantial with process | punctuate | commit | poll-ratio, poll-records (avg | max), process-records (avg | max), and, active-process-ratio, standby-process-ratio respectively.

Given some rules in the connector configurations, the Kafka Connect platform will now create Kafka topics for source connectors that write records if the topics are not already available. The new iteration has also allowed the addition of custom headers to all Connect Rest API responses, and fitted sink connectors with the ability to send records to the dead letter queue for error reporting.

Kafka 2.6 has become the first version of the system able to properly work with Java 14, and has been updated to use TLS 1.3 when using Java 11 or higher. Scala developers will find 2.13 to be the new default, following a recommendation for production use. 

There is not yet a scheduled release date for Kafka 3.0, however, work on items central to the next major release, such as the replacement of distributed coordination system Zookeeper, has progressed in the leadup to v2.6, and users are free to provide feedback on some upcoming features such as the implementation of the new Raft protocol.