Kafka takes on under-replication in 2.3 release

AI/ML

By Julia Schmidt

June 26, 2019

Kafka takes on under-replication in 2.3 release

Distributed streaming platform Apache Kafka has hit the 2.3 mark, claiming to improve aspects such as resilience and failure handling along the way.

Prevention and debugging of replication issues has gotten easier thanks to the introduction of the topic partition category AtMinIsr. It can be used to warn operations of partitions running the risk of becoming unavailable, since it’s set up between UnderReplicated and UnderMinIsr. An additional option –under-min-isr in the describe topics command informs users which topic partitions may need addressing for being below the set minimum number of data copies.

The design of ReplicaFetcher has been improved, so that – should a partition crash – the thread concerned focuses on the partitions still up instead of tracking the crashed one. This behaviour often led to an abrupt halt to replication for all partitions the thread was responsible for which in turn led to under-replicated partitions.

To make denial of service attacks harder, there is now a limit to the total number of connections that can be active on a broker. The priority also isn’t on new connection requests any longer but on already existing connections to make SocketServer processors more fair.

A more compliance related feature has landed in the form of max.compaction.lag.ms, which can be used to set a maximum amount of time for which a log segment can stay uncompacted. This helps to make sure data is deleted in a timely manner, should specific regulations be in place.

The current release also saw changes to Kafka Connect and Kafka Streams. Connect, a framework which integrates Kafka with other systems, won’t pause worker threads during rebalancing tasks anymore. Additional context in log messages should help in understanding a connectors’ behaviour over time.

To help with things like range queries, Kafka now also includes an implementation of an in-memory window (as well as session) store for Streams. Other than that timestamps are now written into the state store and flatTransformValues() has been added. This allows processor-API-aware computations “that return multiple records for each input record without changing the message key and causing a repartition” as the announcement states.

A complete list of changes, including a new API in the AdminClient which makes incremental config changes easier, can be found in the project’s release notes.

Sourcegraph coding assistant now supports Anthropic Claude 3 – though limited to 7K token input

Supabase moves out of beta, adds supports for Swift, plugs in Oriole storage engine

Go dev survey shows frustration with Python’s dominance of AI

AI coding: Hugging Face engineer extols benefits of open source models, but hard questions remain

.NET Smart Components experiment the "Visual Basic" of AI programming?

GitHub autofix progresses to public beta: insecure code corrected with AI, but only for enterprise

JetBrains bows to user pressure and unbundles AI Assistant in new IntelliJ IDEA beta

Hands On: Netlify AI-assisted deployment aims to reduce log-diving

Stack Overflow turns to Google for hosting and AI features, trusts in Gemini for tech answers

Employing your cloud data warehouse to scale up AI/ML

Rust-based Zed editor now open source – with built-in support for OpenAI and GitHub Copilot

AI assistance is leading to lower code quality, claim researchers

Kafka takes on under-replication in 2.3 release

ABOUT US

FOLLOW US