Prometheus team ploughs on, releases 2.29 with promtool and performance enhancements


Though summer months seem to slow down progress on many open-source projects, the Prometheus team is keeping busy. Its latest achievements can be found in the just-released version 2.29 of its monitoring project.

Prometheus’s command line tool promtool now provides the option of using feature flags in its unit tests, and has learned to validate service discovery files when checking configurations. Users have also gained the ability to adjust the block duration used for backfilling data. It was automatically creating two-hour blocks before, which isn’t the most efficient when backfilling data across long time periods. 

Another enhancement can be found in promtool’s tsdb analyze command, which is now able to plot “a distribution of how full chunks are relative to the maximum capacity of 120 samples per chunk” so that it’s easier to see where compaction steps might be useful.

The project’s user interface keeps evolving as well, so that users can select time ranges via a drag of their mouse, and sort and filter items on the flags page. The team tweaked the alerts display to no longer render collapsed details, which has shown to improve performance. 

This was also amongst the goals behind a new method to garbage-collect old series, and changes in the way write-ahead-logs are decoded and data is appended in the time series database (TSDB). TSDB’s --storage.tsdb.allow-overlapping-blocks and --storage.tsdb.retention.size flags have been promoted to stable since the last release. Operators can use them to allow overlapping blocks (which enables vertical compaction and query merges) and set the number of bytes that can be stored for blocks respectively.

To keep service discovery current, the release extends the component with a mechanism for service discovery for the Kuma service mesh. It also learned additional meta labels, so that it’s easier to find EC2 instances in a certain availability zone, discover Hetzner servers whose labels have keys but no values, and simplify monitoring of DNS registrations on Azure.

Version 2.29 logs when compaction failed due to a too high total symbol size. It just skips that step in such a case, and allows using start and end as label names in PromQL queries again. Other fixes improve timestamp handling in the OpenMetrics parser, and get rid of a head GC and pending readers race condition, details for which are available in the release notes.

Prometheus is a monitoring system and time series database which is licensed under Apache 2.0. It was initially developed at music distribution platform SoundCloud, but has since made its way into the open source community and Cloud Native Computing Foundation. It graduated from the CNCF process in 2018 and is currently the second most popular graduate project the foundation has to offer — with container orchestrator Kubernetes being the main attention-grabber.