Prometheus fires up performance in 2.15 release

Prometheus fires up performance in 2.15 release

Version 2.15 of CNCF graduated monitoring tool Prometheus is now available, bringing users a new endpoint for exposing per metric metadata and a couple of optimisations promising better performance.

Most enhancements in v2.15 concern the project’s time series database layer (TSDB). It now uses the WAL size for size based retention calculation and decodes WAL records in a separate routine to reduce replay latency. The release notes also promise a lower memory footprint of loaded TSDB blocks and during the compaction process.

Since performance clearly is an important topic, queues now sport a prometheus_remote_storage_sent_bytes_total counter. This is meant to help tracking the bandwidth used by remote writes. Speaking of the latter, the query label on prometheus_remote_storage_* metrics has been changed to remote_name and url, while remote read requests can now make use of range hints and query grouping. 

PromQL, Prometheus’ query language, got its parser reworked so that it no longer loses time creating goroutines and synchronising. The tool now uses a slice based buffer instead of the former, channel-based approach which supposedly makes it up to seven times faster. It also has learnt to accept spaces after an opening square bracket in subqueries for time ranges.

Feature-wise, the Prometheus API now offers an endpoint that lets users access metadata per metric. Devs that liked the React user interface which was introduced in v2.14, can get excited about the newly added pages for targets and the TSDB status as well as some performance improvements.

The Prometheus team also fixed some bugs, so the alertmanager configuration shouldn’t miss targets with similar configurations anymore and the targets metadata API works according to the specification.