Grafana Cloud steps up to on-call management, Traces completes enterprise stack; Also, Loki gets easier

Grafana Loki

Grafana Labs used its customer conference, ObservabilityCON, to announce updates to its commercial offerings and put a spotlight on some recent releases from its open source portfolio.

One of the new helpers the company announced is Grafana Enterprise Traces. The self-managed tracing service uses open source project Tempo as its foundation, but includes enterprise-grade features like access control and support. 

Another new addition is the recorded queries feature that allows enterprise customers to export the results of non-time series queries. Like that they have a chance to construct time series where it wasn’t an official option before and recognise trends in data — which could be helpful when reviewing processes, for example.

Since collecting data is always costly, the Grafana team also introduced a cardinality analysis tool that can be used for optimization purposes, while fresh query sharding capabilities in Grafana Enterprise Metrics and Cloud Metrics are hoped to speed up query execution.

ObservabilityCON also marked the kickoff of the public beta preview phase for Grafana OnCall. The on-call management tool is the result of an acquisition Grafana Labs made earlier this year, and is meant to be integrated into Grafana Cloud deployments. Amongst other things it promises a unified interface to manage Grafana, Prometheus, and Altermanager alerts, options to create and manage on-call schedules, and a UI to set up escalation chains. 

Commercial offerings aside, Grafana Labs recently also released new versions of its logging system Loki and distributed tracing backend Tempo. Tempo is now available in v1.2, which provides users with an easy way of comparing current config values with the default, and an option to query backend blocks via the command line interface. There’s also a command to search a time range for a key/value pair, a runtime config handler, and the project learned to check ingesters for traces. 

Despite only being a minor release, Tempo 1.2 includes a couple of breaking changes that are due to information consolidation, name changes, and added support for partial results from failed block queries. According to the release notes this “will likely have no impact on your deployment” but could lead to “a temporary read outage during deployment” of the new version.

Meanwhile the developers behind Loki put their focus on ease of use for the 2.4 release. The system no longer requires logs to be sent in strict chronological order, and comes fitted with a so-called simple scalable deployment mode. The latter bundles component microservices into read and write targets in order to offer a higher availability write path and scale read paths independently according to demand. 

Other than that, a consumer for getting Kafka logs into Loki was added, and recording rules were reworked to be more resilient so that they’re now part of the regular feature set. The team also worked to reduce Loki’s configuration and improve defaults, so users who haven’t set limits in their config files before should check the upgrade guide to not run into problems when switching to the new version.