DataDog has thrown a bone to container fans struggling to monitor large orchestrated Kubernetes clusters with the general availability of its Cluster Agent.
Previously, DataDog explained, its customers had to rely on a DataDog agent that collected data from the kubelet on a node, and the cluster’s control plane. The latter, in particular, while giving plenty of visibility into the cluster “put increasing load on the API server and etcd as the size of the cluster increased.”
Monitoring service DataDog’s answer is its Cluster Agent, which it describes as a streamlined, centralized approach to collecting cluster-level monitoring data able to handle 1,000s of nodes.
The agent acts as a proxy between the API server and the node-based agents. Thus the load on the API server is alleviated, the node-based agents can focus on node-level data and the dedicated Cluster Agent collects cluster-level data from the master node, and relays cluster-level metadata to the node-based Agents.
This ties-in with the announcement of horizontal pod autoscaling in Kubernetes, which allows users to autoscale applications in Kubernetes using any metric collected by DataDog – if the Cluster Agent is installed.
Unsurprisingly, DataDog can’t resist saying it had dogfooded the tech itself, and that after “deploying the Datadog Cluster Agent on a cluster with hundreds of nodes and more than 20,000 pods and endpoints, we observed a substantial reduction of the load on our API servers.”
DataDog has also announced a monitor uptime widget, which allows should smooth punters’ efforts to definite and visualise their Service Level Objectives (SLOs) in the DataDog dashboard, and share them internally, or externally, eg to customers, with those pretty graphics switching from green to red accordingly. Alerts can be set, whether for a service as a whole, or components. The service is in beta for now, but you can request access here.