The hidden cost of dev stack diversity within an enterprise: ‘Engineering chaos’

The hidden cost of dev stack diversity within an enterprise: ‘Engineering chaos’

A survey of over 100 enterprises reported by DevOps startup Earthly showed that the top CI/CD (Continuous Integration/Continuous Delivery) issue is not speed but rather that the diversity of development stacks enabled by containerization has made it impossible to enforce engineering and security policy across teams.

Earthly founder Vlad Ionescu reported the outcome of interviews with over 100 businesses including DocuSign, Twilio, LinkedIn, Box, Morgan Stanley and Bank of New York Mellon, in the hope of getting validation for the idea that CI/CD performance was critical to developer productivity, enabling the company to market its open source build framework (also called Earthly), which is based on Docker’s BuildKit.

The top issue reported though was not CI/CD speed but rather the difficulty of managing teams with diverse dev stacks. Containerization of applications and microservices means that individual teams typically have a lot of freedom to use their preferred tools, programming languages and frameworks. Provided that the container runs correctly, often on Kubernetes, it will be compatible with the output of other teams. The consequence though is that enforcing policy company-wide becomes difficult.

“Within any given company, you’ll find a mix of programming languages, CI technologies, build scripts, packaging constructs, in-house scripts, adapters,” and more, said Ionescu. “Security teams complained about not having any visibility into the chaos. Engineering leadership complained about not being able to enforce high-quality engineering standards and not being able to understand the level of maturity of each app.”

Earthly Lunar – a solution for managing diverse CI/CD processes?

Earthly’s answer is in its launch of a new CI/CD monitoring system called Lunar. This works by instrumenting any CI/CD pipeline to generate “metadata about how code is built, tested, scanned and deployed.” The metadata is then queried to assess and detect compliance with policy. Lunar is an effort to get a standardized view of diverse DevOps processes, though it is unlikely to be a complete solution to the problems Inonescu identified.

One should be wary of a report that supports a product launch; but a discussion on Reddit showed some agreement with Ionescu’s diagnosis. “I’ll have to empathize with some of the problems they’ve outlined,” said one. Another remarked that there is no single DSL (domain specific language) to define configuration, deployment and monitoring of CI/CD. “You need to know Jenkins, Docker, Kubernetes, Helm, Terraform, Ansible, PromQL, etc., etc. Then the cloud provider will pull out the rug from under your feet once in a while; we are on the third iteration of GCP dashboard and alert definitions.”

The discussion highlighted though that the stack diversity issue is in part a consequence of having a proliferation of applications and microservices. Having multiple teams working on a single code repository is likely easier to manage than when each team is working on its own repository, with more freedom to go its own way.

It is also true that giving teams freedom to use their preferred tools has benefits for their performance and ability to innovate.

According to Ionescu, whose career includes a brief spell at Google as a software engineer, neither Google nor Facebook suffers from these dev stack diversity issues, because they are among “companies who have invested heavily in common CI/CD infrastructure.” 

The further implication, perhaps, is that standardizing CI/CD across an enterprise is a possible solution to these issues – though not easy to achieve retrospectively.