Gitpod, which provides containerised development environments, has migrated its service away from Kubernetes in favour of a new home-grown platform called Flex, citing issues with complexity, resource management, and state management.
Kubernetes, said co-founder and CTO Christian Weichel, along with engineer Alejandro de Brito Fontes, “seems like the obvious choice for building out remote, standardized and automated development environments,” and has been used since Gitpod was founded in 2020. Now though, the authors reflect on “our journey of experiments, failures and dead-ends.”
The company does dismiss Kubernetes as unsuitable for production applications, but claims that development environments are a special case because they are exceptionally stateful, have unpredictable resource usage, and require far-reaching permissions.
Some of the issues though are not unique to development environments. Complexity, for example, is cited as a “significant support load,” with the claim that teams underestimate the challenges. “Managed Kubernetes services help, but come with their own restrictions (e.g. Kernel versions on GKE) – state handling and storage remains unsolved,” said Weichel on Hacker News.
What is the replacement, Gitpod Flex, like? Perhaps surprisingly, the authors state that Flex uses a control plane “heavily inspired by Kubernetes.” The new platform simplifies the architecture and improves the security foundation, they claim.
The immediate consequence of the shift from what is now called Gitpod Classic (Kubernetes-based) to Flex is that users have fewer deployment options. Classic Gitpod would run on any Kubernetes. Flex at scale only runs on AWS (Amazon Web Services) though the company said that Azure and Google Cloud Platform support is planned. Flex is self-hosted, meaning that it runs on the customer’s own AWS account, including infrastructure for “runners” which are responsible for operational tasks such as running, scaling, and backing up development environments, and are in communication with the Gitpod Flex management plane.
There is also an option to run Flex via Gitpod Desktop, which runs on a Mac and enables the launch of development environments on the local machine.
In an online presentation, Weichel said that although Flex development environments are container-based, each engineer requires their own remote VM, whereas in the Classic architecture they would have a Kubernetes pod. “The first reaction might be, moving from pods to VMS, isn’t that moving back in time?” said Weichel, but argued that there are “a lot of benefits … VMs offer a lot stronger security guarantees than any container ever will. It’s also around resource isolation.” He also said that VMs are the “core principle of how a lot of cloud systems are built,” making them the best unit of isolation.
Despite using a dedicated VM, the development environment itself still runs in a container. This is partly for standardization, with Gitpod now adopting the same devcontainer.json container specification used by Microsoft and others, which Weichel said was their “most upvoted issue.”
Multiple development containers can run in a single VM, but only for one developer. Weichel said that sharing a VM between several developers caused not only security issues, but also scaling problems when one developer logged off and another continued working.
While Gitpod appears to have come up with a better-optimized platform for its needs, the current restrictions to AWS only (and for desktop, Mac only) are frustrating to some. That said, Flex is currently in free preview until early 2025, aside from potentially hefty AWS hosting fees. Classic will be “sunset by April 1, 2025”, according to its login page, so Gitpod customers will have little choice.