Orchestral manoeuvres in the Docker: A noob’s guide to microservices

Orchestral manoeuvres in the Docker: A noob’s guide to microservices

Given the hype around microservices, it’s tempting to question whether the task of managing microservices has also been oversold. Isn’t it just just like managing a traditional piece of software? Well, no. Here’s why.

In the case of our trusted monolithic applications, you’ll more often than not find yourself dealing with a single binary to execute or a simple command to run in order to make the entire application live. Usually our monolith can be found on three or four dedicated machines behind a pair of load balancers. This is not the case with microservices.

Given that a decent sized microservices-based application can span 20 to 30 services, you’ll need a lot more going on behind the scenes to keep everything running and processing traffic smoothly. The keyword here is orchestration.

The monolithic application contains a big bundle of tightly coupled code. These apps are easy to deploy and they’re even easier to understand – from the outside. They’re a black box with buttons on.

With microservices you’ll find 20-plus black boxes of different shapes and sizes, and all with different buttons.

Monoliths are dropped in one release cycle and contain lot of moving parts – so a lot can go wrong in a single release. Small changes can result in big challenges where the software has been rolled out over a large scale.

Each microservice is isolated and consists of a single unit of functionality. This means a change to a service is a very small deployment and greatly reduces what can go wrong.

One database is often all you’ll find powering a traditional application. With tightly coupled code often comes a relational database with tightly coupled relationships between data. This means a table schema change can knock out the entire application in the event of a bad deployment.

In an ideal microservices world, each service would have its own database to manage its own data model. A change to the schema is abstracted away via the service’s API: the only way another service can get access to the data.

Even with modern cloud providers like AWS and its AutoScaling Groups, scaling a monolith is a lot of work. The traditional application will take time to boot and require heavy resources that are expensive and slow to provision – if they’re available.

Microservices are not only born to be scaled, they were born because of a need to scale. They are incredibly quick to launch and have such low CPU and memory footprints that deployment times can often be measured in milliseconds.

If microservices are each meant to have a single database, how do we deploy that database? If each service has its own version number, how do we roll out a new version?

Welcome to managing microservices.

Containers, containers, containers

You need to deploy microservices as single units and, while alternatives exist, the mind goes to Docker. It’s not perfect and it doesn’t support all technology stacks, but microservices need a container environment and Docker has decent features, broad industry support for safety and continued development, and offers millisecond deployment times.

You’ll want to use versioned containers for easier, rolling deployments. Docker containers can be tagged with anything you like, which means you can tag images with numbers. Using these numbers gets us what we need: version controlled containers we can deploy and rollback if something fails to work as expected.

Containers are made from layers or changes to the filesystem. Each new change to an image results in only the differences being deployed. This is what makes Docker containers so fast to work with: if your image is 500MB (not uncommon; consider Alpine Linux for smaller images) and you add 500KB of changes, downloading those changes onto a cluster and redeploying them is simply a matter of downloading the 500KB change, and not the entire 500MB+500KB set of data.

A Dockerfile is all you’ll need to write to get your application and its dependencies wrapped up into a single, deployable unit. Each command in a Dockerfile results in a new layer being introduced within the image, so be mindful of this when writing them.

Once you have a catalog of Docker images you’ll want to introduce a repository for storing them. This repository is what your orchestration engine will use to pull images and create containers. Without a central repository to work with, juggling Docker images and containers can be a lot harder.

With a Docker image repository setup, you’ll need something to keep your containers up and running. It’s a lot like spinning plates – you don’t want to be doing it yourself.

You need orchestration

An orchestration engine is aware of what it is you want and it knows how to get you it. After it has deployed containers it’ll then manage them, producing new instances of a container to meet demand for its functionality or bring up new containers when the previous instances fail due to a software or hardware issue.

An orchestration engine will also deal with rolling deployments, aborting failed deployments and traffic splitting inbound customers slowly over to a newly deployed version. This means your deployment process will become more about writing code and pushing it to a git repository versus staying up until 2300 and manually going through a runbook, hoping everything works.

You’ll also want to be able to abstract away the infrastructure and not care about it. This isn’t such as obvious benefit to using an orchestration system, but – as with Kubernetes below – compute resources are introduced and taken away again simply, as and when the orchestration cluster needs them. You’re none the wiser (unless you want to be) as to what’s happening with your cluster’s EC2 instances, and that’s a good thing. You worry about code and building solutions, and Kubernetes will worry about whether or not you have enough compute resources to handle demand.

Finally, you’ll have need for automatic and transparent scalability. Kubernetes Services are developed to understand what constitutes load for a given set of Pods, and as such, it can scale the service up and down with demand without you having to intervene. As I hinted at earlier, with the right configuration it’ll even scale your EC2 instances for you – inside an AutoScaling Group.

So, to recap:

Kubernetes

  • Supports the main cloud providers such as AWS and GCP
  • Lets you manage the state of your services using code (YAML in this case)
  • Comes with fast deployment times

Pods, Volumes and Services

  • Pods are used to deploy the containers to the cluster
  • Volumes are how you’ll manage persistent data across the cluster (using the services provided by a good cloud provider)
  • Services are how you’ll expose those pods to the network

Kops

  • Makes it easy to deploy a K8s cluster
  • Lets you manage the cluster going forward from a single tool
  • Offers the option to produce Terraform code for future expansion

Ultimately it pays to employ an ephemeral mindset towards your code and applications. They will – and should – come and go quickly as you make and release code changes. Docker lets you quickly deploy an image as a container within an incredibly short space of time. It can also be shut down and deleted just as fast.

But it’s not all plain sailing and with Docker comes a new set of problems. The issue shifts from managing all those microservices to managing all those containers. An orchestration engine like Kubernetes will take care of this for you. Sitting on AWS EC2 and taking advantage of ASGs, Kubernetes can not only scale your containers but also the compute resources needed to power them.

Kubernetes is complex and takes a lot of work to understand and deploy. Luckily very simple yet powerful tools like Kops exist to ease the deployment process, reducing it down to a 30-minute execution.

So, coming back to the opening proposition: are microservices different to managing traditional applications? Clearly, the answer is “yes”.

The reason it’s a “yes” is because they are a new way of thinking about application design, a way that introduces a complex set of mandatory requirements.

Fortunately, we have the tools to get up and running.