Reduce costs by 90% by moving from microservices to monolith: Amazon internal case study raises eyebrows

By Tim Anderson

May 5, 2023

Reduce costs by 90% by moving from microservices to monolith: Amazon internal case study raises eyebrows

An Amazon case study from the Prime Video team has caused some surprise and amusement in the developer community, thanks to its frank assessment of how to save money by moving from a microservices architecture to a monolith, and avoiding costly services such as AWS Step Functions and Lambda serverless functions.

The requirement was for a monitoring tool to identify quality issues in “every stream viewed by customers” and therefore needed to be highly scalable, as there are “thousands of concurrent streams.” The team initially created a solution with distributed components orchestrated by AWS Step Functions, a serverless orchestration service based on state machines and tasks. It turned out though that Step Functions was a bottleneck.

“Our service performed multiple state transitions for every second of the stream, so we quickly reached account limits. Besides that, AWS Step Functions charges users per state transition,” the paper stated. There was also a “cost problem” with the “high number of tier-1 calls to the S3 bucket” used for temporary storage of captured video frames.

The initial architecture for the Prime Video monitoring application proved too costly as well as scaling poorly

“We realized that distributed approach wasn’t bringing a lot of benefits in our specific use case, so we packed all the components into a single process,” the paper continued, eliminating the need for S3. “We also implemented orchestration that controls components within a single instance.” The solution now runs on EC2 (Elastic Compute Cloud) and ECS (Elastic Container Service), with “a lightweight orchestration layer to distribute customer requests.”

The paper concludes that “Microservices and serverless components are tools that do work at high scale, but whether to use them over monolith has to be made on a case-by-case basis. Moving our service to a monolith reduced our infrastructure cost by over 90%. It also increased our scaling capabilities.” There is also reference to cost reduction via EC2 savings plans, suggesting that even internal AWS customers get billed according to a similar model as the rest of us.

“I’m sort of gobsmacked this article exists,” said a comment on Hacker News. Elsewhere AWS frequently touts the benefits of microservices and serverless architecture as the best way to “modernize” applications. For example, under Reliability, the AWS “Well-architected framework” advises:

“Build highly scalable and reliable workloads using a service-oriented architecture (SOA) or a microservices architecture. Service-oriented architecture (SOA) is the practice of making software components reusable via service interfaces. Microservices architecture goes further to make components smaller and simpler.”

In this “AWS Prescriptive Guidance” document for modernizing .NET applications, the company cites benefits of microservices including faster innovation, high availability and reliability, increased agility and on-demand scalability, modern CI/CD (continuous integration and deployment) pipelines, and strong module boundaries; though it also cites “operational complexity” as a disadvantage.

The new paper, though, seems to confirm a couple of suspicions among developers. One is that AWS-recommended solutions may not be the most cost-effective, as they invariably involve using multiple costly services.

Another is that the merits of microservices versus monolithic applications are frequently overstated. David Heinemeier Hansson, creator of Ruby on Rails and an advocate for reducing use of cloud services, commented on the Amazon case study saying that it “really sums up so much of the microservices craze that was tearing through the tech industry for a while: IN THEORY. Now the real-world results of all this theory are finally in, and it’s clear that in practice, microservices pose perhaps the biggest siren song for needlessly complicating your system. And serverless only makes it worse.” According to Hansson, “replacing method calls and module separations with network invocations and service partitioning within a single, coherent team and application is madness in almost all cases.”

In 2020, Sam Newman, consultant and author of books including “Building Microservices” and “Monolith to Microservices,” told a developer conference that “microservices should not be the default choice” and added advice to software architects in a comment to The Register “Have you done some value chain analysis? Have you looked at where the bottlenecks are? Have you tried modularisation? Microservices should be a last resort.”

Newman noted on Twitter today of the AWS paper: “this article is really speaking more about pricing models of functions vs long-running VMs than anything. Still a totally logical architectural driver, but the learnings from this case study likely have a more narrow range of applicability as a result.” He added that “the reason that people don’t talk publicly about walking back their foray into microservices is that it can be viewed by some as ‘they got it wrong’. Changing your mind when the situation changes is totally sane.”

The paper is not necessarily bad news for AWS. On the one hand, it goes against what the cloud giant tends to say is best practice; but on the other, it is a refreshingly honest look at how to reduce cost with a simplified architecture, as well as a case study in willingness to change track. Unlike many promotional case studies, this one looks genuinely useful to AWS customers.

Reduce costs by 90% by moving from microservices to monolith: Amazon internal case study raises eyebrows

Google VP of development explains how 'citizen developers' must be tempered by the pros

Hands on with Kiro, the AWS preview of an agentic AI IDE driven by specifications

Microsoft shovels extra Copilot features into VS Code amid dev complaints of 'more AI bloat'

Despite 30 months work, core developer says Python's JIT compiler is often slower than the interpret...

Things Go better with telemetry: Microsoft adds phone home to its Go build

Zig lead makes 'extremely breaking' change to std.io ahead of Async and Await's return

Microsoft SQL Server MCP tool: 'Leap in data interaction' or limited and frustrating?

Cloudflare container platform in public preview with scale to zero pricing, some initial limitations

Microsoft to finally expunge the Azure AD Graph API

Avalonia UI sponsorship 'completely removes' open source vs commercial conflict claims CEO

Google positions itself for 'next decade' of AI as Gemini CLI arrives with generous free tier

"Serious" MySQL bug celebrates 20 years unfixed - another reason to switch to PostgreSQL?

ABOUT US

FOLLOW US