The Agile and the Continuous: Database Drift … Neat film title but something to avoid

The Agile and the Continuous: Database Drift … Neat film title but something to avoid

In DevOps the talk is of development and operations, of continuous pipelines and agile updates, of rolling out builds. But it often overlooks something rather critical – the database.

Be it in development or production environment, the database is a critical piece of infrastructure. And everybody in an organisation wants a piece of it.

IT departments usually have a database release management system that enables database admins (DBAs) to review any changes that developers have requested to schemas and stored procedures.

According to Robert Reeves, CTO of Datical, which sells tools to automate database releases, problems occur when organisations quicken the release cycles and the database team gets out of sync. This is what can happen when DevOps – with its agile and continuous builds – is introduced.

“We worked with one vehicle manufacturer based in the US that had purchased Pivotal CloudFoundry,” he recalls. In theory, that tool offers fast deployment. Type cf push and the application name, and the software is out in production without the development team having to worry about the infrastructure.

“They couldn’t do that until the database was updated, because these things go hand in glove,” Reeves says. “They would have to wait between seven and twelve business days until the DBA got around to it.”

There are other potential problems, too. One is database drift. If you’re not enforcing all database changes through an automated system, then trigger-happy devs might change the QA or – heaven forfend – production schemas to get their code out the door.

A database out of step with a DevOps workflow can also leave DBAs scratching their heads when performance problems arise, warns Stephanie Herr, Redgate database DevOps product manager.

“We still see DBAs getting paged if there’s a performance problem in production,” she says. Without good visibility into who made which changes, cause analysis is difficult. “Those production DBAs need to address that, even though they may not have written the code in the last deployment that made the changes, or been aware of it.” Neither should we overlook privacy and security. The temptation is to grab a copy of the production environment for dev and test purposes.

Alex Yates, director of DevOps consulting firm DLM Consultants, frets that hackers are now actively targeting developers and development systems.

“The hackers stopped attacking production servers years ago because the developers are so much easier,” he says. “Whatever it is they have access to, now the hackers have access to.” A quick phishing email to the right developer – or access to a dev system using default passwords – and it’s game over for that production data.

What’s the reason the database is so frequently overlooked? Technical and cultural issues.

On the cultural side, databases are the purview of the DBA and, as we know, developers are from Mars, DBAs are from Venus.

Says Yates: “Old-school DBAs are at the very Ops end of the DevOps spectrum. They are incredibly risk-averse and security conscious because the consequences of screwing up the data are so significant. These folks need a good reason to trust the new shiny – and they tend to distrust developers because developers have a track record of breaking databases.”

On the tech side, databases traditionally haven’t fit very well in a typical DevOps scenario that treats servers as cattle – letting them die and swapping them as and when necessary.

“That’s all very good until you start talking about persistent data stores,” Yates points out. “I can’t kill it and deploy another because I’ll lose the data. As a result, either we need people to care about the health of the system as if it were a pet, or we need to come up with clever new ways to treat our transactional data like cattle in a reliable way.”

Both these can be tackled simultaneously.

Herr advises involving the DBA in development decisions early on. “If a developer is doing a big feature change that requires some data manipulation, it would be great if that developer could reach out to the DBA and ask for help with it,” she says. DBAs might push back at this. They very busy people, already tasked with managing growing volumes of data in increasingly complex environments. Nobody with this on their plate goes looking for extra work.

Reeves suggests making working with development part of the DBA’s job description. “We need to have a DBA in our product teams, or at least someone that might support a handful of product teams,” he says.

Automation is the oil that greases the DevOps wheels and it can also help to take some of the load off that DBA plate, according to Herr. Scripts and flags for a DBA that highlight significant implications for schemas, functions, stored procedures or security might help.

Flagging critical changes can help close potential gaps that Herr has seen herself in real-world situations. “I would have a script that was thousands of lines long, and the only time the production DBA saw it was when we sat down to do deployment,” she says. “They would glance through it, but they were trusting me.”

Use of scripts highlights another decision that DevOps teams must make when bringing databases into the environment: should DevOps teams use migration scripts to handle database changes?

The process of folding databases into the DevOps process begins by introducing the same kind of control mechanisms you’d use for software source code. “The goal is source control all the way down,” says Yates. “When I redeploy my thing I want to redeploy the entire thing from source control. If my thing has a database, my deployment needs to include the database. Otherwise, it won’t work.”

A traditional approach to database source control is to write migration scripts that make changes to the database. If you were to run each migration script committed in turn from scratch, you’d end up reproducing the most current version of your database schema.

The problem comes, however, when different developers start working on an application’s code and make changes to the database – each producing a script. That can create merge conflicts in the version control system. Which migration script is correct? Which should you run first?

You may have to pick over individual lines in each script to see what’s happening and manually stitch together a Frankenscript, according to Yates. The bigger the team and the less disciplined it is about checking in code, the stickier this problem will get. An alternative is to take a model (also known as a state)-based approach. This stores a snapshot of the data structure, describing the way that the database schema should be. We still need to generate the upgrade scripts that make the database look that way, and there are automated tools to handle this.

“Ninety per cent of the time it makes your job a lot easier,” he says. “In the source control world, you can have your nice declarative view of how you want the system to look. The problem is when you have a change that isn’t trivial to work out.”

Sometimes the software won’t know enough to generate a script well. If you split a full name column in a relational table into two columns representing first and last name, the software is likely to drop the full name column and create the two subsequent columns, losing your data in the process. It takes a human to script the change without losing the data.

The type of approach you choose to bring your database into a source-controlled, continuous integration world will depend on your team and technology. Large, complex database environments with lots of stored procedures and large development teams lend themselves to state-based database delivery. Conversely, simpler database environments with highly disciplined development teams that can avoid branch merge conflicts, are more migration-friendly.

There are also tools and techniques that can combine the best aspects of the two approaches. One example is Visual Studio SSDT, which lets you run pre-and post-deployment scripts.

Successful DevOps and continuous life cycles need to factor in their database and their organisation’s DBAs. That means having the same conversations with your DBAs as with other members of the DevOps cycle, and understanding how your database change process fits into your version control workflow.

Failure to do this, or leave it too late, could see the humble database become the speed bump to your grand agile plan.