it. Now that it is code, you can apply the
best code-management and software-engineering techniques to it.
Technique 2: Coding For
How can you upgrade a database schema
in a distributed computing environment?
Imagine a typical Web-based application that is many instances (
replicas) of the same software running
behind a Web load balancer. Each instance receives its share of the HTTP
traffic. The instances access the same
When the software is tightly coupled
to the database schema it becomes
impossible to perform software upgrades that require a database schema
change. If you first change the schema,
the instances will all die or at least get
confused by the change; you could run
around upgrading the instances as fast
as possible, but you have already lost
the game because you suffer an outage.
Ah ha! Why not upgrade the instances first? Sadly, as you upgrade the instances’ software one by one, the newly
upgraded instances fail to start as they
detect the wrong schema. You will end
up with downtime until the schema is
changed to match the software.
The obvious solution is to defy the
laws of physics and change the database schema at the exact same time
as you upgrade the software on all the
instances. If you could do that, everything would be just fine.
Sadly, ACM has a policy against defying the laws of physics, as do most
employers. This is why the traditional
method is to shut down the entire application, upgrade everything, and then
bring it back online. It’s the best we can
do until our friends at IEEE figure out
how to pause time.
Whether you stop the world by defying physics or by scheduling downtime, you have introduced an even
bigger problem: You have made many
individual changes, but you don’t know
if any of them were successful until the
system is running again. You also don’t
know which of the accumulated changes caused things to break.
Such “big bang” changes are risky.
It is less risky to make and validate the
changes one at a time. If you make multiple changes all at once, and there is a
problem, you have to start binary search-plication starts with schema version 1.
That value is stored in the database
(imagine a one-row table with a single
field that stores the value “ 1”). When the
application starts, it should know that
it is compatible with schema version 1,
and if it doesn’t find that version in the
database, it refuses to run.
To automate schema updating, however, the next release of the software
knows it requires version 2 of the schema, and knows the SQL command that
will upgrade a version 1 schema to version 2. On startup, it sees the version is
1, runs the appropriate schema upgrade
command, updates the version number
stored in the database to 2, and then proceeds to run the application.
Software that does this typically has
a table of SQL schema update commands. The command in array index n
upgrades the schema from version n– 1
to n. Thus, no matter which version is
found, the software can bring the database to the required schema version.
In fact, if an uninitialized database is
found (for example, in a testing environment), it might loop through dozens of
schema changes until it gets to the new-est version. Not every software release
requires a schema change; therefore,
separate version numbers are used for
schema and software.
There are open source and commer-
cial systems that implement this pro-
cess. Some of these products are more
sophisticated than others, supporting
a variety of languages, database sys-
tems, error-handling sophistication,
and whether or not they also support
rollbacks. A Web search for “sql change
automation” will find many. I am most
familiar with the open source proj-
ects Mayflower for .NET code (https://
and Goose for Go (https://bitbucket.
Schema modifications used to lock
the database for minutes and possibly
hours. This would cause applications
to time out and fail. Modern SQL databases have reduced or eliminated such
problems, thanks to lockless schema
updates and online reindexing features.
These features can be found in all recent
SQL products, including open source
products such as MariaDB, MySQL, and
PostgreSQL. Check the documentation
for details of what can and cannot be
done without interruption.
Once your software uses these techniques, adopting continuous integration (CI) becomes significantly easier.
Your automated testing environment
can include tests that build a database in
the old schema, upgrade it, and run the
new software release. Your schema upgrade process may be tested hundreds
of times before it goes into production.
This should bring new confidence to the
process, reduce the risk of schema upgrades, and decouple the DBAs’ personal involvement in upgrades. They will
appreciate getting their weekends back.
My favorite part of this technique is
that your schema is now being treated
like code. Manual work at the console
has been eliminated and you have
gained the ability to do the process
over and over—in developer sandbox-es, testing environments, user acceptance test (UAT) environments, and
production. You can run the process
multiple times, fixing and fine-tuning
1. The running code reads and writes the old schema, selecting just the fields that it
needs from the table or view. This is the original state.
2. Expand: The schema is modified by adding any new fields but not removing any old
ones. No code changes are made. If a rollback is needed, it’s painless because the new
fields are not being used.
3. Code is modified to use the new schema fields and pushed into production. If a rollback is needed, it just reverts to phase 2. At this time any data conversion can be done
while the system is live.
4. Contract: Code that references the old, now unused, fields is removed and pushed
into production. If a rollback is needed, it just reverts to phase 3.
5. Old, now unused, fields are removed from the schema. In the unlikely event that a
rollback is needed at this point, the database would simply revert to phase 4.
The Five Phases
of a Live Schema Change