CONCURRENC Y
To make this concrete, in a typical MVC (
model-view-controller) application, the view (typically implemented
in environments such as JavaScript, PHP, or Flash) and
the controller (typically implemented in environments
such as J2EE or Ruby on Rails) can consist purely of
sequential logic and still achieve high levels of concurrency, provided that the model (typically implemented
in terms of a database) allows for parallelism. Given that
most don’t write their own databases (and virtually no
one writes their own operating systems), it is possible to
build (and indeed, many have built) highly concurrent,
highly scalable MVC systems without explicitly creating a
single thread or acquiring a single lock; it is concurrency
by architecture instead of by implementation.
ILLUMINATING THE BLACK ART
What if you are the one developing the operating system
or database or some other body of code that must be
explicitly parallelized? If you count yourself among the
relative few who need to write such code, you presumably do not need to be warned that writing multithreaded
code is hard. In fact, this domain’s reputation for difficulty has led some to conclude (mistakenly) that writing
multithreaded code is simply impossible: “No one knows
how to organize and maintain large systems that rely on
locking,” reads one recent (and typical) assertion. 5 Part
of the difficulty of writing scalable and correct multithreaded code is the scarcity of written wisdom from
experienced practitioners: oral tradition in lieu of formal
writing has left the domain shrouded in mystery. So in
the spirit of making this domain less mysterious for our
fellow practitioners (if not also to demonstrate that some
of us actually do know how to organize and maintain
large lock-based systems), we present our collective bag of
tricks for writing multithreaded code.
Know your cold paths from your hot paths. If there
is one piece of advice to dispense to those who must
develop parallel systems, it is to know which paths
through your code you want to be able to execute in
parallel (the hot paths) versus which paths can execute
sequentially without affecting performance (the cold
paths). In our experience, much of the software we
write is bone-cold in terms of concurrent execution: it
is executed only when initializing, in administrative
paths, when unloading, etc. Not only is it a waste of time
to make such cold paths execute with a high degree of
parallelism, but it is also dangerous: these paths are often
among the most difficult and error-prone to parallelize.
In cold paths, keep the locking as coarse-grained as
possible. Don’t hesitate to have one lock that covers a
wide range of rare activity in your subsystem. Conversely,
in hot paths—those that must execute concurrently to
deliver highest throughput—you must be much more
careful: locking strategies must be simple and fine-grained, and you must be careful to avoid activity that
can become a bottleneck. And what if you don’t know if a
given body of code will be the hot path in the system? In
the absence of data, err on the side of assuming that your
code is in a cold path and adopt a correspondingly coarse-grained locking strategy—but be prepared to be proven
wrong by the data.
Intuition is frequently wrong—be data intensive. In
our experience, many scalability problems can be attributed to a hot path that the developing engineer originally
believed (or hoped) to be a cold path. When cutting
new software from whole cloth, you will need some
intuition to reason about hot and cold paths—but once
your software is functional, even in prototype form, the
time for intuition has ended: your gut must defer to the
data. Gathering data on a concurrent system is a tough
problem in its own right. It requires you first to have a
machine that is sufficiently concurrent in its execution
to be able to highlight scalability problems. Once you
have the physical resources, it requires you to put load
on the system that resembles the load you expect to see
when your system is deployed into production. Once the
machine is loaded, you must have the infrastructure to be
able to dynamically instrument the system to get to the
root of any scalability problems.
The first of these problems has historically been acute:
there was a time when multiprocessors were so rare that
many software development shops simply didn’t have
access to one. Fortunately, with the rise of multicore
CPUs, this is no longer a problem: there is no longer any
excuse for not being able to find at least a two-processor
(dual-core) machine, and with only a little effort, most
will be able (as of this writing) to run their code on an
eight-processor (two-socket, quad-core) machine.
Even as the physical situation has improved, however,
the second of these problems—knowing how to put load
on the system—has worsened: production deployments
have become increasingly complicated, with loads that