a focus on immediately applicable
takeaways for practitioners running
distributed systems in the wild. As production deployments have increasingly adopted weak consistency models
such as eventual consistency, we have
learned several lessons about how to
reason about, program, and strengthen these weak models.
In summary, we will primarily focus
on three questions and some preliminary answers:
How eventual is eventual consistency? If the scores of system architects advocating eventual consistency
are any indication, eventual consistency seems to work “well enough” in
practice. How is this possible when it
provides such weak guarantees? New
prediction and measurement techniques
allow system architects to quantify the
behavior of real-world eventually consistent systems. When verified via measurement, these systems appear strongly
consistent most of the time.
How should one program under
eventual consistency? How can system
architects cope with the lack of guarantees provided by eventual consistency?
How do they program without strong
ordering guarantees? New research
enables system architects to deal with
inconsistencies, either via external compensation outside of the system or by limiting themselves to data structures that
avoid inconsistencies altogether.
Is it possible to provide stronger
guarantees than eventual consistency
without losing its benefits? In addition
to guaranteeing eventual consistency
and high availability, what other guar-
antees can be provided? Recent results
show that it is possible to achieve the
benefits of eventual consistency while
providing substantially stronger guar-
antees, including causality and several
ACID (atomicity, consistency, isolation,
durability) properties from traditional
database systems while still remaining
highly available.
history and Concepts of
eventual Consistency
Brewer’s CAP theorem dictates it is
impossible simultaneously to achieve
always-on experience (availability)
and to ensure users read the latest
written version of a distributed database (consistency—as formally proven,
a property known as “
linearizabil-ity” 11) in the presence of partial failure
(partitions). 8 CAP pithily summarizes
trade-offs inherent in decades of dis-tributed-system designs (for example,
RFC 67714 from 1975) and shows that
maintaining an SSI (single-system
image) in a distributed system has a
cost. 10 If two processes (or groups of
processes) within a distributed system cannot communicate (are parti-
tioned)—either because of a network
failure or the failure of one of the components—then updates cannot be synchronously propagated to all processes without blocking. Under partitions,
an SSI system cannot safely complete
updates and hence presents unavailability to some or all of its users. Moreover, even without partitions, a system
that chooses availability over consistency enjoys benefits of low latency: if
a server can safely respond to a user’s
request when it is partitioned from all
other servers, then it can also respond
to a user’s request without contacting
other servers even when it is able to do
so. 1 (Note that you cannot “sacrifice”
partition tolerance! 12 The choice is between consistency and availability.)
As services are increasingly replicated to provide fault tolerance (
ensuring services remain online despite
individual server failures) and capacity (to allow systems to scale with variable request rates), architects must
face these consistency-availability and
-latency trade-offs head on. In a dynamic, partitionable Internet, services requiring guaranteed low latency
must often relax their expectations of
data consistency.
Eventual consistency as an available alternative. Given the CAP impossibility result, distributed-database
designers sought weaker consistency
models that would enable both availability and high performance. While
weak consistency has been studied
and deployed in various forms since
the 1970s, 19 the eventual consistency
model has become prominent, particularly among emerging, highly scalable NoSQL stores.
One of the earliest definitions of
eventual consistency comes from a
1988 paper describing a group communication system15 not unlike a shared
text editor such as Google Docs today:
“…changes made to one copy eventually migrate to all. If all update activity
stops, after a period of time all replicas
of the database will converge to be logically equivalent: each copy of the database will contain, in a predictable order,
the same documents; replicas of each
document will contain the same fields.”
Under eventual consistency, all servers eventually “converge” to the same
state; at some point in the future, servers are indistinguishable from one an-
IllustratIon by cycloneProject / shutterstock.com