Another limitation of causal consistency also stems from the possibility of concurrent operations. Programmers must decide how to deal
with concurrent write operations to
the same data at different datacenters. A common strategy is the last-writer-wins rule in which one concurrent update overwrites the other. For
example, a social-network user can
have only one birthday. Some situations, however, require a more careful
approach. Consider a scenario where
Alice has two pending friend requests
being accepted concurrently at different datacenters. Each accepted friend
request should increase Alice’s friend
count by one. With the last-writer-wins
rule, however, one of the increments
will overwrite the other. Instead, the
two increments must be merged to
increase Alice’s total friend count by
two. With causally consistent storage
(as with eventually consistent storage), programmers must determine if
the last-writer-wins rule is sufficient,
or if they have to write a special function for merging concurrent updates.
The final limitation of causal consistency is it cannot see or enforce causality outside of the system. The classic
example is a cross-country phone call.
If Alice on the West Coast updates her
profile, calls Bob on the East Coast, and
then Bob updates his profile, the system will not see the causal relationship
between the two updates and will not
enforce any ordering between them.
Providing causal consistency. At a
high level, our systems, COPS and Ei-
ger capture causality through a client
library and then enforce the observed
ordering when replicating writes to
other datacenters. The ordering is
enforced by delaying the application
of a write until all causally previous
operations have been applied. This
delay is necessary only in remote
datacenters; all causally previous op-
erations have already been applied at
the datacenter that accepts the write.
The client library that tracks causality
sits between the Web servers and the
storage tiers in each datacenter. (In
current implementations it is on the
Web servers.) Individual clients are
identified through a special actor_id
field in the API to the client library
that allows the operations of different
users on the same Web server to be
disentangled. For example, in a social
network the unique user ID could be
used as the actor_id.
Let’s first describe an inefficient
system that provides causality and
then explain how to refine it to make
it efficient.
Our systems operate by tracking and
enforcing the ordering only between
write operations. Read operations es-
tablish causal links between write op-
erations by different clients, but they
are not replicated to other datacenters
and thus do not need to have an order-
ing enforced on them. For example, in
anomaly/regularity 1, Bob’s read (Op3)
of Alice’s post (Op1) and comment
(Op2) creates the causal link that orders
Bob’s later comment (Op4) after Alice’s
post and comment. A causal link be-
tween two write operations is called a
dependency—the later operation de-
pends on the earlier operation.
Figure 6 shows the relationship
between the graph of causality and
the graph of dependencies. A depen-
dency is a small piece of metadata
that uniquely identifies a write opera-
tion. It has two fields: a key, which is
the data location that is updated by
the write; and a timestamp, which is
a globally unique logical timestamp
assigned by the logical clock of the
server in the datacenter where it was
originally written. Figure 6 illustrates
(a) a set of example operations; (b)
the graph of causality between them;
(c) the corresponding dependency
graph; and (d) a table listing depen-
dences with one-hop dependencies
shown in bold.
In the initial design the client library
tracks the full set of dependencies for
each client. Tracking all dependencies
for a client requires tracking three sets
of write operations:
1. All of the client’s previous write
operations, because of the thread-of-
execution rule.
2. All of the operations that wrote
values it read, because of the reads-
from rule.
3. All of the operations that the op-
erations in 1 and 2 depend on, because
of the transitivity rule.
Tracking the first set is straightfor-
ward: servers return the unique time-
stamp assigned to each write to the
client library, which then adds a depen-
dency on that write. Tracking the sec-
ond set is also straightforward: servers
return the timestamp of the write that
wrote the value when they respond to
reads, and then the client library adds a
dependency on that write. The third set
of operations is a bit trickier: it requires
that every write carry with it all of its de-
pendencies, that these dependencies
are stored with the value, returned with
reads of that value, and then added to
the reader’s set of dependencies by the
client library.
With the full set of dependencies
for each client stored in its client library, all of these dependencies can
be attached to each write operation
the client issues. Now when a server
in a remote datacenter receives a write
with its full set of dependencies, it
blocks the write and verifies each dependency is satisfied. Blocking these
replicated write operations is acceptable because they are not client-facing
and do not block reads to whatever
data they update. Here, we have explicitly chosen to delay these write
Figure 7. Regularity 1-Redux.
West Coast East Coast
Alice: “I’ve lost my wedding ring”
Alice: “Whew found it upstairs!”
Bob: “I’m glad to hear that”
Alice: “I’ve lost my wedding ring”
Alost
Afound
Bglad
Ti
m e Alost
Afound
Bglad