the server and to use a unique name
generated in the server. The causal
history associated with the new version is the union of the causal history
of the client and the newly assigned
unique name.
Figure 7 shows an example where
clients A and B concurrently update
server S. When client B first writes its
version, a new unique name, s1, is created (in the figure this action is denoted by the symbol ) and merged with
the causal history read by the client
{}, leading to the causal history {s1}.
When client A later writes its version,
the causal history assigned to this version is the causal history at the client,
{}, merged with the new unique name
s2, leading to {s2}. Using the normal
rules for checking for concurrent
updates, these two versions are concurrent. In the example, the system
keeps both concurrent updates. For
simplicity, the interactions of server T
with its own clients were omitted, but
as shown in the figure, before receiving data from server S, server T had a
single version that depicted three updates it managed—causal history {t1,
t2, t3}—and after that it holds two concurrent versions.
One important observation is that
in each node, the union of the causal
histories of all versions includes all
generated unique names until the last
known one: for example, in server S,
after both clients send their new versions, all unique names generated in
S are known. Thus, the causal past of
any update can always be represented
using a compact vector representation, as it is the union of all versions
known at some server when the client
read the object. The combination of
the causal past represented as a vector and the last event, kept outside the
vector, is known as a dotted version
vector.
2, 13 Figure 8 shows the previous
example using this representation,
which, as the system keeps running,
eventually becomes much more compact than causal histories.
In the condition expressed before
(clients communicate only with serv-
ers and a new update overwrites all
versions previously read), which is
common in key-value stores where
multiple clients interact with storage
nodes via a get/put interface, the dot-
ted version vectors allow causality to
be tracked between the written ver-
sions with vectors of the size of the
number of servers.
Final Remarks
Tracking causality should not be ignored. It is important in the design
of many distributed algorithms. And
not respecting causality can lead to
strange behaviors for users, as reported by multiple authors.
1, 9
The mechanisms for tracking
causality and the rules used in these
mechanisms are often seen as complex,
6, 15 and their presentation is not
always intuitive. The most commonly
used mechanisms for tracking causality—vector clocks and version vectors—are simply optimized representations of causal histories, which are
easy to understand.
By building on the notion of causal
histories, you can begin to see the logic behind these mechanisms, to identify how they differ, and even consider
possible optimizations. When confronted with an unfamiliar causality-tracking mechanism, or when trying
to design a new system that requires
it, readers should ask two simple
questions: Which events need tracking? How does the mechanism translate back to a simple causal history?
Without a simple mental image for
guidance, errors and misconceptions
become more common. Sometimes,
all you need is the right language.
Acknowledgments
We would like to thank Rodrigo Ro-drigues, Marc Shapiro, Russell Brown,
Sean Cribbs, and Justin Sheehy for
their feedback. This work was partially supported by EU FP7 SyncFree
project (609551) and FCT/MCT projects UID/CEC/04516/2013 and UID/
EEA/50014/2013.
Related articles
on queue.acm.org
The Inevitability of Reconfigurable Systems
Nick Tredennick, Brion Shimamoto
http://queue.acm.org/detail.cfm?id=957767
Abstraction in Hardware System Design
Rishiyur S. Nikhil
http://queue.acm.org/detail.cfm?id=2020861
Eventually Consistent: Not What You Were
Expecting?
Wojciech Golab, et al.
http://queue.acm.org/detail.cfm?id=2582994
References
1. Ajoux, P., Bronson, N., Kumar, S., Lloyd, W.,
Veeraraghavan, K. Challenges to adopting stronger
consistency at scale. In Proceedings of the 15th
Workshop on Hot Topics in Operating Systems, Kartause
Ittingen, Switzerland. Usenix Association, 2015.
2. Almeida, P. S., Baquero, C., Gonçalves, R., Preguiça,
N. M., Fonte, V. Scalable and accurate causality tracking
for eventually consistent stores. In Proceedings of the
Distributed Applications and Interoperable Systems,
held as part of the Ninth International Federated
Conference on Distributed Computing Techniques
(Berlin, Germany, 2014), 67–81.
3. Birman, K.P., Joseph, T.A. Reliable communication
in the presence of failures. ACM Transactions on
Computer Systems 5, 1 (1987), 47–76.
4. Charron-Bost, B. Concerning the size of logical clocks
in distributed systems. Information Processing Letters
39, 1 (1991), 11–16.
5. Fidge, C.J. Timestamps in message-passing systems
that preserve the partial ordering. Proceedings of the
11th Australian Computer Science Conference 10, 1
(1988), 56–66.
6. Fink, B. Why vector clocks are easy. Basho Blog, 2010;
http://basho.com/posts/ technical/why-vector-clocks-are-easy/.
7. Hoff, T. How League of Legends scaled chat
to 70 million players—it takes lots of minions.
High Scalability; http://highscalability.com/
blog/2014/10/13/how-league-of-legends-scaled-chat-
to-70-million-players-it-t.html.
8. Lamport, L. Time, clocks, and the ordering of events in
a distributed system. Communications of the ACM 21,
7 (1978), 558–565.
9. Lloyd, W., Freedman, M.J., Kaminsky, M., Andersen,
D.G. Don’t settle for eventual: Scalable causal
consistency for wide-area storage with COPS. In
Proceedings of the 23rd ACM Symposium on Operating
Systems Principles (New York, NY, 2011), 401–416.
10. Mattern, F. Virtual time and global states in distributed
systems. In Proceedings of the International
Workshop on Parallel and Distributed Algorithms
(Gers, France, 1988), 215– 226.
11. Neville-Neil, G. Time is an illusion. acmqueue 13, 9
(2015). 57–72
12. Parker, D.S. et al. Detection of mutual inconsistency in
distributed systems. IEEE Transactions on Software
Engineering 9, 3 (1983), 240–247.
13. Preguiça, N. M., Baquero, C., Almeida, P. S., Fonte, V.,
Gonçalves, R. Brief announcement: Efficient causality
tracking in distributed storage systems with dotted
version vectors. In ACM Symposium on Principles of
Distributed Computing. D. Kowalski and A. Panconesi,
Eds. (2012), 335–336.
14. Schwarz, R., Mattern, F. Detecting causal relationships
in distributed computations: in search of the Holy
Grail. Distributed Computing 7, 3 (1994), 149–174.
15. Sheehy, J. Why vector clocks are hard. Basho Blog,
2010; http://basho.com/posts/ technical/why-vector-clocks-are-hard/.
16. Sheehy, J. There is no now. acmqueue 13, 3 (2015),
20–27.
Carlos Baquero ( cbm@di.uminho.pt) is assistant
professor of computer science and senior researcher at
the High-Assurance Software Laboratory, Universidade
do Minho and INESC Tec. His research interests are
focused on distributed systems, in particular causality
tracking, data types for eventual consistency, and
distributed data aggregation.
Nuno Preguiça ( nuno.preguica@fct.unl.pt) is associate
professor in the Department of Computer Science,
Faculty of Science and Technology, Universidade NOVA
de Lisboa, and leads the computer systems group at
NOVA Laboratory for Computer Science and Informatics.
His research interests are focused on the problems of
replicated data management and processing of large
amounts of information in distributed systems and mobile
computing settings.
Copyright held by authors.
Publication rights licensed to ACM. $15.00