The Scalable Commutativity Rule:
Designing Scalable Software for
By Austin T. Clements, M. Frans Kaashoek, Eddie Kohler, Robert T. Morris, and Nickolai Zeldovich
Developing software that scales on multicore processors
is an inexact science dominated by guesswork, measurement, and expensive cycles of redesign and reimplementa-tion. Current approaches are workload-driven and, hence,
can reveal scalability bottlenecks only for known workloads
and available software and hardware. This paper introduces
an interface-driven approach to building scalable software.
This approach is based on the scalable commutativity rule,
which, informally stated, says that whenever interface operations commute, they can be implemented in a way that
scales. We formalize this rule and prove it correct for any
machine on which conflict-free operations scale, such as
current cache-coherent multicore machines. The rule also
enables a better design process for scalable software: programmers can now reason about scalability from the earliest stages of interface definition through software design,
implementation, and evaluation.
Until the mid-2000s, continuously rising CPU clock
speeds made sequential software perform faster with
each new hardware generation. But higher clock speeds
require more power and generate more heat, and around
2005 clock speeds reached the thermal dissipation limits of a few square centimeters of silicon. CPU architects
have not significantly increased clock speeds since, but
the number of transistors that can be placed on a chip
has continued to rise. Architects now increase parallelism by putting more CPU cores on each chip. Total cycles
per second continues to grow exponentially, but software must scale—must take advantage of parallel CPU
resources—to benefit from this growth.
Unfortunately, scaling is still an untamed problem. Even
with careful engineering, software rarely achieves the holy
grail of linear scalability, where doubling hardware parallelism doubles software performance.
Engineering scalable systems software is particularly
challenging. Systems software, such as operating system
kernels and databases, presents services to applications
through well-defined interfaces. Designers rarely know
ahead of time how applications will use these interfaces,
and thus often cannot predict what bottlenecks to multicore
scalability will arise. Furthermore, scaling bottlenecks may
be a consequence of the definition of the interface itself;
such problems are particularly difficult to address once
many applications depend on the interface.
Lack of a principled way to reason about scalability
hampers all phases of systems software development:
defining an interface, implementing the interface, and
testing its scalability.
When defining an interface, developers lack a systematic way of deciding whether a given definition will allow
for scalable implementations. Demonstrating a scalability bottleneck requires a complete implementation and a
workload. By the time these are available, interface changes
may no longer be practical: many applications may rely on
the existing interface, and applications that trigger the bottleneck may not be important enough to warrant an interface change.
During design and implementation, developers lack a
systematic way to spot situations in which perfect scalability is achievable. This makes it hard to design an implementation to be scalable from the start. Instead, over time
developers must iteratively improve the software’s parallel
performance as specific workloads uncover bottlenecks,
often re-implementing the software multiple times.
While testing, developers lack a systematic way of
evaluating scalability. The state of the art for testing the
scalability of multicore software is to choose a workload,
plot performance at varying numbers of cores, and use
tools such as differential profiling13 to identify scalability bottlenecks exhibited by that workload. Each new
hardware model or workload, however, may expose new
This paper presents a new approach to designing scalable software that starts with the design of scalable software
interfaces. This approach makes it possible to reason about
multicore scalability before an implementation exists,
and even before the necessary hardware is available. It can
highlight inherent scalability problems, leading to better
interface designs. It sets a clear scaling target for the implementation of a scalable interface. Finally, it enables systematic testing of an implementation’s scalability.
At the core of our approach is what we call the scalable
commutativity rule: In any situation where several operations commute (meaning there is no way to distinguish their
execution order using the interface), there exists an implementation that is conflict-free during those operations
(meaning no core writes a cache line that was read or written
The original version of this paper was published in the
Proceedings of the 24th ACM Symposium on Operating
Systems Principles (SOSP’ 13).