incorporate a precise yet efficient race detection mechanism. In this context, false positives in race detection cannot be tolerated. The second contribution of our work is the
Goldilocks algorithm, a novel, precise, and general algorithm for detecting data races at runtime. In Elmas et al., 6
we presented an implementation of the Goldilocks algorithm in a Java Virtual Machine (JVM) called Kaffe. 19 Our
experiments with Goldilocks on benchmarks brought up
the new possibility that the overhead of post-deployment
precise race detection in a runtime may be tolerable. There
has been significant progress in the efficiency of precise race
detection since the Goldilocks runtime was first published
(see Flanagan and Freund, and Pozmiansky and Schuster, 7, 14
for example) and this idea appears viable today.
The Goldilocks algorithm is based on an intuitive, general representation for the happens-before relationship as
a generalized lockset (Goldilockset) for each variable. In
the traditional use of the term, a lockset for a shared variable x at a point in an execution contains a set of locks. A
thread can perform a race-free access to x at that point by
first acquiring a lock in this lockset. A Goldilockset is a generalization of a lockset. In Java, locks and volatile variables
are synchronization objects, and acquiring and releasing a
lock, as well as reading from and writing to a volatile variable, are synchronization operations. To reflect this, at each
point in an execution, the Goldilockset for a shared variable
x may contain thread ids, locks, and volatile variables. A
thread can perform a race-free access to x iff its thread id
is in the Goldilockset, or if it first acquires a lock that is in
the Goldilockset, or reads a volatile variable that is in the
Goldilockset. In other words, the Goldilockset indicates
the threads that have the ownership of x and the synchronization objects that protect access to x at that point. The
Goldilockset is updated during the execution as synchronization operations are performed. As a result, Goldilocksets
are a compact, intuitive way to precisely represent the
happens-before relationship. Thread-local variables, variables protected by different locks at different points of the
execution, and event-based synchronization with condition variables are all uniformly handled by Goldilocks.
Furthermore, Goldilocksets can easily be generalized to
handle other synchronization primitives such as software
transactions18 and adapted to handle memory models of
languages other than Java, such as C++MM. To facilitate
this, in this paper (differently from Elmas et al. 6) we present
Goldilocks on a generic memory model and then show
how the algorithm can be specialized to JMM.
1. 1. Related work
Dynamic race Detection: There are two approaches to
dynamic data-race detection, one based on locksets and the
other based on the happens-before relation. Eraser16 is a
well-known lockset-based algorithm for detecting race conditions dynamically by enforcing the locking discipline that
every shared variable is protected by a unique lock. In spite
of the numerous papers that refined the Eraser algorithm to
reduce the number of false alarms, there are still cases, such
as dynamically changing locksets, that cannot be handled
precisely. Precise lockset algorithms exist for Cilk programs, 3
86 communications of the acm | noVemBer 2010 | vol. 53 | no. 11
but they cannot handle concurrency patterns implemented
using volatile variables such as barrier synchronization.
There is a significant body of research on dynamic data-race detection based on computing the happens-before
relation4, 7, 14, 15, 17 using vector clocks. 12 Hybrid techniques14, 20
combine lockset and happens-before analysis. For example, RaceTrack20 uses a basic vector clock algorithm to
capture thread-local accesses to objects thereby eliminating unnecessary and imprecise applications of the Eraser
algorithm. Similarly, MultiRace14 presents djit+, a vector
clock algorithm with several optimizations to reduce the
number of checks at an access, including keeping distinct
vector clocks for reads and writes and using a lockset algorithm as a fast-path check. To the best of our knowledge,
Fast Track, 7 which builds on djit+, is the best-performing
vector clock-based algorithm in the literature. By exploiting some access patterns, Fast Track reduces the cost of
vector clock updates to O( 1) on average. We provide a qualitative comparison of the Goldilocks and Fast Track algorithms in Section 4. 3. Vector clock and Goldilocks are
both precise, but the generalized locksets in Goldilocks
provide an intuitive representation of how shared variables
are protected at each point the execution.
Concurrency-related exceptions: Since proposed first by
the authors in Elmas et al., 6 the idea that programming platforms should be able to guarantee sequential consistency
for all programs has gained significant support. Marino
et.al. 11 and Lucia et.al. 9 provide platforms with explicit memory model exceptions. Both platforms define stronger but
simpler contracts than JMM and C++MM, which enable efficient hardware implementations that support the memory
model exceptions.
2. a GeneRic memoRy moDeL
In this section, we present a generic memory model and
express the JMM as a special case of it. This generic model
allows a uniform treatment of the various synchronization
constructs in Java. We also believe that memory models at
different levels (e.g., the hardware level) and for different
languages (e.g., C++MM) can be expressed as instances of
this model. This allows Goldilocks to be applied in these
settings directly.
Variables and actions: Program variables are separated into
two categories: data variables (Data) and synchronization
variables (Sync). We use x, and o to refer to data and syn-
chronization variables, respectively. Threads in a program
execute actions from the following categories:
• Data variable accesses: read(t, x, v) by thread t reads the
current value v of a data variable x, and write(t, x, v) by
thread t writes the value v to x.