– Synchronization source actions: sync-source(t, o) by
thread t creates a synchronizes-with source by writing
to a synchronization variable o. Lock releases and volatile variable writes in Java are synchronization source
actions.
– Synchronization sink actions: sync-sink(t, o) by thread
t creates a synchronizes-with sink by reading from a
synchronization variable o. Lock acquires and volatile variable reads in Java are synchronization sink
actions.
Multithreaded executions: An execution E is represented by
a tuple E Tid, Act, W, →po .,→so ..
• Tid is the set of identifiers for threads involved in the
execution. Each newly forked thread is given a new
unique id from Tid.
• Act is the set of actions that occur in this execution. Act|t
is the set of actions performed by t ∈ Tid, and Act|x (resp.
Act|o) are the sets of actions performed on data variable x
(resp. synchronization variable o).
• W is a total function that maps each read(t, x, v) in Act to
a write(u, x, v) in Act. W is used to model the write-seen
relationship between a read of x and the write to x it
sees. In a race-free, sequentially consistent execution,
this is the last write before read(t, x, v). In order to make
the function W total, we assume an initial write for each
variable before any reads.
• →po t is the program order per thread t. For each thread t,
→po t is a total order over Act|t and gives in which order the
actions were issued to execute. This order is sometimes
referred to as the observed execution order.
• →so o is the synchronization order per synchronization
variable o ∈ Sync. For each o ∈ Sync, →so o is a total order
over Act|o.
Synchronizes-With and happens-Before: Given an execution with program and synchronization orders, we extract
two additional orders called the synchronizes-with (→sw ) and
happens-before (→hb ) orders. Data races are defined using
these orders.
A synchronization operation a1 by thread t1
synchronizes with a2 by thread t2, denoted a1→sw a2, if a1 is a
sync-source on some synchronization variable o, a2 is a
sync-sink on o, and a1→soo a2.
The happens-before partial order →hb on the execution E is
defined as the transitive closure of the program orders →po t for
all t ∈ Tid and the synchronizes-with order →sw .
In this paper, we focus only on well-formed executions, 10
which respect (i) the intra-thread semantics and (ii) the
semantics of the synchronization variables and operations.
In addition, well-formed executions satisfy two essential
requirements for data-race detection:
• Happens-before consistency: This property makes use of
the happens-before order to restrict the write-seen rela-
tionship. For example, for a read action a, a →hb W(a)
cannot happen, and W(a) cannot be overwritten by
another write action b such that W(a) →hb b →hb a.
• The union of the program orders for all t ∈ Tid and the
synchronization orders for all variables o ∈ Sync is a valid
partial order. During an execution, our data-race detection algorithm examines a linearization of this partial
order and identifies the happens-before edges between
data accesses.
Sequential Consistency: Sequential consistency is a prop-
erty that allows programmers to use an interleaving model
of execution where accesses from different threads are inter-
leaved into a total order, and every read sees the value of the
most recent write. Sequential consistency is widely consid-
ered to be the only simple-enough model with which writing
useful concurrent programs is practical. Formally, an execu-
tion E Tid, A, W, →po ., →so . is sequentially consistent if there
exists a total order →SC over Act satisfying the following:
• For every thread t ∈ Tid, →SC respects the program order
→po t, i.e., →po t ⊆ →SC .
Data races: Two data variable accesses are called conflicting
if they refer to the same shared data variable and at least one
of them is a write access.
One frequently used definition of a race condition involves
a program state in which two conflicting accesses by two different threads to a shared data variable are simultaneously
enabled. To distinguish this definition from others, let us refer
to this condition as a simultaneity race. The definition of a race
condition used in most work on dynamic race detection is
what we call a happens-before race and involves two conflicting
accesses not ordered by the happens before relationship, i.e.,
not separated by proper synchronization operations. For C++,
these two definitions of a race condition have been shown to
be equivalent. 2 This proof also generalizes to Java executions.
Formally, an execution E Tid, Act, W, →po , →so . contains a
happens-before race if there are two conflicting actions,
a, b ∈ Act|x accessing a data variable x, such that neither
a →hb b nor b →hb a holds. Conversely, the execution is race free
if every pair of conflicting accesses to a data variable are
ordered by happens-before.
The well-formedness of an execution guarantees that if
the execution has no race conditions, then it is sequentially
consistent. The Goldilocks runtime makes use of this and
the DataRaceException to guarantee for all programs
(whether racy or not) that every concurrent execution is
sequentially consistent at the byte-code level. This does not
restrict the Goldilocks runtime’s use as a debugging tool,
because, for the Java and C++ memory models, it has been
proven2, 10 that if a program has a racy execution, then it is
guaranteed to have at least one execution that is sequentially
consistent and racy. Thus, it is sufficient to restrict one’s
attention to looking for races in sequentially consistent executions only.
3. the GoLDiLocKs aLGoRithm
In this section, we describe our algorithm for detecting data
races in an execution E Tid, Act, W, →po ., →so .. For simplicity