pos(y)
figure 4. the synchronization event list.
pos(z)
head tail
pos(x)
pos(w)
each synchronization variable o.c When a thread performs a
synchronization action a, it must append a corresponding
event to the synchronization-event list atomically with the
event. In Kaffe, we make sure this is the case by modifying
the implementations of the Java synchronization actions.
In order to represent GLS(x), each variable x in the program is associated with two bits of information regarding
the most recent access to x: owner(x) stores the id of the
thread that most recently accessed x, and pos(x) points to
the last synchronization event in the list that was taken into
account when GLS(x) was last computed.
Figure 4 shows four variables pointing to entries in
the synchronization-event list. Figure 5 shows how the
Goldilockset GLS(x) is computed when x is accessed.
5(a): After each access to x by a thread ti, owner(x) is set to ti, and
pos(x) is set to point the tail of the synchronization event list.
5(b): Right before an access to x by thread tj, temporarily,
we represent GLS(x) explicitly. GLS(x) is initially {owner(x)}
and is updated by processing the synchronization events
between pos(x) (denoted by a1, …, an) and tail according to
the rules 1 and 2 of Figure 2. This process stops either when
head tail
figure 5. Lazy evaluation of Goldilockset GLS(x).
head tail tail pos(x) owner(x) = ti (a) After ti accesses x a1 a2 a2 a1 an an
head
pos(x)
owner(x) = ti
Events to consider
in lockset evaluation
(c) After ti accesses x
pos(x)
owner(x) = ti
c For Java, there is a total order on all synchronization operations, and the entries in the list are in this order.
tj is added to GLS(x) or the last event (an) is processed. In the
former case, no race is reported according to the rule 3 of
Figure 2. In the latter case, a race is reported since tj ∉ GLS(x)
after the evaluation.
5(c): After the check, owner(x) is set to tj and pos(x) is set to
the tail of the synchronization event list.
The implementation does not use any extra threads for
race detection. The algorithm is performed in a decentralized manner by instrumented threads of the program being
checked. For each data variable x, we use a unique lock to
make atomic the Goldilockset update and the race-freedom
check for each access to x and to serialize all the race-freedom
checks for x.
4. 2. Performance optimizations
Short-Circuit Checks: A cheap-to-evaluate sufficient condition for a happens-before edge between the last two
accesses to a variable can reduce race-detection overhead.
We make use of two such conditions, called short-circuit
checks, and bypass the traversal of the synchronization
event list when these checks succeed. In this case, the final
Goldilockset of the variable consists of the id of the thread
that accessed it last.
We employ two constant-time short-circuit checks. First,
when the last two accesses to a shared variable are performed by the same thread t, the happens-before relationship is guaranteed by the program order of t. This is detected
by checking whether owner(t), the last accessor thread, is the
same as the thread performing the current access.
In the second short-circuit check, we determine whether
the variable x is protected by the same lock during the last
two accesses to x. For this, we associate with each variable x
a lock alock(x), which is randomly selected among the locks
held by the most recent accessor thread. When a thread t
accesses x and if alock(x) is held by t, then that access is race
free.
Direct ownership Transfer: A sound but imprecise third
optimization is to consider only the subset of synchronization events executed by the current and last accessing
thread when examining the portion of the synchronization
event list between pos(x) and tail. This check is not constant
time, but we found that it succeeds often enough to improve
Goldilocks overhead.
garbage Collection: The synchronization events list is periodically garbage-collected when there are entries in the
beginning of the list that are not relevant for the Goldilockset
computation of any variable. This is the case when an entry
in the list is not reachable from pos(x) for any data variable x,
and is tracked by maintaining incremental reference counts
for each list entry.
Partially eager evaluation: Sometimes the synchronization
event list gets too long and it is not possible to garbage-collect the event list when variable x is accessed early in
an execution but is not used afterwards. We address this
problem by “partially eager” Goldilockset evaluation. We
move pos(x) forward towards the tail to a new position
poś(x), and partially evaluate a Goldilockset GLS(x) of x by
processing events (i.e., running ApplyLocksetRules on them)