between pos(x) and poś(x). During the next access the evalu-tion of GLS(x) starts from the stored Goldilockset, not from
{owner(x)}.
Sound Static race analysis: The runtime overhead of race
detection is directly related to the number of data variable
accesses checked and synchronization events that occur.
To reduce the number of accesses checked at runtime, we
use static analysis at compile time to determine accesses
that are guaranteed to be race free. While implementing
Goldilocks in Kaffe, we worked with two static analysis
tools for this purpose: Chord13 and RccJava. 1
4. 3. Race-detection overhead
At the time of the original Goldilocks work, the vector clock
algorithm12 was the only precise dynamic-race-detection
algorithm in the literature. The vector clock algorithm, for
an execution with n threads, requires for every thread and
synchronization variable a separate vector clock (VC) of
size n and performs O(n) operations (merging or comparing two VCs) whenever a synchronization operation or data
access happens. In preliminary research, compared to a
straightforward implementation of vector clocks, we found
Goldilocks overhead to be significantly less. 6
In Elmas et al., 6 we measured the overhead of the
Goldilocks implementation inside Kaffe on a set of widely
used Java benchmarks. This implementation required us to
run all programs in interpreted (not just-in-time compiled)
mode. We found that, with powerful static analysis tools
eliminating much of the monitoring, we were able to obtain
a slowdown of within approximately 2 for all benchmarks.
Without static elimination of some checks, overheads
remained high; some benchmarks experienced slowdowns
of over 15. The overhead results with static pre-elimination
were encouraging in that they showed precise race detection
to be a practical debugging tool, and they indicated that,
with further optimizations, post-deployment runtime race
detection to support DataRaceDetection could be viable.
Later work on FastTrack, 7 a dynamic race detector
based on vector clocks, is able to avoid worst-case performance of vector clocks much of the time using optimizations for common cases. Flanagan and Freund7 compare
a number of race-detection algorithms, including just-in-time compiled implementations of FastTrack and
Goldilocks in RoadRunner. FastTrack achieves
significantly better overheads than both implementations of Goldilocks. The low overheads achieved by
FastTrack provide further support that a practical
race-aware runtime for deployed programs supporting
a DataRaceException can be built. It is reported in
Flanagan and Freund7 that additional short-circuit checks
similar to ones we discussed above dramatically reduce the
runtime of Fast Track. Most of these checks can be incorporated into Goldilocks implementations as well.
5. concLusion
We have presented a race-aware runtime for Java incorporating a novel algorithm, Goldilocks, for precise dynamic race detection. The runtime provides a
DataRaceException, and thus ensures that executions
92 communications of the acm | noVemBer 2010 | vol. 53 | no. 11
remain sequentially consistent at the byte-code level.
Experiments with Goldilocks have demonstrated that the
runtime overhead of supporting a DataRaceException
can be made reasonable.
acknowledgments
We would like to thank Hans Boehm, Cormac Flanagan,
Steve Freund, and Madan Musuvathi for their critique of
this paper. This research was supported by the Software
Reliability Research Group at Microsoft Research,
Redmond, WA, by the Scientific and Technical Research
Council of Turkey (TUBITAK) under grant 104E058, and by
the Turkish Academy of Sciences (TUBA).
1. abadi, m, flanagan, c., freund, s.n.
types for safe locking: static race
detection for Java. ACM Trans.
Program. Lang. Syst. (2006).
2. Boehm, H.-J., adve, s.V. foundations
of the c++ concurrency memory
model. In Proceedings of the 2008
ACM SIGPLAN Conf. on Programming
Language Design and Implementation
(pLDI 2008)
3. cheng, g.-I., feng, m., Leiserson, c.e.,
randall, K.H., stark, a.f. Detecting
data races in cilk programs that use
locks. In ACM Symposium on Parallel
Algorithms and Architectures (spaa
1998).
4. christiaens, m. De Bosschere, K.
traDe: Data race detection for
Java. Proc. Intl. Conference on
Computational Science. V. alexandrov,
J. Dongarra, B. Juliano, r. renner, and
c. tan, eds. (Iccs 2001).
5. elmas, t., Qadeer, s., tasiran, s.
goldilocks: efficiently computing
the happens-before relation using
locksets. Technical Report MSR-
TR-2006–163, Microsoft Research
(2006).
6. elmas, t., Qadeer, s., tasiran, s.
goldilocks: a race and transaction-aware java runtime. In Proceedings of
the 2007 ACM SIGPLAN Conference
on Programming Language Design and
Implementation (pLDI 2007).
7. flanagan, c., freund, s.n. fasttrack:
efficient and precise dynamic race
detection. In Proceedings of the
2009 ACM SIGPLAN Conference on
Programming Language Design and
Implementation (pLDI 2009).
8. flanagan, c., freund, s.n. the
roadrunner dynamic analysis
framework for concurrent programs,
In ACM Workshop on Program
Analysis for Software Tools and
Engineering (paste 2010).
9. Lucia, B., ceze, L., strauss, K., Qadeer,
s., Boehm H.-J. conflict exceptions:
simplifying concurrent language
semantics with precise hardware
exceptions for data-races. In
Proceedings of the 37th International
Symposium on Computer Architecture
(Isca 2010).
10. manson, J., pugh, W., adve, s.V. the
Java memory model. In Proceedings
References
of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles
of Programming Languages (popL
2005).
11. marino, D., singh, a., millstein, t.,
musuvathi, m., narayanasamy, s.
Drfx: a simple and efficient memory
model for concurrent programming
languages. In Proceedings of the
2010 ACM SIGPLAN Conference on
Programming Language Design and
Implementation (pLDI 2010).
12. mattern, f. Virtual time and global
states of distributed systems. In
Proceedings of the International
Workshop on Parallel and Distributed
Algorithms (1988).
13. naik, m., aiken, a., Whaley, J. effective
static race detection for Java. In
Proceedings 2006 ACM SIGPLAN
Conference on Programming Language
Design and Implementation (pLDI
2006).
14. pozniansky, e., schuster, a. multirace:
efficient on-the-fly data race detection
in multithreaded c++ programs:
research articles. Concurr. Comput.
Pract. Exp. (2007).
15. ronsse, m., Bosschere, K.D. recplay:
a fully integrated practical record/
replay system. ACM Trans. Comput.
Syst. (1999).
16. savage, s., Burrows, m., nelson, g.,
sobalvarro, p., anderson, t. eraser:
a dynamic data race detector for
multithreaded programs. ACM Trans.
Comput. Syst. (1997).
17. schonberg, e. on-the-fly detection
of access anomalies. In Proceedings
of the ACM SIGPLAN Conference on
Programming Language Design and
Implementation (pLDI 1989).
18. shavit, n., touitou, D. software
transactional memory. In Symposium
on Principles of Distributed Computing
(1995).
19. Wilkinson, t. Kaffe: a JIt and
interpreting virtual machine to run
Java code. http://www.transvirtual.
com (1998).
20. yu, y., rodeheffer, t., chen, W.
racetrack: efficient detection of data
race conditions via adaptive tracking.
In Proceedings of the 20th ACM
Symposium on Operating Systems
Principles (sosp 2005).
Tayfun Elmas ( telmas@ku.edu.tr), Koç
university, Istanbul, turkey.
Shaz Qadeer ( qadeer@microsoft.com),
microsoft research, redmond, Wa.
Serdar Tasiran ( stasiran@ku.edu.tr), Koç
university, Istanbul, turkey.
© 2010 acm 0001-0782/10/1100 $10.00