worker threads and collect results, while md5-tree and
matmult-tree fork and join workers recursively in a binary
tree structure. The “embarrassingly parallel” md5-tree
performs and scales well with recursive work distribution, while matmult-tree suffers from the cost of large
cross-node data transfers using Determinator’s unopti-mized page copying protocol.
As Figure 10 shows, the shared-memory benchmarks
on Determinator perform comparably to nondeterministic, distributed-memory equivalents running on Linux.
Illustrating the benefits of its shared memory API, the
Determinator version of md5 is 63% of the size of the Linux
version ( 62 lines versus 99), which uses remote shells to
coordinate workers. Determinator’s matmult is 34% of the
size of the Linux version ( 90 lines versus 263), which passes
data using TCP.
5. 4. implementation complexity
To provide a feel for implementation complexity, Table
2 shows source code line counts for Determinator, as well
as its PIOS instructional subset, counting only lines containing semicolons. The entire system is less than 15,000
lines, about half of which is generic C and math library code
needed mainly for porting Unix applications easily.
6. CoNCLusioN
While Determinator is only a proof of concept, it shows
that operating systems can offer a pervasively, naturally
deterministic environment. Determinator avoids introducing data races in shared memory and file system access,
thread and process synchronization, and throughout its
API. Experiments suggest that such an environment can
figure 10. Deterministic shared-memory benchmarks versus
distributed-memory equivalents for Linux.
Speedup over Linux
2
1. 5
1
0.5
0
1 node
2 nodes
4 nodes
8 nodes
16 nodes
md5-tree
md5-circuit matmult-tree
Benchmark
table 2. implementation code size of the Determinator os and of
Pios, its instructional subset
Component
Determinator semicolons
2044
751
2952
6948
1797
14,492
Pios semicolons
1847
647
1079
394
1418
5385
efficiently run coarse-grained parallel applications, both
on multicore machines and across clusters, though running fine-grained applications efficiently may require
hardware evolution.
Acknowledgments
We thank Zhong Shao, Ramakrishna Gummadi, Frans
Kaashoek, Nickolai Zeldovich, Sam King, and the OSDI
reviewers for their valuable feedback. We thank ONR and
NSF for their support under grants N00014-09-10757 and
CNS-1017206.
References
1. amza, c. et al. treadMarks: Shared
memory computing on networks of
workstations. IEEE Computer 29, 2
(feb. 1996), 18–28.
2. artho, c., havelund, k., biere, a.
high-level data races. in Workshop
on Verification and Validation of
Enterprise Information Systems
(VVEIS) (apr. 2003), 82–93.
3. aviram, a., ford, b., Zhang, y. Workspace
consistency: a programming model
for shared memory parallelism. in
2nd Workshop on Determinism and
Correctness in Parallel Programming
(WoDet) (Mar. 2011).
4. aviram, a., hu, S., ford, b., gummadi,
r. determinating timing channels
in compute clouds. in ACM Cloud
Computing Security Workshop
(CCS W) (oct. 2010).
5. bellard, f. QeMU, a fast and Portable
dynamic translator, in USENIX
Annual Technical Conference,
anaheim, ca, april 10-15, 2005.
6. beltrametti, M., bobey, k., Zorbas, J.r.
the control mechanism for the Myrias
parallel computer system. Comput.
Architect. News 16, 4 (Sept. 1988),
21–30.
7. berenson, h. et al. a critique of anSi
SQl isolation levels. in SIGMOD
(June 1995).
8. bergan, t. et al. coredet: a compiler
and runtime system for deterministic
multithreaded execution. in 15th
International Conference on
Architectural Support for Programming
Languages and Operating Systems
(ASPLOS) (Mar. 2010).
9. bergan, t. et al. deterministic process
groups in doS. in 9th USENIX
Symposium on Operating Systems
Design and Implementation (OSDI)
(oct. 2010).
10. boyd-Wickizer, S. et al. corey: an
operating system for many cores. in
8th USENIX Symposium on Operating
Systems Design and Implementation
(OSDI) (dec. 2008).
11. castro, M., liskov, b. Practical
byzantine fault tolerance. in
3rd USENIX Symposium on
Operating Systems Design and
Implementation (OSDI) (feb. 1999),
173–186.
12. dunlap, g. W. et al. reVirt: enabling
intrusion analysis through virtual-machine logging and replay. in 5th
Amittai Aviram, Shu-Chun Weng, Sen
hu, and Bryan Ford, yale University,
department of computer Science, new
haven, ct.
USENIX Symposium on Operating
Systems Design and Implementation
(dec. 2002).
13. dunlap, g. W. et al. execution replay
for multiprocessor virtual machines.
in Virtual Execution Environments
(VEE) (Mar. 2008).
14. ford, b. PioS: Parallel instructional
operating system, 2010. http://zoo.
cs.yale.edu/classes/cs422/pios.
15. git: the fast version control system.
http://git-scm.com/.
16. haeberlen, a., kouznetsov, P.,
druschel, P. Peerreview: Practical
accountability for distributed
systems. in 21st ACM Symposium on
Operating Systems Principles (SOSP)
(oct. 2007).
17. kaashoek, f. et al. 6.828: operating
system engineering. http://pdos.csail.
mit.edu/6.828/.
18. kahn, g. the Semantics of a Simple
language for Parallel Programming",
in Information Processing '74:
Proceedings of the IFIP Congress
(1974), pp. 471-475.
19. leblanc, t.J., Mellor-crummey, J. M.
debugging parallel programs with
instant replay. IEEE Trans. Comput.
C- 36, 4 (apr. 1987), 471–482.
20. lee, e. the problem with threads.
Computer,
39, 5 (May 2006), 33–42.
21. Musuvathi, M. et al. finding and
reproducing heisenbugs in concurrent
programs. in 8th USENIX Symposium
on Operating Systems Design and
Implementation (OSDI) (berkeley,
california, USa, 2008), USeniX
association.
22. olszewski, M., ansel, J., amarasinghe,
S. kendo: efficient deterministic
multithreading in software. in
14th International Conference on
Architectural Support for Programming
Languages and Operating Systems
(ASPLOS) (Mar. 2009).
23. openMP architecture review
board. openMP application program
interface version 3.0, May 2008.
24. Parker, d. S. Jr. et al. detection of
mutual inconsistency in distributed
systems. IEEE Trans. Software Eng.
SE- 9, 3 (May 1983).
25. Wei, J., Pu, c. tocttoU vulnerabilities
in UniX-style file systems: an
anatomical study. in 4th USENIX
Conference on File and Storage
Technologies (FAST) (dec. 2005).