tions in TM systems. Recent software
optimizations have managed to accelerate STM performance by 2%–15%. We
believe such analysis is a good practice
that should be extended to every piece
of system software, especially open
source. However, the gains are only a minor dent in the overheads we observed,
indicating the challenge that lies before
the community in making STM performance compelling.
conclusion
Based on our results, we believe that the
road ahead for STM is quite challenging. Lowering the overheads of STM to
a point where it is generally appealing
is a difficult task and significantly better results have to be demonstrated. If
we could stress a single direction for
further research, it is the elimination of
dynamically unnecessary read and write
barriers—possibly the single most powerful lever toward further reduction of
STM overheads. However, given the difficulty of similar problems explored by
the research community such as alias
analysis, escape analysis, and so on, this
may be an uphill battle. And because
the argument for TM hinges upon its
simplicity and productivity benefits, we
are deeply skeptical of any proposed solutions to performance problems that
require extra work by the programmer.
We observed that the TM programming model itself, whether implemented in hardware or software, introduces
complexities that limit the expected
productivity gains, thus reducing the
current incentive for migration to transactional programming, and the justification at present for anything more than
a small amount of hardware support.
acknowledgments
We would like to thank Pratap Pattnaik
for his continuous support, Christoph
von Praun for numerous discussions,
work on benchmarks and runtimes,
and Rajesh Bordawekar for the B+tree
code implementation.
References
1. baugh, l., neelakantam, n., and zilles, C. using
hardware memory protection to build a high-performance, strongly-atomic hybrid transactional
memory. in Proceedings of the 35th International
Symposium on Computer Architecture. ieee
Computer society, Washington, dC, 2008, 115–126.
2. blundell, C., devietti, J., lewis, e.l., martin, m.m.K.
making the fast case common and the uncommon
case simple in unbounded transactional memory.
in Proceedings of the 34th Annual International
Symposium on Computer Architecture. aCm, ny, 2007.
3. blundell, C., lewis, C., and martin, m.m.K. subtleties
of transactional memory atomicity semantics. IEEE
TCCA Computer Architecture Letters 5, 2 (nov 2006).
4. bobba, J., Goyal, n., hill, m.d., swift, m.m., and Wood,
d.a. tokentm: efficient execution of large transactions
with hardware transactional memory. in Proceedings
of the 35th International Symposium on Computer
Architecture. ieee Computer society, Washington,
d. C., 2008, 127–138.
5. Ceze, l., tuck, J., Cascaval, C., torrellas, J.
bulk disambiguation of speculative threads in
multiprocessors. in Proceedings of the 34th Annual
International Symposium on Computer Architecture.
aCm, ny, 2006, 237–238.
6. damron, p., federova, a., lev, y., luchangco, v., moir,
m., and nussbaum, d. hybrid transactional memory.
in Proceedings of the 12th International Conference
on Architectural Support for Programming Languages
and Operating Systems, oct. 2006.
7. dice, d., shalev, o., and shavit, n. transactional
locking ii. DISC, sept. 2006, 194–208.
8. felber, p., fetzer, C., mueller, u., riegel, t., suesskraut,
m., and sturzrehm, h. transactifying applications
using an open compiler framework. in Proceedings
of the ACM SIGPLAN Workshop on Transactional
Computing. aug. 2007.
9. hammond, l., Wong, v., Chen, m., Carlstrom, b.d.,
davis, J.d., hertzberg, b., prabhu, m. K., Wijaya, h.,
Kozyrakis, C., and olukotun, K. transactional memory
coherence and consistency. in Proceedings of the
31st Annual International Symposium on Computer
Architecture. ieee Computer society, June 2004, 102.
10. harris, t. and fraser, K. language support for
lightweight transactions. in Proceedings of Object-Oriented Programming, Systems, Languages, and
Applications. oct. 2003, 388–402.
11. harris, t., plesko, m., shinnar, a., and tarditi, d.
optimizing memory transactions. in Proceedings
of the Programming Language Design and
Implementation Conference. 2003, 388–402.
12. herlihy, m., luchangco, v., moir, m., and scherer iii,
W.n. software transactional memory for dynamic-sized data structures. in Proceedings of the 22nd ACM
Symposium on Principles of Distributed Computing.
July 2003, 92–101.
13. herlihy, m. and moss, J.e.b. transactional memory:
architectural support for lock-free data structures.
in Proceedings of the 20th Annual International
Symposium on Computer Architecture. may 1993.
14. intel C++ stm compiler, prototype edition 2.0.; http://
softwarecommunity.intel.com/articles/eng/1460.htm/
(2008).
15. Kulkarni, m., pingali, K., Walter, b., ramanarayanan, G.,
bala, K., and Chew, p.l. optimistic parallelism requires
abstractions. in Proceedings of the PLDI 2007. aCm,
ny, 2007, 211–222.
16. larus, J.r., and rajwar, r. Transactional Memory.
morgan Claypool, 2006.
17. the lonestar benchmark suite; http://iss.ices.utexas.
edu/lonestar/ (2008).
18. marathe, v.J., spear, m.f., heriot, C., acharya, a.,
eisenstat, d., scherer iii, W.n., and scott, m.l.
lowering the overhead of software transactional
memory. technical report tr 893, Computer science
department, university of rochester, mar 2006.
Condensed version submitted for publication.
19. minh, C. C., trautmann, m., Chung, J., mcdonald, a.,
bronson, n., Casper, J., Kozyrakis, C., and olukotun, K.
an effective hybrid transactional memory system with
strong isolation guarantees. in Proceedings of the
34th Annual International Symposium on Computer
Architecture. aCm, ny, 2007, 69–80.
20. moore, K.e., bobba, J., moravan, m.J., hill, m.d., and
Wood, d.a. logtm: log-based transactional memory.
in Proceedings of the 12th Annual International
Symposium on High Performance Computer
Architecture, feb 2006.
21. olszewski, m., Cutler, J., steffan, J. G. Judostm: a
dynamic binary-rewriting approach to software
transactional memory. in Proceedings of the 16th
International Conference on Parallel Architecture
and Compilation Techniques. 2007. ieee Computer
society, Washington d.C., 365-375.
22. riegel, t., fetzer, C., and felber, p. time-based
transactional memory with scalable time bases.
in Proceedings of the 19th ACM Symposium on
Parallelism in Algorithms and Architectures, 2007.
23. saha, b., adl-tabatabai, a.r., hudson, r.l., minh, C.C.,
and hertzberg, b. mcrt-stm: a high performance
software transactional memory system for a
multi-core runtime. in Proceedings of the 11th ACM
Symposium on Principles and Practice of Parallel
Programming. mar. 2006, aCm, ny, 187–197.
24. saha, b., adl-tabatabai, a.r., and Jacobson, Q.
architectural support for software transactional
memory. in proceedings of the 39th annual
international symposium on microarchitecture. dec.
2006, 185–196.
25. shavit, n., and touitou, d. software transactional
memory. in Proceedings of the ACM Symposium of
Principles of Distributed Computing. aCm, 1995.
26. shavit, n. and touitou, d. software transactional
memory. in Proceedings of the 14th ACM Symposium
on Principles of Distributed Computing. aCm, ny, 1995.
27. shpeisman, t., menon, v., adl-tabatabai, a-r.,
balensiefer, s., Grossman, d., hudson, r., moore, K.f.,
and saha, b. enforcing isolation and ordering in stm.
in Proceedings of Proceedings of the Programming
Language Design and Implementation Conference.
aCm, 2007, 78–88.
28. shriraman, a., spear, m.f., hossain, h., marathe,
v.J., dwarkadas, s., and scott, m.l. an integrated
hardware-software approach to flexible transactional
memory. in Proceedings of the 34th Annual
International Symposium on Computer Architecture.
aCm, ny, 2007, 104–115.
29. spears, m.t., michael, m.m., and von praum, C.
ringstm: scalable transactions with a single
atomic instruction. in Proceedings of the 20th
ACM Symposium on Parallelism in Algorithms and
Architectures. aCm, ny, 275–284.
30. stamp benchmark; http://stamp.stanford.edu/ (2007).
31. (ibm) xl C/C++ for transactional memory for aix;
www.alphaworks.ibm.com/tech/xlcstm/ (2008).
32. tremblay, m. and Chaudhry, s. a third generation
65nm 16-core 32-thread plus 32-scout-thread Cmt.
in Proceedings of the IEEE International Solid-State
Circuits Conference. feb. 2008.
33. Wang, C. Chein, W-y, Wu, y., saha, b., and adl-tabatabai, a.r. Code generation and optimization for
transactional memory constructs in an unmanaged
language. in Proceedings of International Symposium
on Code Generation and Optimization. 2007, 34–48.
34. Wu, p., michael, m.m., von praun, C., nakaike, t.,
bordawekar, r., Cain, h. W., Cascaval, C., Chatterjee,
s., Chiras, s., hou, r., mergen, m., shen, x., spear,
m.f., Wang, h.y., and Wang, K. Compiler and
runtime techniques for software transactional
memory optimization. to appear in Concurrency and
Computation: Practice and Experience, 2008.
35. yen, l., bobba, J., marty, m.m., moore, K.e., volos,
h., hill, m.d., swift, m.m., and Wood, d.a. logtm-se:
decoupling hardware transactional memory from
caches. in Proceedings of the 13th International
Symposium on High-Performance Computer
Architecture. feb 2007.
36. yoo, r.m., ni, y., Welc, a., saha, b. adl-tabatabai,
a-r. and lee, h-h.s. Kicking the tires of software
transactional memory: why the going gets tough.
Proceedings of the 20th Annual ACM Symposium on
Parallelism in Algorithms and Architectures, 2008.
37. zhang, r., budimlić, z. and scherer iii, W.n. Commit
phase in timestamp-based stm. in Proceedings of the
20th Annual Symposium on Parallelism in Algorithms
and Architectures. aCm, ny, 326–335.
Călin Cas˛caval ( cascaval@us.ibm.com) is a research
staff member and manager of programming models and
tools for scalable systems at ibm tJ Watson research
Center, yorktown heights, ny.
Colin Blundell is a member of the architecture
and Compilers Group, department of Computer and
information science, university of pennsylvania.
Maged Michael is a research staff research member at
ibm tJ Watson research Center, yorktown heights, ny.
Trey Cain is a research staff member at ibm tJ Watson
research Center, yorktown heights, ny.
Peng Wu is a research staff member at ibm tJ Watson
research Center, yorktown heights, ny.
Stefanie Chiras is a manager in ibm's systems and
technology Group.
Siddhartha Chatterjee is director of the austin research
laboratory, ibm research, austin, tx.