tions in TM systems. Recent software optimizations have managed to accelerate STM performance by 2%–15%. We believe such analysis is a good practice that should be extended to every piece of system software, especially open source. However, the gains are only a minor dent in the overheads we observed, indicating the challenge that lies before the community in making STM performance compelling.

conclusion

Based on our results, we believe that the road ahead for STM is quite challenging. Lowering the overheads of STM to a point where it is generally appealing is a difficult task and significantly better results have to be demonstrated. If we could stress a single direction for further research, it is the elimination of dynamically unnecessary read and write barriers—possibly the single most powerful lever toward further reduction of STM overheads. However, given the difficulty of similar problems explored by the research community such as alias analysis, escape analysis, and so on, this may be an uphill battle. And because the argument for TM hinges upon its simplicity and productivity benefits, we are deeply skeptical of any proposed solutions to performance problems that require extra work by the programmer.

We observed that the TM programming model itself, whether implemented in hardware or software, introduces complexities that limit the expected productivity gains, thus reducing the current incentive for migration to transactional programming, and the justification at present for anything more than a small amount of hardware support.

acknowledgments

We would like to thank Pratap Pattnaik for his continuous support, Christoph von Praun for numerous discussions, work on benchmarks and runtimes, and Rajesh Bordawekar for the B+tree code implementation.

 

References

1. baugh, l., neelakantam, n., and zilles, C. using hardware memory protection to build a high-performance, strongly-atomic hybrid transactional memory. in Proceedings of the 35th International Symposium on Computer Architecture. ieee Computer society, Washington, dC, 2008, 115–126.

2. blundell, C., devietti, J., lewis, e.l., martin, m.m.K. making the fast case common and the uncommon case simple in unbounded transactional memory. in Proceedings of the 34th Annual International

Symposium on Computer Architecture. aCm, ny, 2007.

3. blundell, C., lewis, C., and martin, m.m.K. subtleties of transactional memory atomicity semantics. IEEE TCCA Computer Architecture Letters 5, 2 (nov 2006).

4. bobba, J., Goyal, n., hill, m.d., swift, m.m., and Wood, d.a. tokentm: efficient execution of large transactions with hardware transactional memory. in Proceedings of the 35th International Symposium on Computer Architecture. ieee Computer society, Washington, d. C., 2008, 127–138.

5. Ceze, l., tuck, J., Cascaval, C., torrellas, J. bulk disambiguation of speculative threads in multiprocessors. in Proceedings of the 34th Annual International Symposium on Computer Architecture. aCm, ny, 2006, 237–238.

6. damron, p., federova, a., lev, y., luchangco, v., moir, m., and nussbaum, d. hybrid transactional memory. in Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, oct. 2006.

7. dice, d., shalev, o., and shavit, n. transactional locking ii. DISC, sept. 2006, 194–208.

8. felber, p., fetzer, C., mueller, u., riegel, t., suesskraut, m., and sturzrehm, h. transactifying applications using an open compiler framework. in Proceedings of the ACM SIGPLAN Workshop on Transactional Computing. aug. 2007.

9. hammond, l., Wong, v., Chen, m., Carlstrom, b.d., davis, J.d., hertzberg, b., prabhu, m. K., Wijaya, h., Kozyrakis, C., and olukotun, K. transactional memory coherence and consistency. in Proceedings of the 31st Annual International Symposium on Computer Architecture. ieee Computer society, June 2004, 102.

10. harris, t. and fraser, K. language support for lightweight transactions. in Proceedings of Object-Oriented Programming, Systems, Languages, and Applications. oct. 2003, 388–402.

11. harris, t., plesko, m., shinnar, a., and tarditi, d. optimizing memory transactions. in Proceedings of the Programming Language Design and Implementation Conference. 2003, 388–402.

12. herlihy, m., luchangco, v., moir, m., and scherer iii, W.n. software transactional memory for dynamic-sized data structures. in Proceedings of the 22nd ACM Symposium on Principles of Distributed Computing. July 2003, 92–101.

13. herlihy, m. and moss, J.e.b. transactional memory: architectural support for lock-free data structures. in Proceedings of the 20th Annual International Symposium on Computer Architecture. may 1993.

14. intel C++ stm compiler, prototype edition 2.0.; http:// softwarecommunity.intel.com/articles/eng/1460.htm/ (2008).

15. Kulkarni, m., pingali, K., Walter, b., ramanarayanan, G., bala, K., and Chew, p.l. optimistic parallelism requires abstractions. in Proceedings of the PLDI 2007. aCm, ny, 2007, 211–222.

16. larus, J.r., and rajwar, r. Transactional Memory. morgan Claypool, 2006.

17. the lonestar benchmark suite; http://iss.ices.utexas. edu/lonestar/ (2008).

18. marathe, v.J., spear, m.f., heriot, C., acharya, a., eisenstat, d., scherer iii, W.n., and scott, m.l. lowering the overhead of software transactional memory. technical report tr 893, Computer science department, university of rochester, mar 2006. Condensed version submitted for publication.

19. minh, C. C., trautmann, m., Chung, J., mcdonald, a., bronson, n., Casper, J., Kozyrakis, C., and olukotun, K. an effective hybrid transactional memory system with strong isolation guarantees. in Proceedings of the 34th Annual International Symposium on Computer Architecture. aCm, ny, 2007, 69–80.

20. moore, K.e., bobba, J., moravan, m.J., hill, m.d., and Wood, d.a. logtm: log-based transactional memory. in Proceedings of the 12th Annual International Symposium on High Performance Computer Architecture, feb 2006.

21. olszewski, m., Cutler, J., steffan, J. G. Judostm: a dynamic binary-rewriting approach to software transactional memory. in Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques. 2007. ieee Computer society, Washington d.C., 365-375.

22. riegel, t., fetzer, C., and felber, p. time-based transactional memory with scalable time bases. in Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures, 2007.

23. saha, b., adl-tabatabai, a.r., hudson, r.l., minh, C.C., and hertzberg, b. mcrt-stm: a high performance software transactional memory system for a multi-core runtime. in Proceedings of the 11th ACM

Symposium on Principles and Practice of Parallel Programming. mar. 2006, aCm, ny, 187–197.

24. saha, b., adl-tabatabai, a.r., and Jacobson, Q. architectural support for software transactional memory. in proceedings of the 39th annual international symposium on microarchitecture. dec. 2006, 185–196.

25. shavit, n., and touitou, d. software transactional memory. in Proceedings of the ACM Symposium of Principles of Distributed Computing. aCm, 1995.

26. shavit, n. and touitou, d. software transactional memory. in Proceedings of the 14th ACM Symposium on Principles of Distributed Computing. aCm, ny, 1995.

27. shpeisman, t., menon, v., adl-tabatabai, a-r., balensiefer, s., Grossman, d., hudson, r., moore, K.f., and saha, b. enforcing isolation and ordering in stm. in Proceedings of Proceedings of the Programming Language Design and Implementation Conference. aCm, 2007, 78–88.

28. shriraman, a., spear, m.f., hossain, h., marathe, v.J., dwarkadas, s., and scott, m.l. an integrated hardware-software approach to flexible transactional memory. in Proceedings of the 34th Annual International Symposium on Computer Architecture. aCm, ny, 2007, 104–115.

29. spears, m.t., michael, m.m., and von praum, C. ringstm: scalable transactions with a single atomic instruction. in Proceedings of the 20th ACM Symposium on Parallelism in Algorithms and Architectures. aCm, ny, 275–284.

30. stamp benchmark; http://stamp.stanford.edu/ (2007).

31. (ibm) xl C/C++ for transactional memory for aix; www.alphaworks.ibm.com/tech/xlcstm/ (2008).

32. tremblay, m. and Chaudhry, s. a third generation 65nm 16-core 32-thread plus 32-scout-thread Cmt. in Proceedings of the IEEE International Solid-State Circuits Conference. feb. 2008.

33. Wang, C. Chein, W-y, Wu, y., saha, b., and adl-tabatabai, a.r. Code generation and optimization for transactional memory constructs in an unmanaged language. in Proceedings of International Symposium on Code Generation and Optimization. 2007, 34–48.

34. Wu, p., michael, m.m., von praun, C., nakaike, t., bordawekar, r., Cain, h. W., Cascaval, C., Chatterjee, s., Chiras, s., hou, r., mergen, m., shen, x., spear, m.f., Wang, h.y., and Wang, K. Compiler and runtime techniques for software transactional memory optimization. to appear in Concurrency and Computation: Practice and Experience, 2008.

35. yen, l., bobba, J., marty, m.m., moore, K.e., volos, h., hill, m.d., swift, m.m., and Wood, d.a. logtm-se: decoupling hardware transactional memory from caches. in Proceedings of the 13th International Symposium on High-Performance Computer Architecture. feb 2007.

36. yoo, r.m., ni, y., Welc, a., saha, b. adl-tabatabai, a-r. and lee, h-h.s. Kicking the tires of software transactional memory: why the going gets tough. Proceedings of the 20th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2008.

37. zhang, r., budimlić, z. and scherer iii, W.n. Commit phase in timestamp-based stm. in Proceedings of the 20th Annual Symposium on Parallelism in Algorithms and Architectures. aCm, ny, 326–335.

 

Călin Cas˛caval ( cascaval@us.ibm.com) is a research staff member and manager of programming models and tools for scalable systems at ibm tJ Watson research Center, yorktown heights, ny.

Colin Blundell is a member of the architecture and Compilers Group, department of Computer and information science, university of pennsylvania.

Maged Michael is a research staff research member at ibm tJ Watson research Center, yorktown heights, ny.

Trey Cain is a research staff member at ibm tJ Watson research Center, yorktown heights, ny.

Peng Wu is a research staff member at ibm tJ Watson research Center, yorktown heights, ny.

Stefanie Chiras is a manager in ibm's systems and technology Group.

Siddhartha Chatterjee is director of the austin research laboratory, ibm research, austin, tx.

References:

http://softwarecommunity.intel.com/articles/eng/1460.htm/

http://stamp.stanford.edu/

http://www.alphaworks.ibm.com/tech/xlcstm/

mailto:cascaval@us.ibm.com

http://softwarecommunity.intel.com/articles/eng/1460.htm/

http://iss.ices.utexas.edu/lonestar/

http://iss.ices.utexas.edu/lonestar/

Archives