Călin Caşcaval, Colin Blundell,
Maged Michael, Harold W. Cain,
Peng Wu, Stefanie Chiras,
and Siddhartha Chatterjee
The overhead posed by STM may likely overshadow its promise.
TM (transactional memory) 1 is a concurrency control paradigm that provides atomic
and isolated execution for regions of code. TM is considered by many researchers to be
one of the most promising solutions to address the problem of programming multicore
processors. Its most appealing feature is that most programmers only need to reason
locally about shared data accesses, mark the code region to be executed transactionally, and let the underlying system ensure the correct concurrent execution. This model
promises to provide the scalability of fine-grain locking, while avoiding common pitfalls
of lock composition such as deadlock. In this article we explore the performance of a
highly optimized STM and observe that the overall performance of TM is significantly
worse at low levels of parallelism, which is likely to limit the adoption of this programming paradigm.
Different implementations of transactional memory systems make tradeoffs that
impact both performance and programmability. Larus and Rajwar2 present an overview
of design tradeoffs for implementations of transactional memory systems. Here are some
of the design choices:
• STM (software-only TM) 3, 4, 5, 6, 7, 8, 9 is the focus of this article. While offering flexibility
and no hardware cost, it leads to overhead in excess of most users’ tolerance.
• HTM (hardware-only TM) 10, 11, 12, 13, 14, 15, 16 suffers from two major impediments: high
implementation and verification costs lead to design risks too large to justify on a niche
programming model; and hardware capacity constraints lead to significant performance