figure 2: . scalability results for three stm runtimes on a quad-core
intel Xeon server: iBm, intel stm v2, and sun tL2.
delaunay
2. 5
2
— Intel
— IBM — Sun TL2
Scalability normalized
to sequential
1. 5
1
0.5
0
0
2
4
8
Threads
kmeans
2. 5
2
Scalability normalized
to sequential
1. 5
1
0.5
0
0
2
4
8
Threads
vacation
2. 5
2
Scalability normalized
to sequential
1. 5
1
0.5
0
0
2
4
8
Threads
genome
2. 5
2
Scalability normalized
to sequential
1. 5
1
0.5
0
0
2
4
8
Threads
end users, the advantage of an STM is
that it offers an environment to transactionalize (that is, porting to TM) their
applications without incurring extra
hardware cost or waiting for such hardware to be developed.
Conversely, an STM entails nontrivial drawbacks with respect to performance and programming semantics:
˲ Overheads: In general, STM results
in higher sequential overheads than traditional shared-memory programming
or HTM. This is the result of the software
expansion of loads and stores to shared
mutable locations inside transactions
to tens of additional instructions that
constitute the STM implementation
(for example, the STM_READ code in
Figure 1c). Depending on the transactional characteristics of a workload,
these overheads can become a high
hurdle for STM to achieve performance.
The sequential overheads (that is, con-flict-free overheads that are incurred regardless of the actions of other concurrent threads) must be overcome by the
concurrency-enabling characteristics of
transactional memory.
˲ Semantics: In order to avoid incurring high STM overheads, non-transactional accesses (such as loads and stores
occurring outside transactions) are typically not expanded. This has the effect
of weakening—and hence complicating—the semantics of transactions,
which may require the programmer
to be more careful than when strong
transactional semantics are supported.
The following are some of the weakened
guarantees that are usually associated
with such S TMs:
˲ Weak atomicity: Typically the STM
runtime libraries cannot detect conflicts
between transactions and non-transactional accesses. Thus, the semantics of
atomicity are weakened to allow undetected conflicts with non-transactional
accesses (referred to as weak atomicity3), or equivalently put the burden on
the programmer to guarantee that no
such conflicts can possibly take place.
˲ Privatization: Some STM designs
prohibit the seamless privatization of
memory locations, that is, the transition from being accessed transactionally to being accessed privately—or
non-transactionally in general, by using locks. For some STM designs, once
a location is accessed transactionally,
it must continue to be accessed transactionally. With some STM designs, the
programmer can ease the transition by
guaranteeing that the first access to the
privatized location—such as after the
location is no longer accessible by other
threads—is transactional.
˲ Memory reclamation: Some STM
designs prohibit the seamless reclamation of the memory locations accessed
transactionally for arbitrary reuse, such
as using malloc and free. With such
STM designs, memory allocation and
deallocation for locations accessed
transactionally are handled differently
from other locations.
˲ Legacy binaries: STM needs to observe all memory activities of the transactional regions to ensure atomicity and
isolation. STMs that achieve this observation by code instrumentation gener-