be caught and returned back to the programmer as illegal;
˲ ˲ are general-purpose enough to
express important parallel algorithms and
patterns; and
˲ ˲ enable high and scalable
performance.
Many previous programmer-produc-tivity-driven efforts have sought to raise
the level of abstraction with threads; for
example, Cilk, 19 TBB, 25 OpenMP, 39 the
recent HPCS languages, 28 other high-level libraries, frameworks, and APIs
such as java.util.concurrent and the C++
boost libraries, as well as more domain-specific ones. While these solutions go
a long way toward easing the pain of
orchestrating parallelism, our memory-models driven argument is deeper—we
argue that, at least so far, it is not possible to provide reasonable semantics
for a language that allows data races,
an arguably more fundamental problem. In fact, all of these examples either
provide unclear models or suffer from
the same limitations as C++/Java. These
approaches, therefore, do not meet our
enforcement requirement. Similarly,
transactional memory provides a high-level mechanism for atomicity, but the
memory model in the presence of non-transactional code faces the same issues as described here. 38
At the heart of our agenda of disciplined models are the questions: What
is the appropriate discipline? How to
enforce it?
A near-term transition path is to
continue with data-race-free and focus
research on its enforcement. The ideal
solution is for the language to eliminate data races by design (for example,
Boyapati13); however, our semantics difficulties are avoided even with dynamic
techniques (for example, Elmas et al., 17
Flanagan and Freund, 18 or Lucia et al. 27)
that replace all data races with exceptions. (There are other dynamic data
race detection techniques, primarily for
debugging, but they do not guarantee
complete accuracy, as required here.)
A longer-term direction concerns
both the appropriate discipline and its
enforcement. A fundamental challenge
in debugging, testing, and reasoning
about threaded programs arises from
their inherent non-determinism—an
execution may exhibit one of many possible interleavings of its memory accesses. In contrast, many applications
Data-race-free
provides a simple
and consistent
model for threads
and shared
variables. We
are convinced
that it is the best
model today to
target during
initial software
development.
written for performance have deterministic outcomes and can be expressed
with deterministic algorithms. Writing
such programs using a deterministic
environment allows reasoning with sequential semantics (a memory model
much simpler than sequential consistency with threads).
A valuable discipline, therefore, is to
provide a guarantee of determinism by
default; when non-determinism is inherently required, it should be requested explicitly and should not interfere
with the deterministic guarantees for
the remaining program. 7 There is much
prior work in deterministic data parallel, functional, and actor languages. Our
focus is on general-purpose efforts that
continue use of widespread programming practices; for example, global
address space, imperative languages,
object-oriented programming, and
complex, pointer-based data structures.
Language-based approaches with
such goals include Jade34 and the recent
Deterministic Parallel Java (DPJ). 8 In
particular, DPJ proposes a region-based
type and effect system for determinis-tic-by-default semantics—“regions”
name disjoint partitions of the heap
and per-method effect annotations
summarize which regions are read and
written by each method. Coupled with
a disciplined parallel control structure,
the compiler can easily use the effect
summaries to ensure that there are no
unordered conflicting accesses and
the program is deterministic. Recent
results show that DPJ is applicable to a
range of applications and complex data
structures and provides performance
comparable to threads code. 8
There has also been much recent
progress in runtime methods for determinism. 4, 5, 16, 31
Both language and runtime approaches have pros and cons and still
require research before mainstream
adoption. A language-based approach
must establish that it is expressive
enough and does not incur undue programmer burden. For the former, the
new techniques are promising, but the
jury is still out. For the latter, DPJ is attempting to alleviate the burden by using a familiar base language (currently
Java) and providing semiautomatic
tools to infer the required programmer
annotations. 41 Further, language annotations such as DPJ’s read/write effect