commutativity arises from the fact that there may be several
concrete (memory) states that represent a single abstract
state. Exploiting this kind of ADT commutativity obviously
requires an object-oriented language.
We will refer to application-specific and ADT commutativity
as semantic commutativity. In contrast, traditional compile-time
parallelization techniques such as dependence analysis10 and
Diniz and Rinard’s commutativity analysis4 focus on concrete
commutativity in which all orders of performing operations
result in the same concrete state. Semantic commutativity is
more general, and it permits more interleavings of operations.
The programming abstractions introduced in this section permit programmers to highlight opportunities for
exploiting semantic commutativity. They can be added to
any sequential object-oriented language (the results in this
paper are from an implementation in C++). Figure 5(a) is a
conceptual picture of the Galois system. Application programs use two constructs called Galois set iterators, described
in Section 3. 1, for highlighting opportunities for exploiting
parallelism. Section 3. 2 describes the data structure library.
Data structures in which there are opportunities to exploit
ADT commutativity are implemented by catch-and-release
classes, while others are implemented in catch-and-keep
classes. The runtime system implements optimistic parallelization, and detects and recovers from potentially unsafe
accesses to shared objects, as explained in Section 3. 3.
3. 1. Galois set iterators
The Galois programming model is sequential and object-oriented; programs are written in an object-oriented language like C++ or Java extended with two constructs called
Galois set iterators.
• Unordered-set iterator: for each e in set s do B(e)
The loop body B(e) is executed for each element e of set S.
Since set elements are not ordered, this construct asserts
that in a serial execution of the loop, the iterations can be
executed in any order. There may be dependences between
the iterations, as in the case of Delaunay mesh refinement,
but any serial order of executing iterations is permitted.
When an iteration executes, it may add elements to S.
• ordered-set iterator: for each e in poset s do B(e)
This construct iterates over a partially ordered set
(Poset) S. It is similar to the set iterator above, except
that any execution order must respect the partial order
imposed by the Poset S.
figure 5. The Galois system.
main()
for each ..... {
....
.......
.......
.......
.......
......
}
Master
i1
i2
i3
Library
Runtime system
Catch-and-keep
classes
Catch-and-release
classes
Application
program
i4
i5
(a) Layered architecture
(b) Execution model
92 commuNicaTioNS of The acm | sEPTEMBER 2009 | VoL. 52 | no. 9
Note that new elements may be added to a set while
iterating over it, which is not allowed in conventional set
iterators in languages like SETL or Java. Figure 6 shows
the client code for Delaunay mesh generation. Instead of a
work-list of bad triangles, this code uses a set of bad triangles and a set iterator. Since set elements are not ordered,
the iterator is permitted to iterate over the set in any order.
Therefore, the Galois version makes evident the fact that
the bad triangles can be processed in any order, a fact that
is absent from the more conventional code of Figure 2. For
lack of space, we do not show the Galois version of agglomerative clustering, but it uses the ordered-set iterator in the
obvious way.
The parallel execution model is shown in Figure 5(b).
A master thread executes all code outside the Galois set
iterators. When it encounters a Galois set iterator, it enlists
worker threads to help execute iterations concurrently. The
assignment of iterations to threads is done dynamically,
but this policy can be changed by an expert programmer. 12
Threads synchronize by barrier synchronization at the end
of the iterator.
Given this execution model, the main technical problem
is to ensure that the parallel execution respects the sequential semantics of the iterators. For an unordered-set iterator,
this can be accomplished by ensuring that the execution of
each iteration has transactional semantics. These semantics
guarantee serializability; the parallel execution will behave
as if the iterations were executed serially in some order. For
the ordered-set iterator, we must also ensure that the iterations appear to execute in the order prescribed by the ordering on set elements. Guaranteeing sequential semantics is
a nontrivial problem because each iteration may read and
write objects in shared memory, so we must ensure that
these reads and writes are properly coordinated. Next, we
describe how this is accomplished.
3. 2. Galois library classes
The library has two kinds of classes: catch-and-keep and
catch-and-release. As in Java, a lock is automatically associated with each object, but the locking policy is different for
the two kinds of classes.
Catch-and-Keep Classes: Catch-and-keep classes are the
default, and they are implemented in Galois using a variation of the well-known two-phase locking policy. To invoke
a method on a catch-and-keep object, an iteration must
figure 6. Delaunay mesh refinement using set iterator.
1: Mesh m = /* read in initial mesh */
2: Set wl;
3: wl.add ( mesh.bad Triangles ( ) );
4: for each e in wl do {
5: if (e no longer in mesh) continue;
6: Cavity c = new Cavity (e);
7: c.expand ();
8: c.retriangulate ();
9: m.update ( c );
10: wl.add( c.badTriangles());
11: }