The Burroughs machines disappeared not because of any defect in
their architecture, but because of
IBM’s massive success in marketing
the 360 series systems. Moreover, in a
process reminiscent of Clayton Chris-tensen’s Innovator’s Dilemma, the
low-end assembler-language minicomputers, originally designed to run
laboratory instruments, grew up into
the minicomputer and then micro-computer market, untainted by any
notions of parallel programming.
With the introduction of RISC architectures in the early 1980s, much
of the research for high-performance
computers was rechanneled toward
exploiting RISC for fast chips. It
looked at the time that sophisticated
compilers could make up for missing
functions in the chips.
With two notable exceptions, most
of the projects exploring alternatives
to the “von Neumann architecture” expired and were not replaced with new
projects or other initiatives. One exception was Arvind’s Monsoon Project at
MIT, 10 which demonstrated that massive parallelism is readily identified in
the functional programming language
Haskell, and then readily mapped to a
shared memory multiprocessor. (
Functional languages generate all their values by evaluating functions without
side effects.)
The other project involved a group
at the Lawrence Livermore National
Laboratory studying scientific codes in
the functional language Sisal, a derivative of MIT’s Val language; Sisal programs were as efficient as Fortran programs and could be readily compiled
to massively parallel shared memory
supercomputers. 1, 2, 11
The current generations of supercomputers (and data warehouses) are
based on thousands of CPU chips running in parallel. Unlike the innovative
designs of the Burroughs systems, their
hardware architectures conform to the
conventional von Neumann machine.
Their operating systems are little more
than simple schedulers and message
passing protocols, with complex functions relegated to applications running
on separate host machines.
The point is clear: ideas for arranging multiple processors to work together in an integrated system have been
with us for 50 years. What’s new?
the parallel
architecture research
of the 1960s
and 1970s solved
many problems
that are being
encountered today.
Determinate computation
One of the holy grails of research in
parallel computation in the 1960s and
1970s was called “determinacy.” 7
Determinacy requires that a network of
parallel tasks in shared memory always
produces the same output for given
input regardless of the speeds of the
tasks. It should not be confused with
a similar word, “deterministic,” which
would require that the tasks be ordered
in the same sequence every time the
system runs.
A major result of this research was
the “determinacy theorem.” A task is
a basic computation that implements
a function from its inputs to outputs.
Two tasks are said to be in conflict if
either of them writes into memory
cells used by the other. In a system of
concurrent tasks, race conditions may
be present that make the final output
depend on the relative speeds or orders of task execution. Determinacy is
ensured if the system is constrained
so that every pair of conflicting tasks is
performed in the same order in every
run of the system. Then no data races
are possible. Note that atomicity and
mutual exclusion are not sufficient
for determinacy: they ensure only that
conflicting tasks are not concurrent,
but not that they always executed in
the same order.
A corollary of the determinacy theo-
rem is that the entire sequence of val-
ues written into each and every mem-
ory cell during any run of the system is
the same for the given input. This cor-
ollary also tells us that any system of
blocking tasks that communicates by
messages using FIFO queues (instead
of shared memory) is automatically
determinate because the message
queues always present the data items
in the same order to the tasks receiving
them.
functional Programming
and composability
Another holy grail for parallel system
has been modular composability. This
would mean that any parallel program
can be used, without change, as a component of a larger parallel program.
Three principles are needed to enable parallel program composability.
David Parnas wrote about two: information hiding and context independence. Information hiding means
a task’s internal memory cannot be
read or written by any other task. Context independence means no part of
a task can depend on values outside
the task’s internal memory or input-output memory. The third principle is
argument noninterference; it says that
a data object presented as input to two
concurrent modules cannot be modified by either.
Functional programming languages automatically satisfy these three
principles; their modules are thus
composable.
It is an open question how to structure composable parallel program
modules from different frameworks
when the modules implement nondeterminate behavior. Transaction
systems are an extreme case. Because
their parallel tasks may interfere in
the records they access, they use locking protocols to guarantee mutual exclusion. Transaction tasks cannot be
ordered by a fixed order—their nonde-terminacy is integral to their function.
For example, an airplane seat goes to
whichever task requested it first. The