tiple ways to create parallel activities
(such as parallel loops in which iterations could execute concurrently) and
synchronization operations for enforcing order between activities on different processors. The Parallel Computing Forum promulgated a standard
set of FORTRAN extensions for this
programming model18 to allow easy
programming while enabling access to
the performance advantages of shared
memory over other machines.
However, two sets of problems arose
with these architectures: First, the inability to hide latency with conventional processors at large scale and the
hardware overhead of implementing
cache coherence limiting their scalabil-ity; and second, the possibility of data
races, so if two parallel threads access
the same memory location, with at least
one of them writing to that location,
then a lack of proper synchronization
could lead to different results for different parallel schedules.
These problems led to development of distributed-memory MIMD
machines in which processors were
interconnected via networks, with
each processor having its own local
memory. This organization allowed the
machines to scale to around 1,000 processors by the early 1990s. The price
of this architectural simplicity was
software complexity; when two processors would have to share data, they
would have to exchange messages, an
expensive operation. Reducing this
cost relied on correctly placing the
data to minimize the required communication, and placing the message-passing calls in the most appropriate program locations. By the time
the HPF effort had begun in the early
1990s, message-passing libraries (such
as PVM and PARMACS) were already
being used for programming MIMD
systems in connection with standard
sequential languages. A standardization activity for message passing began
soon thereafter, patterned after and
in close connection with the HPF effort.
11 The resulting Message Passing
Interface (MPI)
20 library provided efficient explicit control of locality and
communication. Despite the greater
complexity of this programming paradigm, it quickly became popular due to
its ability to exploit the performance of
distributed-memory computers.
the hPf goal
was a common
programming
model that could
execute efficiently
on all classes of
parallel machines.
The HPF goal was a common programming model that could execute
efficiently on all classes of parallel
machines. Developers had to answer
whether the advantages of shared
memory, even a single thread of control, can be simulated on a distributed-memory machine; also how parallelism
can be made to scale to hundreds or
thousands of processors. Such issues
were addressed through data-parallel
languages in which the large data structures of applications are part of a global
name space that can be laid out across
the memories of a distributed-memory
machine, or a “data distribution.” The
data elements mapped to the local
memory of a processor in this way are
said to be “owned” by the processor.
Program execution is modeled by a
single thread of control, but the components of distributed data structures
can be operated on in parallel. The data
distribution controls how work is to be
allocated among the processors. Compilation techniques (such as those discussed later) allowed MIMD architectures to relax the lockstep semantics of
the SIMD model to improve efficiency.
The HPF design was based largely on
experience with the language designs
and implementations of the early data-parallel languages. Here, we focus on
the data-parallel approach embodied
in HPF. To be sure, it was not universally embraced by either the compiler
research community or the application community. Other programming
paradigms, including functional and
dataflow languages, also had substantial intellectual merit, passionate user
communities, and/or both.
hPf standardization Process
At the Supercomputing conference in
1991 in Albuquerque, NM, Ken Kennedy of Rice University and Geoffrey Fox
of Indiana University met with a number of commercial vendors to discuss
the possibility of standardizing the data-parallel versions of Fortran. These
vendors included Thinking Machines
(then producing SIMD machines), Intel and IBM (then producing distributed-memory MIMD machines), and
Digital Equipment Corp. (then interested in producing a cross-platform
Fortran system for both SIMD and
MIMD machines).
Kennedy and Fox organized a birds-