Building adaptable and more efficient programs
for the multi-core era is now within reach.
By Jason Ansel and Cy Chan
With the dawn of the multi-core era, programmers are being challenged to write code that performs well on an increasingly diverse array of architectures. A single program or library may be used on systems ranging in power from large servers with dozens or hundreds of cores to small single-core netbooks or mobile phones.
A program may need to run efficiently both on architectures with many simple cores and on
those with fewer monolithic cores. Some of the systems a program encounters might have
GPU coprocessors, while others might not. Looking forward, processor designs such as
asymmetric multi-core [ 3], with different types of cores on a single chip, will
present an even greater challenge for
programmers to utilize effectively.
that allows the user to specify algorithmic choices at the language level. Using
this mechanism, PetaBricks programs
define not a single algorithmic path,
but a search space of possible paths.
This flexibility allows our compiler to
build programs that can automatically
not a single
but a search space of
possible paths. This
flexibility allows our
compiler to build
programs that can
autotuning, to every
adapt, with empirical autotuning, to
every architecture they encounter.
PE TABRICKS LANGUAGE
The PetaBricks language provides a
framework for the programmer to describe multiple ways of solving a problem while allowing the autotuner to determine which of those ways is best for
the user’s situation. It provides both algorithmic flexibility (multiple algorithmic choices) as well as coarse-grained
code generation flexibility (synthesized
outer control flow).
At the highest level, the programmer
can specify a transform, which takes
some number of inputs and produces
some number of outputs. In this respect, the PetaBricks transform is like
a function call in any common procedural language. The major difference
with PetaBricks is that we allow the
programmer to specify multiple pathways to convert the inputs to the outputs for each transform. Pathways are
specified in a dataflow manner using
a number of smaller building blocks
called rules, which encode both the
data dependencies of the rule and C++-like code that converts the rule’s inputs
to outputs. Dependencies are specified