specialization, the use of staging provides strong guarantees about the structure of the generated code. In this
case, we are guaranteed that specialization will happen for
each value of m (but never for n), statically evaluating tests
on values of m and inserting constants for all occurrences
of m in the generated code.
Other strong guarantees can be achieved by restricting the interface of function definitions. Being of type
Rep[A=>B], the result of lambda is a first-class value in
the generated code that can be stored or passed around
in arbitrary ways. However, we might want to avoid higher
order control flow in generated code for efficiency reasons,
or to simplify subsequent analysis passes. In this case, we
can define a new function constructor fundef as follows:
def fundef[A,B](f: Rep[A] => Rep[B]): Rep[A] =>
Rep[B] =
(x: Rep[A]) => lambda(f).apply(x)
The use of fundef instead of lambda produces a
restricted function that can only be applied but not passed
around in the generated code (type Rep[A]=>Rep[B]).
At the same time, a result of fundef is still a first class
value in the code generator. If we do not expose lambda
and apply at all to client code, we obtain a guarantee that
each function call site unambiguously identifies the function definition being called and no closure objects will
need to be created at runtime.
4. ReLATeD WoRk
Static program generation approaches include C++ templates, and Template Haskell.
20 Building on C++ templates,
customizable generation approaches are possible through
Expression Templates.
23 An example of runtime code generation in C++ is the TaskGraph framework. Active libraries
were introduced by Veldhuizen24 and telescoping languages
by Kennedy at al.
11 Specific toolkits using domain-specific
code generation and optimization include FFTW,
6 SPIRAL,
17
and ATLAS.
25
This paper draws a lot of inspiration from the work of
Kiselyov et al.
12 on a staged FFT implementation. Performing
symbolic rewritings by defining operators on lifted expressions and performing common subexpression elimination
on the fly is also central to their approach. LMS takes these
ideas one step further by making them a central part of the
staging framework itself.
Immediately related works on embedding typed languages include those of Carette et al.
2 and Hofer et al.
8 Lee
et al.
13, 18 describe how LMS is used in the development of
DSLs for high-performance parallel computing on heterogenous platforms.
Multistage Programming Languages such as MetaML22 and
MetaOCaml1 have been proposed as a disciplined approach
to building code generators. These languages provide three
syntactic annotations: brackets, escape, and run, which
together provide a syntactic quasi-quotation facility that
is similar to that found in Lisp but statically scoped and
statically typed.
Acknowledgments
The development of lightweight modular staging has benefited greatly from the work on Delite in collaboration
with the Stanford Pervasive Parallelism Lab, in particular
Arvind Sujeeth, Hassan Chafi, Kevin Brown, HyoukJoong
Lee and Kunle Olukotun. A number of members of the
authors’ group at EPFL have applied LMS to interesting
use cases and contributed new insights or valuable extensions to the framework.