Doi: 10.1145/2184319.2184345
Lightweight Modular Staging:
A Pragmatic Approach to
Runtime Code Generation
and Compiled DSLs
Abstract
Good software engineering practice demands generalization and abstraction, whereas high performance demands
specialization and concretization. These goals are at odds,
and compilers can only rarely translate expressive high-level programs to modern hardware platforms in a way that
makes best use of the available resources.
Generative programming is a promising alternative to
fully automatic translation. Instead of writing down the
target program directly, developers write a program generator, which produces the target program as its output.
The generator can be written in a high-level, generic style
and can still produce efficient, specialized target programs. In practice, however, developing high-quality program generators requires a very large effort that is often
hard to amortize.
We present lightweight modular staging (LMS), a generative programming approach that lowers this effort significantly. LMS seamlessly combines program generator logic
with the generated code in a single program, using only
types to distinguish the two stages of execution. Through
extensive use of component technology, LMS makes a reusable and extensible compiler framework available at the
library level, allowing programmers to tightly integrate
domain-specific abstractions and optimizations into the
generation process, with common generic optimizations
provided by the framework.
LMS is well suited to develop embedded domain-specific
languages (DSLs) and has been used to develop powerful
performance-oriented DSLs for demanding domains such
as machine learning, with code generation for heterogeneous platforms including GPUs. LMS has also been used to
generate SQL for embedded database queries and JavaScript
for web applications.
1. in TRoDuCTion
Building and managing complex software systems is
only possible by generalizing functionality and abstracting from particular use cases. Achieving performance,
on the other hand, requires concretizing configurations
and specializing code to its particular environment.
Generative programming can bridge this gap by translating away abstraction overhead and effectively specializing generic programs.
Generative programming can be broadly classified as
static or dynamic. Static code generation happens at compile time; a widely used example is the C++ template language or macro systems in other languages. Dynamic code
generation that takes place at program runtime brings
additional flexibility, because code can be specialized with
respect to parameters not yet available at compile time.
Many computations can naturally be separated into
stages distinguished by frequency of execution or availability of information. Staging transformations aim at
executing certain pieces of code less often or at a time
when performance is less critical. In the context of program generation, multistage programming (MSP,
staging for short) as established by Taha and Sheard22 allows
programmers to explicitly delay evaluation of a program
expression to a later stage (thus, staging an expression).
The present stage effectively acts as a code generator that
composes (and possibly executes) the program of the next
stage. A nice property of this approach is that generator
and generated code are expressed in a single program,
with the aim that programmers can construct a multistage
program from a naive implementation of the same algorithm by adding staging annotations in a selective way.
Basic mechanisms for composing program fragments at
runtime have existed for a long time in the form of Lisp’s
“code is data” model and its use of quasi-quotation: syntactic annotations to denote expressions that should remain
unevaluated and to mark holes within them, to be filled
in with expressions computed elsewhere. Dedicated MSP
languages such as MetaML22 and MetaOCaml1 add well-scoping and well-typing guarantees about the generated
code. Despite these advances, incorporating dynamic code
generation into larger systems in the form of self-optimizing
“active” libraries or adding compilation to embedded
domain-specific languages (DSLs) that are implemented as
libraries remains a significant challenge.
1. 1. Contributions
In this paper, we present lightweight modular staging
A previous version of this paper was published in the
Proceedings of the 9th International Conference on
Generative Programming and Component Engineering
(Oct. 10–13, Eindhoven , The Netherlands).