major LMS components on the way. Section 2. 1 discusses
representations of staged code, Section 2. 2 is concerned
with optimizations, Section 2. 3 is concerned with target
code generation, and Section 2. 4 concludes the example
by showing how the generated code can be integrated in
larger programs. Section 3 describes how additional features can be added, in particular functions and recursion.
Section 4 discusses related work. An earlier version of this
paper appeared at GPCE 2010. The original version contains a few additional examples and a more thorough discussion of how effectful statements are represented, while
the present version has been updated with some new material in Section 3.
2. An enD-To-enD eXAMPLe
In the same way as the simple power function shown above,
we can stage far more interesting and practically relevant
programs, for example, the fast Fourier transform (FFT).
A staged FFT, implemented in MetaOCaml, has been presented by Kiselyov et al.
12 Their work is a very good showcase for how staging allows one to transform a simple,
unoptimized algorithm into an efficient program generator. Achieving this in the context of MetaOCaml, however,
required restructuring the program into monadic style
and adding a front-end layer for performing symbolic
rewritings. Using our approach of just adding Rep types,
we can go from the naive textbook-algorithm to the staged
version (shown in Figure 1) by changing literally two lines
of the code:
trait FFT { this: Arith with Trig =>
case class Complex(re: Rep[Double], im:
...
}
All that is needed is adding the self-type annotation to
import arithmetic and trigonometric operations and changing the type of the real and imaginary components of complex numbers from Double to Rep[Double].
Merely changing the types will not provide us with the
desired optimizations yet. We see below how we can add
the transformations described by Kiselyov et al. to generate the same fixed-size FFT code, corresponding to the
famous FFT butterfly networks (see Figure 2). Despite the
seemingly naive algorithm, this staged code is free from
branches, intermediate data structures, and redundant
computations. The important point here is that we can
add these transformations without any further changes to
the code in Figure 1, just by mixing in the trait FFT with a
few others.
In the remainder of this section, we present the LMS
framework that is used to generate the code in Figure
2 from the algorithm in Figure 1. Before considering
specific optimizations, let us have a closer look at the
definition of Rep and the traits Arith and Trig. The definitions are given in Figure 3. In trait Base, the declaration type Rep[+T] defines an abstract type constructor
figure 1. ff T code. only the real and imaginary components of
complex numbers need to be staged.
trait FFT { this: Arith with Trig =>
case class Complex(re: Rep[Double], im: Rep[Double]) {
def +(that: Complex) =
Complex(
this.re + that.re,
this.im + that.im)
def *(that: Complex) = ...
}
def omega(k: Int, N: Int): Complex = {
val kth = − 2.0 k Math.Pi/N
Complex(cos(kth), sin(kth) )
}
def fft(xs: Array[Complex]): Array[Complex] = xs match {
case (x :: Nil) => xs
case _ =>
val N = xs.length // assume it’s a power of two
val (even0, odd0) = splitEvenOdd(xs)
val (even1, odd1) = (fft(even0), fft(odd0))
val (even2, odd2) = (even1 zip odd1 zipWithIndex)
map {
case ((x, y), k) =>
val z = omega(k, N) y
(x + z, x − z)
}.unzip;
even 2 ::: odd2
}
}
figure 2. Computation graph for size- 4 ff T. Autogenerated from
staged code in figure 1.
Sym(0)
Sym( 4)
Sym( 2)
Sym( 6)
Sym( 26)
Result(Array(...))
Sym( 3)
Sym( 7)
Sym( 1)
Sym( 5)
Sym( 8)
Plus(Sym(0),Sym( 4))
Sym( 12)
Plus(Sym( 2),Sym( 6))
Sym( 10)
Minus(Sym( 10),Sym( 4))
Sym( 14)
Minus(Sym( 2),Sym( 6))
Sym( 15)
Minus(Sym( 3),Sym( 7))
Sym( 13)
Plus(Sym( 3),Sym( 7))
Sym( 11)
Minus(Sym( 1),Sym( 5))
Sym( 9)
Plus(Sym( 1),Sym( 5))
Sym( 18)
Minus(Sym( 8),Sym( 12))
Sym( 16)
Plus(Sym( 8),Sym( 12))
Sym( 17)
Plus(Sym( 9),Sym( 13))
Sym( 19)
Minus(Sym( 9),Sym( 13))
Sym( 22)
Plus(Sym( 10),Sym( 15))
Sym( 24)
Minus(Sym( 10),Sym( 15))
Sym( 23)
Minus(Sym( 11),Sym( 14))
Sym( 25)
Plus(Sym( 11),Sym( 14))
(also called a higher kinded type) Rep which we take to
range over possible representations of staged expressions. Since Rep is abstract, no concrete representation
is defined yet; the declaration merely postulates the existence of some representation.
Trait Arith extends trait Base and contains only
abstract members. These postulate the existence of an