call. When the definition is known to be side-effect free,
to Atom will search the already encountered definitions
for a structurally equivalent one. If a matching previous
definition is found, its symbol will be returned, possibly
moving the definition to a parent scope to make it accessible. If the definition has side effects or it is seen for the first
time, it will be associated with a fresh symbol and saved for
future reference. This simple scheme provides a powerful
global value numbering (common subexpression elimination) optimization that effectively prevents generating
duplicate code. More complicated optimization schemes
can be added, too.
Since all operations in interface traits such as Arith
return Rep types, defining Rep[T] = Exp[T] in trait
BaseExp (see Figure 5) means that conversion to symbols
will take place within the constructor methods (e.g., cos or
infix_*). This fact is important because it establishes our
correspondence between the evaluation order of the program generator and the evaluation order of the generated
program: at the point where the generator calls toAtom, the
composite definition is turned into an atomic value, that is,
its evaluation is recorded now and played back later in the
same relative order with respect to others within the closest
reifyBlock invocation.
We observe that there are no concrete definition classes
provided by the trait Expressions. Providing meaningful
data types is the responsibility of other traits that implement
the interfaces defined previously (Base and its descendants).
For each interface trait, there is one corresponding core implementation trait. Shown in Figure 5, we have traits BaseExp,
ArithExp, and TrigExp for the functionality required
by the FFT example. The trait BaseExp installs atomic
figure 5. implementing the interface traits from figure 3 using the
expression types from figure 4.
trait BaseExp extends Base with Expressions {
type Rep[+T] = Exp[T]
}
trait ArithExp extends Arith with BaseExp {
implicit def unit(x: Double) = Const(x)
case class Plus(x: Exp[Double], y: Exp[Double])
extends Def[Double]
case class Times(x: Exp[Double], y: Exp[Double])
extends Def[Double]
def infix_+(x: Exp[Double], y: Exp[Double]) =
Plus(x, y)
def infix_*(x: Exp[Double], y: Exp[Double]) =
Times(x, y)
}
trait TrigExp extends Trig with BaseExp {
case class Sin(x: Exp[Double]) extends Def[Double]
case class Cos(x: Exp[Double]) extends Def[Double]
def sin(x: Exp[Double]) = Sin(x)
def cos(x: Exp[Double]) = Cos(x)
}
expressions as the representation of staged values by defining Rep[T] = Exp[T]. Traits ArithExp and TrigExp define
one definition class for each operation defined by Arith and
Trig, respectively, and implement the corresponding interface methods to create instances of those classes.
2. 2. implementing optimizations
Some profitable optimizations, such as the global value
numbering described above, are very generic. Other optimizations apply only to specific aspects of functionality, for example, particular implementations of constant
folding (or more generally, symbolic rewritings) such as
replacing computations like x
1.0 with x. Yet, other
optimizations are specific to the actual program being
staged. In the FFT case, Kiselyov et al.
12 describe a number
of rewritings that are particularly effective for the patterns
of code generated by the FFT algorithm but not as much as
for other programs.
What we want to achieve again is modularity, so that optimizations can be combined in a way that is most useful for a
given task. To implement a particular rewriting rule (whether
specific or generic), say, x
1.0 → x, we can provide a specialized implementation of infix_* (overriding the one in
trait ArithExp) that will test its arguments for a particular
pattern. How this can be done in a modular way is shown
by the traits ArithExpOpt and ArithExpOptFFT, which
implement some generic and program-specific optimizations (see Figure 6). Note that the use of x*y within the body
of infix_* will apply the optimization recursively.
figure 6. extending the implementation from figure 5 with generic
(top) and specific (bottom) optimizations (analog of TrigExp has been
omitted).
trait ArithExpOpt extends ArithExp {
override def infix_*(x:Exp[Double],y:Exp[Double])=
(x,y) match {
case (Const(x), Const(y)) => Const(x y)
case (x, Const( 1)) => x
case (Const( 1), y) => x
case _ =>
super.infix_*(x, y)
}
}
trait ArithExpOptFFT extends ArithExp {
override def infix_*(x:Exp[Double],y:Exp[Double]) =
(x,y) match {
case (Const(k), Def(Times(Const(l), y))) =>
Const(k l) y
case (x, Def(Times(Const(k), y))) =>
Const(k) (x y))
case (Def(Times(Const(k), x)), y) =>
Const(k) (x y))
...
case (x, Const(y)) => Times(Const(y), x)
case _ =>
super.infix_*(x, y)
}
}