through memory) and preserves cache coherence. SEED
adds architectural state, which must be maintained at context switches. Lastly, functional units (FUs) could be shared
with the GPP to save area (by adding bypass paths); this work
considers stand-alone FUs.
3. 2. Dataflow execution model
SEED’s execution model closely resembles prior dataflow
architectures, but is restricted for loops/nested-loops, and
adds the use of compound instructions.
We use a running example to aid explanation: a simple
linked-list traversal where a conditional computation is
performed at each node. Figure 7(a) shows the original program, (b) the Von Neumann control flow graph (CFG) representation, and (c) SEED’s explicit-dataflow representation.
Data-dependence: Similar to other dataflow representations, SEED programs follow the dataflow firing rule: instructions execute when their operands are ready. To initiate
computation, live-in values are sent from the host. During
dataflow execution, each instruction forwards its outputs to
dependent instructions, either in the same iteration (solid
line in Figure 7(c)), or in a subsequent iteration (dotted
line). For example, the a_next value loaded from memory
is passed on to the next iteration for address computation.
Control-flow strategy: Control dependencies between
instructions are converted into data dependencies. SEED
uses a switch instruction, which forwards values to one of two
possible destinations depending on the input control signal.
In the example, depending on the n_val comparison, v2
is forwarded to either the if or else branch. This strategy
enables control-equivalent regions to execute in parallel.
3. 1. Von Neumann core integration
Adaptive execution: To adaptively apply explicit-dataflow specialization, we use a technique similar to bigLITTLE, except
that we restrict the entry points of specializable regions to
fully-inlined loops or nested loops. This simplifies integration with a different ISA. Targeting longer nested-loop
regions reduces the cost of configuration and GPP core
GPP integration: SEED uses the same cache hierarchy as
the GPP, which facilitates fast switching (no data-copying
CFU 2 CFU 2
Subgraph 1 mapped to CFU 1
Subgraph 2 mapped to CFU 1
Subgraph 3 mapped to CFU 2
Subgraph 4 mapped to CFU 2
Figure 7. (a) Example C loop; (b) control flow graph (CFG); (c) SEED program representation.
SEED Unit 1
Specialization Engine for Explicit-Dataflow
e SEED Unit 8
Figure 6. High-level SEED integration and organization (IMU: instruction
management unit; CFU: compound functional unit; ODU: output