0
20
40
60
80
100
Tile CCXarbiter FPU L1.5 L2 NoCrouter Core Chipbridge
Coverage
percen
ta
g e Overall score
Cond
Toggle
FSM
Branch
Figure 4. Test suite coverage results by module (default OpenPiton
configuration).
L2
23.11%
L1.5
14.35%
FPU
7.74%
NoC Router0
0.70%
NoC Router1
0.83%
NoC Router2
0.86%
CCX Arbiter+
Misc. Logic
0.82%
Figure 5. Tile area breakdown for FPGA PicoPiton.
Linux for SPARC,a and OpenSolaris (and its successors).
Porting the OpenSPARC T1 hypervisor required changes
to fewer than 10 instructions, and a newer Debian Linux
distribution was modified with open source, readily
available, OpenSPARC T1-specific patches written as
part of Lockbox. 3, 4
OpenPiton provides additional stability on top of what is
inherited from OpenSPARC T1. The tool flow was updated to
modern tools and ported to modern Xilinx FPGAs. OpenPiton
is also used extensively for research internal to Princeton.
This means there is active support for OpenPiton, and the
code is constantly being improved and optimized, with regular releases over the last several years. In addition, the open
sourcing of OpenPiton has strengthened its stability as a
community has built.
Validation. When designing large scale processors, simulation of the hardware design is a must. OpenPiton supports one open source and multiple commercial Verilog
simulators, which can simulate the OpenPiton design at
rates up to tens or hundreds of kilohertz. OpenPiton inherited and then extended the OpenSPARC T1’s large test
suite with thousands of directed assembly tests, randomized assembly test generators, and tests written in C. This
includes tests for not only the core, but the memory system,
I/O, cache coherence protocol, etc. Additionally, the extensions like Execution Drafting (ExecD) (Section 4. 1. 1) have
their own test suites. When making research modifications
to OpenPiton, the researcher can rely on an established test
suite to ensure that their modifications did not introduce
any regressions. In addition, the OpenPiton documentation details how to add new tests to validate modifications
and extend the existing test suite. Researchers can also use
our scripts to run large regressions in parallel (to tackle the
slower individual execution), automatically produce pass/
fail reports and coverage reports (as shown in Figure 4),
and run synthesis to verify that synthesis-safe Verilog has
been used. Our scripts support the widely used SLURM job
scheduler and integrate with Jenkins for continuous integration testing.
3. 2. FPGA prototyping
OpenPiton can also be emulated on FPGA, which pro-
vides the opportunity to prototype the design, emulated
at tens of megahertz, to improve throughput when run-
ning our test suite or more complex code, such as an
interactive operating system. OpenPiton is actively sup-
ported on three Xilinx FPGA platforms: Artix- 7 (Digilent
Nexys Video), Kintex- 7 (Digilent Genesys 2) and Virtex- 7
(VC707 Evaluation Board). An external port is also main-
tained for the Zynq-7000 (ZC706 Evaluation Board).
Figure 5 shows the area breakdown for a minimized
“PicoPiton” core, implemented for an Artix- 7 FPGA
(Digilent Nexys 4 DDR).
OpenPiton designs have the same features as the Piton
processor, validating the feasibility of that particular design
(multicore functionality, etc.), and can include the chip
bridge to connect multiple FPGAs via an FPGA Mezzanine
Card (FMC) link. All of the FPGA prototypes feature a full
system (chip plus chipset), using the same codebase as the
chipset used to test the Piton processor.
OpenPiton on FPGA can load bare-metal programs over
a serial port and can boot full stack multiuser Debian Linux
from an SD/SDHC card. Booting Debian on the Genesys2
board running at 87.5MHz takes less than 4 minutes (and
booting to a bash shell takes just one minute), compared
to 45 minutes for the original OpenSPARC T1, which relied
on a tethered MicroBlaze for its memory and I/O requests.
This boot time improvement combined with our push-button FPGA synthesis and implementation scripts drastically increases productivity when testing operating system
or hardware modifications.
3. 3. The Princeton Piton Processor
The Piton processor prototype12, 13 was manufactured in
March 2015 on IBM’s 32 nm SOI process with a target clock
frequency of 1GHz. It features 25 tiles in a 5 × 5 mesh on a
6mm × 6mm ( 36 mm2) die. Each tile is two-way threaded
and includes three research projects: ExecD, 11 CDR, 8 and
MITTS, 23 while an ORAM7 controller was included at the chip
level. The Piton processor provides validation of OpenPiton
as a research platform and shows that ideas can be taken
from inception to silicon with OpenPiton.
With Piton, we also produced the first detailed power
and energy characterization of an open source manycore
design implemented in silicon. 13 This included characterizing energy per instruction, NoC energy, voltage versus
frequency scaling, thermal characterization, and memory
system energy, among other properties. All of this was
done in our lab, running on the Piton processor with the
OpenPiton chipset implemented on FPGA. Performing
such a characterization yielded new insights into the
balance between recomputation and data movement,
a Linux for SPARC is hosted at https://oss.oracle.com/projects/linux-sparc/