chip interconnect, are shared between
different jobs running on the system.
Other OpenPiton-related work has
also explored issues in security and
side-channel attacks.
While most computer architects use
in-house simulators or from-scratch
implementations to do their research,
which results in questionable claims
of validity and reproducibility, the
Princeton team took an extremely
clever approach: they leveraged an existing open source processor design,
the OpenSPARC T1, to extend it into an
entirely new scalable design. Then, the
team integrated their research projects into this chip design, and taped
out many of their research projects, so
that all of these have a real-world physical realization in an advanced 25-core
manycore processor in 32-nm technology. Then, they realized this effort as
the open source OpenPiton scalable
processor, which is the only real-world
open source platform for both prototyping and experimenting with Linux-capable manycore.
I believe this work will unlock the
next 20 years of progress in Linux-capable manycore research in academia, which has largely fizzled
because of the lack of realistic, sili-con-validated models to work with. At
the same time, OpenPiton’s scalable
cache coherence implementation
is licensed under the BSD license,
which allows it to be freely mixed and
matched. Indeed, work is already un-derway to retrofit open source RISC-V
implementations like BlackParrot
and Ariane. I expect OpenPiton’s influence will grow across the community and enable larger and larger research projects that can truly deliver
specialization across the stack.
Michael B. Taylor is a professor in the Paul Allen School
of Computer Science and Engineering and the Department
of Electrical Engineering at the University of Washington,
Seattle, WA, USA.
Copyright held by author/owner.
COMPUTER ARCHITECTURE IS currently
undergoing a radical and exciting
transition as the end of Moore’s Law
nears, and the burden of increasing
humanity’s ability to compute falls to
the creativity of computer architects
and their ability to fuse together the
application and the silicon. A case in
point is the recent explosion of deep
neural networks, which occurred as
a result of a drop in the cost of compute because of successful parallelization with GPGPUs (general-purpose
graphics processing units) and the
ability of cloud companies to gather
massive amounts of data to feed the
algorithms. As improvements in general-purpose architecture slow to a
standstill, we must specialize the architecture for the application in order
to overcome fundamental energy efficiency limits that prevent humanity’s
progress. This drive to specialize will
bring another wave of chips with neu-ral-network specific accelerators currently in development worldwide, but
also a host of other kinds of accelerators, each specialized for a particular
planet-scale purpose.
Organizations like Google, Microsoft, and Amazon are increasingly finding reasons to bypass the confines imposed by traditional silicon companies
by rolling their own silicon that is tailored to their own datacenter needs. In
this new world, a multicore processor
acts more of a caretaker of the accelerator rather than the main act.
However, specialization brings chal-
lenges, primarily the high NRE (non-
recurring engineering) costs, and the
long time-to-market of developing cus-
tomized chips. Ultimately this NRE will
limit the diversity of specialized chips
that are economically feasible. For this
reason, a new style of computer archi-
tecture research has emerged, which
attacks the challenge of driving down
the cost and time to market of devel-
oping these specialized hardware de-
signs. A growing movement within aca-
demia is to train Ph.D. students with
the skills necessary to succeed in this
brave new world, learning not only how
to perform research but also to design
and build chips. Both feeding into and
out of this approach is the growth of an
active open source movement that ulti-
mately will provide many of the compo-
nents that will be mixed and matched
to create low-NRE designs.
The OpenPiton research, led by
Princeton professor David Wentzlaff,
is one of the watershed moments in
this fundamental shift toward the construction of an open source ecosystem
in the computer architecture. OpenPiton is an open source distributed
cache-coherent manycore processor
implementation for cloud servers. Unlike most multicore implementations,
OpenPiton implements a new scalable, directory-based cache-coherence
protocol (known as P-Mesh) with three
levels of cache, including a distributed,
shared last-level L2 cache that scales
in size with the number of tiles. Cache
coherence is maintained using three
physical Networks on Chip (NoCs),
which can connect general-purpose
cores, accelerators, and other peripherals, and can be extended across multiple chips to build systems with up to
500 million cores.
In contrast to existing Intel Xeon
processors, manycores are designed
to have low implementation complexity, in order to minimize NRE. Many-core is clearly the future of low-NRE
general-purpose server architecture
and provides scalable general-purpose performance that can be coupled
with emerging classes of accelerators.
The OpenPiton work advances the
bar not only by releasing open source
for use by others, but also serving as
a framework for micro-architectural
exploration in Infrastructure-as-a-Service (IaaS) clouds. Much of this
micro-architectural work focuses on
how resources, whether inside a core,
in the cache, or in the on-chip or off-
research highlights
DOI: 10.1145/3366341
To view the accompanying paper,
visit doi.acm.org/10.1145/3366343 Technical Perspective
Bootstrapping a Future of Open
Source, Specialized Hardware
By Michael B. Taylor