software-hardware innovation inside
the SSD. Moreover, going beyond the
packaged SSD, because the two major
components inside the SSD are each
manufactured by multiple vendors,d it
is conceivable that SSDs could be cus-
tom designed and provided in partner-
ship with component vendorse (just
like how today’s datacenter servers are
built and deployed), and even contrib-
ute back some of the designs to the
community (via forums like the Open
Compute project, https://www.open-
compute.org). For example, the indus-
try is already moving in this direction
with introduction of the Open-Channel
SSD technology2, 8,f that moves much of
the SSD firmware functionalities out of
the black box and into the operating
system or userspace, giving applica-
tions better control over the device. In
an open source project called Denalig
in 2018, Microsoft proposed a scheme
d Several vendors manufacture each type of
component in flash SSDs. For example: flash
controller manufactured by Marvell, PMC (ac-
quired by Microsemi), Sandforce (acquired by
Seagate), Indilinx (acquired by OCZ), and flash
memory manufactured by Samsung, Toshiba,
and Micron.
e Many large-scale datacenter operators (such
as Google19 and Baidu16) build their own SSDs
that are fully optimized for their own applica-
tion requirements.
f The Linux Open-Channel SSD subsystem was
introduced in the Linux kernel version 4. 4.
g https://bit.ly/2GCuIum
ular debugging tools (such as Micro-
soft Visual Studio) available to general
application developers. Worse, the de-
vice-side processing code—selection
and aggregation—had to be compiled
into the SSD firmware in the prototype,
meaning application developers would
need to worry about not only the target
application itself but also complex in-
ternal structures and algorithms in the
SSD firmware.
On top of this, the consequences
of an error can be quite severe, which
could result in corrupted data or an
unusable drive. Workaday application
programmers are unlikely to accept
the additional complexity, and cloud
providers are unlikely to let untrusted
code run in such a fragile environment.
Application developers need a flexible and general programming model
that allows easily running user code
written in a high-level programming
language (such as C/C++) inside an
SSD. The programming model must
also support the concurrent execution
of multiple in-SSD applications while
ensuring that malicious applications
do not adversely affect the overall SSD
operation or violate protection guarantees provided by the operating and
file system.
In 2014, Seshadri et al.
20 proposed
Willow, an SSD that made programmability a central feature of the SSD
interface, allowing ordinary developers
to safely augment and extend the SSD
semantics with application-specific
functions without compromising file
system protections. In their model, host
and in-SSD applications communicate
via PCIe using a simple, generic—not
storage-centric—remote procedure call
(RPC) mechanism. In 2016, Gu et al.
7 explored a flow-based programming model where an in-SSD application can be
constructed from tasks and data pipes
connecting the tasks. These programming models provide great flexibility in
terms of programmability but are still
far from “general purpose.” There is
a risk that existing large applications
might still need significant redesigns
to exploit each model’s capabilities, requiring much time and effort.
Fortunately, winds of change can
disrupt the industry and help applica-
tion developers explore SSD program-
ming in a better way, as illustrated in
Figure 3. The processing capabilities
available inside the SSD are increasingly
powerful, with abundant compute
and bandwidth resources. Emerging
SSDs include software-programmable
controllers with multi-core proces-
sors, built-in hardware accelerators
to offload compute-intensive tasks
from the processors, multiple GBs of
DRAM, and tens of independent chan-
nels to the underlying storage media,
allowing several GB/s of internal data
throughput. Even more interesting
and useful, programming SSDs is be-
coming easier, with the trend away
from proprietary architectures and
software runtimes and toward com-
modity operating systems (such as
Linux) running on top of general-
purpose processors (such as ARM and
RISC-V). This trend enables general
application developers to fully lever-
age existing tools, libraries, and exper-
tise, allowing them to focus on their
own core competencies rather than
spending many hours getting used to
the low-level, embedded development
process. This also allows application
developers to easily port large applica-
tions already running on host operat-
ing systems to the device with mini-
mal code changes.
All in all, the programmability evolution in SSDs presents a unique opportunity to embrace the SSDs as a
first-class programmable platform
in the cloud datacenters, enabling
Figure 3. Disruptive trends in the flash storage industry toward abundant resources and
increased ease of programmability inside the SSD.
Frugal
resources
inside SSD
Abundant
resources
inside SSD
(
CP
U
#
c
or
e
s
/cl
o
ck
s
p
ee
d,
h
a
rd
w
a
re
o
ffl
o
ad
, D
R
A
M
,
#f
la
s
hc
h
a
nn
el
s
a
nd
c
ap
aci
ty)
Embedded CPU, proprietary
firmware OS
General purpose CPU,
server-like OS (Linux)
(Ease of programmability inside SSD)
DisruptiveTrend
Today’s
SSD
Programmable
SSD