uals contributing patches to the open
source DPDK projects as of release
17.05. While DPDK is network-centric,
it provides the basis for the SPDK stor-age-centric ecosystem. Other projects,
such as FD.IO ( http://fd.io) and Seastar
( http://seastar-project.org) also use
DPDK. These domain specifics are not
discussed in this article.
Linux user space device enablers.
Linux kernel version 2. 6 introduced
the User Space IO (UIO)a loadable module. UIO is the older of the two kernel-bypass mechanisms in Linux (VFIO being the other). It provides an API that
enables user space handling of legacy
INTx interrupts, but not message-sig-naled interrupts (MSI or MSI-X). UIO
also does not support DMA isolation
through IOMMU isolation. Even with
these limitations, UIO is well suited
for use in virtual machines, where direct IOMMU access is not available. In
these situations, a guest VM user space
process is not isolated from other processes in the same guest VM, but the
hypervisor itself can isolate the guest
VM from other VMs or host processes
using the IOMMU.
For bare-metal environments, VFIOb
is the preferred framework for Linux
kernel-bypass. It operates with the
Linux kernel’s IOMMU subsystem to
place devices into IOMMU groups.
User space processes can open these
IOMMU groups and register memory
with the IOMMU for DMA access us-
ing VFIO ioctls. VFIO also provides the
ability to allocate and manage mes-
sage-signaled interrupt vectors.
Data plane development kit. DPDK
( http://dpdk.org) was originally aimed
at accelerating network packet processing applications. The project was
initiated by Intel Corporation, but is
now under the purview of the open
source Linux Foundation. At the core
of DPDK is a set of polled-mode Ethernet drivers (PMDs). These PMDs bypass the kernel, and by doing so, can
process hundreds of millions of net-work-packets per second on standard
DPDK also provides libraries to aid
kernel-bypass application development. These libraries enable probing
for PCI devices (attached via UIO or
VFIO), allocation of huge-page memory, and data structures geared toward
polled-mode message-passing applications such as lockless rings and memory buffer pools with per-core caches.
Figure 6 shows key components of the
Storage performance development
kit. SPDK is based on the foundations
of DPDK. It was introduced by Intel
Corporation in 2015 with a focus on
enabling kernel-bypass storage and
storage-networking applications using
NVMe SSDs. While SPDK is primarily
driven by Intel, there are an increas-
ing number of companies using and
contributing to the effort. The proj-
ect desires broader collaboration that
may require adoption of a governance
structure similar to DPDK. SPDK shows
good promise for filling the same role
for storage and storage networking as
DPDK has for packet processing.
SPDK’s NVMe polled-mode drivers
provides an API to kernel-bypass appli-
cations for both direct-attached NVMe
storage as well as remote storage us-
ing the NVMe over Fabrics protocol.
Figure 7 shows the SPDK framework’s
core elements as of press time. Using
SPDK, Walker22 shows reduction in IO
submission/completion overhead by a
factor of 10 as measured with the SPDK
software overhead measurement tool.
To provide the reader with a better
understanding of the impact of legacy
IO we present data from the ‘fio’ bench-
marking tool ( https://github.com/ax-
boe/fio). Figure 8 shows performance
data, for kernel-based IO (with Ext4
and raw block access) and SPDK. The
data compares throughput with the
number of client threads. Configura-
tion is queue depth of 32, and IO size of
4KiB. Sequential read, sequential write,
random read, random write, and 50: 50
read-write workloads are examined.
The key takeaway is that SPDK requires only one thread to get over 90%
of the device’s maximum performance.
Figure 7. SPDK architecture.
3rd Party Logical Volumes BlobFS
Linux Async IO Ceph RBD
Block Device Abstraction (BDEV)
Core NVMe Devices