AS WE EMBARK on a new era of storage performance,
the limitations of monolithic OS designs are
beginning to show. New memory technologies
(for example, 3D XPoint™ technology) are driving
multi-GB/s throughput and access latencies at sub-microsecond scales. As the performance of these
devices approaches the realms of DRAM, the overhead
incurred by legacy IO stacks increasingly dominates.
To address this concern, momentum is gathering
around new ecosystems that enable effective
construction of tailored and domain-specific IO
architectures. These ecosystems rely on bringing
both device control and data planes into user space,
so that they can be readily modified
and intensely optimized without jeopardizing system stability.
This article begins by giving a quantitative exploration of the need to shift
away from kernel-centric generalized
storage IO architectures. We then discuss the fundamentals of user space
(kernel-bypass) operation and the
potential gains that result. Following
this, we outline key considerations
necessary for their adoption. Finally,
we briefly discuss software support for
NVDIMM-based hardware and how
this is positioned to integrate with a
user space philosophy.
Evolution of Storage IO
Since the release of the Intel 8237 in
the IBM PC platform (circa 1972), network and storage device IO has centered around the use of Direct Memory
Access (DMA). This enables the system
to transfer data to and from a device to
main memory with no involvement of
the CPU. Because DMA transfers can
be initiated to any part of main memory, coupled with the need to execute
privileged machine instructions (for
example, masking interrupts), device
drivers of this era were well suited to
the kernel. While executing device
drivers in user space was in theory possible, it was unsafe because any misbehaving driver could easily jeopardize
the integrity of the whole system.
As virtualization technologies
evolved, the consequence of broad
Conventional storage software stacks are
unable to meet the needs of high-performance
Storage-Class Memory technology. It is time to
rethink 50-year-old architectures.
BY DANIEL WADDINGTON AND JIM HARRIS
˽ NVMe and memory-based storage
technologies are experiencing an
exponential growth in performance
with aggressive parallelism and fast
new media. Traditional IO software
architectures are unable to sustain
these new levels of performance.
˽ IOMMU hardware is a key enabler
for realizing safe and maximal
performing user space device drivers
and storage IO stacks.
˽ Kernel-bypass strategies rely on
“asynchronous polling” whereby threads
actively check device completion queues.
Naive designs can lead to excessive busy-waiting and inefficient CPU utilization.