figure 2. sora system architecture. all Ph Y and mac execute in
software on a commodity multi-core cPu.
Multi-core CPU
APP APP APP APP
Mem
Digital Samples
@Multiple Gbps
RCB
Sora Sora APP APP
RF
Sora Soft-Radio Stack
High throughput
low-latency PCIe bus
RF
RF A/D
D/A RF
3. 1. hardware components
The hardware components in the Sora architecture are a
new RCB with an interchangeable radio front-end (RF front-end). The radio front-end is a hardware module that receives
and/or transmits radio signals through an antenna. In the
Sora architecture, the RF front-end represents the well-defined interface between the digital and analog domains. It
contains analog-to-digital (A/D) and digital-to-analog (D/A)
converters, and necessary circuitry for radio transmission.
Since all signal processing is done in software, the RF front-end design can be rather generic. It can be implemented in a
self-contained module with a standard interface to the RCB.
Multiple wireless technologies defined on the same frequency band can use the same RF front-end hardware, and
the RCB can connect to different RF front-ends designed for
different frequency bands.
The RCB is a new PC interface board for establishing a high-throughput, low-latency path for transferring
high-fidelity digital signals between the RF front-end and
PC memory. To achieve the required system throughput
discussed in Section 2. 1, the RCB uses a high-speed, low-latency bus such as PCIe. With a maximum throughput
of 64Gbps (PCIe × 32) and sub-microsecond latency, it is
well suited for supporting multiple gigabit data rates for
wireless signals over a very wide band or over many MIMO
channels. Further, the PCIe interface is now common in
contemporary commodity PCs.
Another important role of the RCB is to bridge the synchronous data transmission at the RF front-end and the
asynchronous processing on the host CPU. The RCB uses
various buffers and queues, together with a large onboard
memory, to convert between synchronous and asynchronous streams and to smooth out bursty transfers between
the RCB and host memory. The large onboard memory further allows caching precomputed waveforms, adding additional flexibility for software radio processing.
Finally, the RCB provides a low-latency control path for
software to control the RF front-end hardware and to ensure
it is properly synchronized with the host CPU. Section 5. 1
describes our implementation of the RCB in more detail.
3. 2. sora software
Figure 3 illustrates Sora’s software architecture. The software components in Sora provide necessary system services
and programming support for implementing various wireless PHY and MAC protocols in a general-purpose operating
Applications
User mode
Kernel mode
Network Layer (TCP/IP)
Sora soft radio stack
Sora supporting lib
Wireless MAC
Sora PHY Lib
Wireless PHY
Streamline Processing
Support
RCB Manager
DMA Memory
Real-time Support (Core
dedication)
PC Bus
RCB
system. In addition to facilitating the interaction with the
RCB, Sora provides a set of techniques to greatly improve
the performance of PHY and MAC processing on GPPs. To
meet the processing and real-time requirements, these techniques make full use of various common features in existing
multi-core CPU architectures, including the extensive use of
LUTs, substantial data-parallelism with CPU SIMD extensions, the efficient partitioning of streamlined processing
over multiple cores, and exclusive dedication of cores for
software radio tasks. We describe these software techniques
in details in the next section.
4. hiGh-PeRfoRmance sDR soft WaRe
4. 1. efficient Ph Y processing
In a memory-for-computation trade-off, Sora relies upon the
large-capacity, high-speed cache memory in GPPs to accelerate PHY processing with precalculated LUTs. Contemporary
modern CPU architectures usually have megabytes of L2
cache with a low ( 10–20 cycles) access latency. If we precalculate LUTs for a large portion of PHY algorithms, we can
greatly reduce the computational requirement for online
processing.
For example, the soft demapper algorithm used in demodulation needs to calculate the confidence level of each bit
contained in an incoming symbol. This task involves rather
complex computation proportional to the modulation density. More precisely, it conducts an extensive search for all
modulation points in a constellation graph and calculates
a ratio between the minimum of Euclidean distances to all
points representing one and the minimum of distances to
all points representing zero. In this case, we can precalculate the confidence levels for all possible incoming symbols
based on their I and Q values, and build LUTs to directly
map the input symbol to confidence level. Such LUTs are
not large. For example, in 802.11a/g with a 54Mbps modulation rate (64-QAM), the size of the LUT for the soft demapper is only 1.5KB.