Figure 5. Heterogeneous server configuration with 25BCE large cores and 1BCE small cores.
Arrival rate: λ
short request
long request
Figure 6. Server configurations with 10BCE cores when dedicating (a) 10 resource units and
(b) 70 resource units toward caching.
(
C
=
10)
Cac
he
(a)
Arrival rate: λ
Arrival rate: λ
Service time: Ts = 1/(µ√ 10)
(b)
(
C
=
70)
Cach
e
and fourth root of r. Beyond that point,
optimal design favors smaller cores.
Core Heterogeneity
The previous section explored the
trade-offs between powerful, brawny
cores and power-efficient, wimpy cores.
Neither type of core provides high efficiency across a wide range of QoS targets, raising several obvious questions,
including: Should an architect combine multiple core types in the same
system, as is already the norm in multi-core chips for mobile systems? How
should architects determine the size of
these cores? And at what ratio should
they use them? Determining the right
mix of large-versus-little cores, as well
as devising schedulers that take advantage of heterogeneous cores, especially in the presence of heterogeneous
load, has been a notably active topic of
research in computer architecture in
recent years. 5, 9, 15 Figure 4c shows the
QPS under various QoS targets for a set
of heterogeneous designs. In all cases,
the system has two core configurations: small cores with U = 1, benefiting
applications with relaxed QoS, and big
cores with U = 25, benefiting applications with strict QoS. The system also
receives two exponentially distributed
input request streams, one with short
and the other with long mean-service-time requests, and design a simple heterogeneity-aware scheduler that routes
long requests to big cores and short requests to small cores. Requests are admitted to a single queue, as in Figure 5,
and the ratio of long-to-short requests
is, for now, 1: 1. Figure 5 starts with all
big cores at the leftmost point of the
x-axis, explores the heterogeneous
space, and ends with all small cores at
the rightmost point.
Finding 4. Figure 4c captures a surprising trend. For strict QoS targets,
like 1 · Ts, homogeneous systems with
all big cores achieve optimal performance. In contrast, for very relaxed QoS
targets, like 100Ts, using all small cores
achieves the best performance. However, for QoS targets in the middle (such
as 10Ts), heterogeneous systems, coupled with heterogeneity-aware schedulers, outperform their homogeneous
counterparts. This result is especially
true when the ratio of big to small cores
matches the ratio of long-to-short requests. Varying the request ratio affects