This design point matches the dual-core microprocessor on 45nm technology (Core2 Duo), integrating two cores
of 25 million transistors each and 6MB
of cache in a die area of about 100mm2.
If this analysis is performed for future technologies, assuming (our best
estimates) modest frequency increase
15% per generation, 5% reduction in
supply voltage, and 25% reduction of
figure 9. three scenarios for integrating 150-million logic transistors into cores.
5 MT 2 3
large-Core
25 MT
2
large-Core
25MT
5 MT 2 3
3
4
5
6
30
20
Large-Core homogeneous
large-core
throughput
small-core
throughput
Total
throughput
6
small-Core homogeneous
large-core
throughput
small-core
throughput
Total
throughput
Pollack’s Rule
(5/25)0.5=0.45
13
small-Core homogeneous
large-core
throughput
small-core
throughput
Total
throughput
Pollack’s Rule
(5/25)0.5=0.45
11
(a)
(b)
(c)
figure 10. a system-on-a-chip from texas Instruments.
C64x+ DsP
and video
accelerators
(3525/3530 only)
lCD
Controller
Display subsystem
video
enc
2D/3D Graphics
(3515/3530 only)
Camera I/F
Image
Pipe
Parallel I/F
L3/L4 Interconnect
Peripherals
Connectivity
usB 2.0 hs
OTG Controller
usB host
Controller x2
system
Timers
GP x12
WDT x2
McBsP x5
hDQ/1-wire
Program/Data storage
sDRC MMC/sD/sDIO
McsPI x4
serial Interfaces
I2C x3 uART x2
uART w/
IRDA
GPMC
capacitance, then the results will be
as they appear in Table 1. Note that
over the next 10 years we expect increased total transistor count, following Moore’s Law, but logic transistors
increase by only 3x and cache transistors increase more than 10x. Applying Pollack’s Rule, a single processor
core with 150 million transistors will
provide only about 2.5x microarchitecture performance improvement over
today’s 25-million-transistor core,
well shy of our 30x goal, while 80MB of
cache is probably more than enough
for the cores (see Table 3).
The reality of a finite (essentially
fixed) energy budget for a microprocessor must produce a qualitative shift in
how chip architects think about architecture and implementation. First, energy-efficiency is a key metric for these
designs. Second, energy-proportional
computing must be the ultimate goal
for both hardware architecture and
software-application design. While
this ambition is noted in macro-scale
computing in large-scale data centers, 5 the idea of micro-scale energy-proportional computing in microprocessors is even more challenging. For
microprocessors operating within a
finite energy budget, energy efficiency
corresponds directly to higher performance, so the quest for extreme energy
efficiency is the ultimate driver for performance.
In the following sections, we outline key challenges and sketch potential approaches. In many cases, the
challenges are well known and the
subject of significant research over
many years. In all cases, they remain
critical but daunting for the future of
microprocessor performance:
Organizing the logic: Multiple cores
and customization. The historic measure of microprocessor capability is
the single-thread performance of a
traditional core. Many researchers
have observed that single-thread performance has already leveled off, with
only modest increases expected in the
coming decades. Multiple cores and
customization will be the major drivers for future microprocessor performance (total chip performance). Multiple cores can increase computational
throughput (such as a 1x–4x increase
could result from four cores), and customization can reduce execution la-