performance data for each benchmark and processor.
Figure 2 shows an example of this data, plotting the power
versus performance characteristics for one of the 45 processor configurations, the stock i7 ( 45).
We organize our analysis into eight findings, as summarized
in Figure 1. The original paper contains additional analyses
and findings. We begin with broad trends. We show that
applications exhibit a large range of power and performance
characteristics that are not well summarized by a single
number. This section conducts a Pareto energy efficiency
analysis for all of the 45 nm processor configurations. Even
with this modest exploration of architectural features, the
results indicate that each workload prefers a different hardware configuration for energy efficiency.
4. 1. Power is application dependent
The nominal thermal design power (TDP) for a processor is
the amount of power the chip may dissipate without exceeding the maximum transistor junction temperature. Table 1
lists TDP for each processor. Because measuring real processor power is difficult and TDP is readily available, TDP is often
substituted for real measured power. Figure 3 shows that this
substitution is problematic. It plots measured power on a logarithmic scale for each benchmark on each stock processor
as a function of TDP and indicates TDP with an “✗.” TDP is
strictly higher than actual power. The gap between peak measured power and TDP varies from processor to processor and
TDP is up to a factor of four higher than measured power. The
variation among benchmarks is highest on the i7 ( 45) and i5
( 32), likely reflecting their advanced power management. For
example, on the i7 ( 45), measured power varies between 23 W
for 471.omnetpp and 89 W for fluidanimate! The smallest
Figure 2. Power/performance distribution on the i7 ( 45). each point
represents one of the 61 benchmarks. Power consumption is highly
variable among the benchmarks, spanning from 23 W to 89 W. the
wide spectrum of power responses from different applications
points to power saving opportunities in software.
3.00 4.00 5.00 6.00 7.00
Native non-scale Native scale
Java non-scale Java scale
108 CommuniCations oF the aCm | juLy2012 | voL. 55 | no. 7
variation between maximum and minimum is on the Atom
( 45) at 30%. This trend is not new. All the processors exhibit
a range of benchmark-specific power variation. TDP loosely
correlates with power consumption, but it does not provide a
good estimate for ( 1) maximum power consumption of individual processors, ( 2) comparing among processors, or ( 3)
approximating benchmark-specific power consumption.
finding: Power consumption is highly application dependent
and is poorly correlated to TDP.
Figure 2 plots power versus relative performance for each
benchmark on the i7 ( 45), which has eight hardware contexts
and is the most recent of the 45 nm processors. Native (red) and
managed (green) are differentiated by color, whereas scalable
(triangle) and non-scalable (circle) are differentiated by shape.
Unsurprisingly, the scalable benchmarks (triangles) tend to
perform the best and consume the most power. More unexpected is the range of power and performance characteristics of
the non-scalable benchmarks. Power is not strongly correlated
with performance across workload or benchmarks. The points
would form a straight line if the correlation were strong. For
example, the point on the bottom right of the figure achieves
almost the best relative performance and lowest power.
4. 2. historical overview
Figure 4(a) plots the average power and performance for
each processor in their stock configuration relative to the
reference performance, using a log/log scale. For example,
the i7 ( 45) points are the average of the workloads derived
from the points in Figure 2. Both graphs use the same color
for all of the experimental processors in the same family.
The shapes encode release age: a square is the oldest, the
diamond is next, and the triangle is the youngest, smallest
technology in the family.
Figure 3. measured power for each processor running 61 benchmarks.
each point represents measured power for one benchmark. the
“✗”s are the reported tDP for each processor. Power is application
dependent and does not strongly correlate with tDP.
Measured power (W) (log)
Atom ( 45)
TDP (W) (log)
C2D ( 65) C2Q ( 65)
C2D ( 45) AtomD ( 45)
i7 ( 45)
i5 ( 32)