tem must be adapted for energy awareness, and then for energy optimization.
Processors. Given the significant
fraction of power on contemporary
computing platforms attributed to
CPUs (and the early introduction of
power-management features on them
as a result), much progress has already
been made with operating-system
schedulers/thread dispatchers. Careless activation of hardware when there
is no useful work to be done must be
eliminated. Polling within the operating system (or within applications) is
an obvious example, but the use of a
high-frequency clock-tick interrupt
as the basis for timer events, timekeeping, and thread-scheduling can be
equally problematic. The objective is to
keep hardware quiescent until needed.
The “tickless” kernel project16 in Linux
introduced an initial implementation
of a dynamic tick. By reprogramming
the per-CPU periodic timer interrupt
to eliminate clock ticks during idle,
the average amount of time that a CPU
stays in its idle state after each idle state
entry can be improved by a factor of 10
or more. Beyond the very good ideas
that dynamic ticks and deferrable timers in Linux represent, the Tesla project in OpenSolaris is also considering
what the transition to a more broadly
event-based scheme for software development within the operating system
might imply.
The confluence of features on modern processors—CMT (chip multi-threading), CMP (chip multiprocessor),
and NUMA (non-uniform memory access) for multiprocessor systems with
multiple sockets—invites a great deal
of new work to implement optimal-placement thread schedulers. 6 Given
the ability to alter performance levels,
energy efficiency and the expected introduction of heterogeneous multicore
CPUsh will only add intrigue to this. 7, 15
Storage. Compared with CPUs, the
power consumed by a disk drive does
not seem especially large. A typical
3.5-in., 7200RPM commodity disk consumes about 7W to 8W—only about
10% of what a typical multicore CPU
h Heterogeneous here means a multicore CPU
in which cores of different performance levels
(different CPU microarchitectures) are put in
the same multicore package, and whose pow-er-consumption consequences are therefore
very different.
consumes. Although higher-performance 10,000RPM spindles consume
about 14W, and 15,000RPM drives
perhaps use around 20W, what is the
worry? The alarming relative rate of
growth in storage, mentioned earlier,
could quickly change the percentage
of total power that storage devices account for. Performance and reliability factors have already resulted in
the common application of multiple
spindles, even on desktop systems (to
implement a simple RAID solution). In
the data center, storage solutions are
scaling up much faster.
Low-end volume server boxes now
routinely house a dozen or more drives,
and one example 4U rack-mount storage array product from Sun accommodates 46 3.5-in. drives. A single
instance of the latter unit, if it used
10,000RPM- or 15,000RPM industrial
drives, might therefore account for
1,088W to 1.6k W, rather a more significant energy-use picture.
Storage subsystems are now obviously on the radar of the energy attentive. There are at least two immediate
steps that can be taken to help improve
energy consumption by storage devices. The first is direct attention to energy
use in traditional disk-based storage.
Some of this work has been started by
the disk hardware vendors, who are beginning to introduce disk-drive power
states, and some have been started by
operating-system developers working
on contemporary file systems (such
as ZFS) and storage resource management. The second, particularly derived
from the recent introduction of large
inexpensive Flash memory devices, is
a more holistic look at the memory/
storage hierarchy. Flash memory fills
an important performance/capacity
gap between main memory devices and
disks, 10, 11 but also has tremendous energy-efficiency advantages over rotating
mechanical media.
Memory. Main memory, because of its
relatively low power requirement (say,
2W per DIMM), seems at first glance to
be of even less concern than disks. Its
average size on contemporary hardware
platforms, however, may be poised to
grow more rapidly. With hardware sys-
tem manufacturers’ focus primarily on
performance levels (to keep up with the
corresponding performance demands
of multicore CPUs), maintaining full
CPU-to-memory bandwidth is critical.
The consequence has been an evolu-
tion from single- to dual-channel and
now triple-channel DIMMs along with
the corresponding DDR, DDR2, and
DDR3 SDRAM technologies. Although
reductions in the process feature size
(DDR3 is now on 50-nanometer tech-
nology) have enabled clock frequency
to go up and power per DIMM to go
down somewhat, the desire for even
greater performance via an increase in
DIMMs per memory channel is still in-
creasing the total power consumed by
the memory system.