“as soon as possible.”d In this case, energy optimization is restricted to those
resources that can be deactivated, or
whose individual performance can be
reduced, without affecting the workload’s best possible completion time.
If a deadline later than the best
achievable deadline is specified, the
computation may take any length of
time up to this deadline, and the system can seek a more global energy
minimum for the task (or workload).
Deadlines might be considered “hard,”
in which case the system’s energy-opti-mizing resource allocator must somehow guarantee to meet them (raising
difficult implementation issues), or
“soft,” in which case only a best effort
can be tolerated.
Services must operate at required
throughput. For online services, the notion of throughput, in order to characterize the required performance level,
may be more suitable than that of a
completion deadline. Since services, in
their implementation, can ultimately
be decomposed into individual tasks
that do complete, we expect there to be
a technical analog (although the most
suitable means of specifying its performance constraint might be different).
the System must be Responsive
to Changing Demand
Real workloads are not static: the
amount of work provided and the resources required to achieve a given
performance level will vary as they run.
Dynamic response is an important
practical consideration related to service level.
Throughput (T) must be achievable
within latency (L). Specification of
the maximum latency within which
reserved hardware capacity can be
activated or its performance level increased seems a clear requirement, but
this must also be related to the performance needs of the task or workload in
question.
Throughput is dependent on the
type of task. A metric such as TPS
(transactions per second) might be relevant for database system operation,
triangles per second for the rendering
d All values of deadline: D = ti less than the short-est achievable deadline: to is equivalent to setting D = to (that is: {∀t ti < to , [D = ti] ≈ [D = to ]} ). We
can therefore denote maximum performance
byD = 0.
component of an image-generation
subsystem, or corresponding measures for a filing service, I/O interconnect, or network interface. Interactive
use imposes real-time responsiveness
criteria, as does media delivery: computational, storage, and I/O capacity
required to meet required audio and
video delivery rates. A means by which
such diverse throughput requirements
might be handled in practice is suggested here.
Instantaneous power must never exceed power limit (P). A maximum power
limit may be specified to respect practical limits on power availability (whether
to an individual system or to a data center as a whole). In some cases, exceeding
this limit briefly may be permissible.
Combinations of such constraints
mean that over-constraint must be
expected in some circumstances, and
therefore a policy for constraint relaxation will also be required. A strict
precedence of the constraints might
be chosen or a more complex trade-off
made between them.
Approaching a Solution
Given this concept for energy-efficient
computing, how might such a system
be constructed? How would you expect
an energy-efficient system to operate?
A system has three principal aspects
that could solve this problem:
It must be able to construct a power ˲
model that allows the system to know
how and where power is consumed,
and how it can manipulate that power
(this component is the basis for enact-
ing any form of power management).
The system must have a means ˲
for determining the performance requirements of tasks or the workload—
whether by observation or by some
more explicit means of communication. This is the constraints-determi-nation and performance-assessment
component.
Finally, the system must imple- ˲
ment an energy optimizer—a means of
deciding an energy-efficient configuration of the hardware at all times while
operating. That optimization may be
relative (heuristically decided) or absolute (based on analytical techniques).
This is the capacity-planning and dy-namic-provisioning component.
The first aspect is relatively straight-
forward to construct. The third is cer-
tainly immediately approachable,
especially where the optimization
technique(s) are based on heuristic
methods. The second consideration
is the most daunting. It represents an
important disruptive consequence of
energy-efficient computing and could
demand a more formal (programmat-
ic) basis for communicating require-
ments of the workload to the system.
A description of the workload’s basic
provisioning needs, along with a way to
indicate both its performance require-
ments and present performance, seem
basic to this.
Power model
In order to manage the system’s hard-
ware for energy efficiency, the systeme
must know the specific power details
of the physical devices under its con-
trol. Power-manageable components
must expose the controls that they
offer, such as their power and perfor-
mance states (D-states and P-states,
respectively, in the ACPI architectural
model). To allow modeling of power
relative to performance and availability
(that is, relative to its activation respon-
siveness), however, the component in-
terface must also describe at least the
following:
The per-state power consumption ˲
(for each inactive state) or power range
(for each active state).
State-transition latency (time re- ˲
quired to make each state transition).
State-change energy (energy ex- ˲
pended to change state).
Once the system has such a power
model, consisting of all its power-manageable hardware, it has the basic foun-
e “The system” here most naturally suggests
the operating system, although it is clear that
this must include the hypervisor for virtual-ized systems. One can reasonably expect that
this concept will need to be broadened to include some aspects of the firmware and even
hardware components (on the low end) and
important runtimes, such as the Java Virtual
Machine, which have responsibility for, and/or
particular knowledge of, resource allocation.