FEBRUARY 2019 | VOL. 62 | NO. 2 | COMMUNICATIONS OF THE ACM 101
method when analyzing individual agents in a large system
is intractable.
1 First, we define key probability distributions
on population behavior. Second, we optimize each agent’s
strategy in response to the population rather than individual
competitors. Third, we find an equilibrium in which no
agent can perform better by deviating from her optimal
strategy. Thus, we reason about the population and neglect
individual agents because any one agent has little impact on
overall behavior in a large system.
The mean field analysis for the sprinting game focuses on
the sprint distribution, which characterizes the number of
agents who sprint when the system is not in recovery.
In equilibrium, the sprint distribution is stationary and
does not change across epochs. In any given epoch, some
agents complete a sprint and enter the cooling state while
others leave the cooling state and begin a sprint. Yet the
number of agents who sprint is unchanged in expectation.
The stationary distribution for the number of sprinters
translates into stationary distributions for the rack’s current draw and the probability of tripping the circuit
breaker. Given the tripping probability, which concisely
describes population dynamics, an agent can formulate
her best response and optimize her sprinting strategy to
maximize performance. We find an equilibrium by specifying an initial value for the tripping probability and
iterating.
• Optimize sprint strategy (§
4. 2). Given the probability of
tripping the breaker Ptrip, each agent optimizes her
sprinting strategy to maximize her performance. She
sprints if performance gains from doing so exceed
some threshold. Optimizing her strategy means setting
her threshold u T.
• Characterize sprint distribution (§
4. 3). Given that each
agent sprints according to her threshold u T, the game
characterizes population behavior. It estimates the
expected number of sprinters nS, calculates their
demand for power, and updates the probability of tripping the breaker .
• Check for equilibrium. The game is in equilibrium if
= Ptrip. Otherwise, iterate with the new probability of
tripping the breaker.
4. 2 Optimizing the sprint strategy
Sprinting defines a repeated game in which an agent acts in
the current epoch and encounters consequences of that
action in future epochs. An agent optimizes her sprinting
strategy accounting for the probability of tripping the circuit
breaker Ptrip, her utility from sprinting u, and her state. To
decide whether to sprint, each agent optimizes the following
Bellman equation.
( 1)
The equation quantifies value when an agent acts optimally
in every epoch. VS and V¬ S are the expected values from sprint-
ing and not sprinting, respectively. If VS(u, A) > V¬S(u, A),
then sprinting is optimal. The game solves the Bellman
equation and identifies actions that maximize value with
simultaneously, but they risk tripping the circuit breaker
and triggering power emergencies that harm global
performance.
The game considers N agents who run task-parallel applications on N chip multiprocessors. Each agent computes in
either normal or sprinting mode. The normal mode uses a
fraction of the cores at low frequency whereas sprints use all
cores at high frequency. Sprints rely on the executor to
increase task parallelism and exploit extra cores. In this article, we consider three cores at 1.2GHz in normal mode and
twelve cores at 2.7GHz in a sprint.
In any given epoch, an agent occupies one of three states—
active (A), chip cooling (C), and rack recovery (R)—according
to her actions and those of others in the rack. An agent’s
state describes whether she can sprint, and describes how
cooling and recovery impose constraints on her actions.
Active (A) – Agent can safely sprint. An agent in the active
state operates her chip in normal mode by default. The
agent may decide to sprint by comparing benefits in the current epoch against benefits from deferring the sprint to a
future epoch. If the agent sprints, her state in the next epoch
is cooling.
Chip cooling (C) – Agent cannot sprint. After a sprint, an
agent remains in the cooling state until excess heat has been
dissipated. Cooling requires a number of epochs ∆tcool,
which depends on the chip’s thermal package. An agent in
the cooling state stays in this state with probability pc and
returns to the active state with probability 1 − pc. Probability
pc is defined so that 1/( 1 − pc) = ∆tcool.
Rack recovery (R) – Agent cannot sprint. When multiple
chips sprint simultaneously, total current draw may trip the
circuit breaker, trigger a power emergency, and require supplemental current from batteries. After an emergency, all
agents remain in the recovery state until batteries recharge.
Recovery requires a number of epochs ∆trecover, which
depends on the power supply and battery capacity. Agents in
the recovery state stay in this state with probability pr and
return to the active state with probability 1 − pr. Probability
pr is defined so that 1/( 1 − pr) = ∆trecover.
4. GAME DYNAMICS AND STRATEGIES
Strategic agents decide between sprinting or not to maximize utilities. Sophisticated strategies produce several
desirable outcomes. Agents sprint during the epochs that
benefit most from additional cores and higher frequencies.
Moreover, agents consider other agents’ strategies because
the probability of triggering a power emergency and entering the recovery state increases with the number of
sprinters.
We analyze the game’s dynamics to optimize each agent’s
strategy for her performance. A comprehensive approach to
optimizing strategies considers each agent—her state, utility, and history—to determine whether sprinting maximizes
her performance given her competitor’s strategies and system state. In practice, however, this optimization is intractable for hundreds or thousands of agents.
4. 1 Mean field equilibrium
The mean field equilibrium (MFE) is an approximation