perturbations, the perturbed instance
is well-conditioned in the sense that
each step of the simplex method
makes significant progress traversing
the boundary of the shadow.
Local search. A local search algo-
rithm for an optimization problem
maintains a feasible solution, and
iteratively improves that solution via
“local moves” for as long as possible,
terminating with a locally optimal
solution. Local search heuristics are
ubiquitous in practice, in many differ-
ent application domains. Many such
heuristics have an exponential worst-
case running time, despite always ter-
minating quickly in practice (typically
within a sub-quadratic number of it-
erations). Resolving this disparity is
right in the wheelhouse of smoothed
analysis. For example, Lloyd’s algo-
rithm for the k-means problem can
require an exponential number of
iterations to converge in the worst
case, but needs only an expected poly-
nomial number of iterations in the
smoothed case (see Arthur et al.
7 and
the references therein).o
Much remains to be done, how-
ever. For a concrete challenge prob-
lem, let’s revisit the maximum cut
problem. The input is an undirected
graph G = (V, E) with edge weights,
and the goal is to partition V into two
groups to maximize the total weight
of the edges with one endpoint in
each group. Consider a local search
algorithm that modifies the current
solution by moving a single vertex
from one side to the other (known
as the “flip neighborhood”), and per-
forms such moves as long as they in-
crease the sum of the weights of the
edges crossing the cut. In the worst
case, this local search algorithm can
require an exponential number of it-
erations to converge. What about in
the smoothed analysis model, where
a small random perturbation is added
o An orthogonal issue with local search heuris-
tics is the possibility of outputting a locally
optimal solution that is much worse than a
globally optimal one. Here, the gap between
theory and practice is not as embarrassing—
for many problems, local search algorithms
really can produce pretty lousy solutions.
For this reason, one generally invokes a local
search algorithm many times with different
starting points and returns the best of all of
the locally optimal solutions found.
and non-clique vertices to decrease
the degrees of the former back down
to ≈ n/2. Thus the semi-random model
forces us to develop smarter, more ro-
bust algorithms.m
For the semi-random planted
clique model, Feige and Krauth-
gamer24 gave a polynomial-time al-
gorithm that recovers the clique with
high probability provided k = Ω(√n ).
The spectral algorithm by Alon et al.
3
achieved this guarantee only in the
standard planted clique model, and
it does not provide any strong guaran-
tees for the semi-random model. The
algorithm of Feige and Krauthgamer24
instead uses a semidefinite program-
ming relaxation of the problem. Their
analysis shows that this relaxation is
exact with high probability in the stan-
dard planted clique model (provided k
= Ω(√n)), and uses the monotonicity
properties of optimal mathematical
programming solutions to argue this
exactness cannot be sabotaged by any
monotone adversary.
Smoothed Analysis
Smoothed analysis is another example
of a semi-random model, now with
the order of operations reversed: an
adversary goes first and chooses an arbitrary input, which is then perturbed
slightly by nature. Smoothed analysis
can be applied to any problem where
“small perturbations” make sense,
including most problems with real-valued inputs. It can be applied to any
measure of algorithm performance,
but has proven most effective for running time analyses.
Like other semi-random models,
smoothed analysis has the benefit of
potentially escaping worst-case inputs (especially if they are “isolated”),
while avoiding overfitting a solution
to a specific distributional assumption. There is also a plausible narrative about why real-world inputs are
m The extensively studied “stochastic block
model” generalizes the planted clique model
(for example, see Moore32), and is another
fruitful playground for semi-random models.
Here, the vertices of a graph are partitioned
into groups, and the probability that an edge
is present is a function of the groups that contain its endpoints. The responsibility of an
algorithm in this model is to recover the (
unknown) vertex partition. This goal becomes
provably strictly harder in the presence of a
monotone adversary.
31
captured by this framework: whatever
problem you would like to solve, there
are inevitable inaccuracies in its formulation (from measurement error,
uncertainty, and so on).
The simplex method. Spielman
and Teng38 developed the smoothed
analysis framework with the specific
goal of proving that bad inputs for the
simplex method are exceedingly rare.
Average case analyses of the simplex
method from the 1980s (for example,
Borgwardt14) provide evidence for this
thesis, but smoothed analysis provides
more robust support for it.
The perturbation model in Spielman and Teng38 is: independently for
each entry of the constraint matrix and
right-hand side of the linear program,
add a Gaussian (that is, normal) random variable with mean 0 and standard deviation σ.n The parameter σ
interpolates between worst-case analysis (when σ = 0) and pure average-case
analysis (as σ → ∞, the perturbation
drowns out the original linear program). The main result states that the
expected running time of the simplex
method is polynomial as long as typical
perturbations have magnitude at least
an inverse polynomial function of the
input size (which is small!).
Theorem 3 (Spielman and Teng38)
For every initial linear program, in expectation over the perturbation to the
program, the running time of the simplex method is polynomial in the input
size and in 1/σ.
The running time blow-up as σ → 0 is
necessary because the worst-case running time of the simplex method is
exponential. Several researchers have
devised simpler analyses and better
polynomial running times, most recently Dadush and Huiberts.
17 All of
these analyses are for a specific pivot
rule, the “shadow pivot rule.” The idea
is to project the high-dimensional feasible region of a linear program onto
a plane (the “shadow”) and run the
simplex method there. The hard part
of proving Theorem 3 is showing that,
with high probability over nature’s
n This perturbation results in a dense constraint
matrix even if the original one was sparse, and
for this reason Theorem 3 is not fully satisfactory.
Extending this result to sparsity-preserving
perturbations is an important open question.