Doi: 10.1145/1400181.1400203
technical Perspective
new Developments
in Graph Partitioning
By Éva Tardos
ARORA, RAO, AND VAZIRANI discuss the
most important developments in approximation algorithms over the last
two decades. Graph partitioning has
played a central role in algorithms research, both as an important problem
with many applications and as a fertile breeding ground for interesting
and deep algorithmic techniques. The
paper reviews several related papers,
starting with the breakthrough of an
improved O(√log n) approximation algorithm based on semidefinite programming and also including a faster combinatorial version using expander flows.
Graph partitioning is a natural algorithmic primitive in many settings,
including data clustering, divide-and-conquer approaches, parallel computing architectures and algorithms, and
packet routing in distributed networks.
A well-understood graph cut problem
is to partition the vertices into two sets
such that there are few edges crossing
between the two sets. A smallest cut
can be found in polynomial time via a
number of different methods, including traditional network flows. While
minimal cuts have many applications,
they are not useful primitives for the
divide-and-conquer type applications;
such cuts often result in extremely unbalanced partitions. Graph partitioning refers to a broad class of partitioning problems that seek cuts with few
crossing edges, where the two sides of
the partition are more balanced.
Finding an optimal balanced partition is intractable (NP-hard), so for the
last 40 years attention has been focused
on finding nearly optimal partitions;
this has resulted in the development
of deep techniques in approximation
algorithms. The first major breakthrough result appeared in a 1988 paper by Leighton and Rao: an efficient
algorithm that finds almost-balanced
partitions is an O(log n) approximation
algorithm; that is, where the number of
edges that cross the cut is no more than
O(log n) times more than the optimal,
where n is the number of vertices of the
graph. The authors used a linear programming relaxation of the balanced
partitioning problems. Solving the linear program results in a fractional cut,
where each edge is assigned a fractional extent to which it belongs to the cut.
The algorithm provides an interesting
technique that “rounds” the fractional
values to 0 or 1 and hence turns the fractional cut into a real cut. Since that paper, the linear programming relaxations
and rounding paradigm proved a powerful tool in approximation algorithms
with a wide range of applications.
The O(√log n) approximation algorithm presented here is a long-awaited
improvement in the graph partitioning guarantee. The paper builds on the
improved understanding of the Leighton and Rao technique gained over
the years. In 1994, Linial, London, and
Rabinovich brought to light an intriguing connection between graph partitioning problems and embeddings into
high-dimensional metric spaces. A cut
can be thought of as a simple embedding into a line: vertices on one side of
the cut are mapped to 0, and those on
the other side to 1, and any mapping
of the points to the one-dimensional
metric can be thought of as combinations of cuts (where the two sides of the
cut are the nodes mapped to points ≤α
and the nodes mapped to points ≥α for
some value α). Leighton and Rao’s linear program is a relaxation of this one-dimensional metric space to an arbitrary metric. While finding the best cut
and the best embedding into the one-dimensional metric are NP-complete,
finding the best embedding to an arbitrary metric space can be done in polynomial time. One can derive the O(log n)
approximation by mapping the metric
space obtained by the linear program
into the L1 metric space so that the distances are approximately preserved, and
then decomposing the L1 metric space
as a product of one-dimensional metric spaces. The new paper uses Euclid-
ian metrics and build on a semidefinite
programming technique introduced by
Goemans and Williamson that allows
them to optimally embed the nodes of
a graph into a high-dimensional Euclidian metric.
A well-known disadvantage of using
general-purpose linear programming
solvers and semidefinite programs in
approximation algorithms is that they
are slow. The authors here review the
faster O(√log n) approximation algorithm of Arora, Hazan, and Kale based
on flows and the multiplicative update
method. The connection of cut problems to certain multicommodity flow
problems pioneered by Shahrokhi and
Matula, who note that the edges congested by the flow are the ones good
to cut. There have since been great
developments in solving this class of
linear programs efficiently by using
variants of the multiplicative update
method. The flow problem naturally
related to balanced cut is that of sending one unit of flow between every pair
of nodes in the graph. Edges in a small
balanced cut will have to be congested
as O(n2) flow wants to cross the cut. Unfortunately, the best routing of flow can
have O(log n) more congestion in some
graphs than the congestion predicted
by small cuts, thus limiting the quality
of the approximation bound achieved.
To speed the computation, Leighton
and Rao replace this all-pairs flow with
a sparse random pairing, only sending
flow between O(n) randomly selected
source-sink pairs. If the graph of the
source-sink pairs is an expander, then
this smaller flow problem works just
as well in the algorithm. The authors
achieve their improved approximation
by showing an expander exists whose
flow can be routed with O(√log n) less
congestion than random expanders.
Arora, Rao, and Vazirani review an
impressive collection of results, bringing to bear a wide array of tools on an
important problem. Combining tools
from high-dimensional metric spaces
with ideas from expander flows, spar-sification, and multiplicative update
methods results in a simple and fast
method that achieves the improved
O(√log n) approximation guarantee.
Éva Tardos ( eva@cs.cornell.edu) is chair of and a
professor in the Department of Computer Science at
Cornell University, Ithaca, NY.