the surface of the unit sphere in an n-dimensional Euclidean space, where n is the number of graph vertices. Moreover, the “distance” between points in this space is defined
to be the square of the Euclidean distance. This means that
we draw the graph so that the sum of squares of the lengths
of the edges is as small as possible, while requiring that the
square of the distance between the average pair of points is
a fixed constant, say 1.
There are some important additional constraints that
the geometric embedding must satisfy, which we have
suppressed in our simplified outline above. These are
described in Section 2, together with details about how a
good cut is recovered from the embedding (also see Figure
3 in Section 2). In that Section, we also explain basic facts
about the geometry of n-dimensional Euclidean space that
are necessary to understand the choice of parameters in
the algorithm. The proof of the main geometric theorem,
which yields the O ( log n ) bound on the approximation
factor achieved by the algorithm, makes essential use of a
phenomenon called measure concentration, a cornerstone
of modern convex geometry. We sketch this proof in Section 6.
The main geometric theorem has led to progress on a
long-standing open question about how much distortion is
necessary to map the metric space l into a Euclidean met-
1
ric space. This result and its relation to the main theorem
are described in Section 7.
1. 2. sketch of the flow-based approach
We now describe the flow-based approach that holds the
key to designing fast algorithms. In Section 3, we mention
why it is actually dual to the geometric approach. To visualize this approach imagine that one sunny day in the San
Francisco Bay area, each person decides to visit a friend.
The most congested roads in the resulting traffic nightmare
will provide a sparse cut of the city: most likely cutting the
bridges that separate the East bay from San Francisco.*
More formally, in the 1988 approach of Leighton and
Rao14 (which gives an O(log n)-approximation to graph-partitioning problems), the flow routing problem one
solves is this: route a unit of flow between every pair of
vertices in the graph so that the congestion on the most-congested edge is as small as possible; i.e., route the traffic
so that the worst traffic jam is not too bad. A very efficient
algorithm for solving this problem can be derived by viewing the problem as a contest between two players, which
we can specify by two (dueling) subroutines. Imagine that a
traffic planner manages congestion by assigning high tolls
to congested edges (streets), while the flow player finds
the cheapest route along which to ship each unit of flow
† It may appear strange to pick a random partition of this geometric space
instead of optimizing this choice. Though some optimization is a good idea
in practice, one can come up with worst-case graphs where this fails to provide substantial improvements over random partitions. A similar statement
holds for other algorithms in this paper that use random choices.
* Of course, unlike San Francisco’s road system, a general graph cannot be
drawn in the Euclidean plane without having lots of edge crossings. So, a
more appropriate way of picturing flows in a general graph is to think of it
as a communications network in which certain vertex pairs (thought of as
edges of the “traffic graph”) are exchanging a steady stream of packets.
(thereby avoiding congested streets). In successive rounds,
each player adjusts the tolls and routes respectively to best
respond to the opponent’s choices in the previous round.
Our goal is to achieve an equilibrium solution where the
players’ choices are best responses to each other. Such an
equilibrium corresponds to a solution that minimizes the
maximum congestion for the flow player. The fact that an
equilibrium exists is a consequence of linear programming duality, and this kind of two-player setting was the
form in which von Neumann originally formulated this
theory which lies at the foundation of operations research,
economics, and game theory.
Indeed, there is a simple strategy for the toll players so
that their solutions quickly converge to a nearly optimal
solution: assign tolls to edges that are exponential in the
congestion on that edge. The procedure in Figure 3 is guaranteed to converge to solution where the maximum congestion is at most ( 1 + e) times optimal, provided m ≥ 1 is a
sufficiently close to 1. The number of iterations (i.e., flow
reroutings) can also be shown to be proportional to the
flow crossing the congested cut.
But how do we convert the solution to the flow routing
problem into a sparse cut in the graph? The procedure is
very simple! Define the distance between a pair of vertices
by the minimum toll, over all paths, that must be paid to
travel between them. Now consider all the vertices within
distance R of an arbitrary node u. This defines a cut in the
graph. It was shown by Leighton and Rao, 14 that if the distance R is chosen at random, then with high probability the
cut is guaranteed to be within O(log n) times optimal. The
entire algorithm is illustrated in the figure below.
Here is another way to think about this procedure: we
may think of the tolls as defining a kind of abstract “
geometry” on the graph; a node is close to nodes connected
by low-toll edges, and far from nodes connected by large
toll edges. A good cut is found by randomly cutting this abstract space. This connection between flows and geometry
will become even stronger in Section 3.
In our 2004 paper, 6 we modified the above approach by
focusing upon the choice of the traffic pattern. Instead of
routing a unit of flow between every pair of vertices—i.e., a
traffic pattern that corresponds to a complete graph—one
can obtain much better cuts by carefully choosing the traffic
figure 2: Parameter m depends upon e, which specifies the accuracy
with which we wish to approximate the congestion.
rOUTE PATHS AND cUT input: G = (V, E) Maintain: d(.) on the edges of G.
initially d(e) = 1.
1. do until the maximum d(e) is n,
(a) choose a random (i, j ) pair.
(b) Find a shortest path, p, from i to j.
(c) Multiply d(e) on each edge of p by m.
2. choose a value d randomly and an arbitrary node u and return the set of
nodes within distance d of u.