u, w, |u − u| 2 + |u − w| 2 ≥ |u − w| 2. The two-point embedding
remains viable since it satisfies this constraint. By contrast,
the spectral method described above fails to meet this condition, since any three distinct points on a line form an obtuse triangle. (See the accompanying box for a discussion
of the hypercube, which is a graph that provides an intuitive picture about the nature of this constraint.)
The conditions satisfied by the embedding are formally
described in Figure 4. An embedding satisfying these conditions in a sufficiently high-dimensional space can be
computed in polynomial time by a technique called SDP.
A semidefinite program is simply a linear program whose
variables are allowed to be certain quadratic functions. In
our case, the variables are squared distances between the
points in ℜd. Indeed, researchers have known about this
approach to graph partitioning in principle for many years
(more precisely since10, 15), but it was not known how to analyze the quality of cuts obtained. For a survey of uses of SDP
in computing approximate solutions to NP-hard problems,
see. 9
The specific conditions in Figure 4 were chosen to express the semidefinite relaxation of the graph
min-bisec-tion problem, where the two sides of the cut have to be of
equal size. In this case, a feasible solution is to map each
partition of an optimal bisection to two antipodal points.
Therefore, the optimal solution to the SDP provides a lower
bound on the cost of the optimal bisection. How do we recover a good bisection from this embedding? This is a little
more involved than the simple “random partition” technique described in the previous subsection.
The key to finding a good cut is a result from6 that any
embedding satisfying all the above conditions is similar to
a two-point embedding in the following sense: given such
an embedding {u , . . . , u }, we can efficiently find two dis-
1n
joint “almost antipodal” sets, S and T, each with Ω(n) points
that are at least ∆ = Ω(1/ log n ) apart. That is, every point
in S has distance at least ∆ from every node in T. Once the
sets S and T are identified, there is a very simple procedure
figure 4: criteria for geometric embedding.
let the ith vertex get mapped to the point u . the mapping must satisfy the
i
following conditions:
• the points lie on the unit sphere:
∀i
• the points are well spread:
2
u = 1
i
∑ 2 n ui−uj≥ 2 E i<j 2
• satisfy the triangle inequality:
222
∀i, j,k u −u +u −u ≥u −u
ij jk ik
• edges are short:
2
min∑ u − {ij}∈E i uj
eXAMPLE: THE HYPERCUBE
insisting that the embedding of the graph into ℜd satisfies the triangle inequality is quite a severe constraint. indeed, on the surface of a unit sphere
in d dimensions, the maximum number of distinct points that can satisfy
this condition is 2d. An extreme example is the d-dimensional hypercube,
which has exactly n = 2d points. the points are identified with vectors in
{− 1, + 1}d, and these vectors obey the triangle inequality since they do so
coordinate wise.
{− 1, 1, 1}
{ 1, 1, 1}
{ 1, 1, − 1}
{− 1, − 1, − 1}
{ 1, − 1, − 1}
Hypercube in 3 dimensions
log n
1
log n
Number of 1’s
log n
0
log n
0
log n -dimensional hypercube
each point of the d-dimensional hypercube is connected by an edge to
exactly d other points—those that differ from it in exactly one coordinate. the
optimal bisection cut corresponds to any one of the d dimensions: separate
the points according to whether that particular coordinate is − 1 or + 1. the
number of edges crossing this cut is exactly n/2 = 2d− 1, one for each 1-pair.
Another natural way of bisecting the d-dimensional hypercube is by a
level cut. imagine that we arrange the points on levels according to the sum
of the coordinates. there are exactly d + 1 levels where the jth level has exactly d points. the bisection cut separating the lowest d/2 levels from the j
remaining d/2 levels has Θ( d2d) edges (which is Θ( d ) = Θ( logn) factor worse than optimal bisection cut). this is because the middle level has
roughly 2d / d points, and each point has d/2 edges that cross the cut. the
interesting thing is to note that there was nothing special about the sum of
coordinates, and indeed we could start with any point u at the lowest level,
and then assign the rest of the points of the hypercube to levels according
to their distance from u. the number of distinct level cuts is 2d − 1 (choosing
u or its “antipodal” point leads to the same level sets).
our algorithm uses a random projection to define a cut. When run on the
hypercube, this corresponds to finding a level cut. thus, the algorithm fails
to discover anything close to one of the d optimal dimension cuts. in this
case, the algorithm’s answer is indeed logn factor away from optimal.