→
jection onto u, into a path., i.e., we find a sequence of points
x , . . . , x with the property that on average (x − x , u) ≥ OE/2
1 l i i+i
and therefore (x − x , u) ≥ l OE/2. On the other hand, since
1l
|x − x | 2 ≤ ∆, triangle inequality implies that |x − x | 2 ≤ l∆,
i i+ 1 1 l
and therefore |x − x | ≤ l ∆ . Note that OE was chosen to be
1l
(roughly) the standard deviation of the projection length
of a unit vector. Therefore, the projection length of x − x is
1l
t = l / ∆ many standard deviations from its mean. The prob-
ability that the projection is t standard deviations from its
mean is at most e−t2/2. So if the number of hops l in the path is
Ω( log n ), then the probability of this event is polynomially
small. We get a contradiction by applying the union bound to
the n2 pairs of points.
The main challenge then is to piece together such a path.
To find a path for a direction u, we clearly cannot limit ourselves to the matching in that direction. To understand how
matchings for different directions relate to each other, we
need two new ideas. Let us first introduce some notation.
Recall that each point u has a matched pair for half the directions (i.e., either for u→ or −u →). We say that u is e-covered in
these directions. And we say that u is (e, d )-covered if, u is e-
covered for atleast d measure of directions.
The first idea is simple: if u is e covered in direction u →, then
it is almost e-covered in every neighboring direction u →′. Quantitatively, let |u→ − u →′| ≤ g then u is (e − 2g )-covered in direction
u →′. This fact follows from elementary geometry.
The second idea is to use measure concentration. Consider a set of points A on the surface of the unit ball in Rd.
If the measure of A is at least d for some constant d, then the
g -neighborhood of A (i.e., all points within distance at most g
from some point in A) has measure almost 1. Quantitatively,
2
the measure is approximately 1 − e−t /2 if g = t/ d (i.e., t standard deviations).
Returning to our proof, we have a point u that is e-covered
for a set of directions A of measure d. Moreover, the covering
points are within ∆ of u, and so we are working with a ball of
radius ∆ rather than 1. So, a g = t ∆ / d neighborhood of A
has measure 1 − e−t2/2. So, u is also (e/2, d′)-covered for d ′ very
close to 1.
It might seem that we are almost done by the following induction argument: u is covered in almost all directions. Pick
a direction u→ at random and let u′ cover u in this direction.
Now u′ is also almost covered in almost all directions. Surely,
u′ has a good chance of being covered in direction u→ since it
was chosen at random. Of course this argument, as it stands,
is nonsense because the choice of u′ is conditioned by the
→→
choice of u. Here is a potential fix: for random u with high-probability u is covered both in direction u → and −u →. If x and y
are the covering points in these two directions, then x − y has
→
a projection on u that is twice as long. This time the problem
is that most nodes u might yield the same pair of points x and
y thus making it impossible to continue with an induction.
For example, in a k-dimensional hypercube with n = 2k, every
point is Θ( log n ) covered in almost all directions. However, for a particular direction, most points are covered by only
a few in the tails of the resulting Gaussian distribution.
To actually piece together the path, we perform a more
→
delicate induction. For random direction u, node u is covered
→
with high probability by some node x. Also in direction −u,
with probability 1/2, u is matched to some node y (i.e., {u, y}
formed a discarded pair). So with probability almost 1/2, y is
now covered in direction u by x. Note that this time y takes
this role only once for this direction, since {u, y} is a matched
pair.
To actually carry out, the induction requires some work
to ensure that boosting using measure concentration can be
carried out. Moreover, the size of the covered set does indeed
drop. Indeed, other ideas need to be included to allow the induction to continue for log n steps.
7. imPLications foR metRic sPace emBeDDinGs
It is well known that there are different ways of measuring
distances; for instance, l , l , and l , as well as more exotic
12 p
methods such as Kullback–Leibler divergence. It is an impor-
tant question in functional analysis to understand the inter-
relationship among these different measures. One frequent
method of quantifying this is to look at the minimum distor-
tion required to realize one distance function with the other
distance function.
Let (X , d ) and (X , d ) be two metric spaces, by which we
11 22
mean that d is a distance function on X that is nonnegative,
ii
and satisfies triangle inequality. An embedding of X into X is
12
a mapping f : X → X . Its distortion is the minimum C such
12
that
d (x ,x )≤ d ( f (x ), f (x ))≤ C⋅d (x ,x ) ∀x ,x ∈ X ( 1)
112 2 1 2 112 12 1
The minimum distortion of X into X is the lowest distortion
12
among all embeddings of X into X . Though this notion had
12
been studied extensively in functional analysis, its importance
in algorithm design was realized only after the seminal 1994
paper of Linial, London, and Rabinovich. 15 Many algorithmic
applications of this idea have been subsequently discovered.
It was realized before our work that the accuracy of the
approach to cut problems involving SDPs (see Section 2) is
closely related to analyzing the minimum distortion parame-
ter linking different metric spaces (see the survey17). Indeed,
the ARV analysis can be viewed as an embedding with low
“average” distortion, and subsequent work of Chawla, Gup-
ta, Raecke7 and Arora, Lee, and Naor5 has been built upon
this observation. The final result is a near-resolution of an
old problem in analysis: what is the minimum distortion
required to embed an n-point l space (i.e., where the points
1
are vectors and distance is defined using the l metric) into
1
l It was known for a long time that this distortion is at least
2
log n and at most O(log n). The Arora–Lee–Naor paper
shows that the lower bound is close to truth: they give a new
embedding whose distortion is O ( log n log log n ).
We note here a connection to a general demand version
of the sparsest cut problem: given a set of pairs of vertices
(s , t )(s , t ), find the cut that minimizes the ratio of number
11 kk
of cut edges and the number pairs that are separated. An il-
lustrative application is how to place few listening devices in
a network to listen to many pairs of communicating agents;
minimizing the ratio of listening devices to compromised
pairs of agents.
The approximation ratio for the natural SDP relaxation