cost spanning tree of the graph and performs a depth-first-traversal of this tree,
one gets a “tour” that visits every node
of the graph at least once and has a cost
of at most twice the cost of the optimum
travelling salesperson tour. (This tour
may visit some vertices more than
once, but such extra visits can be short-circuited. The short-circuiting only produces a smaller length tour, thanks to
the triangle inequality.) Thus the travelling salesman problem with triangle
inequalities (-TSP) admits a polynomial time 2-approximation algorithm.
Does this imply that every optimization
problem admits a 2-approximation algorithm? Turns out that not even a -
approximation algorithm is known
for graph coloring. On the other hand,
the knapsack problem has a ( 1 + e )-
approximation algorithm for every
positive e, while the same is not known
-TSP. Thus while the theory of NP-completeness managed to unify the
study of optimization problems, the
theory of “approximability” managed to
fragment the picture. Till 1990 however
it was not generally known if the inability to find better approximation algorithms was for some inherent reason,
or was it merely our lack of innovation.
This is where the PCP theory came in to
the rescue.
In their seminal work, Feige et al. 15
came up with a startling result. They
showed that the existence of a PCP verifier as in the PCP theorem (note that
their work preceded the PCP theorem
in the form stated here, though weaker
variants were known) implied that if
the independent set size in a graph
could be approximated to within any
constant factor then NP would equal P!
Given a PCP verifier V and an assertion
A, they constructed, in polynomial time,
a graph G = G with the property that ev-
V,A
ery independent set in G corresponded
to a potential “proof” of the truth of A,
and the size of the independent set is
proportional to the probability with
which the verifier would accept that
“proof.” Thus if A were true, then there
would be a large independent set in the
graph of size, say K. On the other hand,
if A were false, every independent set
would be of size at most ( 1 − e )K. Thus
the gap in the acceptance probability
of the verifier turned into a gap in the
size of the independent set. A 1/( 1 −
e /2)-approximation algorithm would ei-
ther return an independent set of size
greater than ( 1 − e /2)K, in which case
A must be true, or an independent set
of size less than ( 1 − e )K in which case
we may conclude that A is false. Thus a
1/( 1 − e/2)-approximation algorithm
for independent sets suffices to get an
algorithm to decide truth of assertions,
which is an NP-complete task.
The natural next question is whether
the connection between independent
set approximation and PCPs is an isolated one—after all different problems
do behave very differently with respect
to their approximability, so there is no
reason to believe that PCPs would also
yield inapproximability results for other
optimization problems. Fortunately,
it turns out that PCPs do yield inapproximability results for many other
optimization problems. The result of
Feige et al. was followed shortly thereafter by that of Arora et al. who showed
3
that for a broad collection of problems,
there were nontrivial limits to the constant factor to which they were approx-imable, unless NP = P. (In other words,
for each problem under consideration
they gave a constant a > 1 such that the
existence of an a-factor approximation
algorithm would imply NP = P.) This collection was the so-called MAX SNP-hard
problems. The class MAX SNP had been
discovered earlier by Papadimitriou and
Yannakakis30 and their work and subsequent works had shown that a varied collection of problems including the MAX
CUT problem in graphs, Vertex Cover
problem in graphs, Max 3SAT (an optimization version of 3SAT where the goal
is to satisfy as many clauses as possible),
-TSP, Steiner trees in metric spaces, the shortest superstring problem
were all MAX SNP-hard. Subsequently
more problems were added to this set
by Lund and Yannakakis29 and Arora
et al. 1 The combined effect of these results was akin to that of Karp’s work25 in
NP-completeness. They suggested that
the theory of PCPs was as central to the
study of approximability of optimization
problems, as NP-completeness was to
the exact solvability of optimization
problems. Over the years, there have
been many successful results deriving
inapproximability results from PCP machinery for a wide host of problems (see
surveys by Arora and Khot2, 26 for further
details). Indeed the PCP machinery ended up yielding not only a first cut at the
approximability of many problems, but
even very tight analyses in many cases.
Some notable results here include the
following:
˲ Håstad22 showed that Max 3SAT
does not have an a-approximation algorithm for a < 8/7. This is tight by a result
of Karloff and Zwick24 that gives an 8/7
approximation algorithm.
˲ Feige14 gave a tight inapproximability result for the Set Cover problem.
˲ Håstad21 shows that the clique size
in n-vertex graphs cannot be approximated to within a factor of n1−e for any
positive e .
Again, we refer the reader to some of
the surveys for more inapproximability
results2, 26 for further details.
construction of PcPs:
a Bird's eye View
We now give a very high level view of
the two contrasting approaches towards the proof of the PCP theorem.
We stress that this is not meant to give
insight, but rather a sense of how the
proofs are structured. To understand
the two approaches, we find it useful
to work with the notion of 2-prover one
round proof systems. While the notion
is the same as the one defined informally in Section 2, here we define the
verifier more formally, and introduce
the parameter corresponding to query
complexity in this setting.
Definition 4. 1 (2IP verifier, Answer
size, Gap). A 2IP verifier of answer size
a(n), and gap e (n) is a probabilistic algo-
rithm V who, on input an assertion A Î
{0, 1}*, picks a random string R Î {0, 1}*,
makes one query each to two provers P ,
1
P : { 1, …, } ® {0, 1}*, and produces
2
an accept/reject verdict, denoted
PP
V 1, 2 (A; R), with the following restrictions,
when A Î {0, 1}n:
Running time: V always runs in time
polynomial in n.
answer size: The prover’s answers are
each a(n) bits long.
prover length: The questions to the provers are in the range { 1,…, (n)} where (n)
is a polynomial in n.
Completeness: If A is a true assertion,
PP
there exist provers P , P such that V 1, 2 (A;
12
R) always accepts.
Soundness, with gap e (n): If A is not true,
then for every pair of provers P , P the
12
probability, over the choice of the random-