comparing to A. As a result, there is no hope that preconditioned Richardson’s iteration or any other preconditioned
method can lead to fast solvers.
The remedy to the problem is recursion. In a recursive
preconditioned method, the system in the preconditioner B
is not solved exactly but approximately, via a recursive invocation of the same iterative method. We now have to find a
preconditioner for B, and furthermore a preconditioner for
it and so on. This produces a multilevel hierarchy of progressively smaller graphs.
Rohklin, Spielman, and Teng19 analyzed a recursive
iterative method which moves between levels of the hierarchy as shown in Figure 5; for each visit at level i, the algorithm makes k visits to level i + 1. Every time the algorithm
returns to the ith level it performs matrix–vector multiplications with the graph Ai, and other simpler operations; so
the work is proportional to the number of edges of Ai. To
keep the total work as small as possible, that is O(km), the
graphs in the hierarchy must get smaller sufficiently fast.
In particular, it is sufficient that the graph on level i + 1 is
smaller than the graph on level i by a factor of 1/(2k).
However, the algorithm must converge within the O(km)
time bound. To achieve this, the iterative method analyzed
within this recursive framework is a method known as
Chebyshev iteration. It requires only O(k) iterations, when
B is a k2-approximation of A, as compared to the O(k2) iterations required by Richardson’s iteration. Using this fact
Spielman and Teng arrived at design conditions that are
sufficient for a fast solver.
20 It was actually shown that a
good algorithm for preconditioning extends to a good
solver. More specifically, assume that for some fixed value
C and any value of k, we have a fast algorithm that given A,
produces a k2-approximation with n + C · m/k edges. Then
we automatically get a solver that runs in time O(k · m).
Carefully checking the above statement, we realize that
there is a slight discrepancy. If m is close to n and k is large,
then n + C · m/k will be bigger than m, which contradicts our
Figure 5. the sequence of calls of a recursive iterative method.
the matrix is fixed at each level.
promise for a multilevel hierarchy of progressively smaller
graphs. However, as observed by Vaidya,
23 when m is almost
the same n, the graph has several “tree-like” parts, and these
can be reduced via a “partial” Gaussian elimination that
runs in O(m) time. So whenever this case appears, it makes
sense to first run partial elimination. This will decrease the
vertex count n, leading to a much smaller instance on which
recursion is applicable.
The multilevel analysis of Spielman and Teng is significant not only for its actual algorithmic value but also the
conceptual reduction of the multi-level solver design problem to a well-defined two-level preconditioning problem,
allowing us now to focus on the combinatorial component
of the solver.
4. thE ComBinAtoRiAL ComPonEnt
Although graph theory has been used to speed up direct
methods, it took a paradigm-shifting idea of Pravin Vaidya
to enter a systematic study of using graph theory for iterative methods. In particular, Vaidya suggested the use of a
spanning tree of the graph A as a building base for the preconditioner B. A spanning tree of a graph is a connected
subgraph without loops. The choice of a tree stems from
the observation that linear systems whose matrix is the
Laplacian of a tree can be solved in O(n) time via Gaussian
elimination. Adding a few edges of A back onto the tree
returns a preconditioner B which can only be better than
the tree, while still being relatively easy to solve. Vaidya’s
idea set forth two questions: (i) What is an appropriate
base tree? (ii) Which off-tree edges should be added into
the preconditioner?
While these questions seem to be interrelated, we can
actually address them separately.
4. 1. Low-stretch: the base spanning tree
The goal of finding a preconditioning tree B which is as similar as possible to the graph A led Vaidya to a natural idea:
use a tree which concentrates the maximum possible weight
from the total weight of the edges in A.
The maximum-weight spanning tree idea led to the first
non-trivial results, but does not suffice for our algorithm.
In fact, the weight measure does not distinguish trees in
unweighted graphs, where all trees have equal weight.
The key to finding a good tree to use as a building base
is the notion of stretch: For every edge (u, v) of the graph,
there is a unique “detour” path between u and v in a tree
T. The stretch of the edge with respect to T is equal to the
distortion caused by this detour, and in the unweighted
case, it is simply the length of the tree path. This notion
generalizes naturally to the weighted case, which we will
formalize in Section 4. 3. The total stretch of a graph A
with respect to a tree T is the sum of the stretches of all the
off-tree edges. A low-stretch tree (LSST) is one for which
we have a good upper bound on the total stretch. So, at a
high level, an LSST has the property that it provides good
(on average) “detours” for edges of the graph. A concrete
example on a larger unweighted graph is given in Figure
6, where the tree on the right has lower total stretch, and
as it turns out is a better base tree to add edges to.