Our work also has close connections with the study of
stochastic algorithms for low-rank matrix approximation.
In this body of work, one is interested in sampling some
entries of a matrix in order to construct an approximate
factorization of this matrix. Typically, it is assumed that
one may sample any subset of entries but would like to
minimize the computational complexity involved in constructing an approximation. Pioneering work in this area
appears in Frieze et al.
15 and Liberty et al.,
25 and an extensive survey of these methods can be found in Halko et al.
19
While this body of work also uses similar foundational
theory of random matrices, our modeling assumptions are
fundamentally different. Here, we are primarily concerned
with the scenario where we have very limited control over
which entries of the matrix we can observe. In the examples
described in the introduction, one does not have access to
all of the entries of the matrix due to systemic constraints.
Surprisingly, our results demonstrate that low-rank matrices can be recovered exactly from almost all sufficiently
large subsets of entries. However, when we have the ability
to sample entries at will, the algorithmic recovery schemes
become considerably more efficient. We see our results
as complementary extremes of the sort of access one may
have to the entries of a matrix.
Indeed, when the sampling can be chosen in specially
designed patterns, the exact recovery problem becomes dramatically simpler. For example, suppose that M is generic
and that we precisely observe every entry in the first r rows
and columns of the matrix.
Write M in block form as
with M11 an r × r matrix. In the special case that M11 is invertible and M has rank r, it is easy to verify that M22 = M21 M11 – 1 M12.
One can prove this identity by forming the SVD of M. That
is, if M is generic, the upper r × r block is invertible, and
we observe every entry in the first r rows and columns, we
can recover M. This result immediately generalizes to the
case where one observes r rows and r columns and the r ×
r matrix at the intersection of the observed rows and columns is invertible. Algorithms in the stochastic low-rank
matrix approximation literature are essentially no more
complicated than this simple algorithm. They use randomness to add numerical robustness and to guarantee that the
sampled entries span the row/column space of the matrix
to be acquired.
3. ReCen T ADVAnCes in Lo W-RAnk MoDeLinG
Our original article announced the possibility of vari-
ous refinements and extensions, and invited research-
ers to develop the new field of matrix completion. We are
pleased to see that the area of low-rank modeling and
matrix completion has been quite active, and the field is
growing at a very fast pace. In fact, there are so many new
and exciting results recently developed that it is unfor-
tunately impossible to review them all here. Below, we
survey selected progress that has occurred since our origi-
nal submission.
3. 1. improvements and other approaches
The results discussed in Section 2. 1 show that under suitable conditions, one can reconstruct an n × n matrix of
rank r from a small number, m, of its sampled entries provided that m is on the order of n1.2r log n, at least for moderate values of the rank. One would like to know whether
better results hold, in the sense that exact matrix recovery
would be guaranteed with a reduced number of measurements. In particular, recall that an n × n matrix of rank r
depends on (2n − r)r degrees of freedom; is it possible to
recover most low-rank matrices from on the order of nr
randomly selected entries? Can the sample size be merely
proportional to the true complexity of the low-rank object
we wish to recover?
In this direction, we would like to emphasize that there
is nothing in the approach of our original paper that
stands in the way of stronger results. Our proof architecture requires bounding an infinite matrix series in the
operator norm. We develop a bound on the spectral norm
of each of the first four terms of this series and a general
argument to bound the remainder of the series in the full
version of this paper. Presumably, one could bound higher
order terms by the same techniques. Getting an appropriate bound on the fifth term would lower the exponent
of n from 6/5 to 7/6. The appropriate bound on the sixth
term would further lower the exponent to 8/7, and so on.
To obtain an optimal result, one would need to bound
O(log n) terms.
Following this main idea, the authors Candès and Tao9
reduced the upper bound on the number of required measurements to O(nr log6(n)) using a combinatorial argument to bound precisely this particular series. Their results
depend on some additional assumptions, including a
“strong incoherence condition” that is more restrictive than
the one defined in Section 2. 1. However, they also show that
no algorithm could succeed with high probability if less than
Q(nrlog(n) ) entries were observed.
An unexpected and clever method for approximating
this infinite matrix series was invented in Gross et al.
18 This
new approach used powerful large deviation bounds from
quantum information theory combined with an iterative
construction that circumvented much of the combinatorics
necessary for the proof in Candès and Tao.
9 This approach
is dramatically simpler than previous approaches, and,
using this technique, it was shown that O(nrlog2(n) ) entries
were sufficient for exact matrix completion in Gross17 and
Recht.
29 In Recht,
29 the leading constant was even upper
bounded by 64.
From a very different perspective, the authors in Keshavan
et al.
21 provided a non-convex algorithm for low-rank matrix
recovery. Here, the authors analyze a gradient descent
scheme over the Grassmannian manifold of subspaces.
Using some of the techniques developed in the full version
of this paper, the authors show that this nonconvex problem
is actually convex in a neighborhood of the true low-rank
matrix provided the number of observed entries exceeds