all components—that is, uncorrelated with the standard
basis—in order to minimize the number of observations
needed to recover a low-rank matrix. This motivates the
Definition 2. 2. Let U be a subspace of Rn of dimension r
and pU be the orthogonal projection onto U. Then the coher-
ence of U (vis-à-vis the standard basis (ei)) is defined to be
for some b > 2, then the minimizer to the problem ( 2. 3) is unique
and equal to M with probability at least 1 − cn−b. For
this estimate can be improved to
with the same probability of success.
Note that for any subspace, the smallest m(U) can be is 1,
achieved, for example, if U is spanned by vectors whose
entries all have magnitude . The largest possible value
for m(U) is n/r which would correspond to any subspace that
contains a standard basis element. Matrices whose column
and row spaces have low coherence are likely not to vanish
in too many entries and are our most likely candidates for
matrices that are recoverable from a few samples. As we discuss below, subspaces sampled from the random orthogonal model (Definition 2. 1) have nearly minimal coherence.
To state our main result, we introduce two assumptions
about an n1 × n2, rank r matrix M whose SVD is given by
( 1. 1) and with column and row spaces denoted by U and V,
Theorem 2. 3, proven in the full version of this paper,
asserts that if the coherence is low, few samples are required
to recover M. For example, if m0 is a small constant and the
rank is not too large, then the recovery is exact with large
probability provided that
( 2. 5)
We give two illustrative examples of matrices with incoherent column and row spaces. This list is by no means
a0 The coherences obey max(m (U), m(V) ) ≤ m0 for some
m0 > 0.
a1 The n1× n2 matrix
has a maximum entry boun-in absolute value for some m1 > 0.
These definitions implicitly define two critical parameters,
m0 and m1. These m’s may depend on r and n1, n2. Moreover,
note that a1 always holds with since the (i, j )th
entry of the matrix is given by and by
the Cauchy–Schwarz inequality,
1. The first example is the random orthogonal model
(see Definition 2. 1). For values of the rank r greater
than log n, m (U) and m(V ) are O( 1), m1 = O(log n) both
with very large probability. Hence, the recovery is
exact on most sampling sets provided that m ≤ Cn5/4r
log n. When r ≤ n1/5, we can strengthen this bound to
m ≤ Cn6/5r log n.
2. The second example is more general and simply
requires that the components of the singular vectors of
M are small. Assume that the uj and vj’s obey
( 2. 6)
Hence, for sufficiently small ranks, m1 is comparable to m0.
We say that a subspace U ⊂ Rn is incoherent with the standard basis if m(U) is at most logarithmic in n. As we show
in the full version of this paper that, for larger ranks, both
subspaces selected from the uniform distribution and
spaces constructed as the span of singular vectors with
bounded entries are not only incoherent with the standard
basis but also obey a1 with high probability for values of m1
at most logarithmic in n1 and/or n2.
We are now in a position to state our main result: if a
matrix has row and column spaces that are incoherent with
the standard basis, then nuclear norm minimization can
recover this matrix from a random sampling of a small number of entries.
for some value of mB = O( 1). Then, the maximum coherence is
at most mB since m(U) ≤ mB and m(V) ≤ mB. Further, we show in
the full version of this paper that a1 holds most of the time
with . Thus, for matrices with singular vectors
obeying ( 2. 6), the recovery is exact provided that m obeys
( 2. 5) for values of the rank not exceeding mB−1n1/5.
Theorem 2. 3. Let M be an n1 × n2 matrix of rank r obeying a0 and a1 and put n = max(n1, n2). Suppose we observe m
entries of M with locations sampled uniformly at random. Then
there exist constants C, c such that if
2. 2. numerical validation
To demonstrate the practical applicability of the nuclear
norm heuristic for recovering low-rank matrices from their
entries, we conducted a series of numerical experiments
for a variety of the matrix sizes n, ranks r, and numbers of
entries m. For each (n, m, r) triple, we repeated the following procedure 50 times. We generated M, an n × n matrix of
rank r, by sampling two n × r factors ML and MR with i.i.d.
Gaussian entries and setting M = MLMRT. We sampled a subset W of m entries uniformly at random. Then, the nuclear
norm minimization problem was solved using the semidefinite programming solver, SeDuMi.
33 We declared M to
be recovered if the solution returned by the solver, Xopt, satisfied Xopt − M F/ M F < 10− 3. Figure 2 shows the results of
these experiments for n = 50. The x-axis corresponds to the
fraction of the entries of the matrix that are revealed to the
SDP solver. The y-axis corresponds to the ratio between the