found in the full version of this paper.
To see that our theorem provides conditions under which
( 2. 7) can be solved via nuclear norm minimization, note that
there exist unitary transformations f and G such that ej = ffj
and ej = Ggj for each j = 1, …, n. Hence,
Then, if the conditions of Theorem 2. 3 hold for the matrix
fXG T, it is immediate that nuclear norm minimization finds
the unique optimal solution of ( 2. 7) when we are provided a
large enough random collection of the inner products f Ti Mg j.
In other words, all that is needed is that the column and row
spaces of M be, respectively, incoherent with the bases (fi)
and (gi).
2. 4. Connections, alternatives, and prior art
Nuclear norm minimization is a recent heuristic introduced by Fazel14 and is an extension of the trace heuristic
often used in control theory; see, for example, Beck and
D’Andrea.
3 Indeed, when the matrix variable is symmetric
and positive semidefinite, the nuclear norm of X is the sum
of the (nonnegative) eigenvalues and thus equal to the trace
of X. Hence, for positive semidefinite unknowns, X, ( 2. 3)
becomes the semidefinite program
Even for the general matrix M, which may not be positive
definite or even symmetric, the nuclear norm heuristic can
be formulated in terms of semidefinite programming. The
program ( 2. 3) is equivalent to
with optimization variables X, W1, and W2 (see, e.g.,
Fazel14). There are many efficient algorithms and high-quality software packages available for solving these types
of problems.
one cannot hope to reconstruct x but assume now that the
object we wish to recover is known to be structured in the
sense that it is sparse (or approximately sparse). This means
that the unknown object depends upon a smaller number
of unknown parameters. Then, it has been shown that
1
minimization—minimizing the sum of the absolute values
of x, subject to the linear constraints y = Fx—allows recovery of sparse signals from remarkably few measurements.
8
If F is chosen randomly from a suitable distribution, then
with very high probability, all sparse signals with about k
nonzero entries can be recovered from on the order of k log
n measurements. For instance, if x is k-sparse in the Fourier
domain, that is, x is a superposition of k sinusoids, then
it can be perfectly recovered with high probability—by
1
minimization—from the knowledge of about k log n of its
entries sampled uniformly at random.
From this viewpoint, the results in this paper greatly
extend the theory of compressed sensing by showing that
other types of interesting objects or structures, beyond
sparse signals and images, can be recovered from a limited
set of measurements. Moreover, the techniques for proving our main results build upon ideas from the compressed
sensing literature together with powerful probabilistic tools
for bounding norms of operators between Banach spaces.
Also, our notion of coherence generalizes the concept
of the same name in compressive sensing. Notably, the
authors Candès and Romberg7 introduce the notion of
the coherence of a unitary transformation U; the coherence of U is simply proportional to maxj,k|Uj,k|
2. This
quantity plays a crucial role in determining the minimal
sampling rate necessary to recover a k-sparse signal by
1
minimization.
In Recht et al.,
30 the authors studied the nuclear norm
heuristic applied to a related problem where partial infor-
mation about a matrix M is available from m equations of
the form
( 2. 8)
Here, for each k, is an i.i.d. sequence of Gaussian
or Bernoulli random variables and the sequences {a(k)} are
also independent of each other (the sequences {a(k)} and
{bk} are available to the analyst). Building on the concept
of restricted isometry in the context of sparse signal recovery, Recht et al.
30 establish the first sufficient conditions
for which the nuclear norm heuristic returns the minimum rank element in the constraint set. The authors prove
that the heuristic succeeds with large probability whenever the number m of available measurements is greater
than a constant times 2nr log n for n × n matrices of rank
r. These results do not generalize to the matrix completion problem of interest to us in this paper. The measurements in ( 2. 8) give some information about all the entries
of M, whereas in our problem information about most of
the entries is simply not available. As a consequence, our
methods are quite different and require more involved
probabilistic analysis.