Doi: 10.1145/2184319.2184343
Exact Matrix Completion
via Convex Optimization
Abstract
Suppose that one observes an incomplete subset of entries
selected from a low-rank matrix. When is it possible to complete the matrix and recover the entries that have not been
seen? We demonstrate that in very general settings, one
can perfectly recover all of the missing entries from most
sufficiently large subsets by solving a convex programming
problem that finds the matrix with the minimum nuclear
norm agreeing with the observed entries. The techniques
used in this analysis draw upon parallels in the field of
compressed sensing, demonstrating that objects other
than signals and images can be perfectly reconstructed
from very limited information.
1. in TRoDuCTion
In many practical problems of interest, one would like to
recover a matrix from a sampling of its entries. As a motivating example, consider the task of inferring answers in
a partially filled out survey in which questions are asked
to a collection of individuals. Then we can form a matrix
where the rows index the individuals and the columns
index the questions. We collect data to fill out this table,
but unfortunately, many questions are left unanswered. Is
it possible to make an educated guess about what the missing answers should be? How can one make such a guess?
Formally, we may view this problem as follows. We are
interested in recovering a data matrix M with n1 rows and
n2 columns but have access to only m of its entries, where m
is much smaller than the total number of entries, n1n2. Can
one recover the matrix M from m of its entries? In general,
everyone would agree that this is impossible without some
additional information.
In many instances, however, the matrix we wish to
recover is known to be structured in the sense that it is low-rank or approximately low-rank. (We recall for completeness that a matrix has rank r if its rows or columns span an
r-dimensional space.) Consider the following two scenarios
as prototypical examples.
˲ The Netflix problem. In the area of recommender sys-
tems, users submit ratings on a subset of entries in a da-
tabase, and the vendor provides recommendations based
on the user’s preferences.
31 Because users only rate a few
items, one would like to infer their preference for unrated
items. A special instance of this problem is the now famous
Netflix problem.
24 Users (rows of the data matrix) are given
the opportunity to rate movies (columns of the data ma-
trix), but users typically rate only very few movies so that
there are very few scattered observed entries of this data
matrix. Yet, one would like to complete this matrix so that
the vendor (here Netflix) might recommend titles that any
particular user is likely to be willing to order. In this case,
the data matrix of all user-ratings may be approximately
low-rank, because only a few factors contribute to an indi-
vidual’s tastes or preferences.
( 1. 1)
where v T denotes the transpose of v. S is an r × r
diagonal matrix with real, positive elements sk > 0. U is an
n × r matrix with orthonormal columns u1, …, ur. That is,
uk Tuk = 1 and ui T uj = 0 if i ≠ j. v is also n × r with orthonormal columns v1, …, vr. The column space of M is spanned
by the columns of U, and the row space is spanned by the
columns of v.
The number of degrees of freedom associated with a rank
r matrix M is r(2n − r). To see this, note that S has r nonzero
entries, and U and v each have nr total entries. Since U and
v each satisfy r(r + 1)/2 orthogonality constraints, the total
number of degrees of freedom is r + 2nr − r (r + 1) = r (2n − r).
Thus, when r is much smaller than n, there are significantly
The original version of this paper was published in
Foundations of Computational Mathematics 9,
6 (2009),
717–772.