intellectual excitement. “People like
us are motivated by the research, not
necessarily by the money,” says Chris
Volinsky, executive director of statistics research at AT&T Research and a
member of BellKor’s Pragmatic Chaos.
“Having an academic flavor to the competition has really helped it to sustain
energy for two-and-a-half years.”
Although Netflix is unique in publicly enlisting and rewarding outside
researchers, many other companies
are fine-tuning the choices their recommender systems present to customers. Some of their efforts, like those of
Amazon, L.L. Bean, and i Tunes, are obvious to users. Other companies work
behind the scenes, quietly monitoring
and personalizing the experience of
each user. But either way, user satisfaction depends on not just new and
improved algorithms, but individual
human preferences, with all of their
many quirks.
The Netflix Prize has brought a lot of
attention to the field, notes John Riedl,
a professor of computer science at the
University of Minnesota. However,
Riedl worries that the Netflix Prize puts
“a little too much of the focus on the algorithmic side of the things, whereas I
think the real action is going to happen
in how you build interfaces … that expose the information in more creative
and interesting ways.”
Implicit and explicit Information
To entice its customers to rate movies,
Netflix promises to show them other
movies they will enjoy. Netflix also encourages its customers to provide detailed information about their viewing
preferences. Unfortunately, this rich,
explicit feedback demands a level of
user effort that most Web sites can’t
hope for.
Instead, many companies rely on
implicit information about customer
preferences, such as their purchasing
history. However they get the feedback,
though, researchers must manage with
a sparse data set that reveals little of
many customers’ tastes about most
products. A further, critical challenge
for Web-based recommender systems
is generating accurate results in less
than a second. To maintain a rapid response as databases grow, researchers
must continually trade off effectiveness for speed.
One popular and efficient set of
methods, called k nearest neighbors,
searches for a handful of other customers (k of them) who have chosen the
same items as the current customer.
The system then recommends other
items chosen by these “neighbors.”
In contrast, latent factor methods
search customers’ choices for patterns
that can explain them. Some factors
have obvious interpretations, such as
a user’s preference for horror films,
while other statistically important factors have no obvious interpretation.
One advantage of latent-factor methods is they can provide recommendations for a new product that has yet to
generate much consumer data.
These algorithms all aim to solve the
generic problem of correlating preferences without invoking knowledge of
the domain they refer to, such as clothing, movies, and music. In principle,
notes Joseph Konstan, a professor of
computer science and engineering at
the University of Minnesota, as long as
individuals’ preferences remain constant, then with enough opinions from
a sufficient number of people, “you
don’t need to know anything about the
domain.” In practice, Konstan says,
limited data and changing tastes can
make domain-specific approaches
more effective.
One of the most sophisticated
domain-specific approaches is used
by Internet-radio company Pandora,
which employs dozens of trained musicologists to rate songs on hundreds
of attributes. “We are of the opinion
that to make a good music recommendation system you need to understand both the music and the listeners,” says Pandora Chief Operating
Officer Etienne Handman. Still, the
most enjoyed Pandora playlists, he
says, supplement the musicologists’
sophisticated ratings with statistical
information from users.
measuring effectiveness
To win the Netflix Prize, a team must
beat Cinematch by 10% on a purely statistical measure, the root mean square
error, of the differences between predicted and actual ratings. Like content-based assessments, however, this ob-
Obituary
Rajeev Motwani, Google Mentor, Dies at 47
Rajeev motwani, a professor of computer science at stanford University who mentored many students and silicon Valley entrepreneurs, including the founders of Google, died in an apparent accidental drowning on June 5. He was 47. motwani was well known for his theoretical research on
randomized algorithms and his
contributions in data mining,
Web search, information
retrieval, streaming databases,
and robotics. He is the author
of two classic computer
science textbooks, Randomized
Algorithms, with prabhakar
Raghavan, and Introduction
to Automata Theory, Languages
and Computation, with John
Hopcroft and Jeffrey Ullman.
motwani served as director
of graduate studies in the
computer science department
at stanford and founded
the mining Data at stanford
(miDAs) project. He was known
as a friendly, well-respected
professor who always made
himself available to advise and
mentor young entrepreneurs
and students, including Google
cofounders sergey Brin and
larry page when they were
graduate students at stanford
in the mid-1990s.
“the Google founders used
both his technical expertise
and his understanding of how
technology can transition into
the real world. He was helpful
in that regard,” recalls stanford
computer science professor
Balaji prabhakar, a friend
of motwani. “many former
students and silicon Valley
folks have sought out Rajeev’s
advice and input. He was a
generous person who saw the
potential in people and their
ideas.”
motwani was also an angel
investor and technical advisor
for many startup companies in
silicon Valley, and sat on the
board of numerous companies,
including Google, Kaboodle,
and mimosa systems.
motwani’s research earned him
numerous awards, including the
Gödel prize and an Arthur p. sloan
Foundation Research Fellowship.
— Wylie Wong