Alexei efros’ remote
image-cloning
technique employs
a relatively new
approach to modeling
in which simple
machine-learning
techniques are
applied to huge
databases.
ACM’s
interactions
magazine explores
critical relationships
between experiences, people,
and technology, showcasing
emerging innovations and industry
more important, the remote cloning
works because it employs a relatively
new approach to modeling in which
very simple machine-learning tech-
niques are applied to huge databases.
Large amounts of data can overcome
weaknesses in algorithms, Efros says.
“The standard view in computer sci-
ence has been that the most impor-
tant thing is the algorithm, then you
have the representation, then you find
some data,” he says. “But it’s actually
the other way around: The most im-
portant thing is the data. Then it’s the
representation, and only then comes
the algorithm.”
Efros and his colleagues are apply-
ing this principle to a new technique
that finds visually similar images even
if they are quite different at the pixel
level and are not effectively matched by
conventional techniques. It uses a sta-
tistical technique called support vector
machine to estimate the relative im-
portance of different features in a que-
ry image. “Our approach shows good
performance on a number of difficult
cross-domain visual tasks by, for exam-
ple, matching paintings or sketches to
real photographs,” he says.
The approach by Efros’ team could
create “a new age in visual expression,”
says Columbia’s Shree Nayar. Existing
image-editing tools use classical image
processing techniques, he says, “but
now you have the opportunity to use
lots of data and machine learning.”
Efros says his ultimate goal is to
create a “visual memex”—a model for
linking images not by categories, such
as car, person, or city, but by zeroing
in on what is unusual or unique in an
image. These visual characteristics
would become, in effect, hyperlinks.
In a You Tube video titled “Data-driven
Visual Similarity for Cross-domain Im-
age Matching,” Efros downloads 200
images from Flickr based on a simple
keyword search for “Medici Fountain
Paris.” From them he builds a “visual
memex graph” whose nodes are imag-
es and parts of images and whose edg-
es are various types of associations,
such as visual similarity and context.
He goes on to show how his algo-
rithms use the graph to find far better
matches to a test image of the fountain
than traditional searching techniques.
By zeroing in on relatively small but
unique elements in his test image, the
technique avoids the superficial but
incorrect matches based on similar
skies or foregrounds that occupy large
portions of the pictures but which are
irrelevant to the fountain.
http://www.acm.org/subscribe
Further Reading
Cossairt, O., Miau, D., and Nayar, S.
Gigapixel computational imaging, IEEE
International Conference on Computational
Photography, Pittsburgh, PA, April 8–10, 2011.
Kee, E., Paris, S., Chen, S., and Wang, J.
Modeling and removing spatially-varying
optical blur, IEEE International Conference
on Computational Photography, Pittsburgh,
PA, April 8–10, 2011.
Nack, J.
Adobe demos refocusable images, http://
blogs.adobe.com/jnack/2010/09/adobe-
demos-refocusable-images.html, Sept. 25,
2010.
Srivastava, A., Malisiewicz, T.,
Gupta, A., and Efros, A.
Data-driven visual similarity for cross-domain image matching, Proceedings of
the 2011 SIGGRAPH Asia Conference, hong
Kong, Dec. 12–15, 2011.
Szeliski, R.
Computer Vision: Algorithms and Applications ,
2011. Springer, new York, n Y, 2011.
Gary Anthes is a technology writer, editor, and
photographer based in arlington, Va.
© 2012 aCM 0001-0782/12/06 $10.00