they need it, but they have chosen to
avoid it here. They want to rethink the
problem from the ground up, setting
out basic principles about the behavior they desire with edges, textures,
and smooth regions.
Their new direction is quite unexpected. To make an analogy, it is almost
as if some experts in 3D manufacturing
decided to abandon their CAD systems
and 3D printers in order to sculpt marble with a hammer and chisel. Sometimes the fancy tools get in the way, and
the best thing is to get back in direct
contact with the material.
The results in this case are stunning. The authors are able to achieve
extreme levels of detail enhancement
and HDR range compression. There
are almost no visible artifacts. It is difficult to believe anyone can do much
better, and in that sense one could say
the problems have been solved.
So, is this paper the last word?
No, because beautiful pictures are
not enough. It is still important to
situate the work intellectually within
the greater worlds of image processing and computational photography.
How do these techniques relate to the
many other approaches to detail enhancement and HDR range compression? How can the insights from this
paper be integrated into methods that
are couched in other languages, such
as wavelets or image statistics? More
generally, what does this paper teach
us about the underlying problems of
edge-aware image processing? There
is already progress on these questions, as noted in the revised research
that Paris et al. present here. We can
expect more insights to follow, as people digest the results of this refreshingly original paper.
Edward Adelson ( adelson@csail.mit.edu) is the John
and Dorothy Wilson Professor of Vision Science in the
Department of Brain and Cognitive Sciences at MIT,
Cambridge, MA.
Copyright held by author.
IN RECENT YEARS, the image sensors
in digital cameras have improved in
many ways. The increases in spatial
resolution are well known. Equally
important, but less obvious, are improvements in noise level and dynamic range. At this point digital cameras
have gotten so good it is challenging
to display the full richness of their
image data. A low noise imager can
capture subtly varying detail that can
only be seen by turning up the display
contrast unnaturally high. A high dynamic range (HDR) imager presents
the opposite problem: its data cannot
be displayed without making the contrast unnaturally low. To convey visual information to a human observer,
it is often necessary to present an image that is not physically correct, but
which reveals all the visually important variations in color and intensity.
A discipline known as computational
photography has emerged at the intersection of photography, computer vision, and computer graphics, and the
twin problems of detail enhancement
and HDR range compression (also
called tone mapping) have become
recognized as important topics.
Given an individual image patch, it
is not difficult to find display parameters that will effectively convey the local visual information. The problem
is this patch must coexist with all the
other image patches around it, and
these must join into a single, globally coherent image. Many techniques
have been proposed to find an image
that simultaneously displays everything clearly, while still looking like a
natural image. In struggling to bring
about a global compromise between
all the local constraints, these techniques tend to introduce visually disturbing artifacts, such as halos around
strong edges, or distortions of apparent contrast, sharpness, and position
of local features.
Performance has improved through
the use of increasingly sophisticated
image processing techniques, which
can manipulate information smooth-
ly across multiple spatial scales, while
preserving the integrity of sharp edg-
es. Recent progress in “edge-aware”
processing builds on a foundation
of work in such topics as anisotro-
pic diffusion, regularization, and
sparse image coding. New classes of
edge-aware filters have been devised,
utilizing ideas from robust estima-
tion. Novel forms of wavelet decom-
position have been introduced, spe-
cifically to deal with the challenges
of processing sharp edges within a
multiscale representation. However,
none of the methods has proven en-
tirely satisfactory, and some of them
are quite complex.
In the following paper, Paris et al.
made a surprising move. They chose
to build a system on the Laplacian
pyramid, which is a very simple multiscale representation that predates
wavelets. It lacks an impressive mathematical pedigree, but is still widely
used because of its simplicity and
reliability; it serves as a basic building block for many image-processing
schemes. At the same time, the Laplacian pyramid seems ill suited to any
tasks involving specialized processing near edges. Its basic functions
are smooth, overlapping, and non-oriented, whereas edges are sharply
localized and oriented.
The authors also eschew a wide
range of modern techniques. Indeed,
the most striking thing about the paper is what is missing: There are no
statistical image models, no machine
learning, no PDEs, no fancy wavelets,
and no objective functions. Instead,
the authors return to an old-fashioned
style rarely seen today: carefully considering a problem at the level of
pixels and patches, and specifying
the requirements in the most direct
possible way. It should be noted that
these authors are fully capable of developing elaborate machinery when
Technical Perspective
Image Processing
Goes Back to Basics
By Edward Adelson
To view the accompanying paper,
visit doi.acm.org/10.1145/2723694
research highlights
DOI: 10.1145/2732218