contained in the matching scene are considered. Since we
expect our matching scenes to already be roughly aligned
with the incomplete image, we discourage spurious, distant
matches by multiplying the error at each placement by the
magnitude of the offset.
We composite each matching scene into the incomplete
image at its best scoring placement using a form of graph
cut seam finding15 and standard Poisson’s blending. 17 Using
a seam finding operation to composite the images is arguably undesirable for hole filling since a user might want to
preserve all of the image data in the input image. Past image
completion algorithms4 have treated the remaining valid pixels in an image as hard constraints which are not changed.
We relax this, as in25 and allow the seam-finding operation to
remove valid pixels from the query image but we discourage
the cutting of too many pixels by adding a small cost for removing each pixel in the query image which increases with
distance from the hole.
When performing a seam-finding operation and gradient
domain fusion in sequence, it makes sense to optimize the
seam such that it will minimize the artifacts left behind by
the gradient domain fusion. Jia et al. 11 uses iterative dynamic programming optimizations to find a seam which leaves
minimum intensity difference between the two images after allowing some constant intensity offset. The intuition
is that humans are not sensitive to relatively large shifts in
color and intensity as long as the shifts are seamless and low
frequency. Inspired by this, as well as the fact that our scene
matches tend to have colors similar to our query image, we
propose a seam-finding heuristic which ignores intensity
differences entirely and instead minimizes the difference
in image gradients. This heuristic finds seams that avoid
cutting high-frequency image content for which Poisson’s
blending would cause obvious artifacts.
An advantage of our heuristic is that we can quickly
find the optimal seam using a graph cut procedure. 3 The
details of our graph cut setup can be found in our original SIGGRAPH publication. 9 A very similar metric was
mentioned but not demonstrated in1 As in, 1 we allow the
Poisson blending to operate on the entire image domain
instead of correcting one of the images. We use the Poisson
solver of. 2
Finally, we assign each composite a score which is the sum
of the scene matching distance, the local context matching
distance, the local texture energy distance, and the cost of
the graph cut. All four components of the score are scaled to
contribute roughly equally. We present the user with the 20
composites with the lowest scores.
5. ResuLts anD comPaRison
We test our algorithm on a set of 51 images with missing regions. All test images come from either the LabelMe database18 or our own personal photo collections, none of which
are contained in our downloaded database. Images in the
test set, about a half megapixel each, are higher resolution
than the images used in most previous works4, 6 and likewise the missing regions are quite large ( 56,000 pixels on
average). The regions we removed from the photos all had
semantic meaning such as unwanted objects, store-fronts,
walls with graffiti, roads, etc. The test set is made freely available on the authors’ Web page.
Image completion is an inherently underconstrained
problem. There are many viable ways to fill a hole in an image. Previous approaches, which operate by reusing texture
from the input image, can offer relatively few viable, alternative completions (perhaps by changing parameters such as
the patch size). While some such results might look slightly
better than others, the semantic interpretation of the image
is unlikely to change. On the other hand, our algorithm can
offer a variety of semantically valid image completions for
a given query image (Figure 4). After compositing, the best-matching patches we present a user with the 20 top image
completions (roughly equivalent to one page of image search
results). In some cases, many of these completions are of acceptable quality and the user can select the completion(s)
which they find most compelling. In other cases, only a few
or none of the results were acceptable. The quality of our results benefits from this very simple user interaction and it
figure 4: the algorithm presents to the user a set of alternative image completions for each input. here, we show three such alternatives.
Original
Input
Alternative completions