figure 1. manipulating images using our interactive tools. Left to right: (a) the original image; (b) a hole is marked (magenta) and we use line
constraints (red/green/blue) to improve the continuity of the roofline; (c) the hole is filled in; (d) user-supplied line constraints for retargeting;
(e) retargeting using constraints eliminates two columns automatically; and (f) user translates the roof upward using reshuffling.
(a) Original
(b) Hole + constraints
(c) Hole filled
(d) Constraints (e) Constrained retarget (f) Reshuffle
figure 2. Given an approximate match between patches with patch
distance D, these 2D histograms show peaked distributions of
the (x, y) coordinates where better matches are located. a better
initial match with lower distance D causes more peaking. this is
averaged over a dataset of more than 100 similar and dissimilar
natural image pairs. Note the center pixel is black, but this is not
visible at print resolution.
D = 50
D = 25
D = 10
advantage of these statistics. Specifically, we look at the
correlation between adjacent patches, and find that they
are highly correlated. For example, as shown in Figure 2,
given an approximate match between patches with patch
distance D, the locations of matches with patch distance
less than D are not uniformly distributed throughout the
image, but instead follow a peaked distribution. We can
also ask in the ground truth nearest neighbor matches,
for two patches that are horizontally or vertically adjacent, how far apart spatially are their matches? This is
visualized as a histogram in Figure 3 which again shows a
peaked distribution.
Such biased distributions allow us to devise an efficient
iterative search strategy for natural images—PatchMatch—
that focuses computational effort on regions most likely to
produce good matches. It converges for all images in the
limit, but converges extremely quickly for natural images
that follow its prior assumptions. We make several observations about the problem:
Dimensionality of offset space. First, although the
dimensionality of the patch space is large (m dimensions),
it is sparsely populated (O(M) patches). Many previous
methods have accelerated the nearest neighbor search by
attacking the dimensionality of the patch space using tree
structures (e.g., kd-tree), and dimensionality reduction
methods (e.g., PCA). In contrast, our algorithm searches in
the 2-D space of possible patch offsets, achieving greater
speed and memory efficiency.
natural structure of images. Second, the usual independent search for each pixel ignores the natural structure in
images. In patch-sampling synthesis algorithms, the output
figure 3. histogram showing a correlation between adjacent patches’
nearest neighbors. on the horizontal axis, we measure how far apart are
nearest neighbors of adjacent patches, in euclidean 2D distance. on the
vertical axis, we count the number of patches with a given distance. most
patches have zero euclidean distance, indicating perfect coherence.
14,000,000
12,000,000
10,000,000
8,000,000
6,000,000
4,000,000
2,000,000
0
0
5
10
15
20
25
30
35
40
typically contains large contiguous chunks of data from the
input (as observed by Ashikhmin1). Thus we can improve
efficiency by performing searches for adjacent pixels in an
interdependent manner.
the law of large numbers. Finally, whereas any one random choice of patch assignment is very unlikely to be a good
guess, some nontrivial fraction of a large field of random
assignments will likely be good guesses. As this field grows
larger, the chance that no patch will have a correct offset
becomes vanishingly small.
Based on these three observations we offer a randomized algorithm for computing approximate NNFs using
incremental updates (Section 3). The algorithm begins with
an initial guess, which may be derived from prior information or may simply be a random field. The iterative process
consists of two phases alternated at each patch:
propagation, in which coherence is used to disseminate good solutions to adjacent pixels in the field; and random search, in
which the current offset vector is perturbed by multiple
scales of random offsets. Theoretical and empirical tests
show the algorithm has good convergence properties for
tested imagery up to 2 MP, and our CPU implementation
shows speedups of 20–100 times versus kd-trees with PCA.
Tree methods search in time O(cmmM), and incur the “curse
of dimensionality:” cm is exponential in the patch dimension m [ 15]. In contrast, our algorithm takes time O(NmM),
where N is the number of iterations (typically 5 when
searching translations and 20 for rotations and scales). In