addition, unlike kd-trees, our algorithm uses substantially
less auxiliary memory.
2. ReLateD WoRk
Patch-based sampling methods have become a popular tool
for image and video synthesis and analysis. Applications
include texture synthesis, image and video completion,
summarization and retargeting, image recomposition and
editing, image stitching and collages, new view synthesis,
morphing, noise removal, super-resolution and more. We will
next review some of these applications and discuss the common search techniques that they use as well as their degree of
interactivity.
nearest neighbor search methods. Correspondence
searches can be classified as either local, where a search is
performed in a limited spatial window, or global, where all
possible displacements are considered. Correspondences
can also be classified as sparse, determined only at a subset
of key feature points, or dense, determined at every pixel or
on a dense grid in the input. For efficiency, many algorithms
only use local or sparse correspondences. Local search can
only identify small displacements, so multiresolution refinement is often used (e.g., in optical-flow3), but large motions
of small objects can be missed. Sparse keypoint14 correspondences are commonly used for alignment, 3D reconstruction, and object detection and recognition. These methods
work best on textured scenes at high resolution, but are less
effective in other cases.
Dense patch-based methods. Those methods that find
both dense and global matches have often had high time
cost in the matching stage. Moreover, whereas in texture
synthesis the texture example is usually a small image, in
other applications such as patch-based completion, retar-
geting and reshuffling, the input image is typically much
larger, making the search problem even more critical.
Various speedups for this search have been proposed, gen-
erally involving tree structures such as TSVQ23, kd-trees,
10, 24
and VP-trees,
12 each of which supports both exact and
approximate search (ANN). The FLANN method16 automati-
cally chooses which tree algorithm to use according to the
data. In synthesis applications, approximate search is often
used in conjunction with dimensionality reduction tech-
niques such as PCA,
10 because ANN methods are much more
time- and memory-efficient in low dimensions. Ashikhmin1
proposed a local propagation technique exploiting local
coherence in the synthesis process by limiting the search
space for a patch to the source locations of its neighbors in
the exemplar texture. The propagation step of our algorithm
is inspired by the same coherence assumption. The k-coher-
ence technique21 combines the propagation idea with a pre-
computation stage in which the k nearest neighbors of each
patch are cached, and later searches take advantage of these
precomputed sets. Although this accelerates the search
phase, k-coherence still requires a full nearest-neighbor
search for all pixels in the input. It assumes that the initial
offsets are close enough that only a small number of nearest
neighbors need to be searched. This may be true for small
pure texture inputs, but we found that for large complex
images our random search phase is required to escape local
minima. In this work we compare speed and memory usage
of our algorithm against kd-trees with dimensionality reduc-
tion, and we show that it is at least an order of magnitude
faster than the best competing combination (ANN+PCA)
and uses significantly less memory. Our algorithm also
provides more generality than kd-trees because it can be
applied with arbitrary nonlinear distance metrics, allows
searching over additional continuous domains such as rota-
tion and scale, and can be easily modified with constraints
in the image space to preserve structures and enable local
interactions. Locality sensitive hashing is an alternative to
tree structures that can be used to search image patches,
18
but it also requires substantial additional memory and a
precomputation step, unlike our algorithm.
matching across rotations and scales. When search
across a large range of scales and rotations is required, a
dense search was previously considered impractical due to
the high dimensionality of the search space. The common
way to deal with this case is via keypoint detectors.
14 These
detectors either find a local scale and orientation for each
keypoint or do an affine normalization. These approaches
are not always reliable due to image structure ambiguities
and noise. Our generalized matching algorithm can operate
on any common image descriptors (e.g., SIFT) and unlike
many of the above tree structures, supports any distance
function. Even while the algorithm naturally supports dense
global matching, it may also be constrained to only accept
matches in a local window if desired.