We have developed two mobile vision
applications using OpenCV: one that
stitches a panoramic image from sev-
eral normal photographs, and another
that stabilizes streaming video. The per-
formance requirements are challeng-
ing. Our goal is real-time performance,
where each frame should be processed
within about 30 milliseconds, of which
figure 8. input images and the resulting panorama.
figure 9. Panorama stitching pipeline.
figure 10. Video stabilization input sequence.
figure 11. Video stabilization pipeline.
basic operations such as simply copying a 1280×720-pixel frame may take
eight milliseconds. Consequently, to a
large extent the final design of an application and its underlying algorithm is
determined by this constraint.
In both cases we were able to satisfy
the time limits by using the GPU for optimizing the applications’ bottlenecks.
Several geometric transformation
functions such as image resizing and
various types of image warping were
ported to the GPU, resulting in a doubling of the application performance.
The results were not nearly as good
when performing the same tasks using NEON and multithreading. One of
the reasons was that both applications
deal with high-resolution four-channel
images. As a result, the memory bus
was overloaded and the CPU cores
competed for the cache memory. At
the same time we needed to program
bilinear interpolation manually, which
is implemented in GPU hardware. We
learned that the CPU does not work as
well for full-frame geometric transformations, and the help of the GPU was
invaluable. Let’s consider both applications in more detail.
Panorama stitching. In the panora-ma-stitching application our goal was
to combine several ordinary images
into a single panorama with a much
larger field of view (FOV) than the input images.
7 Figure 8 demonstrates the
stitching of several detailed shots into
a single high-resolution image of the
Figure 9 shows the processing pipeline for the OpenCV panorama-stitch-ing application. The process of porting
to Tegra started from some algorithmic improvements, followed by NEON
and multithreading optimization; yet
after all these efforts, the application
still was not responsive enough and
could not stitch and preview the resulting panorama at interactive speeds.
Among the top bottlenecks were image resizing and warping. The former
is required because different algorithmic steps are performed at different
resolutions, and each input frame is resized about three times, depending on
the algorithmic parameters. The type
of warping needed depends on the desired projection mode (spherical, cylindrical, among others) and is performed
before the final panorama blending.