vision technology becomes as ubiquitous as touch interfaces.
Computer vision is computationally expensive, however. Even an algorithm dedicated to solving a very
specific problem, such as panorama
stitching or face and smile detection, requires a lot of power. Many
computer-vision scenarios must be
executed in real time, which implies
that the processing of a single frame
should complete within 30–40 milliseconds. This is a very challenging requirement, especially for mobile and
embedded computing architectures.
Often, it is possible to trade off quality for speed. For example, the panora-ma-stitching algorithm can find more
matches in source images and synthesize an image of higher quality, given
more computation time. To meet the
constraints of time and the computational budget, developers either
compromise on quality or invest more
time into optimizing the code for specific hardware architectures.
Vision And heterogeneous
In the past, an easy way to increase the
performance of a computing device
was to wait for the semiconductor processes to improve, which resulted in
an increase in the device's clock speed.
When the speed increased, all applications got faster without having to modify them or the libraries that they relied
on. Unfortunately, those days are over.
As transistors get denser, they also
leak more current, and hence are less
energy efficient. Improving energy efficiency has become an important priority. The process improvements now allow for more transistors per area, and
there are two primary ways to put them
to good use. The first is via parallelization: creating more identical processing units instead of making the single
unit faster and more powerful. The
second is via specialization: building
domain-specific hardware accelerators
that can perform a particular class of
functions more efficiently. The concept
figure 1. Computer vision and GPu.
Computer Vision on GPU Computer Vision
about a scene
Red ball Human face
The same hardware boosts both!
figure 2. CPu versus GPu performance comparison.
of combining these two ideas—that is,
running a CPU or CPUs together with
various accelerators—is called heterogeneous parallel computing.
High-level computer-vision tasks
often contain subtasks that can be run
faster on special-purpose hardware
architectures than on the CPU, while
other subtasks are computed on the
CPU. The GPU (graphics processing
unit), for example, is an accelerator
that is now available on every desktop
computer, as well as on mobile devices
such as smartphones and tablets.
The first GPUs were fixed-function
pipelines specialized for accelerated
drawing of shapes on a computer
display, as illustrated in Figure 1. As
GPUs gained the capability of using
color images as input for texture mapping, and their results could be shared
back with the CPU rather than just being sent to the display, it became possible to use GPUs for simple image-processing tasks.
Making the fixed-function GPUs
partially programmable by adding
shaders was a big step forward. This
enabled programmers to write special
programs that were run by the GPU
on every three-dimensional point of
the surface and at every pixel rendered
onto the output canvas. This vastly expanded the GPU’s processing capability, and clever programmers began to
try general-purpose computing on a
GPU (GPGPU), harnessing the graphics accelerator for tasks for which it
was not originally designed. The GPU
became a useful tool for image processing and computer-vision tasks.
The graphics shaders, however,
did not provide access to many useful
hardware capabilities such as synchronization and atomic memory operations. Modern GPU computation languages such as CUDA, OpenCL, and
DirectCompute are explicitly designed
to support general-purpose computing
on graphics hardware. GPUs are still
not quite as flexible as CPUs, but they
perform parallel stream processing
much more efficiently, and an increasing number of nongraphics applications are being rewritten using the
GPU compute languages.
Computer vision is one of the tasks
that often naturally map to GPUs. This
is not a coincidence, as computer vision really solves the inverse to the