interactions of these materials with light is to use a different program for each material.
This situation is very different from that found for
other high-performance tasks, such as video decode,
which does not inherently require programmable
hardware; one could design fixed-function hardware
sufficient to support the standard video formats without
any programmability at all. As a practical matter most
video-decode hardware does include some programmable units, but this is an implementation choice, not a
fundamental requirement. This need for programmability
by 3D graphics applications makes graphics architectures
uniquely well positioned to evolve into more general
high-throughput parallel computer architectures that
handle tasks beyond graphics.
LIMITS OF THE TRADITIONAL Z-BUFFER
GRAPHICS PIPELINE
The Z-buffer graphics pipeline with programmable shading that is used as the basis of today’s graphics architectures makes certain fundamental approximations and
assumptions that impose a practical upper limit on the
image quality. For example, a Z buffer cannot efficiently
determine if two arbitrarily chosen points are visible from
each other, as is needed for many advanced visual effects.
A ray tracer, on the other hand, can efficiently make this
determination. For this reason, computer-generated movies use rendering techniques such as ray-tracing algorithms and the Reyes (renders everything you ever saw)
algorithm2 that are more sophisticated than the standard
Z-buffer graphics pipeline.
Over the past few years, it has become clear that the
next frontier for improved visual quality in realtime 3D
graphics will involve modeling lighting and complex
illumination effects more realistically (but not necessarily
photo-realistically) so as to produce images that are closer
in quality to those of computer-generated movies. These
effects include hard-edged shadows (from small lights),
soft-edged shadows (from large lights), reflections from
water, and approximations to more complex effects such
as diffuse lighting interactions that dominate most interior environments. There is also a desire to model effects
such as motion blur and to use higher-quality anti-alias-ing techniques. Most of these effects are challenging to
produce with the traditional Z-buffer graphics pipeline.
Modern game engines (e.g., Unreal Engine 3, CryEn-
gine 2) have begun to support some of these effects using
today’s graphics hardware, but with significant limitations. For example, Unreal Engine 3 uses four different
shadow algorithms, because no one algorithm provides
an acceptable combination of performance and image
quality in all situations. This problem is a result of limitations on the visibility queries that are supported by the
traditional Z-buffer pipeline. Furthermore, it is common
for different effects such as shadows and partial transparency to be mutually incompatible (e.g., partially transparent objects cast shadows as if they were fully opaque
objects). This lack of algorithmic robustness and generality is a problem for both game-engine programmers
and for the artists who create the game content. These
limitations can also be viewed as violations of important principles of good system design such as abstraction (a capability should work for all relevant cases) and
orthogonality (different capabilities should not interact in
unexpected ways).
The underlying problem is that the traditional Z-buffer
graphics pipeline was designed to compute visibility (i.e.,
the first surface hit) for regularly spaced rays originating at a single point (see figure 3a), but effects such as
hard-edged shadows, soft-edged shadows, reflections, and
diffuse lighting interactions all require more general visibility computations. In particular, reflections and diffuse
lighting interactions require the ability to compute visible
surfaces efficiently along rays with a variety of origins
and directions (figure 3d). These types of visibility queries
cannot be performed efficiently with the traditional
graphics pipeline, but VLSI technology now provides
enough transistors to support more sophisticated realtime
visibility algorithms that can perform these queries
efficiently. These transistors, however, must be organized
into an architecture that can efficiently support the more
sophisticated visibility algorithms.
Since the Z-buffer graphics pipeline is ill suited for
producing the desired effects, the natural solution is to
design graphics systems around more powerful visibility algorithms. Figure 3 provides an overview of some
of these algorithms. I believe that these more powerful
visibility algorithms will be gradually adopted over the
next few years in response to the inadequacies of the
standard Z buffer, although there is substantial debate in
the graphics community as to how rapidly this change
will occur. In particular, algorithms such as ray tracing