figure 1: a simplified graphics pipeline.
vertex data buffers
orientation, and material properties of
object surfaces and the position and
characteristics of light sources. A scene
view is described by the location of a virtual camera. Graphics systems seek to
find the appropriate balance between
conflicting goals of enabling maximum
performance and maintaining an expressive but simple interface for describing graphics computations.
Real-time graphics APIs such as Di-
rect3D and OpenGL strike this balance
by representing the rendering computation as a graphics processing pipeline
that performs operations on four fundamental entities: vertices, primitives,
fragments, and pixels. Figure 1 provides
a block diagram of a simplified seven-stage graphics pipeline. Data flows between stages in streams of entities. This
pipeline contains fixed-function stages
(green) implementing API-specified
operations and three programmable
stages (red) whose behavior is defined
by application code. Figure 2 illustrates
the operation of key pipeline stages.
VG (vertex generation). Real-time
graphics APIs represent surfaces as
collections of simple geometric primitives (points, lines, or triangles). Each
primitive is defined by a set of vertices.
To initiate rendering, the application
provides the pipeline’s VG stage with a
list of vertex descriptors. From this list,
VG prefetches vertex data from memory
and constructs a stream of vertex data
records for subsequent processing. In
practice, each record contains the 3D
(x,y,z) scene position of the vertex plus
additional application-defined parameters such as surface color and normal
VP (vertex processing). The behavior
of VP is application programmable. VP
operates on each vertex independently
and produces exactly one output vertex
record from each input record. One of
the most important operations of VP execution is computing the 2D output image (screen) projection of the 3D vertex
PG (primitive generation). PG uses
vertex topology data provided by the application to group vertices from VP into
an ordered stream of primitives (each
primitive record is the concatenation of
several VP output vertex records). Vertex
topology also defines the order of primitives in the output stream.
PP (primitive processing). PP operates
independently on each input primitive
to produce zero or more output primitives. Thus, the output of PP is a new
(potentially longer or shorter) ordered
stream of primitives. Like VP, PP operation is application programmable.
FG (fragment generation). FG samples
each primitive densely in screen space
(this process is called rasterization).
Each sample is manifest as a fragment
record in the FG output stream. Fragment records contain the output image
position of the surface sample, its distance from the virtual camera, as well as
values computed via interpolation of the
source primitive’s vertex parameters.
FP (fragment processing). FP simulates the interaction of light with scene
surfaces to determine surface color and
opacity at each fragment’s sample point.
To give surfaces realistic appearances,
FP computations make heavy use of filtered lookups into large, parameterized
1D, 2D, or 3D arrays called textures. FP is
an application-programmable stage.
PO (pixel operations). PO uses each
fragment’s screen position to calculate
and apply the fragment’s contribution
to output image pixel values. PO accounts for a sample’s distance from the
virtual camera and discards fragments
that are blocked from view by surfaces
closer to the camera. When fragments
from multiple primitives contribute to
the value of a single pixel, as is often the
case when semi-transparent surfaces
overlap, many rendering techniques
rely on PO to perform pixel updates
in the order defined by the primitives’
positions in the PP output stream. All
graphics APIs guarantee this behavior,
and PO is the only stage where the order
of entity processing is specified by the
The behavior of application-programmable pipeline stages (VP, PP, FP) is
defined by shader functions (or shaders).
Graphics programmers express vertex,
primitive, and fragment shader functions in high-level shading languages
such as NVIDIA’s Cg, OpenGL’s GLSL,
or Microsoft’s HLSL. Shader source is
compiled into bytecode offline, then
transformed into a GPU-specific binary
by the graphics driver at runtime.
Shading languages support complex
data types and a rich set of control-flow
constructs, but they do not contain
primitives related to explicit parallel
execution. Thus, a shader definition is
a C-like function that serially computes
output-entity data records from a single
input entity. Each function invocation is
abstracted as an independent sequence
of control that executes in complete
isolation from the processing of other
As a convenience, in addition to data
records from stage input and output
streams, shader functions may access
(but not modify) large, globally shared
data buffers. Prior to pipeline execution, these buffers are initialized to contain shader-specific parameters and textures by the application.
characteristics and challenges
Graphics pipeline execution is characterized by the following key properties.
Opportunities for parallel processing.
Graphics presents opportunities for
both task- (across pipeline stages) and