0%
50%
100%
Uniform Distribution
0% 50% 100%
Gaussian Distribution
0% 50% 100%
Fitted Mixture of 3 Gaussians
0% 50% 100%
Source: Stanford Visualization group; http://hci.stanford.edu/jheer/files/zoo/ex/stats/qqplot.html
statistical Distributions: figure 2c. scatter plot matrix of automobile data.
2000
3000
40
00
5000
1
0
15
2
0
4
0
0
100
2
300
00
horsepower 50100 150 200 2000 3000 4000 5000 weight 2000 3000 4000 5000 101520 acceleration 101520 displacement
15
20
100
200
300
400
5
100
1
5
0
2
0
0
0
United States
2000
3
400
5000
10
0
00
0
European Union Japan
Source: ggobi; http://hci.stanford.edu/jheer/files/zoo/ex/stats/splom.html
cylinders displacement weight horsepower
8 455 cubic inch 5140 lbs 230 hp
statistical Distributions: figure 2d. Parallel coordinates of automobile data.
acceleration
mpg
47 miles/gallon
year
82
3
68 cubic inch
1613 lbs
46 hp
8 (0 to 60mph) 9 miles/gallon 70
Source: ggobi; http://hci.stanford.edu/jheer/files/zoo/ex/stats/parallel.html
be more appropriate, and indeed we
see in the final plot that a fitted mixture
of three normal distributions provides
a better fit. Though powerful, the Q-Q
plot has one obvious limitation in that
its effective use requires that viewers
possess some statistical knowledge.
SPLOM (Scatter Plot Matrix). Other
visualization techniques attempt to
represent the relationships among
multiple variables. Multivariate data
occurs frequently and is notoriously
hard to represent, in part because of
the difficulty of mentally picturing data
in more than three dimensions. One
technique to overcome this problem is
to use small multiples of scatter plots
showing a set of pairwise relations
among variables, thus creating the
SPLOM (scatter plot matrix). A SPLOM enables visual inspection of correlations
between any pair of variables.
In Figure 2c a scatter plot matrix is
used to visualize the attributes of a database of automobiles, showing the relationships among horsepower, weight,
acceleration, and displacement. Additionally, interaction techniques such
as brushing-and-linking—in which a
selection of points on one graph highlights the same points on all the other
graphs—can be used to explore patterns within the data.
Parallel Coordinates. As shown in
Figure 2d, parallel coordinates (
||-co-ord) take a different approach to visualizing multivariate data. Instead of
graphing every pair of variables in two
dimensions, we repeatedly plot the data
on parallel axes and then connect the
corresponding points with lines. Each
poly-line represents a single row in the
database, and line crossings between
dimensions often indicate inverse correlation. Reordering dimensions can
aid pattern-finding, as can interactive
querying to filter along one or more dimensions. Another advantage of parallel coordinates is that they are relatively
compact, so many variables can be
shown simultaneously.
maps
Although a map may seem a natural
way to visualize geographical data, it
has a long and rich history of design.
Many maps are based upon a
cartographic projection: a mathematical
function that maps the 3D geometry
of the Earth to a 2D image. Other maps