Polaris is useful for performing the type of exploratory
data analysis advocated by statisticians such as Bertin3 and
Cleveland. 8 We demonstrate the capabilities of Polaris as an
exploratory interface to multidimensional databases by considering the following scenario.
At Stanford, researchers developing Argus, 9 a parallel graphics library, found that its performance had linear
speedup when using up to 31 processors, after which its performance diminished rapidly. Using Polaris, we recreate the
analysis they performed using a custom-built visualization
Initially, the developers hypothesized that the diminishing performance was a result of too many remote memory
accesses, a common performance problem in parallel programs. They collected and visualized detailed memory
statistics to test this hypothesis. Figure 6(a) shows a visualization constructed to display this data. The visualization is
composed of two linked Polaris instances. One displays a
bird’s eye view of multiple source code files with each line of
code represented by a single pixel height bar and the other
displays the detailed source-code. In both views, the hue of
each line of code encodes the number of cache misses suffered by that line. Upon seeing these displays, they could tell
that memory was in fact not the problem.
The developers next hypothesized that lock contention might be a problem, so they reran Argus and collected
detailed lock and scheduling information. The data is shown
in Figure 6(b) using a dashboard within Polaris to create a
composite visualization with two linked projections of the
same data. One projection shows a scatterplot of the start
cycle versus cycle duration for the lock events (requests and
holds). The second shows a histogram over time of initiated
lock events. The scatterplot shows that toward the end of the
run, the duration of lock events (both holds and requests)
was taking an unexpectedly long time. That observation correlated with the histogram showing that the number of lock
requests peaked and then tailed off towards the end of the
run indicated that this might be a fruitful area for further
A third visualization, shown in Figure 6(c), shows the
same data using Gantt charts to display both lock events and
process-scheduling events. This display shows that the long
lock requests correspond to descheduled periods for most
processes. One process, however, has a descheduled period
corresponding to a period during which the lock was held.
This behavior, which was due to a bug in the operating system, was the source of the performance issues.
This example illustrates several important points about
the exploratory process. Throughout the analysis, both the
data that users want to see and how they want to see it change
continually. Analysts first form hypotheses about the data
and then create new views to test those hypotheses. Certain
displays enable an understanding of overall trends, whereas
others show causal relationships. As the analysts better
understand the data, they may want to drill-down in the visible dimensions or display entirely different dimensions.
Polaris supports this exploratory process through its
visual interface. By formally categorizing the types of
figure 6: a scenario demonstrating the use of Polaris to analyze the
performance of a parallel graphics library.
graphics, Polaris is able to provide a simple interface for rapidly generating a wide range of displays, allowing analysts to
focus on the analysis task rather than interface.