Polaris: A System for Query,
Analysis, and Visualization of
By Chris Stolte, Diane Tang, and Pat Hanrahan
During the last decade, multidimensional databases have
become common in the business and scientific worlds.
Analysis places significant demands on the interfaces to
these databases. It must be possible for analysts to easily
and incrementally change both the data and their views of it
as they cycle between hypothesis and experimentation.
In this paper, we address these demands by presenting
the Polaris formalism, a visual query language for precisely
describing a wide range of table-based graphical presentations of data. This language compiles into both the queries
and drawing commands necessary to generate the visualization, enabling us to design systems that closely integrate
analysis and visualization. Using the Polaris formalism, we
have built an interactive interface for exploring multidimensional databases that analysts can use to rapidly and incrementally build an expressive range of views of their data as
they engage in a cycle of visual analysis.
Nowadays, structured databases are widely used. Corporations store every sales transaction in large data warehouses.
International research projects such as the Human Genome
Project and Digital Sky Survey are generating massive scientific databases. Organizations such as the United Nations
are making a wide range of global indicators on issues ranging from carbon emission to the adoption of technology
publicly available via the Internet.
Unfortunately, our ability to collect and store data has
rapidly exceeded our ability to analyze it. A major challenge
in computer science is how to extract meaning from data:
to discover structure, find patterns, and derive causal relationships. An analytical session cycles between hypothesis,
experiment, and discovery. Often the path of exploration is
unpredictable, and thus analysts need to be able to rapidly
change both what data they are viewing and how they are
viewing that data. This exploratory analysis process places
significant demands on the human–computer interfaces to
these databases. Few good tools exist.
In this paper, we present a formal approach to building visualization systems that addresses these demands.
The authors dedicate this article to the memory of Jim Gray,
whose pioneering work inspired this research.
The first contribution is the Polaris formalism, a declarative visual query language that specifies a wide range of 2D
graphic displays. The three key components of the formalism are ( 1) a table algebra that captures the structure of
tables and spatial encodings, ( 2) a graphic taxonomy that
results in an intuitive specification of graphic types, and ( 3)
a system for effective visual encoding. This language allows
for easily changing between different graphic displays as
well as adding or removing data.
The second main contribution is the combination of this
visual query language with the underlying database queries
needed. This allows us to combine both visualization as
well as the underlying data transformations to support the
The final contribution is the Polaris interface that allows
users to incrementally construct a visual specification by dragging fields onto “shelves” (see Figure 1). Each intermediate
specification is valid and corresponds to a graphical data display, giving the user quick visual feedback to support this analysis. This interface is built on top of the visual query language
that specifies both the data and graphical transformations
needed, thus combining statistical analysis and visualization.
Polaris enables visual analysis by allowing an analyst to answer
a question by composing a picture of what they want to see.
It has been 6 years since this work was originally published.
In that time, the technology has been commercialized by
Tableau Software as Tableau Desktop and is currently in use
by thousands of companies and tens of thousands of users.
As a result, we have gained considerable experience that has
validated the effectiveness of the visual query language and
interface and resulted in extensions and revisions to both.
2. oVeRVie W
Polaris has been designed to support the interactive exploration of large multidimensional relational databases or data
cubes. Relational databases organize data into tables where
each row in a table corresponds to a basic entity or fact and
each column represents a property of that entity. 18 We refer
to a row in a relational table as a tuple or record, and a column as a field. A single database will contain many heterogeneous but interrelated tables.
A previous version of this paper was published in IEEE’s
Transactions on Visualization and Computer Graphics,
vol 8, issue 1 (Jan. 2002), pp. 52–65.