experiments on the Web [ 1]. These
experiments allow practitioners
to identify causal relationships
between changes in design and
changes in user-observable behavior on a potentially massive scale.
Even more recently, practitioners
are starting to identify usability
issues by mining query logs for
commonly asked questions by
users of certain applications [ 2].
This can help product teams discover large, real-world usability
issues while supplementing laboratory techniques that tend to focus
on smaller, more isolated problems.
Other companies use the data
more directly to modify their offerings. The online game company
Zynga creates games and studies
data on how its audience plays
them in order to update the games
immediately. “We’re an analytics company masquerading as a
games company,” said Ken Rudin,
a Zynga vice president in charge of
its data-analysis team. He continued, “We are totally disrupting the
traditional video games industry;
a huge portion of that disruption
is the ability to use data” [ 3].
Of course, big data analytics,
like any research method, has its
limits and pitfalls. Just because
analysts have big data to work with
doesn’t guarantee the sample they
need is sufficiently representative of their entire user population
(bigger is not better); nor does it
mean they have the ground truth
around their users’ motivations or
needs from their behavior logs. For
instance, boyd and Crawford argue
that working with big data is still
subjective and that automated data
collection is not self-explanatory—
it requires selection and interpretation [ 4]. They point out that
the data sampling and cleaning
processes in particular are prone
to potential error and bias. So, the
51 COVER STORY
interactions May + June 2012
challenge for HCI researchers is to
leverage the big data that will be
increasingly available, but to do so
judiciously.
Here we report on the state of
the practice of big data analytics,
based on a series of interviews we
conducted with 16 analysts. While
the problems uncovered are pain
points for big data analysts (
including HCI practitioners), the opportunity for better user experience
around each of these areas is vast.
It is our hope that HCI researchers will not only turn their attention toward designs that improve
the big data research experience,
but that they will also cautiously
embrace the big data available
to them as a converging line of
evidence in their iterative design
work. The big data user experience
challenge will affect every one of
us. As Pat Hanrahan, a professor
at Stanford, recently said: “The
reason big data is impacting every
one of us is the data oozing out of
everything… It’s like electricity
flowing throughout an organization—everyone can tap into it on
command to answer the individual
questions their jobs demand” [ 5].
The Nature of Analytics Work
The term analytics (including its big
data form) is often used broadly
to cover any data-driven decision making. Here, we use the
term for two groups: corporate
analytics teams and academic
research scientists. In the corporate world, an analytics team
uses their expertise in statistics,
data mining, machine learning,
and visualization to answer questions that corporate leaders pose.
They draw on data from corporate
sources (e.g., customer, sales, or
product-usage data) called business
information, sometimes in combination with data from public sources