some other purpose and hope it will
suffice. Nor should we make assertions about invariants of human
social action based on analyses of
activities that take place in single-case, specific settings (e.g., on a
particular social network) without
explicitly acknowledging what that
specific interface, interaction, application, service, or company brings
to the table, how that affects adoption and use, and thus how the data
collected is biased by those aspects.
These are exciting times! Let’s
bring a creative and yet critical eye
to the collection—the design—of
data to complement the focus on
the analysis of data. Of course,
those trained with experimental
and survey methodologies are con-
tinually designing data by design-
ing experiments and instruments
to address core science questions.
However, I do not see this kind of
thinking commonly applied when
designing interfaces and interac-
tions. I do see a deep commitment
to discoverability, usability, the
support of tasks and activity flows,
and to aesthetic appeal, but not
a critical lens on the data conse-
quences of design choices at the
interface/interaction level. I’d like
to see more of what Tim Brown,
among others, has called “design
thinking” to be applied to data cap-
ture (including application, service,
and system instrumentation), to
data management (including colla-
tion and summarization), to user/
use models that utilize machine-
learning techniques, and, of course,
as has been invited elsewhere, to
data visualization and analysis
(including interpretation). Brown
says that design thinking is nei-
ther art, nor science, nor religion.
It is the capacity, ultimately, for
“integrative thinking.” In Brown’s
view, a design paradigm requires
that the solution is “not locked
away somewhere waiting to be
discovered”; he advocates that we
embrace “incongruous details”
rather than smoothing them or
removing them [ 5]. In the incon-
gruous details lie the insights.
1. Fisher, D., DeLine, R., Czerwinski, M., and
Drucker, S. Interactions with big data analytics.
interactions 19, 3 (May + June 2012), 50-59.
2. I note that there are hot debates about whether
data as a word should be treated as singular or plural. Following an extended discussion with trusted
friends and editors, for this column, I have elected to
go with data as a collective noun.
3. Bowker, G. and Star, S.L. Sorting Things Out:
Classification and Its Consequences. MIT Press, 2000.
4. Seife, C. Proofiness: The Dark Arts of
Mathematical Deception. Viking Press, 2010.
Another classic is Darrell Huff’s How to Lie with
Statistics, W. W. Norton & Company, Inc., 1993.
Some examples from Seife’s book:
Disestimation is when too much meaning is
assigned to a measurement, ignoring any uncertainties about the measurement and/or errors that
could be present. In the 2008 Minnesota Senate
race between Norm Coleman and Al Franken,
errors in counting the votes were much larger than
the number of votes that separated the candidates
(estimated to be between 200 to 300). He concludes
that flipping a coin would have been better than
assuming any veracity in the measure—the number
of votes—given these errors.
Potemkin numbers are statistics based on erroneous
numbers and/or nonexistent calculations. Seife cites
Justice Scalia’s statement that 0.027 percent of convicted felons are wrongly imprisoned. This turned
out to be based on an informal calculation, with
rigorous studies suggesting that the actual number
is between 3 and 5 percent.
Some “fruit salad” examples include “comparing
apples and oranges,” “cherry picking” data for rhetorical effect, and “apple polishing.”
5. Brown, T. Change by Design: How Design
Thinking Transforms Organizations and Inspires
Innovation. Harper Collins, 2009.
September + October 2012
© 2012 ACM 1072-5520/12/09 $15.00