Data in the Wild: Some Reflections
Chee Siang Ang
University of Kent | csa8@kent.ac.uk
Ania Bobrowicz
University of Kent | a.bobrowicz@kent.ac.uk
Diane J. Schiano
djs.ux.consulting | dianejschiano@gmail.com
Bonnie Nardi
University of California, Irvine | nardi@ics.uci.edul
In recent years, the proliferation
of online services such as social
networking, gaming, Internet
fora, and chat rooms has provided
academic and corporate research-
ers opportunities to acquire and
analyze large volumes of data on
human activity and social interac-
tion online. For example, massive
corpora from Facebook, Twitter,
and other user-generated data
sources are being harvested “in
the wild” on the Internet. Research
based on such “found data” is
increasingly common, as software
tools become sophisticated enough
to allow researchers to forage for
data at fairly low cost. Researchers
can now “not merely do more of
the same, but in some cases con-
duct qualitatively new forms of
analysis” [ 1]. Novel computational
methods are being developed to
integrate multiple distinct and
often heterogeneous datasets (e.g.,
mobile location data and Twitter
feeds) in the hope that important
new relationships will emerge that
cannot be found using a single
data source. The growing tendency
to apply new machine learning
and other data-mining techniques
to search for emerging patterns
in existing datasets—rather than
generate new data for planned
hypothesis testing or qualitative
exploration—is transforming the
way research is being conducted.
March + April 2013