new to most computer scientists, but
they are not new to social scientists.
To me, then, this highlights an important path forward. Clearly, machine
learning is incredibly useful—and, in
particular, machine learning is useful
for social science. But we must treat
machine learning for social science
very differently from the way we treat
machine learning for, say, handwriting
recognition or playing chess. We cannot just apply machine learning methods in a black-box fashion, as if computational social science were simply
computer science plus social data. We
need transparency. We need to prioritize interpretability—even in predictive
contexts. We need to conduct rigorous,
detailed error analyses. We need to
represent uncertainty. But, most importantly, we need to work with social
scientists in order to understand the
ethical implications and consequences
of our modeling decisions.
1. Barocas, S. and Selbst, A.D. Big data’s disparate
impact. California Law Review 104 (2016), 671–732.
2. ben-Aaron, J. et al. Transparency by conformity:
A field experiment evaluating openness in local
governments. Public Administration Review 77, 1 (Jan.
3. Clauset, A., Arbesman, S., and Larremore, D.B.
Systematic inequality and hierarchy in faculty hiring
networks. Science Advances 1, 1 (Jan. 2015).
4. Gerrish, S. and Blei, D. How they vote: Issue-adjusted
models of legislative behavior. In Advances in Neural
Information Processing Systems Twenty Five (2012),
5. Hardt, H. How big data is unfair; http://bit.ly/1BBglLr.
6. Hopkins, D. J. and King, G. A method of automated
nonparametric content analysis for social science.
American Journal of Political Science 54, 1 (Jan.
7. Jackman, S. Bayesian Analysis for the Social Sciences.
8. Lauderdale, B.E. and Clark, T.S. Scaling politically
meaningful dimensions using texts and votes.
American Journal of Political Science 58, 3 (Mar.
9. Sharma, A., Hofman, J., and Watts, D. Estimating
the causal impact of recommendation systems from
observational data. In Proceedings of the Sixteenth
ACM Conference on Economics and Computation
10. Szegedy, C. et al. Going deeper with convolutions. In
Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (2015).
Hanna Wallach ( firstname.lastname@example.org) is a Senior
Researcher at Microsoft Research and an Adjunct
Associate Professor at the University of Massachusetts
This article is based on an essay that appeared on Medium
—see http://bit.ly/13QlExf. This work was supported in
part by NSF grant #IIS-1320219. Any opinions, findings
and conclusions, or recommendations expressed in this
material are those of the author and do not necessarily
reflect those of the sponsor.
Copyright held by author.
sets, often collected and made available for no particular purpose other
than “machine learning research.” In
contrast, social scientists often use
data collected or curated in order to
answer specific questions. Because
this process is extremely labor intensive, these datasets have traditionally
been small scale.
But—and this is one of the driving
forces behind computational social science—thanks to the Internet, we now
have all kinds of opportunities to obtain large-scale, digitized datasets that
document a variety of social phenomena, many of which we had no way of
studying previously. For example, my
collaborator Bruce Desmarais and I
wanted to conduct a data-driven study
of local government communication
networks, focusing on how political actors at the local level communicate with
one another and with the general public. It turns out that most U.S. states
have sunshine laws that mimic the federal Freedom of Information Act. These
laws require local governments to archive textual records—including, in
many states, email—and disclose them
to the public upon request.
Desmarais and I therefore issued
public records requests to the 100
county governments in North Carolina,
requesting all non-private email messages sent and received by each county’s department managers during a randomly selected three-month time
frame. Out of curiosity, we also decided
to use the process of requesting these
email messages as an opportunity to
conduct a randomized field experiment
to test whether county governments are
more likely to fulfill a public records request when they are aware that their
peer governments have already fulfilled
the same request.
On average, we found that counties
who were informed that their peers
had already complied took fewer days
to acknowledge our request and were
more likely to actually fulfill it. And we
ended up with over half a million
email messages from 25 different
county governments. 2
Clearly, new opportunities like this
are great. But these kinds of opportu-
nities also raise new challenges. Most
conspicuously, it is very tempting to
say, “Why not use these large-scale,
social datasets in combination with
the powerful predictive models devel-
oped by computer scientists?” How-
ever, unlike the datasets tradition-
ally used by computer scientists, these
new datasets are often about people
going about their everyday lives—their
attributes, their actions, and their in-
teractions. Not only do these datasets
document social phenomena on a
massive scale, they often do so at the
granularity of individual people and
their second-to-second behavior. As
a result, they raise some complicated
ethical questions regarding privacy,
fairness, and accountability.
It is clear from the media that one of
the things that terrifies people the most
about machine learning is the use of
black-box predictive models in social
contexts, where it is possible to do more
harm than good. There is a great deal of
concern—and rightly so—that these
models will reinforce existing structural biases and marginalize historically
In addition, when datapoints are
humans, error analysis takes on a
whole new level of importance because
errors have real-world consequences
that involve people’s lives. It is not
enough for a model to be 95% accurate—we need to know who is affected
when there is a mistake, and in what
way. For example, there is a substantial
difference between a model that is 95%
accurate because of noise and one that
is 95% accurate because it performs
perfectly for white men, but achieves
only 50% accuracy when making predictions about women and minorities.
Even with large datasets, there is always proportionally less data available
about minorities, and statistical patterns that hold for the majority may be
invalid for a given minority group. As a
result, the usual machine learning objective of “good performance on average,” may be detrimental to those in a
minority group. 1, 5
Thus, when we use machine learning to reason about social phenomena—and especially when we do so to
draw actionable conclusions—we
have to be exceptionally careful.
More so than when we use machine
learning in other contexts. But here is
the thing: these ethical challenges are
not entirely new. Sure, they may be