Engaging the Ethics of
Data Science in Practice
Seeking more common ground between data scientists and their critics.
trying to make machines learn something useful, valuable, and reliable.
For example, dealing with dirty and incomplete data is as much a moral as a
practical concern. It requires making a
series of small decisions that are often
fraught, forcing reflection at each step.
How was this data collected? Does it
capture the entire population and full
range of behavior that is of interest?
The same is true for validating a model
and settling on an acceptable error rate.
What must a data scientist do to prove
to herself that a model will indeed perform well when deployed? How do data
scientists decide that a reported error
rate is tolerable—and defendable? Eth-
CRITICAL COMMENTARY ON data science has converged on a worrisome idea: that data scientists do not recog- nize their power and, thus,
wield it carelessly. These criticisms
channel legitimate concerns about
data science into doubts about the ethical awareness of its practitioners. For
these critics, carelessness and indifference explains much of the problem—
to which only they can offer a solution.
Such a critique is not new. In the
1990s, Science and Technology Studies
(STS) scholars challenged efforts by AI
researchers to replicate human behaviors and organizational functions in
software (for example, Collins3). The
scholarship from the time was damning: expert systems routinely failed,
critical researchers argued, because
developers had impoverished understandings of the social worlds into
which they intended to introduce their
6 At the end of the decade, however, Mark Ackerman reframed this as
a social-technical gap between “what
we know we must support socially and
what we can support technically.”
argued that AI’s deficiencies did not
reflect a lack of care on the part of researchers, but a profound challenge of
dealing with the full complexity of the
social world. Yet here we are again.
Our interviews with data scientists
give us reason to think we can avoid
this repetition. While practitioners
were quick to point out that common
criticisms of data science tend to lack
technical specificity or rest on faulty
understandings of the relevant tech-
niques, they also expressed frustration
that critics failed to account for the
careful thinking and critical reflection
that data scientists already do as part
of their everyday work. This was more
than resentment at being subject to
outside judgment by non-experts. In-
stead, these data scientists felt that
easy criticisms overlooked the kinds of
routine deliberative activities that out-
siders seem to have in mind when they
talk about ethics.
Ethics in Practice
Data scientists engage in countless acts
of implicit ethical deliberation while