and effect and, leveraging this capability, to become the dominant paradigm
of next-generation AI.
This research was supported in
part by grants from the Defense Advanced Research Projects Agency
[#W911NF-16-057], National Science
Foundation [#IIS-1302448, #IIS-
1527490, and #IIS-1704932], and
Office of Naval Research [#N00014-
17-S-B001]. The article benefited substantially from comments by the anonymous reviewers and conversations
with Adnan Darwiche of the University
of California, Los Angeles.
1. Balke, A. and Pearl, J. Probabilistic evaluation of
counterfactual queries. In Proceedings of the 12th
National Conference on Artificial Intelligence (Seattle,
WA, July 31–Aug. 4). MI T Press, Menlo Park, CA,
2. Bareinboim, E. and Pearl, J. Causal inference by
surrogate experiments: z-identifiability. In Proceedings
of the 28th Conference on Uncertainty in Artificial
Intelligence, N. de Freitas and K. Murphy, Eds.
(Catalina Island, CA, Aug. 14–18). AUAI Press,
Corvallis, OR, 2012, 113–120.
3. Bareinboim, E. and Pearl, J. Causal inference and
the data-fusion problem. Proceedings of the National
Academy of Sciences 113, 27 (2016), 7345–7352.
4. Chen, Z. and Liu, B. Lifelong Machine Learning. Morgan
and Claypool Publishers, San Rafael, CA, 2016.
5. Darwiche, A. Human-Level Intelligence or Animal-Like
Abilities? Technical Report. Department of Computer
Science, University of California, Los Angeles, CA,
6. Graham, J. Missing Data: Analysis and Design
(Statistics for Social and Behavioral Sciences).
7. Halpern, J.H. and Pearl, J. Causes and explanations:
A structural-model approach: Part I: Causes. British
Journal of Philosophy of Science 56 (2005), 843–887.
8. Hutson, M. AI researchers allege that machine
learning is alchemy. Science (May 3, 2018); https://
9. Jaber, A., Zhang, J.J., and Bareinboim, E. Causal
identification under Markov equivalence. In
Proceedings of the 34th Conference on Uncertainty in
Artificial Intelligence, A. Globerson and R. Silva, Eds.
(Monterey, CA, Aug. 6–10). AUAI Press, Corvallis, OR,
10. Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B.
Human-level concept learning through probabilistic
program induction. Science 350, 6266 (Dec. 2015),
11. Marcus, G. Deep Learning: A Critical Appraisal.
Technical Report. Departments of Psychology and
Neural Science, New York University, New York, 2018;
12. Mohan, K. and Pearl, J. Graphical Models for
Processing Missing Data. Technical Report R-473.
Department of Computer Science, University of
California, Los Angeles, CA, 2018; forthcoming,
Journal of American Statistical Association; http://ftp.
13. Mohan, K., Pearl, J., and Tian, J. Graphical models
for inference with missing data. In Advances in
Neural Information Processing Systems 26, C.J. C.
Burges, L. Bottou, M. Welling, Z. Ghahramani, and
K.Q. Weinberger, Eds. Curran Associates, Inc., Red
Hook, NY, 2013, 1277–1285; http://papers.nips.cc/
14. Morgan, S. L. and Winship, C. Counterfactuals and
Causal Inference: Methods and Principles for Social
Research (Analytical Methods for Social Research),
Second Edition. Cambridge University Press, New
15. Pearl, J. Probabilistic Reasoning in Intelligent
Systems. Morgan Kaufmann, San Mateo, CA, 1988.
16. Pearl, J. Comment: Graphical models, causality, and
intervention. Statistical Science 8, 3 (1993), 266–269.
17. Pearl, J. Causal diagrams for empirical research.
Biometrika 82, 4 (Dec. 1995), 669–710.
18. Pearl, J. Causality: Models, Reasoning, and Inference.
Cambridge University Press, New York, 2000; Second
19. Pearl, J. Direct and indirect effects. In Proceedings
of the 17th Conference on Uncertainty in Artificial
Intelligence (Seattle, WA, Aug. 2–5). Morgan
Kaufmann, San Francisco, CA, 2001, 411–420.
20. Pearl, J. Causes of effects and effects of causes.
Journal of Sociological Methods and Research 44, 1
21. Pearl, J. Trygve Haavelmo and the emergence of
causal calculus. Econometric Theory 31, 1 (2015b),
152–179; special issue on Haavelmo centennial
22. Pearl, J. and Bareinboim, E. External validity: From
do-calculus to transportability across populations.
Statistical Science 29, 4 (2014), 579–595.
23. Pearl, J. and Mackenzie, D. The Book of Why: The New
Science of Cause and Effect. Basic Books, New York, 2018.
24. Peters, J., Janzing, D. and Schölkopf, B. Elements
of Causal Inference: Foundations and Learning
Algorithms. MI T Press, Cambridge, MA, 2017.
25. Porta, M. The deconstruction of paradoxes in
epidemiology. OUPblog, Oct. 17, 2014; https://blog.oup.
26. Ribeiro, M. T., Singh, S., and Guestrin, C. Why should I
trust you?: Explaining the predictions of any classifier.
In Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining
(San Francisco, CA, Aug. 13–17). ACM Press, New
York, 2016, 1135–1144.
27. Robins, J.M. and Greenland, S. Identifiability and
exchangeability for direct and indirect effects.
Epidemiology 3, 2 (Mar. 1992), 143–155.
28. Rosenbaum, P. and Rubin, D. The central role of
propensity score in observational studies for causal
effects. Biometrika 70, 1 (Apr. 1983), 41–55.
29. Shimizu, S., Hoyer, P.O., Hyvärinen, A., and Kerminen,
A.J. A linear non-Gaussian acyclic model for causal
discovery. Journal of the Machine Learning Research 7
(Oct. 2006), 2003–2030.
30. Shpitser, I. and Pearl, J. Complete identification
methods for the causal hierarchy. Journal of Machine
Learning Research 9 (2008), 1941–1979.
31. Spirtes, P., Glymour, C.N., and Scheines, R. Causation,
Prediction, and Search, Second Edition. MI T Press,
Cambridge, MA, 2000.
32. Tian, J. and Pearl, J. A general identification condition
for causal effects. In Proceedings of the 18th National
Conference on Artificial Intelligence (Edmonton, AB,
Canada, July 28–Aug. 1). AAAI Press/MI T Press,
Menlo Park, CA, 2002, 567–573.
33. van der Laan, M.J. and Rose, S. Targeted Learning:
Causal Inference for Observational and Experimental
Data. Springer, New York, 2011.
34. Vander Weele, T. J. Explanation in Causal Inference:
Methods for Mediation and Interaction. Oxford
University Press, New York, 2015.
35. Zhang, J. and Bareinboim, E. Transfer learning
in multi-armed bandits: A causal approach. In
Proceedings of the 26th International Joint Conference
on Artificial Intelligence (Melbourne, Australia,
Aug. 19–25). AAAI Press, Menlo Park, CA, 2017,
Judea Pearl ( firstname.lastname@example.org) is a professor of
computer science and statistics and director of the
Cognitive Systems Laboratory at the University of
California, Los Angeles, USA.
Copyright held by author.
based on the detection of “shocks,”
or spontaneous local changes in the
environment that act like “nature’s interventions,” and unveil causal directionality toward the consequences of
I have argued that causal reasoning is
an indispensable component of human thought that should be formalized
and algorithmitized toward achieving
human-level machine intelligence. I
have explicated some of the impediments toward that goal in the form of
a three-level hierarchy and showed that
inference to level 2 and level 3 requires
a causal model of one’s environment.
I have described seven cognitive tasks
that require tools from these two levels
of inference and demonstrated how
they can be accomplished in the SCM
It is important for researchers to
note that the models used in accomplishing these tasks are structural
(or conceptual) and require no commitment to a particular form of the
distributions involved. On the other
hand, the validity of all inferences depends critically on the veracity of the
assumed structure. If the true structure differs from the one assumed,
and the data fits both equally well,
substantial errors may result that
can sometimes be assessed through a
It is also important for them to keep
in mind that the theoretical limitations
of model-free machine learning do not
apply to tasks of prediction, diagnosis,
and recognition, where interventions
and counterfactuals assume a secondary role.
However, the model-assisted methods by which these limitations are circumvented can nevertheless be transported to other machine learning tasks
where problems of opacity, robustness, explainability, and missing data
are critical. Moreover, given the transformative impact that causal modeling has had on the social and health
14, 25, 34 it is only natural to expect a similar transformation to sweep
through machine learning technology
once it is guided by provisional models of reality. I expect this symbiosis to
yield systems that communicate with
users in their native language of cause
Watch the author discuss
this work in the exclusive