can be used for both for readjusting
learned policies to circumvent environmental changes and for controlling
disparities between nonrepresentative
samples and a target population.
3 It
can also be used in the context of reinforcement learning to evaluate policies
that invoke new actions, beyond those
used in training.
35
Tool 6. Recovering from missing
data. Problems due to missing data
plague every branch of experimental
science. Respondents do not answer
every item on a questionnaire, sensors
malfunction as weather conditions
worsen, and patients often drop from
a clinical study for unknown reasons.
The rich literature on this problem is
wedded to a model-free paradigm of
associational analysis and, accordingly, is severely limited to situations
where “missingness” occurs at random; that is, independent of values
taken by other variables in the model.
6
Using causal models of the missingness process we can now formalize
the conditions under which causal
and probabilistic relationships can be
recovered from incomplete data and,
whenever the conditions are satisfied,
produce a consistent estimate of the
desired relationship.
12, 13
Tool 7. Causal discovery. The d
-separation criterion described earlier enables machines to detect and enumerate the testable implications of a given
causal model. This opens the possibility of inferring, with mild assumptions,
the set of models that are compatible
with the data and to represent this set
compactly. Systematic searches have
been developed that, in certain circumstances, can prune the set of compatible models significantly to the point
where causal queries can be estimated
directly from that set.
9, 18, 24, 31
Alternatively, Shimizu et al.
29
proposed a method for discovering causal directionality based on functional
decomposition.
24 The idea is that in a
linear model X → Y with non-Gaussian
noise, P(y) is a convolution of two non-Gaussian distributions and would be,
figuratively speaking, “more Gaussian”
than P(x). The relation of “more Gaussian than” can be given precise numerical measure and used to infer directionality of certain arrows.
Tian and Pearl32 developed yet
another method of causal discovery
probability of the sentence is estima-
ble from experimental or observational
studies, or a combination thereof.
1, 18, 30
Of special interest in causal discourse are counterfactual questions
concerning “causes of effects,” as opposed to “effects of causes.” For example, how likely it is that Joe’s swimming
exercise was a necessary (or sufficient)
cause of Joe’s death.
7, 20
Tool 4. Mediation analysis and the
assessment of direct and indirect effects. Mediation analysis concerns the
mechanisms that transmit changes
from a cause to its effects. The identification of such an intermediate
mechanism is essential for generating explanations, and counterfactual
analysis must be invoked to facilitate
this identification. The logic of counterfactuals and their graphical representation have spawned algorithms for
estimating direct and indirect effects
from data or experiments.
19, 27, 34 A typical query computable through these algorithms is: What fraction of the effect
of X on Y is mediated by variable Z?
Tool 5. Adaptability, external validity, and sample selection bias. The
validity of every experimental study is
challenged by disparities between the
experimental and the intended imple-mentational setups. A machine trained
in one environment cannot be expected to perform well when environmental conditions change, unless the
changes are localized and identified.
This problem, and its various manifestations, are well-recognized by AI
researchers, and enterprises (such as
“domain adaptation,” “transfer learning,” “life-long learning,” and “
explainable AI”)
4 are just some of the subtasks
identified by researchers and funding
agencies in an attempt to alleviate the
general problem of robustness. Unfortunately, the problem of robustness,
in its broadest form, requires a causal
model of the environment and cannot
be properly addressed at the level of Association. Associations alone cannot
identify the mechanisms responsible
for the changes that occurred,
22 the
reason being that surface changes in
observed associations do not uniquely
identify the underlying mechanism
responsible for the change. The do-
calculus discussed earlier now offers a
complete methodology for overcoming
bias due to environmental changes. It
Unlike the rules
of geometry,
mechanics, optics,
or probabilities,
the rules of cause
and effect
have been denied
the benefits
of mathematical
analysis.