CPTs. Note, however, that the network
in Figure 11a is consistent with common perceptions of causal influences,
yet the one in Figure 11b violates these
perceptions due to edges A → E and A
→ B. Is there any significance to this
discrepancy? In other words, is there
some additional information that can
be extracted from one of these networks, which cannot be extracted from
the other? The answer is yes according
to a body of work on causal Bayesian
networks, which is concerned with a
key question: 16, 30 how can one characterize the additional information captured by a causal Bayesian network
and, hence, what queries can be answered only by Bayesian networks that
have a causal interpretation?
According to this body of work, only
causal networks are capable of updating probabilities based on interventions, as opposed to observations. To
give an example of this difference, consider Figure 11 again and suppose that
we want to compute the probabilities
of various events given that someone
has tampered with the alarm, causing
it to go off. This is an intervention, to be
contrasted with an observation, where
we know the alarm went off but without knowing the reason. In a causal
network, interventions are handled as
shown in Figure 12a: by simply adding a
new direct cause for the alarm variable.
This local fix, however, cannot be applied to the non-causal network in Figure 11b. If we do, we obtain the network
in Figure 12b, which asserts the following (using d-separation): if we observe
the alarm did go off, then knowing it
was not tampered with is irrelevant to
whether a burglary or an earthquake
took place. This independence, which
is counterintuitive, does not hold in
the causal structure and represents one
example of what may go wrong when
using a non-causal structure to answer
questions about interventions.
Causal structures can also be used
to answer more sophisticated queries,
such as counterfactuals. For example,
the probability of “the patient would
have been alive had he not taken the
drug” requires reasoning about interventions (and sometimes might
even require functional information,
beyond standard causal Bayesian networks30). Other types of queries include ones for distinguishing between
one of the most
intriguing aspects
of Bayesian
networks is
the role they play
in formalizing
causality.
direct and indirect causes and for determining the sufficiency and necessity of causation. 30 Learning causal
Bayesian networks has also been
treated, 16, 30 although not as extensively as the learning of general Bayesian
networks.
Beyond Bayesian networks
Viewed as graphical representations
of probability distributions, Bayesian
networks are only one of several other
models for this purpose. In fact, in
areas such as statistics (and now also
in AI), Bayesian networks are studied
under the broader class of
probabilistic graphical models, which include
other instances such as Markov networks and chain graphs (for example,
Edwards11 and Koller and Friedman22).
Markov networks correspond to undirected graphs, and chain graphs have
both directed and undirected edges.
Both of these models can be interpreted as compact specifications of
probability distributions, yet their semantics tend to be less transparent
than Bayesian networks. For example,
both of these models include numeric
annotations, yet one cannot interpret
these numbers directly as probabilities even though the whole model can
be interpreted as a probability distribution. Figure 9b depicts a special
case of a Markov network, known as a
Markov random field (MRF), which is
typically used in vision applications.
Comparing this model to the Bayesian network in Figure 9a, one finds
that smoothness constraints between
two adjacent pixels Pi and Pj can now
be represented by a single undirected
edge Pi – Pj instead of two directed edges and an additional node, Pi → Cij ←
Pj. In this model, each edge is associated with a function f (Pi, Pj) over the
states of adjacent pixels. The values of
this function can be used to capture
the smoothness constraint for these
pixels, yet do not admit a direct probabilistic interpretation.
Bayesian networks are meant to
model probabilistic beliefs, yet the
interest in such beliefs is typically motivated by the need to make rational
decisions. Since such decisions are
often contemplated in the presence
of uncertainty, one needs to know
the likelihood and utilities associated
with various decision outcomes. A