Conditional independence

From Wikipedia, the free encyclopedia

These pictures represent the probabilities of events A, B and C by the areas shaded red, blue and green respectively with respect to the total area. In the first example A and B are conditionally independent given C or not C but in the second they are only given C, because
These pictures represent the probabilities of events A, B and C by the areas shaded red, blue and green respectively with respect to the total area. In the first example A and B are conditionally independent given C or not C but in the second they are only given C, because \Pr(A \cap B \mid \mbox{not } C) \not= \Pr(A \mid \mbox{not } C)\Pr(B \mid \mbox{not } C).\,

In probability theory, two events A and B are conditionally independent given a third event C precisely if the occurrence or non-occurrence of A and B are independent events in their conditional probability distribution given C. In the standard notation of probability theory,

\Pr(A \cap B \mid C) = \Pr(A \mid C)\Pr(B \mid C),\,

or equivalently,

\Pr(A \mid B \cap C) = \Pr(A \mid C).\,

Two random variables X and Y are conditionally independent given an event C if they are independent in their conditional probability distribution given C. Two random variables X and Y are conditionally independent given a third random variable W if for any measurable set S of possible values of W, X and Y are conditionally independent given the event [WS].

Conditional independence of more than two events, or of more than two random variables, is defined analogously.

Contents

[edit] Uses in Bayesian statistics

Let p be the proportion of voters who will vote "yes" in an upcoming referendum. In taking an opinion poll, one chooses n voters randomly from the population. For i = 1, ..., n, let Xi = 1 or 0 according as the ith chosen voter will or will not vote "yes".

In a frequentist approach to statistical inference one would not attribute any probability distribution to p (unless the probabilities could be somehow interpreted as relative frequencies of occurrence of some event or as proportions of some population) and one would say that X1, ..., Xn are independent random variables.

By contrast, in a Bayesian approach to statistical inference, one would assign a probability distribution to p regardless of the non-existence of any such "frequency" interpretation, and one would construe the probabilities as degrees of belief that p is in any interval to which a probability is assigned. In that model, the random variables X1, ..., Xn are not independent, but they are conditionally independent given the value of p. In particular, if a large number of the Xs are observed to be equal to 1, that would imply a high conditional probability, given that observation, that p is near 1, and thus a high conditional probability, given that observation, that the next X to be observed will be equal to 1.

[edit] Rules of conditional independence

A set of rules governing statements of conditional independence have been derived from the basic definition[1] [2]. If we write A \perp B \mid C to mean A is conditionally independent of B given C, then the following rules hold:

Symmetry: X \perp Y \mid Z  \implies Y \perp X \mid Z

Decomposition: Y,W \perp X  \mid Z  \implies Y \perp X \mid Z and  W \perp X \mid Z

Weak Union:  X \perp Y,W \mid Z \implies X \perp Y \mid Z,W

Contraction:  X \perp W \mid Z, Y and  X \perp Y \mid Z \implies X \perp W,Y\mid Z

If the probabilities of X, Y, Z, W are all strictly greater than zero then the following also holds.

Intersection:  X \perp Y \mid Z, W and  X \perp W \mid Z, Y \implies X \perp Y, W \mid Z

[edit] References

  1. ^ AP Dawid, Conditional Independence in Statistical Theory, Journal of the Royal Statistical Society Series B, 1979, Vol 41 pp 1-31
  2. ^ J Pearl, Causality: Models, Reasoning, and Inference, 2000, Cambridge University Press

[edit] See also