Collider (epidemiology)

In statistics, a variable is termed a collider when it is the outcome of two (or more) variables (that may or may not themselves be correlated) (See Figure 1). The name "collider" reflects the fact that in graphical models, the arrow heads from variables that lead into the collider appear to "collide" on the node that is the collider[1]

Figure 1: SEM model of a Collider

The result of having a collider in the path is that the collider blocks[2][3][4] the association between the variables that influence it.

Thus in the example shown on the right, "controlling" for the collider will cause the correlation between predictors X1 and X2 to be biased (Berkson's paradox). The collider in this way "blocks" the association between its predictors.

This is important in regression analyses attempting to test causal theories: Researchers' "controlling" for what they consider to be a background variable such as education may unwittingly inducing false correlations in the variables of interest, and thus risk supporting theories for which there is in fact no support.

See also

References

  1. Hernan, Miguel A; Robins, James M (2010), Causal inference, Chapman & Hall/CRC monographs on statistics & applied probability, CRC, p. 70, ISBN 1-4200-7616-7
  2. Greenland, Sander; Pearl, Judea; Robins, James M (January 1999), "Causal Diagrams for Epidemiologic Research", Epidemiology 10 (1): 37–48, doi:10.1097/00001648-199901000-00008, ISSN 1044-3983, OCLC 484244020
  3. Pearl, Judea (1986). "Fusion, Propagation and Structuring in Belief Networks". Artificial Intelligence 29 (3): 241–288. doi:10.1016/0004-3702(86)90072-x.
  4. Pearl, Judea (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann.