Statistical consistency

From Wikipedia, the free encyclopedia

A phylogenetic reconstruction algorithm is statistically consistent under an evolutionary model if, for all model trees, given the sequences at the leaves, the probability that the method recovers the true tree approaches 1 as the sequence lengths at the leaves approaches infinity.

More formally, a method Φ is statistically consistent under an evolutionary model M if

  • for all model-M trees (T,λ) with sequences S of length k at the leaves, and where each edge e of T has a weight λe,
  • for all fixed f,g with 0 < f \le g, such that f \le \lambda_e \le g for all edges e,
  • and for all ε > 0,

there is a constant N that depends upon f,g and ε and the number of leaves of T, such that if k > N, then \Pr[\Phi(S) = T] > 1- \epsilon.

[edit] References

  1. J. Felsenstein. (1988). Phylogenies from molecular sequences: Inference and reliability. Annu. Rev. Genet. 22, pp. 521-565.