Fixation index

From Wikipedia, the free encyclopedia

Fixation index (FST) is a measure of population differentiation based on genetic polymorphism data, such as Single nucleotide polymorphisms (SNPs) or microsatellites. It is a special case of F-statistics, the concept developed in the 1920s by Sewall Wright.

This statistic compares the genetic variability within and between populations and is frequently used in the field of population genetics.

Several definitions of Fst have been used, all measuring different but related quantities. A common definition given by Hudson et al. (1992) is:

 F_{ST} = \frac{ \Pi_{Between} - \Pi_{Within} } { \Pi_{Between} }

where ΠBetween and ΠWithin represent the average number of pairwise differences between two individuals sampled from different (ΠBetween) or the same (ΠWithin) population. The average pairwise difference within a population can be calculated as the sum of the pairwise differences divided by the number of pairs. Note that when using this definition of FST, the value ΠWithin should be computed for each population and then averaged. Otherwise, random sampling of pairs within populations put all the weight on the population with the largest sample size.

[edit] FST in humans

The International HapMap Project estimated FST for three human populations using SNP data. A more complex formula for FST was used in order to account for differences in sample size:

 F_{ST} = 1 - \frac{ \displaystyle\sum_{j} {n_j \choose 2} \displaystyle\sum_{i} 2 \frac{ n_{ij} } {n_{ij} - 1} x_{ij} (1 - x_{ij}) / \displaystyle\sum_{j} {n_j \choose 2} } { \displaystyle\sum_{i} 2 \frac{ n_{i} } {n_{i} - 1} x_{i} (1 - x_{i}) }

In the above equation xij is the estimated frequency (proportion) of the minor allele at SNP i in population j, nij is the number of genotyped chromosomes at that position, and nj is the number of chromosomes analysed in that population. The lack of the j subscript in the denominator indicates that statistics ni and xi are calculated across the combined data sets.

Across the autosomes, FST was estimated to be 0.12.

[edit] References

  • Estimation of levels of gene flow from DNA sequence data, R. R. Hudson, M. Slatkin and W. P. Maddison, Genetics 1992
  • Evolution and the Genetics of Populations Volume 2: the Theory of Gene Frequencies, pg 294–295, S. Wright, Univ. of Chicago Press, Chicago, 1969
  • A haplotype map of the human genome, The International HapMap Consortium, Nature 2005