GC-content

From Wikipedia, the free encyclopedia

The chemical structure of DNA showing both GC-pairs and AT-pairs
The chemical structure of DNA showing both GC-pairs and AT-pairs

In genetics, guanine-cytosine content (GC-content) is a characteristic of the genome of any given organism or any other piece of DNA or RNA. Usually expressed as a percentage, it is the proportion of GC-base pairs in the DNA molecule or genome sequence being investigated. G stands for guanine and C stands for cytosine. The remaining fraction of any DNA molecule will comprise of the bases A (adenine) and T (thymine), such that calculation of a GC-content indirectly calculates the AT-content as well (e.g. 58% GC-content = 42% AT-content). GC-pairs in the DNA are connected with three hydrogen bonds instead of two in the AT-pairs. This makes the GC-pair stronger and more resistant to denaturation by high temperatures, and thus GC-content tends to be greater in hyperthermophiles.

The GC-content is sometimes used to classify organisms in taxonomy. For example, the Actinobacteria are characterised as "high GC-content bacteria". In Streptomyces coelicolor it is 72%. The GC-content of Yeast (Saccharomyces cerevisiae) is 38%, and that of another common model organism Thale Cress (Arabidopsis thaliana) is 36%. Because of the nature of the genetic code, it is virtually impossible for an organism to have a genome with a GC-content approaching either 0% or 100%. A species with an extremely low GC-content is Plasmodium falciparum (GC% = ~20%), and it is usually common to refer to such examples as being AT-rich instead of GC-poor.

Within a long region of genomic sequence, genes are often characterised by having a higher GC-content in contrast to the background GC-content for the entire genome. In particular, the exons of a gene are characteristically GC-rich, whilst the introns are usually AT-rich (GC-poor). More generally, many studies have looked for (and found) patterns of GC-content variation throughout a genome sequence (encompassing both genes and the - often long - intergenic regions that separate them). The function and significance of such variation is unclear.

In PCR experiments, the GC-content of primers are used to determine their annealing temperature to the template DNA. A higher GC-content level indicates a higher melting temperature.

The GC-content can be measured by several means but one of the simplest methods is to measure what is called the melting temperature of the DNA double helix with a spectrophotometer. The absorbance of DNA at a wavelength of 260 nm increases fairly sharply when the double-stranded DNA separates into two single strands when sufficiently heated. Alternatively, if the DNA or RNA molecule under investigation has been sequenced then the GC-content can be accurately calculated by simple arithmetic.


[edit] External Links

Table with GC-content of all sequenced prokaryotes

In other languages