Conserved sequence

From Wikipedia, the free encyclopedia

Residues conserved among various G protein coupled receptors are highlighted in green.

In biology, conserved sequences are similar or identical sequences that occur within nucleic acid sequences (such as RNA and DNA sequences), protein sequences, protein structures or polymeric carbohydrates across species (orthologous sequences) or within different molecules produced by the same organism (paralogous sequences). In the case of cross species conservation, this indicates that a particular sequence may have been maintained by evolution despite speciation. The further back up the phylogenetic tree a particular conserved sequence may occur the more highly conserved it is said to be. Since sequence information is normally transmitted from parents to progeny by genes, a conserved sequence implies that there is a conserved gene.

It is widely believed that mutation in a "highly conserved" region leads to a non-viable life form, or a form that is eliminated through natural selection.

Conserved nucleic acid sequences

Highly conserved DNA sequences are thought to have functional value. The role for many of these highly conserved non-coding DNA sequences is not understood. Ultra-conserved elements or sequences (UCEs or UCRs) that share 100% identity among human, mouse and rat were first described by Bejerano and colleagues in 2004.^[1] One recent study that eliminated four highly-conserved non-coding DNA sequences in mice yielded viable mice with no significant phenotypic differences; the authors described their findings as "unexpected".^[2] Many regions of the DNA, including highly conserved DNA sequences, consist of repeated sequence (DNA) elements. One possible explanation of the null hypothesis above is that removal of only one or a subset of a repeated sequence could theoretically preserve phenotypic functioning on the assumption that one such sequence is sufficient and the repetitions are superfluous to essential life processes; it was not specified in the paper whether the eliminated sequences were repeated sequences. Although most of the conserved sequences biological function is still unknown, few conserved sequences derived transcripts showed that their expression is deregulated in human cancer tissues.^[3]

The TATA promoter sequence is an example of a highly conserved DNA sequence, being found in most eukaryotes.

Conserved protein sequences and structures

Highly conserved proteins are often required for basic cellular function, stability or reproduction. Conservation of protein sequences is indicated by the presence of identical amino acid residues at analogous parts of proteins. Conservation of protein structures is indicated by the presence of functionally equivalent, though not necessarily identical, amino acid residues and structures between analogous parts of proteins.

Shown below is an amino acid sequence alignment between two human zinc finger proteins, with GenBank accession numbers AAB24882 and AAB24881. Alignment was carried out using the clustalw sequence alignment program. Conserved amino acid sequences are marked by strings of ${\mathrm {*}}$ on the third line of the sequence alignment. As can be seen from this alignment, these two proteins contain a number of conserved amino acid sequences (represented by identical letters aligned between the two sequences).

Conserved polymeric carbohydrate sequences

The monosaccharide sequence of the glycosaminoglycan heparin is conserved across a wide range of species.

Biological role of sequence conservation

Sequence similarities serve as evidence for structural and functional conservation, as well as of evolutionary relationships between the sequences. Consequently, comparative analysis is the primary means by which functional elements are identified.

Among the most highly conserved sequences are the active sites of enzymes and the binding sites of protein receptors.

Conserved non-coding sequences often harbor cis-regulatory elements which constrain evolution. Some deletions of highly conserved sequences in humans (hCONDELs) and other organisms have been suggested to be a potential cause of the anatomical and behavioral differences between humans and other mammals.^[4]^[5]

References

↑ Bejerano, G; Pheasant, M, Makunin, I, Stephen, S, Kent, WJ, Mattick, JS, Haussler, D (2004-05-28). "Ultraconserved elements in the human genome.". Science 304 (5675): 1321–5. doi:10.1126/science.1098119. PMID 15131266. |accessdate= requires |url= (help)
↑ Ahituv N, Zhu Y, Visel A, et al. (2007). "Deletion of ultraconserved elements yields viable mice". PLoS Biol. 5 (9): e234. doi:10.1371/journal.pbio.0050234. PMC 1964772. PMID 17803355.
↑ Calin, GA; Liu, CG, Ferracin, M, Hyslop, T, Spizzo, R, Sevignani, C, Fabbri, M, Cimmino, A, Lee, EJ, Wojcik, SE, Shimizu, M, Tili, E, Rossi, S, Taccioli, C, Pichiorri, F, Liu, X, Zupo, S, Herlea, V, Gramantieri, L, Lanza, G, Alder, H, Rassenti, L, Volinia, S, Schmittgen, TD, Kipps, TJ, Negrini, M, Croce, CM (September 2007). "Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas.". Cancer Cell 12 (3): 215–29. doi:10.1016/j.ccr.2007.07.027. PMID 17785203. |accessdate= requires |url= (help)
↑ McLean, Cory Y.; et al. (10 March 2011). "Human-specific loss of regulatory DNA and the evolution of human-specific traits". Nature 471 (7337): 216–219. doi:10.1038/nature09774. PMC 3071156. PMID 21390129. Cite uses deprecated parameters (help)
↑ Gross, Liza (September 2007). "Are "Ultraconserved" Genetic Elements Really Indispensable?". PLOS Biology 5 (9): e253. doi:10.1371/journal.pbio.0050253. PMC 1964769. PMID 20076686.