Conserved sequence
In biology, conserved sequences are similar or identical sequences that occur within nucleic acid sequences (such as RNA and DNA sequences), protein sequences, protein structures or polymeric carbohydrates across species (orthologous sequences) or within different molecules produced by the same organism (paralogous sequences). In the case of cross species conservation, this indicates that a particular sequence may have been maintained by evolution despite speciation. The further back up the phylogenetic tree a particular conserved sequence may occur the more highly conserved it is said to be. Since sequence information is normally transmitted from parents to progeny by genes, a conserved sequence implies that there is a conserved gene.
It is widely believed that mutation in a "highly conserved" region leads to a non-viable life form, or a form that is eliminated through natural selection.
Conserved nucleic acid sequences
Highly conserved DNA sequences are thought to have functional value. The role for many of these highly conserved non-coding DNA sequences is not understood. Ultra-conserved elements or sequences (UCEs or UCRs) that share 100% identity among human, mouse and rat were first described by Bejerano and colleagues in 2004.[1] One recent study that eliminated four highly-conserved non-coding DNA sequences in mice yielded viable mice with no significant phenotypic differences; the authors described their findings as "unexpected".[2] Many regions of the DNA, including highly conserved DNA sequences, consist of repeated sequence (DNA) elements. One possible explanation of the null hypothesis above is that removal of only one or a subset of a repeated sequence could theoretically preserve phenotypic functioning on the assumption that one such sequence is sufficient and the repetitions are superfluous to essential life processes; it was not specified in the paper whether the eliminated sequences were repeated sequences. Although most of the conserved sequences biological function is still unknown, few conserved sequences derived transcripts showed that their expression is deregulated in human cancer tissues.[3]
The TATA promoter sequence is an example of a highly conserved DNA sequence, being found in most eukaryotes.
Conserved protein sequences and structures
Highly conserved proteins are often required for basic cellular function, stability or reproduction. Conservation of protein sequences is indicated by the presence of identical amino acid residues at analogous parts of proteins. Conservation of protein structures is indicated by the presence of functionally equivalent, though not necessarily identical, amino acid residues and structures between analogous parts of proteins.
Shown below is an amino acid sequence alignment between two human zinc finger proteins, with GenBank accession numbers AAB24882 and AAB24881. Alignment was carried out using the clustalw sequence alignment program. Conserved amino acid sequences are marked by strings of on the third line of the sequence alignment. As can be seen from this alignment, these two proteins contain a number of conserved amino acid sequences (represented by identical letters aligned between the two sequences).
Conserved polymeric carbohydrate sequences
The monosaccharide sequence of the glycosaminoglycan heparin is conserved across a wide range of species.
Biological role of sequence conservation
Sequence similarities serve as evidence for structural and functional conservation, as well as of evolutionary relationships between the sequences. Consequently, comparative analysis is the primary means by which functional elements are identified.
Among the most highly conserved sequences are the active sites of enzymes and the binding sites of protein receptors.
Conserved non-coding sequences often harbor cis-regulatory elements which constrain evolution. Some deletions of highly conserved sequences in humans (hCONDELs) and other organisms have been suggested to be a potential cause of the anatomical and behavioral differences between humans and other mammals.[4][5]
See also
- Ultra-conserved element
- Sequence alignment
- Sequence alignment software
- ClustalW
- UCbase
References
- ↑ Bejerano, G; Pheasant, M, Makunin, I, Stephen, S, Kent, WJ, Mattick, JS, Haussler, D (2004-05-28). "Ultraconserved elements in the human genome.". Science 304 (5675): 1321–5. doi:10.1126/science.1098119. PMID 15131266.
- ↑ Ahituv N, Zhu Y, Visel A, et al. (2007). "Deletion of ultraconserved elements yields viable mice". PLoS Biol. 5 (9): e234. doi:10.1371/journal.pbio.0050234. PMC 1964772. PMID 17803355.
- ↑ Calin, GA; Liu, CG, Ferracin, M, Hyslop, T, Spizzo, R, Sevignani, C, Fabbri, M, Cimmino, A, Lee, EJ, Wojcik, SE, Shimizu, M, Tili, E, Rossi, S, Taccioli, C, Pichiorri, F, Liu, X, Zupo, S, Herlea, V, Gramantieri, L, Lanza, G, Alder, H, Rassenti, L, Volinia, S, Schmittgen, TD, Kipps, TJ, Negrini, M, Croce, CM (September 2007). "Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas.". Cancer Cell 12 (3): 215–29. doi:10.1016/j.ccr.2007.07.027. PMID 17785203.
- ↑ McLean, Cory Y.; et al. (10 March 2011). "Human-specific loss of regulatory DNA and the evolution of human-specific traits". Nature 471 (7337): 216–219. doi:10.1038/nature09774. PMC 3071156. PMID 21390129.
- ↑ Gross, Liza (September 2007). "Are "Ultraconserved" Genetic Elements Really Indispensable?". PLOS Biology 5 (9): e253. doi:10.1371/journal.pbio.0050253. PMC 1964769. PMID 20076686.
Further reading
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (December 1997). "The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools". Nucleic Acids Res. 25 (24): 4876–82. doi:10.1093/nar/25.24.4876. PMC 147148. PMID 9396791.