hCONDELs

hCONDELs refer to regions of deletions within the human genome containing sequences that are highly conserved among closely related relatives. Almost all of these deletions fall within regions that perform non-coding functions. These represent a new class of regulatory sequences and may have played an important role in the development of specific traits and behavior that distinguish closely related organisms from each other.[1][2]

Contents

Nomenclature

The group of CONDELs of a specific organism is specified by prefixing the CONDELs with the first letter of the organism. For instance, hCONDELs refer to the group of CONDELs found in humans whereas mCONDELs and cCONDELs refer to mouse and chimpanzee CONDELs respectively.

Identification of CONDELs

The term hCONDEL was first used in the 2011 Nature article by McLean et al.[3] in whole-genome comparison analysis.[4] This involved firstly identifying a subset of 37,251 human deletions(hDELs)[5] through pairwise comparisons of chimpanzee and macaque genomes.[6] Chimpanzee sequences highly conserved in other species were then identified by pairwise alignment of chimpanzee with macaque, mouse and chicken sequences with BLASTZ[7] followed by multiple alignment of the pairwise alignments done with MULTIZ.[8] The highly conserved chimpanzee sequences were searched against the human genome using BLAT to identify conserved regions not present in humans. This identified 583 regions of deletions that were then referred to as hCONDELs. 510 of these identified hCONDELs were then validated computationally with 39 of these being validated by polymerase chain reaction(PCR).

Characteristics

hCONDELs in humans cover approximately 0.14% of chimpanzee genome. The number of hCONDELs currently identified is 583 using the genome-wide comparison method however validation of these predicated regions of deletions through polymerase chain reaction methods produce 510 hCONDELs. The remainder of these hCONDELs is either false-positives or non-existent genes. hCONDELs have been confirmed through PCR with 88 percent of these shown to have been lost from the draft Neanderthal genome.[9] hCONDELs, on average remove about ninety-five base pairs (bp) of highly conserved sequences from the human genome. The median size of these 510 validated CONDELs is about 2,804 bp thus showing a diverse range in length of the characteristic deletions. Another noticeable characteristic of hCONDELs (and other groups of identified CONDELs such as those from mouse and chimpanzee) is that they tend to be specifically skewed towards GC poor regions.[10]Simulations show that hCONDELs are enriched near genes[11] involved in hormone receptor signaling and neural function, and near genes encoding fibronectin-type-III-or CD80-like immunoglobulin C2-set domains.

Impact in Humans

Sialic acid loss

Of the 510 identified hCONDELs, only one of these deletions has been shown to remove a 92 bp sequence that is part of a protein-coding region in the human sequence. The deletion that affects the protein coding region[12] in humans results in a frameshift mutation in the CMAH gene which codes for the cytidine monophosphate-N-acetylneurminic acid hydroxylase-like protein, an enzyme involved in the t production of sialic acid. Sialic acid is known to play a crucial part in cell signaling pathways and interaction processes. The loss of this gene is evident in the undetectable levels of sialic acid in humans but highly present in mouse, pig, chimpanzee and other mammal tissues and may provide more insight into the historic background of human evolution.[13]

The mechanisms and time of occurrence of hCONDELs are not entirely understood but given that conserved non-coding sequences play a major developmental role through regulation of genes,[1] their loss in regions of deletions, it is expected that their loss in hCONDELs will result in developmental consequences that can be observed in human-specific traits. In situ hybridization experiments done by Mclean et al.[3] by fusion of mouse constructs fused to basal promoter with LacZ expression [14] for hCONDELs near the androgen receptor(AR) locus and the growth arrest and DNA-damage-inducible protein GADD45 gamma (GADD45G) locus suggest a role in deletions that affect regulatory sequences in humans.

Loss of Whiskers and Penile Spine

An hCONDEL located near the locus of the androgen receptor gene may be responsible for the loss of whiskers and penile spines in humans compared to its closely related relatives including chimpanzees. The 60.7kb hCONDEL which is located new the AR locus has been found to be responsible for removing a 5 kb sequence that codes for an enhancer[15] for the AR locus. Using the mouse construct with LacZ expression showed localization of this hCONDEL region (AR enhancer) to the mesenchyme of vibrissae follicles and the mesoderm cells of penile organs.

Expansion of brain size

Many hCONDELs are located around genes expressed during cortical neurogenesis. A 3,181 bp hCONDEL which is located near the GADD45G gene removes a forebrain-specific p300 enhancer binding site. The removal of this region , known to function as a suppressor specifically increases the proliferation of the subventricular zone(SVZ) of the septum. The loss of this SVZ enhancer region in an hCONDEL may provide further insights into the role of DNA sequence changes that may have resulted in evolution of the human brain [16] and may provide a better understanding of the evolution of humans.

References

  1. ^ a b Woolfe, A.; Goodson, M.; Goode, D. K.; Snell, P.; McEwen, G. K.; Vavouri, T.; Smith, S. F.; North, P. et al. (2005). "Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development". PLoS Biology 3 (1): e7. doi:10.1371/journal.pbio.0030007. PMC 526512. PMID 15630479. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=526512.  edit
  2. ^ Dermitzakis, E. T.; Reymond, A.; Scamuffa, N.; Ucla, C.; Kirkness, E.; Rossier, C.; Antonarakis, S. E. (2003). "Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs)". Science 302 (5647): 1033–1035. doi:10.1126/science.1087047. PMID 14526086.  edit
  3. ^ a b McLean, C. Y.; Reno, P. L.; Pollen, A. A.; Bassan, A. I.; Capellini, T. D.; Guenther, C.; Indjeian, V. B.; Lim, X. et al. (2011). "Human-specific loss of regulatory DNA and the evolution of human-specific traits". Nature 471 (7337): 216–219. doi:10.1038/nature09774. PMC 3071156. PMID 21390129. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=3071156.  edit
  4. ^ Chen, R.; Bouck, J. B.; Weinstock, G. M.; Gibbs, R. A. (2001). "Comparing Vertebrate Whole-Genome Shotgun Reads to the Human Genome". Genome research 11 (11): 1807–1816. doi:10.1101/gr.203601. PMC 311156. PMID 11691844. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=311156.  edit
  5. ^ Harris, R. A.; Rogers, J.; Milosavljevic, A. (2007). "Human-Specific Changes of Genome Structure Detected by Genomic Triangulation". Science 316 (5822): 235–237. doi:10.1126/science.1139477. PMID 17431168.  edit
  6. ^ Gibbs, R. A.; Rogers, J.; Katze, M. G.; Bumgarner, R.; Weinstock, G. M.; Mardis, E. R.; Remington, K. A.; Strausberg, R. L. et al. (2007). "Evolutionary and Biomedical Insights from the Rhesus Macaque Genome". Science 316 (5822): 222–234. doi:10.1126/science.1139247. PMID 17431167.  edit
  7. ^ Schwartz, S.; Kent, W. J.; Smit, A.; Zhang, Z.; Baertsch, R.; Hardison, R. C.; Haussler, D.; Miller, W. (2003). "Human–Mouse Alignments with BLASTZ". Genome Research 13 (1): 103–107. doi:10.1101/gr.809403. PMC 430961. PMID 12529312. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=430961.  edit
  8. ^ Blanchette, M.; Kent, W. J.; Riemer, C.; Elnitski, L.; Smit, A. F.; Roskin, K. M.; Baertsch, R.; Rosenbloom, K. et al. (2004). "Aligning Multiple Genomic Sequences with the Threaded Blockset Aligner". Genome Research 14 (4): 708–715. doi:10.1101/gr.1933104. PMC 383317. PMID 15060014. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=383317.  edit
  9. ^ Green ,Richard E et al.(2010). “A Draft Sequence of the Neandertal Genome”. Science 328 :710
  10. ^ Musto, H.; Cacciò, S.; Rodríguez-Maseda, H.; Bernardi, G. (1997). "Compositional constraints in the extremely GC-poor genome of Plasmodium falciparum". Memorias do Instituto Oswaldo Cruz 92 (6): 835–841. PMID 9566216.  edit
  11. ^ Levy, S.; Hannenhalli, S.; Workman, C. (2001). "Enrichment of regulatory signals in conserved non-coding genomic sequence". Bioinformatics 17 (10): 871–877. doi:10.1093/bioinformatics/17.10.871. PMID 11673231.  edit
  12. ^ Suzuki, R.; Saitou, N. (2011). "Exploration for Functional Nucleotide Sequence Candidates within Coding Regions of Mammalian Genes". DNA Research 18 (3): 177–187. doi:10.1093/dnares/dsr010. PMC 3111233. PMID 21586532. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=3111233.  edit
  13. ^ Chou, H. -H.; Takematsu, H.; Diaz, S.; Iber, J.; Nickerson, E.; Wright, K. L.; Muchmore, E. A.; Nelson, D. L. et al. (1998). "A mutation in human CMP-sialic acid hydroxylase occurred after the Homo-Pan divergence". Proceedings of the National Academy of Sciences 95 (20): 11751. doi:10.1073/pnas.95.20.11751.  edit
  14. ^ Poulin, F.; Nobrega, M. A.; Plajzer-Frick, I.; Holt, A.; Afzal, V.; Rubin, E. M.; Pennacchio, L. A. (2005). "In vivo characterization of a vertebrate ultraconserved enhancer". Genomics 85 (6): 774–781. doi:10.1016/j.ygeno.2005.03.003. PMID 15885503.  edit
  15. ^ Gotea, V.; Visel, A.; Westlund, J. M.; Nobrega, M. A.; Pennacchio, L. A.; Ovcharenko, I. (2010). "Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers". Genome Research 20 (5): 565–577. doi:10.1101/gr.104471.109. PMC 2860159. PMID 20363979. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2860159.  edit
  16. ^ Hill, R. S.; Walsh, C. A. (2005). "Molecular insights into human brain evolution". Nature 437 (7055): 64–67. doi:10.1038/nature04103. PMID 16136130.  edit