Genome

Part of a series on
Genetics
Key components

Chromosome
DNA • RNA
Genome
Heredity
Mutation
Nucleotide
Variation

Glossary
Index
Outline

History and topics

Introduction
History

Evolution • Molecular
Population genetics
Mendelian inheritance
Quantitative genetics
Molecular genetics

Research

DNA sequencing
Genetic engineering
Genomics • Topics
Medical genetics

Branches in genetics

Biology portal

In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA.[1]

Contents

Origin of term

The term was adapted in 1920 by Hans Winkler, Professor of Botany at the University of Hamburg, Germany. In Greek, the word genome (γίνομαι) means "I become, I am born, to come into being". The Oxford English Dictionary suggests the name to be a blend of the words gene and chromosome. A few related -ome words already existed, such as biome and rhizome, forming a vocabulary into which genome fits systematically.[2]

Overview

Some organisms have multiple copies of chromosomes, diploid, triploid, tetraploid and so on. In classical genetics, in a sexually reproducing organism (typically eukarya) the gamete has half the number of chromosomes of the somatic cell and the genome is a full set of chromosomes in a gamete. In haploid organisms, including cells of bacteria, archaea, and in organelles including mitochondria and chloroplasts, or viruses, that similarly contain genes, the single or set of circular and/or linear chains of DNA (or RNA for some viruses), likewise constitute the genome. The term genome can be applied specifically to mean that stored on a complete set of nuclear DNA (i.e., the "nuclear genome") but can also be applied to that stored within organelles that contain their own DNA, as with the "mitochondrial genome" or the "chloroplast genome". Additionally, the genome can comprise nonchromosomal genetic elements such as viruses, plasmids, and transposable elements.[3]

When people say that the genome of a sexually reproducing species has been "sequenced", typically they are referring to a determination of the sequences of one set of autosomes and one of each type of sex chromosome, which together represent both of the possible sexes. Even in species that exist in only one sex, what is described as "a genome sequence" may be a composite read from the chromosomes of various individuals. In general use, the phrase "genetic makeup" is sometimes used conversationally to mean the genome of a particular individual or organism. The study of the global properties of genomes of related organisms is usually referred to as genomics, which distinguishes it from genetics which generally studies the properties of single genes or groups of genes.

Both the number of base pairs and the number of genes vary widely from one species to another, and there is only a rough correlation between the two (an observation known as the C-value paradox). At present, the highest known number of genes is around 60,000, for the protozoan causing trichomoniasis (see List of sequenced eukaryotic genomes), almost three times as many as in the human genome.

An analogy to the human genome stored on DNA is that of instructions stored in a book:

Types

Most biological entities that are more complex than a virus sometimes or always carry additional genetic material besides that which resides in their chromosomes. In some contexts, such as sequencing the genome of a pathogenic microbe, "genome" is meant to include information stored on this auxiliary material, which is carried in plasmids. In such circumstances then, "genome" describes all of the genes and information on non-coding DNA that have the potential to be present.

In eukaryotes such as plants, protozoa and animals, however, "genome" carries the typical connotation of only information on chromosomal DNA. So although these organisms contain chloroplasts and/or mitochondria that have their own DNA, the genetic information contained by DNA within these organelles is not considered part of the genome. In fact, mitochondria are sometimes said to have their own genome often referred to as the "mitochondrial genome". The DNA found within the chloroplast may be referred to as the "plastome".

Genomes and genetic variation

A genome does not capture the genetic diversity or the genetic polymorphism of a species. For example, the human genome sequence in principle could be determined from just half the information on the DNA of one cell from one individual. To learn what variations in genetic information underlie particular traits or diseases requires comparisons across individuals. This point explains the common usage of "genome" (which parallels a common usage of "gene") to refer not to the information in any particular DNA sequence, but to a whole family of sequences that share a biological context.

Although this concept may seem counter intuitive, it is the same concept that says there is no particular shape that is the shape of a cheetah. Cheetahs vary, and so do the sequences of their genomes. Yet both the individual animals and their sequences share commonalities, so one can learn something about cheetahs and "cheetah-ness" from a single example of either.

Sequencing and mapping

The Human Genome Project was organized to map and to sequence the human genome. Other genome projects include mouse, rice, the plant Arabidopsis thaliana, the puffer fish, and bacteria like E. coli. In 1976, Walter Fiers at the University of Ghent (Belgium) was the first to establish the complete nucleotide sequence of a viral RNA-genome (bacteriophage MS2). The first DNA-genome project to be completed was the Phage Φ-X174, with only 5386 base pairs, which was sequenced by Fred Sanger in 1977. The first bacterial genome to be completed was that of Haemophilus influenzae, completed by a team at The Institute for Genomic Research in 1995. A few months later, the first eukaryotic genome was completed, with the 16 chromosomes of budding yeast Saccharomyces cerevisiae being released as a result of a European-led effort begun in the mid-1980s.

The development of new technologies has made it dramatically easier and cheaper to do sequencing, and the number of complete genome sequences is growing rapidly. Among many genome databases, the one maintained by the US National Institutes of Health is inclusive.[4]

These new technologies open up the prospect of personal genome sequencing as an important diagnostic tool. A major step toward that goal was the completion of the decipherment of the full genome of DNA pioneer James D. Watson in 2007.[5]

Whereas a genome sequence lists the order of every DNA base in a genome, a genome map identifies the landmarks. A genome map is less detailed than a genome sequence and aids in navigating around the genome. A fundamental step in the Human genome project was the release of a detailed genomic map by Jean Weissenbach and his team at the Genoscope in Paris.[6][7]

Comparison of different genome sizes

Organism type Organism Genome size (base pairs) Genome size (in human-readable format) mass - in pg Note
Virus Bacteriophage MS2 3,569 3.5kb 0.000002 First sequenced RNA-genome[8]
Virus SV40 5,224 5.2kb [9]
Virus Phage Φ-X174 5,386 5.4kb First sequenced DNA-genome[10]
Virus HIV 9,749 9.7kb [11]
Virus Phage λ 48,502 48kb
Virus Mimivirus 1,181,404 1.2Mb Largest known viral genome
Bacterium Haemophilus influenzae 1,830,000 1.8Mb First genome of a living organism sequenced, July 1995[12]
Bacterium Carsonella ruddii 159,662 160kb Smallest non-viral genome.[13]
Bacterium Buchnera aphidicola 600,000 600kb
Bacterium Wigglesworthia glossinidia 700,000 700Kb
Bacterium Escherichia coli 4,600,000 4.6Mb [14]
Bacterium Solibacter usitatus (strain Ellin 6076) 9,970,000 10Mb Largest known Bacterial genome
Amoeboid Polychaos dubium ("Amoeba" dubia) 670,000,000,000 670Gb 737 Largest known genome.[15] (Disputed [16])
Plant Arabidopsis thaliana 157,000,000 157Mb First plant genome sequenced, December 2000.[17]
Plant Genlisea margaretae 63,400,000 63Mb Smallest recorded flowering plant genome, 2006.[17]
Plant Fritillaria assyrica 130,000,000,000 130Gb
Plant Populus trichocarpa 480,000,000 480Mb First tree genome sequenced, September 2006
Plant Paris japonica (Japanese-native, pale-petal) 150,000,000,000 150Gb 152.23 Largest plant genome known
Moss Physcomitrella patens 480,000,000 480Mb First genome of a bryophyte sequenced, January 2008.[18]
Yeast Saccharomyces cerevisiae 12,100,000 12.1Mb First eukaryotic genome sequenced, 1996[19]
Fungus Aspergillus nidulans 30,000,000 30Mb
Nematode Caenorhabditis elegans 100,300,000 100Mb First multicellular animal genome sequenced, December 1998[20]
Nematode Pratylenchus coffeae 20,000,000 20Mb Smallest animal genome known[21]
Insect Drosophila melanogaster (fruit fly) 130,000,000 130Mb [22]
Insect Bombyx mori (silk moth) 530,000,000 530Mb
Insect Apis mellifera (honey bee) 236,000,000 236Mb
Insect Solenopsis invicta (fire ant) 480,000,000 480Mb [23]
Fish Tetraodon nigroviridis (type of puffer fish) 385,000,000 390Mb Smallest vertebrate genome known
Mammal Homo sapiens 3,200,000,000 3.2Gb 3
Fish Protopterus aethiopicus (marbled lungfish) 130,000,000,000 130Gb 143 Largest vertebrate genome known

Note: The DNA from a single (diploid) human cell if the 46 chromosomes were connected end-to-end and straightened, would have a length of ~2 m and a width of ~2.4 nanometers.

Since genomes and their organisms are very complex, one research strategy is to reduce the number of genes in a genome to the bare minimum and still have the organism in question survive. There is experimental work being done on minimal genomes for single cell organisms as well as minimal genomes for multicellular organisms (see Developmental biology). The work is both in vivo and in silico.[24][25]

Genome evolution

Genomes are more than the sum of an organism's genes and have traits that may be measured and studied without reference to the details of any particular genes and their products. Researchers compare traits such as chromosome number (karyotype), genome size, gene order, codon usage bias, and GC-content to determine what mechanisms could have produced the great variety of genomes that exist today (for recent overviews, see Brown 2002; Saccone and Pesole 2003; Benfey and Protopapas 2004; Gibson and Muse 2004; Reese 2004; Gregory 2005).

Duplications play a major role in shaping the genome. Duplications may range from extension of short tandem repeats, to duplication of a cluster of genes, and all the way to duplications of entire chromosomes or even entire genomes. Such duplications are probably fundamental to the creation of genetic novelty.

Horizontal gene transfer is invoked to explain how there is often extreme similarity between small portions of the genomes of two organisms that are otherwise very distantly related. Horizontal gene transfer seems to be common among many microbes. Also, eukaryotic cells seem to have experienced a transfer of some genetic material from their chloroplast and mitochondrial genomes to their nuclear chromosomes.

See also

References

  1. ^ Ridley, M. (2006). Genome. New York, NY: Harper Perennial. ISBN 0-06-019497-9
  2. ^ Joshua Lederberg and Alexa T. McCray (2001). "'Ome Sweet 'Omics -- A Genealogical Treasury of Words". The Scientist 15 (7). http://lhncbc.nlm.nih.gov/lhc/docs/published/2001/pub2001047.pdf. 
  3. ^ Madigan M, Martinko J (editors) (2006). Brock Biology of Microorganisms (11th ed.). Prentice Hall. ISBN 0-13-144329-1. 
  4. ^ "Genome Home". 2010-12-08. http://www.ncbi.nlm.nih.gov/sites/entrez?db=Genome&itool=toolbar. Retrieved 2011-01-27. 
  5. ^ Wade, Nicholas (2007-05-31). "Genome of DNA Pioneer Is Deciphered". The New York Times. http://www.nytimes.com/2007/05/31/science/31cnd-gene.html?em&ex=1180843200&en=19e1d55639350b73&ei=5087%0A. Retrieved 2010-04-02. 
  6. ^ "What's a Genome?". Genomenewsnetwork.org. 2003-01-15. http://www.genomenewsnetwork.org/resources/whats_a_genome/Chp3_1.shtml. Retrieved 2011-01-27. 
  7. ^ NCBI_user_services (2004-03-29). "Mapping Factsheet". http://www.ncbi.nlm.nih.gov/About/primer/mapping.html. Retrieved 2011-01-27. 
  8. ^ Fiers W, et al. (1976). "Complete nucleotide-sequence of bacteriophage MS2-RNA - primary and secondary structure of replicase gene". Nature 260 (5551): 500–507. Bibcode 1976Natur.260..500F. doi:10.1038/260500a0. PMID 1264203. http://www.nature.com/nature/journal/v260/n5551/abs/260500a0.html. 
  9. ^ Fiers W, Contreras R, Haegemann G, Rogiers R, Van de Voorde A, Van Heuverswyn H, Van Herreweghe J, Volckaert G, Ysebaert M (1978). "Complete nucleotide sequence of SV40 DNA". Nature 273 (5658): 113–120. doi:10.1038/273113a0. PMID 205802. http://www.nature.com/nature/journal/v273/n5658/abs/273113a0.html. 
  10. ^ Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes CA, Hutchison CA, Slocombe PM, Smith M (1977). "Nucleotide sequence of bacteriophage phi X174 DNA". Nature 265 (5596): 687–695. doi:10.1038/265687a0. PMID 870828. http://www.nature.com/nature/journal/v265/n5596/abs/265687a0.html. 
  11. ^ "Virology - Human Immunodeficiency Virus And Aids, Structure: The Genome And Proteins Of HIV". Pathmicro.med.sc.edu. 2010-07-01. http://pathmicro.med.sc.edu/lecture/hiv9.htm. Retrieved 2011-01-27. 
  12. ^ Fleischmann R, Adams M, White O, Clayton R, Kirkness E, Kerlavage A, Bult C, Tomb J, Dougherty B, Merrick J (1995). "Whole-genome random sequencing and assembly of Haemophilus influenzae Rd". Science 269 (5223): 496–512. doi:10.1126/science.7542800. PMID 7542800. http://www.sciencemag.org/cgi/content/abstract/269/5223/496. 
  13. ^ Nakabachi A, Yamashita A, Toh H, et al. (October 2006). "The 160-kilobase genome of the bacterial endosymbiont Carsonella". Science 314 (5797): 267. doi:10.1126/science.1134196. PMID 17038615. 
  14. ^ Frederick R. Blattner, Guy Plunkett III, et al. (1997). "The Complete Genome Sequence of Escherichia coli K-12". Science 277 (5331): 1453–1462. doi:10.1126/science.277.5331.1453. PMID 9278503. http://www.sciencemag.org/cgi/content/abstract/277/5331/1453. 
  15. ^ Parfrey LW, Lahr DJG, Katz LA (2008). "The Dynamic Nature of Eukaryotic Genomes". Molecular Biology and Evolution 25 (4): 787–94. doi:10.1093/molbev/msn032. PMC 2933061. PMID 18258610. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=2933061. 
  16. ^ ScienceShot: Biggest Genome Ever, comments: "The measurement for Amoeba dubia and other protozoa which have been reported to have very large genomes were made in the 1960s using a rough biochemical approach which is now considered to be an unreliable method for accurate genome size determinations."
  17. ^ a b Greilhuber J, Borsch T, Müller K, Worberg A, Porembski S, and Barthlott W (2006). "Smallest angiosperm genomes found in Lentibulariaceae, with chromosomes of bacterial size". Plant Biology 8 (6): 770–777. doi:10.1055/s-2006-924101. PMID 17203433. 
  18. ^ Lang D, Zimmer AD, Rensing SA, Reski R (October 2008). "Exploring plant biodiversity: the Physcomitrella genome and beyond". Trends Plant Sci 13 (10): 542–549. doi:10.1016/j.tplants.2008.07.002. PMID 18762443. 
  19. ^ "Saccharomyces Genome Database". Yeastgenome.org. http://www.yeastgenome.org/. Retrieved 2011-01-27. 
  20. ^ The C. elegans Sequencing Consortium (1998). "Genome sequence of the nematode C. elegans: a platform for investigating biology". Science 282 (5396): 2012–2018. doi:10.1126/science.282.5396.2012. PMID 9851916. http://www.sciencemag.org/cgi/content/abstract/282/5396/2012. 
  21. ^ Gregory TR (2005). "Animal Genome Size Database". http://www.genomesize.com. http://www.genomesize.com/statistics.php?stats=entire#stats_top. 
  22. ^ Adams MD, Celniker SE, Holt RA, et al. (2000). "The genome sequence of Drosophila melanogaster". Science 287 (5461): 2185–95. Bibcode 2000Sci...287.2185.. doi:10.1126/science.287.5461.2185. PMID 10731132. http://www.sciencemag.org/cgi/content/abstract/287/5461/2185. Retrieved 2007-05-25. 
  23. ^ Wurm Y et al. (2011). "The genome of the fire ant Solenopsis invicta". PNAS 108 (14): 5679–5684. doi:10.1073/pnas.1009690108. PMC 3078418. PMID 21282665. http://www.pnas.org/content/early/2011/01/24/1009690108.abstract. Retrieved 2011-02-01. 
  24. ^ Glass JI, Assad-Garcia N, Alperovich N, Yooseph S, Lewis MR, Maruf M, Hutchison CA 3rd, Smith HO, Venter JC (2006). "Essential genes of a minimal bacterium". Proc Natl Acad Sci USA 103 (2): 425–30. doi:10.1073/pnas.0510013103. PMC 1324956. PMID 16407165. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1324956. 
  25. ^ Forster AC, Church GM (2006). "Towards synthesis of a minimal cell". Mol Syst Biol. 2 (1): 45. doi:10.1038/msb4100090. PMC 1681520. PMID 16924266. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1681520. 

Further reading

External links