Generic Model Organism Database
From Wikipedia, the free encyclopedia
Model Organism Databases (MODs) describe genome and other information about important experimental organisms in the life sciences. Also called organism-specific databases, these databases capture the large volumes of data and information being generated by modern biology. Behind every MOD is a software system that is designed to help manage the data within the MOD, and to help users query and access those data. In the past, every MOD project developed its own software tools.
More recently, the Generic Model Organism Database (GMOD) Project began as an effort to create reusable software tools for developing MODs. GMOD is a loose federation of software applications (components) aimed at providing functionality that is needed by many or all model organism databases. Some of these software components are linked together by their use of a common database schema known as Chado. This project is funded by the US NIH and the USDA Agricultural Research Service.
Contents |
[edit] Software
The full list of GMOD software components is found on the GMOD home page. These components include:
- Gbrowse -- A flexible genome browser
- Pathway Tools -- Includes several tools such as for management of metabolic pathway information, a powerful genome browser, and tools for analysis of high-throughput functional genomics data on pathway maps
- Apollo -- For viewing and editing genome annotations
- PubFetch -- Facilitates literature collection for curators' use
- Textpresso -- A text mining system for scientific literature
Chado makes extensive use of controlled vocabularies to type all entities in the database, so there is a feature table where gene, transcripts, exons, transposable elements, etc. are stored and their type is provided by the Sequence Ontology. When a new datatype comes along, the feature table requires no modification, only an update of the data in the database. The same is largely true of analysis data that can be stored in Chado as well.
The existing core modules of Chado are:
- sequence - for sequences/features
- cv - for controlled-vocabs/ontologies
- general - currently just dbxrefs
- organism - taxonomic data
- pub - publication and references
- companalysis - augments sequence module with computational analysis data
- map - non-sequence maps (PRELIMINARY SCHEMA)
- genetic - genetic and phenotypic data (IN DEVELOPMENT)
- expression - gene expression (PRELIMINARY SCHEMA)
[edit] Participating members
- WormBase[1]
- FlyBase[2]
- Mouse Genome Informatics[3]
- Gramene[4]
- Rat Genome Database[5]
- TAIR[6]
- EcoCyc[7]
- Saccharomyces Genome Database[8]
[edit] Related projects
- Open Biomedical Ontologies
- Ensembl
- Bioperl
- BioJava
- BioXML[9]
- Gene Ontology Software[10]
- DAS[11]
- The Genomics Unified Schema[12]
- Manatee: Manual Annotation Tool Etc, Etc...[13]
- Biocurator.org[14]
- SGD Lite[15]
- Open Biomedical Ontologies
[edit] See also
- List of Bioinformatics Software Projects
- Biological database
- Genome project
- Genomics
- Genome