Generic Model Organism Database

From Wikipedia, the free encyclopedia

Model Organism Databases (MODs) describe genome and other information about important experimental organisms in the life sciences. Also called organism-specific databases, these databases capture the large volumes of data and information being generated by modern biology. Behind every MOD is a software system that is designed to help manage the data within the MOD, and to help users query and access those data. In the past, every MOD project developed its own software tools.

More recently, the Generic Model Organism Database (GMOD) Project began as an effort to create reusable software tools for developing MODs. GMOD is a loose federation of software applications (components) aimed at providing functionality that is needed by many or all model organism databases. Some of these software components are linked together by their use of a common database schema known as Chado. This project is funded by the US NIH and the USDA Agricultural Research Service.

1 Software
2 Chado database schema
3 References
4 Participating databases
5 Related projects
6 See also

[edit] Software

The full list of GMOD software components is found on the GMOD home page. These components include:

GMOD Core (Chado database and tools)
- gmod-core : the Chado schema and tools to install it.
- XORT : a tool for loading and dumping chado-xml
- GMODTools : extracts data from a Chado database into common genome bulk formats (GFF, Fasta, etc)
MOD website
- gmod-web/Turnkey : a generic web front end for browsing database contents.
Genome Editing and Visualization
- Apollo : a Java application for viewing and editing genome annotations
- GBrowse : a CGI application for displaying genome annotations
Comparative Genomics
- Synbrowse : a GBrowse based synteny viewer
- CMap : a CGI application for displaying comparative maps
Literature curation
- PubSearch : a Java servlet web application for annotating genes from literature
- PubFetch : a tool to facilitate literature collection for curators' use
- Textpresso : a text mining system for scientific literature
Database querying tools
- BioMart : a query-oriented data management system
Biological Pathways
- Pathway Tools : tools for metabolic pathway information, and analysis of high-throughput functional genomics data

[edit] Chado database schema

Chado makes extensive use of controlled vocabularies to type all entities in the database, so there is a feature table where gene, transcripts, exons, transposable elements, etc. are stored and their type is provided by the Sequence Ontology. When a new datatype comes along, the feature table requires no modification, only an update of the data in the database. The same is largely true of analysis data that can be stored in Chado as well.

The existing core modules of Chado are:

sequence - for sequences/features
cv - for controlled-vocabs/ontologies
general - currently just dbxrefs
organism - taxonomic data
pub - publication and references
companalysis - augments sequence module with computational analysis data
map - non-sequence maps (PRELIMINARY SCHEMA)
genetic - genetic and phenotypic data (IN DEVELOPMENT)
expression - gene expression (PRELIMINARY SCHEMA)

[edit] References

Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S. (2002). "The generic genome browser: a building block for a model organism system database.". Genome Res. 12: 1599-610.

Colbourne JK, Singan VR, Gilbert DG. (2005). "wFleaBase: the Daphnia genome database.". BMC Bioinformatics. 6: 45.

Chisholm RL, Gaudet P, Just EM, Pilcher KE, Fey P, Merchant SN, Kibbe WA. (2006). "dictyBase, the model organism database for Dictyostelium discoideum.". Nucleic Acids Res. 34(Database issue): D423-7..

Wang L, Wang S, Li Y, Paradesi MS, Brown SJ. (2007). "BeetleBase: the model organism database for Tribolium castaneum.". Nucleic Acids Res. 35(Database issue): D476-9.

Arnaiz O, Cain S, Cohen J, Sperling L. (2007). "ParameciumDB: a community resource that integrates the Paramecium tetraurelia genome sequence with genetic data.". Nucleic Acids Res. 35(Database issue): D439-44.

[edit] Participating databases

The following organism databases are contributing to and/or adopting GMOD components for model organism databases.

ANISEED	AntonosporaDB	ATIDB
BeeBase	BeetleBase	BGD
BioHealthBase	Bovine QTL Viewer	Cattle EST Gene Family Database
CGD	CGL	ChromDB
Chromosome 7 Annotation Project	CSHLmpd	Database of Genomic Variants
DictyBase	DroSpeGe	EcoCyc
FlyBase	Fungal Comparative Genomics	Fungal Telomere Browser
Gallus Genome Browser	GeneDB	GrainGenes
Gramene	HapMap	Human 2q33
Human Genome Segmental Duplication Database	IVDB	MAGI
Marine Biological Lab Organism Databases	MGI	Non-Human Segmental Duplication Database
OMAP	OryGenesDB	Oryza Chromosome 8
Pathway Tools	ParameciumDB	PeanutMap
PlantsDB	PlasmoDB	PseudoCAP
PossumBase	PUMAdb	RGD
SGD	SGD Lite	SmedDB
SOL Genomics Network	Soybase	Soybean Gbrowse Database
T1DBase	TAIR	TGD
TGI	TIGR	TIGR Rice Genome Browser
ToxoDB	TriAnnot BAC Viewer	VectorBase
wFleaBase	WormBase
XanthusBase	Xenbase

[edit] Related projects

Bioperl
BioJava
Ensembl
Gene Ontology Software [1]
DAS[2]
The Genomics Unified Schema[3]
Manatee: Manual Annotation Tool Etc, Etc...[4]
Biocurator.org[5]
Open Biomedical Ontologies
The Sequence Ontology Project

Categories: Bioinformatics databases | Model organisms | Genomics

Generic Model Organism Database

From Wikipedia, the free encyclopedia

Contents

[edit] Software

[edit] Chado database schema

[edit] References

[edit] Participating databases

[edit] Related projects

[edit] See also

Views

Navigation

interaction

Search