Bioinformatic Harvester

From Wikipedia, the free encyclopedia

The Bioinformatic-Harvester is a bioinformatic meta search engine for genes and protein-associated information. Harvester currently works for human, mouse and rat proteins. Harvester crosslinks 16 popular bioinformatic resources and allows cross searches. A ranking system similar to Google pagerank sorts the search results and displays the more relevant information.

Databases supported by Bioinformatic Harvester
UniProt | SOURCE | SMART | SOSUI | PSORT | HomoloGene | GFP-cDNA | IPI | OMIM
NCBI-BLAST | Genome-Browser | Ensembl | RZPD | STRING | iHOP | Entrez

Contents

[edit] How Harvester works

Harvester collects information from protein and gene databases along with information from so called "prediction servers." Prediction server e.g provide online sequence analysis for a single protein. Harvesters search index is based on the UniProt protein information collection. The Uniprot collections consists of ~71.000 human (2006-04) ~62.000 mouse and ~15.000 rat protein information pages which are curated and updated on a regular basis.

 A screenshot of the Harvester browser
Enlarge
A screenshot of the Harvester browser

Harvester collects two types of information:

A) text based information from the following databases:

  • UniProt, world largest protein database
  • SOURCE, convenient gene information overview
  • Simple Modular Architecture Research Tool (SMART),
  • SOSUI, predicts transmembrane domains
  • PSORT, predicts protein localisation
  • Homologene, compares proteins from different species
  • gfp-cdna, protein localisation with fluorescence microscopy
  • International Protein Index (IPI).

B) Databases rich in graphical elements are not collected, but crosslinked via iframes. Iframes are transparent windows within a HTML pages. The iframe windows allows up-to-date viewing of the "iframed," linked databases. Several such iframes are combined on a Harvester protein page. This mehtod allows convenient comparison of information from several databases.

Currently Harvester crosslinks the following (graphical elements rich) servers via iframes:

  • NCBI-BLAST, an algorithm for comparing biological sequences NCBI
  • Genome Browser, working draft assemblies for genomes UCSC
  • Ensembl, automatic gene annotation. EMBL-EBI and Sanger-Institute
  • RZPD, German resources Center for genome research in Berlin/Heidelberg
  • STRING, Search Tool for the Retrieval of Interacting Genes/Proteins EMBL
  • iHOP, information hyperlinked over proteins via gene/protein synonyms

[edit] What one can find

Harvester allows a combination of different search terms and single words.

Search Examples:

  • Gene-name: "golga3"
  • Gene-alias: "ADAP-S ADAS ADHAPS ADPS" (one gen name is sufficient)
  • Gene-Ontologies: "Enzyme linked receptor protein signaling pathway"
  • Unigene-Cluster: "Hs.449360"
  • Go-annotation: "intra-Golgi transport"
  • Molecular function: "protein kinase binding"
  • Protein: "Q9NPD3"
  • Protein domain: "SH2 sar"
  • Protein Localisation: "endoplasmic reticulum"
  • Chromosome: "2q31"
  • Disease relevant: use the word "diseaselink"
  • Combinations: "golgi diseaselink" (finds all golgi proteins associated with a disease)
  • mRNA: "AL136897"
  • Word: "Cancer"
  • Comment: "highly expressed in heart"
  • Author: "Merkel, Schmidt"
  • Publication or project: "cDNA sequencing project"

[edit] See also

[edit] Literature

  • Liebel,U., & Kindler,B.,Pepperkok,R. (2004) 'Harvester': a fast meta search engine of human protein resources. Bioinformatics. 2004 Aug 12;20(12):1962-3. Epub 2004 Feb 26.[1]
  • Liebel,U., & Kindler,B.,Pepperkok,R. (2004) Bioinformatic "Harvester": a search engine for genome-wide human, mouse, and rat protein resources. Methods Enzymol. 2005;404:19-26[2]

[edit] External links

In other languages