User:Wikibm

From Wikipedia, the free encyclopedia

Temporarily userfying per request at Wikipedia:Deletion review/Content review. Userfied on October 30, 2007.


BGenomic databases, including complete genomes such as the human genome, have been set up in an effort to help biologists in their research. Most of these databases are publicly available for consulting.

We automatically retrieve new releases when major updates for these databanks become available. Between two major releases, minor updates and corrections are also retrieved and installed in order to maintain up-to-date databases. Databases and tools are accessible on the web server ( http://genouest.org/) under Banks item.

National Project Biomaj (BIOlogie Mise A Jour) :

Biological knowledge, in proteomics and genomics context is mainly based on transitive bioinformatics analyzes consisting in periodic comparison of data newly produced again corpus of known information. This approach needs on one hand accurate bioinformatics softwares, pipelines, interfaces... and on another hand numerous heterogeneous biological banks, which are distributed around the world.

A data integration process is obviously an essential preliminary step. This represents a major challenge and bottleneck in bioinformatics. These biological data banks contain a mass of heterogeneous data (all in different formats) and very bulky (Tera bytes). These banks, after their recovery, must undergo various post treatments more or less personalized upstream of their use via various bioinformatics software (blast, SRS, emboss, gcg, ...). The banks frequency update scale is variable, and may vary, according to the source, from daily to several times per year. With the growing number of complete genomes and others genomics data sources increase rapidly. Moreover, the nature and the number of the banks are in constant evolution; the data between sources are cross-linked. The maintenance task is complex and heavy. A first stake consists in automating the process of updating the data banks for the administrator. Another significant stake to resolve is for the "quality" of service, providing to the users a clear vision of the integrity of data (state, exact origin, ... ) constitutive of their workspaces.

Biomaj is a joint development between three bioinfomatics platforms : INRA Toulouse (David Allouche), INRA Jouy-en-Josas (Christophe Caron) and our platform genouest.org. Biomaj is written using state-of-the-art technologies (java, xml, ..) and is based on a parametrisable workflow engine. Post processes are written for the usual formats (gcg, blast, srs, ...) and are easily customisable at user's needs. Biomaj is currently under heavy tests and will be relased under an opensource licence in May 2007.