AGML

From Wikipedia, the free encyclopedia

Two-dimensional gel electrophoresis (2-DE) is used to separate complex protein mixtures by first separating the proteins according to their isoelectric point (pI) and then according to their molecular weight. This separation would then enable the scientist to identify the proteins according to their published pI and MW values or subject them to further separation or identification using Mass Spectrometry. 2-DE is a time consuming procedure resulting in copious amounts of raw data. Thus in order use computational means to analyze the data in context, meaning full data structure are required/needed. These data standards are slow in coming due to complexity and integrative nature of knowledge required to develop this is sparse. Annotated Gel Markup Language, or AGML is an attempt to bridge this gap.

In spite of two-dimensional gel electrophoresis (2-DE) being an effective and widely used method to screen the proteome, its data standardization has still not matured to the level of microarray genomics data or mass spectrometry approaches. The trend toward identifying encompassing data standards has been expanding from genomics to transcriptomics, and more recently to proteomics. The relative success of genomic and transcriptomic data standardization has enabled the development of central repositories such as GenBank and Gene Expression Omnibus. An equivalent 2-DE-centric data structure would similarly have to include a balance among raw data, basic feature detection results, sufficiency in the description of the experimental context and methods, and an overall structure that facilitates a diversity of usages, from central reposition to local data representation in LIMs systems. Results & Conclusion: Achieving such a balance can only be accomplished through several iterations involving bioinformaticians, bench molecular biologists, and the manufacturers of the equipment and commercial software from which the data is primarily generated. Such an encompassing data structure is described here, developed as the mature successor to the well established and broadly used earlier version. A public repository, AGML Central, is configured with a suite of tools for the conversion from a variety of popular formats, web-based visualization, and interoperation with other tools and repositories, and is particularly mass-spectrometry oriented with I/O for annotation and data analysis. Structure language is especially important in this field and would greatly enable the transition of 2-D gel electrophoresis in to the semantic realm. This would enable more meaningful searching of the data and eventual integration and relationship formation using computational algorithms (Adapted from Stanislaus et al. 2008 see http://www.biomedcentral.com/1471-2105/9/4/).

Annotated Gel Markup Language, or AGML, is a language that has been proposed to markup data obtained by 2-D gel electrophoresis based on the XML language. The eXtensible Markup Language (XML) is particularly well suited to represent biological data and methods and is the choice to do this in most areas. As such XML syntax notation was used to describe data acquired from 2-D gel electrophoresis and mass spectrometry. The goal of AGML is to enable proteomics research to move into the browsing mode of searching through existing databases.

The AGML Central project is part of a wider XML data model to represent 2-D gel electrophoresis data. In this regard the aim is to faithfully represent both the data/results and experimental protocols/methods used in producing the data. The major advantage in this comes from the fact that when analysing 2-D gel electrophoresis data stored in AGML (an XML data structure), all the pertinent information can be found in one data file. AGML 2.0 data structure can store data, both gel and mass spectrometry, and experimental methods through the use of MI2DG.

[edit] External links