User:Tim@/PPIP
From Wikipedia, the free encyclopedia
Contents |
[edit] Help me make a page for Protein-Protein Interaction Prediction
[edit] To Do List
- 8) help?
[edit] Draft article begins here
[edit] The prediction of an interaction or binding between proteins.
(computational multi genome assays are of primary interest)
Protein-Protein Interaction Prediction is often referred to with the acronym (PPIP)
Protein-protein interaction prediction is a branch of bioinformatics that seeks to use computational methods to predict complex protein-protein interactions. Understanding the functional interactions of proteins is an important research focus in biochemistry, often referred to as proteomics.
The promise of the Human genome project was that determining the human genetic code would lead to great advances in understanding human disease and to developing drugs and other therapies that were engineered to interact with specific biological processes. Since its, completion genome sequencing has become more accurate, less expensive, and quicker. There are now many sequenced genomes waiting for study. However, it has become clear that the data of the DNA sequence is not enough. The human genome encodes between 40,000 and 100,000 proteins (the proteome), and in order to make use of the DNA sequence, protein sequences must be predicted and verified, Protein structure predicted and verified, and Protein-protein_interaction predicted and verified.
Many methods exist to experimentally determine protein-protein interactions, including Affinity chromatography, yeast two-hybrid screening techniques, fluorescence resonance energy transfer (FRET), and Surface Plasmon Resonance (SPR). However, with at least 40,000 unique protein sequences potentially available in every cell, the number of possible interactions becomes so enormous that physical testing can only be performed on a small subset of the most interesting proteins at great expense. Protein-protein interaction prediction seeks to use advanced computational methods to rapidly investigate many possible interactions in order to identify those with the greatest potential for experimental investigation.
Knowledge is the first step Understanding the second and helping the third. PPIP provides a means of gaining knowledge of how organisms work. understanding how the inner workings of organisms makes it possible to develop a method of fixing organisms when broken and helping them resist attack. PPIP has resulted in leaps in the cure for genetic and infectious diseases. One such example is the work done on MCF7 breast cancer was gained through the tumour suppressor protein (RSU1) from the BUB3-ZNF207 interaction prediction.
[edit] Methods
[edit] Dynamics Method
- Simple brute force approach:
- The Dynamics Method performs PPIP using the same rules as the real system by simulating the dynamics of every force on every atom in two proteins of interest in order to predict first folding, and then interaction. It then does the same for every potential protein pair combination in the genome.
- Advantages and disadvantages
- hypothetically accurate
- impossible due to massive computational requirements
[edit] Folding and Docking
- The unworkable Dynamics Method can be broken up into two smaller sub-problems to avoid or minimise computation of dynamics; Folding and Docking:
- The most effective Folding Prediction Method predicts protein folding structures using a reasonable amount of computational time by using statistical substitution, followed by tweaking. Statistical substitution involves folding a small number of amino acids or residues by using the previously observed statistically dominant folding configuration. Tweaking is similar to heating the structure in that it introduces small random changes and selects those that have the lowest energy states.
- Advantages and disadvantages
- Reasonable results for individual predictions.
- Accuracy improves as more folding conformations are verified.
- Too slow to run on a genome wide
- Not accurate with atypical structures.
- Once protein folding has been successfully modeled, Protein Docking is the next logical step. To simplify the dynamics of docking, Binary docking methods find potentially active sites on a single folded protein structure and match them to active sites on a second protein using pattern recognition software or geometric hashing algorithms. Conserved domains are observed [52] and used to imply potential binding partners because surface complementarity between interacting protein sites is high.
- Advantages and disadvantages
- Multiple protein dockings are also being accurately predicted.
- Relies on folding information that is not available for much of the genomes.
- To slow for a genome wide tool.
- Low reliability
[edit] Sequence Method
- The Sequence Method is an attempt to avoid the modelling of folding and docking altogether by using direct pattern recognition of the binding sequences.
- Advantages and disadvantages
- Fast enough to be used as genome wide tool.
- Oversimplification is possible.
[edit] Graph Learning Method
- The Graph Learning Method improves on the sequence method and its problems by programming a computer to learn what attributes are important for PPIP by identifying patterns in observed interactions. It then uses these attribute patterns for PPIP.
- Advantages and disadvantages
- Fast genome wide tool.
- Good reliability
[edit] Vector Learning Method
- The Vector Learning Method is an alternative to the Graph Learning Method and is currently competing for the title of most efficient method. Both machine learning methods are probably of equal potential. A training set is mapped to an n-dimensional space where successful combinations of residues or amino acids are represented in a hyperspace. Each piece of the pattern or residue attribute is mapped to a separate dimension “vectorization”. Unlike normal two dimensional (latitude and longitude) city maps, protein pattern maps are most effective when using more than 20 dimensions. If a potential protein pair lies within the space identified as successful an interaction is predicted.
- Advantages and disadvantages
- Fast genome wide tool.
- Good reliability
[edit] Evolutionary Method
- Because a large amount of work has been done on interactomes, the Evolutionary Method is becoming a practical speedup. It uses the data from PPIPs or experimentally verified interaction maps to infer protein interaction for evolutionarily related organisms.
- Advantages and disadvantages
- Relies on interactomes.
- Fast genome wide tool.
- Good reliability.
[edit] Validation
Predictions must be validated experimentally, however all experimental methods are costly and have numerous unavoidable associated error producing FN and FP. therefore choosing and understanding superior methods of verification is vary important
[edit] Signficant results
many new drugs and biological understandings are developed starting with PPIP before moving on to experimental methods, saving time and millions of dollars in the process[citation needed].
PPIP produces results that need biological verification and further exploration before the results can be used to cure diseases with new drugs or understanding. The results are used heavily as a starting point for biological research where most of the metabolic pathway of interest is unknown[citation needed].
Interperting the results of PPIP can be problematic because of the volumes of data generated therefore, the data is often organised in a hierarchical manner, or an interactome. The two best approaches are to simply display only one or two interaction links deep of a hierarchy at a time, the second is to assign the highly interactive (hub) proteins to be the roots of the interaction trees, creating groupings of functionally and spatially related proteins.
The main goal of proteomics is to predict the structures, interactions and functions of the proteins. Specific function is only found through interactions. The prediction of protein-protein interactions is of vital interest in proteomics.
[edit] References
- Biological Papers:
- Margulies, Marcel., Egholm, Michael., Altman, William E., Attiya, Said., Bader, Joel S., Bemben, Lisa A., Berka, Jan., Braverman, Michael S., Chen, Yi-Ju., Chen, Zhoutao., Dewell, Scott B., Du, Lei., Fierro, Joseph M. et al. (2005). Genome sequencing in microfabricated high-density picolitre reactors. Nature., 437, 376-380.
- Folding Papers:
- Tropsha, Alexander & Edelsbrunner, Herbert. Biogeometry Applications of Computational Geometry to Molecular Structure. School of Pharmacy, University of North Carolina.
- Bowie, James U. (2005). Solving the Membrane Protein- Folding Problem. Nature. 438, 581-589.
- Docking Papers:
- Ansari, Sam & Helms, Volkhard. (2005). Statistical Analysis of Predominantly Transient Protein-Protein Interfaces. Proteins: Structure, Function and Bioinformatics., 64, 344-355.
- Kim, Wan Kyu & Ison, Jon C. (2005). Survey of the Geometric Association of Domain-Domain Interfaces. Proteins., 61(4), 1075- 1088.
- Mooney, Sean D., Liang, Mike Hsing-Ping., Deconde, Rob & Altman, Ross B. (2005). Structural Characterization of Proteins Using Residue Environments. Proteins: Structure, Function and Bioinformatics., 61, 741-747.
- Terashi, Genki., Takeda-Shitaka, Mayuko., Takaya, Daisuke., Komatsu, Katsuichiro & Umeyama, Hideaki. (2005). Searching for Protein-Protein Interaction Sites and Docking by Mothods of Molecular Dynamics, Grid Scoring, and the Pairwise Interaction Potential of Amino Acid Residues. Proteins: Structure, Function and Bioinformatics., 60, 289-295.
- Sequence Papers:
- Chinnasamy, Arunkumar., Mittal, Ankush & Sung, Wing-Kin. (2005). Probabilistic prediction of protein–protein interactions from the protein sequences. Computers in Biology and Medicine., 1-12.
- Nanni, Loris. (2005). Hyperplanes for Predicting Protein-Protein Interactions. Neurocomputing., 69, 257-263.
- Chen, Xue-wen & Liu, Mei. (2005). Prediction of Protein-Protein Interactions Using Random Decision Forest Framework. Bioinformatics, 1-4.
- Han, Dong-soo., Kim, Hong-soo., Jang, Wong –Hyuk., Lee, Sung-Doke & Suh, Jung-Keun.(2004). PreSPI: a domain combination based prediction system for protein–protein interaction. Nucleic Acids Research., 32(21), 6312-6320.
- Vector Learning Papers:
- Ling Lo, Siaw., Cai Zhong, Cong., Chen, Yu Zong & Chung, Maxey C. M. (2005). Effect of training datasets on support vector machine prediction of protein-protein interactions. Proteomics., 5, 876-884.
- Gao, Qing-Bin & Wang, Zheng-Zhi. (2005). Using Nearest Feature Line and Tunable Nearest Neighbor methods for prediction of protein subcellular locations. Computational Biology and Chemistry., 29, 388-392.
- Webb-Robertson, Bobbie-Jo., Oehmen, Christopher & Matzke, Melissa. (2005).
- SVM-BALSA Remote homology detection based on Bayesian sequence alignment. Computation Biology and Chemistry., 29, 440-443.
- Dubey, Anshul., Realff, Matthew J., Lee, Jay H. & Bommarius, Andreas S. (2005).Support vector machines for learning to identify the critical positions of a protein. Journal of Theoretical Biology., 234, 351-361.
- Review Papers:
- Uetz, Peter & Vollert, Carolina S. Protein-Protein Interactions. (2005) Encyclopedic References of Genomics and Proteomics in Molecular Medicine.
- Gomez, Manuel., Alonso-Allende, Ramón., Pazos, Florencio., Grana, Osvaldo., Juan, David.& Valencia, Alfonso. (2004). Accessible Protein Interaction Data for Network Modeling. Structure of the information and available repositories. Structural Bioinformatics group
- Wodak, Shoshana J. & Mendez, Raul. (2004). Prediction of Protein-Protein Interactions: the CAPRI Experiment, its evaluation and implications. Current Opinion in Structural Biology., 14, 242-249.
- Interactome Papers:
- Nabieva, Elena., Jim, Kam., Agarwal, Amit., Chazelle, Bernard & Singh, Mona. (2005).Whole-proteome Prediction of Protein Function Via Graph-Theoretic Analysis of Interaction Maps.Bioinformatics., 21, 302-310.
- Rhodes, David R., Tomlins, Scott A., Varambally, Sooryanarayana., Mahavisno, Vasudeva., Barrette, Terrence., Kalyana- Sundaram, Shanker., Ghosh, Debashis., Pandey, Alhilesh & Chinnaiyan, Arul M. (2005). Probabilistic model of the Human Protein-Protein Interaction Network. Nature Biotechnology., 23(8), 951-959.
[edit] See also
[edit] External links
- Online protein-protein interaction prediction services
- PubMed
- kernel machines
- Folding@Home