List of disorder prediction software
Computational methods exploit the sequence signatures of disorder to predict whether a protein is disordered, given its amino acid sequence. The table below, which was originally adapted from[1] and has been recently updated, shows the main features of software for disorder prediction. Note that different software use different definitions of disorder.
Predictor | What is predicted | Based on | Generates and uses multiple sequence alignment? |
---|---|---|---|
SPINE-D[2] | Output long/short disorder and semi-disorder (0.4-0.7) and full disorder (0.7-1.0). Semi-disorder is semi-collapsed with some secondary structure. | A neural network based three-state predictor based on both local and global features. Ranked in Top 5 based on AUC in CASP 9. | Yes. |
PONDR | All regions that are not rigid including random coils, partially unstructured regions, and molten globules | Local aa composition, flexibility, hydropathy, etc. | No |
GlobPlot | Regions with high propensity for globularity on the Russell/Linding scale (propensities for secondary structures and random coils) | Russell/Linding scale of disorder | No |
DisEMBL | LOOPS (regions devoid of regular secondary structure); HOT LOOPS (highly mobile loops); REMARK465 (regions lacking electron density in crystal structure) | Neural networks trained on X-ray structure data | No |
SEG | Low-complexity segments that is, “simple sequences” or “compositionally biased regions”. | Locally optimized low-complexity segments are produced at defined levels of stringency and then refined according to the equations of Wootton and Federhen | No |
Disopred2[3] | Regions devoid of ordered regular secondary structure | Cascaded support vector machine classifiers trained on PSI-BLAST profiles | Yes |
OnD-CRF[4] | The transition between structurally ordered and mobile or disordered amino acids intervals under native conditions. | OnD-CRF applies Conditional Random Fields, CRFs, which rely on features generated from the amino acid sequence and from secondary structure prediction. | No |
NORSp | Regions with No Ordered Regular Secondary Structure (NORS). Most, but not all, are highly flexible. | Secondary structure and solvent accessibility | Yes |
FoldIndex[5] | Regions that have a low hydrophobicity and high net charge (either loops or unstructured regions) | Charge/hydrophaty analyzed locally using a sliding window | No |
Charge/hydropathy method.[6] | Fully unstructured domains (random coils) | Global sequence composition | No |
HCA (Hydrophobic Cluster Analysis) | Hydrophobic clusters, which tend to form secondary structure elements | Helical visualization of amino acid sequence | No |
PreLink | Regions that are expected to be unstructured in all conditions, regardless of the presence of a binding partner | Compositional bias and low hydrophobic cluster content. | No |
IUPred | Regions that lack a well-defined 3D-structure under native conditions | Energy resulting from inter-residue interactions, estimated from local amino acid composition | No |
RONN | Regions that lack a well-defined 3D structure under native conditions | Bio-basis function neural network trained on disordered proteins | No |
MD (Meta-Disorder predictor)[7] | Regions of different "types"; for example, unstructured loops and regions containing few stable intra-chain contacts | A neural-network based meta-predictor that uses different sources of information predominantly obtained from orthogonal approaches | Yes |
GeneSilico Metadisorder[8] | Regions that lack a well-defined 3D structure under native conditions (REMARK-465) | Meta method, which uses other disorder predictors (like RONN, IUPred, POODLE, and many more). Based on them the consensus is calculated according method accuracy (optimized using ANN, filtering and other techniques). Currently the best available method (first 2 places in last CASP experiment (blind test)) | Yes |
IUPforest-L | Long disordered regions in a set of proteins | Moreau-Broto auto-correlation function of amino acid indices (AAIs) | No |
MFDp [9] | Different types of disorder including random coils, unstructured regions, molten globules, and REMARK-465-based regions. | An ensemble of 3 SVMs specialized for the prediction of short, long and generic disordered regions, which combines three complementary disorder predictors, sequence, sequence profiles, predicted secondary structure, solvent accessibility, backbone dihedral torsion angles, residue flexibility and B-factors. MFDp (unofficially) secured 3rd place in last CASP experiment) | Yes |
ESpritz | Disorder definitions include: missing x-ray atoms (short), Disprot style disorder (long), and NMR flexibility. A probability of disorder is supplied with two decision thresholds which depend on a user preferred false positive rate. | Bi-directional neural networks with diverse and high quality data derived from the Protein Data Bank and DisProt. Compares extremely well with other CASP 9 servers. The method was designed to be very fast. | No |
CSpritz | Disorder definitions include: missing x-ray atoms (short) and DisProt style disorder (long). A probability of disorder is supplied with two decision thresholds which depend on the false positive rate. Linear motifs within a disorder segment are determined by simple pattern matching from ELM. | Support Vector Machine and Bi-directional neural networks with high quality and diverse data derived from the Protein Data Bank and Disprot. Structural information is also supplied in the form of homologous templates. Compares extremely well with other CASP 9 servers. | Yes |
MeDor (Metaserver of Disorder)[10] | Regions of different "types". MeDor provides a unified view of multiple disorder predictors. | Meta method, which uses other disorder predictors (like FoldIndex, DisEMBL REMARK465, IUPred, RONN ...) and provides additional features (like HCA plot, Secondary Structure prediction, Transmembrane domains ... ) that all together help the user in defining regions involved in disorder. | No |
References
- ↑ Ferron F, Longhi S, Canard B, Karlin D (October 2006). "A practical overview of protein disorder prediction methods". Proteins 65 (1): 1–14. doi:10.1002/prot.21075. PMID 16856179.
- ↑ Zhang T, Faraggi E, Xue B, Dunker K, Uversky VN, Zhou Y (February 2012). "SPINE-D: Accurate prediction of short and long disordered regions by a single neural-network based method". Journal of Biomolecular Structure and Dynamics 29 (4): 799–813. doi:10.1080/073911012010525022. PMC 3297974. PMID 22208280.
- ↑ Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT (March 2004). "Prediction and functional analysis of native disorder in proteins from the three kingdoms of life". J. Mol. Biol. 337 (3): 635–45. doi:10.1016/j.jmb.2004.02.002. PMID 15019783.
- ↑ Wang L, Sauer UH (June 2008). "OnD-CRF: predicting order and disorder in proteins using conditional random fields". Bioinformatics 24 (11): 1401–2. doi:10.1093/bioinformatics/btn132. PMC 2387219. PMID 18430742.
- ↑ Prilusky J, Felder CE, Zeev-Ben-Mordehai T, et al. (August 2005). "FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded". Bioinformatics 21 (16): 3435–8. doi:10.1093/bioinformatics/bti537. PMID 15955783.
- ↑ Uversky VN, Gillespie JR, Fink AL (November 2000). "Why are "natively unfolded" proteins unstructured under physiologic conditions?". Proteins 41 (3): 415–27. doi:10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7. PMID 11025552.
- ↑ Schlessinger A, Punta M, Yachdav G, Kajan L, Rost B (2009). Orgel, Joseph P. R.14 O., ed. "Improved disorder prediction by combination of orthogonal approaches". PLoS ONE 4 (2): e4433. doi:10.1371/journal.pone.0004433. PMC 2635965. PMID 19209228.
- ↑ Kozlowski, L. P.; Bujnicki, J. M. (2012). "MetaDisorder: A meta-server for the prediction of intrinsic disorder in proteins". BMC Bioinformatics 13: 111. doi:10.1186/1471-2105-13-111. PMC 3465245. PMID 22624656.
- ↑ Mizianty MJ, Stach W, Chen K, Kedarisetti KD, Disfani FM, Kurgan L (September 2010). "Improved sequence-based prediction of disordered regions with multilayer fusion of multiple information sources". Bioinformatics 26 (18): i489–96. doi:10.1093/bioinformatics/btq373. PMC 2935446. PMID 20823312.
- ↑ Lieutaud P, Canard B, Longhi S (September 2008). "MeDor: a metaserver for predicting protein disorder". BMC Genomics 16. doi:10.1186/1471-2164-9-S2-S25. PMC 2559890. PMID 18831791.