Statistical potential

From Wikipedia, the free encyclopedia

In protein structure prediction, a statistical potential (also knowledge-based potential, empirical potential, or residue contact potential) is an energy function based on empirical observations about the likelihood of native contacts between any two amino acid residues in the native state tertiary structure of a protein. In its simplest form, a statistical potential is formulated as an interaction matrix that assigns a weight or energy value to each possible contact pair of standard amino acids. The energy of a particular structural model is then the combined energy of all the residue-residue contacts (often defined as residues within 4Å) identified in the structure. The probabilities or weights are determined by statistical examination of native contacts present in a database of structures represented in the Protein Data Bank. According to the energy landscape hypothesis of protein folding, structures that closely resemble the native state should be distinguishably lower in free energy than those that are widely divergent from the native state.

Statistical potentials are used as energy functions in the assessment of an ensemble of structural models produced by homology modeling or protein threading - predictions for the tertiary structure assumed by a particular amino acid sequence made on the basis of comparisons to one or more homologous proteins whose structures have been experimentally determined. Many differently parameterized tatistical potentials have been shown to successfully identify the native state structure from an ensemble of "decoy" or non-native structures.^[1]^[2]^[3] In response to criticism that statistical potentials capture only the tendency of hydrophobic amino acids to pack closely in the hydrophobic core of a globular protein, refinements have included the creation of two interaction matrices parameterized separately for residues in the core and those on the solvent-accessible surface of the protein.^[4]

The primary alternative method for assessing ensembles of models and identifying the lowest-energy structure represented relies on direct energy calculations, which are more computationally expensive than statistical potentials^[2] due to the necessity of calculating long-range electrostatic interactions.

[edit] References

^ Narang P, Bhushan K, Bose S, Jayaram B. (2006). Protein structure evaluation using an all-atom energy based empirical scoring function. J Biomol Struct Dyn 23(4):385-406.
^ ^a ^b Sippl MJ. (1993). Recognition of Errors in Three-Dimensional Structures of Proteins. Proteins 17:355-62.
^ Bryant SH, Lawrence CE. (1993). An empirical energy function for threading protein sequence through the folding motif. Proteins 16(1):92-112.
^ Park K, Vendruscolo M, Domany E. (2000). Toward an energy function for the contact map representation of proteins. Proteins 40(2):237-48.