Scoring functions for docking

From Wikipedia, the free encyclopedia

In the fields of computational chemistry and molecular modelling, scoring functions are approximate mathematical methods used to predict the strength of the non-covalent interaction (also referred to as binding affinity) between two molecules. Most commonly one of the molecules is a small organic compound such as a drug and the second is the drug's biological target such as a protein receptor.[1] Scoring functions have also been developed to predict the strength of other types of intermolecular interactions, for example between two proteins[2] or between protein and DNA.[3]

Contents

[edit] Utility

Scoring functions are useful for:[4]

  • virtual screening of small molecule databases to identify novel small molecules that bind to a protein target of interest and therefore are useful starting points for drug discovery[5]
  • in lead optimization of screening hits to optimize their affinity and selectivity[6]

[edit] Prerequisites

Scoring functions are normally parameterized (or trained) against a data set consisting of experimentally determined binding affinities between molecular species similar to the species that one wishes to predict.

For predictions of affinities of ligands for proteins, the following must be known or predicted:

  • tertiary structure (arrangement of atoms in three dimensional space) of the protein. Protein structures may be determined by experimental techniques such as X-ray crystallography or solution phase NMR methods or predicted by homology modelling.
  • the three dimensional shape (referred to as the "active" conformation) of the other binding partner
  • the relative orientation of the two binding partners (often referred to as the "binding-mode") in the complex

The information above yields the three dimensional structure of the complex. Based on this three dimensional structure, the scoring function can then estimate the strength of the association between the two partners in the complex using one of the methods outlined below. Finally the scoring function itself may be used to help predict both the binding mode and the active conformation of the small molecule binding partner in the complex.

[edit] Classes

There are three general classes of scoring functions:

  • Force-field - affinities are estimated by summing the strength of intermolecular van der Waals and electrostatic interactions between all atoms of the two molecules in the complex. The intramolecular energies (also referred to as strain energy) of the two binding partners are also frequently included.
  • Empirical - based on counting the number off various types of interactions between the two binding partners.[7] These interactions may include for example
    • number of hydrogen bonds (favorable electrostatic contribution to affinity),
    • hydrophobic — hydrophobic contacts (favorable),
    • hydrophilic — hydrophobic contacts (unfavorable),
    • number of rotatable bonds immobilized in complex formation (unfavorable entropic contribution).
  • Knowledge - based on statistical observations of intermolecular close contacts in large 3D databases (such as the Cambridge Structural Database or Protein Data Bank) which are used to derive "potentials of mean force". This method is founded on the assumption that close intermolecular interactions between certain types of atoms or functional groups that occur more frequently than one would expect by a random distribution are likely to be energetically favorable and therefore contribute favorably to binding affinity.[8]

Finally hybrid scoring functions have also been developed in which the terms from two or more of the above types of scoring functions are combined into one function.

[edit] References

  1. ^ Jain AN (2006). "Scoring functions for protein-ligand docking". Curr. Protein Pept. Sci. 7 (5): 407–20. doi:10.2174/138920306778559395. PMID 17073693. 
  2. ^ Lensink MF, Méndez R, Wodak SJ (2007). "Docking and scoring protein complexes: CAPRI 3rd Edition". doi:10.1002/prot.21804. PMID 17918726. 
  3. ^ Robertson TA, Varani G (2007). "An all-atom, distance-dependent scoring function for the prediction of protein-DNA interactions from structure". Proteins 66 (2): 359–74. doi:10.1002/prot.21162. PMID 17078093. 
  4. ^ Rajamani R, Good AC (2007). "Ranking poses in structure-based lead discovery and optimization: current trends in scoring function development". Current opinion in drug discovery & development 10 (3): 308–15. PMID 17554857. 
  5. ^ Seifert MH, Kraus J, Kramer B (2007). "Virtual high-throughput screening of molecular databases". Current opinion in drug discovery & development 10 (3): 298–307. PMID 17554856. 
  6. ^ Joseph-McCarthy D, Baber JC, Feyfant E, Thompson DC, Humblet C (2007). "Lead optimization via high-throughput molecular docking". Current opinion in drug discovery & development 10 (3): 264–74. PMID 17554852. 
  7. ^ Böhm HJ (1998). "Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs". J. Comput. Aided Mol. Des. 12 (4): 309–23. doi:10.1023/A:1007999920146. PMID 9777490. 
  8. ^ Muegge I (2006). "PMF scoring revisited". J. Med. Chem. 49 (20): 5895–902. doi:10.1021/jm050038s. PMID 17004705.