Graphics & Design

Binding Estimation after Refinement, a New Automated Procedure for the Refinement and Rescoring of Docked Ligands in Virtual Screening

Binding estimation after refinement (BEAR) is a novel automated computational procedure suitable for correcting and overcoming limitations of docking procedures such as poor scoring function and the generation of unreasonable ligand conformations.
of 4
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  Binding Estimation after Refinement, a NewAutomated Procedure for the Refinement andRescoring of Docked Ligands in Virtual Screening Giulio Rastelli*, Gianluca Degliesposti,Alberto Del Rio and Miriam Sgobba Dipartimento di Scienze Farmaceutiche, Universit di Modena e Reggio Emilia, Via Campi 183, 41100 Modena, Italy *Corresponding author: Giulio Rastelli,  Binding estimation after refinement (BEAR) is anovel automated computational procedure suit-able for correcting and overcoming limitations ofdocking procedures such as poor scoring functionand the generation of unreasonable ligand confor-mations. BEAR makes use of molecular dynamicssimulation followed by MM-PBSA and MM-GBSAbinding free energy estimates as tools to refineand rescore the structures obtained from dockingvirtual screenings. As binding estimation afterrefinement relies on molecular dynamics, theentire procedure can be tailored to the needs ofthe end-user in terms of computational time andthe desired accuracy of the results. In a validationtest, binding estimation after refinement and re-scoring resulted in a significant enrichment ofknown ligands among top scoring compoundscompared with the srcinal docking results. Bind-ing estimation after refinement has direct andstraightforward application in virtual screeningfor correcting both false-positive and false-nega-tive hits, and should facilitate more reliable selec-tion of biologically active molecules fromcompound databases.Key words:  AMBER, BEAR, binding free energy, docking, MM-GBSA, MM-PBSA, molecular dynamics, postdocking refinement, virtualscreeningReceived 20 October 2008, revised 29 December 2008 and accepted forpublication 30 December 2008 Chemical and biological approaches to drug discovery have changeddrastically in recent years due mainly to progressive advances inexperimental and computational techniques. In structure-based drugdesign, virtual screening represents a valuable computationalapproach for the rapid assessment of large libraries of chemicalstructures and is able to guide the selection and identification ofnew hits with biological activities. Such computational screening isusually achieved by using molecular docking software, focussingfirst on the identification of complementary orientations of mole-cules inside the active site of a target macromolecule, and then onscoring for the assessment of ligand binding strength (1–3). Whilethese methods have been improved respect to the accuracy andefficiency of the available algorithms, drawbacks and limitationsstill exist. For example, docking techniques still lack reliable simula-tion of the flexibility of both ligands and receptor. Another majordrawback concerns the application of scoring functions that largelyfail to estimate ligand binding energies in reasonable agreementwith experiment. As a result, false-positive and false-negative hitsstill populate docking screening results performed with standardmethods. Owing to these limitations, it is generally agreed thatdocking results need to be postprocessed with more accurate toolsto refine docking orientations, filter out poor structures from thedocking ensembles, and rank potential ligands to give better agree-ment with experimental results. To this end, we developed bindingestimation after refinement (BEAR), a new automated postdockingprocedure for the conformational refinement of docking posesthrough molecular dynamics (MD) followed by accurate predictionof binding free energies using MM-PBSA and MM-GBSA (4–8).Binding estimation after refinement uses the  AMBER  software pack-age (9,10) and combines, in the form of a script, different modulesfor MD and free energy evaluation. Binding estimation after refine-ment is fully automated and suitable as a complement to virtualscreening tools, for the efficient refinement and rescoring of dock-ing hit lists. Binding estimation after refinement(BEAR) The BEAR workflow is shown in Figure 1. The automated procedurerequires preprocessing of the receptor and ligand structuresobtained from docking screens. At this stage, hydrogen atoms areadded to the receptor. Typically, all Lys and Arg residues are mod-elled as protonated while Asp and Glu residues are deprotonated,although different assignments are possible. In addition, atomiccharges must be assigned to the ligands to evaluate electrostaticinteractions and solvation energies in subsequent steps. Severalmethods are available for this purpose. In our procedure, we use ANTECHAMBER  (11) for computing AM1-BCC (12) atomic charges foreach docked ligand. Previous work has shown that AM1-BCCcharges perform well in docking and MD applications (13). This taskrequires specification of the total net charge of the ligand in itsappropriate protonation state. 283 Chem Biol Drug Des 2009;   73:  283–286 Research Article ª  2009 The Authors Journal compilation   ª  2009 Blackwell Munksgaard  doi: 10.1111/j.1747-0285.2009.00780.x  Once the preprocessing step is accomplished, the database ofdocked ligands and the receptor structure are submitted to the auto-mated BEAR procedure as shown in Figure 1. Binding estimationafter refinement processes the database of ligands one-by-one inan iterative manner. For each iteration, a docked ligand and recep-tor structure are combined to give a complex. This receptor–ligandcomplex is then submitted to a sequential three-step procedurethat includes input files preparation, structure refinement and scorecalculation.During input file preparation,  ANTECHAMBER  is used to assign Gener-alized Amber Force Field (GAFF) atom types (14) to the ligands.Because GAFF may lack some mandatory parameters for the calcu-lation, the Parmcheck utility (9,10) is used to get missing parame-ters that are automatically stored with the input files. The Leapmodule (9) is then used to assign ff03 (15) Amber atom types andcharges to the receptor, and to create the topology and coordinatefiles that comprise the input files for the subsequent step.The second step is carried out in three subtasks, namely: (i) struc-ture minimization of the initial ligand–receptor complex; (ii) MD sim-ulation and (iii) structure reminimization of the complex obtainedafter MD. All three structure refinement subtasks are performedwith the Sander module of Amber (9,10). By default, energy minimi-zation of ligand–receptor complexes is performed with a distance-dependent dielectric constant  e  = 4 r   without restraints, while in theMD simulation the ligand is allowed to move but the protein isfixed. This MD subtask is specifically devised to help overcomepotentially high energy barriers between different conformations ofthe ligand in the target-binding site, and to help reduce false-posi-tive hits provided that the orientation of the ligand assigned bydocking is not too different (e.g. no head to tail 'flips') from the cor-rect one. Finally, the structure of the complex is reminimized.The third and last step of BEAR involves rescoring of the refinedcomplexes. The minimized ligand–receptor complex is scored byestimating the binding free energy, which is calculated as the dif-ference between the free energy of the complex and that of bothreceptor and ligand. Free energies of binding are calculated withthe MM-PBSA and MM-GBSA algorithms as implemented in the AMBER  package (9). Both MM-PBSA and MM-GBSA evaluate bindingenergies according to the equation  D G  bind  =  G  comp  )  G  rec  )  G  lig where each term is calculated as the sum of  E  MM , the molecularmechanics contribution (expressed as the sum of internal, electro-statics and van der Waals contributions to binding  in vacuo  ), andthe polar ( G  psolv ) and non-polar ( G  npsolv ) contributions to solvationfree energy.  G  psolv  is calculated by solving the Poisson-Boltzman(PB) and Generalized-Born (GB) equations for MM-PSBA and MM-GBSA methods, respectively, while  G  npsolv  is calculated using theequation  G  npsolv  =  c  * SASA + b, where SASA is the solvent-acces-sible surface area calculated using the linear combinations of pair-wise overlaps or Molsurf methods (16,17). Entropic contributions tobinding  via   normal mode analyses are not evaluated as they usuallyhave large error bars and require long simulation times which areimpractical for virtual screenings (8,18).Binding estimation after refinement was srcinally conceived withdefault settings that may be applied throughout or tailored by theuser to reach the optimal compromise between the accuracy ofresults and the computational time needed for refinement andrescoring. An overview of the standard settings that we srcinallyapplied to the automated procedure is shown in Table 1. Virtual screening application To benchmark the performance of BEAR in virtual screening, a vali-dation study was devised by seeding the NCI Diversity set a with 14known active inhibitors of  Plasmodium falciparum   Dihydrofolatereductase whose structures are related to pyrimethamine, cyclogua-nil and WR99210 (19). The compound library, containing 1720 mole-cules, was docked with Autodock 4 (20) into the crystal structure ofPfDHFR (PDB ID 1J3I), using a grid of 60  ·  60  ·  60 points centeredon the inhibitor WR99210 and grid point spacing of 0.375 . For Pre-processing steps Docked ligandDatabase preparation Input preparation Structure refinement Score calculationAntechamber Sander(MM)Sander(MD)Sander(MM)MM-PBSA&MM-GBSALeapAssemblyof thecomplexDocked liganddatabaseReceptor preparation Receptor Bear automated procedure Figure 1:  Workflow of binding estimation after refinement (BEAR) computational procedure. Table 1:  Default molecular dynamics and free energy evaluationparameters and settings Minimization parametersDielectric constant ( e ) 4 r  Cut-off for non-bonded interactions 12 Steps 2000Molecular dynamics parametersSimulation time 100 psTime step 0.2 fsSHAKE OnTemperature 300 KMM-PBSA   ⁄   GBSA parametersGrid spacing 0.5 Internal dielectric constant 1External dielectric constant 80Solvent probe radius 1.4 SASA calculation LCPO or Molsurf c  constant 0.0072 kcal   ⁄   mol   ⁄    2 b   constant 0 kcal   ⁄   mol Rastelli et al. 284  Chem Biol Drug Des   2009; 73: 283–286  each molecule, 10 runs were carried out using the Lamarckiangenetic algorithm with 150 individuals in the first population and2.5 million energy evaluations. The lowest-energy orientation of thelargest cluster found by Autodock was then used as input for theBEAR refinement and rescoring procedure as depicted in Figure 1,employing the default settings.The enrichment curves obtained with Autodock and BEAR are shownin Figure 2A, in which the higher the percentage of known ligandsfound at a given percentage of the ranked database, the better theenrichment performance of the virtual screen. Compared with Auto-dock (blue line), we found that BEAR yielded strikingly better enrich-ment of known actives using either the MM-PBSA (red line) orMM-GBSA (green line) scoring function. The different rankings areshown in Figures 2B–D. In the srcinal Autodock-generated dockingresults, the nanomolar inhibitors WR99210, pyrimethamine and cyclo-guanil are positioned at 20%, 37% and 43% of the ranked database,respectively (Figure 2B). In marked contrast, BEAR retrieves all theknown inhibitors in the top positions of the ranked list according toMM-PBSA (Figure 2C) and MM-GBSA (Figure 2D), respectively. Theapplication of BEAR clearly results in significant improvement of theranking of known inhibitors. It is worth noting that while Autodockpredicts the binding configuration orientation of the active com-pounds in agreement with crystal structures, it fails to rank themamongst the best compounds. As some relatively small fraction ofthe best-ranked compounds are typically selected for experimentaltesting, according to the srcinal Autodock scores the known activecompounds would not have been selected for testing.Further analysis on the geometries of the compounds before andafter the BEAR refinement show that 90% of the molecules in theNCI Diversity set undergo a structural rearrangement with an aver-age root mean square deviation (RMSD) of 1.6 € 1.1 , highlightingthe general importance of the geometry optimization subtasks (e.g.minimization and MD) before the final assessment of the bindingfree energy. The ability of BEAR to operate small as well as impor-tant structural changes is testified by the relatively high standarddeviation associated with the average RMSD value obtained.Moreover, it is particularly significant that BEAR recognized interac-tions with key active site residues such as D54, I14 and I164, whichare indeed considered as critical for the inhibition of this enzyme(19,21). In fact, after BEAR refinement and rescoring, compoundsforming hydrogen bonds with these residues (like the three knowninhibitors discussed above but also other compounds present in theNCI diversity database) had low RMSD values and favourablescores, while this was not observed with the Autodock scoringfunction alone. Visual inspection of the remaining 10% of thedatabase, which is characterized by higher RMSD values, showedthat most of these compounds have favourable Autodock scores butunlikely orientations, some of which are positioned outside of thebinding site. In these cases, BEAR refinement resulted in highRMSD values and BEAR rescoring resulted in poor scores, i.e. theprocedure allowed the detection of false-positive hits and movedthese molecules down in the ranked list.Another application of BEAR made on diverse inhibitors of aldosereductase gave striking agreement with their experimental activities(22). In that case, squared correlation coefficients of 0.80 and 0.73between computed and experimental binding free energies wereobtained using MM-PBSA and MM-GBSA, respectively, providingsignificant validation of the methodology. This result is particularlyrelevant in light of a recent benchmarking study of docking methodsthat classified aldose reductase as a target of 'intermediate' levelof difficulty for docking because of a challenging combination ofpolar and hydrophobic complementarities between the enzyme andinhibitors (23). 02040    %   o   f   k  n  o  w  n   i  n   h   i   b   i   t  o  r  s   f  o  u  n   d 60801001 10 100  0 20 40 60 80 100 WR99210Wr99210PyrimethamineCycloguanilPyrimethamineCycloguanil AutoDock MM-PBSAMM-GBSA% of ranked database 0 20 40 60 80 100 % of ranked database 0 20 40 60 80 100 % of ranked database % of ranked database PyrimethamineWR99210Cycloguanil ABCD Figure 2:  (A) Enrichment curves showing the percentage of known pfDHFR inhibitors retrieved as a function of the percentage of theranked database (NCI diversity set;  x  -axis is in logarithmic scale). We compare AutoDock alone (blue), postprocessing with BEAR   ⁄   MM-PBSA(red) and postprocessing with BEAR   ⁄   MM-GBSA (green). (B–D) Ranking of the known inhibitors seeded in the NCI Diversity set. The diagramsshow the ranking obtained with Autodock alone (B), postprocessing with BEAR   ⁄   MM-PBSA (C) and postprocessing with BEAR   ⁄   MM-GBSA (D). BEAR, Binding Estimation after Refinement Chem Biol Drug Des   2009; 73: 283–286  285  Closing Remarks It is important to note that BEAR default settings can be easily chan-ged or redefined depending on the particular application and the levelof accuracy desired by the user. For instance, in the default settingsMD is restricted to the ligand alone. This choice clearly aims at reduc-ing the computational cost so that it can be applied to a large numberof compounds. Flexible proteins and ligands that have high conforma-tional degrees of freedom may represent a potential difficulty forBEAR with standard settings. In these cases, the customization of thesettings, e.g. a selection of residues around the ligand that can beallowed to move during MD or longer MD simulations, can be possi-ble solutions. Additionally, the end-user can increase the cut-off valueto account for better electrostatics treatment, extend MD simulationsto enhance conformational sampling, add particularly important watermolecules at the interface between ligand and receptor, and so forth.Finally, while BEAR has proven to be a reliable procedure to refineand predict the free energy of binding, another positive feature is thespeed of the calculation. The total average computational timerequired to process each ligand–receptor complex with default set-tings is approximately 8 min on a single core of a 2.4-GHz AMDOpteron CPU (Advanced Micro Devices, Sunnyvale, CA, USA). The pos-sibility of running BEAR on large-scale computing facilities such asHPC clusters or grid platforms extends its applicability to large virtualscreens, processing several thousands of compounds per day. References 1. Rosenfeld R.J., Goodsell D.S., Musah R.A., Morris G.M., GoodinD.B., Olson A.J. (2003) Automated docking of ligands to an arti-ficial active site: augmenting crystallographic analysis with com-puter modeling. J Comput Aided Mol Des;17:525–536.2. Lorber D.M., Shoichet B.K. (1998) Flexible ligand docking usingconformational ensembles. Protein Sci;7:938–950.3. Rarey M., Kramer B., Lengauer T., Klebe G. (1996) A fast flexibledocking method using an incremental construction algorithm.J Mol Biol;261:470–489.4. Kollman P.A., Massova I., Reyes C., Kuhn B., Huo S., Chong L., LeeM., Lee T., Duan Y., Wang W., Donini O., Cieplak P., Srinivasan J.,Case D.A., Cheatham T.E., III. (2000) Calculating structures andfree energies of complex molecules: combining molecular mechan-ics and continuum models. Acc Chem Res;33:889–897.5. Wang J., Morin P., Wang W., Kollman P.A. (2001) Use of MM-PBSA in reproducing the binding free energies to HIV-1 RT ofTIBO derivatives and predicting the binding mode to HIV-1 RT ofefavirenz by docking and MM-PBSA. J Am Chem Soc;123:5221–5230.6. Kuhn B., Gerber P., Schulz-Gasch T., Stahl M. (2005) Validationand use of the MM-PBSA approach for drug discovery. J MedChem;48:4040–4048.7. Lyne P.D., Lamb M.L., Saeh J.C. (2006) Accurate prediction ofthe relative potencies of members of a series of kinase inhibi-tors using molecular docking and MM-GBSA scoring. J MedChem;49:4805–4808.8. Weis A., Katebzadeh K., Soderhjelm P., Nilsson I., Ryed U.(2006) Ligand affinities predicted with the MM   ⁄   PBSA method:dependence on the simulation method and the force field.J Med Chem;49:6596–6606.9. Case D.A., Darden T.A., Cheatham T.E., III, Simmerling C.L.,Wang J., Duke R.E., Luo R.  et al.  (2006) AMBER 9. San Fran-cisco, CA: University of California.10. Case D.A., Cheatham T.E., III, Darden T., Gohlke H., Luo R., MerzK.M., Onufriev A., Simmerling C., Wang B., Woods R.J. (2005)The Amber biomolecular simulation programs. J ComputChem;26:1668–1688.11. Wang J., Wang W., Kollman P.A., Case D.A. (2006) Automaticatom type and bond type perception in molecular mechanicalcalculations. J Mol Graph Model;25:247–260.12. Jakalian A., Jack D.B., Bayly C.I.J. (2002) Fast, efficient genera-tion of high-quality atomic charges. AM1-BCC model: II. Parame-terization and validation. J Comput Chem;23:1623–1641.13. Wei B.Q., Baase W.A., Weaver L.H., Matthews B.W., ShoichetB.K. (2002) A model binding site for testing scoring functions inmolecular docking. J Mol Biol;322:339–355.14. Wang J., Wolf R.M., Caldwell J.W., Kollman P.A., Case D.A.(2004) Development and testing of a general AMBER force field.J Comput Chem;25:1157–1174.15. Duan Y., Wu C., Chowdhury S., Lee M.C., Xiong G., Zhang W.,Yang R., Cieplak P., Luo R., Lee T., Caldwell J., Wang J., KollmanP.A. (2003) A point-charge force field for molecular mechanicssimulations of proteins based on condensed-phase quantummechanical calculations. J Comput Chem;21:1999–2012.16. Weiser J., Shenkin P.S., Still W.C. (1999) Approximate atomicsurfaces from linear combinations of pairwise overlaps (LCPO).J Comp Chem;20:217–230.17. Connolly M.L. (1983) Analytical molecular surface calculation.J Appl Cryst;16:548–558.18. Brown S.P., Muchmore S.W. (2007) Rapid estimation of relativeprotein-ligand binding affinities using a high-throughput versionof MM-PBSA. J Chem Inf Model;47:1493–1503.19. Parenti M.D., Pacchioni S., Ferrari A.M., Rastelli G. (2004)Three-dimensional quantitative structure-activity relationshipsanalysis of a set of plasmodium falciparum dihydrofolate reduc-tase inhibitors using a pharmacophore generation approach.J Med Chem;47:4258–4267.20. Huey R., Morris G.M., Olson A.J., Goodsell D.S. (2007) A semi-empirical free energy force field with charge-based desolvation.J Comput Chem;28:1145–1152.21. Yuvaniyama J., Chitnumsub P., Kamchonwongpaisan S., Vanic-htanankul J., Sirawaraporn W., Taylor P., Walkinshaw M.D.,Yuthavong Y. (2003) Insights into antifolate resistance frommalarial DHFR-TS structures. Nat Struct Biol;10:357–365.22. Ferrari A.M., Degliesposti G., Sgobba M., Rastelli G. (2007) Vali-dation of an automated procedure for the prediction of relativefree energies of binding on a set of aldose reductase inhibitors.Bioorg Med Chem;15:7865–7877.23. Huang N., Shoichet B.K., Irwin J.J. (2006) Benchmarking sets formolecular docking. J Med Chem;49:6789–6801. Note a NCI Diversity set information. Rastelli et al. 286  Chem Biol Drug Des   2009; 73: 283–286
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks