Global Optimization of Clusters of Rigid Molecules

of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  This journal is©the Owner Societies 2016  Phys. Chem. Chem. Phys.,  2016,  18 , 3003--3010 |  3003 Cite this: Phys.Chem.Chem.Phys., 2016,  18 , 3003 Global optimization of clusters of rigid moleculesusing the artificial bee colony algorithm † Jun Zhang* and Michael Dolg* The global optimization of molecular clusters is an important topic encountered in many fields ofchemistry. In our previous work ( Phys. Chem. Chem. Phys. , 2015,  17 , 24173), we successfully applied therecently introduced artificial bee colony (ABC) algorithm to the global optimization of atomic clustersand introduced the corresponding software ‘‘ABCluster’’. In the present work, ABCluster was extended tothe optimization of clusters of rigid molecules. Here ‘‘rigid’’ means that all internal degrees of freedom ofthe constituent molecules are frozen. The algorithm was benchmarked by TIP4P water clusters (H 2 O) N ( N r 20), for which all global minima were successfully located. It was further applied to various clustersof different chemical nature: 10 microhydration clusters, 4 methanol microsolvation clusters, 4 nonpolarclusters and 2 ion–aromatic clusters. In all the cases we obtained results consistent with previous experi-mental or theoretical studies. 1 Introduction Molecular clusters are aggregates containing several molecules.They are gaining more and more attention among researchersdue to their chemical importance. For experimental studies,clusters often exhibit an interesting size-dependence of theirproperties when going from a couple of molecules to the bulk substance. 1 Clusters can also reveal some local structuralinformation of liquids. 2 Many metal clusters exhibit specialcatalytic and optical properties. 3 In theoretical studies, model-ling solvation processes,  e.g.  calculating the hydration energy,requires one to introduce explicit water molecules into theinner hydration spheres as well as an implicit solvation modelto guarantee a sufficient accuracy, 4 i.e.  such studies involve thetreatment of a solute–solvent cluster. 5,6 For theoretical studies of a molecular cluster, the first step isoften to find the global minimum (GM) on its potential energy surface (PES), since this corresponds usually to its most stablestructure at low temperature. However, for larger systems this isa difficult task. One can identify a local minimum (LM) on thePES by the zero gradient condition, but a robust condition foridentifying a GM does not exist. Therefore, a deterministicsearch of the GM is usually impossible, and for such globaloptimization problems, nondeterministic algorithms are morepopular. These algorithms can find the true GM beyond asignificant probability after a sufficient number of iterations.For cluster optimization, there are two kinds of algorithms: thebiased and unbiased ones. The former class is designed forspecific clusters, since it uses the known information of the GMsof small clusters as much as possible to search those of the largerones. The latter class can be applied for general clusters, andmakes no assumptions how their GMs should look like. Thebiased algorithms are often more efficient for specific clustersthan the unbiased ones,  e.g.  basin-hopping, 7,8 an unbiased algo-rithm, can find reliable GMs for (H 2 O)  N   when  N   r  21, 9 but abiased algorithm designed specifically for water clusters can work  well up to  N   = 30. 10 Obviously the biased algorithms are not robust and transferable. The unbiased algorithms can be classified asindividual-based (starting the global optimization from a singlecluster,  e.g.  simulated annealing, 11 Monte Carlo minimization 12 and basin hopping  7,8 ) and population-based (starting the globaloptimization from a set of clusters,  e.g.  differential evolution 13 and particle swarm optimization 14 ) ones. For a comprehensivediscussion of these methods please refer to related reviews. 15–17 In our previous work, 18  we introduce a recently proposedpopulation-based algorithm,  i.e.  the ‘‘artificial bee colony’’(ABC) algorithm to the field of global optimization of clusters.The ABC algorithm requires only three parameters, thus it is very easy to learn and apply by non-experts in global optimiza-tion. It has been wrapped in a black-box way in the software ABCluster (ABC for clusters). We have proved that ABCluster is very efficient in searching the GMs for ionic and metal atomicclusters. In this work, we extend the ability of ABCluster toclusters formed by rigid molecules and show its excellent performance for these cases. For similar results as well asreferences for other optimization schemes and applications Theoretical Chemistry, University of Cologne, Greinstr. 4, 50939 Cologne, Germany. E-mail:,; Fax:  +  49 (0)221 470 6896;Tel:  +  49 (0)221 470 6893 †  Electronic supplementary information (ESI) available: The force field para-meters of the molecules considered in this work. See DOI: 10.1039/c5cp06313b Received 17th October 2015,Accepted 9th December 2015DOI: 10.1039/c5cp06313b PCCP PAPER    P  u   b   l   i  s   h  e   d  o  n   1   6   D  e  c  e  m   b  e  r   2   0   1   5 .   D  o  w  n   l  o  a   d  e   d   b  y   U  n   i  v  e  r  s   i   t  y  o   f   B   i  r  m   i  n  g   h  a  m   o  n   0   1   /   1   0   /   2   0   1   7   1   3  :   4   4  :   1   5 . View Article Online View Journal | View Issue  3004  |  Phys. Chem. Chem. Phys.,  2016,  18 , 3003--3010 This journal is©the Owner Societies 2016 for atomic clusters, the reader is refered to ref. 18. Now, ABCluster is available on our group site ( 2 Theory 2.1 The potential energy function Searching the GM of a molecular cluster is mathematically anunconstrained global optimization problem. The variables to beoptimized are the atomic coordinates  X   { x 1 ,  y 1 ,  z  1 , . . . , x  M  ,  y  M  ,  z   M  } where  M   is the number of atoms in the cluster, and theobjective function is the potential energy function  U  (  X  ) whichcan be an empirical or first-principle one. In this work, we only consider clusters of rigid molecules, where ‘‘rigid’’ means that all internal degrees of freedom (DOF) of a molecule (bondlengths, bond angles and dihedral angles) are kept unchangedduring the optimization. This approximation is not suitablefor all cases,  e.g.  it is not suitable for molecules with soft DOFlike those with long, rotatable side chains. However it worksquite well for most small and medium-sized molecules and cansignificantly reduce the number of DOF of the cluster ( i.e.  thedimension of the global optimization problem). Within thisapproximation, each molecule can be described by a six-component external DOF  q . In ABCluster we use the following  q : the coordinates of its geometrical center  R     {  X  , Y  , Z  } andthree Euler angles  O  { a , b , g } relative toitspre-definedbody-fixedcoordinate system (see Fig. 1). Other choices of coordinates likeangle-axis representation 19 are possible but their performancein the global optimization shows no significant difference. 20 Thus, for a cluster containing   N   rigid molecules, the totalnumber of DOF is 6  N  ,  i.e. Q    { q 1 , . . . q  N  }    { R  1 , O 1 , . . . , R   N  , O  N  }   {  X  1 , Y  1 , Z  1 , a 1 , b 1 , g 1 , . . . ,  X   N  , Y   N  , Z   N  , a  N  , b  N  , g  N  } (1) As the molecules are rigid,  U  (  X  ) is a function of   Q . Theelimination of the translation and rotation DOF reduces  Q  to6  N     6 coordinates. In principle, an ‘‘exact’’ solution to thisglobal optimization problem requires an ergodic sampling overthe  Q  space. This is impossible for large clusters. The sampling difficulty has been discussed for atomic clusters in our previous work  18  where we pointed out that the number of LMs of acluster of size  N   increases exponentially  15,21 leading to a ruggedPES. For molecular clusters the difficulty manifests itself inan additional aspect. A molecule can have several directionalinteraction sites,  e.g.  H 2 O has four (two H atoms and two loneelectron pairs on O), CH 3 OH has three (one H atom and twolone electron pairs on O), and C 6 H 6  has two (the  p  electronsystem on each side of the molecular plane). For large  N  , thenumber of the possible interaction network topologies in acluster can be extremely large and depends strongly on thenatureof its components. Thus, a convergentglobal optimizationmay requirea verylong computation time. For biasedalgorithms,one could use some graph theory approaches to accelerate theoptimization for specific clusters( e.g.  ref. 10). Since the algorithmin ABCluster is an unbiased one and is designed for general rigidmolecular clusters, wedonotapplythese approaches.Toalleviatethe ruggedness of the PES, we use a smoothed PES function U ˜  rather than the srcinal one: 18 U ˜ ( Q ) = min:{ U  ( Q )} (2) Fig. 1  The external DOF  q  of a rigid molecule used in ABCluster. A molecule will first be rotated by  a ,  b  and  g  as shown in the Figure and then translatedby {  X  , Y  ,  Z  } to its final position in a cluster. For explicit expressions please refer to the Appendix. Paper PCCP    P  u   b   l   i  s   h  e   d  o  n   1   6   D  e  c  e  m   b  e  r   2   0   1   5 .   D  o  w  n   l  o  a   d  e   d   b  y   U  n   i  v  e  r  s   i   t  y  o   f   B   i  r  m   i  n  g   h  a  m   o  n   0   1   /   1   0   /   2   0   1   7   1   3  :   4   4  :   1   5 . View Article Online  This journal is©the Owner Societies 2016  Phys. Chem. Chem. Phys.,  2016,  18 , 3003--3010 |  3005  where ‘‘min’’ stands for performing a local minimizationof   U   starting from  Q . The advantage of   U ˜  over  U   is that theformer one removes the energetic barriers along the downhillmovement towards a funnel, leading to a more efficient optimization. 18,22 It has been used in the pioneering work of the global optimization of proteins. 23 However, (2) cannot remove the barriers between the funnels. These barriers makesampling clusters of quite different interaction network topo-logies require a long time, being the bottleneck of the globaloptimization problem.The potential energy function  U   is essential in the descrip-tion of a molecular cluster. In this work we only consider thetwo-body empirical potential function of the following form: U  ð Q Þ¼ X N I  ¼ 1 X N I  o J  X i  I  2 I  X  j  J  2 J  e 2 4 p e 0 q i  I  q  j  J  r i  I   j  J  þ 4 e i  I   j  J  s i  I   j  J  r i  I   j  J    12   s i  I   j  J  r i  I   j  J    6  !# (3)Here  I   and  J   are the indices of the molecules,  i   I   and  j   J   are theindices of the atoms in molecules  I   and  J  , respectively.  r  i   I    j   J  is thedistance between atom  i   I   and  j   J  . Obviously (3) only considersthe intermolecular Coulomb and Lennard-Jones interactions. Although this form is simple, it is used in many modern forcefields like CHARMM, 24 OPLS 25 and AMBER  26 and has beentested in numerous studies, confirming its reliability. Using amore sophisticated  U   is more expensive. Also, since the GM is very sensitive to the form and parameters of   U   ( e.g.  the GM of (H 2 O) 6  with the TIP4P and TIP5P force field is the cage and ring isomer, respectively  9,20 ), using different forms of potentials willcause confusion. Therefore, we decide to use only the simplest form (3) in this work. In practice one can first obtain a set of LMs with (3) and then study them further with,  e.g. , quantumchemical methods.For the technical details of computing (2) and (3) by using the external DOF  q  please refer to the Appendix. 2.2 The artificial bee colony algorithm The artificial bee colony (ABC) algorithm was proposed in 2005by Karaboga. 27 It is a swarm intelligence based algorithm,modelling the foraging behavior of honey bee colonies. Thebees want to find the best nectar as a food source and havedeveloped an efficient methodology to accomplish this mission.In terms of the global optimization problem, a rigid molecularcluster with external DOF  Q  is a nectar, its energy   U  ( Q ) is thequality of the nectar. A lower energy implies a higher quality or a good solution. The ABC algorithm simulates the bees’methodology by introducing three kinds of bees: employed,onlooker and scout bees. In each search cycle, first employedbees perform a coarse exploration of the  Q  space, obtaining some trial solutions; then onlooker bees do the search in theneighborhood of some ‘‘good’’ solutions; finally scout beesexamine the obtained solutions and discard the ones that hadlittle contribution to the improvement of the solutions during the past several cycles and replace them by new random ones.The search cycles until some stopping criteria are satisfied,and the best solution obtained so far is assumed to correspondto the GM. The mechanism and performance of the ABCalgorithms have been discussed in several papers. 28–30 Especially,its specific implementation in ABCluster has been discussed indetail in our previous work. 18 For the global optimization of rigid molecular clusters, the ABC algorithm is very similar tothe one for atomic clusters. 18 Therefore we will only briefly describe it here.In the ABC algorithm, three parameters are needed: the sizeof the population of trial solutions SN, the scout limit   g  limit   andthe maximum cycle number  g  max  . The cluster is described by itsexternal DOF  Q , size  N  , an estimated length  L , and the potentialparameters. The global optimization then begins:(1) Initialize the population:  Q 11 , . . . , Q 1SN . One can use randominitial guesses,  i.e.  each component of   R  and  O  is randomly taken from the range [0,  L ] and [0,2 p ], respectively. Next allthe clusters are locally optimized using the limited-memory-Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) algorithm. 31 TheLM (or GM) properties of the obtained structures are tacitly assumed here.(2)Modellingemployedbees:incycle  g  ,foreach Q  g i   ( i  =1, . . . ,SN),a new trial solution  V  i   is generated by the trigonometric mutationoperator: 32 V i   ¼  13  Q  gk 1 þ Q  gk 2 þ Q  gk 3   þ  p 2   p 1 ð Þ  Q  gk 1  Q  gk 2   þ  p 3   p 2 ð Þ  Q  gk 2  Q  gk 3   þ  p 1   p 3 ð Þ  Q  gk 3  Q  gk 1   (4) where  k  1 ,  k  2  and  k  3  are random integers in {1, . . . ,SN} and k  1 a k  2 a k  3 a i  , and  p k m  ¼ ~ U   Q k m   ~ U   Q k 1   þ  ~ U   Q k 2   þ  ~ U   Q k 3   ð m ¼ 1 ; 2 ; 3 Þ  (5) U ˜  is the smoothed potential energy function (2). This solution isupdated with a greedy selection scheme (6): Q  g þ 1 i   ¼ V i   if   ~ U   V i  ð Þ o  ~ U   Q  gi  ð Þ Q  gi   otherwise 8<: (6)(3) Modelling onlooker bees: for SN times, a ‘‘good’’ solution Q  g k   is selected by the tournament scheme 16 and a new trialsolution  V  k   is generated by the ‘‘ABC/current/2 + ABC/best/2’’strategy (7): V k  ¼ Q  gk þ F   Q  gk 1 þ Q  gk 2  Q  gk 3  Q  gk 4    if   Z o 0 : 5 Q  g best þ F   Q  gk 1 þ Q  gk 2  Q  gk 3  Q  gk 4    otherwise 8><>: (7) where  k  1 ,  k  2 ,  k  3  and  k  4  are random integers in {1, . . . ,SN} and k  1 a k  2 a k  3 a k  4 a k  .  F   and  Z  are random numbers in [0,1]. Q  g k   is again updated with the greedy selection scheme (6).(4) Modelling scout bees: now each  Q  g i   ( i   = 1, . . . ,SN) isexamined. A   Q  g i   which does not change in the last   g  limit   cycles will be replaced by a random trial solution  Q  g  +1 i   regardless of  whether it is better than  Q  g i   or not.(5) If   g   Z  g  max  , the algorithm is finished, otherwise goto step 2. PCCP Paper    P  u   b   l   i  s   h  e   d  o  n   1   6   D  e  c  e  m   b  e  r   2   0   1   5 .   D  o  w  n   l  o  a   d  e   d   b  y   U  n   i  v  e  r  s   i   t  y  o   f   B   i  r  m   i  n  g   h  a  m   o  n   0   1   /   1   0   /   2   0   1   7   1   3  :   4   4  :   1   5 . View Article Online  3006  |  Phys. Chem. Chem. Phys.,  2016,  18 , 3003--3010 This journal is©the Owner Societies 2016 3 Applications In the section, we examine the performance of the ABC algo-rithm in the global optimization of rigid molecular clusters. Allthe potential parameters in (3) were taken from the CHARMM36force field 33 and the details can be found in the ESI. †  All optimiza-tions were performed by ABCluster. 18 The graphs of the clusters were rendered by CYLView. 34 3.1 Water clusters as benchmark   Water clusters are of fundamental importance in chemistry  35 and thus their GMs have attracted much attention from thescientific community. Water molecules can form complex hydro-gen bond networks. For (H 2 O)  N   in an ice-I h  structure, Pauling pointed out that the number of possible networks scales as(3/2)  N  . 36 This is a big challenge for a global optimization. Theglobal minima of small water clusters are well documentedin the literature (see ref. 10 and 37 and references therein).Therefore we take this system as a first benchmark of ouralgorithm. Here the water molecule is described by the TIP4Pmodel. 38 The algorithm parameters and optimization resultsare given in Table 1. Some GMs are shown in Fig. 2.Table 1 confirms that ABCluster successfully located theGMs for all (H 2 O)  N   (  N   = 5–20) clusters. The number of stepsrequired for convergence increases rapidly for  N  Z 10, reflecting the exponential scaling of the number of their LMs. It is observedthat this quantity increases as exp(0.60  N  ). Interestingly, Takeuchifound a similar dependency of the number of local optimizationsperformed during a search for the GM of (H 2 O)  N   clusters using another method,  i.e.  exp(0.63  N  ). 10 Therefore, a reliable search forthe GMs of (H 2 O) 21  would already require more than 10 5 steps.The optimization becomes more difficult for unbiased methodslike basin hopping  20 and even for biased algorithms. 10 Never-theless, this benchmark confirms the reliability of our algo-rithm. In principle, any GM can be found with sufficiently large  g  max  . In the remainder we will apply the ABC algorithm to morecomplex systems. 3.2 Microhydration clusters Next we want to examine the performance of ABCluster for somemicrohydration clusters,  i.e.  X(H 2 O)  N  . We chose  N   = 20 in most cases in order to model the behavior of solute X in a sufficient amount of water to form at least a complete first hydrationsphere. Each of the systems discussed in the following could bepart of an independent project but here they are merely used asexamples to prove the accuracy and robustness of our method-ology. The optimization results are given in Table 2 and Fig. 3.First we consider the three alkali cations Na + , K + and Cs + .The subtle difference in hydration properties of Na + and K + makes them play important but completely different roles inbiological processes. An essential factor is their charge density.The water molecules interact stronger and thus tend to becloser to X with higher charge density such as Na + . This leads toan increased repulsion between the directly coordinating watermolecules and to a preference of a smaller water coordinationnumber (CN). Our optimization results (see Fig. 3) confirmedthis: in the GM of Na + (H 2 O) 20 , Na + takes an off-center position with CN = 6, while in the GM of K + (H 2 O) 20  and Cs + (H 2 O) 20  thecations are found in the center and exhibit larger CNs, being aclathrate-like structure. These observations are in agreement  with previous studies on Li + to Cs + and Ca 2+ . 39–41 Table 1  Benchmark for the water clusters (energy unit: kJ mol  1 ) a  ABC algorithm parameters: SN = 60,  g  limit   = 4,  g  max   = 30000Initial guess: random(H 2 O)  N   Step b Energy (H 2 O)  N   Step b Energy 5 1   152.1371 13 304   533.06796 1   197.8168 14 437   583.09697 1   243.6168 15 1485   628.48568 1   305.5747 16 783   681.31579 6   344.4982 17 1940   723.938910 16   391.0943 18 2221   773.371811 169   431.5672 19 4285   821.184312 28   492.9979 20 28054   873.1465 a The reference GM energies (  N   = 5–20) are from ref. 9. Note that ourenergy is slightly larger in magnitude than theirs ( e.g.  for (H 2 O) 10  391.0943  vs.  391.0227). This is probably due to the slightly different accuracy of   e 2 4 p e 0 in (3). thus they are in fact identical (dor this we use e 2 4 pe 0 ¼ 1389 : 506  Å kJ mol  1 ).  b Step is the step at which the final energy is obtained. Fig. 2  Some GMs of water clusters obtained by ABCluster. Table 2  The global optimization of some microhydration clusters (energyunit: kJ mol  1 )  ABC algorithm parameters: SN = 60,  g  limit   = 4,  g  max   = 35000Initial guess: randomSystem Step a Energy Na + (H 2 O) 20  2627   1167.6197K + (H 2 O) 20  1349   1099.8384Cs + (H 2 O) 20  103   1050.8635Gmd + (H 2 O) 20  2201   1016.3934Cl  (H 2 O) 20  3011   1143.7146SO 42  (H 2 O) 20  2798   1659.7028K + Cl  (H 2 O) 20  14617   1585.2550Mg  2+ SO 42  (H 2 O) 20  45   3937.4321CH 4 (H 2 O) 20  30921   879.6630Coronene–(H 2 O) 10  18   430.6415 a ‘‘Step’’ is the step at which the final energy is obtained. Paper PCCP    P  u   b   l   i  s   h  e   d  o  n   1   6   D  e  c  e  m   b  e  r   2   0   1   5 .   D  o  w  n   l  o  a   d  e   d   b  y   U  n   i  v  e  r  s   i   t  y  o   f   B   i  r  m   i  n  g   h  a  m   o  n   0   1   /   1   0   /   2   0   1   7   1   3  :   4   4  :   1   5 . View Article Online
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!