Graphics & Design

A semi-definite programming-based underestimation method for global optimization in molecular docking

Description
A semi-definite programming-based underestimation method for global optimization in molecular docking
Published
of 6
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A Semi-Definite programming-based Underestimation method for globaloptimization in molecular docking ∗ Ioannis Ch. Paschalidis † ,  Member, IEEE  , Yang Shen ‡ , Sandor Vajda § , Pirooz Vakili ¶ ,  Member, IEEE   Abstract —The paper introduces a new global optimizationmethod that is targeted to solve molecular docking problems,an important class of problems in computational biology. Thesearch method is based on finding general convex quadraticunderestimators to the binding energy function that is funnel-like. Finding the optimum underestimator requires solving asemi-definite programming problem, hence the name Semi-Definite programming based Underestimation (SDU). The op-timal underestimator is used to bias sampling in the searchregion. A detailed comparison of SDU with a related methodof Convex Global Underestimator (CGU), a discussion of theconvergence properties of SDU, and computational results of the application of SDU to a number of rigid protein-proteindocking problems are provided.  Index Terms —Computational biology, Global optimization,Semi-definite programming, Molecular docking. I. I NTRODUCTION T HE solution of a number of important problems incomputational biology rests on finding global minimaof energy functions that are funnel-like. These are functionswith multiple non-convex funnels and a huge number of localminima of less depth that are spread over the domain of the function. For example, protein folding is the problemof predicting the 3-dimensional native structure (or “confor-mation”) of proteins from their 1-dimensional amino acidsequences. It is known that proteins when they fold canfollow multiple paths on the energy landscape [1] which isfunnel-like shaped. Similar energy funnels are also found inother problems such as protein-protein docking [2].Global optimization methods such as simulated annealingand genetic algorithms have been applied in some of theseareas but they are very slow and easily trapped in kineticmoves. A number of recent approaches have attempted, withsome success, to use the funnel-like shape to guide theglobal search to the vicinity of the global minimum. Forexample, the Semi-Global Simplex (SGS) algorithm usessimplex moves on surfaces spanned by the local minima * Research partially supported by the NSF under a CAREER awardANI-9983221 and grants DMI-0330171, ECS-0426453, CNS-0435312,DMI-0300359, and EEC-0088073, and by the ARO under the ODDR&EMURI2001 Program Grant DAAD19-01-1-0465 to the Center for Net-worked Communicating Control Systems. †  Center for Information & Systems Eng., and Dept. of ManufacturingEng., Boston University, 15 St. Mary’s St., Brookline, MA 02446, e-mail: yannisp@bu.edu , url:  http://ionia.bu.edu/ . ‡  Center for Information & Systems Engineering, and Dept. of Manufac-turing Eng., Boston University, e-mail:  yangshen@bu.edu . §  Department of Biomedical Engineering, Boston University, e-mail: vajda@bu.edu . ¶  Center for Information & Systems Engineering, and Dept. of Manu-facturing Eng., Boston University, e-mail:  vakili@bu.edu . rather than on the free energy itself [3]. Or, the SmoothDock approach [4] uses the strategy of descending on the “smooth”components of the energy function to which one slowlyadds higher frequency components. Of most relevance tothis paper is the Convex Global Underestimator (CGU)method where convex quadratic underestimators are usedto approximate the envelope spanned by the local minimaof the energy function [5]. The vicinity of the minimizerof the underestimator is viewed as the potential location of the global minimum of the energy function. The problem of finding the optimal underestimator is formulated and solvedas a Linear Programming (LP) problem.It has been shown that CGU does not perform well in somecases and that its performance deteriorates as the dimensionof the search space increases [3]. We contend that a criticalreason for this poor performance is the restricted class of underestimators used in CGU. This restriction amounts to alack of flexibility in capturing the overall shape of the energyfunnels and hence an inability to locate promising regions tosearch for the global minimum.We use the same strategy of using quadratic convexfunctions to underestimate the envelope spanned by the localminima of the energy function. However, we consider theclass of general convex quadratic functions for underesti-mation. In this case, given a finite set of local minima,finding the optimal underestimator amounts to solving aSemi-Definite Programming (SDP) problem, hence the termSemi-Definite programming-based Underestimation (SDU).We show, theoretically as well as experimentally, that SDUoutperforms CGU, often significantly. Using some prelimi-nary experimental results, we show that SDU is a promisingmethod for solving molecular docking problems.The rest of the paper is organized as follows. Sec. IIpresents some background material on molecular docking.Sec. III presents our SDU method. Comparisons with CGUare in Sec. IV. SDU’s convergence properties are discussedin Sec. V. Some results on docking proteins are discussedin Sec. VI. We end with conclusions in Sec. VII.II. P RELIMINARIES Next we review key properties of the free energy functionsthat are to be minimized in molecular docking problems.We start with their biophysical properties and then abstractcharacteristic mathematical properties that are important inthe development of appropriate optimization strategies. Proceedings of the44th IEEE Conference on Decision and Control, andthe European Control Conference 2005Seville, Spain, December 12-15, 2005 TuC07.1 0-7803-9568-9/05/$20.00 ©2005 IEEE 3675   A. Biophysical OriginFree energy evaluation models:  At fixed temperatureand pressure, a complex of two molecules adopts the con-formation that corresponds to the lowest Gibbs free energyof the system that includes the component molecules andthe solvent – usually water – surrounding them. Thus, indocking calculations the natural target function to minimizeis an approximation of the Gibbs free energy,  G RL , of thereceptor-ligand complex, or that of the binding free energy, ∆ G  [6]. In particular,  ∆ G  =  G RL − G R − G L , where  G R and G L are the free energies of the (free) receptor and ligand,respectively, and both  G R and  G L are independent of theconformation of the complex; hence, minimizing  G RL isequivalent to minimizing  ∆ G .We use free energy evaluation models that combine molec-ular mechanics with continuum electrostatics and empiricalsolvation terms. In the most general case the binding freeenergy is decomposed according to ∆ G  = ∆ E  elec +∆ E  vdw +∆ E  int +∆ G ∗ des − T  ∆ S  sc +∆ G o , (1)where  ∆ E  elec  is the change in electrostatic energy uponbinding,  ∆ G ∗ des  is the desolvation free energy,  ∆ E  vdw  is thechange in van der Waals energy, and  ∆ E  int  is the change ininternal energy due to any flexing/straining of the backboneand side chains. The entropy term,  − T  ∆ S  sc , accounts forthe decrease in entropy experienced by the interface sidechains upon binding. The term,  ∆ G o , accounts for all otherchanges in the binding free energy that occur upon binding,which is considered to depend weakly on the conformationand will be treated as a constant (15 kcal/mol in this work).The internal (bonded) energy,  ∆ E  int , is the sum of bondstretching, angle bending, torsional, and improper terms.  B. Mathematical Properties Multi-frequency behavior:  The free energy function canbe regarded as the sum of three components with differentfrequencies. First, the sum of electrostatic, desolvation, andentropic terms changes relatively slowly along any reactionpath, and hence we define the “smooth” free energy, or thesmooth component of the free energy by ∆ G s  = ∆ E  elec  + ∆ G des  − T  ∆ S  sc  + ∆ G o  (2)where the desolvation free energy  ∆ G des  does not includethe solvent-solute van der Waals term.  ∆ G s  is much lesssensitive to structural perturbations than the terms  ∆ E  vdw and  ∆ E  int . The internal energy  ∆ E  int  changes with anintermediate frequency, and the frequency of change is veryhigh for  ∆ G vdw .In local minima in which the internal and van der Waalsterms are close to zero, the free energy surface is essentiallydetermined by the “smooth” free energy component  ∆ G s .However, an arbitrary pathway in the conformational spacegoes through non-native states at which the  ∆ E  vdw  and ∆ E  int  are high, resulting in the funnel-like shape shownin Fig. 1.III. SDU: T HE  S EMI -D EFINITE  U NDERESTIMATOR M ETHOD In this section we introduce the SDU method. We firstintroduce some notational conventions we will be using. Notational Conventions:  All vectors are assumed to becolumn vectors. We use lower case boldface letters to denotevectors and for economy of space we write x  = ( x 1 ,...,x n ) for the column vector  x .  x ′ denotes the transpose of   x , 0  the vector of all zeroes,  e  the vector of all ones, and e i  the  i th unit vector. For any vector  x  we write   x  1 for the L1 norm, i.e.,   x  1  =   ni =1 | x i | , and   x   for theEuclidean norm. We use upper case boldface letters to denotematrices. Specifically, we write  A  = ( A i,j ) ni,j =1  for thematrix with  ( i,j ) th  element equal to  A i,j . We denote bydiag ( x )  the diagonal matrix with elements  x 1 ,...,x n  inthe main diagonal and zeroes elsewhere. We also denote bydiag ( A , B )  the block diagonal matrix with matrices  A  and B  in the main diagonal and zeroes elsewhere. We define F  •  Y  △ = n  i =1 n  j =1 F  i,j Y  i,j .  (3)Finally, for any event  S    we use  1 { S   }  to denote theindicator function of this event, that is,  1 { S   }  equals onewhen the event occurs and zero otherwise.We are now prepared to describe the two key componentsof the SDU algorithm.  A. Constructing an Underestimator  Let us denote by  f   :  R n →  R  the function we seek to minimize and assume we have obtained a set of   K  local minima  φ 1 ,..., φ K  of   f  ( · ) . Let the underestimatorbe defined by, U  ( φ )  △ = φ ′ Q φ + b ′ φ + c,  (4)where  Q  ∈  R n × n is a positive semi-definite matrix,  b  ∈ R n , and  c  is a scalar. The positive semi-definiteness of   Q guarantees the convexity of   U  ( · ) .Using an L1 norm as a distance metric the problem of finding the tightest possible such underestimator  U  ( · )  canbe formulated as follows:min  K j =1  ( f  ( φ j ) − c − φ j ′ Q φ j − b ′ φ j ) s.t.  f  ( φ j )  ≥  c + φ j ′ Q φ j + b ′ φ j , j  = 1 ,...,K, Q    0 , (5)where the decision variables are  Q ,  b , and  c , and “   0 ”denotes positive semi-definiteness.Let vectors  b + , b − ≥  0  and scalars  c + ,c − ≥  0  satisfying b  =  b + − b − and  c  =  c + − c − . Let  s  = ( s 1 ,...,s K  )  and Y  be the block diagonal matrix given by Y  △ =  diag ( Q , diag ( b + ) , diag ( b − ) ,c + ,c − , diag ( s )) .  (6)Note that  Y  ∈  R (3 n + K  +2) × (3 n + K  +2) . Let  F 0 △ = diag ( diag ( 0 ) , − diag ( e )) , where 0 is the  (3 n +2) -dimensionalzero vector, and  e  is the  K  -dimensional vector of ones. Also,for  j  = 1 ,...,K   we define F j △ =  diag ( φ j φ j ′ , diag ( φ j ) , − diag ( φ j ) , 1 , − 1 , diag ( e j )) . In addition, let  E i,j  denote the  (3 n + K  +2) × (3 n + K  +2) matrix with all elements equal to zero except the  ( i,j ) th 3676  element which equals  1 . Then, (5) can be written as follows:( SDP-P ) max  F 0  •  Y s.t.  F j  •  Y  =  f  ( φ j ) , j  = 1 ,...,K, E i,j  •  Y  = 0 , j  = 1 ,...,i − 1 ,i  =  n + 1 ,..., 3 n + K   + 2 , Y    0 , (7)where the decision variable is the matrix  Y . Problem( SDP-P ) in (7) is a  Semi-Definite Programming (SDP) problem [7]. SDP problems can be solved efficiently usinginterior-point methods [7] (in polynomial time).The dual to ( SDP-P ) in (7) is the problem( LMI-D ) min K   j =1 x j f  ( φ j ) s.t.  Z  = K   j =1 x j F j + 3 n + K  +2  i = n +1 i − 1  j =1 w i,j E i,j  − F 0 , Z    0 , (8)where the decision variables are  x j ’s and  w i,j ’s. Problem( LMI-D ) can be seen as the problem of minimizing a linearfunction subject to the  Linear Matrix Inequality (LMI) K   j =1 x j F j  + 3 n + K  +2  i = n +1 i − 1  j =1 w i,j E i,j  − F 0    0 . Our main result on underestimating a set of local minima issummarized in the following theorem. Theorem III.1 Consider a function  f   : R n → R and a set of local minima φ 1 ,..., φ K  of   f  ( · ) . Let   ( Q , b + , b − ,c + ,c − , s )  form an optimal solution  Y  of Problem ( SDP-P ) in (7),where  Y  is defined as in (6). Let   U  ( φ )  △ =  φ ′ Q φ + ( b + − b − ) ′ φ +( c + − c − ) . Then  U  ( · )  satisfies  f  ( φ j )  ≥  U  ( φ j )  for all  j  = 1 ,...,K   while minimizing   ( f  ( φ 1 ) ,...,f  ( φ K  )) − ( U  ( φ 1 ) ,...,U  ( φ K  ))  1 . Moreover, the dual to Problem( SDP-P ) is the LMI problem ( LMI-D ) in (8). Hereafter, we will say that a function  U  ( · )  satisfying thestatement of Theorem III.1  underestimates  f  ( · )  at points φ 1 ,..., φ K  . Figure 1 illustrates such an underestimator. Fig. 1. A funnel-like shaped function and a quadratic function underesti-mating the surface spanned by the local minima.  B. Sampling Suppose we are seeking the native conformation in someregion  B  . Using a set of local minima  φ 1 ,..., φ K  ∈  B  of   f  ( · )  we construct an underestimator  U  ( · )  as described inSection III-A. Let  φ P  the minimizer of   U  ( · ) . Notice thatthe underestimator contains information on the location of the near-native energy valley. We are interested in samplingconformations such that conformations close to φ P  are morelikely to be selected. In addition, conformations with highenough energy can be ignored. This can be achieved by usingthe following probability density function (pdf) in B  : g ( φ ) =  U  ( φ ) − U  max   B ( U  ( φ ) − U  max )  d φ △ =  U  ( φ ) − U  max A .  (9)In the expression above  U  max  = max B U  ( φ )  and weintroduced the normalizing constant  A  to denote the integralin the denominator.To generate random samples in  B   using the above pdf we will use the so called  rejection method  . In particular let h ( φ ) = 1 /V   be the uniform pdf in B   where  V   =   B d φ  isthe volume of   B  . Notice that g ( φ )  ≤  V  ( U  ( φ P  ) − U  max ) A h ( φ ) ,  ∀ φ ∈ B  and set  R ( φ )  equal to the ratio of the left hand side over theright hand side of the above, that is, R ( φ )  △ =  U  ( φ ) − U  max U  ( φ P  ) − U  max .  (10)In order to discard high energy conformations we are inter-ested in sampling points in  B   with associated probabilitydensity in some interval  [ ζ, 1] , where  ζ   ∈  [0 , 1) . Thealgorithm in Fig. 2 outputs such a sample point. To seethat notice that in Step 1 we generate uniformly distributedsamples in the set  { ( x ,y )  |  x  ∈ B  ,y  ∈  [ ζ, 1] } . The rejectionrule of Step 2 accepts samples that are uniformly distributedin  { ( x ,y )  |  x  ∈  B  ,ζ   ≤  y  ≤  g ( x ) } . Thus, the output  φ  of the algorithm is distributed in B   according to  g ( · ) .1) Generate a uniformly distributed random variables x 1  ∈ B   and  x 2  ∈  [ ζ, 1] .2) If   x 2  ≤  R ( x 1 ) , stop and output  φ  =  x 1 ; otherwise,return to Step 1. Fig. 2. An algorithm generating a sample in  B   drawn from  g ( · )  withassociated density in  [ ζ, 1] . We finally note that the algorithm in Fig. 2 requires know-ing  U  max . In many cases this is straightforward to obtain.Consider for instance the case where  B   is a polyhedron.Then, since  U  ( · )  is convex it achieves its maximum at an ex-treme point of the polyhedron B  . Hence, it suffices to searchover all extreme points which in low-dimensional problems(e.g., rigid docking) are not that many. Alternatively, one canuse an estimate of   U  max , e.g.,  max i U  ( φ i ) . C. The SDU Algorithm We now have all the ingredients to present our SDUalgorithm. The algorithm seeks a global minimum of the freeenergy function  f  ( · )  in some region B  of the conformationalspace; it is presented in Figure 3. Throughout the algorithmwe maintain a set  L    of interesting local minima obtainedso far as well as the best such local minimum denoted by 3677  1)  Initialization : Starting from  K   ( K   ≥  2 n +1)  randompoints in  B   perform local minimization to obtain K   local minima  φ 1 ,..., φ K  of   f  ( · ) . Set  L    = { φ 1 ,..., φ K  }  and  φ G = argmin i =1 ,...,n f  ( φ i ) .2)  Underestimation : Solve Problem ( SDP-P ) in (7) toobtain the underestimator  U  ( φ ) . Set the predictiveconformation equal to a minimizer of   U  ( · ) , that is,when  Q  is invertible  φ P  =  − 12 Q − 1 b .3)  Elimination : Discard unfavorable local minima from L   ; more specifically, set L    := L   \{ φ ∈ L    |  R ( φ )  <ζ   and φ  = φ G } .4)  Focalization : Define a neighborhood N     ( φ P  )  ⊆ B  of  φ P  . Set  B  := N     ( φ P  ) .5)  Exploration :a) Start from  φ P  and use local minimizationto obtain a local minimum  ˆ φ P  of   f  ( · ) . If  ˆ φ P  ∈  B   set  L    :=  L    ∪ { ˆ φ P  }  and  φ G :=argmin { f  ( φ G ) ,f  (ˆ φ P  ) } .b) Obtain  m  samples from the sampling algorithmof Fig. 2. Using these samples as starting pointsperform local minimization to obtain  m  localminima  x 1 ,..., x m of   f  ( · ) . Set L    := L    ∪{ x  | x  =  x 1 ,..., x m , x  ∈ B  }  and φ G := arg min φ = φ G , x 1 ,..., x m φ ∈ B f  ( φ ) . 6)  Termination : If    φ G − φ P    <   then stop; otherwisego to Step 2. Fig. 3. The SDU algorithm. φ G . The evolution of the algorithm in Fig. 3 depends on theparameters  K,ζ   ∈  [0 , 1] ,m  and   , as well as the way wedefine the neighborhood  N     ( φ P  )  in Step 5. These will beappropriately tuned in every problem instance.A couple of remarks on the proposed SDU algorithm are inorder. The algorithm combines exploration with focalizationin energy favorable regions of the conformational space(energy funnels). The exploration is in fact biased towardsthese energy favorable funnels. This is motivated by thedesire to avoid computationally expensive exploration inareas of the conformational space that are not likely tocontain the native structure.We should point out that we make no claims that the SDUalgorithm will converge to the global minimum of   f  ( · ) . Infact, it is straightforward to see that it will not find the globalminimum if we do not use enough local minima to determinethe underestimator or when  f  ( · )  is arbitrary and does nothave a funnel-like shape. However, later in the paper we willprovide arguments that guarantee convergence for funnel-likeshaped functions under a suitable set of conditions.IV. CGU  AND AND ITS  L IMITATIONS The CGU algorithm [5] can be viewed as a special case of the SDU algorithm under the following key modifications:1)  Underestimation . In deriving the underestimator  U  ( · ) impose the constraint that the matrix  Q  is a diagonalpositive semi-definite matrix. Then the semi-definiteconstraint can be replaced by a non-negativity con-straint for all diagonal entries. It follows that Problem( SDP-P ) can be reformulated as a  linear programming problem (LP)  which can be easily solved.2)  Sampling . Replace our biased sampling method withrandom (uniform) sampling in the neighborhood N     ( φ P  )  ⊆ B   of   φ P  .We will argue that these two differences between CGUand SDU drastically affect the performance of the CGUalgorithm. In particular, limiting the underestimator search tothe class of canonical parabolas (diagonal  Q ) substantiallyreduces the efficiency and accuracy of CGU for generalproblems where the surface spanned by the local minima isnot typically aligned with the canonical coordinates definingthe underestimating parabola. [3] reports many such caseswhere CGU performs poorly. Some attempts in addressingthis limitation have been made in [8] but they are only ableto handle very special cases.We start our study of CGU limitations by providing asimple example where CGU fails. Consider the function f  ( φ ) = 100 φ 21  −  10 φ 1 φ 2  +  φ 22  whose global minimum isat the srcin. We use CGU to underestimate this function.In Fig. 4 we plot contours of the function  f  ( · )  and itsresulting CGU underestimator  U  CGU  ( · ) . More specifically, −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4−2024681012x 1       x         2 U CGU (x)f (x) (a) −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1051015202530x 1       x         2 U CGU (x)f(x) (b)Fig. 4. CGU yields different results depending on the sample region. in Fig. 4(a) we randomly (and uniformly) sampled the region [ − 1 , 10]  ×  [ − 1 , 10]  to obtain a large set of points whichwe used to construct the CGU underestimator. The under-estimator  U  CGU  ( · )  has a global minimum (to be referredto as  prediction ) at  (0 , 10) . Notice that CGU constrainsits prediction within the sampling region. In Fig. 4(b) weperformed the same experiment but used  [ − 1 , 20] × [ − 1 , 20] as the sampling region and CGU’s prediction was  (0 , 20) . Inboth cases, CGU’s prediction is at the boundary because theminimization of   U  CGU  ( · )  is constrained within the samplingregion; unconstrained minimization produces an even worseresult. It is evident that the prediction heavily depends on theinitial sampling region which in most cases is set arbitrarily.In the next subsection we analyze the CGU underestimatingapproach and compare to the one we employ in SDU. 3678   A. Comparing the CGU and SDU underestimators As we discussed in Section III a quadratic underestimatorwill not be informative if either ( i )  f  ( · )  is not funnel-like andthe envelope of local minima can not be well approximatedby a convex quadratic, and ( ii ) if we do not use a rich enoughset of local minima in constructing  U  ( · ) . In the followingwe wish to remove these two potential sources of poorperformance in order to better assess the underestimatingpower of CGU and SDU. More specifically, we consider the“ideal” case of underestimating a convex quadratic given by f  ( t ) =  t ′ ¯ Qt +¯ b ′ t +¯ c , where  ¯ Q    0 . Further, we assume thatan infinite number of sample points of   f  ( · )  in some compactsampling region B   is at our disposal when we construct theunderestimator. The construction of the underestimator basedon utilizing all sample points in B   can be formulated as thefollowing (infinite dimensional) optimization problem:minimize   t ∈ B ( f  ( t ) − U  ( t ))  d t subject to  f  ( t )  ≥  U  ( t ) ,  t  ∈ B  ,  (11)where the decision variables are the (yet unspecified) param-eters defining  U  ( t ) .Suppose first that we use the SDU underestimating ap-proach and seek to construct a function  U  ( t ) =  t ′ Qt  + b ′ t  +  c , where  Q    0 . Consider the problem in (11) forsuch a  U  ( t ) . The next proposition is immediate. Proposition IV.1 SDU can underestimate  f  ( · )  exactly, in particular,  ( Q , b ,c ) = (¯ Q ,  ¯ b , ¯ c )  is an optimal solution of (11). We next consider the CGU underestimation approach.Specifically, we seek to construct a function U  ( t ) =  t ′ Dt  +  b ′ t  +  c , where  D  is diagonal positivesemi-definite matrix. Namely,  D  =  diag ( d 1 ,...,d n ) where  d i  ≥  0  for  i  = 1 ,...,n . For simplicityof the exposition  B   =  B  1  × ··· ×  B  n  where B  i  = [ l i ,u i ]  and  u i  −  l i  =  T   for all  i  = 1 ,...,n .We denote  a ( t ) = ( t 21 ,...,t 2 n ,t 1 ,...,t n , 1) ,  h  =(   t 1 ∈ B 1 t 21 dt 1 /T,...,   t n ∈ B n t 2 n dt n /T,   t 1 ∈ B 1 t 1 dt 1 /T,...,   t n ∈ B n t n dt n /T, 1) , and  z  = ( d 1 ,...,d n ,b 1 ,...,b n ,c ) .In this case, the optimization problem in (11) is equivalentto the following problem:( LSIP-P ) max  h ′ z s.t.  a ′ ( t ) z  ≤  f  ( t ) ,  t  ∈ B  ,  (12)where z is the decision vector. Note that it involves an infinitenumber of constraints. A problem with such a structureis known as the  Linear Semi-Infinite Programming (LSIP) problem [9]. Its dual can be formulated in measure space asfollows:( LSIP-D ) min   B f  ( t ) dµ ( t ) s.t.   B a ( t ) dµ ( t ) =  h , µ  ∈  M  + ( B  ) , (13)where  M  + ( B  )  denotes the set of non-negative regular Borelmeasures on  B  .It can be shown (we omit the details due to space limita-tions) that the underestimator obtained by solving ( LSIP-P )in (12) is the limit (as  K   → ∞ ) of the CGU underestimatorsderived based on function values  f  ( t 1 ) ,...,f  ( t K  )  at a setof samples  t 1 ,..., t K  from  B  . This is insightful becauseit suggests that when we use enough samples the quality of the CGU underestimator does not depend on sample selectionbut rather on the fundamental structure of the underestimatorfunction. Our main result in this section is the followingtheorem; the proof is omitted in the interest of space. Theorem IV.2  Let   f  ( t ) =  t ′ ¯ Qt  , where  ¯ Q    0 . Further,let   U  ∗ ( t ) =  t ′ D ∗ t  +  b ∗ ′ t  +  c ∗  , be the optimal solutionto ( LSIP-P ), i.e., the optimal CGU underestimator to  f  ( t ) .Then, in general,  b ∗  = 0 . In other words the minimizer of the underestimator is different from the minimizer of   f  ( t ) . V. O N  SDU’ S  C ONVERGENCE In this section we give the result that under appropriateconditions the SDU algorithm converges to the global mini-mum of the function  f  ( · ) . Such (free energy) functions aris-ing in molecular docking applications, as we have explained,possess key structural properties. Therefore, we will imposea set of structural assumptions on  f  ( · )  and the search region B  that reflect the properties of the free energies functions weseek to minimize. We denote by epi ( f  )  the epigraph of   f  ( · ) which is defined as epi ( f  ) =  { ( φ ,w )  |  φ  ∈ B  ,f  ( φ )  ≤  w } .We also denote by conv ( S   )  the convex hull of any set  S   . Assumption A  Assume that   f  ( φ )  satisfies the following set of conditions:( i ) it is continuously differentiable; ( ii )  f  ( · )  has a uniqueglobal minimum in B  ; ( iii ) B   is compact; ( iv ) for all localminima  φ  of   f  ( · )  there exists an open set such that   φ  isthe unique minimizer of   f  ( · )  within this set; ( iv ) the extreme points of conv ( epi ( f  ))  lie on a quadratic function  ˜ U  ( φ ) = φ ′ ˜ Q φ + ˜ b ′ φ + ˜ c ; ( v )  ˜ U  ( φ )  has a unique global minimumwhich is identical with the global minimum of   f  ( · )  in B  . For functions that satisfy Assumption A we will say thatthey have a  funnel-like shape  (see Fig. 1 for an illustration).As we argued in Section II, Assumption A is not overlyrestrictive for the free energy functions we are interested inminimizing.The following theorem establishes that given sufficientsampling of the search region  B   the SDU underestimationprocedure can locate the global minimum of   f  ( · )  whichwe denote by  φ ∗ . The proof is omitted again due to spacelimitation. Theorem V.1  Let Assumption A prevail. Consider the SDU algorithm provided in Fig. 3 and assume that   B   containsat least   ( n +1)( n +2)2  local minima of   f  ( · )  which are extreme points of conv ( epi ( f  )) . Suppose that in Step 1 of the al-gorithm we obtain  K   uniformly distributed samples in  B  and for each one of those we perform local minimizationto obtain  K   local minima  φ 1 ,...,φ K  of   f  ( · ) . Then, theglobal minimum  φ P  of the underestimator   U  ( · )  obtained inStep 2 of the algorithm converges in probability to the globalminimum  φ ∗ of   f  ( · )  as  K   → ∞  , namely,  lim K  →∞ P [ φ P  = φ ∗ ] = 1 . 3679
Search
Similar documents
View more...
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks