A Theoretical Analysis of Optimization by Gaussian Continuation

A Theoreical Analysis of Opimizaion by Gaussian Coninuaion Hossein Mobahi an John W. Fisher III Compuer Science an Arificial Inelligence Lab. CSAIL Massachuses Insiue of Technology MIT
of 15
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
A Theoreical Analysis of Opimizaion by Gaussian Coninuaion Hossein Mobahi an John W. Fisher III Compuer Science an Arificial Inelligence Lab. CSAIL Massachuses Insiue of Technology MIT Absrac Opimizaion via coninuaion meho is a wiely use approach for solving nonconvex minimizaion problems. While his meho generally oes no provie a global minimum, empirically i ofen achieves a superior local minimum compare o alernaive approaches such as graien escen. However, heoreical analysis of his meho is largely unavailable. Here, we provie a heoreical analysis ha provies a boun on he enpoin soluion of he coninuaion meho. The erive boun epens on a problem specific characerisic ha we refer o as opimizaion complexiy. We show ha his characerisic can be analyically compue when he objecive funcion is expresse in some suiable basis funcions. Our analysis combines elemens of scale-space heory, regularizaion an ifferenial equaions. Inroucion Nonconvex energy minimizaion problems arise frequenly in learning an inference asks. For example, consier some funamenal asks in compuer vision. Inference in image segmenaion Mumfor an Shah 989, image compleion Mobahi, Rao, an Ma 009, an opical flow Sun, Roh, an Black 00, as well as learning of par-base moels Felzenszwalb e al. 00, an icionary learning Mairal e al. 009, all involve nonconvex objecives. In nonconvex opimizaion, compuing he global minima are generally inracable an as such, heurisic mehos are sough. These mehos may no always fin he global minimum, bu ofen provie goo subopimal soluions. A popular heurisic is he so calle coninuaion meho. I sars by solving an easy problem, an progressively changes i o he acual complex ask. Each sep in his progression is guie by he soluion obaine in he previous sep. This iea is very popular owing o is ease of implemenaion an ofen superior empirical performance Copyrigh c 05, Associaion for he Avancemen of Arificial Inelligence All righs reserve. Tha is fining much eeper local minima, if no he global minima. agains alernaives such as graien escen. Insances of his concep have been uilize by he arificial inelligence communiy for more han hree ecaes. Examples inclue grauae-nonconvexiy Blake an Zisserman 987, mean fiel heory Yuille 987, eerminisic annealing Rose, Gurewiz, an Fox 990, an opimizaion via scale-space Wikin, Terzopoulos, an Kass 987. I is wiely use in various sae-of-hear soluions see Secion. Despie ha, here exiss no heoreical unersaning of he meho iself. For example, i is no clear which properies of he problem make is associae opimizaion easy or ifficul for his approach. This paper provies a boun on he objecive value aaine by he coninuaion meho. The erive boun monoonically epens on a paricular characerisic of he objecive funcion. Tha is, lower value of he characerisic guaranees aaining lower objecive value by he coninuaion. This characerisic reflecs he complexiy of he opimizaion ask. Hence, we refer o i as he opimizaion complexiy. Imporanly, we show ha his complexiy parameer is compuable when he objecive funcion is expresse in some suiable basis funcions such as Gaussian Raial Basis Funcion RBF. We provie a brief escripion of our main resul here, while he complee saemen is pospone o Theorem 7. Le fx be a nonconvex funcion o be minimize an le ˆx be he soluion iscovere by he coninuaion meho. Le f be he minimum of he simplifie objecive funcion. Then, fˆx w f + w α, where w 0 an w 0 are inepenen of f an α is he opimizaion complexiy of f. When f can be expresse by Gaussian RBFs fx K k a ke x x k δ, hen in Proposiion 9 we show ha is opimizaion complexiy α is proporional o We noe ha prior applicaion ailore analysis is available, e.g. Kosowsky an Yuille 994. However, here is no general an applicaion inepenen resul in he lieraure. K K j k a ja k e x j x k δ ɛ. Our analysis here combines elemens of scale space heory Loog, Duisermaa, an Florack 00, ifferenial equaions Wier 975, an regularizaion heory Girosi, Jones, an Poggio 995. We clarify ha opimizaion by coninuaion, which races one paricular soluion, shoul no be confuse by homoopy coninuaion in he conex of fining all roos of a sysem of equaion 3. Homoopy coninuaion has a rich heory for he laer problem Morgan 009; Sommese an Wampler 005, bu ha is a very ifferen problem from he opimizaion seup. Throughou his aricle, we use for equaliy by efiniion, x for scalars, x for vecors, an X for ses. Denoe a funcion by f., is Fourier ransform by ˆf., an is complex conjugae by f.. We ofen enoe he omain of he funcion by X R an he omain of is Fourier ransform by Ω R. Le k σ x, for σ 0, enoe he isoropic Gaussian kernel, k σ x x πσ e σ. Le. inicae., an R ++ {x R x 0}. Finally, given a funcion of form g : R R ++, gx; x gx;, gx; xgx;, an ġx; gx;. Finally, gx; k. x k Opimizaion by Coninuaion Consier he problem of minimizing a nonconvex objecive funcion. In opimizaion by coninuaion, a ransformaion of he nonconvex funcion o an easy-ominimize funcion is consiere. The meho hen progressively convers he easy problem back o he original funcion, while following he pah of he minimizer. In his paper, we always choose he easier funcion o be convex. The minimizer of he easy problem can be foun efficienly. This simple iea has been use wih grea success for various nonconvex problems. Classic examples inclue aa clusering Gol, Rangarajan, an Mjolsness 994, graph maching Gol an Rangarajan 996; Zaslavskiy, Bach, an Ver 009; Liu, Qiao, an Xu 0, semi-supervise kernel machines Sinhwani, Keerhi, an Chapelle 006, muliple insance learning Gehler an Chapelle 007; Kim an Torre 00, semisupervise srucure oupu Dhillon e al. 0, language moeling Bengio e al. 009, robo navigaion Preo, Soao, an Menegai 00, shape maching Tirhapura e al. 998, l 0 norm minimizaion Trzasko an Manuca 009, image eblurring Boccuo e 3 In principle, one may formulae he opimizaion problem as fining all roos of he graien an hen evaluaing he objecive a hose poins o choose he lowes. However, his is no pracical as he number of saionary poins can be abunan, e.g. exponenial in number of variables for polynomials. al. 00, image enoising Rangarajan an Chellappa 990; Nikolova, Ng, an Tam 00, emplae maching Dufour, Miller, an Galasanos 00, pixel corresponence Leoreanu an Heber 008, acive conours Cohen an Gorre 995, Hough ransform Leich, Junghans, an Jenschel 004, an image maing Price, Morse, an Cohen 00, fining opimal parameers in compuer programs Chauhuri an Solar- Lezama 0 an seeking he opimal proofs Chauhuri, Clochar, an Solar-Lezama 04. In fac, he growing ineres in his meho has mae i one of he mos favorable soluions for he conemporary nonconvex minimizaion problems. Jus wihin he pas few years, he meho has been uilize for lowrank marix recovery Malek-Mohammai e al. 04, error correcion by l 0 recovery Mohimani e al. 00, super resoluion Coupe e al. 03, phoomeric sereo Wu an Tan 03, image segmenaion Hong, Lu, an Sunaramoorhi 03, face alignmen Saragih 03, shape an illuminaion recovery Barron 03, 3D surface esimaion Balzer an Morwal 0, an ense corresponence of images Kim e al. 03. The las wo are in fac sae of he ar soluions for heir associae problems. In aiion, i has recenly been argue ha some recen breakhroughs in he raining of eep archiecures Hinon, Osinero, an Teh 006; Erhan e al. 009, has been mae by algorihms ha use some form of coninuaion for learning Bengio 009. We now presen a formal saemen of opimizaion by he coninuaion meho. Given an objecive funcion f : X R, where X R. Consier an embeing of f ino a family of funcions g : X T, where T [0,, wih he following properies. Firs, gx, 0 fx. Secon, gx, is boune below an is sricly convex in x when ens o infiniy 4. Thir, gx, is coninuously iffereniable in x an. Such embeing g is someimes calle a homoopy, as i coninuously ransforms one funcion o anoher. The coniions of sric convexiy an boune from below for g., wih imply ha here exiss a unique minimizer for he g., when. We call his minimizer x. Define he curve x for 0 as one wih he following properies. Firs, lim x x. Secon, 0 ; g x, 0. Thir, x is coninuous in. This curve simply sweeps a specific saionary pah of g originae a x, as he parameer progresses backwar See Figure. In general, such curve neiher nees o exis, nor be unique. However, hese coniions can be guaranee by imposing exra coniion 0 ; e gx; 0 see e.g. Theorem 3 of Wu 996. Throughou his paper, i is assume ha x exiss. In pracice, he coninuaion meho is use as he following. Firs, x is eiher erive analyically or 4 A rigorous efiniion of such asympoic convexiy is provie in he supplemenary appenix. Figure : Plos show g versus x for each fixe ime. Algorihm Algorihm for Opimizaion by Coninuaion Meho : Inpu: f : X R, Sequence 0 n 0. : x 0 global minimizer of gx; 0. 3: for k o n o 4: x k Local minimizer of gx; k, iniialize a x k. 5: en for 6: Oupu: x n approximae numerically by arg min x gx; for large enough. The laer can use sanar convex opimizaion ools as gx; approaches a convex funcion in x for large. Then, he saionary pah x is numerically racke unil 0 See Algorihm. As menione in he inroucion, for a wie range of applicaions, he coninuaion soluion x0 ofen provies a goo local minimizer of fx, if no he global minimizer. Alhough his work only focuses on he use of homoopy coninuaion for nonconvex opimizaion, here is also ineres in his meho for convex opimizaion, e.g. o improve or guaranee he convergence rae Xiao an Zhang 0. 3 Analysis Due o he space limiaion, only he saemen of resuls are provie here. Full proofs are available in a supplemenary appenix. 3. Pah Inepenen Analysis The firs challenge we confron in eveloping a guaranee for he value of gx0; 0 is ha g. ; 0 mus be evaluae a he poin x0. However, we o no know x0 unless we acually run he coninuaion algorihm an see where i lans a upon erminaion. This is obviously no an opion for he heoreical analysis of he problem. Hence, he quesion is wheher i is possible o say somehing abou he value of gx0; 0 wihou knowing he poin x0. Here we prove ha his is possible an we erive an upper boun for gx0; 0 wihou knowing he curve x iself. We, however, require he value of g a he iniial poin o be known. In aiion, we require a global curve inepenen inequaliy o relae gx; an ġx;. Our resul is sae in he following lemma. Lemma Wors Case Value of g x; Given a funcion f : X R an is associae homoopy map g. Given a poin x ha is he saionary poin of gx; w.r.. x. Denoe he curve of saionary poins originae from x a by x, i.e. [0, ] ; gx, 0. Suppose his curve exiss. Given coninuous funcions a an b, such ha [0, ] x X ; agx; + b ġx;. Then, he following inequaliy hols for any [0, ], g x; g x ; e ar s r bs s e ar r. The proof of his lemma essenially consiss of applying a moifie version of he ifferenial form of Gronwall s inequaliy. This lemma eermines our nex challenge, which is fining he a an b for a given f. In orer o o ha, we nee o be more explici abou he choice of he homoopy. Our following evelopmen relies on Gaussian homoopy. 3. Gaussian Homoopy The Gaussian homoopy g : X T R for a funcion f : X R is efine as he convoluion of f wih k σ, gx; σ [f k σ ]x X fy k σx y y. In orer o emphasize ha he homoopy parameer coincies wih he sanar eviaion of he Gaussian, from here on, we swich o he noaion gx; σ for he homoopy insea of previously use gx;. A wellknown propery of he Gaussian convoluion is ha i obeys he hea equaion Wier 975, ġx; σ σ gx; σ. 3 This means ha in Lemma, he coniion aσgx; σ + bσ ġx; σ can be replace by aσgx; σ + bσ σ gx; σ. In orer o fin such aσ an bσ, we firs obain a lower boun on gx; σ in erms of gx; σ. Then, we will se aσgx; σ+bσ o be smaller han he lower boun. Gaussian homoopy has useful properies in he conex of he coninuaion meho. Firs, i enjoy some opimaliy crierion in erms of he bes convexificaion of fx Mobahi an Fisher III. Secon, for some complee basis funcions, such as polynomials or Gaussian RBFs, Gaussian convoluion has a close form expression. Finally, uner mil coniions, a large enough banwih can make gx; σ unimoal Loog, Duisermaa, an Florack 00 an hence easy o minimize. In fac, he example in Figure is consruce by Gaussian convoluion. Observe how he original funcion boom graually looks more like a convex funcion in he figure. 3.3 Lower Bouning g as a Funcion of g Here we wan o relae gx; σ o gx; σ. Since he ifferenial operaor is only w.r.. variable x, we can simplify he noaion by isregaring epenency on σ. Hence, we work wih hx gx; σ for some fixe σ. Hence, he goal becomes lower bouning hx as a funcion of hx. The lower boun mus hol a any arbirary poin, say x 0. Remember, we wan o boun hx 0 only as a funcion of he value of hx 0 an no x 0 iself. In oher wors, we o no know where x 0 is, bu we are ol wha hx 0 is. We can pose his problem as he following funcional opimizaion ask, where h 0 hx 0 is a known quaniy. y inf fx, s.., fx h 0, fx hx. f,x 4 Then i follows 5 ha y hx 0. However, solving 4 is oo iealisic ue o he consrain fx hx an he fac ha hx can be any complicae funcion. A more pracical scenario is o consrain fx o mach wih hx in erms of some signaures. These signaures mus be easy o compue for hx an allow solving he associae funcional opimizaion in f. A poenially useful signaure for consraining he problem is funcion s smoohness. We quanify he laer for a funcion fx by ˆfω ω where Ĝ is Ω Ĝ ω a ecreasing funcion calle sabilizer. This form essenially penalizes higher frequencies in f. Funcional opimizaion involving his ype of consrain has been suie in he realm of regularizaion heory in machine learning Girosi, Jones, an Poggio 995. Deeper mahemaical eails can be foun in Dyn e al. 989; Dyn 989; Maych an Nelson 990. The smoohness consrain plays a crucial role in our analysis. We enoe i by α for breviy, where α π Ω ĥω ω, an refer o his quaniy as he Ĝ ω opimizaion complexiy. Hence, he ieal ask 4 can be relaxe o he following, 5 If h is a one-o-one map, fx h 0 an fx hx imply ha x x 0 an hence y hx 0. ỹ inf fx f,x 5 s.., fx h 0, ˆfω Ĝ ω ω π α. Ω Since 5 is a relaxaion of 4 because he consrain fx hx is replace by he weaker consrain ˆfω Ω ω ĥω ω, i follows ha Ĝ ω Ω Ĝ ω ỹ y. Since y hx 0, we ge ỹ hx 0, hence he esire lower boun. In he seing 5, we can inee solve he associae funcional opimizaion. The resul is sae in he following lemma. Lemma Consier f : X R wih well-efine Fourier ransform. Le Ĝ : Ω R ++ be any ecreasing funcion. Suppose fx h 0 an π Ω ˆfω Ĝω ω α for given consans h 0 an α. Then inf f,x fx c G0 c G0, where c, c is he soluion o he following sysem, c G0 c G0 h 0 Ω Ĝω ω + c c c Ω ω Ĝω ω... +c Ω ω 4 Ĝω ω π α. 6 Here means he applicaion of he Laplace operaor wice. The lemma is very general, working for any ecreasing funcion Ĝ : Ω R ++. An ineresing choice for he sabilizer Ĝ is he Gaussian funcion his is a familiar case in he regularizaion heory ue o Yuille Yuille an Grzywacz 989. This leas o he following corollary. ˆfω Ĝω Corollary 3 Consier f : X R wih well-efine Fourier ransform. Le Ĝω ɛ e ɛ ω. Suppose fx h 0 an Ω ω π α for given consans h 0 an α. Then inf f,x fx h0+ α h 0 ɛ. Example Consier hx e x ω. Le Ĝω e i.e. se ɛ. I is easy o check ha ĥω ω R Ĝω π. Hence, α. Le x0 0. Obviously, hx 0. Using Corollary 3 we have inf f,x f x +. We now show ha he wors case boun suggese by Corollary 3 is sharp for his example. I is so because h x x e x, which a x 0 0 becomes h x Exension o he Smoohe Objecive Corollary 3 applies o any funcions fx ha has wellefine Fourier ransform an any sabilizer of form Ĝω. This inclues any parameerize family of funcions an sabilizer, as long as he parameers an x are inepenen of each oher. In paricular, one can choose he parameer o be σ an replace fx by gx; σ an Ĝω by Ĝω; σ ɛ σe ɛ σ ω. Noe ha σ an x are inepenen. This simple argumen allows us o express Corollary 3 in he he following parameric way. Corollary 4 Consier f : X R wih well-efine Fourier ransform. Define gx; σ [h k σ ]x. Le Ĝω; σ ɛ σe ɛ σ ω. Suppose gx ; σ ĝω;σ Ĝω;σ g 0 σ an Ω ω π ασ for given values g 0 σ an ασ. Then inf g. ;σ,x gx ; σ g0σ+ ασ g 0 σ ɛ σ. 3.5 Choice of ɛσ For he purpose of analysis, we resric he choice of ɛσ 0 as sae by he following proposiion. This resuls in monoonic ασ, which grealy simplifies he analysis. Proposiion 5 Suppose he funcion ɛσ 0 saisfies 0 ɛσ ɛσ σ. Then ασ 0. This choice can be furher refine by he following proposiion. Proposiion 6 The only form for ɛσ 0 ha saisfies 0 ɛσ ɛσ σ is, ɛσ β σ + ζ, 7 for any 0 β an ζ σ. 3.6 Lower Bouning σ gx; σ by aσgx; σ + bσ The goal of his secion is fining coninuous funcions a an b such ha aσgx; σ + bσ σ gx; σ. By manipulaing Corollary 4, one can e- rive gx 0 ; σ gx0;σ+ ασ gx 0;σ ɛ σ, where π ασ ɛσ Ω ĝω; σ e ɛ σ ω ω. By muliplying boh sies by σ remember σ 0 an facorizing ασ he above inequaliy can be equivalenly wrien as, σ gx 0 ; σ σ gx0;σ ɛ σ gx0;σ σ ασ ɛ σ ασ. This inequaliy implies gx 0 ; σ σ gx0;σ ɛ σ σ ασ +γ gx 0 ;σ ασ ɛ σ γ, where 0 γ is any consan an we use he fac ha u, γ [, ] [0, ; u +γu wih γ gx 0;σ ασ being u. The inequaliy now has he affine form σ gx 0 ; σ aσgx 0 ; σ + bσ, where aσ σ ɛ σ σ γ ɛ σ γ, bσ σ ασ ɛ σ γ. 8 Noe ha he coninuiy of ɛ as sae in 7 implies coninuiy of a an b. 3.7 Inegraions an Final Boun Theorem 7 Le f : X R be he objecive funcion. Given he iniial value g xσ ; σ. Then for any 0 σ σ, an any consans 0 γ , 0 β , ζ σ, he following hols, g xσ; σ σ + ζ σ + ζ p g xσ ; σ +c ασ σ + ζ σ + ζ p, where p β γ an c γ. γ γ The proof essenially combines 8 wih he fac ġx; σ σ gx; σ i.e. he hea equaion o obain ġx; σ aσgx; σ + bσ, where aσ γ γ σ ɛ σ an bσ σ ασ. This form is now ɛ σ γ amenable o Lemma. Using he form of ɛσ in 7, ar r can be compue analyically o γ γ β logσ + ζ. Finally, using he Holer s inequaliy f g f g, we can separae ασ from he remaining of he inegran in form of sup ασ. The laer furher simplifies o ασ ue o non-increasing propery of α sae in Proposiion 5. We now iscuss he role of opimizaion complexiy ασ in 9. For breviy, le w σ, σ σ +ζ σ +ζ p, an w σ, σ c σ +ζ σ +ζ p. Observe ha w an w are inepenen of f, while g an α epen on f. I can be prove ha w is nonnegaive Proposiion 8, an obviously so is ασ. Hence, lower opimizaion complexiy ασ resuls in a smaller objecive value g xσ; σ. Since he opimizaion complexiy α epens on he objecive funcion, i provies a way o quanify he harness of he opimizaion ask a han. A pracical consequence of our heorem is ha one may eermine he wors case performance wihou running he algorihm. Imporanly, he opimizaion complexiy can be easily compue when f is represene by some suiable basis form; in paricular by Gaussian RBFs. This is he subjec of he nex secion. Noe ha while our resul hols for any choice of consans wihin he prescribe range, ieally hey woul be chosen o make he boun igh. Tha is, he negaive an posiive erms respecively receive he large an small weighs. Before ening his secion, we presen he following proposiion which formally proves w is posiive. Proposiion 8 Le c an p γ γ β γ for any choice of 0 γ an γ 9 0 β. Suppose 0 σ σ an ζ σ. Then c σ +ζ σ +ζ p Analyical Expression for ασ In orer o uilize he presene heorem in pracice for some given objecive funcion f, we nee o know is associae opimizaion complexiy ασ. Tha is, we mus be able o compue ĥω ω analyically. Ω Ĝω Is his possible, a leas for a class of ineresing funcions? Here we show ha his is possible if he funcion f is represene in some suiable form. Specifically, here we prove ha he inegrals in ασ can be compue analyically when f is represene by Gaussian RBFs. Before proving his, we provie a brief escripion of Gaussian RBF represenaion. I is known ha, uner mil coniions, RBF funcions are capable of univers
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks