Computers & Electronics

A functional density-based nonparametric approach for statistical calibration

Description
A functional density-based nonparametric approach for statistical calibration
Published
of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A Functional Density-Based NonparametricApproach for Statistical Calibration Noslen Hern´andez 1 , Rolando J. Biscay 2 , 3 ,Nathalie Villa-Vialaneix 4 , 5 , and Isneri Talavera 1 1 Advanced Technology Application Centre, CENATAV - Cuba 2 Institute of Mathematics, Physics and Cybernetics - Cuba 3 Departamento de Estad´ıstica de la Universisad de Valpara´ıso, CIMFAV - Chile 4 Institut de Math´ematiques de Toulouse, Universit´e de Toulouse - France 5 IUT de Perpignan, D´epartement STID, Carcassonne - France Abstract. In this paper a new nonparametric functional method is in-troduced for predicting a scalar random variable Y  from a functionalrandom variable X . The resulting prediction has the form of a weightedaverage of the training data set, where the weights are determined bythe conditional probability density of  X given Y  , which is assumed to beGaussian. In this way such a conditional probability density is incorpo-rated as a key information into the estimator. Contrary to some previousapproaches, no assumption about the dimensionality of  E ( X | Y  = y ) isrequired. The new proposal is computationally simple and easy to imple-ment. Its performance is shown through its application to both simulatedand real data. 1 Introduction The fast development of instrumental analysis equipment and modern measure-ment devices provides huge amounts of data as high-resolution digitized func-tions. As a consequence, Functional Data Analysis (FDA) has become a growingresearch field. In the FDA setting, each individual is treated as a single entity de-scribed by a continuous real-valued function rather than by a finite-dimensionalvector: functional data (FD) are then supposed to have values in an infinitedimensional space, often particularized as a Hilbert space.An extensive review of the methods developed for FD can be found in themonograph of Ramsay and Silverman[1]. In the case of functional regression,where one intends to estimate a random scalar variable Y  from a functionalvariable X  taking values in a functional space X  , earlier works were focused onlinear methods such as the functional linear model with scalar response[2–8] or the functional Partial Least Squares[9]. More recently, the problem has also beenaddressed nonparametrically with smoothing kernel estimates [10], multilayerperceptrons[11], and support vector regression[12,13]. Another point of view between these two approaches is to use a semi-parametric approach, such as theSIR (Sliced Inverse Regression [14]) that has been extended to functional data(FIR) in[15–17]. In this approach, the functional regression problem is addressed I. Bloch and R.M. Cesar, Jr. (Eds.): CIARP 2010, LNCS 6419, pp. 450–457,2010.c  Springer-Verlag Berlin Heidelberg 2010  A Functional Density-Based Nonparametric Approach 451 through the opposite regression problem i.e., the estimation of  E ( X  | Y  = y ), byassuming that this quantity belongs to a finite dimensional subspace of  X  .In this paper, a new functional regression method to estimate γ  ( X  ) = E ( Y  | X  )is introduced that also relies on regarding the inverse regression model X  = F  ( Y  ) + e . Its main practical motivation arises from calibration problems inChemometrics, specifically in spectroscopy, where some chemical variable Y  (e.g., concentration) needs to be predicted from a digitized function X  (e.g.,an spectrum). In this setting, said ‘’inverse” model represents the physical datageneration process in which the output spectrum X  is determined by the inputchemical concentration Y  , and e is a functional random perturbation mainly dueto the measurement procedure. The specific form of the conditional density of  X  given Y  , which is assumed to be Gaussian, is incorporated as a key informationinto the estimator. This regression estimate, will be refereed to as functionalDensity-Based Nonparametric Regression (DBNR). Unlike the FIR approach,few assumptions are required: in particular, γ  does not need to be a function of a finite number of projections nor X  has to follow an elliptical distribution (orany other given distribution). DBNR is computationally very easy to use.This paper is organized as follows. Section2presents the functional Density-Based Nonparametric Regression method. Sections3and4illustrate the use of  this approach in simulated and real data. Conclusions are given in Section5. 2 Functional Density-Based Nonparametric Regression 2.1 Definition of DBNR in a General Setting Let ( X,Y  ) be a pair of random variables taking values in X× R where ( X  ,  .,.  )is a Hilbert space. Suppose also that n i.i.d. realizations of ( X,Y  ) are given,denoted by ( x i ,y i ) i =1 ,...,n . The goal is to build, from ( x i ,y i ) i , a way to predicta new value for Y  from a given (observed) value of  X  . This problem is usuallyaddressed by the estimation of the regression function γ  ( x ) = E ( Y  | X  = x ).The functional density-based nonparametric regression implicitly supposesthat the inverse model makes sense; this inverse model is: X  = F  ( Y  ) +  (1)where  is a random process (perturbation or noise) with zero mean, independentof  Y  , and y → F  ( y ) is a function from R into X  . As was stated in Section1,this is a common background for calibration problems, amongs others.Additionally, the following assumptions are made: first, it exists a probabilitymeasure P  0 on X  (not depending on y ) such that the conditional probability mea-sure of  X  given Y  = y , say P  ( ·  y ), has a density f  ( ·  y ) with respect to P  0 : P  ( A  y ) =   A f  ( x  y ) P  0 ( dx )for any measurable set A in X  . Furthermore, it is assumed that Y  is a continuousrandom variable, i.e., that its distribution has a density f  Y  ( y ) (with respect tothe Lebesgue measure on R ).  452 N. Hern´andez et al. Under these assumptions, the regression function is: γ  ( x ) =   R f  ( x  y ) f  Y  ( y ) ydyf  X ( x ) , where f  X ( x ) =   R f  ( x  y ) f  Y  ( y ) dy. Hence, given an estimate  f  ( x  y ) of  f  ( x  y ), the following estimate of  γ  ( x ) canbe constructed from the previous equation:   γ  ( x ) =  ni =1   f  ( x  y i ) y i   f  X ( x ) , where  f  X ( x ) = n  i =1   f  ( x  y i ) . (2) 2.2 Specification in the Gaussian Case The general estimation scheme given in Equation (2) will be here specified forthe case in which P  ( ·  y ) is a Gaussian measure on X  = L 2 [0 , 1] for each y ∈ R . P  ( ·  y ) is then supposed to have a mean function μ ( ·  y ) ∈ X  (which is thenequal to F  ( y )( · ) according to Equation (1)) and a covariance operator r (notdepending on y ), which is a Hilbert-Schmidt operator on the space X  . Then,there exists an eigenvalue decomposition of  r , ( ϕ j ,λ j ) j ≥ 1 such that ( λ j ) j isa decreasing series of positive real numbers, ( ϕ j ) j take values in X  and r =  j λ j ϕ j ⊗ ϕ j where ϕ j ⊗ ϕ j ( h ) =  ϕ j ,h  ϕ j for any h ∈X  .Denote by P  0 the Gaussian measure on X  with zero mean and covarianceoperator r . Assume the following usual regularity condition holds: for each y ∈ R , ∞  j =1 μ 2 j ( y ) λ j < ∞ , with μ j ( y ) =  μ ( ·  y ) ,ϕ j  . Then, P  ( ·  y ) and P  0 are equivalent Gaussian measures, and the density f  ( ·  y )has the explicit form: f  ( x  y ) = exp ⎧⎨⎩ ∞  j =1 μ j ( y ) λ j  x j − μ j ( y )2 ⎫⎬⎭ , where x j =  x,ϕ j  for all j ≥ 1. This leads to the following estimation schemefor f  ( x  y ):1. Obtain an estimate   μ ( ·  y ) of  t → μ ( t  y ) for all y ∈ R . This may be carriedout trough any standard nonparametric regression from R to R , based onthe learning set ( y i ,x i ( t )) i =1 ,...,n ; e.g., a smoothing kernel method.2. Obtain estimates (   ϕ j ,  λ j ) j of the eigenfunctions and eigenvalues ( ϕ j , λ j ) j of the covariance r on the basis of the empirical covariance of the residuals x i −   μ ( ·  y i ), i = 1,..., n . Only the first p eigenvalues and eigenfunctions areestimated, where p = p ( n ) is a given integer, smaller than n .3. Estimate f  ( x  y ) by   f  ( x  y ) = exp ⎧⎨⎩  p  j =1   μ j ( y )   λ j   x j −  μ j ( y )2 ⎫⎬⎭ (3)where  μ j ( y ) =    μ ( ·  y ) ,   ϕ j  and  x j =  x,   ϕ j  .  A Functional Density-Based Nonparametric Approach 453 Finally, substituting (3) into (2) leads to an estimate  γ  ( x ) of  γ  ( x ). Undersome technical assumptions the consistency of the DBNR method can be proved:lim n →∞ ˆ γ  ( x ) = P γ  ( x ). 3 A Simulation Study The feasibility and the performance of the introduced nonparametric functionalregression method are first explored through a simulation study. For comparison,results obtained by the functional Nadaraya-Watson kernel (NWK) estimator[10] are also shown. 3.1 Data Generation The data were simulated in the following way: values for the real random variable, Y  , were drawn from a uniform distribution in the interval [0 , 10]. Then, X  wasgenerated by 4 different models or settings: M1 X  = Y e 1 + 2 Y e 2 + 3 Y e 5 + 4 Y e 10 +  M2 X  = (exp( Y  ) / exp(10)) e 1 + ( Y  2 / 100) e 2 + ( Y  3 / 1000) e 5 + log( Y  + 1) e 10 +  M3 X  = sin( Y  ) e 1 + log( Y  + 1) e 5 +  M4 X  = α exp  Y  10 e 1  +  where ( e i ) i ≥ 1 is the trigonometric basis of  X  = L 2 ([0 , 1]) (i.e., e 2 k − 1 = √ 2cos(2 πkt ), and e 2 k = √ 2sin(2 πkt )), and  a Gaussian process independent of  Y  with zero mean and covariance operator Γ  e =  j ≥ 11 j e j ⊗ e j . More precisely,  was simulated by using a truncation of  Γ  e , Γ  e ( s,t )   qj =11 j e j ( t ) e j ( s ) with q = 500.A sample of size n L = 300 was simulated for training and a sample of size n T  = 200 for testing. Figure1gives examples of  X  obtained for model M3 forthree different values of  y and of the underlying (non noisy) function, F  ( y )( · ). Inthis example, the simulated data have a high level of noise so that the regressionestimation is a rather hard statistical task. 3.2 Simulation Results To apply the DBNR method, the discretized functions X  were approximated bya continuous function using a functional basis expansion. Specifically, the datawere approximated using 128 B-spline basis functions of order 4, as it is shownin Figure1. The conditional mean μ ( · /y ) was estimated by a kernel smoothingin which the bandwidth parameter h was selected by 10-fold cross-validationminimizing the mean squared error (MSE) criterion. A similar procedure wasused to select the parameter p (number of eigenvalues and eigenfunctions usedin (3)).Finally, DBNR performance was compared with those obtained by the func-tional NWK estimate with two kinds of metrics for the kernel: the usual L 2 -normand the PCA based semi-metric norm (see [10] for further details about these  454 N. Hern´andez et al. 00.20.40.60.81−6−4−2024     f   u   n   c   t    i   o   n   s sample: 44 (Y = 3.7948) 00.20.40.60.81−6−4−2024     f   u   n   c   t    i   o   n   s sample: 248 (Y = 1.9879) 00.20.40.60.81−6−4−20246     f   u   n   c   t    i   o   n   s sample: 41 (Y = 8.3812)argument (t)argument (t)argument (t) Fig.1. True function, F  ( y )( · ) (smooth continuous line), simulated data, X , (gray roughline) and approximation of  X using B-splines (rough black line) in M3 for three dif-ferent values of  y Table 1. RMSE for all the methods and all generating modelsModel DBNR NWK (PCA) NWK ( L 2) M1 0.08 0.10 0.09 M2 1.47 1.60 1.77 M3 1.79 1.79 2.00 M4 0.94 2.16 1.91 methods). The resulting root mean squared errors (RMSE) are presented in Ta-ble1. The results show that DBNR is a good alternative to common NWKmethods. Indeed, DBNR outperforms NWK methods in all the the cases con-sidered in this simulation study that includes both linear ( M1 ) and nonlinear( M2 − M4 ) models.Figures2and3show how the method performs for each step of the estimation scheme (described in Section2.2) for the model M3 . In particular, Figure2givesthe result of the first step by displaying the true value and the estimate of  F  ( y )( · )for various values of  y (top) and the true value and the estimate of  F  ( · )( t ) forvarious values of  t (bottom). The results are very satisfactory given the fact thatthe data have a high level of noise (which is stressed on in the bottom of thefigure): a minor estimation problem appears at the boundaries of  F  ( · )( t ), whichis a known drawback of the kernel smoothing method. Also, those estimates aresmoother than the estimates of  F  ( y )( · ): this can be explained by the fact thatthe kernel estimator is used regarding y and not regarding t , but this aspect canbe improved in the future.Figure3shows the results of the steps 2-3 of the estimation scheme: theestimated eigendecomposition of  r is compared to the true one and finally, thepredicted value for Y  are compared to the true ones, both on training and testsets. The estimation of the eigendecomposition is, once again very satisfactorygiven the high level of noise, and the comparison between training and test setsshow that the method does not overfit the data.
Search
Similar documents
View more...
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks