Art & Photos

EXACT DISTRIBUTION OF SQUARED WELSCH-KUH DISTANCE AND IDENTIFICATION OF INFLUENTIAL OBSERVATIONS

Description
This paper proposes the exact distribution of squared DFFITS alias squared Welsch-Kuh () 2 WK distance measure used to evaluate the influential observations in a multiple linear regression analysis. The authors have explored the relationship between
Categories
Published
of 23
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Journal of Reliability and Statistical Studies; ISSN (Print): 0974-0!4" (#nline): !!!9-$%%% &ol' 9" Issue  (!0%): %9-9 EXACT DISTRIBUTION OF SQUARED WELSCH-KUH DISTANCE AND IDENTIFICATION OF INFLUENTIAL OBSERVATIONS G.S. David Sam Jaya!ma" #  a$d A. S!%&'a$ (   Jaal Institute of *ana+eent" ,irucira..alli" India / *ail:  saaya771+ail'co" ! sultan901+ail'co' Recei2ed *arc !!" !0$ *odified *arc 30" !0% cce.ted .ril !7" !0% A)*&"a+& ,is .a.er .ro.oses te e5act distribution of s6uared 88I,S alias s6uared elsc-u ( ) ! WK  distance easure used to e2aluate te influential obser2ations in a ulti.le linear re+ression analysis' ,e autors a2e e5.lored te relationsi. beteen te ! WK   in ters of to inde.endent 8-ratio<s and tey a2e son te deri2ed density function of te ! WK   distance in a co.licated series e5.ression for in2ol2in+ =auss y.er-+eoetric function it to sa.e .araeters . and n' *oreo2er" te ean" 2ariance of te distribution are deri2ed in ters of te sa.e .araeters and te autors a2e establised te u..er control liit of ! WK  . Siilarly" te critical .oints of s6uared elsc-u ( ) ! WK  distance easure are co.uted at $> and > si+nificance le2esl for different sa.le si?es and 2aryin+ no' of .redictors' 8inally" te nuerical e5a.le sos te identification of te influential obser2ations and te results e5tracted fro te .ro.osed a..roaces are ore scientific" systeatic and teir e5actness out.erfors te elsc-u<s traditional a..roac' K,y W"d*  S6uared   elsc-u istance *easures" Influential #bser2ation" Series /5.ression 8or" =auss @y.er-=eoetric 8unction" *ean" &ariance" Aritical Points' A/S C%a**i0i+a&i$ %!@0 #.   I$&"d!+&i$ a$d R,%a&,d 1" ,e Studenti?ed residuals and te .lot of te residuals ere considered te ost a..ro.riate statistical de2ices to detect .otentially critical obser2ations in te literature before te tird 6uarter of te !0t century' BenCen and ra.er (97!) a2e clarified tat te estiated 2ariance of te residuals includes .ertinent inforation  beyond tat .ro2ided by .lots of residuals or studenti?ed residuals' Siilarly" tey discussed te 2ariances of residuals in se2eral ore co.licated desi+ns' @oa+lin and els (97) e5.ressed" .roection atri5 Cnon as te at atri5 tat contains tis inforation and to+eter it te studenti?ed residuals" .ro2ides a eans of identifyin+ e5ce.tional data .oints' AooC (977) as been te first to establis a si.le easure"  D i   tat incor.orates inforation fro te D-s.ace and E-s.ace used for assessin+ te influential obser2ations in re+ression odels' ,e .roble of outliers or influential data in te ulti.le or ulti2ariate linear re+ression settin+ as been torou+ly discussed it reference to .araetric re+ression odels by te .ioneers naely AooC (977)"   70 Journal of Reliability and Statistical Studies" June !0%" &ol' 9()   AooC and eisber+ (9!)" Belsey et al' (90) and Aatteree and @adi (9) res.ecti2ely' In non-.araetric re+ression odels" dia+nostic results are 6uite rare' on+ te" /ubanC (9$)" Sil2eran (9$)" ,oas (99)" and i (99%) studied residuals" le2era+es" and se2eral ty.es of AooC<s distance in sootin+ s.lines" and i and i (99 F !00) .ro.osed a ty.e of AooC<s distance in Cernel density estiation and in local .olynoial re+ression' ,e .rase Ginfluence easures< as +li.sed a +reat sur+e of researc interests' ,e de2elo.ents of different easures are in2esti+ated to identify te influential obser2ation fro te early criteria of AooC<s to te .resent and a definition about influence" ic a..ears ost suitable" is +i2en by Belsey et al' (90)' AooC<s statistical dia+nostic easure is a si.le" unifyin+ and +eneral a..roac for ud+in+ te local influence in statistical odels' s far as te influence easures are concerned in te literature" te .rocedures ere desi+ned to detect te influence of obser2ations on a s.ecific re+ression result' @oe2er" @adi (99!) .ro.osed a dia+nostic easure called @adi<s influence function to identify te o2erall .otential influence ic .ossesses se2eral desirable .ro.erties tat any of te fre6uently used dia+nostics do not +enerally .ossess suc as in2ariance to location and scale in te res.onse 2ariable and in2ariance to non-sin+ular transforations of te e5.lanatory 2ariables' It is an additi2e function of easures of le2era+e and of residual error and it is onotonically increasin+ in te le2era+e 2alues and in te s6uared residuals' Recently" H a?-=arcH a and =on?le?-8arH as (!004) odified te classical AooC<s distance it +enerali?ed *aalanobis distance in te conte5t of ulti2ariate elli.tical linear re+ression odels and tey also establised te e5act distribution for identification of outlier data .oints' Aonsiderin+ te abo2e re2ies" te autors a2e  .ro.osed te e5act distribution of S6uared elsc-u distance ( ) ! WK   to e5actly identify te influential data .oints and is discussed in te subse6uent sections' (. R,%a&i$*'i2 ),&1,,$ S3!a",d W,%*+'-K!' di*&a$+, ( ) ! WK   a$d F-"a&i* ,e ulti.le linear re+ression odel it rando error is +i2en by Y X e β  = +  () ere () nX  Y   is te atri5 of te de.endent 2ariable"   )( kX   β  is te 2ector of beta co-efficients or .artial re+ression co-efficients and () nX  e is te residual folloed noral distribution N (0" ne  I  ! σ  )' 8ro ()" statisticians concentrate and +i2e i.ortance to te error dia+nostics suc as outlier detection" identification of le2era+e .oints and e2aluation of influential obser2ations' Se2eral error dia+nostics tecni6ues e5ist in te literature .ro.osed by statisticians" but te 88I,S is te interestin+ tecni6ue based on te si.le fact tat te i.act of te i th  on te .redicted 2alue can be easured by scalin+ te can+e in .rediction at i  x , en te i th   obser2ations is oitted " i'e' ( )( ) iihiiiihi yi  x y σ σ   β  β      −  −=      (!) elsc and u (977)" elsc and Peters (97) and Belsley" u and elsc   /5act distribution of s6uared elsc-u distance K 7 (90) su++ested usin+ ( ) ! i σ   an estiate of ! σ  and called (!) as 88I,S' 8or si.licity" tey refer (!) by elsc-u distance ( ) i WK  " ( )( )  iiiiiiiiiii hh Rh xWK  −=−=          σ   β  β    (3) ere i  R is te absolute e5ternally studenti?ed residual" G n < is te sa.le si?e" and ii h  is te at 2alue of i th  obser2ation or dia+onal eleent of te at atri5 ))(( LL  X  X  X  X  H   − = ' elsc (90) su++ested i WK   as a dia+nostic tool and ( ) !M  p n +  as a calibration .oint for obser2ations' ,e 2alue of i WK   for obser2ations e5ceedin+ tis calibration .oint ic is treated as influential obser2ation and sees reasonable to noinate .oints for s.ecial attention" elsc-u distance easure can also be ritten in a s6uared alternati2e for as !!  iii iii hWK Rh =−  (4) ,ou+ te easure is scientific and te criterion ( ) !M  p n +  used to detect te influential obser2ation is not scientific and te autors belie2e tat it is based on rule of tub a..roac' In order to o2ercoe tis rule of tub a..roac" autors ade an atte.t to aCe tis a..roac ore scientific by fi5in+ eanin+ full criterion as calibration .oint' ,o identify te e5act influential obser2ations" e .ro.ose te e5act distribution for s6uared elsc-u distance easure' 8or tis" e utili?e te relationsi. aon+ te s6uared elsc-u distance ( ) ! i WK  " e5ternally studenti?ed residual ( ) i  R and at eleents () ii h ',e ters i  R  and ii h  are inde.endent because te co.utation of i  R  in2ol2es te error ter )"0( ! ei  N e  σ  ∼  and ii h  2alues in2ol2e te set of .redictors LL (())  H X X X X  − = ' ,erefore" fro te .ro.erty of least s6uares if ()0  E eX   = " ten i  R  and ii h  are also uncorrelated and inde.endent' sin+ tis assu.tion" e already Cno tat te e5ternally studenti?ed residual ( ) i  R e5actly follos t-distribution it n-p-2  de+rees of freedo and it<s s6uared for is +i2en as ( ) ( )  ( ) !"!!!   −−− ∼−=  pniiie sii  F he R   ($) 8ro ($)" it is te s6uared for of te e5ternally studenti?ed residual and it follos 8-distribution it (" n-p-2 ) de+rees of freedo' Siilarly" e identify te distribution of ii h  based on te relationsi. .ro.osed by Belsley et al' (90) o a2e son tat   7! Journal of Reliability and Statistical Studies" June !0%" &ol' 9()   if te set of .redictors follos ulti2ariate noral distribution it )"(  X  X   Σ µ  " ten ( ) )"( ))(( M)(  pn piiii  F h pnh pn −− ∼−−−−   (%) 8ro (%) it follos 8-distribution it (")  p n p − −  de+rees of freedo and it can be ritten in an alternati2e for as )"( )"( M  pn pi  pn pi ii  F  pn pn F  pn ph −−−− −−++    −−=   (%a) In order to deri2e te e5act distribution of s6uared elsc-u distance" itout loss of +enerality substitutin+ ($) and (%a) in (4)" e +et ! i WK   in ters of te to inde.endent 8-ratios it ("!) n p − −  and (")  p n p − −  de+rees of freedo res.ecti2ely and te relationsi. is +i2en as ( ) ( ) !""!  i i p n p i n p n pWK F F n n p n − − − −  −= + − −    (7) ( ) ( ) ( ) !""! !! i i p n p i n p n n p pWK F F n n p n n p − − − − − −   −= +  − − − −     () 8ro ()" it can be furter si.lified and ! i WK   is e5.ressed in ters of to inde.endent beta 2ariables of Cind-! naely  i θ   and ! i θ   by usin+ te folloin+ facts ( ) !" "!! ii p n p  p p n p F n p ∼ θ β  − − − − − =  −     (9)   ( ) !!"! !"!!! ii n p n p F n p ∼ θ β  − − − − =  − −     (0) ,en" itout loss of +enerality () can be ritten as ( ) !! ! i i i n n pWK n n θ θ  − −  = + −     ()   ( ) !! ! i i i n pWK nn θ θ  − −= +−   (!) ( ) ( ) !! " i i i WK p n n α θ θ  = +  (3) 8ro (3)" te autors a2e son te s6uared elsc-u distance easure in ters of ! "!! i n p ∼ θ β   − −     and !! !"!! i n p ∼ θ β   − −     ic folloed beta distribution of Cind-! it to sa.e .araeters  p "  n and ( ) ( ) "!M  p n n p n α   = − − −   is a norali?in+ function ic in2ol2es te sa.e .araeters res.ecti2ely' Based on te identified relationsi. fro (3)" te autors a2e deri2ed te distribution of te   /5act distribution of s6uared elsc-u distance K 73 s6uared elsc-u distance ic discussed in te ne5t section' 4. E5a+& Di*&"i)!&i$ 0 S3!a",d W,%*+'-K!' di*&a$+,   sin+ te tecni6ue of to-diensional Jacobian of transforation" te oint  .robability density function of te to beta 2ariables of Cind-! naely  i θ  and ! i θ    ere transfored into density function of ne rando 2ariables ! i WK   and i u ' It is +i2en as ( )  ( ) !! "" i i i i  f WK u f J  θ θ  =  (4) 8ro (4)" e Cno i  θ  and i ! θ   are inde.endent ten rerite (4) as ( )  ( ) ( ) !! " i i i i  f WK u f f J  θ θ  =  ($) sin+ te can+e of 2ariable tecni6ue" substitute ii  u = ! θ    in (3) e +et ( ) ! " iii WK n p n u θ α    = −     (%)   ,en .artially differentiate (%)" co.ute te Jacobian deterinant and rerite ($) as ( )  ( ) ( ) ( ) ( ) !!!! """ i ii i i ii i  f WK u f f WK u θ θ θ θ  ∂=∂  (7) ( )  ( ) ( ) !!!!!! " i iiii i i ii iii uWK  f WK u f f uWK  θ θ θ θ θ θ  ∂ ∂∂∂=∂ ∂∂∂  () 8ro ($)" e Cno tat  i θ    and ! i θ  are inde.endent and ten te density function of te oint distribution of  i θ   and  i θ   is +i2en as ( ) !!!!!!!!!! "()()!""!!!!  p n p n p pi i i ii i  f  p n p n p B B θ θ θ θ θ θ  − − − −   −− + − +− −       = + × +− − − −           (9) ere ! 0" i i θ θ  ≤ < ∞ " "0 n p  >  and ( ) ( ) ( ) ( ) !!!!!! """0 i iiiiii i iiii WK uWK n p n un p n u n p n uuWK  θ θ α θ θ  α  α  ∂ ∂∂∂ −= =∂ ∂∂∂  (!0) ,en substitutin+ (9) and (!0) in () in ters of te substitution of i u " e +et te
Search
Similar documents
View more...
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks