A morphologically optimal strategy for classifier combinaton: multiple expert fusion as a tomographic process

A morphologically optimal strategy for classifier combinaton: multiple expert fusion as a tomographic process
of 12
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  A Morphologically Optimal Strategy forClassifier Combination: Multiple Expert Fusionas a Tomographic Process David Windridge and Josef Kittler,  Member  ,  IEEE Computer Society  Abstract —We specify an analogy in which the various classifier combination methodologies are interpreted as the implicitreconstruction, by tomographic means, of the composite probability density function spanning the entirety of the pattern space, theprocess of feature selection in this scenario amounting to an extremely bandwidth-limited Radon transformation of the training data.This metaphor, once elaborated, immediately suggests techniques for improving the process, ultimately defining, in reconstructiveterms, an optimal performance criterion for such combinatorial approaches. Index Terms —Classifier combination, tomography, probability theory, feature selection.  1 I NTRODUCTION T HE  potential for the misclassification errors arising fromclassification methods of varying distinction to be onlypartially overlapping has lead to the realization that, ingeneral, no one method of classification can circumscribe allaspects of a typical real-world classification problem,prompting the investigation of a variety of combinatorialmethods in a bid to improve classification performance,e.g., [1], [2], [3], [4], [5], [6]. Generally, these methods havein common that they are based on intuitive techniques forthe combination of disparate decision schemes (e.g.,majority vote and weighted mean), rather than arisingnaturally from any underlying theoretical schematics. Inparticular, there has not as yet been any attempt to obtain a  generically  optimal mathematical solution to the problem;optimization typically taking place  within  the terms of thechosen combination strategy (e.g., [7] through [11]). Weshall set out to at least partially address this deficiency inthe following paper by outlining an analogy with theapparently unrelated subject of tomographic reconstruction.By interpreting the combination of classifiers with  distinct feature sets as the implicit reconstruction of the combinedpattern space probability density function (PDF), we can begin to envisage the problem in geometric terms and, inconsequence, propose a  morphologically 1 optimal solution both to this and, ultimately, to the more general problem of nondistinct feature sets. The focus of this paper is thereforepredominantly on theoretical development: algorithmicdevelopment of the idea for practical implementation inarbitrary spaces is dealt with more completely elsewhere[24].We therefore commence the current paper with anoutline of tomographic reconstruction theory and itsgeneralization to the higher-dimensionality, low angularsample-rate pattern spaces appropriate to pattern recogni-tion theory: later sections of the paper concern themselveswith making the parallels with probability theory mathe-matically rigorous, the final sections then considering thegeneralization of the technique to the combination of classifiers with  non distinct feature sets and, hence, theuniversal application of the method (as illustrated in theappendix by a practical demonstration of the anticipatedperformance improvement). 2 C LASSIFIER  C OMBINATION , R ADON T RANSFORMATION , T OMOGRAPHIC R ECONSTRUCTION In formalizing the framework of this tomographic metaphorfor classifier combination, we shall commence by specifyingas follows our prior assumptions in relation to conventionalcombinatorial schemes (generalizing at a later point to a lessconstricting set of assumptions): 1.  We shall assume that the selection of features isdecided through classifier preference, and that this isaccomplished via the straight-forward omission of superfluous dimensions as appropriate (this mighteven be done on a class-by-class basis, if representa-tive ability is the selection criterion). 2.  For simplicity of demonstration, it shall be assumedat the outset that the set of classifiers operate on onlyone feature individually, and that these are distinct IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 3, MARCH 2003 1 .  The authors are with the Department of Electronic and ElectricalEngineering, University of Surrey, Guildford, Surrey. GU2 7XH.E-mail: {D.Windridge, J.Kittler}@eim.surrey.ac.uk. Manuscript received 21 Feb. 2001; revised 11 Apr. 2002; accepted 26 Sept.2002.Recommended for acceptance by A. Del Bimbo.For information on obtaining reprints of this article, please send e-mail to:tpami@computer.org, and reference IEEECS Log Number 113658. 1. Thus, the proposed method can be considered optimal with regard tounderlying pattern-space probability morphologies about which we haveabsolutely no prior constraint information (such as feature independence),other than that implied by the feature/classifier allocations imposed by thefeature-selector. 0162-8828/03/$17.00    2003 IEEE  (though note that the former is not a prerequisite of the method). Evidence that the stronger of these twoassumptions, the latter, is reasonably representativeof the usual situation comes from [15] through [18],wherein features selected  within  a combinatorialcontext are consistently shown to favor the allocationof distinct feature sets among the constituentclassifiers, presumably due to their divergent designphilosophies: the wider implications of the alter-native to this assumption are dealt with at a laterpoint. 3.  We shall consider that the construction of aclassifier is the equivalent of estimating the PDFs  p ð x  Nð 1 ;i Þ ;x  Nð 2 ;i Þ  ... x  Nð k i ;i Þ j ! i Þ 8 i , where  Nð x;y Þ  isthe final set of feature dimensions passed fromthe feature selection algorithm for class  ! y  (thecardinality of which,  k i , we will initially set tounity for every class identified by the featureselector. i.e.,  k i  ¼  1  8 i ). 4.  It is assumed (prior to setting out the featureselection algorithm most appropriate to our techni-que) that, in any reasonable feature selection regime,the total set of features employed by the variousclassifiers exhausts the classification informationavailable in the pattern space (i.e., the remainingdimensions contribute only a stochastic noise com-ponent to the individual clusters).Given assumption 3 above (that individual classifiersmay be regarded as PDFs) and further, that patternvectors corresponding to a particular class may beregarded as deriving from an  n -dimensional  probabilitydistribution, then the process of feature selection may beenvisaged as an integration over the dimensions redun-dant to that particular classification scheme (the discard-ing of superfluous dimensions being, in effect, the linearprojection of a higher dimensional space onto a lowerone, ultimately a  1-dimensional  space in the aboveframework). That is, for  n -dimensional  pattern data of class  i :  p ð x k j ! i Þ ¼ Z   þ11 ...  |{z}  n  1 Z   þ11  p ð ~X X  j ! i Þ dx 1  ... dx k  1 dx k þ 1  ... dx n ; ð 1 Þ with  ~X X   ¼ ð x 1 ;x 2 ; ... ;x n Þ . (A visual apprehension of thisprojective behavior may be gained by comparison of thefigures in the appendix, representing a two-to-one dimen-sional projection—indeed, in general, the appendix figuresmay serve as a useful visual reference throughout thefollowing arguments.)Because of condition 4 above (a good approximationwhen a range of classifiers is assumed), we shall considerthat the pattern vector effectively terminates at index  j ,where  j    n  is the total number of features (and alsoclassifiers, given condition 3). That is,  ~X X   ¼ ð x 1 ;x 2 ; ... ;x  j Þ now represents the extent of the pattern vector dimension-ality. In the integral analogy, the remaining dimensions thatare integrated over in (1) serve to reduce the stochasticcomponent of the joint PDF by virtue of the increased bincount attributable to each of the pattern vector indices.Now, it is the basis of our thesis that we may regard (1)as the  j -dimensional  analogue of the Radon transform(essentially, the mathematical equivalent of the physicalmeasurements taken within a tomographic imaging re-gime), an assertion that we shall make explicit in Section 3after discussing a method for extending the inverse Radontransform to an arbitrarily large dimensionality. Theconventional Radon transform, however, is defined interms of the two-dimensional function  f  ð x;y Þ  thus: (follow-ing the formulation of Natterer [20]) R ð ;s Þ½ f  ð x;y Þ ¼ Z   þ11 Z   þ11 f  ð x;y Þ   ð s  x cos   y sin  Þ dx dy ¼  g  ð ;s Þð Þ ; ð 2 Þ where  s  may be regarded as a perpendicular distance to aline in (x,y) space, and    the angle that that line subtendsin relation to the  x  axis.  R ð ;s Þ  is then an integral over f  ð x;y Þ  along the line specified (    being the Dirac deltafunction): refer, for example, to [21] and [22] foralternative formulations of the Radon integral.As a first approximation to inverting the Radontransform and reconstructing the original data  f  ð x;y Þ ,we might apply the Hilbert Space adjoint operator of  R ð ;s Þ , the so-called back-projection operator: R  ½ R ð ;s Þð ~xx Þ ¼ Z  S  R ð ;~  ~xx Þ d; with  ~xx  ¼ ð x;y Þ ;~  ¼ ð cos ; sin  Þ : ð 3 Þ That is, the first stage of recovering the morphologicalinformation lost by Radon transformation at a particularpoint consists in summing over the angularly-distributedRadon transforms that intersect the point. It will provenecessary throughout the following to gain a preciseappreciation of how this operator acts: 2 consider, first, thefollowing identity written in terms of the arbitrary function v , where  V   ¼  R  v : Z  S  Z  s v ð ;~xx  ~  s Þ g  ð ;s Þ ds d ¼ Z  S  Z  R 0 2 v ð ;~xx  ~  ~xx 0  ~ Þ f  ð ~xx 0 Þ dx 0 dy 0 d ð substituting 2 and eliminating  s Þ¼ Z  R 0 2 Z  S  v ð ; ð ~xx  ~xx 0 Þ ~ Þ d   f  ð ~xx 0 Þ dx 0 dy 0 ¼ Z  R 0 2 V  ð ~xx  ~xx 0 Þ f  ð ~xx 0 Þ dx 0 dy 0 ð via the definition of V  ½¼  R  v Þ¼  V ? f   ð a 2  D convolution Þ : ð 4 Þ The first term in the above may be symbolically written R  ð v ? g  Þ , where it is understood that the convolution is 2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 3, MARCH 2003 2. Those familiar with the morphology of tomographic artifacting inextremely angular resolution limited spaces may wish to skip to Section 3.  with respect to the length variable and not the angular termin  g  . Hence, we have that  V ? f   ¼  R  ð v ? g  Þ .We may describe the relationship between  V   and  v  moreexplicitly via Fourier transformation: denoting by  f  0 ð ~xx Þ  theconvolution  V ? f  , and by  Fð ~kk Þ½ f  0   the Fourier transforma-tion of   f  0 ð ~xx Þ  with respect to the vector  ~kk  ¼ ð k 1 ;k 2 Þ , we have,tautologically, that: f  0 ð ~xx Þ ¼ F   1 ½Fð ~kk Þ½ f  0 ¼ Z   11 Z   11 Fð k 1 ;k 2 Þ½ f  0  e i 2  ð xk 1 þ yk 2 Þ dk 1  dk 2 : ð 5 Þ Expressing this within polar coordinates (i.e., k 1  ¼  ! cos ;k 2  ¼  ! sin  Þ , we have that: f  0 ð ~xx Þ ¼ Z   2  0 Z   1 0 Fð ! cos ;! sin  Þ½ f  0  e i 2  ð x! cos  þ y! sin  Þ  @k 1 @!@k 2 @   @k 2 @!@k 1 @   d! d ¼ Z   2  0 Z   1 0 Fð ! cos ;! sin  Þ½ f  0  !e i 2  ð x! cos  þ y! sin  Þ d! d ¼ Z    0 Z   11 Fð ! cos ;! sin  Þ½ f  0 j ! j e i 2  ð x! cos  þ y! sin  Þ d!   d: Digressing briefly, we see that it is possible to treat (2) ina similar manner by rewriting it in terms of the coordinatesystem  ~ss  ¼ ð s;t Þ , and Fourier transforming with respect tothe  s  component,  ~ss  being the coordinate system  ~xx  rotatedthrough an angle    with respect to the  x  axis. That is: F  s ð ! Þ½ g  ð ;s Þ ¼ F  s ð ! Þ Z   11 f  ð x;y Þ   ð s  x cos   y sin  Þ dx dy  ¼ Z   11 Z   11 f  ð t cos   s sin ;t sin  þ s cos  Þ e i 2 !t dt ds: Transposing the right-hand side back into  ð x;y Þ  coordi-nates, we obtain: F  s ð ! Þ½ g  ð ;s Þ ¼ Z   11 Z   11 f  ð x;y Þ e i 2 ! ð x cos  þ y sin  Þ  @t@x@s@y   @s@x@t@y   dx dy ¼ Z   11 Z   11 f  ð x;y Þ e i 2 ! ð x cos  þ y sin  Þ  cos 2  þ sin 2    dx dy ¼ Fð ! cos ;! sin  Þ½ f  ð x;y Þ : ð 6 Þ Now, we have from the convolution theorem that: Fð k 1 ;k 2 Þ½ f  0 ð ~xx Þ ¼ Fð k 1 ;k 2 Þ½ V ? f   ¼ Fð k 1 ;k 2 Þ½ V  Fð k 1 ;k 2 Þ½ f   : Therefore, substituting this result for  k 1  ¼  ! cos ;k 2  ¼ ! sin   into (5), we would have that: f  0 ð ~xx Þ ¼ Z    0 Z   11 Fð ! cos ;! sin  Þ½ f  0 j ! j e i 2  ð x! cos  þ y! sin  Þ d!  d ¼ Z    0 Z   11 Fð ! cos ;! sin  Þ½ V  Fð ! cos ;! sin  Þ½ f  j ! j e i 2  ð x! cos  þ y! sin  Þ d!   d ¼ Z    0 Z   11 Fð ! cos ;! sin  Þ½ V  F  s ð ! Þ½ g  ð ;s Þj ! j e i 2  ð x! cos  þ y! sin  Þ d!   d ¼ Z    0  F   1 !  ð ~xx  ~ Þ  Fð ! cos ;! sin  Þ½ V  F  s ð ! Þ½ g  ð ;s Þ j ! j   d ¼  R   F   1 !  Fð ! cos ;! sin  Þ½ V  F  s ð ! Þ½ g  ð ;s Þ j ! j  ð ~xx Þ ð from  ð 3 ÞÞ : Comparing with (4), we find the equivalence: R  v ? g  ð ;s Þ½ ð ~xx Þ   R  h F   1 !  Fð ! cos ;! sin  Þ½ V  F  s ð ! Þ½ g  ð ;s Þ j ! j i ð ~xx Þ Or :  v ? g  ð ;s Þ ¼ F   1 !  Fð ! cos ;! sin  Þ½ V   F  s ð ! Þ½ g  ð ;s Þ j ! j½  : Fouriertransformingbothsideswithrespectto ! givesus: Fð ! Þ½ v ? g  ð ;s Þ ¼ Fð ! cos ;! sin  Þ½ V  F  s ð ! Þ½ g  ð ;s Þ j ! j)Fð ! Þ½ v Fð ! Þ½ g  ð ;s Þ¼Fð ! cos ;! sin  Þ½ V  F  s ð ! Þ½ g  ð ;s Þj ! j : Hence, by cancelling  F  s ð ! Þ½ g  ð ;s Þ  in the above, wederive the explicit relationship between  V   and  v : Fð ! Þ½ v  ¼ Fð ! cos ;! sin  Þ½ V   j ! j :  ð 7 Þ The effect of the back-projection operator on the Radontransform of   f   may then be appreciated, via a considerationof (4), by setting  v  to be a Dirac delta function in  s (corresponding to an identity operation within the con-volution). The  V   corresponding to this  v  may then bededuced by inserting the Fourier transform of the deltafunction (unity throughout  f  -space ) into the above equation.Hence, we see that the effect of applying the back-projectionoperator to the Radon transformed  f   function is theequivalent of convolving  f   with the inverse Fourier-transformed remainder: f  recovered ð x;y Þ ¼  f  srcinal  ? F   1 ð ! cos ;! sin  Þ½j ! j  1  :  ð 8 Þ In terms of the tomographic analogy, we retrieve a”blurred” version of the original data (see the contrast between the left and right-hand panels of the second figurethe appendix for an illustration of this). In fact, the object of tomography is exactly the reverse of this process: We seekto obtain a  v  function such that it is  V   that approaches theform of the delta function: that is, transforming the RHS of (4) into  f   alone. In this instance, we may regard the  v function as a ”filtering operator” that serves to removemorphology attributable to the sampling geometry rather WINDRIDGE AND KITTLER: A MORPHOLOGICALLY OPTIMAL STRATEGY FOR CLASSIFIER COMBINATION: MULTIPLE EXPERT FUSION AS... 3  than the srcinal data, which is then, hence, applied to theRadon data at a stage prior to inversion via the backprojection operator.We shall, in Section 3, set out to show that thesummation method of classifier combination (which isrepresentative of many more generalized combinationapproaches under certain conditions, such as very limitedclass information within the individual classifiers) is, ineffect, the equivalent of applying the back-projectionoperator immediately to the classifier PDFs (which in ouranalogy are to be considered Radon transforms), withoutany attempt to apply prior filtering (i.e., setting  v  to thedelta function in (4)). It is then, via this observation, that wehope to improve on the combination process, presenting anoptimal, or near optimal solution to the inversion problem by finding an appropriate filter,  v , albeit in the context of probability theory.Prior to setting out this correspondence however, weneed first to extend the method to the  j  dimensionsrequired of our pattern vector. This is somewhat involvedto achieve formally, and the reader is referred to [23] whereit is demonstrated via a recursive, dimensionally incre-mental argument that the composite  j -dimensional  patternspace formed from nonoverlapping arbitrarily-dimensionedRadon transforms has the form of a summation over thevarious linear projections, and critically, that the appro-priate filtering mechanism for larger dimensionalities isapplied linearly to the constituent subspaces. In otherwords, the required methodology is precisely the rationalgeneralization of (4) that we might expect.It is also necessary, prior to precisely elucidating therelationship between classifier combination and tomogra-phy, to assess what conditions, if any, the very low numberof angular Radon samples inherent in the class PDFs willimpose upon the reconstructed pattern space, a discussionof which will therefore occupy the remainder of the section. 2.1 Sampling Issues As a prelude to addressing the issue of sampling, we shallfirst need to discretize the srcinal back projection formula(4) thus: (after Natterer, 1996 [20]) R  ð v ? g  Þ   2  pq  X  p  1  j ¼ 0 X q l ¼ q   zfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflffl{   p ¼ 2 v  ð ~xx  ~  j   s l Þ R ð ~  j ;s l Þ ;  ð 9 Þ ( 2 q  þ 1  being the number of colinear Radon samples withinthe discretised regime,    the diameter of the reconstructivearea, and the subscript    appended here to indicate some,as yet unspecified, bandwidth limitation 3 ).There are then two distinct aspects to the sampling issueas it relates to Radon transformation, namely, the linear andthe rotational integrations within the discretised form of theinverse Radon transform. The first of these we can addressin terms of the Nyquist criterion for sufficient samplingwithin the Fourier domain; following Natterer [20], we shallconsider that the reconstructed pattern space is bandwidthlimited (in the Fourier sense) to frequencies within avalue   . The Nyquist criterion states that this space maythen be fully determined by linear sampling with a step-sizeof     =  . In the nomenclature of (9), this step-size isimplied by the relation between the width of the recon-structed space,   , and the total number of parallel Radontransforms,  q  , via the ratio  =q  . The fact that the Radontransform and the pattern space have identical bandwidthlimitations, as is implicitly considered to be the case in theabove argument, is directly demonstrated by inspection of (6). The composite imposition upon the bandwidth arisingfrom the equation of these step-sizes is then:  q     1    .However, we have also to consider what possible bandwidth limitations are imposed by the rotationalsampling rate, which, given that Radon samples areobtained for only two angles per plane of reconstruction(the feature axes), would then appear, on intuitive grounds,to be the dominating factor of the two. Formulating thisprecisely is less straightforward, and we adopt Natterer’s[20] argument in terms of Bessel functions. Using Debye’srepresentation of the asymptotic form of Hankel functionsof the the first kind as a method of relating angularintegration to wavelength (or its nearest equivalent inBessel terms), it is thus shown in [20] that the bandwidth of the Radon transform in terms of     is, to a very good degreeof approximation,    , with an angular step-size:  =p . TheNyquist criterion consequently imposes a restriction:   p      ,or:  p      .Furthermore, since we have from above that  p  ¼  2 , the bandwidth criterion owing to the angular sampling rate isthus:      2  , contrasting with a bandwidth criterionderived from the linear sampling rate of:      q   . Now,the number of points in a typical classifier-derived PDFwill generally be in excess of the cardinality of the testdata set from which it derived; being typically of theorder of   1 ; 000 . This, and the corresponding bandwidthlimitation, will clearly be so far in excess of the angularsampling limitations that we are, hence, justified indisregarding the number of linear sample-points as beingof consequence to the recovered pattern space morphol-ogy, dominated as it is by the angular sampling rate. Wehave thus to consider the recoverable pattern-space as being of inherently few degrees of freedom in relation tothe multidimensional feature spaces that exist within single  classifiers (which would exhibit comparable angularand linear sample rates by the above arguments). Hence,we expect features to be distributed over differingclassifiers by the feature selector  only  when the classifiersare morphologically disposed to represent  differing  sub-spaces of the composite pattern-space (see condition 1 of Section 2). 3 C ORRESPONDENCE WITH  C LASSIFIER C OMBINATION Having thus obtained a form (or rather, a  method ) for n -dimensional  inverse Radon transformation, and havingestablished the limitations imposed on the methodology by 4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 3, MARCH 2003 3. In the interests of clarity, we shall consider this bandwidth limitationonly in terms of the individual two-dimensional inverse Radon transformsfrom which the multidimensional inverse transform is progressivelyconstructed, the  n -dimensional  inverse Radon transform being demon-strated to be decomposable in this fashion in [23].  the angular sample-rate, we are now in a position to makethe correspondence with classifier combination theory moreformally explicit. That is, we shall seek to encompass thevarious extant combinatorial decision theories within thetomographic framework that we have developed over thepreceding sections, and show that they represent, withincertain probabilistic bounds, an imperfect approximation tothe  unfiltered  inverse Radon transformation.We will first, however, demonstrate how we mightexplicitly substitute probabilistic terms into the n -dimensional  inverse Radon transformation methodologyvia an illustration: We state without proof (refer to [23])that the two-to-three-dimensional stage of the initiallyone-dimensional recursive inverse Radon transformationis of the form: R  ð v ? g  Þ ¼   3 3 q  3 X q l   ¼ q  v  ð ~xx  ~   0   s l Þ R ð ~   0 ;s l Þþ   3 3 q  3 X q l 0   ¼ q  v  ð ~xx  ~   1   s l 0 Þ R ð ~   1 ;s l 0 Þþ   3 3 q  3 X q l 00   ¼ q  v  ð ~xx  ~   1   s l 00 Þ R ð ~   1 ;s l 00 Þ8  ;   :  ;   2 I  ;   6¼    ;0  < ; < n; ð 10 Þ (where the two-dimensional feature subspaces are denoted by the various concatenations of lowercase Greek letters,which individually represent the initial one-dimensionalRadon transforms).We have initially then to establish exactly what is meantin geometrical terms by the Radon forms upon which thisequation is constructed. It is helpful in this endeavor to, atleast initially, eliminate the complication of the prefilteringconvolution represented by  v . We do this by setting  v  to adiscretized form of the Dirac     function throughout thesummation, P q l ¼ q   , that is: v  ð s xyl Þ ¼  1 P q l ¼ q   s xyl ¼  12 q   when  s xyl  ¼  0 : v  ð s xyl Þ ¼  0 ;  otherwise : ð 11 Þ Hence, the various summations only produce nonzeroterms when:  ~xx  ~   0  ¼  s l . Thus, without filtering, (10)commutes to the form: R  ð v ? g  Þ ¼   3 3 q  3  R ð ~   0 ;~xx  ~   0 Þþ   3 3 q  3  R ð ~   1 ;~xx  ~   1 Þþ   3 3 q  3  R ð ~   0 ;~xx  ~   0 Þ8  ;   :  ;   2 I  ;   6¼    ;0  < ; < n: ð 12 Þ Now, because we are free to set the coordinate system aswe choose, and, in having set  j  to 2 in (10), consequentlyobtaining a perpendicularity between the Radon integralvectors, we shall find it convenient to express our geometryin terms of an orthogonal coordinate system, with axialdirection vectors set parallel to the perpendicular Radonintegrals. Thus, we may legitimately make the equations:   ¼  x 1 ;   ¼  x 2 ;   ¼  x 3 .Also, in having imposed this parallelism between theRadon integrals and coordinate axes, we find that thesubscript  xyl  comes to exhibit a redundancy of twovariables, such that we may state the further consequentequivalences:    0  ¼  ;  1  ¼  ;  0  ¼    . Thus, (12) nowadopts the form:  R  ð v ? g  Þ ¼  A ½ R ð ~ x 1 ;x 1 Þþ R ð ~ x 2 ;x 2 Þþ R ð ~ x 3 ;x 3 Þ  ( A  denoting the normalization). However, recallfrom (2) that: R ð ;s Þ½ f  ð x 0 1 ;x 0 2 Þ ¼ Z   þ11 Z   þ11 f  ð x 0 1 ;x 0 2 Þ   ð s  x 0 1  cos   x 0 2  sin  Þ dx 0 1  dx 0 2  ¼  g  ð s; Þð Þ : Now, we also have that:  cos ~ x 2  ¼  sin ~ x 1  ¼  0  and cos ~ x 1  ¼  sin ~ x 2  ¼  1 . (   being measured in relation to the  x 1 axis.)Thus, for example, picking an ordinate at random: R ð ~ x 1 ;x 1 Þ ¼ Z   þ11 Z   þ11 f  ð x 0 1 ;x 0 2 Þ   ð x 1   x 0 1 Þ dx 0 1  dx 0 2 ¼ Z   þ11 f  ð x 1 ;x 0 2 Þ  dx 0 2  ¼ Z   þ11 f  ð x 1 ;x 2 Þ  dx 2 ; ð 13 Þ and similarly for  x 2 ,  x 3 .Now, a rational extension of the nomenclature of (1)would allow us to write:  p ð x 1 ;x 2 j ! i Þ ¼ Z   þ11 ...  |{z}  n  2 Z   þ11  p ð ~X X  j ! i Þ dx 3  ... dx n ;  ð 14 Þ (and similarly, for the remaining pairs of basis vectorcombinations.)It is, naturally, still the case that:  p ð x 1 j ! i Þ ¼ Z   þ11 ...  |{z}  n  1 Z   þ11  p ð ~X X  j ! i Þ dx 2  ... dx n ¼ Z   þ11  p ð x 1 ;x 2 j ! i Þ dx 2 : ð 15 Þ Thus, by setting the equivalence  f  ð x 1 ;x 2 Þ   p ð x 1 ;x 2 j ! i Þ ,we find by direct substitution into (13) that we can statethat: R ð ~ x 1 ;x 1 Þ ¼ Z   þ11 f  ð x 1 ;x 2 Þ  dx 2  ¼  p ð x 1 j ! i Þ ;  ð 16 Þ and similarly for the remaining numeric subscripts.Hence, in consequence, we may simply restate theunfiltered two-to-three-dimensional inverse Radon trans-formation in the more transparent form: R  ð v ? g  Þ ¼  A ½  p ð x 1 j ! i Þþ  p ð x 2 j ! i Þþ  p ð x 2 j ! i Þ : Moreover, we can go further and extend this approach tothe recursive methodology of the  n -dimensional  inverseRadon transformation, in which case we find in the mostgeneral terms, that the unfiltered  n -dimensional  inverseRadon transformation will have the form: (declining explicitcalculation of the various normalizing constants corre-sponding to  A  in the above, this being a relatively complexundertaking, and not in any case required in the context of the decision making schemes within which the method willultimately be applied (see later)): WINDRIDGE AND KITTLER: A MORPHOLOGICALLY OPTIMAL STRATEGY FOR CLASSIFIER COMBINATION: MULTIPLE EXPERT FUSION AS... 5
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks