Government & Politics

A statistical signal processing approach to image fusion for concealed weapon detection

Description
A statistical signal processing approach to multisensor image fusion is presented for concealed weapon detection (CWD). This approach is based on an image formation model in which the sensor images are described as the true scene corrupted by
Published
of 4
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A STATISTICAL SIGNAL PROCESSING APPROACH TO IMAGE FUSION FOR CONCELED   WEAPON DETECTION 1    J. Yang and R. S. Blum ECE Department, Lehigh University 19 Memorial Drive West, Bethlehem, PA 18015-3084  jiy3@lehigh.edu, rblum@eecs.lehigh.edu 1  This material is based on work supported by the U. S. Army Research Office under grant number DAAD19-00-1-0431. The content of the information does not necessarily reflect the position or the policy of the federal government, and no official endorsement should be inferred. ABSTRACT A statistical signal processing approach to multisensor image fusion is presented for concealed weapon detection (CWD). This approach is based on an image formation model in which the sensor images are described as the true scene corrupted by additive non-Gaussian distortion. The expectation-maximization (EM) algorithm is used to estimate the model parameters and the fused image. We demonstrate the efficiency of this approach by applying this method to fusion of visual and non-visual images with emphasis on CWD applications. 1. INTRODUCTION The majority of image fusion algorithms that have been proposed have not been developed using a rigorous application of estimation theory. One exception is the work of Sharma [1] who has proposed a Bayesian fusion method which is based on estimation theory and assumes all disturbances follow a Gaussian distribution. Since this is a rather limiting assumption, we present a generalization that allows both Gaussian and non-Gaussian disturbances based on which assumption best fits the observed data. Further the fusion takes place in the multiscale transform (MST) domain [2-4] which appears to be a popular choice. In our examples we use the Laplacian pyramid transform [4] and we focus on concealed weapon detection (CWD) applications [5,6]. 2. THE IMAGE FORMATION MODEL We model every coefficient of the MST of each observed sensor image as ()()()() iii  zjjsjj β ε= +  (1) where 1,..., iq =  indexes the sensors,  j  denotes the coefficient location (for example,  j is shorthand for (,,)  jxym ≡  where ,  xy  are the pixel coordinates, and m  is the level of the pyramid), () i  zj  is the observed sensor image, () sj  is the true scene (which we hope to approximate using fusion), ()1 i  j β = ± or 0 is the sensor selectivity factor, and () i  j ε  is the random distortion. This model acknowledges that a given sensor may be able to “see” certain objects (()1 i  j β = ), may fail to “see” other objects (()0 i  j β = ), or may “see” certain objects with a polarity-reversed representation (()1 i  j β = − ). The distortion is modeled using a K  -term mixture of Gaussian probability density functions (pdfs) as 2(),221,, ()1(())()exp2()2() i K i jikik kiki  j fjj jj ε εε λπσ σ = =    −     ∑  (2) 3. FUSION WITH EM ALGORITHM The expectation-maximization (EM) algorithm [7-9] is used to estimate the model parameters and to produce the fused image (using the final scene estimate). Using the EM algorithm we have developed a set of iterative equations that produce approximate maximum likelihood estimates of the model parameters [9]. The iterative algorithm will be run once for each image i  and each coefficient  j  to obtain the estimates of () sj ,() i  j β , and { 1, () i  j λ , … , , () Ki  j λ ;  21, () i  j σ , … , 2, () Ki  j σ }. These estimates can and will change as  j  changes. In performing these estimates the calculations will employ the coefficients 1,..., lL =  in an  Lhh = ×  window around the coefficient  j , but in these calculations we assume that the parameters () i l β  and { 1, () i l λ , … , , () Ki l λ ; 21, () i l σ , … , 2, () Ki l σ } are the same for each coefficient 1,..., lL =  in the window to reduce the number of parameters we need to estimate. For this reason we drop the indices on these parameters. We set () ii l β β=  for example. The window size  L  must be carefully chosen to be large enough to allow good estimation but small enough to allow our assumptions to be reasonable. Now, consider parameter estimation for coefficient  j  in (1). We make computations for each coefficient in the window 1,..., lL =  around  j  in (1), but we only keep the estimates for 1 l  =  which is the coefficient corresponding to index  j . Let '() sl  denote the updated value of () sl , the fused image, and assume similar notation for the other quantities ' i β , , ' ki λ  and 2, ' ki σ . The algorithm begins with current estimates () sl , i β , , ki λ , 2, ki σ  of the parameters and produces updated estimates '() sl , ' i β , , ' ki λ , 2, ' ki σ  using the following procedure. 1. First compute , [()] kili gzl : 2,22,,,2,221,, ()()exp22[()]()()exp221,...,;1,...,;1,..., kiiikikikiliK  piii p pipi  zlslgzl zlslkKiqlL λ βπσ σλ βπσ σ = −−=−−= = =            ∑  (3) 2. Update the parameter i β . ' i β  is chosen to have the value from the set {0, -1, +1} that maximizes 2,1112,2, ln()()'()[()]2 q LK kiilk iikiliki Q zlslgzl σβσ = = = = −−+ ⋅  ∑∑∑  (4) 3. Update the true scene () sl with the updated value ' i β : ,211,2,211, [()]()''(),1,...,[()]' qK kiliiiik kiqK kiliiik ki gzl zlsllLgzl βσβσ = == = = = ∑∑∑∑  (5) 4. Update the distortion parameters , ki λ  and 2, ki σ  with the updated value ' i β  and '() sl : ,,1 1'[()],1,...,;1,...,  Lkikilil gzlkKiq L λ = = = = ∑  (6) 2,21,,1 ()''()[()]'2[()]1,...,;1,...,  Liikililki Lkilil  zlslgzlgzlkKiq βσ  == −== = ∑∑  (7) 5. Repeat steps 1-4 until 1 '()()  Ll slsl δ τ = = − < ∑  (8) where 0.0001 τ = . The above procedure is derived from the SAGE version of the EM algorithm, similar to the development in [8]. The fused result is the final estimate (1) s  when the algorithm converges. The procedure must be repeated for each  j  and i  from (1) that describes the sensor images. Initial values for the parameters are required to begin the EM algorithm. A simple estimate for the true scene () sl  is the weighted average of the sensor image data. The initial values of () sl  are given by 1 ()()1,..., qiii slwzllL = = = ∑  (9) where 1 1 qii w = = ∑ . The simplest case is with an equal weight for each sensor image, that is 1/,1,... i wqiq = = . A simple initialization for i β  is to assume that the true scene appears in each sensor image. Hence 1 i β =  for 1,..., iq = . In order to model the distortion in a robust way the distortion is initialized as impulsive [9]. We initialize the distortion parameters with 1, 0.8 i λ =  and 2,, 0.2/(1) iKi K  λ λ= = = − L  for 1,..., iq = . Then we set 22,1, kiki σ γσ − =  for 1,..., iq = ,  2,... kK  = , where the value for 21, i σ  is chosen based on an estimate of the total variance 22,,1 K ikikik  σ λ σ = =  ∑  for 1,..., iq =  given by [ ] 221 ()()/   Liil  zlslL σ = = − ∑  (10) We choose 10 γ =  so that the initial distortion model is fairly impulsive. This initialization scheme worked very well for the cases we have studied. We observed that the algorithm in our experiments generally converged in 3 to 5 iterations. 4. EXPERIMENTS AND RESULTS We applied the EM fusion algorithm to the visual and MMW images 2  shown in Fig. 1 (a) and (b). The number of levels in the Laplacian pyramid was 5. The mixture model in (2) was employed with K  =2, and local analysis window size was 5 × 5. Fig. 1 (c) shows the result of the EM fusion algorithm. Fig. 1 (d), (e) and (f) show the results obtained by pixel averaging, selecting the maximum pixel, and the Laplacian pyramid fusion approach in [3]. The fusion method from [3] chooses the maximum sensor pyramid coefficients for the high-pass pyramid coefficients and averages the sensor pyramid coefficients for the low-pass pyramid coefficients. From the comparison, the EM fusion algorithm performs better than these 3 fusion methods. A second example of the EM fusion algorithm, for a CWD application, fused a visual and an IR image. Fig. 2 (a) and (b) show the visual and IR images. Fig. 2 (c), (d), (e) and (f) show the fused result obtained by the EM fusion algorithm, pixel averaging, selecting the maximum pixel, and the algorithm from [3] respectively. This example shows that the EM fusion algorithm performs well for this example. From the comparison, the EM fusion algorithm performs better than pixel averaging and selecting the maximum pixel, while it has similar performance to the algorithm from [3]. The last example shows that the EM fusion algorithm is useful for other applications beside CWD. Fig. 3 (a) and (b) are the long wave and medium wave images to be fused in an Autonomous Landing Guidance (ALG) application. Fig. 3 (c) and (d) show the result obtained by the EM fusion algorithm and the algorithm from [3], respectively. From this example we can see that the EM fusion algorithm has the ability to deal with those objects with polarity-reversed features in the sensor images. 2  The source images were obtained from Thermotex Corporation. Although space does not permit demonstrating this, when the source images contain additive non-Gaussian noise the EM fusion algorithm has the ability to attenuate the noise to produce an improved fused result. 5. DISCUSSION We have presented a probabilistic image fusion method based on a statistical signal processing approach. We have experimented with this method for concealed weapon detection applications. The results showed the advantages of the EM-based approach in some cases. When we apply our algorithm to cases where we have multiple frames from a video sequence, we can expect an improved fused result over the cases considered in this paper. The multiple frames will provide redundant information about the true scene. The additional information is very helpful in the estimation. We also envision some other improvements in our current fusion approach. If we build the image formation model for possibly correlated Gaussian mixture distortion, this model should be closer to realistic sensor images and the estimation may improve. 6. REFERENCES [1] R. K. Sharma, T. K. Leen, M. Pavel, “Probabilistic Image Sensor Fusion”,  Advances in Neural Information Processing Systems 11 , The MIT Press, 1999. [2] Z. Zhang and R. S. Blum, “A hybrid image registration technique for a digital camera image fusion application,”  Information Fusion , pp. 1-15, Jan. 2001. [3] Z. Zhang and R. S. Blum, “A categorization and study of multiscale-decomposition-based image fusion schemes,” Proceedings of the IEEE  , pp. 1315-1328, Aug. 1999. [4] P. J. Burt and E. Adelson, “The Laplacian pyramid as a compact image code,”  IEEE Trans. Communications , vol. Com31, no. 4, pp. 532-540, 1983. [5] D. D. Ferris Jr., R. W. McMillan, N. C. Currie, M. C. Wicks, and A. Slamani, “Sensors for military special operations and law enforcement applications,” Proc. SPIE  , vol. 3062, pp. 173-180, 1997. [6] M. A. Slamani, L. Ramac, M. Uner, P. K. Varshney, D. D. Weiner, M. Alford, D. D. Ferris Jr., and V. Annicola, “Enhancement and fusion of data for concealed weapons detection,” Proc. SPIE  , vol. 3068, pp. 8-19, 1997. [7] A. P. Dempster, N. M. Laird, D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,”   J. of the Royal Statistical Soc ., vol. 39, no. 1, pp. 1-38, 1977. [8] R. S. Blum, R. J. Kozick, and B. M. Sadler, “An adaptive spatial diversity receiver for non-Gaussian interference and noise,”  IEEE Transactions on Signal Processing , pp. 2100-2112, Aug. 1999. [9] R. A. Redner, H.F. Walker, “Mixture densities, maximum likelihood and the EM algorithm,” SIAM  Review , vol. 26, pp. 195-239, April 1984. (a) Visual Image (b) MMW Image (c) EM Fusion (d) Averaging (e) Selecting Maximum (f) Laplacian Fusion Fig. 1 Fusion result of visual and MMW images for CWD. (a) Visual image (b) IR image (c) EM Fusion (d) Averaging (e) Selecting Maximum (f) Laplacian Fusion Fig. 2 Fusion result of visual and IR images for CWD. (a) Long wave (b) Medium wave (c) EM fusion (d) Laplacian fusion Fig. 3 Fusion result of long wave and medium wave images for ALG.
Search
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks