Philosophy

Bayesian regularization of diffusion tensor images: IMF Thiele Research Report

Description
Bayesian regularization of diffusion tensor images: IMF Thiele Research Report
Categories
Published
of 4
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  BAYESIANREGULARIZATIONOFDIFFUSIONTENSORIMAGESUSINGHIERARCHICALMCMCANDLOOPYBELIEFPROPAGATION Siming Wei † Jing Hua ‡ Jiajun Bu † Chun Chen † Yizhou Yu ♦† †  College of Computer Science, Zhejiang University ‡  Wayne State University ♦  University of Illinois at Urbana-Champaign ABSTRACT Based on the theory of Markov Random Fields, a Bayesian regu-larization model for diffusion tensor images (DTI) is proposed inthis paper. The low-degree parameterization of diffusion tensors inour model makes it less computationally intensive to obtain a max-imum  a posteriori  (MAP) estimation. An approximate solution tothe problem is achieved ef  fi ciently using hierarchical Markov ChainMonte Carlo (HMCMC), and a loopy belief propagation algorithmis applied to a coarse grid to obtain a good initial solution for hierar-chical MCMC. Experiments on synthetic and real data demonstratethe effectiveness of our methods.  Index Terms —  Diffusion Tensor Images, Image Restoration,Bayesian Models, Markov Chain Monte Carlo 1. INTRODUCTION Diffusion tensor imaging (DTI) enables the indirect inference of white matter microstructures by reconstructing local diffusion dis-placement probability density functions for water molecules from lo-cal measurements. When following a 3D Gaussian distribution, thisprobability density function can be described with a diffusion tensorwhich is a 3 × 3  symmetric positive semide fi nite matrix whose eigen-values and eigenvectors characterize underlying  fi ber orientation andanisotropy. A diffusion tensor is inherently related to the covariancematrix of the Gaussian distribution, and can be reconstructed fromdiffusion coef  fi cients measured locally along several gradient direc-tions. Due to inherent noise of DTI measurements, the reconstructedtensors are inaccurate, giving rise to possibly erroneous results in thederived white matter  fi ber orientation which further affects the accu- racy of   fi ber tracking. There is an extensive literature on differentregularization techniques [1, 8, 10].We would like to focus on the statistically sound Bayesianframework for tensor  fi eld regularization. Prior distributions inthe Bayesian framework are often modeled using Markov RandomFields (MRFs) with pairwise interactions. Bayesian regularizationby means of  the maximum  a posteriori  (MAP) estimation is well-known in the statistical literature, initiated in the 1980s by Gemanand Geman [4]. Bayesian regularization of the primary directions of diffusion tensors has been developed in Poupon  et al.  [10]. Similarwork for full diffusion tensors following a multivariate Gaussiandistribution was proposed in Martin-Fernandez  et al.  [7]. The gen-eralization of such work to Markov random tensor  fi elds with moregeneric distributions has been presented in Frandsen  et al.  [3]. Itshould be noted that the MAP estimation in previous work wasachieved using either locally iterative optimization which can beeasily trapped in locally optimal solutions or generic Markov ChainMonte Carlo sampling [5] which needs a large number of iterationsto converge.In this paper, we introduce more advanced techniques for com-puting the MAP estimation of the tensor  fi eld. We  fi rst develop aBayesian model based on a low-degree parameterization of diffu-sion tensors. This model makes MAP estimation less computation-ally intensive. We further develop a hierarchical MCMC techniquethat dynamically partitions the srcinal state space into ahierarchy of nested smaller state spaces with an increasing resolution. It is able toconverge to better approximate solutions than conventional MCMC in a relatively small number of iterations. Loopy belief propagation(LBP)[9, 6] is applied to a coarse grid of diffusion tensors to quicklyobtain a good initial solution for hierarchical MCMC. LBP is an ef- fi cient deterministic technique for MRF based optimization. Exper-imental results con fi rm that our revised Bayesian model as well asour MAP estimation techniques are both ef  fi cient and effective. 2. BASICMODEL Let  W   be a  fi nite set of voxels in the white matter. We denote thediffusion tensor  fi eld reconstructed from measured diffusion coef- fi cients by  Σ =  { Σ w  :  w  ∈ W} , where  Σ w  is a  3  ×  3  positivesemide fi nite matrix at voxel  w . Let  λ w 1  ≥  λ w 2  ≥  λ w 3  ≥  0  bethe eigenvalues of   Σ w  with corresponding orthonormal eigenvectors u w 1 ,u w 2 , and  u w 3 . The fractional anisotropy (FA) index is de fi nedas:  FA w  =   12  3 i =1 ( λ wi − ¯ λ w ) 213  3 i =1  λ 2 wi ,  and the normalized diffusion ten-sor at w  is:  ¯Σ w  = Σ w / ¯ λ w , where  ¯ λ w  =  13 ( λ w 1 + λ w 2 + λ w 3 ) . Sincemost white matter voxels do not have neural  fi ber crossings, the reg-ularized tensors should most likely be of “cigar” type, which means λ w 2  =  λ w 3 . For this type of tensors, we de fi ne  σ w  =  λ w 2 /λ w 1  asthe  eigenratio  of   ¯Σ w . It is obvious that there is a one-to-one map-ping between eigenratios and FA values, and both of their rangesare (0,1]. These two quantities are closely related since the largerthe eigenratio is, the smaller the FA value is. Moreover, a normal-ized cigar-type tensor at  w  can be uniquely determined by the pri-mary direction of the tensor  Md w  =  u w 1  and the eigenratio  σ w ,i.e.  ¯Σ w  = ¯Σ w ( Md w ,σ w ) . These two variables are crucial to  fi bertracking. We further de fi ne the primary direction  fi eld as  Md  = { Md w  :  w  ∈W} and eigenratio  fi eld as  σ  = { σ w  :  w  ∈W} .The diffusion function at  w  is denoted as  f  w . For a given di-rection  u  on the unit sphere,  f  w ( u ) = ¯ λu  ¯Σ w ( Md w ,σ w ) u . Let F   =  { f  w  :  w  ∈ W}  be the  fi eld of diffusion functions. Also de-note the set of measured diffusion coef  fi cients as  F   =  { F  w ( u i ) : i  = 1 ,...,k,w  ∈ W} , where  u 1 ,...,u k  are the directions inwhich the signal intensity is measured.  F   is determined by the equa-tion  S  w ( u i ) =  S  w 0  ∗ exp( − bF  w ( u i ))  and estimated by the leastsquare approximation. Here  S  w 0  is the signal intensity without gra-dient,  S  w ( u i )  is the measured intensity on direction  u i  and  b  is thediffusion-encoding strength factor. To reduce the noise level of a 65978-1-4244-7993-1/10/$26.00 ©2010 IEEEICIP 2010Proceedings of 2010 IEEE 17th International Conference on Image ProcessingSeptember 26-29, 2010, Hong Kong  given data set, we aim to achieve the MAP estimation  p ( F| F  ) . Ac-cording to the Bayes’ theorem, we have  p ( F| F  )  ∝  p ( F  |F  )  p ( F  ) .Therefore, performing regularization is equivalent to solving the fol-lowing optimization problem arg max Md, σ  p ( F  |F  )  p ( F  ) .  (1)Our prior and likelihood models are revised versions of those inFrandsen  et. al  [3]. The prior distribution  P  ( F  )  is de fi ned to be  p ( F  ) = 1 Z  α exp  − α  w ∼ w  g (  ¯Σ w ( Md w ,σ w ) −  ¯Σ w  ( Md w  ,σ w  )  )  , (2) where  Z  α  is a normalizing constant and  α >  0 .   •   representsthe Frobenius norm of a matrix, and  ∼  means  w  and  w  are directneighbors. To choose an appropriate function  g , outlier suppressionshould be taken into account. So we set g ( x ) =  c − c ∗ exp( − x 2 /K  ) with constant parameters  c  and  K  .A well-known assumption isthat the raw signal intensity followsa Racian-distribution. Thus the noise on measured diffusion coef  fi -cients at each voxel is independently and normally distributed([3]).The covariance of these distributions may be varying at differentvoxels. We de fi ne such a covariance as  h w  =  exp (2 b ¯ λ w )+1( b ∗ SNR 0 ) 2  , where SNR 0  is the signal-to-noise ratio of the signal intensity without anygradient. This is a revised version of the function  h ( f  w ( µ ))  in [3].Theirchoice of  h may result in the preference of spherical (isotropic)tensors during MAP estimation while ours using the average eigen-value does not have such a bias. Therefore, at each voxel  w  anddirection  µ , we have:  p ( F  w ( µ ) |F  ) = 1 √  2 πh w exp  − ( F  w ( µ ) − f  w ( µ )) 2 2 h w   (3)Then the likelihood  p ( F  |F  )  is formulated as  p ( F  |F  ) =  w ∈W    1 √  2 πh w  k exp  − 12 k  i =1 ( F  w ( u i ) − f  w ( u i )) 2 h w  , (4) Now we are ready to solve the optimization problem (1) follow-ing the above prior and likelihood models. Set E  w 1  (¯Σ w ) = k  i =1  ln(2 πh w ) + ( F  w ( u i ) − ¯ λu  i ¯Σ w u i ) 2 h w  , and E  ww  2  (¯Σ w ,  ¯Σ w  ) = 2 αg (  ¯Σ w  −  ¯Σ w   ) . Since  ¯Σ w  is uniquely determined by the primary direction and eigen-ratio,  E  w 1  is a function of both  Md w  and  σ w . Similarly,  E  ww  2  is afunction of   Md w ,  Md w  ,  σ w  and  σ w  . We also de fi ne  E  Total  =  w ∈W   E  w 1  +  w ∼ w   E  ww  2  . Therefore,  E  Total  is a function of  Md  and  σ  at all voxels. The optimization problem in (1) becomes arg min Md, σ E  Total  (5) 3. MULTILEVELTENSORREGULARIZATION Markov Chain Monte Carlo has been traditionally adopted for solv-ing the optimization problem in (4). Each step of this method needsto sample a normalized cigar tensor from a proposal distribution.Sampling positive semide fi nite matrices in the continuous tensorspace is time-consuming, and MCMC has a slow convergence rate.De fi ne the resolution index  R A  of a set  A  with norm • as R A  = min { a − a    :  a   =  a  and  a,a  ∈  A } .  (6)We say a set is of high resolution when its resolution index is small.A higher resolution of the tensor state space gives rise to slower con-vergence of the Markov chain while performing MCMC sampling.Our effort is to overcome this obstacle by means of a multilevel tech-nique for MCMC. As stated in Section 2, a diffusion tensor can beuniquely de fi ned by its primary direction and eigenratio. The keyidea is that a tensor space at a low level is generated by a low-resolution primary direction space and a low-resolution eigenratiospace. At level  l , we denote the state space of primary directions as M  l and that of eigenratios as  ER l . Then the optimization problemat this level is formulated as arg min Md, σ E  Total ,  ∀ w  ∈ W  , Md w  ∈  M  lw , σ w  ∈  ER lw .  (7)Our multilevel coarse-to- fi ne regularization is summarized asfollows:1) At  Level  = 1 , let ICOS be an icosahedron with verticeson a unit sphere, and two of its vertices lie on the  z   axis of the coordinate system. At any voxel w,  M  lw  is set to be  {  p  :  p  is a vertex of ICOS and  z  (  p )  >  0 } , and  ER lw  is de fi ned as { 18 ,  28 ,  38 ,  48 ,  58 ,  68 ,  78 }  . The resolution indices of this level are R M  1 w ≈  1 . 05  and  R ER 1 w = 0 . 125 . Perform the loopy belief propagation algorithm detailed in Section 4 to obtain a good initialprimary direction and eigenratio at every voxel.2) Level update:  Level  =  Level  + 1 . Suppose the new level is  l .At voxel w, suppose the current state is  ( Md w ,σ w ) . De fi ne a circleon the unit sphere to be C  w  =  { u  :  u  ∈  unit sphere  S  2 and  u  ∈  B ( Md w ,s ∗ R M  l − 1 w ) } , where  B ( Md w ,s ∗ R M  l − 1 w )  is a ball centered at  Md w  with radius s ∗ R M  l − 1 w . Here  s  is a scaling factor. It can be proven mathemati-cally that  s  ≥√  3 / 3  is necessary.  M  lw  and  ER lw  are de fi ned as fol-lows. Determine a set  PT  w  which consists of six points uniformlydistributed on the circle  C  w .  M  lw  =  Md w  ∪  PT  w .  ER lw  is a setof seven points which divide  [ σ w −  12 R ER l − 1 w ,σ w  +  12 R ER l − 1 w ]  intoeight uniform segments. Thus, the tensor state space for the currentlevel  l  is T  l =  { ¯Σ w ( Md w ,σ w ) | Md w  ∈  M  lw , σ w  ∈  ER lw ,w  ∈ W} . 3) Traverse the voxels at the current level in a sequential order. Whenupdating tensor  ¯Σ w , randomly select a normalized diffusion tensor ¯Σ  w  in the constructed tensor space, and use it to replace  ¯Σ w  withprobability  β  , where β   = exp(min { E  (¯Σ w ) − E  (¯Σ  w ) , 0 } ) ,E  (¯Σ w ) =  E  w 1  (¯Σ w ) +  w ∼ w  E  ww  2  (¯Σ w ,  ¯Σ w  ) . 4) If the stopping criteria (which could be a time limit, a number of sweeps, etc.) for the current level have been satis fi ed, goto 2) unlessthe highest level has been reached; otherwise, goto 3).As seen in the above steps, the state space at every level alwayscontains very few candidates, which accelerates the moving speedof Markov chain, and saves both time and memory. In practice, wetypically use 6-10 levels in the above multilevel regularization. 66  4. LOOPYBELIEFPROPAGATION Quickly obtaining a good initial solution is of great importance tomultilevel regularization in Section 3. We chose to achieve this goalthrough belief propagation [9, 6, 2] which is widely used for MAPestimation to MRF problems. This algorithm is iterative and delivers messages among neighboring nodes in parallel during each iteration.Each message is a vector whose dimension is equal to the numberof distinct states in a discrete state space. For a graph without anyloops, belief propagation guarantees the optimal solution in a  fi nitenumber of iterations. For a graph with loops, the same algorithm,which is now called loopy belief propagation (LBP), converges to agood approximation of the optimal solution [6].Parallel update of messages in BP leads to excessive memoryusage since all messages from one iteration need to be saved for thenext iteration. To improve the space complexity, we reduce boththe number of candidates in the state space and the number of mes-sages. Since our prior distribution prefers similar primary directionsat neighboring voxels, we take advantage of this to establish ourcoarsened LBP model. To reduce the number of messages, we di-vide the voxel set into  2 × 2 × 2  blocks and only maintain messagesamong blocks. We denote the block set by  G  . For any block  g  ∈ G  , E  gG 1  =  w ∈ g E  w 1  +  w ∼ w  w,w  ∈ g E  ww  2  , E  gg  G 2  =  w ∼ w  w ∈ gw  ∈ g  E  ww  2  .  (8)It is straightforward to prove that E  Total  =  g ∈G E  gG 1  +  g ∼ g  E  gg  G 2  .  (9)To reduce the size of the state space, we only optimize the primarydirections of the tensors using LBP. Voxels in the same block havethe same primary direction, but they still keep their srcinal eigen-ratios computed from the raw data. From (9), we formulate the fol-lowing new optimization problem, arg min Md g (  g ∈G E  gG 1 ( Md g ) +  g ∼ g  E  gg  G 2  ( Md g ,Md g  )) ,  (10)which can be solved approximately by LBP. An additional techniquefor saving both the computation time and space cost by half is color-ing the blocks with black and white. Adjacent blocks have differentcolors. The messages from black and white blocks are updated al-ternatively in consecutive iterations.Denote the blockwise messages by  MG . The block-based LBPis summarized as follows:1) Devide the dataset into  2 × 2 × 2  blocks, and color the blocks withblack and white. Initialize all messages  MG 0 gg   = 0  for  g  ∼  g  .2) Update MG tgg   iteratively from t  = 1  to T   as follows. The updateis only performed on black blocks when  t  is even and white blockswhen  t  is odd. MG tgg   = min Md g ( E  gG 1 ( Md g )+ E  gg  G 2  ( Md g ,Md g  )+  k  = g  ,k ∼ g MG t − 1 kg  ) . 3) Compute the optimal state  Md ∗ g  for each block  g : Md ∗ g  = argmin Md g { E  gG 1 ( Md g ) +  k ∼ g MG current kg  } , where  MG current kg  is the latest version of the message propagated onthe edge between  k  and  g .4) For each voxel  w  ∈  g , set  Md ∗ g  as its initial primary directionand choose an initial eigenratio closest to the raw tensor’s eigenratiofrom { 18 ,  28 ,  38 ,  48 ,  58 ,  68 ,  78 } . Start the multilevel MCMC described inSection 3. 5. RESULTS Our  fi rst experiment is based on synthetic data. Figure 1 containsthree images corresponding to the initial, noisy and regularized heli-cal tensor fi eld whose anisotropic elements are colored in red. Figure1(b) shows the tensor  fi eld after Gaussian noise has been added ontoboth the primary direction and the smaller eigenvalue of the normal-ized cigar-type tensors. The covariance of the noises were set to 0.05radian and 0.1, respectively. We further generated synthetic DTIs of this noise-corrupted tensor  fi eld for their use in our Bayesian reg-ularization model. Figure 1(c) demonstrates that our regularizationmethod can remove more than 97% of the noise, and thus, is veryeffective.We have also applied our regularization algorithm on noisyreal DTI data. The resolution of the DTI is 256x256x40, and thediffusion-encoding strength factor  b  = 1000 . In our regularizationmodel, we set  α  = 3 . 0 ,  c  = 1  and  K   = 3  (see eq. (2)). To verifyour method’s performance, we compare the results of tractographyof well-known  fi bers from the srcinal noisy data and the processedone using our regularization method. Figure 3(a) and (b) show the fi ber tracts that pass through a Region of Interest (ROI) de fi ned onthe center sagittal slice of the corpus callosum. In the regularizedDTI data, the extracted  fi ber bundles correctly pass through the cor-pus callosum ROI laterally making a U-shaped structure, and  fi nallyend at the cortex dorsoventral along both sides of the hemisphericalcleft, as shown in Figure 3(b). In the srcinal noisy data, the same fi ber tracking procedure, however, largely fails in determinationof the correct tracts, as shown in Figure 3(a). Figure 3(c) and (d)show the cingulum  fi ber tracts extracted from the noisy and regular-ized DTI data, respectively. By de fi ning ROIs, the cingulum  fi bertracts can be cleanly extracted from the regularized data as shownin Figure 3(d). However, in the srcinal noisy data, there are manyshort spurious  fi bers along the entire tract as shown in Figure 3(c).These comparisons clearly demonstrate the effectiveness of ourregularization method. Fig.1 . (a) A synthetic tensor  fi eld with anisotropic elements shownin red. (b) Noise-corrupted version of the tensor  fi eld in (a). (c)Regularized version of the tensor  fi eld in (b).We further investigated the running time and convergence be-havior of our regularization method on the aforementioned real DTIdata. A comparison of convergence behavior between conventionalMCMC and our Hierarchy MCMC (HMCMC) is shown in Figure 2.Their initial tensor  fi elds are installed by the same LBP algorithm.We maintain six levels in our HMCMC, and the number of sweeps ateach level is respectively set to 100,150,50,50,20,20. The curve forHMCMC shows the objective function (minus log probability den- 67  0 200 400 600 800 1000 12002.442.462.482.52.522.54 seconds   m   i  n  u  s   l  o  g   d  e  n  s   i   t  y  MCMCHierarchy MCMC Fig. 2 . Convergence behavior of MCMC and Hierarchy MCMC.Minus log probability density is shown as a function of running timeon an Intel Pentium D 3.0GHz processor.# LBP iterations 5 10 15 20LBP run time(s) 9 18 28 37# MCMC sweeps 20 35 41 43MCMC run time(s) 54 93 109 114saves(s) 45 75 81 77 Table 1 . Performance comparison between LBP and pure MCMCon an Intel Pentium D 3.0GHz processor.sity) drops rapidly at the  fi rst few sweeps of each level, and it even-tually converges to a better approximate solution than conventionalMCMC.Table 1 justi fi es the use of LBP to generate an initial solution.It compares the performance between LBP and pure MCMC at the fi rst level. In pure MCMC, we initialize the tensor at each voxel w  with the one in the discrete state space that minimizes  E  w 1  . Inthis comparison, the number of iterations (sweeps) and the time toreach the same value of the objective function are given in the samerow. We can see that running LBP 15 iterations saves more than 80seconds. 6. CONCLUSION In this paper, we have presented a Beyesian regularization model forDTIs solved using MAP. This model introduces a low-degree pa-rameterization of diffusion tensors that make MAP estimation lessexpensive. The hierarchical MCMC algorithm we presented is amuch improved version of MCMC and it is able to converge to betterapproximate solutions with lower energies. We also use LBP on acoarse grid to install a good initial solution for hierarchical MCMC.Experiments demonstrated the effectiveness of our methods. Acknowledgments We would like to thank the reviewers for their valuable comments.This work was partially supported by National Science Foundation(IIS 09-14631) and National Natural Science Foundation of China(60728204/F020404). References [1] A.W. Anderson. Theoretical analysis of the effects of noise ondiffusion tensor imaging.  Magnetic Resonance in Medicine ,46:1174–1188, 2001.[2] P.F. Felzenszwalb and D.P. Huttenlocher. Ef  fi cient belief prop-(a) (b)(c) (d) Fig. 3 . The results of   fi ber tracking on the srcinal noisy DTI datasetand the regularized one.(a) The tracked  fi ber tracts srcinated from aROI de fi ned on the center image slice of the corpus callosum in thenoisy DTI dataset; (b) The tracked  fi ber tracts from the same ROI asin (a) in the regularized data; (c) The result of tracking the cingulum fi bers in the noisy data; and (d) The result of tracking the cingulum fi ber tracts in the regularized data.agation for early vision.  International Journal of Computer Vision , 70(1), 2006.[3] J. Frandsen, A. Hobolth, L. Østergaard, P. Vestergaard-Poulsen, and E.B. Vedel Jensen. Bayesian regularization of diffusion tensor images.  Biostatistics , 8(4):784–799, 2007.[4] S. Geman and D. Geman. Stochastic relaxation, gibbs distri-butions and the bayesian restoration of images.  IEEE Trans-actions on Pattern Analysis and Machine Intellligence , 6:721–741, 1984.[5] W.R. Gilks, S. Richardson, and D.J. Spiegelhalter.  MarkovChain Monte Carlo in Practice . Chapman & Hall/CRC, 1996.[6] Y. Weiss J.S. Yedidia, W.T. Freeman. Understanding belief propagation and its generalizations. Technical report, Mit-subishi Electric Research Laboratories, MERL-TR-2001-22,2002.[7] M.Mart´ ı n-Fern´andez, C.-F.Westin, and C.Alberola-L´opez. 3dbayesian regularization of diffusion tensor mri using multivari-ate gaussian markov random  fi elds. In  MICCAI, Lecture Notesin Computer Science , volume 3216, pages 351–359, 2004.[8] G.J.M. Parker, J.A. Schnabel, M.R. Symms, D.J. Werring, andG.J. Barker. Nonlinear smoothing for reduction of systematicand random errorsin diffusion tensor imaging.  Journal ofMag-netic Resonance Imaging , 11:702–710, 2000.[9] J. Pearl.  Probabilistic Reasoning in Intelligent Systems: Net-works of Plausible Inference . Morgan Kaufmann, San Fran-cisco, 1988.[10] C. Poupon, C.A. Clark, V. Frouin, J. Regis, I. Bloch, D. LeBihan, and J.-F. Mangin. Regularization of diffusion-based di-rectional maps for the tracking of brain white matter fascicles.  NeuroImage , 12:184–195, 2000. 68
Search
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks