A Multidimensional Similarity Measure for Bilateral AdaptiveFiltering of fMRI Data
J. Rydell, H. Knutsson and M. BorgaDepartment of Biomedical Engineering, Link¨oping university, SwedenCenter for Medical Image Science and Visualization (CMIV), Link¨oping university, Sweden
Abstract
In analysisof fMRI data, it is commonto average neighboring voxels in order to obtain robust estimates of thecorrelations between voxel timeseries and the modelof the signal expected to be present in activated regions. We have previously proposed a method whereonly voxels with similar correlation coefﬁcients are averaged. In this paper we extend this idea, and present a novel method for analysis of fMRI data. In the pro posed method, only voxels with similar correlation coefﬁcients and similar timeseries are averaged. The proposed method is compared to our previous method andto two wellknownﬁltering strategies, andis shownto have superior ability to discriminate between activeand inactive voxels.
1 Introduction
Analysis of functional MRI data deals with the problem of detecting very weak signals in very noisy data.The common solution to this problem is to averagethe time series from neighboring pixels or voxels, andthereby enhance the signal to noise ratio [3]. In practice, this is done by convolving the slices (or volumes)with a ﬁxed lowpass ﬁlter kernel, e.g. a gaussian. Theprice to pay for this kind of noise reduction is loss of spatial resolution. Loss of spatial resolutionmeans thatthe shape of activated regions cannot be accurately determined and, perhaps worse, that small activated regions may remain undetected.Inordertomaintainhighspatialresolution,thespatial lowpass ﬁltering can be made
adaptive
and
nonisotropic
. This means that for each voxel, the size andshape of the local region, in which the averaging isperformed, is data dependent. A method for adaptivespatial ﬁltering based on canonical correlation analysis(CCA)haspreviouslybeensuggested[2]. Thatmethodchooses the size and shape of the local averaging region, i.e. the resulting adaptive ﬁlter, such that the
correlation
between the averaged time series and themodel of the blood oxygen level dependent (BOLD)response model is maximized. This makes the methodvery sensitive. Indeed, the method is so sensitive thatrestrictions have to be imposed on the number of parameters in the adaptive ﬁlter and their ranges in orderto maintain a reasonable selectivity. If given too muchfreedom,the methodmay ﬁnd false signals in the noisesince the ﬁlter is optimized for making the ﬁlter outputas similar to the BOLD response model as possible.Another problem with this method is that when the ﬁlter is centered in a nonactivated voxel but close to anactivated region, the ﬁlter will try to ”reach in” to theactivated region in order to pick up as much activationas possible. This will make the resulting regions labeled as active become larger than they should be, i.e.a growing of activated regions will occur.Wehavepreviouslyproposedanalternativemethodfor adaptive ﬁltering [6]. That method is based on averaging of voxels which have similar correlation withthe BOLD model, and has the advantagethat edges between active and inactive regions are preserved. Weherepresentanextensionofthisﬁlteringscheme,wherevoxels to be averaged are not only required to havesimilar correlation with the BOLD model, but shouldalso have similar timeseries. We also show that thismodiﬁcation providesa signiﬁcant improvementof thedetection performance.
2 Theory
When ordinary lowpass ﬁltering is used for noise reduction,voxels that are spatially close to each otheraretreatedassamplesfromonedistribution,andaweightedaverage of the voxels in a neighborhood is used as anestimate of the true signal value in the center of that region. The weights are predetermined and based on thedistance from the center of the neighborhood. Close toedges in an image, the voxel values are actually samples from two or more distributions, and using predetermined weights for averaging causes blurring of the edges. Bilateral ﬁltering [5, 7] extends lowpassﬁltering by also considering the distance between thevalue of a certain voxel and that of the center voxel,thereby creating a different ﬁlter kernel in each neighborhood. This approach causes voxels from the otherside of an edge to be treated as outliers, and thus theireffect on the estimate of the true signal value is reduced or eliminated. An example of using lowpassﬁltering and bilateral ﬁltering, respectively, of a noisyonedimensional signal is shown in ﬁgure 1. The sig
(a) Noisy data (b) After lowpassﬁltering(c) After bilateralﬁltering
Figure 1: Noisy data before and after lowpass and bilateral ﬁltering.nal is a step function with additive gaussian noise, andit is obvious that lowpass ﬁltering causes blurring of the edge while it is preserved by bilateral ﬁltering.The bilateral ﬁlter kernel in each neighborhoodcanbe expressed as a product of two ﬁlter kernels: the spatial ﬁlter
F
s
and the range ﬁlter
F
r
. The spatial ﬁlter isbased on spatial distance, and corresponds to the ﬁlterkernel used in lowpass ﬁltering, while the range ﬁlter is based on the difference in image intensity. Thatis, given an image
I
(
x,y
)
, the bilateral ﬁlter kernel
F
(∆
x,
∆
y
)
at image coordinates
(
x,y
)
can be written
F
(∆
x,
∆
y
) =
F
s
(∆
x,
∆
y
)
·
F
r
(∆
x,
∆
y
)
(1)where
F
s
(∆
x,
∆
y
)
is an ordinary spatial ﬁlter kernel
g
(∆
x,
∆
y
)
and the range ﬁlter is deﬁned as
F
r
(∆
x,
∆
y
) =
h
(
I
(
x
+∆
x,y
+∆
y
)
−
I
(
x,y
))
(2)A common choice of the ﬁlter kernels
g
and
h
is gaussian functions.
3 Method
Godtliebsen et al [4] have proposed using bilateral ﬁltering of the raw fMRI data, with a time dimension inaddition to the spatial and range dimensions describedabove. Our previous method is similar to bilateral ﬁltering, but instead of basing the range ﬁlter on differences in image intensity, we base it on the differencein correlationbetweenindividualvoxel timeseries andthe BOLD model. Furthermore, instead of using thecorrelation coefﬁcients directly, we use a mapping of the correlation. The reason for using this mapping isthat the correlation coefﬁcients are not readily comparable on a linear scale. The mapping is deﬁned as
Λ(
x,y
) = log
11
−
ρ
(
x,y
)
2
(3)where
ρ
(
x,y
)
isthecorrelationbetweenthetimeseriesat coordinates
(
x,y
)
and the BOLD model. Under certain conditions this measure, which is the logarithm of Wilks’ lambda, is equivalent to mutual information.Hereweproposeanextensionofthepreviousmethod,where we use
two
range ﬁlters. One of these (
F
r
1
) isidentical to the range ﬁlter described above, while theother(
F
r
2
)isbasedonthesimilaritybetweentheintensity timeseries themselves. That is, two spatially closevoxelsare averagedif theirindividualcorrelationswiththe BOLD model are similar
and
their timeseries resemble each other.Often, the BOLD model used in fMRI data analysis is a linear subspace model, i.e. a model with two ormore temporal basis signals. The correlation betweena timeseries and the model is then deﬁned as the highest correlation between the signal and any linear combination of the model basis signals. The model basiscan, for example, be generated by performing principal component analysis of a large number of simulated BOLD responses, generated by Buxton’s balloonmodel [1]. We propose that such a subspace modelis used, and use the angle between the projections of two timeseries onto the model subspace as a measureof similarity between the two timeseries. (In experiments with more than one stimuli, a linear subspacemodel can instead be based on the expected responsesfrom each of the stimuli.) By measuring the angle inthe signal subspace, large random variations that aredue to the high noise levels in the data are disregarded.If the timeseries were directlycomparedto each other,any similarity would remain undetected because of thenoise.Simplycombiningthespatialandrangeﬁlterswould,at each coordinate
(
x,y
)
, yield a ﬁlter
F
(∆
x,
∆
y
) =
F
s
(∆
x,
∆
y
)
·
F
r
1
(∆
x,
∆
y
)
·
F
r
2
(∆
x,
∆
y
)
(4)which averagesovervoxelsthat are close to each other,where the correlation with the BOLD model is similar,and where the projection of the signal onto the BOLDmodel basis functions is similar. However, this is generalized slightly by introducing the parameters
α
,
β
and
γ
as follows:
F
(∆
x,
∆
y
) =
(5)
F
s
(∆
x,
∆
y
)
α
·
F
r
1
(∆
x,
∆
y
)
β
·
F
r
2
(∆
x,
∆
y
)
γ
These parameters can be used to tune the relative importance of the different ﬁlters. The parameters caneven be variable, to accommodatedifferent weightingsof the ﬁlter kernels in different neighborhoods. Thismakes the proposedmethod verygeneral. We do, however, here propose speciﬁc choices of the parameters.As was mentioned in the last section, it is commonto choose
F
s
and
F
r
to be gaussian functions. Accordingly, we suggest that all of
F
s
,
F
r
1
and
F
r
2
are selected as such. Thus,
F
s
(∆
x,
∆
y
) = exp
−
d
s
(∆
x,
∆
y
)
2
2
σ
2
s
(6)
F
r
1
(∆
x,
∆
y
) = exp
−
d
r
1
(∆
x,
∆
y
)
2
2
σ
2
r
1
(7)
F
r
2
(∆
x,
∆
y
) = exp
−
d
r
2
(∆
x,
∆
y
)
2
2
σ
2
r
2
(8)
where the distance measures are deﬁned as
d
s
(∆
x,
∆
y
) =
∆
x
2
+ ∆
y
2
(9)
d
r
1
(∆
x,
∆
y
) =
(10)
Λ(
x,y
)
−
Λ(
x
+ ∆
x,y
+ ∆
y
)
d
r
2
(∆
x,
∆
y
) =
(11)
arccos(
ˆw
(
x,y
)
·
ˆw
(
x
+ ∆
x,y
+ ∆
y
))
The different
σ
:s are the standard deviations of the respective gaussian functions and
ˆw
(
x,y
)
is the projection direction in the subspace model for the timeseriesat coordinates
(
x,y
)
.The values of the exponents
α
,
β
and
γ
should bein the range from
0
to
1
, where
0
means that the ﬁlterhas no effect and
1
means that the ﬁlter has full effect. This implies that setting
α
=
β
= 1
and
γ
= 0
yields our previous method as a special case. We propose that these parameters are used as weights for thedifferent ﬁlters according to the certainties of their respective distance measures. The exact spatial distanceis always known, and thus its certainty
α
= 1
. Thereis no good certainty estimate for the correlation, andthus we also propose that
β
is constant, for example
β
= 1
. However, the certainty of the projection ontothesubspacemodelisrelatedtoourestimate ofthecorrelation. The higher the correlation estimate, the morecertain the projection direction is. Thus we select
γ
(∆
x,
∆
y
) =
4

ρ
(
x,y
)
ρ
(
x
+ ∆
x,y
+ ∆
y
)

(12)i.e. the square root of the geometric mean of the correlations in the two pixels under consideration. Thechoice of the square root is not of crucial importance,but it appears to provide a better weighting than usingthe geometric mean directly. Then, in regions wherethe correlation is high, the ﬁlter based on timeseriessimilarity will be important, while in other regions itwill have little or no effect. This is an advantage inboth active and inactive regions. In inactive regions,the correlation is low and the similarity between thetimeseries is random. By ignoring the second rangeﬁlter (
F
r
2
) in these regions, the ﬁnal ﬁlter will average over larger areas, thus reducing the probability of ﬁnding spurious correlations in the noise. In these regions,
F
r
1
precludes ﬁlters that would pick up signalfrom activated voxels. In active regions, on the otherhand, the correlations are higher and thus
F
r
2
has effect. This decreases the risk of extending the effectiveﬁlter beyond the active region.An example of the different ﬁlter kernels is shownin ﬁgure 2. Figure 2a shows where activity has beenembedded in the noise in an artiﬁcial data set. In ﬁgure 2b, the spatial ﬁlter
F
s
is shown. Figures 2c andd show the range ﬁlters
F
r
1
and
F
r
2
when they are located in the dashed square in ﬁgure 2a. In this case,the center pixel is located in an activated region. It isclear that the two range ﬁlters complement each other,excluding pixels outside of the activated region fromthe averaging. In ﬁgure 2e the resulting ﬁlter obtained
(a) Activatedlocations(b) Spatial ﬁlter
F
s
(c) Range ﬁlter
F
r
1
(d) Range ﬁlter
F
r
2
(e) Resulting ﬁlter
F
(f) Resulting ﬁlterwith inactivecenter pixel
Figure 2: Example of ﬁlter kernels based on the different distance measures, and ﬁnal ﬁlter combined usingequation 5. The resulting ﬁlter in ﬁgure e is used forweighting the timeseries in the region surrounded bythe dashed line in ﬁgure a.by combining
F
s
,
F
r
1
and
F
r
2
according to equation5 is shown. The coefﬁcients in this ﬁlter are used asweights for averaging the timeseries in the marked region. As can be seen in the ﬁgure, the ﬁlter has almostzero weight for inactive pixels but large weights forspatially close pixels with activation similar to that of thecenterpixel. Ifthecenterpixelhadbeenlocatedbeside the active region, a ﬁlter with large weights for inactive pixels and small weights for active pixels wouldinstead have been obtained. Such a resulting ﬁlter,where the center pixel is just outside of the active partof the marked regionin ﬁgure 2a, is shown in ﬁgure 2f.When the ﬁlters
F
(∆
x,
∆
y
)
have been created ateach coordinate
(
x,y
)
, they are used to ﬁlter the rawdata in each timepoint. After this, each timeseries inthe resulting data is analyzed separately to detect activation.It is important to notice that this is different fromcalculating the correlation in each pixel and then performing bilateral ﬁltering of the correlation map.
4 Results and discussion
The proposed method has been evaluated on both realandsyntheticdata. Figures3beshowcorrelationmapsgeneratedby analyzingsimulateddata usingﬁxed lowpass ﬁltering, adaptive ﬁltering using CCA, adaptiveﬁltering using our previous method and adaptive ﬁltering using the proposedmethod,respectively. The areaswhere BOLDlike signals were embedded in the noiseare shown in ﬁgure 3a. The signal to noise ratio of thesimulated data is approximately 5 – 10 %. Brighter regions in ﬁgure 3a have higher SNR. The noise is gaussian, with spatial autocorrelation similar to that found
(a) Locations withsimulatedactivation(b) Fixedlowpassﬁltering(c) Adaptiveﬁltering basedon CCA(d) Our previousmethod(e) The proposedmethod
Figure 3: Locations with simulated activity and activeregions detected using the different analysis methods.in real fMRI data.In ﬁgure 4, receiveroperating characteristic (ROC)curves,showingthesensitivity(abilitytocorrectlyclassify active voxels) versus the speciﬁcity (ability to correctly classify inactive voxels) of the different methods, are displayed.It is evident from the ROC curves that the methods based on bilateral ﬁltering have superior ability todiscriminate between active and inactive voxels in thesimulated data. This is also supported by the correlation maps in ﬁgures 3de, which show sharper edgesbetweenactiveandinactiveregionsthanthecorrelationmaps generated by the CCA method and the methodbasedona ﬁxedﬁlter. This edgepreservingpropertyisclearly an advantage of these methods. While the visible difference between the correlation map from ourprevious method and that from the proposed method isnot very large, the ROC curves clearly show that theproposed inclusion of a second range ﬁlter, based ontimeseries similarity, provides a further enhancementof the detection performance. This is to be expected,since the new range ﬁlter reduces the risk of creatingtoo large ﬁlters.Figure5showsactivationdetectedinrealdatafroma ﬁnger tapping task, overlaid on an anatomical imageof the brain. The activation in the motor cortex is consistent with the task.
5 Conclusion
A new method for adaptive ﬁltering of fMRI data hasbeen presented and evaluated. The method, which isbased on bilateral ﬁltering, extends our previous fMRIanalysis scheme. Experimentalresults have shown that
10
−4
10
−3
10
−2
10
−1
10
0
0.30.40.50.60.70.80.911 − specificity
S e n s i t i v i t y
Adaptive (proposed)Adaptive (previous bilateral)Adaptive (CCA)Fixed low−pass filter
Figure 4: ROC curves for the different analysis methods. The proposed method provides the best detectionperformance.Figure 5: Activation detected using the proposedmethod on real data from a ﬁnger tapping experiment.As expected, the detected activation resides in the motor cortex.thenewmethodprovidesimprovedactivationdetectionperformance.
References
[1] R.B. Buxton, E.C. Wong, and L.R. Frank. Dynamicsof blood ﬂow and oxygenation changes during brain activation: the Balloon model.
Magnetic Resonance in Medicine
, 39(6):855–864, 1998.[2] O. Friman, M. Borga, P. Lundberg, and H. Knutsson.Adaptive analysis of fMRI data.
NeuroImage
,19(3):837–845, 2003.[3] K.J. Friston, P. Jezzard, and R. Turner. Analysis of functional MRI timeseries.
Human Brain Mapping
, 1:153–171, 1994.[4] F. Godtliebsen, C.K. Chu, S. H. Sørbye, andG. Torheim. An estimator for functional data with application to MRI.
IEEE Transactions on Medical Imaging
,20(1):36–44, 2001.[5] F. Godtliebsen, E. Spjøtvoll, and J. S. Marron. A nonliear Gaussian ﬁlter applied to images with discontinuities.
Nonparametric Statistics
, 8:21–43, 1997.[6] J. Rydell, H. Knutsson, and M. Borga. Correlation controlled adaptive ﬁltering for fMRI data analysis. In
Proceedings of the 13th NordicBaltic conference onbiomedical engineering and medical physics (NBC’05)
,Ume˚a, Sweden, June 2005. NBC.[7] C. Tomasi and R. Manduchi. Bilateral ﬁltering for grayand color images. In
IEEE International Conferenceon Computer Vision 98
, pages 839–846, Bombay, India,January 1998. IEEE.