A real time implementation and an evaluation of an optimal filtering technique for noise reduction in dual microphone hearing aids

A real time implementation and an evaluation of an optimal filtering technique for noise reduction in dual microphone hearing aids
of 4
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
  AREALTIMEIMPLEMENTATIONANDANEVALUATIONOFANOPTIMALFILTERINGTECHNIQUEFORNOISEREDUCTIONINDUALMICROPHONEHEARINGAIDS  Jean Baptiste Maj 1 , 2  , Liesbeth Royackers 1 and Jan Wouters 11 Lab.Exp.ORL KULeuven, Kapucijnenvoer 33, 3000 Leuven 2 ESAT-SISTA KULeuven, Kasteelpark Arenberg 10, 3001 Leuven ABSTRACT A real time implementation and an evaluation of a Singular ValueDecomposition (SVD) based optimal filtering technique [1] fornoisereductioninadualmicrophoneBTEhearingaidispresented.A method to improve the performance of a Voice Activity Detec-tor (VAD) is described and evaluated physically. This method isused in the real time implementation of the optimal filtering tech-nique. A perceptual evaluation by normal hearing subjects is car-ried out for single and multiple jammer sound sources with speechweighted noise. The SVD-based technique can perform as well asan adaptive beamformer [2] strategy in a single noise scenario (i.e.the ideal scenario for the latter technique), and, can outperform thebeamformer technique in a multiple noise sources scenario.  1 1. INTRODUCTION Noise reduction strategies are important in hearing aid devices toimprove speech intelligibility in a noisy background [3]. Moderndigital hearing aids using dual-microphone configurations in a sin-gle behind-the-ear (BTE) hearing aid allow to process more com-plex noise reduction algorithms. Recently, adaptive noise reduc-tion algorithms have been developed and implemented in hearingaids. These algorithms can adapt to changing jammer sound direc-tionsandcan trackmovingnoisesources. Inthisstudy, an adaptiveprocedure using a SVD-based optimal filtering technique is eval-uated perceptually. This strategy was assessed theoretically andphysically in previous studies [1, 4, 5]. The optimal filtering strat-egy works without assumptions about the desired target direction,however, this strategy needs a robust VAD. In this paper, the SVD-based optimal filtering technique is presented and the real timeimplementation is described. Furthermore, a method to improvethe performance of the VAD is introduced. A physical evaluationallows to assess the latter method. Finally, a perceptual evaluation 1 The authors would like to acknowledge Marc Moonen 2 for his scien-tific contribution. They consider him to be a co-author of this paper, how-ever he cannot be listed due to conference restrictions. This study is sup-ported by the Fund for Scientific Research - Flanders (Belgium) throughthe FWO projects 3.0168.95 (”Signal processing for improved speech in-telligibility of hearing impaired”), G.0233.01 (”Signal processing and au-tomatic patient fitting for advanced auditory prostheses”), and Cochlear(IWT project 20540), and was partially funded by the Belgian State, PrimeMinister’s Office - Federal Office for Scientific, Technical and CulturalAffairs - IUAP P4-02 (Modeling, Identification, Simulation and Control of Complex Systems) and the Concerted Research Action GOA-MEFISTO-666 (Mathematical Engineering for Information and Communication Sys-temsTechnology)oftheFlemishGovernment. Thescientificresponsibilityis assumed by its authors. with subjects is carried out by measuring the SNR-improvementsof the SVD-based technique, and comparing these to the resultsobtained with an adaptive beamformer technique [2]. 2. SVD-BASEDOPTIMALFILTERINGTECHNIQUE The SVD-based optimal filtering technique considered here, ingeneral reconstructs a speech signal  s k  from noisy data  u k  = s k  + n k  by means of an optimal filter  W WF   ∈  R N  × N  using ˆs k  = W T WF  u k  at time  k . Using a Minimum Mean Square Error-criterion (MMSE), the optimal filter W WF   is equal to: W WF   =  E{ u k . u T k } − 1 . ( E{ u k . u T k }−E{ n k . n T k } )  (1)Doclo and Moonen [1] use an interesting and useful simplificationin formula (1), where W WF   is derived from the GSVD  (gener-alized singular value decomposition)  of the data matrices U k  ∈ R p × N  and N k  ∈  R q × N  (with  p  and  q  typically larger than  N  ),suchthat E{ u k . u T k } ⇒  ( U T k  . U k ) /p and E{ n k  . n T k  } ⇒  ( N T k  . N k ) /q  . u k  is collected during  speech-and-noise periods , while n k  is col-lected during  noise periods . The GSVD of the matrices U k  and N k  is defined as   U k  = Y .diag { σ i } . X T  N k  = V .diag { η i } . X T   (2)where Y ∈  R p × N  and V ∈  R q × N  are orthogonal matrices, X ∈ R N  × N  is an invertible matrix and  σ i η i are the generalized singularvalues. By substituting the above formulas in (1), we obtain: W WF   = X − T  .diag  1 −  pq η 2 i σ 2 i  . X T  (3)By using a time constrained estimator, the energy of the signaldistortion   2 s  is minimized under the constraint that the residualnoise energy   2 n  stays under a threshold  α  [1]. Min W WF   2 s  subject to  2 n  ≤  α where  0  ≤  α  ≤  1  (4)Thus, the filter W WF   becomes: W WF   = X − T  .diag   q.σ 2 i  −  p.η 2 i q.σ 2 i  + ( µ − 1)  p.η 2 i  . X T  (5)The speech distortion parameter  µ  ∈  [0 , ∞ ]  allows a trade-off between signal distortion and noise reduction. If   µ  = 1  the src-inal MMSE solution is obtained. More emphasis is put on thesignal distortion when  µ <  1  at the expense of decreasing the IV - 90-7803-8484-9/04/$20.00 ©2004 IEEEICASSP 2004              Σ w 2 SVD Output +  + Front microphoneRear microphonew 1 SVD Fig. 1 . Representation of the SVD-based optimal filtering tech-nique.noise reduction performance. The residual noise level is reducedwhen  µ >  1  at the expense of increasing speech distortion. With µ  → ∞ , all the emphasis is put on the noise reduction withouttaking into account of the signal distortion. In a two microphoneapplication, the vector u k  ∈ R MN  takes the form: u k  =   u 1 k  u 2 k   (6)with u jk  =   u j ( k )  u j ( k − 1)  ... u j ( k − N   + 1)  T  (7)where the  j  refers to the  j -th microphone. The vector n k  is simi-larly defined. The computation of the optimal filter W WF   resultsin a  (2 × N  ) − taps estimator w WF   for the signal ˜s k . ˜s k  =  ˜ s ( k )˜ s ( k  + 1) ... ˜ s ( k  +  p − 1)  = U k . w WF   (8)where ˜s k  is an estimate for the (delayed version of the) speechpart of either front microphone or rear microphone depending onthe choice for w WF  , which is one column of  W WF  . Maj et al.[5] showed that using the middle column of  W WF   in the front mi-crophone part, a good estimate of  ˜s k  is obtained. This filter w WF  (see figure 1) is as a two-channel filter, where each microphonewas filtered with a N-taps filter w SVDj  . In our experiments N willbe 15. w WF   =   w SVD 1 w SVD 2   (9) 3. REAL TIME IMPLEMENTATION The real time implementation of the SVD-based technique is illus-trated in figure 2. Four steps are necessary to compute the filtercoefficients in real time: •  Step  1 :  The VAD discriminates the  speech-and-noise periods from the  noise periods  of the noisy speech signals. The VAD usedin this study is based on the log-energy of the signal [2]. The log-energy of the signal is computed with an overlap method on 128samples. The decision of the VAD is taken from the computa-tion of two thresholds namely,  Tspeech  and  Tnoise .  Tspeech  and Tnoise  are computed from the statistics of the signal (the mean andthe variance). The function  Signal  equals the log-energy when theenergy of the signal increases, and drops with an exponential curve Step 1:  VAD Step 2: Gradient G Step 3: GSVD update Step 4: Computation of the filter w WF Fig.2 . Real time implementation of the SVD-based optimal filter-ing technique.when the energy dropps. A function  Offset   preserves the  VAD =1during a number of samples when a  noise period   is detected. Inthis way, a  speech-and-noise period   is still identified when there isa silence in a word or a sentence. With these different thresholds,the VAD works as follows:- if   Signal > Tspeech , a  speech-and-noise period   is detected, VAD =1.- if   Tnoise  >  Signal  and  Offset  =1, a  noise period   is detectedbut  VAD =1.- if   Tnoise  >  Signal  and  Offset  =0, a  noise period   is detected, VAD =0. •  Step  2 :  Classification errors between the  speech-and-noise pe-riods  and the  noise periods  occur with the VAD. If the  speech-and-noise periods  are wrongly classified, speech-and-noise vec-tors are added to the noise matrix ( N k ). In this case, the factor F   = 1 − η 2 i /σ 2 i  of the filter W WF   tends to be small ( σ 2 i  → η 2 i ),resulting in signal cancellation. Since  F   varies in time, the gradi-ent  G  of this factor can be measured during the processing: G  =  δ  (1 /N.  N i =1 (1 − η 2 i /σ 2 i )) δt  (10)If the gradient  G  is below a given threshold  β  , this means that theVAD detects  speech-and-noise periods  instead of   noise periods .Then, a correction is made to the VAD and the decision made in Step 1  is modified. Otherwise, when  G > β  , the decision made in Step  1 is kept valid. •  Step  3 :  A recursive technique is used to approximate the SVD-based optimal filtering technique. This technique is based on aJacobi-typeGSVD-updatingalgorithm[6]. RecursiveGSVD-updatingalgorithms use the decomposition of the GSVD at time  k − 1  tocompute the decomposition at time k . The equation 2 at time  k − 1 can be rewritten as:   U k − 1  =  Y k − 1 · R U,k − 1 · X T k − 1 N k − 1  =  V k − 1 · R N,k − 1 · X T k − 1 (11)where  R U,k − 1  ∈  R N  × N  and  R N,k − 1  ∈  R N  × N  are upper tri-angular matrices having parallel rows and X k − 1  ∈  R N  × N  is anorthogonal matrix. For the computation, only R U,k − 1 ,  R N,k − 1 and X k − 1  are stored. When a new data vector u k  (speech-and-noise) or n k  (noise) is present at time  k , the GSVD of  U k  and N k need to be recomputed as U k  =   λ s · U k − 1 u k   or  N k  =   λ n · N k − 1 n k   (12) IV - 10              where  λ s  and  λ n  are exponential weighting factors for speech andnoise matrix, respectively. For details on the updating scheme, thereader is referred to [6]. •  Step  4 : Thisstepconsistsofcomputingtheoptimalfilter w WF,k after the update of the recursive GSVD-updating algorithm. Sub-stituting formulae (11) into (1), the equation can be rewritten as: W WF,k  = X k . R − 1 U,k .diag   (1 − λ 2 n ) . ( R iiU,k ) 2 − (1 − λ 2 s ) . ( R iiN,k ) 2 (1 − λ 2 n ) . ( R iiU,k ) 2 + ( µ − 1) . (1 − λ 2 s ) . ( R iiN,k ) 2  . R U,k . X T k (13)The factor  p/q   is replaced by  (1 − λ 2 n ) / (1 − λ 2 s ) . Only one column(the i − th column, w iWF,k  of  W WF,k )iscomputedasthesolutionofthelinearsetbyaback-substitutionmethod. Inourexperiments,the speech distortion parameter  µ  is set to 1.75. 4. METHODS4.1. Hearing aids The hearing aid was a prototype based on a Cochlear Nucleusbehind-the-ear headset housing. One hardware directional micro-phone (Microtronic 6001), as front microphone, and one omni-directional microphone (Knowles FG-3452), as rear microphone,were mounted in an endfire array configuration. The hardwaredirectional microphone had a cardioid spatial characteristic (nullat 180 o ) in anechoic conditions. The distance between the frontentry port and the back entry port of the hardware directional mi-crophone was 1cm. The distance between the front entry port of the hardware directional microphone and the omnidirectional mi-crophone was 2.5cm. 4.2. Physical evaluation In general, several signals are available to the VAD, such as thesignal of the omnidirectional microphones, the directional micro-phone or even the output of the noise reduction technique. In thisstudy, the behaviour of the VAD is evaluated when the VAD isconnected to these different signals. When the VAD algorithmis connected to the omnidirectional microphone or the directionalmicrophone, the signals are directly available. When the VAD isconnected to the output of the strategy, the signals are only avail-able after a first update of the adaptive filters. The SVD-basedtechnique needs at least a  noise period   and a  speech-and-noise pe-riod  . To solve this problem of initialization, the VAD is connectedfirst to the directional microphone and when several samples areclassified as  speech-and-noise periods  or  noise periods , the opti-mal filters are updated. Only then, the VAD algorithm is connectedto the output of the SVD-based strategy. The performance of theVAD is evaluated by calculating the percentage correctly detectedsamples by the VAD algorithm for  speech-and-noise periods  and noise periods  of the signals. The percentage (Per) is calculated as: Per  =  SN  RealTime × 100 SN  Perfect Per  =  N  RealTime × 100 N  Perfect (14)where N  Perfect  and SN  Perfect  are the number of samples, whichare known to be classified as  noise periods  ( N  ) or  speech-and-noise periods ( SN  )bythe‘perfect’VAD. N  Realtime  and SN  Realtime are the number of samples which are correctly classified as  noise periods  or  speech-and-noise periods  by the real time VAD. Thesignals of the speech signals (0 o ) and the noise signal (90 o ) arerecorded when the hearing aid is positioned on a dummy head.The signals are recorded during 90 seconds. In the calculation, thefirst 20 seconds of the signals are not taken in account. This is thetime needed to the noise reduction algorithm to converge. 4.3. Perceptual evaluation The perceptual evaluation was performed with ten normal hearinglisteners by measuring the Speech Reception Threshold (SRT) of sentences in a stationary speech weighted noise, with an adaptiveprocedure[7]. Thetestsoftheomnidirectionalmicrophoneandtheadaptive beamformer [2] were carried out in two different noisescenarios in a moderately reverberant room ( T  60  = 0 . 76 s ). Afirst, where the speech source was at an angle of 0 o (in front of themannequin) and the noise source at 90 o , and a second, where thespeechsourcewasat45 o andthreeindependentnoisesourceswereat 90 o  /180 o  /270 o . The distance between the loudspeakers and thecenter of the mannequin was 1 meter. The SVD-based techniquewas compared to an adaptive beamformer technique, which wasknown to give significant improvements in speech intelligibility[2]. 5. RESULTS5.1. Physical evaluation Figure 3 shows the results of the percentage ( Per ) correctly de-tected samples by the VAD algorithm for  speech-and-noise peri-ods  and  noise periods  in a stationary speech weighted noise. TheVAD algorithm detected correctly the  noise-only periods  whenit was connected to the omnidirectional microphone, the direc-tional microphone or the output of the noise reduction strategy( Per >  90 ). The detection performance for the  speech-and-noise periods  was clearly a function of the signal to which the VAD wasconnected. The performance of the VAD dropped significantlywhen it was linked to the omnidirectional or directional micro-phone for a SNR below 5dB. When the VAD used the output sig-nal of the SVD-based technique, the percentage of well-detectedsamples stayed above 90 %  for a SNR above -5dB. At a SNR of -10dB, the scores were about 90 %  with the optimal filtering tech-nique. Connecting the VAD to the output of the noise reductionalgorithm revealed the best performance. In this study, the VADwas connected to the output of the noise reduction strategy for thereal time implementation. 5.2. Perceptual evaluation Figure 4 shows the SRT-improvements (in dB) of the two noisereduction algorithms (SVD-based optimal filtering technique ver-sus adaptive beamformer [2]) relative to the omnidirectional mi-crophone, for both jammer sound scenarios. To compare the per-formance of the noise reduction techniques between each other,a statistical analysis (a paired comparison) was performed for thetwo noise scenarios. In the single jammer sound scenario, impor-tant SRT-improvements were obtained, 15.8dB and 15.1dB, forthe adaptive beamformer and the optimal filtering technique re-spectively. There were no significant differences between bothstrategies (p=0.103). This means that the SVD-based technique IV - 11               10   505100102030405060708090100SNR (dB) of the input signal (omnidirectional microphone)    P  e  r  c  e  n   t  a  g  e  c  o  r  r  e  c   t   l  y   d  e   t  e  c   t  e   d   (   P  e  r   )   VAD connected to omni. mic. (noise periods)VAD connected to omni. mic. (speech  and  noise periods)VAD connected to dir. mic. (noise periods)VAD connected to dir. mic. (speech  and  noise periods) VAD connected to SVD (noise periods)VAD connected to SVD (speech  and  noise periods) Fig. 3 . Performance of the VAD when it is connected to the omni-directional microphone, the directional microphone, the output of the SVD-based technique. 024681012141618201 Noise 3 NoisesSVDBeam    S   N   R  -   i  m  p  r  o  v  e  m  e  n   t   (   d   B   ) Fig. 4 . SRT-improvements (in dB) of the SVD-based optimal fil-tering technique (SVD) and the adaptive beamformer (Beam) rel-ative to the omnidirectional microphone for both jammer soundscenarioscan perform as well as the adaptive beamformer when the noisescenario is optimal for the latter technique. Indeed, the desiredtarget (speech at 0 o ) was in the look direction of the beamformer(angle 0 o ). In the multiple noise scenario, the SVD-based tech-nique was significantly better than the adaptive beamformer whena stationary speech weighted noise was present (p=0.005). SRT-improvements of 7.5dB and 9.0dB were obtained with the adap-tive beamformer and the optimal filtering technique, respectively.The difference between the two strategies (1.5dB) is important forhearing-aid users. In critical listening conditions (close to 50 % of speech understood by the listener) an improvement of 1dB in SNRcorresponds to an increase of speech understanding of about 15per cent in every day speech communication [3].On one hand, the SVD-based optimal filtering technique workswithout assumptions about the desired target direction, however,this strategy needs a robust VAD. On the other hand, the adap-tive beamformer works with assumptions about the desired targetdirection and the characteristics of the microphones. When theseassumptions are violated, it leads to a leakage of the speech signalinto the noise reference. If then the VAD misclassifies the  speech-and-noise periods , the adaptive filter takes in account the statisticsof the desired signal and subsequent target cancellation.In the multiple jammer sound scenario, the noise reduction strate-gies did not achieve the same performance as the single jammersound scenario. The SRT-improvements decreased by about 8dB.Theoretically, a signal processing strategy comprising N micro-phones can potentially separate up to N statistically independentsources. More specifically, a configuration with two microphonesis optimal for the cancellation of one jammer sound. The direc-tional microphone is important in adverse listening conditions. Inadiffuselisteningenvironment(thejammersourcesarenotlocatedin well defined directions), the adaptive effect of the noise reduc-tion strategies falls back to the effect of the directional microphone[2].SVD-based procedures are known to have a high computationalcomplexity, but, recent studies showed that the complexity prob-lem can be controlled, making this approach attractive for practicalsystems. Recently, a LMS approach was found to have approx-imately the same cost of calculation as the adaptive beamformer[8]. 6. CONCLUSIONS A real time implementation and an evaluation of a Singular ValueDecomposition (SVD) based optimal filtering technique for noisereductioninadualmicrophoneBTEhearingaidispresented. Con-necting the VAD to the output of the noise reduction algorithm re-veals a good performance for discriminating the  speech-and-noise periods  from the  noise periods . Perceptual measurements showedthat the optimal filtering technique is more robust than the adap-tive beamformer in a multiple noise source scenarios and couldperform as well as the latter technique in a single jammer soundscene. 7. REFERENCES [1] S Doclo and M Moonen, “Gsvd-based optimal filtering forsingle and multiple speech enhancement,”  IEEE Transactionson Signal Processing , vol. 50, no. 9, pp. 2230–2244, 2002.[2] J B Maj, J Wouters, and M Moonen, “Noise reduction resultsof an adaptive filtering technique for dual-microphone behind-the-ear hearing aids,”  Ear and Hearing , vol. In revision, 2003.[3] R Plomp, “Noise, amplification, and compression: consider-ations of three main issues in hearing aid design,”  Ear and  Hearing , vol. 15, no. 1, pp. 2–12, 1994.[4] J B Maj, M Moonen, and J Wouters, “Theoretical analysis of adaptive noise reduction algorithms for hearing aids,”  Euro- pean Signal Processing Conference (EUSIPCO) , vol. Septem-ber 3-6, pp. Toulouse, France, 2002.[5] J B Maj, J Wouters, and M Moonen, “Svd-based optimal fil-tering technique for noise redcution in hearing aids using twomicrophones,”  Journal on Applied Signal Processing , vol. 4,pp. 432–443, 2002.[6] M Moonen, P VanDooren, and J Vandewalle, “A singularvalue decomposition updating algorithm for subspace track-ing,”  SIAM Journal of Matrix Anal. Application , vol. 13, no.4, pp. 1015–1038, 1992.[7] N Versfeld, L Daalder, J M Festen, and T Houtgast, “Exten-sion of sentence materials for the measurement of the speechreception threshold,”  Journal of the Acoustical Society of  America , vol. 107, no. 3, pp. 1671–1684, 2000.[8] A Spriet, M Moonen, and J Wouters, “Spatially preprocessed,speech distortion weighted multi-channel wiener filtering fornoise reduction,”  Submitted  , 2003. IV - 12            
Similar documents
View more...
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks