Description

A real time implementation and an evaluation of an optimal filtering technique for noise reduction in dual microphone hearing aids

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

AREALTIMEIMPLEMENTATIONANDANEVALUATIONOFANOPTIMALFILTERINGTECHNIQUEFORNOISEREDUCTIONINDUALMICROPHONEHEARINGAIDS
Jean Baptiste Maj
1
,
2
, Liesbeth Royackers
1
and Jan Wouters
11
Lab.Exp.ORL KULeuven, Kapucijnenvoer 33, 3000 Leuven
2
ESAT-SISTA KULeuven, Kasteelpark Arenberg 10, 3001 Leuven
Jean-Baptiste.Maj@uz.kuleuven.ac.be
ABSTRACT
A real time implementation and an evaluation of a Singular ValueDecomposition (SVD) based optimal ﬁltering technique [1] fornoisereductioninadualmicrophoneBTEhearingaidispresented.A method to improve the performance of a Voice Activity Detec-tor (VAD) is described and evaluated physically. This method isused in the real time implementation of the optimal ﬁltering tech-nique. A perceptual evaluation by normal hearing subjects is car-ried out for single and multiple jammer sound sources with speechweighted noise. The SVD-based technique can perform as well asan adaptive beamformer [2] strategy in a single noise scenario (i.e.the ideal scenario for the latter technique), and, can outperform thebeamformer technique in a multiple noise sources scenario.
1
1. INTRODUCTION
Noise reduction strategies are important in hearing aid devices toimprove speech intelligibility in a noisy background [3]. Moderndigital hearing aids using dual-microphone conﬁgurations in a sin-gle behind-the-ear (BTE) hearing aid allow to process more com-plex noise reduction algorithms. Recently, adaptive noise reduc-tion algorithms have been developed and implemented in hearingaids. These algorithms can adapt to changing jammer sound direc-tionsandcan trackmovingnoisesources. Inthisstudy, an adaptiveprocedure using a SVD-based optimal ﬁltering technique is eval-uated perceptually. This strategy was assessed theoretically andphysically in previous studies [1, 4, 5]. The optimal ﬁltering strat-egy works without assumptions about the desired target direction,however, this strategy needs a robust VAD. In this paper, the SVD-based optimal ﬁltering technique is presented and the real timeimplementation is described. Furthermore, a method to improvethe performance of the VAD is introduced. A physical evaluationallows to assess the latter method. Finally, a perceptual evaluation
1
The authors would like to acknowledge Marc Moonen
2
for his scien-tiﬁc contribution. They consider him to be a co-author of this paper, how-ever he cannot be listed due to conference restrictions. This study is sup-ported by the Fund for Scientiﬁc Research - Flanders (Belgium) throughthe FWO projects 3.0168.95 (”Signal processing for improved speech in-telligibility of hearing impaired”), G.0233.01 (”Signal processing and au-tomatic patient ﬁtting for advanced auditory prostheses”), and Cochlear(IWT project 20540), and was partially funded by the Belgian State, PrimeMinister’s Ofﬁce - Federal Ofﬁce for Scientiﬁc, Technical and CulturalAffairs - IUAP P4-02 (Modeling, Identiﬁcation, Simulation and Control of Complex Systems) and the Concerted Research Action GOA-MEFISTO-666 (Mathematical Engineering for Information and Communication Sys-temsTechnology)oftheFlemishGovernment. Thescientiﬁcresponsibilityis assumed by its authors.
with subjects is carried out by measuring the SNR-improvementsof the SVD-based technique, and comparing these to the resultsobtained with an adaptive beamformer technique [2].
2. SVD-BASEDOPTIMALFILTERINGTECHNIQUE
The SVD-based optimal ﬁltering technique considered here, ingeneral reconstructs a speech signal
s
k
from noisy data
u
k
=
s
k
+
n
k
by means of an optimal ﬁlter
W
WF
∈
R
N
×
N
using
ˆs
k
=
W
T WF
u
k
at time
k
. Using a Minimum Mean Square Error-criterion (MMSE), the optimal ﬁlter
W
WF
is equal to:
W
WF
=
E{
u
k
.
u
T k
}
−
1
.
(
E{
u
k
.
u
T k
}−E{
n
k
.
n
T k
}
)
(1)Doclo and Moonen [1] use an interesting and useful simpliﬁcationin formula (1), where
W
WF
is derived from the GSVD
(gener-alized singular value decomposition)
of the data matrices
U
k
∈
R
p
×
N
and
N
k
∈
R
q
×
N
(with
p
and
q
typically larger than
N
),suchthat
E{
u
k
.
u
T k
} ⇒
(
U
T k
.
U
k
)
/p
and
E{
n
k
.
n
T k
} ⇒
(
N
T k
.
N
k
)
/q
.
u
k
is collected during
speech-and-noise periods
, while
n
k
is col-lected during
noise periods
. The GSVD of the matrices
U
k
and
N
k
is deﬁned as
U
k
=
Y
.diag
{
σ
i
}
.
X
T
N
k
=
V
.diag
{
η
i
}
.
X
T
(2)where
Y
∈
R
p
×
N
and
V
∈
R
q
×
N
are orthogonal matrices,
X
∈
R
N
×
N
is an invertible matrix and
σ
i
η
i
are the generalized singularvalues. By substituting the above formulas in (1), we obtain:
W
WF
=
X
−
T
.diag
1
−
pq η
2
i
σ
2
i
.
X
T
(3)By using a time constrained estimator, the energy of the signaldistortion
2
s
is minimized under the constraint that the residualnoise energy
2
n
stays under a threshold
α
[1].
Min
W
WF
2
s
subject to
2
n
≤
α where
0
≤
α
≤
1
(4)Thus, the ﬁlter
W
WF
becomes:
W
WF
=
X
−
T
.diag
q.σ
2
i
−
p.η
2
i
q.σ
2
i
+ (
µ
−
1)
p.η
2
i
.
X
T
(5)The speech distortion parameter
µ
∈
[0
,
∞
]
allows a trade-off between signal distortion and noise reduction. If
µ
= 1
the src-inal MMSE solution is obtained. More emphasis is put on thesignal distortion when
µ <
1
at the expense of decreasing the
IV - 90-7803-8484-9/04/$20.00 ©2004 IEEEICASSP 2004
Σ
w
2
SVD
Output
+
+
Front microphoneRear microphonew
1
SVD
Fig. 1
. Representation of the SVD-based optimal ﬁltering tech-nique.noise reduction performance. The residual noise level is reducedwhen
µ >
1
at the expense of increasing speech distortion. With
µ
→ ∞
, all the emphasis is put on the noise reduction withouttaking into account of the signal distortion. In a two microphoneapplication, the vector
u
k
∈
R
MN
takes the form:
u
k
=
u
1
k
u
2
k
(6)with
u
jk
=
u
j
(
k
)
u
j
(
k
−
1)
... u
j
(
k
−
N
+ 1)
T
(7)where the
j
refers to the
j
-th microphone. The vector
n
k
is simi-larly deﬁned. The computation of the optimal ﬁlter
W
WF
resultsin a
(2
×
N
)
−
taps estimator
w
WF
for the signal
˜s
k
.
˜s
k
=
˜
s
(
k
)˜
s
(
k
+ 1)
...
˜
s
(
k
+
p
−
1)
=
U
k
.
w
WF
(8)where
˜s
k
is an estimate for the (delayed version of the) speechpart of either front microphone or rear microphone depending onthe choice for
w
WF
, which is one column of
W
WF
. Maj et al.[5] showed that using the middle column of
W
WF
in the front mi-crophone part, a good estimate of
˜s
k
is obtained. This ﬁlter
w
WF
(see ﬁgure 1) is as a two-channel ﬁlter, where each microphonewas ﬁltered with a N-taps ﬁlter
w
SVDj
. In our experiments N willbe 15.
w
WF
=
w
SVD
1
w
SVD
2
(9)
3. REAL TIME IMPLEMENTATION
The real time implementation of the SVD-based technique is illus-trated in ﬁgure 2. Four steps are necessary to compute the ﬁltercoefﬁcients in real time:
•
Step
1 :
The VAD discriminates the
speech-and-noise periods
from the
noise periods
of the noisy speech signals. The VAD usedin this study is based on the log-energy of the signal [2]. The log-energy of the signal is computed with an overlap method on 128samples. The decision of the VAD is taken from the computa-tion of two thresholds namely,
Tspeech
and
Tnoise
.
Tspeech
and
Tnoise
are computed from the statistics of the signal (the mean andthe variance). The function
Signal
equals the log-energy when theenergy of the signal increases, and drops with an exponential curve
Step 1:
VAD
Step 2:
Gradient
G
Step 3:
GSVD update
Step 4:
Computation of the filter
w
WF
Fig.2
. Real time implementation of the SVD-based optimal ﬁlter-ing technique.when the energy dropps. A function
Offset
preserves the
VAD
=1during a number of samples when a
noise period
is detected. Inthis way, a
speech-and-noise period
is still identiﬁed when there isa silence in a word or a sentence. With these different thresholds,the VAD works as follows:- if
Signal
>
Tspeech
, a
speech-and-noise period
is detected,
VAD
=1.- if
Tnoise
>
Signal
and
Offset
=1, a
noise period
is detectedbut
VAD
=1.- if
Tnoise
>
Signal
and
Offset
=0, a
noise period
is detected,
VAD
=0.
•
Step
2 :
Classiﬁcation errors between the
speech-and-noise pe-riods
and the
noise periods
occur with the VAD. If the
speech-and-noise periods
are wrongly classiﬁed, speech-and-noise vec-tors are added to the noise matrix (
N
k
). In this case, the factor
F
= 1
−
η
2
i
/σ
2
i
of the ﬁlter
W
WF
tends to be small (
σ
2
i
→
η
2
i
),resulting in signal cancellation. Since
F
varies in time, the gradi-ent
G
of this factor can be measured during the processing:
G
=
δ
(1
/N.
N i
=1
(1
−
η
2
i
/σ
2
i
))
δt
(10)If the gradient
G
is below a given threshold
β
, this means that theVAD detects
speech-and-noise periods
instead of
noise periods
.Then, a correction is made to the VAD and the decision made in
Step 1
is modiﬁed. Otherwise, when
G > β
, the decision made in
Step
1 is kept valid.
•
Step
3 :
A recursive technique is used to approximate the SVD-based optimal ﬁltering technique. This technique is based on aJacobi-typeGSVD-updatingalgorithm[6]. RecursiveGSVD-updatingalgorithms use the decomposition of the GSVD at time
k
−
1
tocompute the decomposition at time
k
. The equation 2 at time
k
−
1
can be rewritten as:
U
k
−
1
=
Y
k
−
1
·
R
U,k
−
1
·
X
T k
−
1
N
k
−
1
=
V
k
−
1
·
R
N,k
−
1
·
X
T k
−
1
(11)where
R
U,k
−
1
∈
R
N
×
N
and
R
N,k
−
1
∈
R
N
×
N
are upper tri-angular matrices having parallel rows and
X
k
−
1
∈
R
N
×
N
is anorthogonal matrix. For the computation, only
R
U,k
−
1
,
R
N,k
−
1
and
X
k
−
1
are stored. When a new data vector
u
k
(speech-and-noise) or
n
k
(noise) is present at time
k
, the GSVD of
U
k
and
N
k
need to be recomputed as
U
k
=
λ
s
·
U
k
−
1
u
k
or
N
k
=
λ
n
·
N
k
−
1
n
k
(12)
IV - 10
where
λ
s
and
λ
n
are exponential weighting factors for speech andnoise matrix, respectively. For details on the updating scheme, thereader is referred to [6].
•
Step
4 :
Thisstepconsistsofcomputingtheoptimalﬁlter
w
WF,k
after the update of the recursive GSVD-updating algorithm. Sub-stituting formulae (11) into (1), the equation can be rewritten as:
W
WF,k
=
X
k
.
R
−
1
U,k
.diag
(1
−
λ
2
n
)
.
(
R
iiU,k
)
2
−
(1
−
λ
2
s
)
.
(
R
iiN,k
)
2
(1
−
λ
2
n
)
.
(
R
iiU,k
)
2
+ (
µ
−
1)
.
(1
−
λ
2
s
)
.
(
R
iiN,k
)
2
.
R
U,k
.
X
T k
(13)The factor
p/q
is replaced by
(1
−
λ
2
n
)
/
(1
−
λ
2
s
)
. Only one column(the
i
−
th
column,
w
iWF,k
of
W
WF,k
)iscomputedasthesolutionofthelinearsetbyaback-substitutionmethod. Inourexperiments,the speech distortion parameter
µ
is set to 1.75.
4. METHODS4.1. Hearing aids
The hearing aid was a prototype based on a Cochlear Nucleusbehind-the-ear headset housing. One hardware directional micro-phone (Microtronic 6001), as front microphone, and one omni-directional microphone (Knowles FG-3452), as rear microphone,were mounted in an endﬁre array conﬁguration. The hardwaredirectional microphone had a cardioid spatial characteristic (nullat 180
o
) in anechoic conditions. The distance between the frontentry port and the back entry port of the hardware directional mi-crophone was 1cm. The distance between the front entry port of the hardware directional microphone and the omnidirectional mi-crophone was 2.5cm.
4.2. Physical evaluation
In general, several signals are available to the VAD, such as thesignal of the omnidirectional microphones, the directional micro-phone or even the output of the noise reduction technique. In thisstudy, the behaviour of the VAD is evaluated when the VAD isconnected to these different signals. When the VAD algorithmis connected to the omnidirectional microphone or the directionalmicrophone, the signals are directly available. When the VAD isconnected to the output of the strategy, the signals are only avail-able after a ﬁrst update of the adaptive ﬁlters. The SVD-basedtechnique needs at least a
noise period
and a
speech-and-noise pe-riod
. To solve this problem of initialization, the VAD is connectedﬁrst to the directional microphone and when several samples areclassiﬁed as
speech-and-noise periods
or
noise periods
, the opti-mal ﬁlters are updated. Only then, the VAD algorithm is connectedto the output of the SVD-based strategy. The performance of theVAD is evaluated by calculating the percentage correctly detectedsamples by the VAD algorithm for
speech-and-noise periods
and
noise periods
of the signals. The percentage (Per) is calculated as:
Per
=
SN
RealTime
×
100
SN
Perfect
Per
=
N
RealTime
×
100
N
Perfect
(14)where
N
Perfect
and
SN
Perfect
are the number of samples, whichare known to be classiﬁed as
noise periods
(
N
) or
speech-and-noise periods
(
SN
)bythe‘perfect’VAD.
N
Realtime
and
SN
Realtime
are the number of samples which are correctly classiﬁed as
noise periods
or
speech-and-noise periods
by the real time VAD. Thesignals of the speech signals (0
o
) and the noise signal (90
o
) arerecorded when the hearing aid is positioned on a dummy head.The signals are recorded during 90 seconds. In the calculation, theﬁrst 20 seconds of the signals are not taken in account. This is thetime needed to the noise reduction algorithm to converge.
4.3. Perceptual evaluation
The perceptual evaluation was performed with ten normal hearinglisteners by measuring the Speech Reception Threshold (SRT) of sentences in a stationary speech weighted noise, with an adaptiveprocedure[7]. Thetestsoftheomnidirectionalmicrophoneandtheadaptive beamformer [2] were carried out in two different noisescenarios in a moderately reverberant room (
T
60
= 0
.
76
s
). Aﬁrst, where the speech source was at an angle of 0
o
(in front of themannequin) and the noise source at 90
o
, and a second, where thespeechsourcewasat45
o
andthreeindependentnoisesourceswereat 90
o
/180
o
/270
o
. The distance between the loudspeakers and thecenter of the mannequin was 1 meter. The SVD-based techniquewas compared to an adaptive beamformer technique, which wasknown to give signiﬁcant improvements in speech intelligibility[2].
5. RESULTS5.1. Physical evaluation
Figure 3 shows the results of the percentage (
Per
) correctly de-tected samples by the VAD algorithm for
speech-and-noise peri-ods
and
noise periods
in a stationary speech weighted noise. TheVAD algorithm detected correctly the
noise-only periods
whenit was connected to the omnidirectional microphone, the direc-tional microphone or the output of the noise reduction strategy(
Per >
90
). The detection performance for the
speech-and-noise periods
was clearly a function of the signal to which the VAD wasconnected. The performance of the VAD dropped signiﬁcantlywhen it was linked to the omnidirectional or directional micro-phone for a SNR below 5dB. When the VAD used the output sig-nal of the SVD-based technique, the percentage of well-detectedsamples stayed above 90
%
for a SNR above -5dB. At a SNR of -10dB, the scores were about 90
%
with the optimal ﬁltering tech-nique. Connecting the VAD to the output of the noise reductionalgorithm revealed the best performance. In this study, the VADwas connected to the output of the noise reduction strategy for thereal time implementation.
5.2. Perceptual evaluation
Figure 4 shows the SRT-improvements (in dB) of the two noisereduction algorithms (SVD-based optimal ﬁltering technique ver-sus adaptive beamformer [2]) relative to the omnidirectional mi-crophone, for both jammer sound scenarios. To compare the per-formance of the noise reduction techniques between each other,a statistical analysis (a paired comparison) was performed for thetwo noise scenarios. In the single jammer sound scenario, impor-tant SRT-improvements were obtained, 15.8dB and 15.1dB, forthe adaptive beamformer and the optimal ﬁltering technique re-spectively. There were no signiﬁcant differences between bothstrategies (p=0.103). This means that the SVD-based technique
IV - 11
10
505100102030405060708090100SNR (dB) of the input signal (omnidirectional microphone)
P e r c e n t a g e c o r r e c t l y d e t e c t e d ( P e r )
VAD connected to omni. mic. (noise periods)VAD connected to omni. mic. (speech
and
noise periods)VAD connected to dir. mic. (noise periods)VAD connected to dir. mic. (speech
and
noise periods) VAD connected to SVD (noise periods)VAD connected to SVD (speech
and
noise periods)
Fig. 3
. Performance of the VAD when it is connected to the omni-directional microphone, the directional microphone, the output of the SVD-based technique.
024681012141618201 Noise 3 NoisesSVDBeam
S N R - i m p r o v e m e n t ( d B )
Fig. 4
. SRT-improvements (in dB) of the SVD-based optimal ﬁl-tering technique (SVD) and the adaptive beamformer (Beam) rel-ative to the omnidirectional microphone for both jammer soundscenarioscan perform as well as the adaptive beamformer when the noisescenario is optimal for the latter technique. Indeed, the desiredtarget (speech at 0
o
) was in the look direction of the beamformer(angle 0
o
). In the multiple noise scenario, the SVD-based tech-nique was signiﬁcantly better than the adaptive beamformer whena stationary speech weighted noise was present (p=0.005). SRT-improvements of 7.5dB and 9.0dB were obtained with the adap-tive beamformer and the optimal ﬁltering technique, respectively.The difference between the two strategies (1.5dB) is important forhearing-aid users. In critical listening conditions (close to 50
%
of speech understood by the listener) an improvement of 1dB in SNRcorresponds to an increase of speech understanding of about 15per cent in every day speech communication [3].On one hand, the SVD-based optimal ﬁltering technique workswithout assumptions about the desired target direction, however,this strategy needs a robust VAD. On the other hand, the adap-tive beamformer works with assumptions about the desired targetdirection and the characteristics of the microphones. When theseassumptions are violated, it leads to a leakage of the speech signalinto the noise reference. If then the VAD misclassiﬁes the
speech-and-noise periods
, the adaptive ﬁlter takes in account the statisticsof the desired signal and subsequent target cancellation.In the multiple jammer sound scenario, the noise reduction strate-gies did not achieve the same performance as the single jammersound scenario. The SRT-improvements decreased by about 8dB.Theoretically, a signal processing strategy comprising N micro-phones can potentially separate up to N statistically independentsources. More speciﬁcally, a conﬁguration with two microphonesis optimal for the cancellation of one jammer sound. The direc-tional microphone is important in adverse listening conditions. Inadiffuselisteningenvironment(thejammersourcesarenotlocatedin well deﬁned directions), the adaptive effect of the noise reduc-tion strategies falls back to the effect of the directional microphone[2].SVD-based procedures are known to have a high computationalcomplexity, but, recent studies showed that the complexity prob-lem can be controlled, making this approach attractive for practicalsystems. Recently, a LMS approach was found to have approx-imately the same cost of calculation as the adaptive beamformer[8].
6. CONCLUSIONS
A real time implementation and an evaluation of a Singular ValueDecomposition (SVD) based optimal ﬁltering technique for noisereductioninadualmicrophoneBTEhearingaidispresented. Con-necting the VAD to the output of the noise reduction algorithm re-veals a good performance for discriminating the
speech-and-noise periods
from the
noise periods
. Perceptual measurements showedthat the optimal ﬁltering technique is more robust than the adap-tive beamformer in a multiple noise source scenarios and couldperform as well as the latter technique in a single jammer soundscene.
7. REFERENCES
[1] S Doclo and M Moonen, “Gsvd-based optimal ﬁltering forsingle and multiple speech enhancement,”
IEEE Transactionson Signal Processing
, vol. 50, no. 9, pp. 2230–2244, 2002.[2] J B Maj, J Wouters, and M Moonen, “Noise reduction resultsof an adaptive ﬁltering technique for dual-microphone behind-the-ear hearing aids,”
Ear and Hearing
, vol. In revision, 2003.[3] R Plomp, “Noise, ampliﬁcation, and compression: consider-ations of three main issues in hearing aid design,”
Ear and Hearing
, vol. 15, no. 1, pp. 2–12, 1994.[4] J B Maj, M Moonen, and J Wouters, “Theoretical analysis of adaptive noise reduction algorithms for hearing aids,”
Euro- pean Signal Processing Conference (EUSIPCO)
, vol. Septem-ber 3-6, pp. Toulouse, France, 2002.[5] J B Maj, J Wouters, and M Moonen, “Svd-based optimal ﬁl-tering technique for noise redcution in hearing aids using twomicrophones,”
Journal on Applied Signal Processing
, vol. 4,pp. 432–443, 2002.[6] M Moonen, P VanDooren, and J Vandewalle, “A singularvalue decomposition updating algorithm for subspace track-ing,”
SIAM Journal of Matrix Anal. Application
, vol. 13, no.4, pp. 1015–1038, 1992.[7] N Versfeld, L Daalder, J M Festen, and T Houtgast, “Exten-sion of sentence materials for the measurement of the speechreception threshold,”
Journal of the Acoustical Society of America
, vol. 107, no. 3, pp. 1671–1684, 2000.[8] A Spriet, M Moonen, and J Wouters, “Spatially preprocessed,speech distortion weighted multi-channel wiener ﬁltering fornoise reduction,”
Submitted
, 2003.
IV - 12

Search

Similar documents

Related Search

Real-Time Measurement and Control SystemsTime, Knowledge, and the Clash of CivilizatioReal time modeling and hardware in the loop sReal time Modeling and SimulationFinancial and Economic Evaluation of AgricultA and B Theory of TimeEvaluation of Development and Implementation Evaluation of Lipid Profile and Lipoprotein-AA Performance evaluation of Proactive and ReaDESIGN AND EVALUATION OF A MULTIMODAL INTERFA

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks