Description

Block-based Multichannel Transform-Domain Adaptive Filtering

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Related Documents

Share

Transcript

BLOCK-BASED MULTICHANNEL TRANSFORM-DOMAIN ADAPTIVE FILTERING
Sascha Spors, Herbert Buchner, and Karim Helwani
Deutsche Telekom Laboratories, Technische Universit¨at Berlin,Ernst-Reuter-Platz 7, 10587 Berlin, Germany.Email: Sascha.Spors@telekom.de
ABSTRACT
Multichannel adaptive ﬁltering is subject to speciﬁc problemsemerging from spatio-temporal couplings in the input signals of the adaptive ﬁlter. Transform-domain adaptive ﬁltering (TDAF) de-couples the input signal of the adaptive ﬁlter by a suitably chosentransformation. In a previous paper, the authors have introduced atwo-stage approach to multichannel TDAF. However, the approachpresented there is based on a sample-by-sample update of the ﬁl-ter coefﬁcients. In this paper we present a more practical block-based formulation of multichannel TDAF that is constructed froma combination of frequency-domain adaptive ﬁltering for temporaldecoupling and an unitary transform for spatial decoupling.
1. INTRODUCTION
Telecommunication systems with more than one acoustic transmis-sion channel are being developed and increasingly used. These sys-tems aim at providing additional spatial auditory cues to the listenerin contrast to the single channel systems frequently used in the lastdecades. The spatial cues increase the naturalness of the communi-cation and can facilitate, for instance, the recognition of speakers ina dialogue by their spatial position.Acoustic echo cancelation (AEC) is required for full-duplex com-munication in a hands-free communication scenario. The applica-tion of AEC to such a scenario is illustrated in Fig. 1. The goal of AEC is to cancel the acoustic echo for the far-end, introduced bythe couplings between the loudspeaker(s) and microphone(s) at thenear-end. In the block diagram of Fig. 1, the echo produced by theacoustic couplings between the
P
loudspeakers and the microphonein the near-end room is canceled for the far-end by subtracting theestimate ˆ
y
(
n
)
of the microphone signal from the actual microphonesignal
y
(
n
)
. The signal ˆ
y
(
n
)
is derived by ﬁltering the loudspeakersignals
x
p
(
n
)
with ﬁnite-impulse response (FIR) ﬁlters that modelthe acoustic paths
h
p
(
n
)
from the loudspeakers to the microphones.The estimation of the acoustic paths
h
p
(
n
)
represents a multichan-nel identiﬁcation problem. It is well known that this identiﬁcationproblem is typically ill-conditioned for the multichannel case if thefar-end signals exhibit spatio-temporal correlations [1].In advanced adaptation schemes, at least two fundamental ap-proaches exist to cope with the far-end correlations: (1) decou-pling of the convolution in the near-end room and (2) decoupling
···
· · ·
x
1
(
n
)
x
P
(
n
)
ˆ
h
1
ˆ
h
P
h
1
(
n
)
h
P
(
n
)
g
1
(
n
)
g
P
(
n
)
e
(
n
)
y
(
n
)
ˆ
y
(
n
)
−
++
far-end near-endFigure 1: Block diagram of multichannel acoustic echo cancelation.of the input (loudspeaker signal) covariance matrix. The ﬁrst ap-proach is applied in frequency-domain adaptive ﬁltering (FDAF),while the second one is applied in transform-domain adaptive ﬁlter-ing (TDAF).The authors have introduced a two-stage approach to multichannelTDAF in [2]. However, the approach presented there was based ona sample-by-sample update of the ﬁlter coefﬁcients. Block-basedadaptation algorithms are typically computationally less complexand therefore favorable. In this paper we present a block-based for-mulation of multichannel TDAF that is constructed from a com-bination of FDAF for temporal decoupling and TDAF for spatialdecoupling.We proceed as follows: The next section will introduce the funda-mental problem of multichannel system identiﬁcation. This will befollowed by a brief review of TDAF and FDAF before we derivethe block-based TDAF algorithm. Some results computed with theproposed algorithm will be shown before concluding the paper.
2. MULTICHANNEL SYSTEM IDENTIFICATION
The estimation of the acoustic paths
h
p
(
n
)
for
p
=
1
,
2
,... ,
P
repre-sents a multichannel identiﬁcation problem. The error
e
(
n
)
is givenas
e
(
n
) =
y
(
n
)
−
P
∑
p
=
1
ˆh
T p
x
p
(
n
)
,
(1)where
ˆh
p
= [
ˆ
h
p
,
0
,
ˆ
h
p
,
1
,...,
ˆ
h
p
,
L
−
1
]
T
,
(2)
x
p
(
n
) = [
x
p
(
n
)
,
x
p
(
n
−
1
)
,... ,
x
p
(
n
−
L
+
1
)]
T
,
(3)with ˆ
h
p
,
l
denoting the
l
-thcoefﬁcient of the
p
-th channel,
L
the ﬁlterlength and
n
the time instant. Under the assumption of minimizingthe mean-square error (MSE) the ﬁlter coefﬁcients can be found bysolving the multichannel normal equation [1]
R
xx
ˆh
=
r
xy
,
(4)where the
PL
×
1 vector
ˆh
of estimated ﬁlter coefﬁcients is givenas
ˆh
= [
ˆh
T
1
,
ˆh
T
2
,...,
ˆh
T P
]
T
. The matrix
R
xx
denotes the covariancematrix of the input signals
x
(
n
)
and
r
xy
the covariance vector be-tween the input
x
(
n
)
and the microphone signal
y
(
n
)
. The
PL
×
LP
covariance matrix
R
xx
is deﬁned as
R
xx
(
n
) =
ˆ
E
{
x
(
n
)
x
T
(
n
)
}
,
(5)where ˆ
E
{·}
denotes a suitable approximation of the expectationoperator and the
PL
×
1 vector
x
of input signals is given as
x
(
n
) = [
x
T
1
,
x
T
2
,...,
x
T P
]
T
. The
PL
×
1 covariance vector
r
xy
(
n
) =
ˆ
E
{
x
(
n
)
y
(
n
)
}
. The covariance matrix
R
xx
is composed from
L
×
L
sub-matrices that are given as
R
pq
=
ˆ
E
{
x
p
(
n
)
x
T q
(
n
)
}
for
p
,
q
=
1
,
2
,... ,
P
. Typically, these are assumed to be Toeplitz matrices.The generalization to multiple microphones and consequentlymultiple-input multiple-output (MIMO) systems in the near-endroom is straightforward. It can be shown that the resulting nor-mal equation for the MIMO case can be decomposed into a series
17th European Signal Processing Conference (EUSIPCO 2009)Glasgow, Scotland, August 24-28, 2009
© EURASIP, 20091735
of independent multiple-input single-output (MISO) normal equa-tions [1] for each microphone channel. Hence, the consideration of a MISO system in the near-end room is sufﬁcient in the context of this work.Thesolution of thenormal equation (4) issubject tonumerical prob-lems when the covariance matrix
R
xx
is ill-conditioned. It can beshown that this is the case when spatio-temporal correlations existbetween the loudspeaker signals
x
p
(
n
)
.
3. A TWO-STAGE APPROACH TO MULTICHANNELTDAF
Transform-domain adaptive ﬁltering (TDAF) is a technique thatperforms the ﬁlter adaptation in a transform domain. In the idealcase, the far-end signals will be decorrelated by a suitably chosentransformation. The ideal transformation can be deduced from thecovariance matrix
R
xx
(
n
)
and is data-dependent in general. TDAFhas srcinally been introduced for the single channel case [3]. Ina previous paper [2] we have proposed multichannel TDAF, basedon a two-step decoupling of the covariance matrix. The approach isbrieﬂy reviewed in the following.
3.1 Spatio-temporal decoupling
The spatio-temporal decoupling consists of two steps: (1) temporaldecoupling using a discrete Fourier transform (DFT) based trans-formation and (2) a spatial decoupling using a unitary transform.Assuming stationary signals
x
p
(
n
)
and the correlation method to es-timate the covariance matrix, the sub-matrices
R
pq
exhibit Toeplitzstructure [4]. These assumptions hold well for typical signals. Forlarge block lengths (
L
→
∞
) the matrices
R
pq
become equivalentto circulant matrices [5]. Circulant matrices can be diagonalized bythe DFT
R
xx
=
FS
xx
F
H
,
(6)where
F
denotes a
PL
×
LP
block-diagonal matrix whose diago-nal blocks are composed from
L
×
L
DFT matrices
F
L
. Frequencydomain quantities are underlined. The elements of the (normal-ized) DFT matrices
F
L
are given as
f
nm
=
1
/
√
L
·
e
−
j
2
π
nm
/
L
for
n
,
m
=
0
,
1
,... ,
L
−
1. The block-matrix
S
xx
is composed from the
L
×
L
diagonal matrices
S
pq
=
diag
{
s
(
0
)
pq
,
s
(
1
)
pq
,...,
s
(
L
−
1
)
pq
}
,
(7)where the elements
s
(
ν
)
pq
for
ν
=
0
,
1
,...,
L
−
1 are given by the DFTof the ﬁrst column of
R
pq
. The frequency bin is denoted as
ν
.In order to achieve further spatial decoupling, the matrix
S
xx
has tobe reordered such that all spatial couplings for one frequency binare combined into submatrices
S
(
ν
)
. Formally, this can be reachedby a suitably chosen permutation matrix
A
L
. The submatrices
S
(
ν
)
can then be diagonalized by application of the spectral theorem.Combining all described steps, the covariance matrix
R
xx
can beexpressed as
R
xx
=
FA
L
U
L
T
xx
U
H L
A
T L
F
H
(8)in terms of the diagonal matrix
T
xx
which is composed from thespatio-temporal eigenvalues of
R
xx
. These eigenvalues can belinked to the spatio-temporal correlation coefﬁcients of the inputsignals
x
p
(
n
)
[2].The
LP
×
PL
matrix
U
L
denotes a block-diagonal matrix composedfrom the
P
×
P
submatrices
U
(
ν
)
constructed from the singular vec-tors of
S
(
ν
)
. Note, that the desired decoupling of the covariancematrix has been achieved by a set of suitably chosen unitary trans-forms. This favorable property is beneﬁcial for mathematical rear-rangements in the algorithm.
3.2 MultichannelTDAF
Introducing Eq. (8) into the normal equation (4) and exploiting theunitarity of the transform matrices yields the transformed normalequation
T
xx
U
H L
A
T L
F
H
ˆh
ˆh
=
U
H L
A
T L
F
H
r
xy
t
xy
,
(9)where
ˆh
and
t
xy
denote the transformed vector of ﬁlter coefﬁcients
ˆh
and the transformed covariance vector
r
xy
, respectively. Since
T
xx
is diagonal, the normal equation (4) has been decomposed bythe transformations into a series of scalar equations.The solution of the normal equation (9) involves the inversion of the diagonal matrix
T
xx
containing the spatio-temporal eigenvaluesof
R
xx
. If one or more of these are zero or close to zero this willbe subject to numerical problems. It was shown in [2] that theseeigenvalues are linked to the spatio-temporal correlations in the far-end signals and that strong correlations lead to eigenvalues that are(close to) zero. One beneﬁt of TDAF is that a regularization can beperformed spatially and temporally frequency-bin selective.The derived transformations have been applied straightforwardly tothe recursive-least squares (RLS) algorithm in [2]. The formula-tion isbased on a sample-by-sample update of the ﬁlter coefﬁcients.However, for apractical implementationblock-based algorithms arefavorable. The presented two-step approach to multichannel TDAFallows the utilization of known frequency domain techniques likefrequency domain adaptive ﬁltering (FDAF) for the temporal de-coupling. After a brief review of generalized FDAF in the nextsection, a combination of TDAF and FDAF will be developed inSection 5.
4. FREQUENCY-DOMAIN ADAPTIVE FILTERING
This section presents a brief review of generalized FDAF [6, 7].FDAF is essentially based on a block formulation of the identiﬁ-cation problem. This block formulation is derived by combining
L
consecutive samples into blocks, formulating the error signal (1)in terms of blocks and minimizing the error. For this purpose, theconvolution operation in (1) is reformulated in terms of a matrixoperation, where the input signals are combined into a matrix withToeplitz structure. A Toeplitz matrix can be transformed into a cir-culant matrix by doubling its size. This concept is a fundamentalbuilding block of FDAF where the circulant matrix is then diago-nalized by the DFT. This results in an overlap save formulation of the convolution by incorporating window functions.The concept of generalized multichannel FDAF is closely linkedto TDAF in the sense that it also aims at temporal decoupling. Itis well known that the Fourier transformation diagonalizes lineartime-shift invariant systems. FDAF employs the DFT for temporaldecoupling of the near-end system. It can be shown [6] that thisleads also to an approximate temporal decoupling of the covariancematrix
R
xx
. This is due to fact that the DFT only approximatelydecouples the covariance matrix for the ﬁnite blocksize in practicalimplementations [5].
4.1 Algorithm
The time-domain block error signal
e
(
m
)
for a block length of
L
samples is deﬁned as
e
(
m
) = [
e
(
mL
)
,
e
(
mL
+
1
)
,...,
e
(
mL
+
L
−
1
)]
T
,
(10)where
m
denotes the block index. The microphone signal
y
(
m
)
isdeﬁned in a similar fashion as
e
(
m
)
. In order to derive an algorithmthat requires only DFTsof size2
L
, theerror and microphone signalsare zero padded before transformation into the frequency domain
e
′
(
m
) =
F
2
L
0
1
×
L
,
e
T
(
m
)
T
,
(11)and similarly for the microphone signal. The loudspeaker signals inthe frequency domain are given as
X
p
(
m
) =
diag
{
F
2
L
[
x
p
(
mL
−
L
)
,...,
x
p
(
mL
+
L
−
1
)]
T
}
,
(12a)
X
(
m
) = [
X
1
(
m
)
,...,
X
P
(
m
)]
.
(12b)
1736
0 5 10 15010203040506070Time [sec]
E R L E [ d B ]
reg. strategy 1reg. strategy 2
(a) Echo return loss enhancement (ERLE).
0 5 10 15
−40−30−20−100Time [sec]
C o e f f i c i e n t e r r o r n o r m [ d B ]
reg. strategy 1reg. strategy 2
(b) Normalized misalignment.
Figure 3: Simulation results for the proposed multichannel TDAFalgorithm for two different regularization strategies.height of 1
.
5 meters. The position of the loudspeakers is
[
2
.
8
,
5
]
mand
[
3
.
2
,
5
]
m, and of the microphone
[
5
,
2
]
m.The signal of a male speaker was fed equally to both loudspeak-ers (phantom source stationary in center). The loudspeaker sig-nals were pre-processed by a nonlinearity [1] in order to cope forthe non-uniqueness problem. Noise with a level of approximately
−
50 dB with respect to the echo was added to the microphone sig-nals, in order to simulate microphone and other noise sources at thenear-end.Thealgorithm wasimplemented inMATLAB,asdepicted byFig.2.The ﬁlter length was chosen as
L
=
4096 at a sampling rate of
f
s
=
44
.
1 kHz. In order to illustrate the effect of selective regu-larization in the eigenspace, two regularization strategies have beenimplemented: (1) both spatial and temporal frequency-bin selectiveregularization and (2) only temporal frequency-bin selective regu-larization. The latter shows a similar performance as a straightfor-ward implementation of FDAF. The dynamic regularization schemeintroduced in [6] has been used for both strategies.Figure 3(a) shows the echo return loss enhancement (ERLE) forthe simulated scenario. It can be seen that the algorithm convergesfast and provides a good amount of echo attenuation (in dB) thatis bounded by the near-end noise. The spatio-temporal frequency-bin selective regularization shows better results than the temporalfrequency-bin regularization. Figure 3(b) shows the normalizedmisalignment. Again the spatio-temporal frequency-bin selectiveregularization performs better. Note, that the proposed TDAF algo-rithm, unlike multichannel FDAF, provides inherently the possibil-ity for this beneﬁcial regularization strategy.
7. CONCLUSION
This paper presents a block-based reformulation of the sample-by-sample multichannel TDAF approach introduced in [2]. Its two-stage approach to spatio-temporal decoupling has been exploited inorder to perform the temporal decoupling efﬁciently by the FDAFalgorithm in combination withan eigenvalue decomposition to copefor the spatial couplings. In contrast to a sample-by-sample up-date the presented block-based approach beneﬁts from the compu-tational savings of the FDAF algorithm. The results show that theresulting algorithm performs well in a typical multichannel AECscenario. One beneﬁt of the proposed adaptation scheme, workingin the eigenspace of the far-end signal covariance matrix, is the pos-sibility of selective regularization in that eigenspace. The beneﬁt of this regularization was demonstrated in Section 6. The block-basedTDAF algorithm is formally equivalent to wave-domain adaptiveﬁltering (WDAF) developed by the authors in [7]. This link is quiteinteresting since WDAF is based on decoupling of the near-end sys-tem by a singular value decomposition (SVD). The only formal dif-ference between the TDAF and the WDAF algorithm, besides theMISO/MIMO formulation, is the matrix
G
U
which accounts forthe change of the eigenspace over time. Since for the derivation of WDAF (like for FDAF) it is assumed that the near-end room acous-tics is time-invariant this matrix does not show up there explicitly.The formulations of WDAF and multichannel TDAF, as presentedby the authors, are based on transforming (ﬁltering) the far-end sig-nals in order to overcome fundamental problems of the multichan-nel identiﬁcation problem. The transformations are linked to theeigenspace of the near-end system or the covariance matrix of thefar-end signals. The formulations of the algorithms are also appli-cable for generic MIMO ﬁlters. This opens up the potential to ﬁndefﬁcient approximations of these transformations. The basic con-cept of TDAF also has a strong relation to blind source separation(BSS). The BSS algorithms based on second-order statistics try toﬁnd a demixing system that diagonalizes the covariance matrix of the demixed signals. The transformation
U
can also be interpretedas demixing system in this context, since the goal is to provide in-dependent signals to the adaptive ﬁlters.In the future, further work is planned on the detailed analysis of theproperties of the presented multichannel TDAF algorithm.
REFERENCES
[1] J. Benesty, Y. Huang, and J. Chen, “Wiener and adaptiveﬁlters,” in
Speech and Audio Processing in Adverse Envi-ronments
, E. Haensler and G. Schmidt, Eds., pp. 103–120.Springer, 2008.[2] S. Spors and H. Buchner, “Multichannel transform domainadaptive ﬁltering: A two stage approach and illustration foracoustic echo cancellation,” in
11th International Workshopon Acoustic Echo and Noise Control (IWAENC)
, Seattle, USA,September 2008.[3] S.S. Narayan, A.M. Peterson, and M.J. Narasimha, “Transformdomain LMS algorithm,” in
IEEE Transactions on Acoustics,Speech, and Signal Processing
, June 1983, vol. ASSP-31.[4] J.D Markel and A.H. Gray,
Linear Prediction of Speech
,Springer-Verlag, Berlin, 1976.[5] R.M. Gray, “On the asymptotic eigenvalue distribution of toeplitz matrices,”
IEEE Transactions on Information Theory
,vol. IT-18, no. 6, pp. 725–730, Nov. 1972.[6] H. Buchner, J. Benesty, and W. Kellermann, “Multichan-nel frequency-domain adaptive algorithms with application toacoustic echo cancellation,” in
Adaptive signal processing: Ap- plication to real-world problems
, J. Benesty and Y. Huang, Eds.Springer, 2003.[7] H.Buchner and S.Spors, “A general derivation of wave-domainadaptive ﬁlteringand application toacoustic echo cancellation,”in
Asilomar Conference on Signals, Systems, and Computers
,Paciﬁc Grove, USA, October 2008.
1739

Search

Similar documents

Tags

Related Search

We Need Your Support

Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...Sign Now!

We are very appreciated for your Prompt Action!

x