Short Stories

Blind Separation of Anechoic Under-determined Speech Mixtures using Multiple Sensors

Description
Blind Separation of Anechoic Under-determined Speech Mixtures using Multiple Sensors
Categories
Published
of 5
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Blind Separation of Anechoic Under-determinedSpeech Mixtures using Multiple Sensors Rayan Saab 1 , ¨Ozg¨ur Yılmaz 2 , Martin J. McKeown 3 , Rafeef Abugharbieh 1 1 Department of Electrical and Computer Engineering, The University of British Columbia. 2 Department of Mathematics, The University of British Columbia. 3 Department of Medicine (Neurology), Pacific Parkinson’s Research Centre, The University of British Columbia  Abstract  —This paper presents a novel technique for Blind SourceSeparation (BSS) of anechoic speech mixtures in the underdetermined case. A demixing algorithm that exploits the sparsity of the short time Fourier transform (STFT) of speech signals is proposed. Thealgorithm merges constrained optimization with ideas based on thedegenerate unmixing estimation technique (DUET) [1]. Thus, thenovelty in the proposed approach is twofold. First, the algorithmutilizes all available mixtures in the anechoic scenario, where bothattenuations and arrival delays between sensors are considered. Sec-ond, it is demonstrated that   l q minimization with  q <  1  outperformsthe standard choice of   q   = 1 . Experimental results on both syntheticand real mixtures indicate significant performance gains over other  BSS algorithms reported in the literature.  Keywords -  blind source separation, sparse signal representa-tion, DUET, time-frequency representations, Gabor expansion,underdetermined signal un-mixing, over-complete representa-tions I. I NTRODUCTION Over the last few years, BSS algorithms have been devel-oped for a wide variety of models, ranging from instantaneous,anechoic, and echoic mixing on one hand to over-determined,even-determined and under-determined scenarios on the otherhand.For instantaneous mixing, especially in the even-determinedcase, a powerful tool that has found increasing use is inde-pendent component analysis (ICA). First expressed in [2],then developed in an information maximization framework by Bell and Sejnowski [3], standard ICA assumes statisticalindependence of the sources and tries to extract  n  sources from n  recorded mixtures. Lewicki and Sejnowski [4], and Lee etal. [5] expanded ICA into the instantaneous under-determinedcase, where there are more sources than available mixtures,by using a maximum a posteriori approach and exploitingsparsity. For an extensive overview of ICA see [6]. OtherBSS approaches e.g. [7], [8] and [9] assume sparsity of thesources in some transform domain as well as a linear mixingmodel to solve the BSS problem for  instantaneous mixturesin the under-determined case . These approaches generally useconstrained  l 1 minimization for separation assuming that thismaximizes sparsity of the estimated sources in the transformdomain.Another set of algorithms that deal with  anechoic under-determined   mixing scenarios were proposed. Jourjine et al.[10] and Yılmaz and Rickard [1] developed an algorithm,called DUET, that exploits sparsity in the short time Fouriertransform (STFT) domain and uses masking to extract multiplesources from only two mixtures. The assumption they referto as  W-Disjoint Orthogonality  is that of only one sourcebeing active at every point in the time-frequency (TF) domain.Bofill [11] proposed another anechoic demixing algorithmfor under-determined mixtures that extracts the attenuationcoefficients using a scatter plot technique and the delays bymaximizing a kernel function. After the amplitudes and delaysare extracted, Bofill uses  second order cone programming ,a technique that can be used for  l 1 minimization in thecomplex domain, to recover the sources in the TF planefrom two mixtures. Vielva et al [12] considered the case of underdetermined instantaneous blind source separation wheresource densities are parametrized by a sparsity factor, andpresented a maximum a posteriori method for separation,while [13] focused on the estimation of the mixing matrixfor under-determined BSS under the sparsity assumption. Arecent survey of available methods in blind source separationover the range of assumptions made and models used can befound in [14].In this paper, a new BSS technique for extracting sourcesin an under-determined anechoic environment is proposed.In solving the problem of extracting a number of sourcesthat exceeds the number of available mixtures, the ‘standard’two stage approach as formalized by Theis et al [15] isadopted. In the first stage the mixing parameters are estimatedby clustering feature vectors which are constructed from theGabor coefficients (or Short-time Fourier transform) of themixtures. These parameters, as well as a sparsity assumptionon the Gabor expansions of speech signals, are then usedin the second stage to extract the sources. In particular, an l q - minimization based algorithm, with  q <  1 , is used toestimate the sources in the STFT domain. Accordingly, thenovel aspects of the proposed approach are: •  Generation of feature vectors incorporating both attenu-ation and delay parameters for an arbitrary number of mixtures in the underdetermined BSS case. •  Proposing the use of   l q minimization with  q <  1 which is shown to significantly improve the separationperformance. 2006 IEEE International Symposium on Signal Processing and Information Technology0-7803-9754-1/06/$20.00©2006 IEEE642  •  Comparing the performance of source extraction based on ℓ q minimization and  ℓ q -basis-pursuit for values  0  ≤  q   ≤  1 in STFT domain, and illustrating that the best separationperformance for speech is obtained for  0 . 1  ≤  q   ≤  0 . 4 .Experiments conducted on both synthetic and real mixturesindicate significant performance gains over other BSS algo-rithms reported in the literature.II. M IXING  M ODEL AND  P ROBLEM  F ORMULATION Assuming  n  time domain sources,  s 1 ( t ) ,...,s n ( t )  and  m mixtures  x 1 ( t ) ,...,x m ( t )  such that x i ( t ) = n  j =1 a ij s j ( t − δ  ij ) , i  = 1 , 2 ,...,m  (1)where  m < n  and  a ij  ∈  R + and  δ  ij  ∈  R  are attenuationcoefficients and time delays associated with the path from the  j th source to the  i th receiver, respectively. Equation (1) definesan  anechoic  mixing model. Without loss of generality, one canset  δ  1 j  = 0  and scale the source functions  s j  such that m  i =1 a 2 ij  = 1  (2)for  j  = 1 ,...,n .By taking the STFT of   x 1 ,..., x m  with an appropriatewindow function, the mixing model (1) can be written as ˆ x ( τ,ω ) =  A ( ω )ˆ s ( τ,ω ) ,  (3)where ˆ x  = [ˆ x 1  ...  ˆ x m ] T  ,  ˆ s  = [ˆ s 1  ... ˆ s n ] T  ,  (4) ˆ x i  and  ˆ s j  denote the STFT of   x i  and  s j , respectively, and A ( ω ) =  a 11  ... a 1 n a 21 e − iωδ 21 ... a 2 n e − iωδ 2 n ......... a m 1 e − iωδ m 1 ... a mn e − iωδ mn  .  (5)Note that, by (2), the column vectors of   A  have unit norm.Using the equivalent discrete form of the continuous STFT,i.e., the samples (Gabor coefficients) of   s  on a sufficientlydense lattice in the TF plane given by ˆ s j [ k,l ] = ˆ s j ( kτ  0 ,lω 0 )  (6)where  τ  0  and  ω 0  are the time-frequency lattice parameters.Similar notation will be used for the mixing matrix  A  and theGabor coefficients of the mixtures  x i .The following two sections describe the two stages of the proposed algorithm, i.e., the parameter estimation andextraction of srcinal sources, both of which depend on thesparsity of the Gabor expansions of speech signals.III. M IXING  P ARAMETERS  R ECOVERY This section presents the method used to recover the mixingmodel parameters, i.e., the delay and attenuation coefficients.  A. Speech Sparsity of STFT Coefficients Signal sparsity in certain transform domains facilitatesmixing parameter estimation. Cardoso [16] noted that theaccuracy with which the mixing parameters in a BSS modelcan be estimated depends on non-Gaussianity of the sources.Furthermore, sparser sources allow better separation quality,e.g., [8]. Thus a transformation that yields a sparse represen-tation of the data is desirable, both for estimating the mixingparameters accurately, and for separation. It was shown in[1] that Gabor expansions of speech signals are sparse. Tofurther illustrate the sparsity exhibited by speech in the STFTdomain, Figure 1 shows the average cumulative powers of the  sorted Gabor (STFT) coefficients  along with the averagecumulative power of the time domain sources as well asof their Fourier (DFT) coefficients. The STFT with 64mswindow-size is sparser, capturing 98% of the total signal powerwith only approximately 9% of the coefficients. 0204060801009595.59696.59797.59898.59999.5100Percentage of Points    P  e  r  c  e  n   t  a  g  e  o   f   t   h  e   P  o  w  e  r  Cumulative Power Distribution forSTFT with Various Window Sizes, Original Signal, Frequency Domain Signal STFT: 32ms windowSTFT: 64ms windowSTFT: 128ms windowTime domain signalFrequency domain signal Fig. 1. Average cumulative power of 50  3  s  speech signals in time domain,frequency (Fourier) domain, and TF domain for window sizes of   32  ms , 64  ms  and  128  ms . The STFT with  32  ms  and  64  ms  window length yieldsignificantly sparser representations of the data (more power represented infewer coefficients).  B. Parameter Estimation Consider the feature vectors at each TF point [k,l] given by F [ k,l ] :=  ˆ x 1 [ k,l ]  ˆ x [ k,l ]   ···  ˆ x m [ k,l ]  ˆ x [ k,l ]   ˆ∆ 21 [ k,l ]  ···  ˆ∆ m 1 [ k,l ]  . (7)where  ·  denotes the Euclidean norm and ˆ∆ j 1 [ k,l ] :=  −  1 lω 0 ∠ ˆ x j [ k,l ]ˆ x 1 [ k,l ] .  (8)as in [1]. If only one source  s J   is nonzero at a TF point, thefeature vector at that TF point will reduce to F [ k,l ] = ˆ  a 1 J   · · ·  a mJ   · · ·  δ 2 J   · · ·  δ mJ  ˜ :=  F J  Thus, the feature vectors calculated at any TF point  [ k,l ]  atwhich source  J   is the only active source will be identical, andequal to  F J  . 643  Given the sparsity assumption for the sources in the TFdomain, it can be expected that there will be many points witha single active source. A clustering approach, in the featurespace, such as k-means can thus be used to estimate the delayand attenuation parameters of the mixing model. In summary,the proposed  Parameter Estimation Algorithm  is as follows:1) Compute the mixture vector  ˆ x [ k,l ]  at every TF point [ k,l ] .2) At every TF point  [ k,l ] , compute the correspondingfeature vector  F [ k,l ] , as in (7),3) Perform some clustering algorithm (e.g. K-means) tofind the  n  cluster centers in the feature space. The clustercenters will yield preliminary estimates  ¯ a ij  and  ¯ δ  ij  of the mixing parameters  a ij  and  δ  ij , respectively.4) Normalize the attenuation coefficients to obtain the  finalattenuation parameter estimates  ˜ a ij , i.e., ˜ a ij  := ¯ a ij / ( m  i =1 ¯ a 2 ij ) 1 / 2 . The  final delay parameter estimates  are given by  ˜ δ  ij  :=¯ δ  ij .IV. S OURCE  S EPARATION This section presents a method for extracting the srcinalsources using the parameters estimated as described in sectionIII.First the estimated mixing matrix  ˜ A [ l ]  is constructed as ˜ A [ l ] =  ˜ a 11 e − ilω 0 ˜ δ 11 ...  ˜ a 1 n e − ilω 0 ˜ δ 1 n ˜ a 21 e − ilω 0 ˜ δ 21 ...  ˜ a 2 n e − ilω 0 ˜ δ 2 n ......... ˜ a m 1 e − ilω 0 ˜ δ m 1 ...  ˜ a mn e − ilω 0 ˜ δ mn  (9)where,  ˜ a ij  and  ˜ δ  ij  are the estimated attenuation and delayparameters. Note that each column vector of   ˜ A [ l ]  is a unitvector in  C m .The next step is to compute estimates  s e 1 ,s e 2 ,...,s en  of thesrcinal sources  s 1 ,s 2 ,...,s n  that satisfy ˜ A [ l ] ˆs e [ k,l ] =  ˆx [ k,l ] ,  (10)where  ˆ s e = [ˆ s e 1 ,... ˆ s en ] T  is the vector of source estimates in theTF domain. At each TF point  [ k,l ] , (10) provides  m  equations(corresponding to the  m  available mixtures) with  n > m unknowns  (ˆ s e 1 ,... ˆ s en ) . Assuming that this system of equationsis consistent, it has infinitely many solutions. To choose areasonable estimate among these infinitely many solutions thesparsity of the sources in the TF domain can be exploited.  A. Sparsity and   l q minimization To find, at each time frequency point, the “sparsest”  ˆs e thatsolves (10), the problem can be formally stated as min ˆs e  ˆs e  sparse  subject to  ˜ A ˆs e =  ˆx ,  (11)where  x  sparse  denotes some measure of sparsity of the vector x .Given a vector  x  = ( x 1 ,...,x n )  ∈  C  n , one measure of itssparsity is given by the number of the non-zero componentsof   x , commonly denoted by   x  0 . Replacing   x  sparse  in(11) with   x  0 , gives rise to the so-called  P  0  problem, e.g.,[17]. Solving  P  0  is, in general, combinatorial and the solutionis very sensitive to noise. More importantly, the sparsitymodel for the Gabor coefficients of speech signals essentiallysuggests that most of the coefficients are very small, howevernot identically zero. In this case,  P  0  fails as it does not takeinto account the value of the components. Alternatively, onecan consider  x  q  := (  i | x i | q ) 1 /q , where  0  < q   ≤  1 ,  as a measure of sparsity. Smaller values of  q   simply indicate more emphasis on sparsity of   x , e.g., [18].Motivated by this, the vector of source estimates  ˆs e can becomputed by solving at each TF point  [ k,l ]  the  P  q  problemdefined by replacing   ˆs e  sparse  in (11) with   ˆs e  q .  B. Solving  P  q The optimization problem  P  q  is not convex, thus computa-tionally challenging. Under certain conditions, it can be shownthat a near minimizer can be obtained by solving the convex  P  1 problem [17], [19]. However, one would not want to imposeany a priori conditions on the sparsity of the Gabor coefficientsof the source vectors. Without such conditions, only localoptimization algorithms are available in the literature [19]. Onthe other hand, we demonstrate here that the  P  q  problem with 0  < q <  1  can be solved in combinatorial time whenever themixing matrix  A  is real. Theorem 1:  Let  A  = [ a 1 | a 2 | ... | a n ]  be an  m  ×  n  matrixwith  n > m ,  A ij  ∈  R , and the column vectors  a i  have unitnorms. Suppose  A  is full rank. For  0  < q <  1 , the  P  q  problem min s  s  q  subject to  A s  =  x where  x  ∈  R n , has a solution  s ∗ = ( s ∗ 1 ,...s ∗ n )  whichhas  k  ≤  m  non-zero components. Moreover, if the non-zero components of   s ∗ are  s ∗ i ( j ) ,  j  = 1 ,...,k , then thecorresponding column vectors  { a i ( j )  :  j  = 1 ,...,k }  of   A are linearly independent.The proof of this theorem is long, and therefore will be omittedin this paper.Theorem 1 renders the  P  q  problem computationallytractable, as it shows that there are only a finite number of solutions of   P  q , and suggests a combinatorial algorithm tosolving  P  q . More precisely, let  A  be the set of all  m  ×  m invertible sub-matrices of   A  ( A  is non-empty as  A  is fullrank). The solution of   P  q  will then be given by the solutionof  min  B − 1 x B  q  where  B  ∈ A .  (12)Here, for  B  = [ a i (1) |···| a i ( m ) ] ,  x B  := [ x i (1) ··· x i ( m ) ] . Notethat  # A ≤  nm  , (12) is a combinatorial problem in the casewhen the mixing matrix  A  and the mixture  x  are real-valued. 644  Though Theorem 1 in general does not hold when the matrix A  is complex (a counter example and discussion can be foundin [20]) the goal of finding the solution with the smallest  l q norm is to impose sparsity. We thus propose to extract thesources using an  l q - basis pursuit approach, i.e. to find the bestbasis composed by a subset of columns of   A  that minimizesthe  l q norm of the solution vector. As shown above, this isequivalent to solving the  P  q  problem in the real-valued case.Moreover, in the complex-valued case, [20] demonstrated thatfor  q   = 1 , the combinatorial solution is a good approximationof the true solution and can be obtained much faster.Thus, the proposed  separation algorithm  can be summa-rized as follows. At each T-F point  [ k,l ] :1) Construct the estimated mixing matrix  ˜ A [ l ]  as in (9).2) Solve the  ℓ q -basis-pursuit problem with  A  = ˜ A [ l ]  asdescribed above for some  0  < q <  1  to find theestimated source vector  ˆs e [ k,l ] .3) Repeat steps 1 and 2 for all T-F points and then re-construct  s e ( t ) , the time domain estimate of the sourcesfrom the estimated Gabor coefficients.V. I NTERFERENCE  S UPPRESSION AND  D ISTORTION R EDUCTION A slight variation to the algorithm is introduced where  ρ , auser set parameter, is introduced to refine the source estimatesby increasing sparsity. At each TF point the estimates of thesmallest sources whose combined power contribution is lessthan  100(1  −  ρ )%  of the total power are set to zero. Thus,for  ρ  = 0  only the highest estimate is kept, and for  ρ  = 1  allestimates are kept. The idea here is to remove contributionsdue to noise or due to errors in estimating the mixing matrix.VI. E XPERIMENTS AND  R ESULTS To test the performance of the proposed algorithm, we usethree measures defined in [21], Signal-to-Interference (SIR),Signal-to-Artifact (SAR) and Signal-to- Distortion (SDR) Ra-tios. SIR measures the amount of interference due to othersources present in a certain estimated source, while SARmeasures artifacts due to algorithmic effects such as forcedor unnatural zeros in the STFT of sources, and SDR is anaggregate measure of distortion in an extracted source relativeto the srcinal.The importance of the algorithm being able to use all theavailable mixtures is highlighted by performing demixing of 5 sources using a decreasing number of mixtures startingwith the even-determined case, and comparing the separationperformance against that of DUET. To generate the 5 mixtures,a mixing model composed of random attenuation  ∼  U  (0 . 5 , 1) and delay parameters  ∼  U  ( − 1 , 1)  was employed. The pro-posed BSS algorithm was first applied using 5 availablemixtures and the experiment was repeated using 4, 3 andfinally 2 mixtures. Figure 2 shows the SDR, SIR and SARresulting from separation as a function of   q   varying from0 to 1 in steps of 0.1. As expected, separation performance 00.20.40.60.81−15−10−5051015quasi−norm          S         D         R Average SDR for estimated sources (n=5, ρ =0.8)m=5m=4m=3m=2DUET (a) SDR 00.20.40.60.81−505101520253035quasi−norm      S     I     R Average SIR for estimated sources (n=5, ρ =0.8)m=5m=4m=3m=2DUET (b) SIR 00.20.40.60.81−505101520quasi−norm      S     A     R Average SAR for estimated sources (n=5, ρ =0.8)m=5m=4m=3m=2DUET (c) SARFig. 2. Average SDR, SIR and SAR (over the five sources) obtained fromdemixing various number of   simulated anechoic  mixtures of 5 sources as afunction of the norm with a preserved power parameter of 0.8. The horizontalline represents the results obtained using DUET. Across all results, the userestimates the existence of 6 sources. drops as the number of mixtures used drops. Notably, even inthe case of 2 mixtures, our proposed algorithm outperformsDUET, which is designed to deal with exactly 2 mixtures. Ina set of experiments to further assess the performance of thealgorithm, an anechoic room mixing model [22] was used.The simulated scenario involved 3 microphones and severalsources placed in the room. The setup involved extracting 4underlying sources from the 3 mixtures and experiments wererepeated 60 times by varying both the speech sources and theirlocations in the room. The results are presented in Figure 3along with the results obtained using DUET, for comparison.Next, to provide an example on a real mixture, we test theaglorithm using the mixtures posted on [23], which have 2sources and 2 microphones. The microphones are placed 35cmapart, and the sources are placed  60 o degrees to the left of the microphones and 2m on the mid-perpendicular of themicrophones respectively [23], [24]. Table 1 shows that theproposed algorithm outperforms that of [24] for which theaudio separation results can be found at [23].VII. C ONCLUSION This paper presents a novel BSS algorithm for demixingunder-determined, anechoic mixtures. The technique is capableof using all available mixtures where both attenuations as wellas arrival delays between sensors are considered. Moreover,the proposed technique improves the separation performanceby incorporating  l q minimization with  q <  1  to inforce 645  Table 1. Demixing Performance (in dB) with 2 real mixtures of 2 sources,  ρ  = 0 . 7 ,  ˆ n  = 2 SIR [24] SIR (proposed algorithm) SAR [24] SAR (proposed algorithm) SDR [24] SDR (proposed algorithm) s 1  26 . 232 40 . 7632 4 . 5363 7 . 4011 4 . 4967 7 . 3987 s 2  55 . 410 43 . 4322 5 . 6433 10 . 4101 5 . 6433 10 . 4077 mean  40 . 821  42 . 0977  5 . 0898  8 . 9056  5 . 0700  8 . 9032 00.20.40.60.81−6−4−202468quasi−norm          S         D         R Average SDR for estimated sources (n=4, m=3)   ρ =1 ρ =0.8 ρ =0.6DUET (a) SDR 00.20.40.60.810510152025quasi−norm      S     I     R Average SIR for estimated sources (n=4, m=3) ρ =1 ρ =0.8 ρ =0.6DUET (b) SIR 00.20.40.60.81345678910quasi−norm      S     A     R Average SAR for estimated sources (n=4, m=3) ρ =1 ρ =0.8 ρ =0.6DUET (c) SARFig. 3.  Average  SDR, SIR and SAR (over 4 sources in 60 experiments)obtained from demixing 3 mixtures when the user estimates the existenceof 5 sources. Results are plotted as a function of the  q  for varying preservedpower parameter. The horizontal line represents results obtained using DUET. sparsity. By adopting a two-stage approach the proposedmethod combines the strengths of   l q minimization and DUET.In the  blind mixing model recovery  stage, feature vectors areconstructed and used to extract the parameters of the mixingmodel via clustering. This is followed by a  blind sourceextraction stage  based on  l q minimization which performs thedemixing at every TF point. Experimental results indicate thatthe proposed algorithm provides significant gains over otherBSS techniques capable of using only two mixtures.R EFERENCES[1] ¨O. Yılmaz and S. Rickard, “Blind source separation of speech mixturesvia time-frequency masking,”  IEEE Transactions on Signal Processing ,vol. 52, no. 7, pp. 1830–1847, July 2004.[2] C. Jutten, J. Herault, P. Comon, and E.Sorouchiary, “Blind separation of sources, parts i,ii and iii,”  Signal Processing , vol. 24, pp. 1–29, 1991.[3] A. Bell and T. Sejnowski, “An information-maximization approach toblind separation and blind deconvolution,”  Neural Computation , vol. 7,pp. 1129–1159, 1995.[4] M. Lewicki and T. Sejnowski, “Learning overcomplete representations,”in  Neural Computation , 2000, pp. 12:337–365.[5] T.-W. Lee, M. Lewicki, M. Girolami, and T. Sejnowski, “Blind sourceseparation of more sources than mixtures using overcomplete represen-tations,”  IEEE Signal Proc. Letters , vol. 6, no. 4, pp. 87–90, April 1999.[6] A. Hyvarinen and E. Oja, “Indpendent component anlysis: Algorithmsand applications,”  Neural Networks , vol. 13, no. 4-5, pp. 411–430, 2000.[7] P. Bofill and M. Zibulevsky, “Blind separation of more sources thanmixtures using sparsity of their short-time Fourier transform,” in  In-ternational Workshop on Independent Component Analysis and Blind Signal Separation (ICA) , Helsinki, Finland, June 19–22 2000, pp. 87–92.[8] M. Zibulevsky, B. Pearlmutter, P. Bofill, and P. Kisilev, “Blind SourceSeparation by Sparse Decompostion of a Signal Dictionary,” in  Inde- pendent Component Analysis: Principles and Practice , S. Roberts andR. Everson, Eds. Cambridge, 2001, ch. 7.[9] Y. Li, A. Cichocki, S. Amari, S. Shishkin, J. Cao, andF. Gu, “Sparse representation and its applications in blindsource separation,” in  Seventeenth Annual Conference on Neural Information Processing Systems (NIPS-2003) , Vancouver, Dec.2003. [Online]. Available: http://www.bsp.brain.riken.jp/publications/ 2003/NIPS03LiCiAmShiCaoGu.pdf [10] A. Jourjine, S. Rickard, and O. Yılmaz, “Blind separation of disjointorthogonal signals: Demixing N sources from 2 mixtures,” in  Proc. ICASSP2000, June 5-9, 2000, Istanbul, Turkey , June 2000.[11] P. Bofill, “Underdetermined blind separation of delayed sound sourcesin the frequency domain,”  Neurocomputing , vol. 55, pp. 627–641, 2003.[12] L. Vielva, D. Erdogmus, and J. Principe, “Underdetermined blindsource separation using a probabilistic source sparsity model,” in  2nd  International Workshop on Independent Component Analysis and Blind Signal Separation , June 2000.[13] D. Luengo, I. Santamaria, L. Vielva, and C. Pantaleh, “Underdeterminedblind separation of sparse sources with instantaneous and convolutivemixtures,” in  IEEE XIII Workshop on Neural Networks for SignalProcessing , 2003.[14] P. O’Grady, B. Pearlmutter, and S. Rickard, “Survey of sparse and non-sparse methods in source separation,”  International Journal of ImagingSystems and Technology , vol. 15, no. 1, 2005.[15] F. Theis and E. Lang, “Formalization of the two-step approach toovercomplete bss,” in  Proc. 4th Intern. Conf. on Signal and ImageProcessing (SIP’02) (Hawaii) , N. Younan, Ed., 2002.[16] J. Cardoso, “Blind signal separation: Statistical principles,”  Proceedingsof IEEE, Special Issue on Blind System Identification and Estimation ,pp. 2009–2025, October 1998.[17] D. Donoho and M. Elad, “Optimally sparse representation in general(nonorthogonal) dictionaries via  l 1 minimization,” in  Proc. Natl. Acad.Sci. USA 100 (2003), 2197-2202 .[18] D. Donoho, “Sparse components of images and optimal atomic decom-positions,”  Constructive Approximation , vol. 17, pp. 352–382, 2001.[19] D. Malioutov, “A sparse signal reconstruction perspective for sourcelocalization with sensor arrays,” Master’s thesis, MIT, 2003.[20] S. Winter, H. Sawada, and S. Makino, “On real and complex valued l1-norm minimization for overcomplete blind source separation,” in  IEEE Workshop on Applications of Signal Processing to Audio and Acoustics ,October 2005.[21] R. Gribonval, L. Benaroya, E. Vincent, and C. Fevotte, “Proposal forperformance measurement in source separation,” in  Proceedings of 4th International Symposium on Independent Component Analysis and Blind Signal Separation (ICA2003) , april 2003, pp. 763–768.[22] S. Rickard, “Personal communication,” 2005.[23] [Online]. Available: http://medi.uni-oldenburg.de/demo/demoseparation.html[24] J. Anemuller and B. Kollmeier, “Adaptive separation of acoustic sourcesfor anechoic conditions: a constrained frequency domain approach,” Speech Commun. , vol. 39, no. 1-2, pp. 79–95, 2003. 646
Search
Similar documents
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x