Real Estate

A study of a delay magnification system inspired by the Ormia ochracea hearing system

Description
A study of a delay magnification system inspired by the Ormia ochracea hearing system
Categories
Published
of 7
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
     Abstract   — This paper studies the delay magnification mechanism of the parasitoid fly Ormia ochracea  and proceeds by proposing and analyzing   a simpler delay magnification system inspired by it. The proposed system combined with a conventional cross correlogram has the potential to localize competing sound sources with ITDs in the microsecond range. I.   I  NTRODUCTION  OUND localization is a necessary preprocessing task for robust speech separation and noise reduction systems. The human auditory system employs binaural processing when dealing with sound localization [1]. The binaural  processing employs inter-aural-time differences (ITD) along with inter-aural-level differences (ILD). Coincidence detection models have been proposed for the computation of the ITD in humans [2]. Low power analog VLSI coincidence detection architectures have also been implemented [3-7]. However, when the distance between the acoustic sensors becomes very small, the available ITD and ILD values become challenging to use with coincidence detection systems given that their delay lines can generally resolve minimum delays in the range of tens of microseconds.   Ormia ochracea ( O 2 ), a parasitoid fly, possesses a remarkable hearing system which copes with minute ITD values [8]. Although the distance between its ears is so small (approximately 520µm) that the ITD and ILD values are also very small, its hearing system has been found to act as a multiplier of the ITD value. Recently, certain groups have developed and constructed low noise miniaturized differential microphones based on the mechanical model of the Ormia ochracea  hearing system [9-12]. Our study analyzes the delay magnification mechanism of the Ormia ’s ears based on the dynamic behavior proposed in [8]. As a result of our analysis, an Ormia -inspired signal  processing scheme suitable for the magnification of the ITD in a miniature binaural sound localization is proposed. The  paper is organized as follows: Part II introduces the mechan- Manuscript received April 16, 2010. This work was supported by The Imperial Royal Thai scholarship in Biomedical Engineering. M. Kongpoon is with the Biomedical Engineering Department, Imperial College London, South Kensington Campus, London, SW7 2AZ, UK. (e-mail: m.kongpoon07@imperial.ac.uk). Y. N. Billeh was with the Biomedical Engineering Department, Imperial College London, South Kensington Campus, London, SW7 2AZ, UK. (e-mail: yazan.billeh06@imperial.ac.uk). E. M. Drakakis is with the Biomedical Engineering Department, Imperial College London, South Kensington Campus, London, SW7 2AZ, UK. (e-mail: e.drakakis@imperial.ac.uk). 2  x 33 , ck  ck  , m 1  s m ck  , 1  x 2  s   Figure 1. Mechanical model of the Ormia ochracea  hearing system  proposed in [8]. ical model of the Ormia ’s   hearing system. Part III analyses  briefly the operation of the natural O 2  system and proposes a new O 2  system inspired by it. Part IV presents cross-correlogram-based localization results from a system employing the O 2 inspired system while part V discusses errors due to mismatches in the proposed system. II.   O RMIA O CHRACEA E ARS ’   M ECHANICAL M ODEL  In [8] Miles and his colleagues have proposed the mechanical model of Fig.1 for the Ormia ochracea  hearing system. The spring constants k   and k  3  and the unison constants c  and c 3  account for the stiffness at the middle and at the ends of the intertympanal bridge of the fly’s ears. The quantities  x 1  and  x 2  denote the displacement of the ear membranes,  s 1  and  s 2  are the applied forces at the membranes, and m  is the mass at each end. The dynamic  behavior of the model of Fig.1 can be codified as: ⎥⎦⎤⎢⎣⎡=⎥⎦⎤⎢⎣⎡⎥⎦⎤⎢⎣⎡+⎥⎦⎤⎢⎣⎡⎥⎦⎤⎢⎣⎡+++⎥⎦⎤⎢⎣⎡⎥⎦⎤⎢⎣⎡++ 212121332133 00  s s x xmm x xcccccc x xk k k k k k  &&&&&&  (1) It can be shown [8] that the dynamics of the Ormia ochracea hearing system can be expressed as the superposition of two vibration modes: the rocking mode (RM)  and the translating mode (TM) . The rocking mode responds to the difference in magnitude of the input forces while the translating mode responds to the superposition of the input forces. III.   A  NALYSIS OF THE O 2   D ELAY M AGNIFICATION S YSTEM The study of Ormia ’s ears in [8] suggests the existence of a coupled mechanical structure which can magnify the ITD and ILD. This section explains the mechanism of the delay magnification process of the Ormia ’s ears based on the dynamic response (1) and proposes a simpler delay magnification structure. A Study of a Delay Magnification System Inspired by the Ormia ochracea  Hearing System Metha Kongpoon, Yazan N. Billeh, and Emmanuel M. Drakakis,  Member, IEEE    S Proceedings of the 2010 3rd IEEE RAS & EMBSInternational Conference on Biomedical Robotics and Biomechatronics,The University of Tokyo, Tokyo, Japan, September 26-29, 2010 978-1-4244-7709-8/10/$26.00 ©2010 IEEE 540    1  s 2  s 1  x 2  x   Figure 2. Equivalent signal flow graph of the O 2 mechanical model. The system receives the inputs  s 1  and  s 2  which are scaled by the factors 1/ k  r   and 1/ k  t  and are processed by the two filters RM and TM. The two outputs of the system are denoted by  x 1  and  x 2 , respectively. TABLE I THE PARAMETERS OF THE O 2  SYSTEM IN FIG.2 RM Filter TM Filter t r  Q 0/0   ckm /   )2/()2( 33 ccmk k   ++   t r  0/0   mk  /   mk k  /)2( 3  +   t r  k  /   20 2 r  m ω    20 2 t  m ω    t r   H  /   ][ 2000220 r r r r   sQ s  ω ω ω  ++   ][ 2000220 t t t t   sQ s  ω ω ω  ++   Figure 3. The  ITD  of the O 2 hearing system as a function of the frequency and the physical delay. The parameters of the O 2 hearing system are: k   = 0.576 N/m, k  3  = 0.18 N/m, c  = 1.15×10 -5  N s/m, c 3  = 2.88×10 -5  N s/m and m  = 2.88×10 -10  kg [8].  A.    Detailed Analysis of the O 2  Delay Magnification System In this subsection the equivalent signal flow graph of the mechanical model of the Ormia ’s ears shown in Fig.1 is derived. The delay characteristic of the Ormia ’s ear membranes is derived and explained from this signal flow graph. To derive the equivalent signal flow graph of Fig.1, we apply Laplace Transform on both sides of (1) resulting in two equations. After subtracting and adding these two equations and rearranging them in the form of two transfer functions the system codified by (1) can be expressed by means of the signal flow graph shown in Fig.2. The quantities appearing in Fig.2 are listed in Table I with k  , k  3 , c , c 3  and m  illustrated in Fig.1. The transfer functions  H  r  (  s ) and  H  t (  s ) correspond to the rocking and the translating modes and are therefore termed  RM and TM filters , respectively. The signal flow graph that we derived in Fig.2 is equivalent to the modal solution for (1) shown in [8]. We term the delay present between the two ear inputs as the  physical delay τ   pd   and the delay appearing between the two outputs  x 1  and  x 2  of the O 2  system shown in Fig.2 as  ITD . The characteristic of the  ITD  resulting from the O 2  hearing system of Fig.2 can be seen in Fig.3. Fig.3 is produced by varying the frequency and the physical delay of the inputs  s 1  and  s 2  in Fig.2 which are assumed to be pure sinusoidals of the same amplitude and frequency but are delayed by τ   pd  . The frequency and τ   pd   vary from 0 to 25kHz and from 2.5 to 12µs, respectively. The system parameters used for the  production of Fig.3 are the same as the system parameters shown in [8]: k   = 0.576 N/m, k  3  = 0.18 N/m, c  = 1.15×10 -5  N s/m, c 3  = 2.88×10 -5  N s/m and m  = 2.88×10 -10  kg. From Fig.3, it can be seen, for example, that for τ   pd   = 2.5 µs and a frequency around 1kHz, the  ITD is around 50 µs which is 20 times the physical delay. It is worth noting that at low frequencies the  ITD is approximately a linear function of the  physical delay. We can derive the identity for the characteristic of the  ITD  in Fig.3 from the phase response of the O 2  system outputs as: ω ϕ λ λ λ ω ϕ λ  /)])1)2cos(2( 1(cos[),,( 2/124 21 ++−−=  −  ITD  (2) The derivation of (2) assumes sinusoidal incident sound waves of frequency ω  rad/s with the same amplitudes arriving at both ears with the physical delay τ   pd  . In (2)  λ ( ω ,   τ   pd  ) = | C  ( ω ,   τ   pd  )|/|  D ( ω ,   τ   pd  )| with | C  ( ω ,   τ   pd  )| and |  D ( ω ,   τ   pd  )| the amplitudes of the output signals from the RM and TM filters of Fig.2. Thus, it can be internally controlled by gains of RM and TM filters, respectively. The quantity φ  is the difference in phase provided by the RM and TM filters at their outputs. A useful feature to be noted from (2) is that for a given frequency ω , the  ITD  depends on φ  and  λ  only. It should be pointed out that the purpose of input arithmetic blocks is to extract the information of the physical delay and modulate the magnitude of the input signals of the RM and TM filters. The RM and TM filters amplify the  physical delay by means of their magnitude gains. Finally, output arithmetic blocks are used to include the amplified  physical delay back into the phase terms of the output signals.  B.   The Modified O 2  system From the previous subsection it should be clear that the  ITD  between the outputs of the Ormia ’s ears is controlled by  phase differences and gain ratios between RM and TM filters. This subsection investigates a simpler O 2  system which results in higher delay gain than that of the srcinal system.  Now bearing in mind Fig.2 consider a modified O 2  system where the 1/ k  r   and 1/ k  t  blocks are removed, and the RM and TM filters are reduced to simple gain elements whose gain 541    Figure 4. The effect of increasing the delay gain  β   value and the frequency ω /2 π   value on the  ITD  relation (6) when the input physical delay amounts to 5µs. (a): the delay gain varies from 20 to 100. (b): the frequency varies from 0.1 to 2 kHz. Figure 5. Tuning characteristics of the O 2  system. (a): The delay gain varies from 20, 10 to 6.67 providing the same  ITD  for different physical delay values of 5, 10 and 15µs respectively. (b): With the physical delay set to 5 µs, when the delay gain  β   varies from 20.18 to 20.7 and 21.6 the same  ITD  value of 100µs is achieved at different frequencies of 0.5, 1 and 1.5 kHz, respectively. magnitudes are denoted by ν  and  ρ  respectively. Assuming  s 1 ( t  ) = sin( ω t  ),  s 2 ( t  ) = sin( ω ( t  -   τ   pd  )), and the delay gain  β   = ν /  ρ , the cross correlation between the outputs of the O 2  system is computed by: ∫ +−∞→ += T T T  dt t  xt  x T c )()( 21lim)( 21  τ τ   (3) The  ITD  value corresponds to the value of τ   where the maximum value of the cross correlation occurs. From (3) a closed form for the  ITD  can be computed: ω ωτ  β  β ωτ  β  /)])cos()1(1 )sin(2(tan[ 221  pd  pd   ITD ++−−=  −  (4) Relation (4) reveals a complex relation between the  physical delay and  ITD . To make the analysis simpler, we assume that  s 1 ( t  ) =  s ( t  ) and  s 2 ( t  ) =  s ( t+ τ   pd  ) are the received narrow band signals. Expanding  s ( t+ τ   pd  ) by means of a Taylor series around t  , we have: ....)(21)()()( 2 +++=+ t  st  st  st  s  pd  pd  pd   &&&  τ τ τ   (5) When τ  2  pd   on the right hand side of (5) is very small, the third term of (5) can be neglected. Replacing  s 2 ( t  ) with  s ( t+ τ   pd  ), applying the Laplace transform and rearranging  s 1 ( t  ) and  s 2 ( t  ), (4) can be approximated (6): ω  βωτ ωτ  βωτ  /)])2()2(1(tan[ 221  pd  pd  pd   ITD −+−≈  −  (6)  Next, we analyze the effect of  β   and ω  values on the resulting  ITD  as expressed by (6). At low ω  and  β   values where  βωτ   pd   << 1, the  ITD  in (6) can be simplified to  ITD   ≈  -    βτ   pd   (since tan -1 (  βωτ   pd  ) ≈    βωτ   pd   when    βωτ   pd   << 1).This means that the  ITD  value is approximately equal to the  physical delay τ   pd   multiplied by the delay gain  β  . For given  physical delay and  β    values, the  ITD  decreases monotonically with ω   as shown in Fig.4.a. Fig.4.b illustrates the dependence of  ITD  upon  β   for various input frequencies. Bearing in mind (6) we investigate the tuning properties of the proposed system. Assuming that ωτ   pd << 1 the term ( ωτ   pd  /2) 2  can be neglected with respect to the term (  βωτ   pd  /2) 2  and the  ITD  value depends exclusively on  βωτ   pd  . Consequently, so long as the product  βωτ   pd   preserves its value for different physical delay τ   pd   values (this can be achieved by varying the value of  β   accordingly) the same targeted  ITD  values can be attained. This is shown in Fig.5.a. Conversely, for a given τ   pd   value the same targeted  ITD  value can be achieved at different frequencies of interest by varying the  β    value. This is shown in Fig.5.b. By comparing the delay characteristic of the  ITD  provided by the srcinal O 2  system which is comprised of two second order filters (see Fig.3) with the  ITD  of the simple O 2  system composed of delay gains (see Fig.4.a), it can be seen that the simple O 2  system provides higher delay gain.   IV.   C ROSS - CORRELOGRAM OF THE O 2   S YSTEM  This section introduces the cross-correlogram as proposed in [13] and the cross-correlogram combined with the simple O 2  system proposed in section III.B.  A.   The Conventional Cross-correlogram The cross-correlogram [13] is an analysis tool that uses a spectrum analysis by cochlea filterbanks and cross correlation units to extract the ITD information from input speech signals. Its computation is equivalent to the coincidence detection model [2]. It consists of  M   identical channels for each of the left and right cochlea filterbanks and the correlator units. In this analysis the cochlea filters are modeled by means of  N  th -order Differential All-Pole Gammatone Filters (DAPGF) [14] whose transfer function is given by:  N  N  DAPGF   sQ s s s H  ][)( 2002120 ω ω ω  ++= −  (7) 542    Figure 6. The cross correlogram of a single sound source with 200µs  physical delay. The left and right filterbanks are composed of 200 6 th -order DAPGF each. Each correlator consists of a pair of 60 delay stages of 10 µs delay each. Figure 7. The proposed O 2 -inspired system combined with the coincidence detection model. where ω 0  and Q  are the pole frequency and the quality factor respectively. At each m th  channel frequency the output of the left and right cochlea filters ( l  m and r  m  respectively) are the inputs to the correlator unit which consists of a pair of T stages of τ   µs delay units emerging from the left and right cochlea side at the m th  channel frequency. At the  j th  delay unit and t  th  time frame the cross correlation between signals coming from both left and right cochlea filters using  K   sampling data can be computed as: ∑ −= −−×−= t  K t k mm  jT k l  jk r  jmt C  ))(()(),,(  (8) The ITD corresponding to the m th  frequency channel can  be approximated by the location of time lag along the T   delay stages where the maximum value of the cross-correlation takes place. To illustrate the concept of the cross-correlogram, a real i- (a) (b) Figure 8. a) The cross-correlogram including the O 2  system for the same sound source as the previous example with the physical delay of -4µs, and delay gain of 50 at low frequencies. The cross-correlation is composed of 32 stages of 10µs of delay each. b) Re-mapping the cross-correlogram of Fig.8.a using the inverse relation codified by (9). nput speech signal from a male speaker is recorded from a microphone with a sampling rate of 44kHz. This signal is assumed to be received from the first acoustic sensor. The second acoustic sensor is assumed to receive a delayed version of the same signal. This second signal is delayed by the physical delay τ   pd  . In this example a sound source placed near the left acoustic sensor is assumed which corresponds to a τ   pd   value of -200µs. Signals from the right and left sensors are windowed using 45.5ms rectangular window. These windowed signals are also interpolated using MATLAB spline functions in order to generate an input signal with enough time resolution; time resolution is set to 0.5µs. The interpolated signals are then passed to the left and right cochlea filterbanks where the total number and the order of DAPGFs (  M   and  N  ) at each side are set to 200 and 6, respectively. The center frequency of each cochlea filter ranges from 0.08 to 5kHz and is equally spaced on the ERB scale [15]. The cross-correlation between the output of the right and left DAPGF are computed independently for each frequency channel using (8) where the total number of delay stages T   is 60 and each delay unit τ   is equivalent to 10µs delay. The resulting cross-correlogram computed by means of Matlab is shown in Fig.6. From Fig.6 it can be seen that the main peak where the maximum of the cross-correlation occurs (highlighted in red color) takes place at -200µs which corresponds to its input  physical delay. Besides the main peak, “ambiguous” peaks also appear on the left and right of the main peak. This  phenomenon is known to be caused by the high frequency of 543    (a)   (b) Figure 9. The cross-correlogram(top) and the pooled cross-correlogram(bottom) of a mixture of two sound sources with delays of -10µs and 2µs respectively a) cross-correlogram without the O 2  system using 60 stages of 0.5µs delay units. b) re-mapped cross-correlogram including the O 2  system using 30 stages of 10µs delay units. the signal compared to τ   pd   [16]. In practice, delays of less than 10µs are difficult to resolve [3-7] due to the restriction posed by the resolution of delay units. When the physical delay is less than the resolution of the delay unit, it will not be possible to extract the desired ITD from the cross-correlogram. From the analysis in the previous section, it should be clear that the O 2  system with simple delay gain elements can magnify a  physical delay of less than 10µs to the range of hundreds of microseconds. In what follows we study the cross-correlogram of a system which incorporates the O 2  unit.  B.   The Modified O 2  System with the Cross-correlogram This subsection investigates the simplified Ormia -inspired ITD magnification scheme of subsection III.B in conjunction with the conventional cross-correlogram technique of subsection IV.A. Now, consider the system in Fig.7 where we include the proposed O 2  system as a  preprocessing stage for the cross-correlgram. In Fig.7, the O 2  system proposed in section III.B receives the outputs from both microphones and produces a pair of signals with magnified physical delay. These signals are filtered by both the left and right cochlea filters. Pairs of outputs from same frequency left and right cochlea filters are cross-correlated in the coincidence detection model. Given that this cross-correlation corresponds to magnified delays the resulting cross-correlogram is mapped back to the “correct” (i.e. non-magnified) one. To demonstrate the effect of the O 2  system on a conventional cross-correlogram, the physical delay τ   pd   is set to -4µs which is less than the resolution of the delay unit in the cross-correlator which is 10 μ s, the delay gain is set to 50 and the rest of the parameters are the same as in the previous example. The resulting cross-correlogram is illustrated in Fig.8.a. From Fig.8.a, it is clear that the incorporation of the O 2  system allows for the production of a cross-correlogram even though the delay of the correlator units is longer than the physical delay. However the cross-correlogram of the system including the O 2  system has a “modified” shape compared to the ideal cross-correlogram without the O 2  system. Bearing in mind Fig.6 it can be seen that at low frequencies the ITD in Fig.8.a is approximately equal to the  product of the physical delay by the delay gain and the main  peak is bent towards zeros at high frequencies. This effect results from the gain frequency characteristic of the O 2  system as predicted by (6) and Fig.4.a. To make the interpretation of the modified cross-correlogram reliable, the inverse relation between the measured ITD from the cross-correlogram with the O 2  system and the approximated physical delay τ  ′  pd   has to be employed. This inverse relation between the i th  measured ITD from the cross-correlogram and the corresponding approximated physical delay τ  ′  pd ( ω ) at fixed operating frequency ω  and delay gain  β   can be found from (6) as: ⎪⎪⎪⎩⎪⎪⎪⎨⎧>−−−+<−−−−= θ θ  ω ω  β ω α  β α  β  β  ω ω  β ω α  β α  β  β  ω τ   ,)1()1( ,)1()1()( 22222222' iiiii pd   (9) where2/))(tan(  ω ω α  ii  ITD ×= and))(2/(  ω π ω  θ  i  ITD ×= . By applying (9) to the values of the cross-correlogram shown in Fig.8.a, the cross-correlogram of Fig.8.b can be constructed. From Fig.8.b, it can be observed that the main  peak of the “corrected” cross-correlogram occurs around -4µs which coincides with the correct physical delay value. C.    Multiple Sound Sources Localization In real life situations a sound source of interest is  practically always interfered by other competing signals. This will affect its localization. The conventional cross-correlogram can locate more than one sound sources when the overlap between the input spectrums is so small that there are enough frequency channels dominated by each one 544
Search
Similar documents
View more...
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks