Issuu on Google+

Analyses for Localization Steve Rogers Institute for Scientific Research 5 June 2006

Introduction Cross-correlation based time delay estimates may be used for calculating the distance of an acoustic source from spatially separated acoustic sensors. Several techniques are available, including generalized correlation, maximum likelihood (ml), and filter residual analyses. Generalized correlation based methods are very common and several related techniques will be explained in this study. The ml method of distance estimation is based on the estimated time delay using generalized cross-correlation (GCC) estimation1-6. Filter residual analyses methods are developed by using a high-pass filter to generate an output residual. This residual generally contains the information about the excitation source. The GCC between signals generated by any pair of acoustic sensors can be computed using an inverse-discrete Fourier transform on the cross power spectral density of the two signals. The matlab code for a typical GCC operation is shown below. The ‘W’ term is a filter which may be derived in a multitude of ways. Eight approaches are developed and compared in this study. Four major steps are involved in computing the GCC: 1) a cross power spectral density (XPSD) Pxy is calculated, 2) filter XPSD to get R, 3) compute inverse Fourier transform G to return to the time domain, and 4) normalize G. In order to calculate the relative distance from the acoustic sensor pair, we must first calculate the relative time delay. The time related to the GCC calculated is first computed as shown below. The time domain resolution is first calculated based on the sampling rate Fs and fft data length N. The time delay tau then is averaged over the entire sample length. The relative distance dist is computed knowing the velocity of the acoustic wave in the media (for air it is ~ 12,000 in/sec). The relative distance for each pair of sensors may be used for localization of an acoustic source for n sensors. The number of combinations of sensors resulting from an array of n sensors7 is:

n  n!  2   = 2!( n − 2 )! . If at least 4 sensors (6 data points) are in the array a good least squares  

location estimate is possible. This is assuming that each sensor is within sensing range of the acoustic source. The eight filter methods applied to the GCC approach are listed in the table below. They are implemented in the m-file GCC.m. The inputs to GCC.m include Pxx, Pyy, and Pxy. These are DFT (discrete Fourier transforms) of the time series from sensor x and sensor y. Pxy is the DFT of the cross-product of sensor x and sensor y. Filter method none Roth

Frequency domain filter - W Comments 1 1 Pxx


Smooth coherence transform

1 Pxx ∗ Pyy


1 Pxy

Phase transform


1 Pxy 0.75

Modified scot





Pxy 1 − γxy




γxy =

Pxy Pxx * Pyy

Pxy Pxx * Pyy

Maximum likelihood

Pxx Pyy Pxx


+ Pyy


Two filter residual methods are now presented. They both make use of linear predictive coding as a base2. The guiding assumption for both approaches is that the excitation source information may be extracted from the signal using linear prediction analyses2. Linear prediction as used in speech processing generates an all-pole filter based on a least squares fit of the input data. The linear prediction residual generation is shown in the figure below.

The Hilbert envelope approach also involves a linear prediction filter and processing of the residuals as shown in the figure below.

The ‘n’ above designates the grouping of sensor 1 and 2 for each run. For run 1, n = 1, so that e1 and e2 are correlated. For run 2, n = 2, so that e3 and e4 are correlated. For run3, n = 3, so that e5 and e6 are correlated.

Analysis Sampling rate was 1.25 Ms/sec. Three different sample runs were taken with dual sensors at one relative position from the acoustic leak source. The file titles are listed as 6-800, which means that 6 inches is the distance of sensor 1 from the leak and 8 inches is the distance of sensor 2 from the leak. Note that in all runs the relative distance (true distance) is 2 inches. Sensor 1 is on the opposite side of the leak from sensor 2. The plots of the autocorrelations of the various methods along with estimated time delay and estimated distance are shown below.

In each case the calculation of the relative distance estimation is similar. The matlab scripts for each distance calculation are shown below.

The table below shows the performance of each method versus the target true distance of 2 inches. Method GCC unfiltered GCC roth GCC scot GCC phat GCC cps-m GCC ht GCC ml GCC wiener LPC Hilbert envelope

File 6_800, run1 -0.65112 -0.94854 0.086751 -0.64934 -1.0754 0.6958 -0.39701 0.8113 -1.8009 -1.9231

File 6_800, run2 -1.1959 -1.1921 0.73345 -1.5644 0.5686 -0.16744 -0.30952 -.052944 1.5105 1.5503

File 6_800, run3 -0.85779 0.46686 -1.1263 -0.55578 -0.67619 0.057998 -0.15472 -1.4489 -1.3944 -1.3977

In these tests the last two filtered residual methods listed in the table appear to be the most consistently close to the target value of 2 inches, however, the sign change is of concern. The guiding assumption was that the main information will be contained in the residual which is derived by high-pass filtering the sensor data.

Conclusions This study highlights one of the critical issues – that of improving the accuracy of the relative distance estimation. At this point, the superior distance estimation algorithm appears to be based on a residual derived by high-pass filtering the sensor data stream. For this study the low-pass filter was a 20 th order linear prediction filter. In the model block diagram below, the block designated low-pass filter may be a number of different types of low-pass filters, only one of which has been addressed here. Other issues have to do with calculation of the estimated source location. This will also be investigated.

References 1. Moore, P., etal, An Impulsive Noise Source Position Locator, Feb., 2002, Final Report, University of Bath. 2. Raykar, V. etal, ‘Speaker Localization Using Excitation Source Information in Speech,’ IEEE Transactions on Speech and Audio Processing, 2005. 3. Varma, K. etal, ‘Robust TDE-Based DOA Estimation for Compact Audio Arrays,’ 2nd IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM2002), 4-6 August 2002, Rosslyn, VA 4. Johansson, A. etal, ‘Speaker Localization Using the Far-Field SRP-PHAT in Conference Telephony,’ ISPACS 2002 5. Knapp, C., etal, ‘The Generalized Correlation Method for Estimation of Time Delay,’ IEEE Transactions on Acoustic, Speech, and Signal Processing, v. ASSP-24, no. 4, p. 320-327, August, 1976. 6. Brandstein, M. & Ward, D. (eds), Microphone Arrays, 2001, Springer, ISBN 3-54041953-5. 7. Varma, K., Time-Delay Estimate Based Direction-of-Arrival Estimation for Speech in Reverberant Environments, 2002, MS thesis, Virginia Polytechnic Institute and State University.

Analyses for Localization scr