throbber
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 4, MAY 2007
`
`1327
`
`Direction of Arrival Estimation Using the
`Parameterized Spatial Correlation Matrix
`
`Jacek Dmochowski, Jacob Benesty, Senior Member, IEEE, and Sofiène Affes, Senior Member, IEEE
`
`Abstract—The estimation of the direction-of-arrival (DOA) of
`one or more acoustic sources is an area that has generated much
`interest in recent years, with applications like automatic video
`camera steering and multiparty stereophonic teleconferencing
`entering the market. DOA estimation algorithms are hindered by
`the effects of background noise and reverberation. Methods based
`on the time-differences-of-arrival (TDOA) are commonly used
`to determine the azimuth angle of arrival of an acoustic source.
`TDOA-based methods compute each relative delay using only two
`microphones, even though additional microphones are usually
`available. This paper deals with DOA estimation based on spatial
`spectral estimation, and establishes the parameterized spatial cor-
`relation matrix as the framework for this class of DOA estimators.
`This matrix jointly takes into account all pairs of microphones,
`and is at the heart of several broadband spatial spectral estima-
`tors, including steered-response power (SRP) algorithms. This
`paper reviews and evaluates these broadband spatial spectral esti-
`mators, comparing their performance to TDOA-based locators. In
`addition, an eigenanalysis of the parameterized spatial correlation
`matrix is performed and reveals that such analysis allows one to
`estimate the channel attenuation from factors such as uncalibrated
`microphones. This estimate generalizes the broadband minimum
`variance spatial spectral estimator to more general signal models.
`A DOA estimator based on the multichannel cross correlation
`coefficient (MCCC) is also proposed. The performance of all
`proposed algorithms is included in the evaluation. It is shown that
`adding extra microphones helps combat the effects of background
`noise and reverberation. Furthermore, the link between accurate
`spatial spectral estimation and corresponding DOA estimation
`is investigated. The application of the minimum variance and
`MCCC methods to the spatial spectral estimation problem leads
`to better resolution than that of the commonly used fixed-weighted
`SRP spectrum. However, this increased spatial spectral resolution
`does not always translate to more accurate DOA estimation.
`
`Index Terms—Circular arrays, delay-and-sum beamforming
`(DSB), direction-of-arrival (DOA) estimation, linear spatial predic-
`tion, microphone arrays, multichannel cross correlation coefficient
`(MCCC), spatial correlation matrix, time delay estimation.
`
`I. INTRODUCTION
`
`P ROPAGATING signals contain much information about
`
`the sources that emit them. Indeed, the location of a signal
`source is of much interest in many applications, and there exists
`a large and increasing need to locate and track sound sources.
`
`Manuscript received September 6, 2006; revised November 8, 2006. The as-
`sociate editor coordinating the review of this manuscript and approving it for
`publication was Dr. Hiroshi Sawada.
`The authors are with the Institut National de la Recherche Scientifique-
`Énergie, Matériaux, et Télécommunications (INRS-EMT), Université du
`Québec, Montréal, QC H5A 1K6, Canada (e-mail: dmochow@emt.inrs.ca).
`Digital Object Identifier 10.1109/TASL.2006.889795
`
`For example, a signal-enhancing beamformer [1], [2] must con-
`tinuously monitor the position of the desired signal source in
`order to provide the desired directivity and interference sup-
`pression. This paper is concerned with estimating the direc-
`tion-of-arrival (DOA) of acoustic sources in the presence of sig-
`nificant levels of both noise and reverberation.
`The two major classes of broadband DOA estimation
`techniques are those based on the time-differences-of-arrival
`(TDOA) and spatial spectral estimators. The latter terminology
`arises from the fact that spatial frequency corresponds to the
`wavenumber vector, whose direction is that of the propagating
`signal. Therefore, by looking for peaks in the spatial spectrum,
`one is determining the DOAs of the dominant signal sources.
`The TDOA approach is based on the relationship between
`DOA and relative delays across the array. The problem of es-
`timating these relative delays is termed “time delay estimation”
`[3]. The generalized cross-correlation (GCC) approach of [4],
`[5] is the most popular time delay estimation technique. Alter-
`native methods of estimating the TDOA include phase regres-
`sion [6] and linear prediction preprocessing [7]. The resulting
`relative delays are then mapped to the DOA by an appropriate
`inverse function that takes into account array geometry.
`Even though multiple-microphone arrays are commonplace
`in time delay estimation algorithms, there has not emerged a
`clearly preferred way of combining the various measurements
`from multiple microphones. Notice that in the TDOA approach,
`the time delays are estimated using only two microphones at a
`time, even though one usually has several more sensor outputs at
`one’s disposal. The averaging of measurements from indepen-
`dent pairs of microphones is not an optimal way of combining
`the measurements, as each computed time delay is derived from
`only two microphones, and thus often contains significant levels
`of corrupting noise and interference. It is thus well known that
`current TDOA-based DOA estimation algorithms are plagued
`by the effects of both noise and especially reverberation.
`To that end, Griebel and Brandstein [8] map all “realizable”
`combinations of microphone-pair delays to the corresponding
`source locations, and maximize simultaneously the sum (across
`various microphone pairs) of cross-correlations across all pos-
`sible locations. This approach is notable, as it jointly maximizes
`the results of the cross-correlations between the various micro-
`phone pairs.
`The spatial spectral estimation problem is well defined in the
`narrowband signal community. There are three major methods:
`the steered conventional beamformer approach (also termed
`the “Bartlett” estimate), the minimum variance estimator (also
`termed the “Capon” or maximum-likelihood estimator), and
`the linear spatial predictive spectral estimator. Reference [9]
`Amazon Ex. 1006
`IPR Petition – US RE47,049
`
`1558-7916/$25.00 © 2007 IEEE
`
`Amazon Ex. 1006, Page 1 of 13
`
`

`

`1328
`
`IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 4, MAY 2007
`
`provides an excellent overview of these approaches. These
`three approaches are unified in their use of the narrowband
`spatial correlation matrix, as outlined in the next section.
`The situation is more scattered in the broadband signal
`case. Various spectral estimators have been proposed, but there
`does not exist any common framework for organizing these
`approaches. The steered conventional beamformer approach
`applies to broadband signals. The delay-and-sum beamformer
`(DSB) is steered to all possible DOAs to determine the DOA
`which emits the most energy. An alternative formulation of
`this approach is termed the “steered-response power” (SRP)
`method, which exploits the fact that the DSB output power may
`be written as a sum of cross-correlations. The computational
`requirements of the SRP method are a hindrance to practical
`implementation [8]. A detailed treatment of steered-beam-
`former approaches to source localization is given in [10], and
`the statistical optimality of the approach is shown in [11]–[13].
`Krolik and Swingler develop a broadband minimum variance
`estimator based on the steered conventional beamformer [14],
`which may be viewed as an adaptive weighted SRP algorithm.
`There have also been approaches that generalize narrowband
`localization algorithms (i.e., MUSIC [15]) to broadband sig-
`nals through subband processing and subsequent combining
`(see [16], for example). A broadband linear spatial predictive
`approach to time delay estimation is outlined in [17] and [18].
`This approach, which is limited to linear array geometries,
`makes use of all the channels in a joint fashion via the time
`delay parameterized spatial correlation matrix.
`This paper attempts to unify broadband spatial spectral esti-
`mators into a single framework and compares their performance
`from a DOA estimation standpoint to TDOA-based algorithms.
`This unified framework is the azimuth parameterized spatial
`correlation matrix, which is at the heart of all broadband spa-
`tial spectral estimators.
`In addition, several new ideas are presented. First, due to
`the parametrization, well-known narrowband array processing
`notions [19] are applied to the DOA estimation problem, gen-
`eralizing these ideas to the broadband case. A DOA estimator
`based on the eigenanalysis of the parameterized spatial corre-
`lation matrix ensues. More importantly, it is shown that this
`eigenanalysis allows one to estimate the channel attenuation
`from factors such as uncalibrated microphones. The existing
`minimum variance approach to broadband spatial spectral esti-
`mation is reformulated in the context of a more general signal
`model which accounts for such attenuation factors. Further-
`more, the ideas of [17] and [18] are extended to more general
`array geometries (i.e., circular) via the azimuth parameterized
`spatial correlation matrix, resulting in a minimum entropy DOA
`estimator.
`Circular arrays (see [20]–[22], for example) offer some ad-
`vantages over their linear counterparts. A circular array provides
`spatial discrimination over the entire 360 azimuth range, which
`is particularly important for applications that require front-to-
`back signal enhancement, such as teleconferencing. Further-
`more, a circular array geometry allows for more compact de-
`signs. While the contents of this paper apply generally to planar
`array geometries, the circular geometry is used throughout the
`simulation portion.
`
`Fig. 1. Circular array geometry.
`
`Section II presents the signal propagation model in planar
`(i.e., circular) arrays and serves as the foundation for the re-
`mainder of the paper. Section III reviews the role of the tradi-
`tional, nonparameterized spatial correlation matrix in narrow-
`band DOA estimation, and shows how the parameterized ver-
`sion of the spatial correlation matrix allows for generalization
`to broadband signals. Section IV describes the existing and pro-
`posed broadband spatial spectral estimators in terms of the pa-
`rameterized spatial correlation matrix. Section V outlines the
`simulation model employed throughout this paper and evaluates
`the performance of all spatial spectral estimators and TDOA-
`based methods in both reverberation- and noise-limited envi-
`ronments. Concluding statements are given in Section VI.
`The spatial spectral estimation approach to DOA estimation
`has limitations in certain reverberant environments. If an inter-
`fering signal or reflection arrives at the array with a higher en-
`ergy than the direct-path signal, the DOA estimate will be false,
`even though the spatial spectral estimate is accurate. Such situ-
`ations arise when the source is oriented towards a reflective bar-
`rier and away from the array. This problem is beyond the scope
`of this paper and is not addressed herein. Rather, the focus of
`this paper is on the evaluation of spatial spectral estimators in
`noisy and reverberant environments and on their application to
`DOA estimation.
`
`II. SIGNAL MODEL
`
`elements in a 2-D geom-
`Assume a planar array of
`etry, shown in Fig. 1 (i.e., circular geometry), whose outputs
`are denoted by
`,
`, where
`is the time index.
`Denoting the azimuth angle of arrival by , propagation of the
`signal from a far-field source to microphone is modeled as:
`
`(1)
`
`, are the attenuation factors due to
`,
`where
`channel effects,
`is the propagation time, in samples, from the
`unknown source
`to microphone 0,
`is an additive noise
`signal at the th microphone, and
`, is the
`
`,
`
`Amazon Ex. 1006, Page 2 of 13
`
`

`

`DMOCHOWSKI et al.: DIRECTION OF ARRIVAL ESTIMATION USING THE PARAMETERIZED SPATIAL CORRELATION MATRIX
`
`1329
`
`relative delay between microphones 0 and . In matrix form, the
`array signal model becomes:
`
`...
`
`...
`
`...
`
`. . .
`
`...
`
`. . .
`. . .
`
`...
`
`...
`
`although presented in far-field planar context, easily generalize
`to the near-field spherical case by including the range and ele-
`vation in the forthcoming parametrization.
`
`III. PARAMETERIZED SPATIAL CORRELATION MATRIX
`In narrowband signal applications, a common space-time
`statistic is that of the spatial correlation matrix [19], which is
`given by
`
`(2)
`
`where
`
`(5)
`
`(6)
`
`relates the angle of arrival to the relative delays
`The function
`between microphone elements 0 and , and is derived for the case
`of an equispaced circular array in the following manner. When
`operating in the far-field, the time delay between microphone
`and the center of the array is given by [23]
`
`where the azimuth angle (relative to the selected angle refer-
`,
`ence) of the th microphone is denoted by
`,
`denotes the array radius, and is the speed of signal
`propagation. It easily follows that
`
`(3)
`
`(4)
`
`may
`It is also worth mentioning that the additive noise
`be temporally correlated with the desired signal
`. In that
`case, a reverberant environment is modeled. The anechoic en-
`vironment is modeled by making the additive noise temporally
`uncorrelated with the source signal. In either case, the additive
`noise may be spatially correlated across the sensors.
`It should also be stated that the signal model presented above
`makes use of the far-field assumption, in that the incoming wave
`is assumed to be planar, such that all sensors perceive the same
`DOA. An error is incurred if the signal source is actually lo-
`cated in the near-field; in that case, the relative delays are also
`a function of the range. In the most general case (i.e., a source
`in the near-field of a 3-D geometry), the function
`takes three
`parameters: the azimuth, range, and elevation. This paper fo-
`cuses on a specific subset of this general model: a source located
`in the far-field with only a slight elevation, such that a single
`parameter suffices. This is commonly the case in a teleconfer-
`encing environment. Nevertheless, the concepts of this paper,
`
`denotes conjugate transpose, as complex sig-
`the superscript
`nals are commonly used in narrowband applications, and
`de-
`notes the transpose of a matrix or vector. To steer these array
`outputs to a particular DOA, one applies a complex weight to
`each sensor output, whose phase performs the steering, and then
`sums the sensor outputs to form the output beam. Now, if the
`input signal is no longer narrowband, each frequency requires
`its own complex weight to appropriately phase-shift the signal
`at that frequency. In the context of broadband spatial spectral
`estimation, the spatial correlation matrix may be computed at
`each temporal frequency, and the resulting spatial spectrum is
`now a function of the temporal frequency. For broadband appli-
`cations, these narrowband estimates may be assimilated into a
`time-domain statistic, a procedure termed “focusing,” which is
`described in [24]. The resulting structure is termed a “focused
`covariance matrix.”
`In this paper, broadband spatial spectral estimation is
`addressed in another manner. Instead of implementing the
`steering delays in the complex weighting at each sensor, the
`delays are actually implemented as a time-delay in the spatial
`correlation matrix, which is now parameterized. Thus, each
`microphone output is appropriately delayed before computing
`this parameterized spatial correlation matrix:
`
`(7)
`
`and real signals are assumed from this point on. The delays are
`a function of the assumed azimuth DOA, which becomes the
`parameter. The parameterized spatial correlation matrix is for-
`mally written as shown by (8) and (9) at the bottom of the page.
`The matrix
`is not simply the array observation matrix, as is
`commonly used in narrowband beamforming models. Instead, it
`is a parameterized correlation matrix that represents the signal
`
`...
`
`...
`
`. . .
`
`...
`
`(8)
`
`(9)
`
`Amazon Ex. 1006, Page 3 of 13
`
`

`

`1330
`
`IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 4, MAY 2007
`
`powers across the array emanating from azimuth . Each off-di-
`agonal entry in the matrix
`is a single cross-correlation term
`and a function of the azimuth angle
`. Notice that the various
`microphone pairs are combined in a joint fashion, in that altering
`the steering angle
`affects all off-diagonal entries of
`. This
`property allows for the more prudent combining of microphone
`measurements as compared to the ad hoc method of averaging
`independent pairs of cross-correlation results.
`This paper relates broadband spatial spectral estimators in
`terms of the parameterized spatial correlation matrix
`
`], by an amount
`[or advanced, depending on the sign of
`that takes into account the array geometry, via the function
`.
`The estimate of the spatial spectral power at azimuth angle
`is given by the power of the beamformer output when steered to
`azimuth . Therefore, to form the entire spectrum, one needs to
`steer the beam and compute the output power across the entire
`azimuth space.
`The steered-beamformer spectral estimate is given by
`
`Substitution of (12) into (13) leads to
`
`(10)
`
`Expression (14) may be written more neatly in matrix notation
`as
`
`(13)
`
`(14)
`
`(15)
`
`(16)
`
`(17)
`
`is the steered azimuth
`is some estimation function,
`where
`angle, and
`is the estimate of the broadband spatial spec-
`trum at azimuth angle
`.
`The DOA estimate follows directly from the spatial spectrum,
`in that peaks in the spectrum correspond to assumed source
`locations. For the case of a single source, which is the case
`throughout this paper, the estimate of the source’s DOA is given
`by
`
`where
`is the DOA estimate.
`Note that this broadband extension is not without caveats:
`care must be taken when spacing the microphones to ensure that
`spatial aliasing [2] does not result.
`It is also important to point out that the GCC method is quite
`compatible with DOA estimation based on the parameterized
`spatial correlation matrix—the cross-correlation estimates that
`comprise the matrix may be computed in the frequency-domain
`using a GCC variant such as the phase transform (PHAT) [4].
`This paper focuses on how to extract the DOA estimate from the
`parameterized spatial correlation matrix; the ideas presented are
`general in that they do not hinge on any particular method for
`computing the actual cross correlations.
`
`IV. BROADBAND SPATIAL SPECTRAL ESTIMATORS
`The following subsections detail the existing and proposed
`broadband spatial spectral estimation methods, relating each to
`the parameterized spatial correlation matrix.
`
`A. Steered Conventional Beamforming and the SRP Algorithm
`The aim of a DSB is to time-align the received signals in
`the array aperture, such that the desired signal is coherently
`summed, while signals from other directions are incoherently
`summed and thus attenuated. Using the model of Section II, the
`output of a DSB steered to an angle of arrival of
`is given as
`
`(12)
`
`steer the beamformer to the desired DOA,
`The delays
`while the beamformer weights
`help shape the beam accord-
`ingly. The weights here have been made dependent on the de-
`sired angle of arrival
`, for a reason that will become apparent
`in future subsections. In (12), the received signals are delayed
`
`(11)
`
`where
`
`The DOA estimate is thus given by
`
`The maximization of a steered beamformer output power is
`equivalent to maximizing a quadratic of the beamformer weight
`vector with respect to the angle of arrival. Altering the angle
`affects the parameter in the quadratic form, namely, the param-
`eterized spatial correlation matrix.
`The well-known SRP algorithm [10] follows directly from a
`special case of (17), where
`for all
`, and
`is a vector
`of
`ones:
`
`(18)
`
`For this special case of fixed unit weights, this means that the
`maximization of the power of a steered DSB is equivalent to the
`maximization of the sum of the entries of
`.
`The SRP algorithm has garnered significant attention re-
`cently: see [10], [25], and [26]. In all of these implementations,
`the weighting of
`is used, which is fixed with respect
`to both the data and the steering angle. Given the well-known
`classical results on the advantages of adaptive beamforming
`over fixed beamforming, it is therefore surprising that adaptive
`weighting schemes have not been investigated more in the
`context of DOA estimation based on the parameterized spatial
`correlation matrix (A fixed weighting scheme is proposed in
`[27]). Notice that from (15), this is an effectively “narrowband”
`weight selection, in that the pre-aligning of the microphones
`requires only the selection of a single weight per channel. Note,
`however, that this weight selection must be performed for all
`angles
`. To that end, the following section presents one such
`adaptive weighting scheme, proposed by Krolik [14].
`
`Amazon Ex. 1006, Page 4 of 13
`
`

`

`DMOCHOWSKI et al.: DIRECTION OF ARRIVAL ESTIMATION USING THE PARAMETERIZED SPATIAL CORRELATION MATRIX
`
`1331
`
`B. Minimum Variance
`The minimum variance approach to spatial spectral esti-
`mation involves selecting weights that pass a signal [i.e., a
`] propagating from azimuth with
`broadband plane wave
`unity gain, while minimizing the total output power, given by
`. The application of the minimum variance method
`to broadband spatial spectral estimation is given in [14].
`The unity gain constraint proposed by [14] is
`
`is apparent that the vector may be estimated from the eigenanal-
`.
`ysis of
`To that end, consider another adaptive weight selection
`method, which follows from the ideas of narrowband beam-
`forming [19]. This weight selection attempts to nontrivially
`maximize the output energy of the steered-beamformer for a
`given azimuth
`
`(19)
`
`subject to
`
`vector follows from the fact that the signal is already
`and the
`time-aligned across the array before minimum variance pro-
`cessing. It is as if the signal is coming from the broadside of
`a linear array.
`Using the method of Lagrange multipliers in conjunction with
`the cost function
`, the minimum variance weights
`become
`
`It is well known that the solution to the above constrained opti-
`mization is the vector that maximizes the Rayleigh quotient [2]
`, which is in turn given by the eigenvector
`. The resulting
`corresponding to the maximum eigenvalue of
`spatial spectral estimate is given by
`
`(20)
`
`(28)
`
`(26)
`
`(27)
`
`The resulting minimum variance spatial spectral estimate is
`found by substituting the weights of (20) into the cost function:
`
`The broadband minimum variance DOA estimator is thus given
`by
`
`(21)
`
`(22)
`
`The next section presents a new idea: the eigenanalysis of the
`parameterized spatial correlation matrix.
`
`C. Eigenanalysis of the Parameterized Spatial Correlation
`Matrix
`Using the signal model of Section II, notice that when the
`steered azimuth matches the actual azimuth , the parameter-
`ized spatial correlation matrix may be decomposed into signal
`and noise components in the following manner:
`
`where
`
`is the signal power
`
`and
`
`(23)
`
`(24)
`
`(25)
`
`where
`is
`, and
`is the maximum eigenvalue of
`the corresponding eigenvector. The DOA estimation involves
`searching for the angle that produces the largest maximum
`eigenvalue of
`:
`
`(29)
`
`In addition to producing another spatial spectrum estimate, the
`above eigenanalysis allows one to estimate:
`
`(30)
`
`Now that an estimate of the attenuation vector
`is available,
`the minimum variance method of [14] may be improved to re-
`flect the presence of channel attenuation factors, which were
`omitted in the developments of Section IV-B.
`
`D. Improved Minimum Variance
`The broadband minimum variance spatial spectral estimation
`proposed by [14] assumes that the attenuation vector
`is equal
`to , or a scaled version of
`. In practice, it is not uncommon
`for this assumption to be violated by factors such as uncalibrated
`microphones, for example. To that end, the unity gain constraint
`proposed by [14] is modified to reflect the more general signal
`model of Section II.
`Taking into account the channel attenuation vector
`posed unity gain constraint is
`
`, the pro-
`
`Note that it has been implicitly assumed that the desired signal is
`wide-sense stationary, zero-mean, and temporally uncorrelated
`with the additive noise. Consider only the signal component of
`. It may be easily shown that this matrix has one nonzero
`eigenvector, that eigenvector being
`, with the corresponding
`eigenvalue being
`. The vector of attenuation constants
`is generally unknown; however, from the above discussion, it
`
`which may be simplified and written in vector notation as
`
`Therefore, the optimal minimum variance weights become
`
`(31)
`
`(32)
`
`(33)
`
`Amazon Ex. 1006, Page 5 of 13
`
`

`

`1332
`
`IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 4, MAY 2007
`
`The resulting proposed minimum variance spatial spectral esti-
`mate is found by substituting the weights of (33) into the cost
`function
`
`Using the method of Lagrange multipliers, the optimal predic-
`tive weights are given by
`
`(34)
`
`and the resulting minimum mean-squared error (mmse) is
`
`The proposed broadband minimum variance DOA estimator is
`thus given by
`
`(35)
`
`E. Linear Spatial Prediction and the Multichannel
`Cross-Correlation Coefficient
`
`Spatial spectral estimation using linear prediction is well de-
`fined for the case of narrowband signals, as the narrowband as-
`sumption allows one to write one of the microphone outputs as a
`complex-weighted linear combination of the other microphone
`outputs [2]. To extend this idea to the broadband case, the same
`method as that of the previous sections is used, in that the time
`delay is applied prior to computing the predictive coefficients.
`This concept was first presented in [17] and [18] in the con-
`text of time delay estimation; the approach was limited to linear
`array geometries, and yielded only a single relative delay. This
`section generalizes the idea to planar array geometries, trans-
`forming the problem from time delay estimation to DOA esti-
`mation.
`The idea is to predict, using real predictive coefficients, the
`output of
`using a linear combination of
`,
`. Using a spatial autoregressive (AR) model, the
`linear predictive framework is given by
`
`Note that both the optimal predictive coefficients and the mmse
`are a function of the steered angle .
`The classical approach to spectral estimation using linear pre-
`diction is to map the optimal predictive coefficients to an AR
`transfer function. However, it is well known that this method
`is very sensitive to the presence of additive noise in the obser-
`vations [2]. This is because the AR model breaks down when
`additive noise is present. To that end, a more robust implemen-
`tation of linear spatial prediction is proposed in [17] and [18].
`The idea is to not estimate an AR spectrum, but rather to find the
`parameter (i.e., the angle ) that minimizes the prediction error.
`In [17] and [18], the idea of linear spatial prediction was used
`to derive the (time delay parameterized) multichannel cross cor-
`relation coefficient (MCCC) in the context of linear array time
`delay estimation. These ideas are now extended to planar array
`geometries, and the azimuth angle-parameterized MCCC is pre-
`sented as another broadband spatial spectral estimator.
`The matrix
`may be factorized as [17], [18]:
`
`where
`
`(41)
`
`(42)
`
`(43)
`
`(44)
`
`(45)
`
`(36)
`
`...
`
`. . .
`
`. . .
`
`...
`
`is a diagonal matrix
`
`where
`may be interpreted as either the spatially white noise
`that drives the AR model, or the prediction error. For each
`in
`the azimuth space, one finds the weight vector
`
`which minimizes the criterion
`
`subject to the constraint
`
`where
`
`(37)
`
`(38)
`
`(39)
`
`(40)
`
`...
`
`. . .
`
`. . .
`
`...
`
`is a symmetric matrix, and
`
`is the cross-correlation coefficient between
`.
`The azimuth-angle dependent mmse may be written using
`(43) as
`
`(46)
`and
`
`(47)
`
`Amazon Ex. 1006, Page 6 of 13
`
`

`

`DMOCHOWSKI et al.: DIRECTION OF ARRIVAL ESTIMATION USING THE PARAMETERIZED SPATIAL CORRELATION MATRIX
`
`1333
`
`is the submatrix formed by removing the first row
`where
`, and
`stands for “determinant.” It is
`and column from
`shown in [17] and [18] that
`
`and thus the following relationship is established:
`
`(48)
`
`(49)
`
`From this relationship, it is easily observed that minimizing
`the spatial prediction error corresponds to minimizing the quan-
`tity
`. Notice that
`when every entry of
`is equal to unity (i.e., perfectly correlated microphone sig-
`nals). Conversely, in the case of mutually uncorrelated micro-
`phone outputs,
`. Putting all of this together, the
`azimuth angle parameterized MCCC is defined as
`
`The MCCC broadband spatial spectral estimate is given by
`
`(50)
`
`from which the DOA estimation easily follows as
`
`(51)
`
`(52)
`
`It is interesting to note that even though the linear spatial pre-
`dictive approach is used here to arrive at the azimuth parame-
`terized MCCC estimator, maximizing the MCCC actually cor-
`responds more closely to the minimization of the joint entropy
`of the received signals [28], assuming that the signals are jointly
`Gaussian distributed. This follows from the fact that for jointly
`Gaussian distributed
`, the joint entropy of
`is directly
`proportional to
`[28].
`
`V. SIMULATION EVALUATION
`
`A. Simulation Environment
`The various broadband spatial spectral estimators are eval-
`uated in a computer simulation. An equispaced circular array
`of three to ten omnidirectional microphones is employed as the
`spatial aperture. The radius of the array is chosen as the distance
`that fulfills the spatial aliasing equality for circular arrays. In
`other words, the array radius is made as large as possible without
`suffering from spatial aliasing [23]
`
`(53)
`
`where
`denotes the highest frequency of interest, and is
`chosen to be 4 kHz in the simulations. For a ten-element cir-
`cular array, the array radius becomes 6.9 cm. The signal sources
`are omnidirectional point sources. This means that the direct-
`
`path component is stronger than any individual reflected com-
`ponent—as mentioned in the Introduction, it is beyond the scope
`of this paper to handle cases where due to source directivity and
`orientation, a reflected component contains more energy than
`the direct-path component.
`A reverberant acoustic environment is simulated using the
`image model method [29]. The simulated room is rectangular
`with plane reflective boundaries (walls, ceiling, and floor). Each
`boundary is characterized by a frequency-independent uniform
`reflection coefficient which does not vary with the angle of in-
`cidence of the source signal.
`The room dimensions in centimeters are (304.8, 457.2, 381).
`The circular array is located in the center of the room: the center
`of the array sits at (152.4, 228.6, 101.6). Two distinct scenarios
`are simulated, as described below.
`The speaker is immobile and situated at (254, 406.4, 101.6)
`and (254, 406.4, 152.4) in the first and second simulation
`scenarios, respectively. The immobility of the source means
`that the evaluation does not consider frames during which the
`source exhibits movement. The correct azimuth angle of arrival
`is 60 . The distance from the center of the array to the source
`is 204.7 cm.
`The SNR at the microphone elements is 0 dB. Here, SNR
`refers to spatially white sensor noise in the first scenario and
`spherically isotropic (diffuse) noise in the second scenario. The
`generation of spherically isotropic noise is performed by trans-
`forming a vector of uncorrelated Gaussian random variables into
`a vector of correlated (i.e., according to a given covariance ma-
`trix) Gaussian random variables by premultiplying the original
`(uncorrelated) vector with the Cholesky factorization [30] of
`the covariance matrix of a diffuse noise field [2]. The covari-
`ance matrix of the diffuse noise field is computed by averaging
`over the entire frequency range (300–4000 Hz). For the compu-
`tation of the SNR, the signal component includes reverberation.
`In terms of reverberation, three levels are simulated for each sce-
`nario: anechoic, moderately reverberant, and highly reverberant.
`The reverberation times are measured using the reverse-time in-
`tegrated impulse response method of [31]. The frequency-inde-
`pendent reflection coefficients of the walls and ceiling are ad-
`justed to achieve the desired level of reverberation: a 60-dB re-
`verberation decay time of 300 ms for the moderately reverberant
`case, and 600 ms for the highly reverberant case.
`In the first simulation scenario, the microphones are all per-
`fectly calibrated with unity gains. In the second simulation sce-
`nario, the presence of uncalibrated microphones is simulated,
`by setting
`,
`to a uniformly distributed random
`number over the range (0.2, 1).
`The source signal is convolved with the synthetic impulse re-
`sponses. Appropriately scaled temporally white Gaussian noise
`is then added at the microphones to achiev

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket