`
`(12)
`
`0)
`
`Europaisches Patentamt
`
`European Patent Office
`
`Office européen des brevets
`
`(11)
`
`EP 1 538 867 Al
`
`EUROPEAN PATENT APPLICATION
`
`{43) Date of publication:
`08.06.2005 Bulletin 2005/23
`
`(51)
`
`int cl7; HO4R 3/00, HO4R 1/40
`
`(21) Application number: 03022273.1
`
`(22) Date of filing: 01.10.2003
`
`(84) Designated Contracting States:
`AT BE BG CH CY CZ DE DK EE ES FIFR GB GR
`HU JEITLILU MC NL PT ROSE SISK TR
`
`(72)
`
`Inventor: Chirstoph, Markus
`94315 Straubing (DE)
`
`Designated Extension States:
`AL LT LV MK
`
`(74) Representative: Griinecker, Kinkeldey,
`Stockmair & Schwanhdusser Anwaltssozietat
`Maximilianstrasse 58
`
`(30) Priority: 30.06.2003 EP 03014846
`
`80538 Minchen (DE)
`
`(71) Applicant: Harman Becker Automotive Systems
`GmbH
`
`76307 Karlsbad (DE)
`
`(54)
`
`Handsfree system for use in a vehicle
`
`The invention is directed to a handsfree system
`(57)
`for use in a vehicle, comprising a microphone array with
`at least two microphones, a signal processing means,
`
`and an adaptive post-filter, the signal processing means
`comprising a beamformer having an input connected to
`the at least twe microphones and an output connected
`to the input of the adaptive post-filter.
`
`
`
`
` c
`
`101
`
`103
`
`104
`
`rT
`
`EP1538867Al
`
`Fig. 1
`
`Printed by Jouve, 75007 PARIS (FR}
`
`ARAVAIMNANHAYXAaOO
`
`
`
`Description
`
`EP 1 538 867 A1
`
`10
`
`15
`
`20
`
`25
`
`[0007] The invention is directed to a handsfree system for use in a vehicle comprising a microphone array with at
`least two microphones and a signal processing means.
`[0002]
`Formaking telephone calls in a car, handsfree systems are used more and moresince they provide increased
`comfort and reducethe risk of an accident as the driver is distracted only marginally. Because ofthat, in many countries,
`handsfree devices are even required by law.
`[0003] Usually, a handsfree system comprises a microphone that can be fastened to a user such asthe driver.
`[0004]
`Due to the relatively large distance between the speaker's mouth and the microphone, many handsfree de-
`vices today suffer from the drawback of a poor speech quality. This is particularly due to the fact that in a car, usually
`a large ambient noise is present interfering with the speech signal. The noise stems from different sources such as
`the motor, wind, or car radio.
`[0005] However, common methodsfor noise reduction are often costly to implement and require a large amountof
`memory and computing power. In particular, a signal processed by conventional noise reduction systems hasa relatively
`jarge delay time which makes these systems unsuitable for real time applications, i.e. telephone applications.
`[0006]
`‘itis, therefore, the problem underlying the invention to overcome the above drawbacks and provide a hands-
`free system for use in a vehicle with improved speech quality.
`[0007] This problem is solved by a handsfree system according to claim 1. Accordingly, the invention provides a
`handsfree system for use in a vehicle comprising a microphone array with atleast two microphones, a signal processing
`means and an adaptive post-filter, the signal processing means comprising a beamformer having an input connected
`to the at least two microphones and an output connected to the input of the adaptive post-filter.
`[0008]
`in the context of this invention, the term "connected" also includes the case that a filter or another signal
`precessing means is provided along the signal path between two devices or means. A beamformer processes signals
`emanating from a microphone array to obtain a combined signal. A beamformer comprises a beamsteering means
`being responsible for time delay compensation of the different microphones and a summing means. In its simplest
`form (Delay-and-Sum beamformer), beamforming only comprises delay compensation and summing of the compen-
`sated signals. Beamforming allows to provide a specific directivity pattern fora microphone array. Usually, abeamformer
`can be implementedas digital system with a plurality of digital filter using, for example, digital signal processors (DSP).
`A beamformer can be configured as an adaptive or a non-adaptive beamformer. Adaptive means that relevant param-
`eters such asfilter coefficients can be re-calculated during use of the system in order to adapt the beamformer to
`changing conditions. In the non-adaptive case, the sysiem parameters are determined onceby calibrating the beam-
`former and, then, kept unchanged. In both cases of a non-adaptive and an adaptive beamformer, the beamforming, in
`principle, can be performed in the time domain or in the frequency domain.
`[0009] Ahandsfree system in accordance with the invention shows an excellent acoustic performance in a vehicular
`environment. Due to the beamformer, an improved directivity is obtained and, furthermore, speech signals are en-
`hanced and ambient noise is reduced. The adaptive post-filter (responsible for filtering a signal after the beamforming)
`further reduces the noise in the signal.
`[0010] According to a preferred embodiment, the adaptive post-filter can beafilter in the time domain. If the posi-
`filtering is performed in the time domain, the delay time is reduced and the implementation is simplified.
`(0017] According to a preferred embodiment, the adaptive post-filter can be a Wiener filter. Itturns out that a Wiener
`filter is particularly suitable for filtering in a car environment.
`[0012]
`in order to reduce spectral distortions of the filtered signal, preferably, the adaptive post-filter can be a linear-
`phase filter. Advantageously, the adaptive post-filter can be a linear-phase Wiener filter.
`[0013] Accordingto a preferred embodiment, the signal processing means can further comprise at least two adaptive
`filters having an input connected to the output of the beamsteering means and an output connected to the adaptive
`post-filter, wherein the at least two adaptivefilters are configured to determine adaptivefilter parameters forthe adaptive
`post-filter.
`[0014]
`[in this way, background filters are provided for adaptively estimating the filter parameters for the adaptive
`post-filter.
`[0015]
`Preferabiy, for each of the at least two microphones, an adaptivefilter can be provided having an input con-
`nected to the output of the beamsteering means. Thus, for each output of the beamsteering signal corresponding to a
`microphone, adaptivefilter parameters can be determined for the adaptive post-filter. The actual filter parameters of
`the post-filter can be given, for example, by the filter parameters determined by one of the adaptivefilters or the mean
`of the filter parameters determined by several different adaptivefilters.
`[0016] Advantageously, an input of each of the at least two adaptivefilters can be further connected to the output of
`the beamformer. This allows for an adaption of the respectivefilter parameters directly with respect to the beamformed
`signal.
`[0017] According to a preferred embodiment, the signal processing means can further comprise a pre-emphasis
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`ARAVAINANHAYXOHYH
`
`
`
`EP 1 538 867 Al
`
`in particular, comprising a pre-whitening filter, having an input connected to an output of the adaptive post-filter
`filter,
`and/or a pre-emphasis filter, in particular, comprising a pre-whitening filter, having an input connected to the output of
`the beamsteering means and an output connected to the at least two adaptivefilters.
`[0018]
`Such a pre-emphasis filter, on the one hand, emphasizes high frequencies and, on the other hand, attenuates
`iow frequencies which is particularly useful to reduce low frequency correlated noise. Preferably, the pre-emphasis
`filter can comprise a pre-whitening filter. A pre-whitening filter whitens the spectral distribution of a signal. The filter
`coefficients of such a pre-whitening filter can be determined using a linear predictive coding (LPC) analysis, forexample,
`via an adaptive lattice predictor (ALP) algorithm.
`[0019] According to a preferred embodiment of the above handsfree systems, the signal processing means can
`further comprise an inverse filter, particularly a warped inverse filter. These filters are especially useful to adjust the
`microphone transfer function and to match the microphonesof the array in this way. Preferably, the beamformer can
`comprise al least one inversefilter, in particular, having an output for providing an inverselyfiltered signal to a summing
`means.
`
`in order to overcome the matching problem, alternatively or additionally, matched microphones on the basis
`[0020]
`of silicone or paired microphones may be used.
`[0021] The susceptibility of microphone arrays often increases with decreasing frequency. Due to this, a higher match-
`ing precision is preferred for low frequencies compared to high frequencies. A frequency depending adjustmeniof the
`microphone transfer functions with the use of warped filters reduces the required memory comparedto the case of
`conventional FIR filters.
`
`Preferably, each inverse filter can be an approximate inverse of a non-minimum phase filter. This results in
`[0022]
`an inverse filter which is both stable and has no phase error.
`[0023] According to a preferred embodiment, an inverse filter may be combined with another filter of the handsfree
`system, for example, a filter of the beamformer. Such a combination in one filter results in a simplified implementation.
`[0024]
`Preferably, the signal processing means of the above handsfree systems can comprise a non-adaptive posi-
`filter having an input connected to an output of the adaptive post-filter. The non-adaptive post-filter may directly follow
`the adaptive post-filter. Such a filter is used to compensate for the ambient acoustics of a speaker. Thus, the non-
`adaptive post-filter may have the form of an inverse room filter.
`[0025]
`[in order to further reduce low frequency noise, according to a preferred embodiment, the signal processing
`means mayfurther comprise an adaptive noise canceller (ANC), for electrical ANC implementations.
`[0026]
`Preferably, the ANC can be connected to a non-acoustic sensor to determine a noise signal, for example, by
`using the tachometer of the vehicle. The ANC, advantageously, can have an output connected to the input of the
`beamformer and/or of the adaptive post-fiiter.
`[0027]
`For a further improvement of the speech signal quality, the signal processing means of the previously de-
`scribed handsfree systems can comprise an acoustic echo canceller AEC. Preferably, the AEC can comprise an echo
`shaping filter. In this way, a frequency selected echo attenuation may be obtained. As in the case of an ANC, the AEC
`can have an output connected to the input of the beamformer and/or of the adaptive post-filter.
`[0028] According to a preferred embodimentof all previously described handsfree systems, the beamformer can be
`a non-adaptive beamformer. By using a non-adaptive beamformer with fixedfilters, the computing power during oper-
`ation of the system is reduced.
`[0029]
`Preferably, the beamformer may be a superdirective beamformerwhich further improves the acoustic per-
`formance.
`
`[0030] Advantageously, the beamformer may bea requiarized superdirective beamformer using a finite regularization
`parameter u. The regularization parameter usually enters the equation for computing the filter coefficients or, alterna-
`tively, is inserted into the cross-power spectrum matrix or the coherence matrix. In contrast to the maximum superdi-
`rective beamformer (u = 0), the regularized superdirective beamformer has reduced noise and is less sensitive to an
`imperfect matching of the microphones.
`[0031] The finite regularization parameter u, preferably, may depend on the frequency. This achieves an improved
`gain of the array compared to a regularized superdirective beamformer with fixed regularization parameter y.. According
`to a preferred embodiment, each superdirectivefilter may result from an iterative design based on a predetermined
`maximum susceptibility. This enables an optimal adjustmentof the microphones, particularly with respect to the transfer
`function and the position of each microphone.
`[0032]
`By using apredetermined maximum susceptibility, defective parameters of the microphone array can be taken
`into account to further improve the gain. The maximum susceptibility may be determined as a function of the error in
`ihe transfer characteristic of the microphones, the error in the microphone positions and a predetermined (required)
`maximum deviation in the directional diagram of the microphone array. The time-invariant impulse response of the
`filters will be determined iteratively only once; there is no adaption of the filter coefficients during operation.
`[0033] According to apreferred embodiment, each superdirectivefilter can bea filter in the time domain. Filtering in
`the frequency domain is a possible alternative, however, requiring to perform a Fourier transform (FFT) and an inverse
`
`
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`ARAZVAINANHVOHA
`
`
`
`EP 1 538 867 Al
`
`Fourier transform (IFFT), thus, increasing the required memory.
`[0034] Advantageously, the beamformer may have the structure of a generalized sidelobe canceller (GSC). In this
`way,at least one filter can be saved. The implementation in the GSC structure, however,is only possible in the frequency
`domain.
`
`10
`
`15
`
`20
`
`25
`
`In order to obtain an optimal adaption of the handsfree system to a particular noise situation, according to a
`[0035]
`preferred embodiment, the beamformer can be a minimum variance distortionless response (MVDR) beamformer.
`[0036] According to a preferred embodiment, the microphone array can comprise at least two microphones being
`arranged in endfire orientation with respect to a first position. An array in endfire orientation has a better directivity and
`is less sensitive to a mismatched propagation or delay time compensation. The first position can be the location of the
`drivers head, for example.
`[0037]
`Preferably, the microphone array can comprise at least two microphonesbeing arranged in endfire orientation
`with respect to a second position. Thus, the handsfree system of the invention has a good directivily in two directions.
`Speech signals coming from two different positions, for example, from the driver and the front seat passenger, can
`both be recorded in good quality.
`[0038] According to a preferred embodiment, the signal processing means may comprise at least two beamformers.
`A first beamformer may be used for signals fromafirst position and a second beamformer may be usedfor signals
`from a second position.
`In this case, advantageously, the handsfree system may further comprise a voice activity
`detector (VAD) and/or a switch control means. The switch control and the VAD are used to determine how to combine
`the output of the at least two beamformers.
`[0039] Advantageously, the handsfree system can comprise a residual echo suppression (RES) means an¢d/or a
`dynamic volume control (DVC). A RES means servesfor suppression of residual echoes, in particular, being present
`in the signal resulting from the adaptive postfilter. Thus, a residual echo suppression means can comprise an input
`connected to the output of the adaptive postfilter. Furthermore, a RES means can comprise an input for receiving a
`far end signal. A DVC is intended for dynamically adapting the output volume of a far end signal depending on the
`ambient noise level being present in the vehicle.
`[0040] According to a preferred embcciment, the at least two microphones in the first endfire orientation (endfire
`orientation with respectto a first position) and the at least two microphonesin the second endfire orientation (endfire
`orientation with respect to a second position) can have a microphone in common. In this way, already a microphone
`array consisting of only three microphones provides an excellent directivity for use in a vehicular environment.
`[0044] According to a preferred embodimentof all previously discussed handsfree systems, the microphone array
`may comprise at least two subarrays. Each subarray of microphones may be optimized for a specific frequency band
`yielding an improved overall directivity.
`[0042]
`To decrease the total number of microphones, preferably, at least two subarrays may haveat least one mi-
`crophone in common.
`[0043] According to a preferred embodiment, the above handsfree systems may comprise a frame wherein each
`microphone of the microphone array is arranged in a predetermined, in particular fixed, position in or on the frame.
`This ensures that after manufacture of the frame with the microphone, the relative positions of the microphones are
`known. Such an array can be easily mounted in a vehicular cabin.
`[0044] According to a preferred embodiment, at least one microphone may be a directional microphone. The use of
`directional microphones improves the array gain.
`[0045]
`Preferably, at least one directional microphone may havea cardioid characteristic. This further improves the
`array gain. More preferred, the cardioid characteristic is a hyper-cardioid characteristic.
`[0046] Advantageously, at least one directional microphone maybe a differential microphone. This results in a mi-
`crophone array with excellent directivity and small dimensions, in particular, the differential microphone may be a first
`order differential microphone.
`[0047]
`The invention is further directed to a vehicle, particularly a car, comprising any of the above-described hands-
`free systems.
`[0048]
`The invention is also directed to the use of any of the previously described handsfree systerns in a vehicle,
`in particular, a car.
`[0049]
`Furthermore, the invention provides a method for noise reduction in a vehicular handsfree system, comprising
`receiving input signals resulting from a microphone array with at least two microphones, processing the input signals
`by a beamformer to provide a beamformed signal, and adaptively filtering a signal resulting from the beamformed
`signal by an adaptive post-filter.
`[0050] This method results in an excellent acoustic performance of a handsfree system in a vehicular environment.
`[00571] According to a preferred embodiment, the adaptivefiltering can be performedin the time domain. in this way,
`particularly the delay time is reduced.
`[0052]
`Preferably, the method can further comprise
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`ARAVAIMANHAYXOAYD
`
`
`
`providing at least two adaptivefilters, particularly Wiener filters, wherein
`
`EP 1 538 867 A1
`
`beam processing the input signals by a beamformer forming comprises beamsteering the input signals for providing
`beamsteered signals corresponding to one of the at least two microphones and summing the signals, and
`
`adaptively filtering comprises receiving and processing at least one beamsteered signal by at least one of the at
`least two adaptivefilters to determine adaptive filter parameters for the adaptive postfilter.
`
`[0053] According to a preferred embodiment, adaptively filtering can further comprise receiving a signal resulting
`from the beamformedsignal by at least one adaptivefilter and wherein processing the beamsteered signal cancomprise
`determining adaptive filler parameters using the at least one beamsteered signal and the signal resulting from the
`beamformmedsignal.
`[0054]
`Preferably, for each beamsteered signal, an adaptivefilter can be provided for determining adaptivefilter
`parameters using the beamsteered signal and the signal resulting from the beamformed signal.
`[0055]
`In order to reduce low frequency correlated noise, receiving at least one beamsteered signal by at least one
`of the at least two adaptive filters can comprise processing the at least one beamsteered signal by a pre-emphasis
`filter, in particular, comprising a pre-whitening filter.
`[0056] According to an advantageous embodiment, the above methods can further comprise processing a signal
`resulting from the microphone array by an inverse filter, in particular, a warped inverse filter.
`[0057]
`Preferably, the methods can further comprise non-adaptively filtering a signal resulting from the adaptively
`filtered signal and/or processing a signal resulting from the adaptively filtered signal by a pre-emphasisfilter.
`[0058] The above method, advantageously, can further comprise processing a signal resulting from the microphone
`array, particularly resulting from the beamformedsignal, by an adaptive noise canceiler (ANC) and/or an acoustic echo
`canceller (AEC) and/or a residual echo suppression (RES) means.
`[0059] According to a preferred embodiment, the input signals can be processed by a non-adaptive and/or superdi-
`rective and/or minimum variance cistortionless response (MVDR) beamformer.
`[0060] The invention also provides a computer program product comprising one or more computer readable media
`having computer-executable instructions for performing the steps of the above described methods.
`[0067] Additional features and advantageswill be described with reference to the examplesillustrated in the draw-
`ings:
`
`Fig. 1
`
`Fig. 2
`
`Fig. 3
`
`Fig. 4
`
`illustrates the structure of a handsfree system according to the invention with an adaptive post-filter in
`the time domain;
`
`showsthe structure of a beamformerin the frequency domain;
`
`illustrates an FXLMS algorithm;
`
`shows the structure of a beamformerin the time domain;
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`Figs. 5A, 5B__illustrate preferred embodiments of arrangements of the microphone arrayin a vehicle;
`
`Figs. 6A, 6B__illustrate preferred embodiments of arrangements of a microphone array in a mirror;
`
`45
`
`50
`
`55
`
`Fig. 7
`
`Fig. 8
`
`Fig. 9
`
`shows a microphone array consisting of three subarray;
`
`illustrates a superdirective beamformer in a GSCstructure;
`
`illustrates a microphone array with two microphonesin a noise field with a noise free sector;
`
`Fig. 10
`
`shows the structure of a superdirective beamformer comprising four first order gradient microphones;
`
`Fig. 11
`
`illustrates the structure of a handsfree system with an electrical ANC;
`
`Fig. 12
`
`shows the structure of an ANC;
`
`Fig. 13
`
`shows the structure of an embodimentof a handsfree system according to the invention with an ANC
`and AEC;
`
`5
`
`ARAVAIMNAHAYXOHA?
`
`
`
`EP 1 538 867 Al
`
`Fig. 14
`
`illustrates the structure of an AEC; and
`
`Fig. 15
`
`shows another embodiment of a handsfree system according to the invention.
`
`10
`
`15
`
`In the
`[0062] An example of the handsfree system in accordance with the present invention is shown in Fig. 1.
`following, first, the general structure will be shortly described, and, then, the different components will be explained in
`more detail. In the figures, it is to be noted that the dotted lines encasing some elements simply serve for better un-
`derstanding of the figures without necessarily implying any actual combination or separation of different elements.
`[0063] The main components of the system are a microphone array, a beamformer and an adaptive post-filter in the
`time domain. The microphone array 101,
`in this example, comprises four microphones 102. Each microphone 102
`yields an output signal xk]. The microphone signals maybefillered by an optional high-passfilter 103.
`[0064] Then, the signals are passed to a beamformer. This beamformer may be a conventional delay and sum beam-
`former. However, in the present example, a preferred superdirective beamformeris shown. Such a beamformer com-
`prises beamsteering means 104 and filters 105. The output signals of the beamformer may be passed through optional
`inverse filters 106 and, then, are summed by summing means 107 to yield a resulting beamformedsignal x/A].
`[0065] This signal is passed through an adaptive post-filter 108 in the time domain which may be followed by an
`optional non-adaptive post-filter 109 and/or by an optional pre-emphasis filter (not shown). The adaption of the post-
`filter 108 is performed using a set of Wienerfilters 109. The input signals of the Wiener filters 110 comprise, on the
`one hand, the individual signals resulting from the different microphones and, on the other hand, the summedsignal
`X[k]. In the present example, the microphone signals are taken after the beamsteering. However,
`if the beamformer
`comprises further (superdirective)filters 105 as in the present case, it is also possibie to take the microphone signals
`after this additional filtering. Before being presented to the Wiener filters, the microphone signals are passed through
`an optional pre-emphasisfilter 111.
`[0066]_[n the following, the functioning of a Wiener filter will be explained. A microphone signal x{k] is the sum of the
`25
`speech signal s[k] and the noise n[k]. The microphone signal will be filtered by an impulse response w(/) to obtain a
`noise reduced signal S[k]. It is the aim to minimize the mean square error between the undisturbed speech signal s[k]
`and the output signal S{Aq]:
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`EG[k}}= E(s{k)- 51k)?}=E [su— ¥w(xtk -il =min.
`
`i=-09
`
`In other words, the partial derivative of the mean squareerror with respect to the coefficients of the impulse
`[0067]
`response has to vanish yielding the Wiener-Hapf equation:
`
`YWrg(l-1) = re,
`
`wherein r,,() and r,,() are the auto-correlation function and the cross-correlation function of the microphone signal
`and the undisturbed speech signal. One may assume that the speech signal and the noisearestatistically independent,
`i.e.(i) = r.(), thus,
`
`> wi)r,@-i)=r,().
`
`f=—0O
`
`[0068] Atransformation of this equation into the frequency domain yields the frequency response of the Wiener filter:
`
`ARAVAIMANHXIORHA
`
`
`
`EP 1 538 867 Al
`
`Vi) =
`
`
`_®.(@)
`®,(0)
`®,,(@) ®,,(@)+®,,(@)
`
`[In order to obtain a time variantfilter, the power spectral densities in the above equation may be replaced by
`[0069]
`the corresponding short-time estimated values that may be obtained, for example, by a recursive averaging:
`
`Wik,v)=
`
`wherein S(«,v) and X(«,v) are short-time spectra that may be determined, for example, with the help of DFTfilter banks
`or an FFT. Here, « is the time index and v the frequency index: E{.} represents the short-time average that may be
`obtained, for example, with the help of a first order IIR filter.
`[0070] The short-time auto power spectral density of the speech signal in the numerator of the above equation is to
`be estimated in a suitable way. Appropriate estimation methods include spectral subtraction (estimating the auto power
`spectral density of the noise), minimum mean square error short-time spectral amplitude (MMSE STSA) estimator or
`MMSE log-SA estimator or a speech pause detector, for example.
`[0071]
`itis also possible to estimate the short-time auto power spectral density of the noise signal with the help of
`the coherence between two or more microphones. In a second step, the estimated short-time auto power spectral
`density of the noise signal may be used to estimate the absolute value of the most probable Fourier coefficient (using,
`for example, a spectral subtraction algorithm or an MMSE log-SA estimator) and to reconstruct the absolute value of
`the spectrum of the speech signal. For the multi-channel noise reduction, one estimates that the coherence or the
`cross power spectral density of the noise signals received by the microphones is vanishing. In the case of two micro-
`phones, for example, the microphone signal has the form:
`
`x,[k] = sh, (i)s{k -i]+n,{A] and x,[k]= Sh; (s{k -—i]+n,[k],
`
`i=
`
`wherein h, (4 and hp(/ are the impulse responses representing the acoustic transfer between the source of speech
`and the microphones. Both parts of the speech signalfiltered in this way are superimposed with the uncorrelated noise
`signals n4[k] and n,[k].
`[0072]
`Since ®,,,,. (w) = 0 and assuming that the Fourier transforms (H,(a@) and Hy ()) of the impulse responses
`of the acoustic transfer (A,(4) and fe()) obey 1H, (@)I = 1H» (I, one obtains for the short-time auto power spectral density
`
`®(kv) = PtJetix, (x,v)|" Vilx,(x,v)|’ 1 E\lx, (x, V)X5(K, v)|’ I
`
`wherein
`
`P{x}= {o .
`
`NN
`
`0
`
`< 0.
`
`[0073] The corresponding Wiener filter, then, has the form
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`ARAZVAIMHAHAYXOHE
`
`
`
`EP 1 538 867 Al
`
`ID, x, (lO.
`(w)IH,
`(@)!”
`
`4“)
`55
`1
`mMoy= Px x, (o) ”
`Px, ()
`
`In Fig. 1, the adaption of the post-filter w(k,/) - k being the time index and / denoting the coefficient within the
`[0074]
`impulse response - is performed in the time domain, for example, with the help of the LMS algorithm. The background
`Wiener filters w4(k, #,...,.w4(K, ) are two minimize the error signals e,[K],...,e@,[K] such that, for example, the filter w(K,
`/) tends towards the frequency response
`
`10
`
`15
`
`20
`
`wherein
`
`eS (=FMOVAOA)F(a)
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`=k
`
`H(@)+ H,(o) + H,(@)
`
`5
`
`S(a),(@)S(@))
`
`_ Ho) +H, (@)+7@) 5*(@)E|s(o) }.
`
`[0075] The form of the other three Wiener filters is obtained by a cyclic permutation of the indices.
`[0076]
`iftistobe understood thatthe system is not restricted to a particular number of Wiener filters 110. Furthermore,
`not every Wienerfilter 110 is always to be used to determine the adaptive post-filter 108. For example, one may use
`only the Wiener filter which uses the microphone signal of the microphone proximal to the source of speech.
`[0077]
`Preferably, however, the adaptive post-filter 108 is determined as
`
`w{[k, | = TCWyLKi} w.[K,f]4+ w.[K, ]+w,[k,f]).
`
`[0078]
`
`The filter is linear-phase if the filter coefficients satisfy
`
`wiki] = w[k,L - #].
`
`{[0079] Using this symmetry condition, the filter coefficients of a linear-phase post-filter (with length L) can be obtained.
`Accordingly,the linear-phase post-filter has twice the length of one of the background filters 110 (with length L/2}. Such
`a linear-phase filter only modifies the amplitude spectrum of the input signal of the filter without a frequency dependent
`distortion of the phase spectrum.
`[0080] The performance of the filter can be further improved by smoothing its frequency response. This can be
`achieved by weighting the filter coefficients with a window function.
`[0087] The inverse filters 106 serve to compensate for the acoustic transfer function of the path between the source
`of speech and the microphones.
`[0082]
`in Figure 2, the structure of a superdirective beamformer is shown. The beamformer shown in this figure
`performs the filtering in the frequency domain, in contrast to the case of Figure 1. I a beamformerin the frequency
`domain were used in Figure 1, aninverse Fourier transform is to be performed on the signals before passing the signals
`to the Wiener fillers 110 or the pre-emphasisfiller 111.
`
`ARAVAIMNAHAYXOHA
`
`
`
`EP 1 538 867 Al
`
`In Figure 2, the microphone array consists of M microphones 102, each yielding a signal x,(t). The signals x;
`[0083]
`{t) are transferred to the frequency domain by fast Fourier transform (FFT) means 201, resuiting in a signal X;,(09). In
`general, the beamforming consists of a beamsteering andafiltering. The beamsteering is responsible for the propa-
`gation time compensation. The beamsteering is performed by a steering vector
`
`d(w) = 1 ae”, a, onas olttiosy
`
`with
`
`10
`
`15
`
`and
`
`"Wa Pp |
`
`| > Pret ll- I~ Pr ll
`1G
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`wherein p,.; denotes the position of a reference microphone, p, the position of microphone n, q the position of the
`source of sound (for example, the speaker), fthe frequency and c the velocity of sound. In the far field, one has
`
`Qj = 44-7 = Amy = 1.
`
`[0084] Accordingtoa rule of thumb, one hasthe far field situation if the source of the useful signal is more than twice
`as far from the microphone array as the maximum dimension of the array. In Figure 2, a far field beamformer is shown
`since only a phasefactor elk denoted by reference sign 202 is applied to the signals X,{o).
`[0085] After the beamsteering, the signals are filtered by superdirectivefilters 203 that are filters in the frequency
`domain. The filtered signals are summedyielding a signal ¥(a). After an inverse fast Fourier transform (IFFT) by means
`204, the resulting signal y[k] is obtained.
`[0086] The optimal filter coefficients A; () may be computed according to
`
`T(@) d(w)1
`A, (w)) = ep
`do) To)
`ale)
`
`wherein the superscript H denotes Hermitian transposing and I'(w) is the complex coherence matrix
`
`T(@)=
`
`Pyx, (w)
`l
`I
`Tah
`Py, (wo) Dr, (@)
`
`oo Vy, (@)
`Veaky (@)
`1
`
`tee
`
`the entries of which are the coherence functions that are defined as the normalized cross-power spectral density of
`two signals
`
`Igq,(0) =®) = ———
`
`XX;
`Pxx, (@) Pxx, (@)
`
`55
`
`Preferably, the beamsteering is separated from the filtering step which reduces the steering vector in the
`[0087]
`design equation for the filter coefficients A; (a) to the unity vector
`
`ARAVAIMNHAHAYXONT
`
`
`
`EP 1 538 867 Al
`
`do) = (4,1,...,1)7
`
`(The superscript T denotes transposing.)
`
`[0088]
`
`in the case of an isotropic noise field in three dimensions (diffuse noise field), the coherenceis given by
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`Ty,x,(@)=of
`
`