`Park et al.
`
`54). SPEECH PROCESSING SYSTEMAND
`METHOD FOR ENHANCING ASPEECH
`SIGNAL IN ANOSY ENVIRONMENT
`
`75) Inventors: Sangil Park; Ed F. Martinez, both of
`Austin, Tex.; Dae-Hee Youn, Seoul,
`Rep. of Korea
`
`(73) Assignee: Motorola Inc., Schaumburg, Ill.
`
`(21) Appl. No.: 54,494
`22 Filed:
`Apr. 30, 1993
`(51) Int. Cl." ..................... G10L 3/02; G1OL 9/00
`52 U.S. Cl. ....................... 395/2,36; 395/2.12; 395/2.35;
`395/2.37; 395/242
`58) Field of Search ........................... 381/38, 47, 71-72;
`395/2, 2.1, 2.34-2.36, 24, 2.42, 2.67, 2.74,
`2.12, 2.7; 379/410; 364/724.19
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`1/1980 Crochiere et al. ..................... 395/2.38
`4,184,049
`4,335,276 6/1983 Bull et al. ........
`17911 SP
`4,468,804
`8/1984 Kates et al. ..
`... 395/2.74
`4,490,839 12/1984 Bunge ....................................... 381A47
`4,653,102 3/1987 Hansen ........................................ 395/2
`4,747,143 5/1988 Kroeger et al...
`... 395/2.34
`4,852,169 7/1989 Veeneman et al. ....................... 381/38
`4,916,743
`4/1990 Takizawa et al. ...
`. 381/.45
`4,952,931
`8/1990 Serageldin et al. ......................... 395/2
`5,083,310
`1/1992 Drory ..............
`... 395/22
`5,148,488 9/1992 Chen et al. ......
`381A47
`5,251,263. 10/1993 Andrea et al. ............................ 38/71
`5,260,896 11/1993 Iwasaki ...............
`364f724.19
`5,293,425 3/1994 Oppenheim et al. ..................... 38/71
`5,319,736 6/1994 Hunt ....................
`... 395/2.36
`5,353,376 10/1994. Oh et al. .....
`... 395/242
`5,396,554 3/1995 Hirano et al. ........................... 379,410
`
`OTHER PUBLICATIONS
`
`Kim et al., "Adaptive multichannel digital filter with latti
`ce-escalator hybrid structure”, ICASSP 90, pp. 1413-1416
`vol. 3, 3-6 Apr. 1990.
`
`22
`
`USOO5590241A
`Patent Number:
`11
`45) Date of Patent:
`
`5,590,241
`Dec. 31, 1996
`
`Park, "170 MIPS Real-Time Adaptive Digital Filter Board';
`Mota, Inc.; Oct. 4-891st Convention; pp. 1-20 (1991).
`Park; "Full-Duplex Hands-Free Mobile Phone System for a
`Digital Cellular'; Engineering Society; pp. 1-12 (Mar. 1-5,
`1993).
`Boll, et al.; "Suppression of Acoustic Noise in Speech Using
`Two Microphone Adaptive Noise Cancellation'; IEEE
`Trans. on ASSP; vol. 28, No. 6, pp. 752-753 (1980).
`Widrow, et al.; "Adaptive Noise Cancelling: Principles and
`Applications'; Proceedings of The IEEE; vol. 63, No. 12,
`pp. 1692-1716 (1975).
`Milenkovic, et al.; "Two Point Acoustic Reconstruction of
`Sustained Vowel Area Functions'; Elsevier Science Pub.
`B.V., pp. 351-362 (1985).
`Petek; "Two-Channel Speech Analysis Using Accelerom
`eter Output Signal As A Ref. Signal'; Dept. of Computer
`Science; pp. 114-117 (1989).
`Viswanathan, et al.; "Noise-Immune Speech Transduction
`Using Multiple Sensors'; B. Beranek & Newman Inc.; IEEE
`Internat'L Conf. on ASSP; vol. 2, pp. 712-715 (1985).
`(List continued on next page.)
`Primary Examiner-Tariq R. Hafiz
`Attorney, Agent, or Firm-Paul J. Polansky
`(57)
`ABSTRACT
`A speech processing system (30) operates in a noisy envi
`ronment (20) by performing adaptive prediction between
`inputs from two sensors positioned to transduce speech from
`a speaker, such as an accelerometer and a microphone. An
`adaptive filter (37) such as a finite impulse response (FIR)
`filter receives a digital accelerometer input signal, adjusts
`filter coefficients according to an estimation error signal, and
`provides an enhanced speech signal as an output. The
`estimation error signal is a difference between a digital
`microphone input signal and the enhanced speech signal. In
`one embodiment, the adaptive filter (37) selects a maximum
`one of a first predicted speech signal based on a relatively
`large smoothing parameter and a second predicted speech
`signal based on a relatively-small smoothing parameter, with
`which to normalize a predicted signal power. The predicted
`signal power is then used to adapt the filter coefficients.
`
`9 Claims, 2 Drawing Sheets
`
`
`
`
`
`
`
`
`
`
`
`AccELEMETER
`
`4
`
`ACCELEROMETER
`SIGNAL
`INPUT
`
`30
`
`
`
`MICROPHONE
`INPU. SIGNA
`
`ESTIMAION
`ERROR
`
`ENHANCED
`SPEECH
`SIGNA
`
`Exhibit 1008
`Page 01 of 10
`
`
`
`5,590,241
`Page 2
`
`OTHER PUBLICATIONS
`
`Viswanathan, et al.; “Multisensor Speech Input For
`Enhanced Imm. To Acoustic Background Noise'; B.
`Beranek & Newman Inc.; IEEE; pp. 193-196 (1984).
`
`Krishnamurthy, et al.; "Two-Channel Speech Analysis”;
`IEEE Trans. on ASSP; vol. 34, No. 4, pp. 730–743 (1986).
`Viswanathan, et al.; "Evaluation of Multisensor Speech
`Input for Speech Recognition in High Ambient Noise'; BBN
`Laboratories; pp. 1-4 (1986).
`
`Exhibit 1008
`Page 02 of 10
`
`
`
`U.S. Patent
`
`Dec. 31, 1996
`
`Sheet 1 of 2
`
`5,590,241
`
`youu
`
`NOILWNILS3()8é+
`
`-Le
`9EGE\
`
`
`
`TWNOISLAdNI
`
`JINOHdOOIA
`
`c&
`
`Odv
`
`ze))\
`
`
`
`MATSIINVa=Ie
`
`CC
`
`3ASION
`
`JOUNos
`
`Ié
`
`Exhibit 1008
`
`Page 03 of 10
`
`C3ONVHNG
`
`TINO
`
`HOJ3dSFALLdvav
`
`WNOISyaLtT4
`
`TWNOISLAdNIvausiYILINONT
`
`
`02
`
`Ofx#e
`
`Y3LINONI1I00V
`
`T4I1dWv
`
`Exhibit 1008
`Page 03 of 10
`
`
`
`
`U.S. Patent
`
`Dec. 31, 1996
`
`Sheet 2 of 2
`
`5,590,241
`
`UNVOICED N
`
`VOICED
`
`
`
`x(k)
`
`Exhibit 1008
`Page 04 of 10
`
`
`
`1.
`SPEECH PROCESSING SYSTEMAND
`METHOD FOR ENHANCING ASPEECH
`SIGNAL IN ANOSY ENVIRONMENT
`
`5,590,241
`
`2
`environments and an adaptive filter which has better respon
`siveness are needed.
`
`FIELD OF THE INVENTION
`This invention relates generally to signal processing sys
`tems, and more particularly, to speech processing systems.
`
`SUMMARY OF THE INVENTION
`Accordingly, there is provided, in one form, a speech
`processing system for enhancing speech signals in an noisy
`environments. The speech processing system has a first input
`terminal for receiving a first digital input signal produced
`from a first sensor, and a second input terminal for receiving
`a second digital input signal produced from a second sensor.
`The first and second digital input signals are characterized as
`having correlated speech components. The speech process
`ing system also includes an adaptive filter and a summing
`device. The adaptive filter has a signal input for receiving the
`second digital input signal, a feedbackinput for receiving an
`estimation error signal, and an output for providing an
`enhanced speech signal. The summing device has a positive
`input for receiving the first digital input signal, a negative
`input terminal for receiving the second digital input signal,
`and an output terminal coupled to the feedback input ter
`minal of the adaptive filter, for providing the estimation error
`signal.
`In another form, there is provided a method for enhancing
`speech signals in a noisy environment. A speech signal x(k)
`is provided to an input of an adaptive finite impulse response
`(FIR) filter. A first signal power estimate y(k) at a sample
`point k is computed by the formula y(k)=By(k-1)+(1-
`B)x’(k). A second signal power estimate Z(k) at sample
`point k is computed by the formula Z(k)=BZ(k-1)--(1-
`B)x’(k). A value of B, is chosen to be greater than a value
`of B. An overall signal power estimate yZ(k) at the sample
`point k is selected as a maximum of the first signal power
`estimate y(k) and the second signal power estimate Z(k). A
`plurality of FER filter coefficients of the adaptive FIR filter
`are recursively updated according to a normalized least
`mean-squares (NLMS) prediction using the overall signal
`power estimate yZ(k) and an estimation error signal. An
`output of the adaptive FIR filter is provided as an enhanced
`speech signal.
`These and other features and advantages will be more
`clearly understood from the following detailed description
`taken in conjunction with the accompanying drawings.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 illustrates in block diagram form a speech pro
`cessing system in accordance with the present invention.
`FIG. 2 illustrates in block diagram form a source model
`useful in analyzing the speech processing system of FIG. 1.
`FIG. 3, which illustrates in block diagram form a model
`of the speech processing system of FIG. 1.
`FIG. 4 illustrates in block diagram form an embodiment
`of the adaptive filter of FIG. 1 in accordance with the present
`invention.
`
`DETALED DESCRIPTION OF A PREFERRED
`EMBODIMENT
`FIG. 1 illustrates in block diagram form a speech pro
`cessing system 30 in accordance with the present invention.
`Speech processing system 30 exists in a noisy environment
`20. Noisy environment 20 includes a human being 21 acting
`as a speech signal source, and a noise source 22. Noise
`source 22 represents all environmental noise, such as air
`flow and engine noise from an aircraft cockpit.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`BACKGROUND OF THE INVENTION
`In a typical speech processing system, a microphone is
`used to recover the speech. The microphone produces an
`analog signal corresponding to the acoustic vibrations it
`receives. However, some environments are so noisy that the
`microphone input signal cannot be understood. These
`extremely noisy environments may also produce noise
`which is constantly changing and thus is very difficult to
`filter. Cellular telephones, cordless telephones, and mobile
`radios are frequently used in environments with high noise
`levels.
`One technique for discerning speech in these extremely
`noisy environments is to use two input sensors, such as two
`microphones or a microphone and an accelerometer. The
`inputs from the two sensors are filtered in analog filters,
`weighted, and combined to produce an enhanced speech
`signal. See, for example, Viswanathan et al., "Noise Immune
`Speech Transduction Using Multiple Sensors,” IEEE Inter
`national Conference on Acoustics, Speech, and Signal Pro
`cessing, vol. ICASSP-85, pp. 19.1.1-19.1.4, March 1985.
`Another technique uses a first microphone placed in prox
`imity to the speaker, to recover a speech signal having a
`large noise component. See S. Boll and D. Pulsipher, "Sup
`pression of Acoustic Noise in Speech Using Two Micro
`phone Adaptive Noise Cancellation," IEEE Transactions on
`Acoustics, Speech and Signal Processing, vol. ASSP-28, no.
`6, pp. 752–754, December 1980. A second microphone is
`physically isolated from the speaker so as to recover prima
`rily the noise but not the speech. The noise component is
`subtracted from the input of the first microphone in an
`adaptive filter in order to recover the speech signal with an
`improved signal-to-noise ratio (SNR). While both of these
`techniques are able to improve the SNR in extremely noisy
`environments, more improvement is desirable.
`In addition, if adaptive filtering is used, it is impossible to
`arrive at an optimum filter response using conventional
`adaptive filtering techniques. The result is that the filter is
`either sometimes over-responsive or sometimes under-re
`sponsive. An adaptive filter with a least-mean-squares
`(LMS) predictor, such as the filter used by Boll and Pul
`50
`sipher, has this problem. A known variant of the LMS
`technique, the normalized LMS (NLMS) predictor, also has
`this problem. The NLMS predictor is able to compensate for
`large changes in signal power by normalizing filter coeffi
`cients in relation to the magnitude of the expected signal
`power. Thus, for example, the NLMS predictor can adapt the
`filter coefficients at large signal power as accurately as at
`low signal power. However, the responsiveness of the
`NLMS predictor depends on the value of a smoothing
`parameter B, which ranges from 0 to 1. There is a tradeoff
`in filter responsiveness depending on the value of B. If B is
`too small, i.e. too much less than 1, then the filter is
`over-responsive, leading to unstable response. If f is too
`large, i.e. too close to 1, however, the filter is under
`responsive and rapid changes in the input signal power are
`reflected in the output only very slowly. Thus, both a speech
`processing system which works well in extremely noisy
`
`40
`
`45
`
`55
`
`60
`
`65
`
`Exhibit 1008
`Page 05 of 10
`
`
`
`3
`Speech processing system 30 includes a microphone 31,
`an amplifier 32, an analog-to-digital converter (ADC) 33, an
`accelerometer 34, an amplifier 35, an ADC 36, an adaptive
`filter 37, and a summing device 38. Microphone 31 is a
`conventional audio microphone such as a unidirectional
`microphone, which is acoustically coupled to noisy envi
`ronment 20. Microphone 31 has an output which provides an
`electrical signal to an input of amplifier 32. Amplifier 32 is
`a conventional analog amplifier which amplifies the audio
`signal to a level which may be converted to a digital signal.
`Thus, amplifier 32 has an output connected to an input of
`ADC 33. ADC 33 is a conventional analog-to-digital con
`verter such as a resistor/capacitor array converter, or pref
`erably, an oversampled analog-to-digital converter based on
`a sigma-delta modulator. ADC 33 provides a digital signal,
`labelled “MICROPHONE INPUT SIGNAL", at an output
`thereof. MICROPHONE INPUT SIGNAL is responsive to
`all sound received from noisy environment 20, including
`that received from noise source 22, and thus in noisy
`environment 20 has a relatively low signal-to-noise ratio
`(SNR).
`Accelerometer 34 is physically attached to the larynx area
`of the neck of human being 21, and provides an electrical
`output signal at an output thereof indicative of vibrations
`present at the larynx area of the neck of human being 21.
`Accelerometer 34 is preferably a conventional piezoelectric
`accelerometer which produces an output correlated with the
`speech component of the output of microphone 31. Since the
`tissue between the larynx and the neck acts like a lowpass
`filter, accelerometer 34 produces a signal which has prima
`rily low-frequency speech components. Also, accelerometer
`34 is insensitive to the acoustic noise produced by noise
`source 22. It should be noted that other types of electrome
`chanical transducers which produce outputs having signal
`components highly-correlated with the speech component of
`microphone 31 may also be used.
`Amplifier 35 has an input connected to accelerometer 34,
`and an output coupled to an input of ADC 36. Amplifier 35
`is also a conventional analog amplifier which amplifies the
`audio signal provided by accelerometer 34 to a level which
`40
`may be converted to a digital signal. ADC 36 has an output
`for providing a digital signal labelled “ACCELEROMETER
`INPUT SIGNAL'. ADC 36 is also a conventional analog
`to-digital converter such as a resistor/capacitor array con
`verter, or preferably, an oversampled analog-to-digital con
`verter based on a sigma-delta modulator.
`Adaptive filter 37 has a signal input for receiving
`ACCELEROMETER INPUT SIGNAL, an error input, and
`an output for providing a signal labelled “ENHANCED
`SPEECH SIGNAL". Summing device 38 has a positive
`50
`input terminal for receiving MICROPHONE INPUT SIG
`NAL, a negative input terminal for receiving the
`ENHANCED SPEECH SIGNAL, and an output terminal for
`providing a signal labelled “ESTIMATION ERROR” to the
`error input of adaptive filter 37.
`Speech processing system 30 is able to provide a speech
`signal which cancels a significant amount of the noise
`produced by noise source 22 by performing adaptive filter
`ing on ACCELEROMETER INPUT SIGNAL based on the
`error signal developed between the ENHANCED SPEECH
`60
`SIGNAL and the MICROPHONE INPUT SIGNAL. Thus,
`speech processing system 30 adaptively filters the input of
`one sensor based on the input of a second sensor which has
`a correlated speech component. In addition, adaptive filter
`37 recursively updates its coefficients to respond to the
`changing noise characteristics of extremely noisy environ
`ments, such as aircraft cockpits.
`
`4
`In a preferred form, ADC 33 and ADC 36 are hardware
`converters integrated with a general-purpose digital signal
`processor on a single integrated circuit. This digital signal
`processor (not shown) performs the summing associated
`with summing device 38, and the adaptive filtering associ
`ated with adaptive filter 37, efficiently through software
`using its instruction set. However, in other embodiments, the
`functions of signal processing system 30 may be performed
`by different combinations of hardware and software.
`FIG. 2 illustrates in block diagram form a source model
`40 useful in analyzing speech processing system 30 of FIG.
`1. An excitation signal ex(k) is the result of two types of
`Sounds: unvoiced sounds, modeled as white noise, which
`corresponds to consonants; and voiced sounds, modeled as
`a periodic impulse signal, corresponding to vowels. The
`excitation signal ex(k) thus alternates between the two types
`of sounds, represented by a switch 41 alternating between a
`first source labelled "UNVOICED' and a second source
`labelled "VOICED". A human vocal tract system 42 is
`excited by signal ex(k) and provides a transfer function
`labelled “V(z)” by which signal ex(k) becomes a speech
`signal labelled "s(k)'. A summing device 43 sums speech
`signal s(k) with environmental noise signal labelled “n(k)'
`to provide a resulting signal labelled "d(k)” which is
`received by the primary sensor, such as microphone 31.
`Signal d(k) corresponds to the MICROPHONE INPUT
`SIGNAL of FIG. 1. A transfer function 44 labelled “H(z)
`represents the transfer function between the vocal cord
`vibration and the secondary sensor, such as accelerometer
`34. A transfer function 45 labelled "T(z)" represents the path
`transferring the speech signal to the secondary sensor. A
`summing device 46 represents the combination of these
`paths to the secondary sensor by summing the outputs of
`transfer functions H(z) and T(z) to provide secondary signal
`X(k). Secondary signal x(k) corresponds to the ACCELER
`OMETER INPUT SIGNAL Of FIG. 1.
`The desired speech signal s(k) can be obtained by the
`following Z-domain equation:
`
`where
`
`To find time-domain signal s(k), it is necessary to find the
`transfer function, A(Z), when the reference input signal x(k)
`and the noise-corrupted speech signal d(k) are given.
`To find A(z), it is helpful to refer to FIG. 3, which
`illustrates in block diagram form a model 50 of speech
`processing system 30 of FIG. 1. Here, signal x(k) passes
`though a first transfer function 51 labelled "A(z)' to provide
`signal s(k), which is then modeled as being combined with
`signal n(k) in a summing device 52 to provide signal d(k).
`Signal x(k) is also passed through a transfer function 53
`which implements the inverse of transfer function A(z),
`labelled “A(z), to provide a speech estimate signal labelled
`“y(k)”, corresponding to the ENHANCED SPEECH SIG
`NAL of FIG. 1. Signal y(k) is then subtracted from signal
`d(k) in a summing device 54 to provide an error signal
`labelled "e(k)", which corresponds to the ESTIMATION
`ERROR Of FIG. 1.
`The desired response, d(k), is a speech signal with addi
`tive noise, s(k)+n(k). The secondary input x(k) to adaptive
`filter 37 (corresponding to ACCELEROMETER INPUT
`SIGNAL) is generated from the excitation signal but the
`
`5,590,241
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`55
`
`65
`
`Exhibit 1008
`Page 06 of 10
`
`
`
`S
`characteristics are changed by transfer function A(Z). Since
`s(k) and X(k) are generated from the same source signal,
`both signals are statistically closely correlated. Thus, when
`s(k) and n(k) are uncorrelated, adaptive filter 37 tries to
`minimize the mean-squared error by making y(k)=S(k),
`thereby making y(k) an approximation of signal s(k). During
`adaptation, the mean-squared error is driven to a minimum
`by eliminating the speech signal components, s(k), so that
`the minimum mean-squared error produces the maximum
`SNR for the estimated output y(k).
`If adaptive filter 37 is implemented as a finite impulse
`response (FIR) filter, the transfer function A(z) is the sum
`from (i=0) to (i=N-1) of cZ, where N represents the
`number of coefficients and c, represents the coefficient value
`for the ith coefficient. An adaptive FIR filter recursively
`finds the coefficients to minimize the mean-squared error.
`FIG. 4 illustrates in block diagram form an embodiment
`of adaptive filter 37 of FIG. 1 in accordance with the present
`invention. In FIG. 4, all signals are represented with generic
`signal processing designations corresponding to those used
`in FIGS. 2 and 3 to emphasize the fact that adaptive filter 37
`may be used in other systems besides speech processing
`system 30. Adaptive filter 37 is a tapped delay line (TDL)
`FIR filter having a plurality of taps at the outputs of delay
`elements, a plurality of variable multipliers for multiplying
`the value at the tap by a variable coefficient, a summing
`device 80 for summing together the outputs of the variable
`multipliers to provide the output signal y(k), and a predictor
`90 for recursively adjusting the values of the coefficients
`according to the error signal e(k).
`Illustrated in FIG. 4 are two delay elements 60 and 61.
`Delay element 60 has an input for receiving input signal
`x(k), and an output. Delay element 61 has an input con
`nected to the output of delay element 60, and an output. All
`delay elements are connected in series with the input of first
`delay element 60 receiving signal X(k), and each Succeeding
`delay element has an input connected to the output of a
`previous delay element.
`Signal x(k) and the output of each delay element provide
`filter taps which are each provided to an input terminal of a
`corresponding variable multiplier. In FIG. 4, three represen
`tative variable multipliers 70, 71, and 72 are illustrated.
`Variable multiplier 70 has an input for receiving signal x(k),
`a coefficient input for receiving corresponding coefficient co,
`and an output terminal connected to summing device 80.
`Variable multiplier 71 has an input connected to the output
`of delay element 60, a coefficient input for receiving corre
`sponding coefficient c, and an output terminal connected to
`summing device 80. Variable multiplier 72 has an input
`connected to the output of a last delay element in adaptive
`FIR filter 37 (not shown), a coefficient input for receiving
`corresponding coefficient cy, and an output terminal con
`nected to summing device 80, where N is equal to the
`number of filter taps. Thus, adaptive FIR filter 37 has N
`variable multipliers and (N-1) delay elements.
`Summing device 80 has inputs for receiving correspond
`ing ones of the outputs of the variable multipliers, and an
`output for providing signal y(k). Predictor 90 has an input
`for receiving signal e(k), and outputs connected to coeffi
`cient inputs of corresponding ones of the variable multipli
`ers. Also illustrated in FIG. 4 is summing device 38 of FIG.
`1, which has a positive input for receiving signal d(k), a
`negative input connected to the output terminal of summing
`device 80 for receiving signal y(k), and an output terminal
`connected to the input terminal of predictor 90 for providing
`signal e(k).
`Adaptive FIR filter 37 differs from conventional adaptive
`FIR filters in that predictor 90 is especially well-suited for
`
`10
`
`5
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,590,241
`
`6
`use in certain applications such as speech processing. Adap
`tive filters are useful for input signals which are continu
`ously present, because after a short delay the filter coeffi
`cients approach an optimum solution to noise cancellation
`for the particular environment. However, speech is rarely
`continuous. Instead, a typical speech pattern includes peri
`ods of conversation followed by periods of silence. In noisy
`environments, it is difficult for the adaptive filter to provide
`a noise-free signal during a period of conversation which
`immediately follows a long period of silence because of the
`history in the adaptive FIR filter. To solve this problem,
`predictor 90 provides coefficients c, based on a prediction of
`output signal y(k) which adapts more quickly on the start of
`a signal than upon the conclusion of a signal, or in other
`words, has a fast attack and a slow release.
`One conventional predictor is the so-called least-mean
`squares (LMS) predictor, in which a subsequent coefficient
`c(k+1) is defined as
`
`where irepresents the ith coefficient, k represents the present
`sample period, c(k) is the present coefficient value of the ith
`coefficient, o is the convergence parameter, e(k) is the error
`signal at samplek, and x(k-i) is the input signal sample prior
`to the current sample by i sample periods. Using a 50-tap
`FIR filter (N=50) with the LMS predictor, O=0.05, and for
`an input signal with a 0 decibel (dB) SNR, speech process
`ing system 30 provides an output SNR of approximately
`6.88 dB.
`In an adaptive filter using an LMS predictor, the adaptive
`adjustments to the coefficients are closely related to signal
`power x*(k). Thus, when the input power is large (such as
`the beginning of a conversation), the adaptation will be fast
`but also too rough or unstable. Also when the power is low
`(during a quiet period), the adaptation will be too slow.
`Another conventional predictor which may be used is the
`normalized LMS (NLMS) predictor, defined as
`
`where O is normalized by the estimated signal power O(k)
`and thus the adaptation rate is independent of the input
`signal power. In real-time signal processing environments,
`o,(k) must be estimated, and the most common formula for
`this estimation is
`
`where 3 is the smoothing parameter equal to (1-0) which
`has a typical value of 0.99. The NLMS predictor has been
`applied to many adaptive filtering applications such as
`full-duplex speakerphones and other speech processing sys
`telS.
`However, the signal power estimation of equation 5 is a
`lowpass filter type estimation, whose reaction depends on
`the value of the smoothing parameter B. If f is very close to
`1, for example 0.999, the filter will sometimes be under
`responsive and the output signal will react slowly to changes
`in the input signal. In other words, the filter will have both
`a slow attack and a slow release. On the other hand, if B is
`much less than 1, for example 0.8, then the filter will
`sometimes be over-responsive and the output signal will
`overreact to changes in the input signal. The resulting filter
`will have both a fast attack and a fast release.
`In accordance with the present invention, predictor 90
`combines a fast attack and a slow release so that at the
`
`Exhibit 1008
`Page 07 of 10
`
`
`
`5,590,241
`
`7
`beginning of a conversation, the system will not diverge and
`will not suffer from a "jumpy” or “unsteady' sound problem.
`This allows the speech processing system to recognize the
`start of speech soon after the end of a long silent period.
`Mathematically, predictor 90 provides two estimates of 5
`signal power designated y(k) and Z(k), given by
`
`and
`
`wherein B provides a relatively fast response in relation to
`B, i.e. B has a lower value than B. For example, one set of
`values useful in speech processing system 30 would be
`B=0.999 and B=0.9. When either y(k) or Z(k) is greater
`than or equal to a predetermined threshold, then the signal
`power estimate yZ(k) used to normalize the input signal in
`predictor 90 is given by
`
`Thus, predictor 90 allows a fast attack by selecting the signal
`power estimate that reacts the quickest to the start of speech,
`and a slow release by selecting the signal power estimate
`that lingers the longest in response to the end of a period of
`speech.
`
`10
`
`15
`
`20
`
`25
`
`8
`APPENDIX A illustrates an assembly language program
`which may be used to implement adaptive FIR filter 37 using
`fast attack and slow release predictor 90. The program
`includes mnemonics which are executable by a Motorola
`DSP 56000-family digital signal processor. However, it will
`be appreciated by those skilled in the art that the program
`may be modified to run on other general-purpose digital
`signal processors. Preferably, ADCs 33 and 36 are hardware
`converters which are implemented along with a general
`purpose digital signal processor running this program on a
`single integrated circuit. However, these elements may be
`implemented discretely in other embodiments.
`While the invention has been described in the context of
`a preferred embodiment, it will be apparent to those skilled
`in the art that the present invention may be modified in
`numerous ways and may assume many embodiments other
`than that specifically set out and described above. For
`example, other types of sensors besides accelerometers may
`be used as the second sensor, as long as their signal
`component is highly-correlated with the signal component
`of the first sensor. In addition, adaptive FIR filter 37 may be
`used in other signal processing systems besides speech
`processing system 30. Also, the functional blocks illustrated
`herein may be practiced exclusively by hardware circuitry,
`exclusively by a digital signal processor executing appro
`priate instructions, or by some combination thereof. Accord
`ingly, it is intended by the appended claims to cover all
`modifications of the invention which fall within the true
`spirit and scope of the invention.
`
`APPENDIXA
`
`taps
`betal
`alpha
`beta2
`alpha2
`threshold
`Past Mag1
`Past Mag2
`abs
`OWe
`Impy
`
`255
`equ
`0.999
`equ
`1-betal
`cqu
`0.9
`equ
`1-beta2
`equ
`0.500
`equ
`1.
`ds
`1
`ds
`biidalpha1,y0
`b,x0
`
`x:Past Mag1,x0
`x0,y0,b
`b1,x:Past Magl
`a,b
`b #2alpha2,y0
`b,x0
`
`OWe
`aC
`OW
`OW
`abs
`OWe
`mpy
`
`OW
`aC
`
`x0,b
`b1,x:Past Magl
`flethreshold,x0
`x0,b
`limit
`by0
`x0,b
`#Sfe,ccr
`24
`y0,b
`b0,x0
`a,y0
`x0,y0,a
`
`; number of adaptive coefficients
`; Smoothing parameter for slow release
`; Convergence parameter for slow release
`; Smoothing parameter for fast attack
`; Convergence parameter for fast attack
`; maximum magnitude for power estimation
`; Past value for y(k) fast estimation
`; Past value for z(k) fast estimation
`; Find Magnitude lx(k)), get Alpha
`; ready for multiply
`; Obtain b=Alpha lx(k)
`; Get smoothing parameter Beta
`; Get past magnitude y(k-1)
`; Find current smoothed magnitude y(k)
`; Save current smoothed magnitude
`; get the original data sample
`; Find Magnitude lx(k), get Alpha
`; ready for multiply
`; Obtain b=Alpha lx(k))
`; Get smoothing parameter Beta
`; get past magnitude Z(k-1)
`; Find current smoothed magnitude z(k)
`; Bring back the y(k) value
`; compare y(k) and Z(k), save Z(k)
`; if x0>b, copy x0 to b (Maximum value)
`; Save current smoothed magnitude
`; Get maximum threshold value
`, compare between yz(k) and threshold
`; if no limiting is found
`; make a denominator
`, make a numerator
`; clear carry bit for division routine
`; get 24 bits of resolution
`; divide routine the output is in B0
`; ready for scaling
`; get original value
`; scaling
`
`Exhibit 1008
`Page 08 of 10
`
`
`
`5,590,241
`
`10
`
`-continued
`
`APPENDIXA
`
`limit
`essessesseeksek-k-k-k-k-k-k-k-k-k-k-k-k-k-k
`LMS Algorithm Main Loop
`is risks. ------------------------------
`chr
`2
`x:(rO)-xO
`OW
`x:(O),b
`move
`y0,y:(r4+n4)
`do
`#taps,afloop
`C
`x0,y0,a
`bx:(r0)+n.0
`move
`x0,b
`act
`y0,y1,b
`
`x:(r0)-x0
`
`y:(r4)-y0
`
`afloop
`:::::::::::::::::::::::::ks:::::::::::::::::::::::::::::::::::::::::k:k:::::::::::::::::k
`Normalization of convergence parameter
`
`3
`
`now
`mpyr
`mpy
`C
`
`; get current lx(k)
`
`, update yZ(k)
`, clear carry bit
`
`; 24-bit resolution divide routine
`
`We claim:
`1. A method for enhancing speech signals in a noisy
`environment, comprising the steps of:
`inputting a digital speech signal X(k) at an input of a
`plurality of successive delay elements whose outputs
`form a like plurality of taps of an adaptive finite
`impulse response (FIR) filter;
`inputting said digital speech signal and each of said
`plurality of taps to corresponding inputs of a plurality
`of variable multipliers;
`computing a first signal power estimate y(k) at a sample
`point k given by the formula y(k)=By(k-1)+(1-
`B)x*(k);
`computing a second signal power estimate Z(k) at said
`sample point k given by the formula Z(k)-BZ(k-1)+
`(1-3)x(k);
`choosing a value of B greater than a value of B;
`selecting an overall signal power estimate yZ(k) at said
`sample point k as a maximum one of said first signal
`power estimate y(k) and said second signal power
`estimate Z(k);
`recursively updating a plurality of FIR filter coefficients
`corresponding to said plurality of variable multipliers
`according to a normalized least-mean-squares (NLMS)
`prediction using said overall signal power estimate
`yz(k) and an estimation error signal to provide updated
`values of said plurality of FIR filter coefficients;
`providing said updated values of said plurality of FIR
`filter coefficients to coefficient inputs of said plurality
`of variable multipliers; a