`Itoh et al.
`
`54
`
`75
`
`73
`
`ACOUSTIC NOSE SUPPRESSOR
`
`Inventors: Kenzo Itoh, Tokyo; Masahide
`Mizushima, Sayama, both of Japan
`Assignee: Nippon Telegraph and Telephone
`Corporation, Tokyo, Japan
`
`Appl. No.:749,242
`21
`22
`Filed:
`Nov. 14, 1996
`Foreign Application Priority Data
`30
`Jan. 31, 1996
`Pl
`Japan ................................... 8-0487.4
`Int. Clar. HOB 15/OO
`(51)
`(52)
`U.S. Cl. ........................................... 38/943; 704/233
`58
`Field of Search ..................... 381/94, 94.3; 704/233
`
`56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`5,377,277 12/1994 Bisping ..................................... 381/94
`
`USOO5757937A
`Patent Number:
`Date of Patent:
`
`11
`45
`
`5,757,937
`May 26, 1998
`
`5,479,517 12/1995 Linhard ..................................... 38/97
`5,550,924 8/1996 Helf et al. ................................. 38/94
`
`Primary Examiner-Forester W. Isen
`Attorney, Agent, or Firm-Pollock, Vande Sande & Priddy
`57
`ABSTRACT
`
`In an acoustic noise suppressor, a power spectrum compo
`nent and a phase component are extracted from an input
`signal by a frequency analysis part, while at the same time
`a check is made in a speech/non-speech identification part to
`see if the input signal is a speech signal or noise. Only when
`the input signal is noise, its spectrum is stored in a storage
`part and is weighted by a psychoacoustic weighting function
`W(f), and the weighted spectrum is subtracted from the
`power spectrum of the input signal and is reconverted to a
`time-domain signal by making its inverse analysis.
`
`11 Claims, 7 Drawing Sheets
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`NOISE SPEC
`UPDATE/STORE
`
`AVER
`LEVELCAL
`
`OSS CONT
`COEF CAL
`
`NW FREQ
`ANAL.
`(FFT)
`
`Page 1 of 16
`
`GOOGLE EXHIBIT 1014
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 1 of 7
`
`5,757,937
`
`FIG. 1
`
`SG IDENT
`
`2
`
`PRIOR ART
`
`NV FREO
`ANAL
`(FFT)
`
`
`
`
`
`FIG. A.
`
`
`
`FREQUENCYKHz)
`
`Page 2 of 16
`
`
`
`USS. Patent
`
`May26, 1998
`
`Sheet 2 of 7
`
`5,757,937
`
`
`
`LNOO$so7
`
`‘WO4509
`
`‘WOT3A37
`
`YSAV
`
`AYOLS/ALVddN
`
`
`
`J3id$SSION
`
`r-
`
`vv
`
`
`
`
`Ed
`
`
`YOOOLNVY
`
`TWNV
`
`LL
`
`Page 3 of 16
`
`Page 3 of 16
`
`
`
`
`
`
`
`U.S. Patent
`
`5,757,937
`
`| | | | | | | | | | | | | | | | | | | | | |
`
`(BRRRRRRRRRRRRRRRRRRRRRR!!!
`
`Page 4 of 16
`
`
`
`U.S. Patent
`U.S. Patent
`
`May 26, 1998
`May 26, 1998
`
`Sheet 4 of 7
`Sheet 4 of 7
`
`5,757,937
`
`F.G. 5
`
`5,757,937
`
`
`
`Page 5 of 16
`
`Page 5 of 16
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 5 of 7
`
`5,757,937
`
`FIG.7
`
`
`
`S(f) SPECT
`O
`SLOPE
`
`SS
`
`DENT
`
`S/NSP
`
`Pthnew
`
`UD
`
`25
`
`Pthew=0. Pithold-- (-0)P
`
`25B
`
`25C
`
`Page 6 of 16
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 6 of 7
`
`5,757,937
`
`FIG. 9
`
`S4
`
`S1
`
`CSTARD
`YKRma-Rath)
`{ S. <ssth) S3
`
`N
`
`OUTPUT CONT
`S.G. S & UD
`
`OUTPUT CONT
`SG N
`
`Y
`OUTPUT CONT
`SIG S
`
`GEND) CEND S5
`
`
`
`-O
`
`O
`
`+1O
`
`+2O dB
`
`SIGNAL-TO-NOISE RATO
`
`Page 7 of 16
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 7 of 7
`
`5,757,937
`
`-----
`
`
`
`O2
`
`11O
`
`NOISE SUPPRESS
`
`Page 8 of 16
`
`
`
`1
`ACOUSTIC NOSE SUPPRESSOR
`
`5,757,937
`
`5
`
`O
`
`BACKGROUND OF THE INVENTON
`The present invention relates to an acoustic noise sup
`pressor which suppresses signals (noise in this instance)
`other than speech signals or the like to be picked up in
`various acoustic noise environments, permitting efficient
`pickup of target or desired signals alone.
`Usually, a primary object of ordinary acoustic equipment
`is to effectively pick up acoustic signals and to reproduce
`their original sounds through a sound system. The basic
`components of the acoustic equipment are (1) a microphone
`which picks up acoustic signals and converts them to electric
`signals, (2) an amplifying part which amplifies the electric
`signals, and (3) an acoustic transducer which reconverts the
`amplified electric signals into acoustic signals, such as a
`loudspeaker or receiver. The purpose of the component (1)
`for picking up acoustic signals falls into two categories: to
`pick up all acoustic signals as faithfully as possible, and to
`effectively pick up only a target or desired signal.
`The present invention concerns "to effectively pick up
`only a desired signal." While the acoustic components of
`this category include a device for picking up a desired signal
`(which will hereinafter be referred to as a speech signal and
`other signals as noise for convenience of description) with
`higher efficiency through the use of a plurality of micro
`phones or the like, the present invention is directed to a
`device for suppressing noise other than the speech signal in
`an input signal already picked up.
`For a wide variety of purposes, speech in a noise envi
`ronment is converted into an electric signal, which is sub
`jected to acoustic processing according to a particular pur
`pose to reproduce the speech (a hearing aid, a loudspeaker
`system for conference use, etc., for instance), or which
`electric signal is transmitted over a telephone circuit, for
`instance, or which electric signal is recorded (on a magnetic
`tape or disc) for reproducing therefrom the speech when
`necessary. When speech is converted into an electric signal
`for each particular purpose, background noise is also picked
`up by the microphone, and hence techniques for suppressing
`such noise are used to obtain the speech signal it is desired
`to convert. For example, in amulti-microphone system (J. L.
`Flanagan, D. A Berkley, G. W. Eiko, et at., “Autodirective
`Microphone Systems." Acoustica, Vol. 73, No. 2, pp. 58-71,
`45
`1991 and O. L. Frost, "An Algorithm for Linearly Con
`strained Adaptive Array Processing.” Proc. IEEE. Vol. 60,
`No. 8, pp. 926-935, 1972, for instance), speech signals
`picked up by microphones placed at different positions are
`synthesized after being properly delayed so that their cross
`correlation becomes maximum, by which the desired speech
`signals are added and the correlation of other sounds is made
`so small that they cancel each other. This method operates
`effectively for speech at specific positions but has a short
`coming that its effect sharply diminishes when the target
`speech source moves.
`Another conventional method is one that pays attention to
`the fact that the actual background noise is mostly stationary
`noise such as noise generated by air conditioners, refrigera
`tors and car engine noise. According to this method, only the
`noise power spectrum is subtracted from an input signal with
`background noise superimposed thereon and the difference
`power spectrum is returned by an inverse FFT scheme to a
`time-domain signal to obtain a speech signal with the
`stationary noise suppressed (S. Boll, "Suppression of Acous
`tic Noise in Speech Using Spectral Subtraction," IEEE
`Trans. ASSP, Vol.27, No. 2. pp. 113-120, 1979). A descrip
`
`55
`
`5
`
`O
`
`25
`
`30
`
`35
`
`SO
`
`65
`
`2
`tion will be given below of this method, since the present
`invention is also based on it.
`FIG. 1 illustrates in block form the basic configuration of
`the prior art acoustic noise suppressor according to the
`above-mentioned literature. Reference numeral 11 denotes
`an input terminal, 12 is a signal discriminating part for
`determining if the input signal is a speech signal or noise, 13
`is a frequency analysis or FFT (Fast FourierTransform) part
`for obtaining the power spectrum and phase information of
`the input signal, and 14 is a storage part. Reference numeral
`15 denotes a switch which is controlled by the output from
`the frequency analysis part 12 to make only when the input
`signal is noise so that the output from the frequency analysis
`part 13 is stored in the storage part 14. Reference numeral
`16 denotes a subtraction part, 17 is an inverse frequency
`analysis or inverse FFT part, and 18 is an output terminal.
`An input signal fed to the input terminal 11 is applied to
`the signal discriminating part 12 and the frequency analysis
`part 13. The signal discriminating part 12 discriminates
`between speech and noise through utilization of the fre
`quency distribution characteristic of the signal level (R. J.
`McAulay and M. L. Malpass, "Speech Enhancement Using
`a Soft-Decision Noise Suppression Filter," IEEE Trans.
`ASSP, Vol. 28, No. 2, pp. 137-145, 1980). The frequency
`analysis part 13 makes a frequency analysis of the input
`signal for each analysis period (an analysis window) to
`obtain the power spectrum S(f) and phase information P(f)
`of the input signal. The frequency analysis mentioned herein
`means a discrete digital Fourier transform and is usually
`made by FFT processing only when the input signal dis
`criminated by the signal discriminating part 12 is noise, the
`switch 15 is connected to an N-side, through which the
`power spectrum characteristic S(f) of the noise of the
`analysis period obtained by the frequency analysis part 13 is
`stored in the storage part 14. When the input signal dis
`criminated by the signal discriminating part 12 is “speech."
`the switch 15 is connected to an S-side, inhibiting the supply
`of the input signal power spectrum S(f) to the storage part
`14. The input signal power spectrum S(f) is compared in
`level by subtracting part 16 with the noise power spectrum
`S(f) stored in the storage part 14 for each corresponding
`frequency f. If the level of the input signal power spectrum
`S(f) is higher than the level of the noise power spectrum
`S(f), a noise spectrum multiplied by constant oc is sub
`tracted from the input signal power spectrum S(f) as indi
`cated by the following equation (1); if not, S(f) is replaced
`with zero or the level n(f) of a corresponding frequency
`component of a predetermined low-level noise spectrum:
`
`(1)
`
`if S(f)> S(f
`S(f) = S(f) - c.S.(f)
`else
`= 0 or n(f)
`where or is a subtraction coefficient and n(f) is low-level
`noise that is usually added to prevent the spectrum after
`subtraction from going negative. This processing provides
`the spectrum S(f) with the noise component suppressed. The
`spectrum characteristic S(f) is reconverted to a time-domain
`signal by inverse Fourier transform (inverse FFT, for
`instance) processing in the inverse frequency analysis part
`17 through utilization of the phase information P(f) obtained
`by fast Fourier transform processing in the frequency analy
`sis part 13, the time-domain signal thus obtained being
`provided to the output terminal 18. As the signal phase
`information P(f), the analysis result is usually employed
`intact.
`With the above processing, a signal from which the
`frequency spectral component of the noise component has
`
`Page 9 of 16
`
`
`
`5,757,937
`
`O
`
`5
`
`20
`
`4
`becomes a harsh grating in periods during which no speech
`signals are present. As an approach to this problem, the noise
`suppressor of the present invention adopts loss control of the
`residual noise to suppress it during signal periods with
`substantially no speech signals.
`The present invention discriminates between speech and
`noise, multiplies the noise by a psychoacoustic weighting
`coefficient to obtain the noise spectral characteristic and
`subtracts it from the input signal power spectrum, and hence
`the invention minimizes degradation of speech quality and
`drastically reduces the psychoacoustically displeasing
`residual noise.
`Besides, loss control of the residual noise eliminates it
`almost completely.
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a block diagram illustrating an example of a
`conventional noise suppressor;
`FIG. 2 is a block diagram illustrating an embodiment of
`the noise suppressor according to the present invention;
`FIG. 3 is a waveform diagram for explaining the operation
`in the FIG. 2 embodiment;
`FIG. 4 is a graph showing an example of an average
`spectral characteristic of noise discriminated using a maxi
`mum autocorrelation coefficient Rmax;
`FIG. 5 is a block diagram showing an example of the
`functional configuration of a noise spectrum update/storage
`part 33 in the FIG. 2 embodiment;
`FIG. 6 is a block diagram showing an example of the
`functional configuration of a psychoacoustically weighted
`subtraction part 34 in the FIG. 2 embodiment;
`FIG. 7 is a graph showing an example of a psychoacoustic
`weighting coefficient W(f);
`FIG. 8 is a block diagram illustrating another example of
`the configuration of an analysis/discrimination part 20;
`FIG. 9 is a flowchart showing a speech/non-speech iden
`tification algorithm which is performed by an identification
`part 25A in the FIG. 8 example;
`FIG. 10 is a graph showing measured results of a speech
`identification success rate by a hearing-impaired person who
`used the noise suppressor of the present invention; and
`FIG. 11 is a block diagram illustrating the noise suppres
`sor of the present invention applied to a multi-microphone
`system.
`
`25
`
`3
`been removed is provided at the output terminal 18. The
`above noise suppression method ideally suppresses noise
`when the noise power spectral characteristic is virtually
`stationary. Usually, noise characteristics in the natural world
`vary every moment though they are "virtually stationary."
`Hence, such a conventional noise suppressor as described
`above suppresses noise to make it almost imperceptible but
`some noise left unsuppressed is newly heard, as a harsh
`grating sound (hereinafter referred to as residual noise)-
`this has been a serious obstacle to the realization of an
`efficient noise suppressor.
`SUMMARY OF THE INVENTION
`It is therefore an object of the present invention to provide
`a noise suppressor which permits efficient picking up of
`target or desired signals alone.
`The acoustic noise suppressor according to the present
`invention comprises:
`frequency analysis means for making a frequency analysis
`of an input signal for each fixed period to extract its
`power spectral component and phase component;
`analysis/discrimination means for analyzing the input
`signal for the above-said each period to see if it is a
`target signal or noise and for outputting the analysis
`result;
`noise spectrum update/storage means for calculating an
`average noise power spectrum from the power spec
`trum of the input signal of the period during which the
`determination result is indicative of noise and storing
`the average noise power spectrum;
`psychoacoustically weighted subtraction means for
`weighting the average noise power spectrum by a
`psychoacoustic weighting function and for subtracting
`the weighted mean noise power spectrum from the
`input signal power spectrum to obtain the difference
`power spectrum; and
`inverse frequency analysis means for converting the dif
`ference power spectrum into a time-domain signal.
`The acoustic noise suppressor of the present invention is
`characterized in that the average power spectral character
`istic of noise, which is subtracted from the input signal
`power spectral characteristic, is assigned a psychoacoustic
`weight so as to minimize the magnitude of the residual noise
`that has been the most serious problem in the noise sup
`pressor implemented by the aforementioned prior art
`method. To this end, the present invention newly uses a
`psychoacoustic weighting coefficient W(f) in place of the
`subtraction coefficient a in Eq. (1). The introduction of such
`a weighting coefficient permits significant reduction of the
`residual noise which is psychoacoustically displeasing.
`In other words, the subtraction coefficient o in Eq. (1) is
`conventionally set at a value equal to or greater than 1.0 with
`a view to suppressing noise as much as possible. With a large
`value of this coefficient, noise can be drastically suppressed
`on the one hand, but on the other hand, the target signal
`component is also suppressed in many cases and there is a
`fear of "excessive suppression." The present invention uses
`the weighting coefficient W(f) which does not significantly
`distort and increases the amount of noise to be suppressed,
`and hence it minimizes degradation of processed speech
`quality.
`Furthermore, residual noise can be minimized by the
`above-described method, but according to the kind and
`magnitude (signal-to-noise ratio) of noise, the situation
`65
`occasionally arises where the residual noise cannot com
`pletely be suppressed, and in many cases this residual noise
`
`35
`
`45
`
`50
`
`55
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENT
`FIG. 2 illustrates in block form an embodiment of the
`noise suppressor according to the present invention. Refer
`ence numeral 20 denotes an analysis/discrimination part, 30
`is a weighted noise suppressing part, is a loss control part.
`The analysis/discrimination part 20 comprises an LPC
`(Linear Predictive Coding) analysis part 22, an autocorre
`lation analysis part 23, a maximum value detecting part 24,
`and a speech/non-speech identification part 25. For each
`analysis period the analysis/discrimination part 20 outputs
`the result of a decision as to whether the input signal is a
`speech signal or noise, and effects ON/OFF control of
`switches 32 and 41 described later on.
`The weighted noise suppression part 30 comprises a
`frequency analysis part (FFT) 31, a noise spectrum update/
`storage part 33, a psychoacoustically weighted subtraction
`part 34, and an inverse frequency analysis part 35. Each time
`it is supplied with the spectrum (noise spectrum) Sn(f) of a
`
`Page 10 of 16
`
`
`
`5
`new period k from the frequency analysis part 31 via a
`switch 32, the noise spectrum update/storage part 33 per
`forms a weighted addition of the newly supplied noise
`spectrum Sn(f) and a previous updated noise spectrum Sn
`(f) to obtain an averaged updated noise spectrum Sn(f)
`and holds it until the next updating and, at the same time,
`provides it as the noise spectrum Sn(f) for suppression use
`to the psychoacoustically weighted subtraction part 34. The
`psychoacoustically weighted subtraction part 34 multiplies
`the updated noise spectrum Sn(f) by the psychoacoustic
`weighting coefficient W(f) and subtracts the psychoacous
`tically weighted noise spectrum from the spectrum S(f)
`provided from the frequency analysis part 31, thereby sup
`pressing noise. The thus noise-suppressed spectrum is con
`verted by the inverse frequency analysis part 35 into a
`time-domain signal.
`The loss control part 40 comprises a switch 41, an
`averaged noise level storage part 42, an output signal
`calculation part 43, a loss control coefficient calculation part
`44 and a convolution part 45. The loss control part 40 further
`reduces the residual noise suppressed by the psychoacous
`tically weighted noise suppression part 30.
`Next, the operation of the FIG. 2 embodiment of the
`present invention will be described in detail with reference
`to FIG. 3 which shows waveforms occurring at respective
`parts of the FIG. 2 embodiment. Also in this embodiment, as
`is the case with the FIG. 1 prior art example, a check is made
`in the analysis/discrimination part 20 to see if the input
`signal is speech or noise for each fixed analysis period
`(analysis window range), then the power spectrum of the
`noise period is subtracted in the weighted noise suppression
`part 30 from the power spectrum of each signal period, and
`the difference power spectrum is converted into a time
`domain signal through inverse Fourier transform processing,
`thereby obtaining a speech signal with stationary noise
`suppressed.
`For example, an input signal x(t) (assumed to be a
`waveform sampled at discrete time t) from a microphone
`(not shown) is applied to the input terminal 11, and as in the
`prior art, its waveform for an 80-msec analysis period is
`Fourier-transformed (FFT, for instance) in the frequency
`analysis part 31 at time intervals of, for example, 40msec to
`thereby obtain the power spectrum S(f) and phase informa
`tion P(f) of the input signal. At the same time, the input
`signal x(t) is applied to the LPC analysis part 22, wherein its
`waveform for the 80-msec analysis period is LPC-analyzed
`every 40 msec to extract an LPC residual signal r(t)
`(hereinafter referred to simply as a residual signal in some
`cases). The human voice is produced by the resonance of the
`vibration of the vocal cords in the vocal tract, and hence it
`contains a pitch period component; its LPC residual signal
`r(t) contains pulse trains of the pitch period as shown on
`Row B in FIG.3 and its frequency falls within the range of
`between 50 and 300 Hz, though different with a male, a
`female, a child and an adult.
`The residual signal r(t) is fed to the autocorrelation
`analysis part 23, wherein its autocorrelation function R(i) is
`obtained (FIG. 3C). The autocorrelation function R(i) rep
`resents the degree of the periodicity of the residual signal. In
`the maximum value detection part 24 the peak value (which
`is the maximum value and will hereinafter be identified by
`Rmax) of the autocorrelation function R(i) is calculated, and
`the peak value Rimax is used to identify the input signal in
`the speech/non-speech identification part 25. That is, the
`signal of each analysis period is decided to be a speech
`signal or noise, depending upon whether the peak value
`Rmax is larger or smaller than a predetermined threshold
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`as
`
`SO
`
`55
`
`5,757,937
`
`6
`value Rimth. On Row D in FIG. 3 there are shown the results
`of signal discriminations made 40 msec behind the input
`signal waveform at time intervals of 40 msec, the speech
`signal being indicated by S and noise by N.
`The maximum autocorrelation value Rimax is often used
`as a feature that well represents the degree of the periodicity
`of the signal waveform. That is, many of noise signals have
`a random characteristic in the time or frequency domain,
`whereas speech signals are mostly voiced sounds and these
`signals have periodicity based on the pitch period compo
`nent. Accordingly, it is effective to distinguish the period of
`the signal with no periodicity from noise. Of course, the
`speech signal includes unvoiced consonants; hence, no
`accurate speech/non-speech identification can be achieved
`only with the feature of periodicity. It is extremely difficult,
`however, to accurately detect unvoiced consonants of very
`low signal levels (p, t, k, s, handf, for instance) from various
`kinds of environmental noise. To subtract the noise spectrum
`from the input signal spectrum, the noise suppressor of the
`present invention makes the speech/non-speech identifica
`tion on the basis of an idea that identifies the signal period
`which is surely considered not to be a speech signal period,
`that is, the noise period, and calculates its long-time mean
`spectral feature.
`In other words, it is sufficient only to calculate the average
`spectral feature of the signal surely considered to be a noise
`signal, and a typical noise spectral characteristic can be
`obtained by setting the aforementioned peak value Rimax at
`a small value. For example, FIG. 4 shows an example of the
`average spectral feature Sns(f) of the signal period
`identified, using the peak value Rimax, as a noise period from
`noise signals picked up in a cafeteria. In FIG. 4 there are also
`shown the average spectral characteristic Sno(f) obtained by
`extracting noise periods discriminated through visual
`inspection from the input signal waveform and frequency
`analyzing them, and their difference characteristic Sno(f)-
`Sns(f)l. The threshold value Rmth of the peak value Rimax
`was 0.14, the measurement time was 12 sec and the noise
`identification rate at this time was 77.8%. As will be seen
`from FIG. 4, the difference between the average spectral
`characteristics Sno(f) and Sns(f) is very small and, according
`to the peak value Rimax, the average noise spectral charac
`teristic can be obtained with a considerably high degree of
`accuracy even from environmental sounds mixed with vari
`ous kinds of noise as in a cafeteria.
`Turning back to FIG. 2, the frequency analysis part 31
`calculates the power spectrum S(f) of the input signal x(t)
`while shifting the 80-msec analysis window at the rate of 40
`Insec. Only when the input signal period is identified as a
`noise period by the speech/non-speech identification part 25,
`the switch 32 is closed, through which the spectrum S(f) at
`that time is stored as the noise spectrum S(f) in the noise
`spectrum update/storage part 33. As depicted in FIG. 5, the
`noise spectrum update/storage part 33 is made up of multi
`pliers 33A and 33B, an adder 33C and a register 33D. The
`noise spectrum update/storage part 33 updates, by the fol
`lowing equation, the noise spectrum when the input signal of
`the analysis period k is decided to be noise N:
`
`(2)
`where Sn is the newly updated noise spectrum, is Sn
`the previously updated noise spectrum, S(f) is the input
`signal spectrum when the input signal of the analysis period
`k is identified as noise, and B is a weighting function. That
`is, when the input signal period is decided to be a noise
`period, the spectrum S(f) provided via the switch 32 from
`
`65
`
`Page 11 of 16
`
`
`
`5,757,937
`
`8
`it selects the output n(f) from the attenuator 34E and outputs
`it as the noise suppressing spectrum S(f).
`The above-described processing by the psychoacousti
`cally weighted subtraction part 34 is expressed by the
`following equation:
`
`10
`
`15
`
`25
`
`30
`
`(3)
`
`S(f) = S(f) - WOf)S(f) if S(f) > S(f)
`= 0 or n(f)
`else
`That is, when the level of the power spectrum S(f) from the
`frequency analysis part 31 at the frequency f is higher than
`the averaged noise power spectrum S(f) (for example, a
`speech spectrum contains a frequency component which
`satisfies this condition), the noise suppression is carried out
`by subtracting the level of the psychoacoustically weighted
`noise spectrum W(f)S(f) at the corresponding frequency f.
`and when the power spectrum S(f) is lower than that S(f),
`the noise suppression is performed by forcefully making the
`noise suppressing spectrum S(f) zero, for instance.
`Incidentally, even if the input signal is a speech signal,
`there is a possibility that the level of its power spectrum S(f)
`becomes lower than the level of the noise spectrum.
`Conversely, when the input signal period is a non-speech
`period and noise is stationary, the condition S(f)<S(f) is
`almost satisfied and the spectrum S(f) is made. for example,
`zero over the entire frequency band. Accordingly, if the
`speech period and the noise period are frequently repeated,
`a completely silent period and the speech period are
`repeated, speech may sometimes become hard to hear. To
`avoid this, when S(f)<S(f), the noise suppressing spectrum
`S(f) is not made Zero but instead, for example, white noise
`n(f) or the averaged noise spectrum Sn(f), obtained in the
`noise spectrum update/storage part 33 as described above
`with reference to FIG. 6. may be fed as a background noise
`spectrum S'(f)/A=n(f) to the inverse frequency analysis part
`35 after being attenuated down to such a low level that noise
`is not grating. In the above, A indicates the amount of
`attenuation.
`While the above-described processing by Eq. (3) is simi
`lar to the conventional processing by Eq. (1), the present
`invention entirely differs from the prior art in that the
`constanta in Eq. (1) is replaced by with the psychoacoustic
`weighting function W(f) having a frequency characteristic.
`The psychoacoustic weighting function W(f) produces an
`effect of significantly suppressing the residual noise in the
`noise-suppressed signal as compared with that in the past,
`and this effect can be further enhanced by a scheme using the
`following equation (4). Replacing f in W(f) with i as each
`discrete frequency point, it is given by
`
`7
`the frequency analysis part 31 to the multiplier 33A is
`multiplied by the weight (1-3), while at the same time the
`previous updated noise spectrum Sn read out of the
`register 33D is fed to the multiplier 33B, whereby it is
`multiplied by B. These multiplication results are added
`together by the adder 33C to obtain the newly updated noise.
`spectrum Sn(f), The updated noise spectrum Sn(f)
`thus obtained is used to update the contents of the register
`33D.
`The value of the weighting function B is suitably chosen
`in the range of 0<B<1. With B=0, the frequency analysis
`result Sk(f) of the noise period is used intact as a noise
`spectrum for cancellation use, in which case when the noise
`spectrum undergoes a sharp change, it directly affects the
`cancellation result, producing an effect of making speech
`hard to hear. Hence, it is undesirable for the value of the
`weighting function B to be zero. With the weighting function
`B set in the range of 0<<1, a weighted mean of the
`previously updated noise spectrum Sn(f) and the newly
`updated spectrum S(f) is obtained, making it possible to
`provide a less sharp spectral change. The larger the value of
`the weighting function B, the stronger the influence of the
`updated spectra in the past on the previously updated
`spectrum Sn(f); therefore, the weighted mean in this
`instance has the same effect as that of all noise spectra from
`the past to the present (the further back in time, the less the
`average is weighted). Accordingly, the updated noise spec
`trum Sn(f) will hereinafter be referred to also as an
`averaged noise spectrum. In the updating by Eq. (2), the only
`updated averaged noise spectrum Sn(f) needs to be
`stored; namely, there is no need of storing a plurality of
`previous noise spectra.
`The updated averaged noise spectrum Sn(f) from the
`noise spectrum update/storage part 33 will hereinafter be
`represented by S(f). The averaged noise spectrum S(f) is
`provided to the psychoacoustically weighted subtraction
`part 34. As shown in FIG. 6, the psychoacoustically
`weighted subtraction part 34 is made up of a comparison part
`34A, a weight multiplication part 34B, a psychoacoustic
`weighting function storage part 34G, a subtractor 34D, an
`attenuator 34E and a selector 34F. In the weight multipli
`cation part 34B the averaged noise spectrum S(f) is mul
`tiplied by a psychoacoustic weighting function W(f) from
`the psychoacoustic weighting function storage part 34G to
`obtain a psychoacoustically weighted noise spectrum W(f)
`S(f). The psychoacoustically weighted noise spectrum W(f)
`S(f) is provided to the subtractor 34D. wherein it is sub
`tracted from the spectrum S(f) from the frequency analysis
`part 31 for eachfrequency. The subtraction result is provided
`to one input of the selector 34F, to the other input of which
`0 or the averaged noise spectrum S(f) is provided as
`low-level noise n(f) after being attenuated by the attenuator
`34E. The FIG. 6 embodiment shows the case where the
`low-level noise n(f) is fed to the other input of the selector
`34F. The comparison part 34A compares, for each
`frequency, the level of the power spectrum s(f) from the
`frequency analysis part 31 and the level of the averaged
`noise spectrum S(f) from the noise spectrum update/storage
`part 33; the comparator 34A applies, for example, a control
`signal sgn=1 or sgn=0 to a control terminal of the selector
`34F for each frequency, depending upon whether the level of
`the power spectrum s(f) is higher or lower than the level of
`the averaged noise spectrum S(f). When supplied with the
`control signal sgn=1 at its control terminal for each
`frequency, the selector 34F selects the outputs from the
`subtractor 34D and outputs it as a noise suppressing spec
`trum S'(f), and when supplied with the control signal sgn=0.
`
`35
`
`45
`
`50
`
`55
`
`where f is a value corresponding to the frequency band of
`the input signal and B and K are predetermined values. The
`larger the values B and K, the more noise is suppressed. The
`psychoacoustic weighting function expressed by Eq. (4) is a
`straight line along which the weighting coefficient W(i)
`becomes smaller with an increase in frequency i as shown in
`FIG. 7, for instance. This psychoacoustic weighting function
`naturally produces the same effect when simulating not only
`such a characteristic indicated by Eq. (4) but also an average
`characteristic of noise. In the case of splitting the weighting
`function characteristic W(f) into two frequency regions at a
`frequency f-f/2. similar results can be obtained even if a
`desired distribution of weighting function is chosen so that
`the average value of the weighting function in the lower
`frequency region is larger than in the higher frequency
`region as expressed by the following equation:
`
`65
`
`Page 12 of 16
`
`
`
`5
`
`O
`
`15
`
`25
`
`30
`
`35
`
`Further, the predetermined values B and K may be fixed at
`certain values unique to each acoustic noise suppressor, but
`by adaptively changing the according to the kind and
`magnitude of noise, the noise suppression ef