`
`United States Patent
`Reuss et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,359,504 B1
`Apr. 15, 2008
`
`USOO73595.04B1
`
`(54) METHOD AND APPARATUS FOR
`REDUCING ECHO AND NOISE
`
`(75) Inventors: Edward L. Reuss, Santa Cruz, CA
`Sylam A. Weeks, Santa Cruz,
`
`(73) Assignee: Plantronics, Inc., Santa Cruz, CA (US)
`(*) Notice:
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 868 days.
`(21) Appl. No.: 10/309,695
`
`Dec. 3, 2002
`
`- W -
`
`omesburg
`
`2/2001 Koski
`6, 192,126 B1
`3/2001 Hawkes et al.
`6,201499 B1
`6,236,862 B1* 5/2001 Erten et al. ................. 455,501
`it. R 1838: East et al.
`6,408,978 B1
`6/2002 Premus
`6,415,029 B1
`7/2002 Piket et al.
`6.420,975 B1
`7/2002 DeLine et al.
`6,707,912 B2 * 3/2004 Stephens et al. ....... 379,406.08
`* cited by examiner
`Primary Examiner Curtis A Kuntz
`Assistant Examiner—Alexander Jamal
`(74) Attorney, Agent, or Firm—Intellectual Property Law
`Office of Thomas Chuang
`(57)
`ABSTRACT
`
`(22) Filed:
`(51) Int. Cl.
`The present invention provides a solution to the needs
`(2006.01)
`H04M 9/08
`described above through a method and apparatus for reduc
`(52) U.S. Cl. ............................. 379/406.02:379/406.01
`ing echo and noise. The apparatus includes a microphone
`(58) Field of Classification Search ...............................
`array for receiving and audio signal, the audio signal includ
`379/406.01 - 406.16
`ing a Voice signal component and a noise signal component.
`See application file for complete search history.
`The apparatus further includes a voice processing path
`having an input coupled to the microphone array and a noise
`References Cited
`processing path having an input coupled to the microphone
`U.S. PATENT DOCUMENTS
`array. The Voice processing path is adapted to detect voice
`signals and the noise processing path is adapted to detect
`5,809,463 A * 9/1998 Gupta et al. ................ TO4,233
`noise signals. A first echo controller is coupled to the voice
`5,920,834. A
`7/1999 Sih et al.
`processing path and a second echo controller is coupled to
`5,960,077 A ck
`9, 1999 Ishii et al. ............. 379,406.08
`3. 2.
`5.3. wa - - - - - - - - - - - - - - - - - 37.. the noise processing path. A noise reducer is coupled to the
`W - -
`Illill
`. . . . . . . . . . . . . . . . . . .
`6,061,023 A
`5, 2000 Daniel et al.
`output of the first echo controller and second echo controller.
`6,151,397 A 1 1/2000 Jackson, Jr. II et al.
`6,178,248 B1
`1/2001 Marash
`
`(56)
`
`24 Claims, 4 Drawing Sheets
`
`tod
`
`----- -
`
`Echo correottf ys
`* * |
`Eile ge
`
`ac
`
`Ofp.
`
`23.
`
`to c
`A
`
`
`
`o
`too
`y o:
`Etfirst-opm
`O W Ap /
`,
`-
`- - Voce (og
`s
`l
`l EAaroew /
`vol%t 9
`of-ri
`:
`;
`
`W
`
`Al2
`7
`o
`
`\23
`y
`
`(2-e
`v
`
`-
`
`a - v A2
`-A 2.
`27
`
`---
`
`(262
`7
`
`all
`/
`-
`A/D is
`
`Page 1
`
`IPR PETITION
`US RE48,371
`Sonos Ex. 1009
`
`
`
`U.S. Patent
`U.S. Patent
`
`Apr. 15, 2008
`Apr. 15, 2008
`
`Sheet 1 of 4
`Sheet 1 of 4
`
`US 7,359,504 B1
`US 7,359,504 B1
`
`33204aa
`
`ora
`
`
`
` OilasieyPS]|:Jiwgoawy7S||.;=BO)
`
`T7809)3
`
`Page 2
`
`
`
`
`
`/bol
`
`32210/
`
`wdoawyg
`
`a
`
`
`
`
`
`
`Mv
`
`
`
`hol
`
`Page 2
`
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Apr. 15, 2008
`
`Sheet 2 of 4
`
`US 7,359,504 B1
`US 7,359,504 B1
`
`
`
`DOINODCHDQOL
`
`A5ioMSay
`
`
`
`Z23094
`
`
`
`MolDod4ay
`
`Page 3
`
`Page 3
`
`
`
`U.S. Patent
`
`Apr. 15, 2008
`
`Sheet 3 of 4
`
`US 7,359,504 B1
`
`Start
`
`304
`
`While (Tx VAD = false) and (Rx VAD = false):
`Determine the Direction of Arrival (DoA) for the strongest noise
`sources greater than the noise threshold.
`Generate an ordered list of DoA(n) according to noise level,
`where 0 <= n <= N. (N = max # steerable nulls)
`
`306
`
`Determine acoustic echo level
`While (Rx VAD = true) and (Tx VAD is false):
`Measure the cross-correlation of the Rx&tx signals at various angles
`While (Rx VAD = false) and (Tx VAD = false):
`Measure the cross-correlation of the Rx 8 x signals at various angles
`Calculate the ratio of (correlation during RXVAD true) (correlation during Rx VAD - false)
`Select the Acoustic Echo Direction of Arrival (AE DoA) with the strongest ratio
`
`308
`1 1 (AE level 2 thresholds
`8.
`(AEDoA not equal to an AEDoA on
`the list)
`?
`No
`
`
`
`
`
`- 310
`
`Number of
`point noise sources
`o:
`Max number of
`steerable nulls
`
`Yes
`
`No
`
`-
`Add AEDOA
`to end of list
`
`
`
`315
`
`
`
`
`
`V
`Beam form Noise Output:
`Direct null at user's mouth
`direct beam(s) at point noise source(s)
`
`
`
`320
`
`Number of point
`noise sources > 02-1
`3.18 Y-1
`Beam form Noise Output:
`Direct null at user's mouth
`Direct wide beam away from user's mouth.
`
`Beamform Voice Output:
`Direct beam at user's mouth.
`Direct null(s) at point noise source(s).
`
`Bearform Woi
`Direct beam at user's mouth.
`Direct equal-spaced nulls away from user's mouth.
`
`Eld
`
`Figure 3
`
`Page 4
`
`
`
`U.S. Patent
`U.S. Patent
`
`Apr. 15, 2008
`Apr. 15, 2008
`
`Sheet 4 of 4
`Sheet 4 of 4
`
`US 7,359,504 B1
`US 7,359,504 B1
`
`
`
`
`
`
`
`h9409}-)
`
`Page 5
`
`Page 5
`
`
`
`US 7,359,504 B1
`
`1.
`METHOD AND APPARATUS FOR
`REDUCING ECHO AND NOISE
`
`TECHNICAL FIELD
`
`The present invention relates to the general field of signal
`processing. More specifically the invention relates to audio
`quality in telecommunications.
`
`BACKGROUND
`
`Headset and other telephonic device designs used for
`telephony must deal with the acoustic response from device
`speakers being detected by the device microphone and then
`sent back to the far-end speaker, which after the delays
`inherent in any telecommunications circuit may be detected
`by the far-end user as an echo of their own voice. Here, the
`“transmit signal” refers to the audio signal from a near end
`user, e.g. a headset wearer, transmitted to a far-end listener.
`The “receive signal refers to the audio signal received by
`the headset wearer from the far-end talker. In the prior art,
`one solution to this echo problem was to ensure the acoustic
`isolation from the headset speaker to the headset micro
`phone is so great as to render any residual echo as imper
`ceptible. For example, one solution is to use a headset with
`a long boom to place the microphone near the user's mouth.
`However, such a headset may be uncomfortable to wear
`or too restrictive in certain environments. Furthermore,
`many applications require a headset design that cannot
`achieve the acoustic isolation required, such as a headset
`with a very short microphone boom used in either cellular
`telephony or Voice over Internet Protocol (VoIP), or more
`generally Voice over Packet (VoP) applications. In these
`applications, the delay through the telecommunications net
`work can be hundreds of milliseconds, which can make even
`a Small amount of acoustic echo annoying to the far-end
`user. The required acoustic isolation is more difficult to
`achieve with boomless headsets, hands-free headsets,
`speaker-phones, and other devices in which a microphone
`and speaker may be in close proximity. One solution
`described in the prior art has been to utilize an echo
`cancellation technique to reduce the acoustic echo. Such
`techniques are discussed for example, in U.S. Pat. No.
`6,415,029 entitled “Echo Canceler and Double-Talk Detec
`tor for Use in a Communications Unit.” However, such
`techniques focus on the voice signal alone as opposed to
`acoustic echo in the noise Sources, thereby limiting their
`effectiveness.
`Headset and other telephonic device designs must also
`deal with background noise, caused by a variety of noise
`Sources in the headset wearer's vicinity, Such as other people
`conversing nearby, wind noise in an automobile, machinery
`& Ventilation noise, loud music and intercom announce
`ments in public places. These sources may either be diffuse
`or point noise Sources. In the prior art, Such acoustic
`interference is normally managed by the use of a long
`microphone boom, which places the microphone as close as
`possible to the user's mouth, a voice tube, which has the
`same effect as a long boom, or a noise canceling micro
`phone, which enhances the microphone response in one
`direction oriented towards the user's mouth and attenuates
`the response from the other directions. However, for many
`applications these solutions are either inadequate. Such as
`very high noise environments, or are not compatible with the
`stylistic and user comfort requirements on the headset. Such
`as a headset with a short microphone boom. Also, when
`using noise-canceling microphones, if the microphone is not
`
`2
`properly positioned—as is often the case—the noise reduc
`ing mechanism is rendered useless. In these cases, additional
`background noise reduction is required in the microphone
`output signal.
`Thus, there has been a need for improvements in the
`reduction of acoustic echo and reduction of background
`noise. More specifically, there has been a need for improved
`systems and methods for echo cancellation and noise reduc
`tion techniques.
`
`SUMMARY OF THE INVENTION
`
`The present invention provides a solution to the needs
`described above through an apparatus and method for reduc
`ing acoustic echo and background noise.
`The present invention provides an apparatus for process
`ing a signal. The apparatus includes a microphone array for
`receiving an audio signal, the audio signal including a voice
`signal component and a noise signal component. The appa
`ratus further includes a voice processing path having an
`input coupled to the microphone array and a noise process
`ing path having an input coupled to the microphone array.
`The Voice processing path is adapted to detect voice signals
`and the noise processing path is adapted to detect noise
`signals. A first echo controller is coupled to the Voice
`processing path and a second echo controller is coupled to
`the noise processing path. A noise reducer is coupled to the
`output of the first echo controller and second echo controller.
`The present invention further provides a device for use in
`a bi-directional communications system. The device
`includes a microphone array for receiving a near end audio
`signal, where the audio signal including a voice signal
`component and a noise signal component. The device further
`includes a speaker and a signal processing circuit. The
`speaker broadcasts to a near end user of the communication
`device an audio signal which is generated by a far end user.
`The signal processing circuit attenuates an echo signal
`generated by the speaker detected by the microphone array
`and attenuates background noise detected by the microphone
`array. The signal processing circuit comprises a voice beam
`former adapted to detect the Voice signal component, a noise
`beam former adapted to detect a noise signal component, a
`first echo controller coupled to the output of the voice
`beam former, a second echo controller coupled to the output
`of the noise beam former, and a noise reducer coupled to the
`output of the echo controller.
`The present invention further presents a method for
`processing a signal to reduce undesired noise. The method
`comprises receiving an audio signal with a microphone
`array, where the audio signal comprising one or more
`components. The audio signal is provided to a voice pro
`cessing path having an input coupled to the microphone
`array and the Voice processing path is adapted to detect voice
`signals. The audio signal is provided to a noise processing
`path having an input coupled to the microphone array and
`adapted to detect noise signals. An acoustic echo component
`in the audio signal is cancelled with a first echo controller
`coupled to the Voice processing path. An acoustic echo
`component in the audio signal is cancelled with a second
`echo controller coupled to the noise processing path. A noise
`component in the audio signal is reduced with a noise
`reducer coupled to the output of the first echo controller and
`second echo controller.
`
`10
`
`15
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Page 6
`
`
`
`3
`DESCRIPTION OF THE DRAWINGS
`
`US 7,359,504 B1
`
`4
`utilize an A/D converter to present a digitized version of the
`far-end signal to echo controllers 112, 114 and double talk
`detector 118.
`The speaker may be part of headsets, other hands free
`devices, handsets, or other telephonic device. In an embodi
`ment of the invention, the headset is boomless. However, the
`headset may comprise a short or regular length boom.
`Although reference may be made herein to the use of a
`headset, e.g., headset speaker, this reference is meant to
`include other hands free devices, handsets, or other tele
`phonic devices with speakers and microphones.
`Microphone array 102 may comprise either omni-direc
`tional microphones, directional microphones, or a mix of
`omni-directional and directional microphones. Microphone
`array 102 detects the voice of a near end user which will be
`the primary component of the audio signal, and will also
`detect secondary components which may include the output
`of a headset or handset speaker and background noise. If
`omni-directional microphones are used, the microphone
`response pattern is affected by the mounting geometry
`within a headset packaging and by a wearer's head. The use
`of directional microphones is also possible, but will affect
`the performance of the beam forming algorithms used in a
`Subsequent stage. These beam forming algorithms may have
`to be modified accordingly. In the instance of a two element
`array, if the elements are directional microphones, then one
`element is oriented towards the wearer's mouth and the
`other oriented away from the mouth.
`Microphone array 102 comprises two or more micro
`phones. Use of two microphones is beneficial to facilitate
`generation of high quality speech signals since desired vocal
`signatures can be isolated and destructive interference tech
`niques can be utilized. Apparatus 100 may be implemented
`with any number of microphones. Those of ordinary skill in
`the art will appreciate that the inventive concepts described
`herein apply equally well to microphone arrays having any
`number of microphones and array shapes which are different
`than linear. The only impact on this generalization is the
`added cost and complexity of the additional microphones
`and their mounting and wiring, plus the added A/D convert
`ers, plus the added processing capacity (processor speed and
`memory) required to perform the beam forming functions on
`the larger array.
`Each microphone in microphone array 102 is coupled to
`an analog to digital (A/D) converters 104. Analog near end
`signals 103 are output from microphone array 102. The
`individual microphone output near end signals 103 are
`applied to A/D converters 104 to form individual digitized
`signals 106. Transmission of Voice by digital techniques has
`become widespread, particularly in cellular telephone and
`PCS applications. In a typical digital telephone system,
`speech is converted from an analog signal to a sampled
`stream of digital Pulse Code Modulated (PCM) samples by
`an A/D converter. In a typical embodiment, a date rate of 64
`kbps is chosen in order to retain Sufficient voice quality.
`Once the speech signal has been digitized, it can be manipu
`lated to achieve certain benefits, such as beam forming, echo
`cancellation, and noise reduction. The digitized voice signal
`can be processed to remove undesired echo by an echo
`canceller and background noise Suppressed by a noise
`reduction algorithm. As described further below, the near
`end audio signal detected by microphone array 102 and
`converted by A/D converters 106 may comprise several
`signal components, including near end speech, near-end
`noise, and far-end echo.
`There is one A/D converter for each microphone in the
`microphone array 102. The A/D converters 104 include
`
`The features and advantages of the apparatus and method
`of the present invention will be apparent from the following
`description in which:
`FIG. 1 is a diagram illustrating a presently preferred
`embodiment of the apparatus utilizing the invention.
`FIG. 2 is a diagram illustrating an embodiment of a
`beam former utilized by the invention.
`FIG. 3 is a flow chart illustrating an example method of
`operation of an adaptive Voice beam former and adaptive
`noise beam former in directing beams and nulls.
`FIG. 4 is a diagram illustrating an embodiment of an
`apparatus for noise reduction using blind source separation
`noise reduction.
`
`10
`
`15
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`
`25
`
`30
`
`35
`
`45
`
`The present invention provides a solution to the needs
`described above through an apparatus and method for reduc
`ing acoustic echo and background noise. The invention
`utilizes beam forming techniques on a voice and noise signal
`together with echo cancellation techniques on a Voice and
`noise signal with noise reduction algorithms.
`Still other embodiments of the present invention will
`become apparent to those skilled in the art from the follow
`ing detailed description, wherein is shown and described
`only the embodiments of the invention by way of illustration
`of the best modes contemplated for carrying out the inven
`tion. As will be realized, the invention is capable of modi
`fication in various obvious aspects, all without departing
`from the spirit and scope of the present invention. Accord
`ingly, the drawings and detailed description are to be
`regarded as illustrative in nature and not restrictive.
`Referring to FIG. 1, diagram of a basic configuration
`utilizing an embodiment of the apparatus for reduction of
`acoustic echo and background noise of the present invention
`40
`is shown. The apparatus 100 of the present invention may be
`constructed using discrete components, such as microphones
`and digital signal processing (DSP) chips (and associated
`filters, A/D and D/A converters, power supplies, etc.). For
`simplicity of explanation, only a Subset of elements is
`shown. The apparatus 100 includes a multi-element micro
`phone array 102, analog to digital converters (A/D) 106,
`beam form voice processor 108, beam form noise processor
`110, voice echo controller 112, noise echo controller 114,
`transmit (Tx) voice activity detector (VAD) 116, double talk
`detector 118, noise reducer 120, transmit output digital to
`analog (D/A) converter 122, far end receive input A/D 124,
`and far end receive (RX) voice activity detector (VAD) 127.
`One of ordinary skill in the art will recognize that other
`architectures may be employed for the apparatus by chang
`ing the position of one or more of the various apparatus
`elements. For example, voice echo controller 112 and noise
`echo controller 114 may be situated between microphone
`array 102 and beam form voice processor 108 and beam form
`noise processor 110.
`The speech of a far end user is converted to a digital far
`end signal 125 by analog to digital converter 124 and
`transmitted to a speaker 128 where it is output to a near end
`user. Prior to output by speaker 128 the digital far end signal
`125 is converted to an analog audio signal by digital to
`analog converter 126. An alternate embodiment would
`couple the far end signal directly to the speaker 128 and
`
`50
`
`55
`
`60
`
`65
`
`Page 7
`
`
`
`US 7,359,504 B1
`
`10
`
`15
`
`5
`anti-alias filters for proper signal preconditioning. Alterna
`tively, the A/D conversion can be implemented using a
`single high speed converter with an analog N to 1 signal
`multiplexer in front of it to Switch the analog signal from a
`specific channel onto the input of the ADC. A signal Sam
`pling mechanism is required for each microphone with
`sample timing synchronized in order to preserve the time
`delay information between microphones as required by the
`beam forming stage. While the invention can be imple
`mented as a purely analog embodiment, it is considered
`simpler and therefore cheaper to implement it using digital
`signal processing (DSP) technology. One of ordinary skill in
`the art will recognize that purely analog implementations
`should be considered as merely an implementation variation
`of the same invention. A far end A/D converter 124 is
`provided for the incoming input receive signal from a
`far-end talker.
`The individual A/D output signals 106 are applied to
`beam form voice processor 108 and beam form noise proces
`sor 110. Beamform voice processor 108 outputs enhanced
`voice signal 109 and beam form noise processor 110 outputs
`enhanced noise signal 111. The digitized output of micro
`phone array 102 is electronically processed by beam form
`voice processor 108 and beam form noise processor 110 to
`emphasize sounds from a particular location and to de
`emphasize sounds from other locations.
`Beamformers are a form of spatial filter that receive
`inputs from an array of spatially distributed sensors and
`combines them in Such a way that it either enhances or
`Suppresses signals coming from certain directions relative to
`signals from other directions. As a result, the beam former
`can alter the direction of sensitivity without movement of
`the sensor array. The input received from each sensor in the
`array are combined in a weighted manner to achieve the
`desired direction of sensitivity. The filter coefficients of a
`non-adaptive beam former are predetermined such that the
`beam former can form a beam (exhibit the greatest sensitiv
`ity) or a null (exhibit minimal sensitivity) in a predetermined
`direction. The filter coefficients of an adaptive beam former
`are continually updated so that directional sensitivity can be
`dynamically changed depending on the changing locations
`or conditions associated with a target source, such as a user
`Voice, and undesired sources, such as acoustic echo or
`background noise.
`Electronic processing using beam forming makes it pos
`sible to electronically “steer an array by emphasizing
`and/or de-emphasizing Voice or noise Sounds from objects as
`they move from location to location. Through the use of
`beam form voice processor 108, microphone array 102 can
`be advantageously used to pick up speech in situations such
`as teleconferences, where hands-free speech acquisition is
`desired, where there are multiple talkers or where there the
`talkers are moving. Through the use of beam forming and
`other Such techniques, the array's directivity pattern can be
`updated rapidly to follow a moving talker or to switch
`between several alternating or simultaneous talkers. Beam
`form voice processor 108 may improve the voice signal to
`noise ratio by forming a composite antenna pattern beam in
`the direction of the voice and an antenna pattern null in the
`direction of one or more point noise sources. Through the
`use of beam form noise processor 110, the microphone array
`can be advantageously used to pick up point noise sources;
`the array's directivity pattern can be updated rapidly to
`follow a moving noise Source or simultaneous noise sources.
`Referring to FIG. 2, a four microphone beam former is
`shown. The beam former includes a microphone array 202.
`
`6
`One of ordinary skill in the art will recognize that other
`number of microphone beam formers may be selected. An
`embodiment including four microphones has been selected
`for the purposes of illustration only and should not be
`construed as limiting. Individual A/D output signals 206 are
`applied to beam form voice processor 208, which generates
`complex weights that are multiplied by the individual A/D
`output signals 206. The results are Summed to produce an
`enhanced voice signal 109. Furthermore, in the present
`invention, individual A/D output signals 206 are applied to
`beam form noise processor 210, which generates complex
`weights that are multiplied by the individual A/D output
`signals 206. The results are Summed to produce an enhanced
`noise signal 211. Operation of the beam form Voice processor
`208 and the beam form noise processor 210 are described in
`further detail below.
`Referring to FIG. 2, beam form voice processor 208
`receives the signals from A/D converters 204 and forms one
`or more beams or nulls 240a, 240b, and 240c. The beams are
`formed using conventional or adaptive beam forming tech
`niques well known to those of ordinary skill in the art.
`Although three beams or nulls are shown, those of ordinary
`skill in the art will recognize that beam form voice processor
`208 can form fewer or greater than the three beams or nulls
`and that the beams or nulls can be directed in any desired
`direction and not just in the directions shown in FIG. 2.
`In further reference to FIG. 2, beam form noise processor
`210 receives the signals from A/D converters 204 and forms
`one or more beams or nulls 242a, 242b. The beams are
`formed using conventional or adaptive beam forming tech
`niques well known to those of ordinary skill in the art.
`Although two beams or nulls are shown, those of ordinary
`skill in the art will recognize that beam form noise processor
`110 can form fewer or greater than the two beams or nulls
`and that the beams or nulls can be directed in any desired
`direction and not just in the directions shown in FIG. 2.
`Beamform voice processor 208 isolates a near-end
`speaker voice 212. Beamform noise processor 210 isolates
`the noise from point noise sources such as X1 244 and X2
`246 for noise reduction at Subsequent stages of apparatus
`100. One or more nulls may be directed at a headset speaker
`228 to minimize the acoustic echo.
`Referring to FIG. 1, in one embodiment both beam form
`voice processor 108 and beam form noise processor 110 are
`implemented as wide-band (pass band covers at least 300 to
`3.300 Hz) beam formers, using any one of several common
`DSP algorithms as described in publications known to those
`of ordinary skill in the art and in Sonar and radar applica
`tions. The beam formers may be either a fixed configuration,
`for lower cost, or adaptive for better performance. The voice
`beam former is configured to orient the main lobe of the
`beam formed response towards the wearer's mouth. An
`adaptive beam former is capable of adjusting the direction of
`the main lobe to compensate for different wearing positions
`encountered on different wearer's heads. This eliminates the
`need for the user to precisely position the headset on their
`head with respect to the mouth to headset orientation.
`Use of adaptive beam formers can also adaptively place a
`null in one or more directions. This capability can be utilized
`to adaptively orient a null in the array response towards a
`major noise point source. If more than two microphones are
`used in the array, then several nulls can be adaptively
`oriented to reduce the response from several noise point
`sources. Additional nulls, if available, may be oriented
`towards the noise point sources, or else they may be oriented
`towards the headset speaker to reduce the acoustic echo
`perceived by the far-end talker. The acoustic echo is
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Page 8
`
`
`
`US 7,359,504 B1
`
`10
`
`15
`
`30
`
`35
`
`7
`reduced, but not eliminated, utilizing beam former nulls.
`Additional acoustic echo control is implemented in Subse
`quent stages by the echo canceller. In the absence of a
`specific major noise point Source, i.e. the noise is diffuse or
`reverberant, the nulls may be oriented in directions generally
`away from the mouth, or else towards the headset speaker to
`reduce the far-end talker's perceived acoustic echo. In the
`case of diffuse noise, the effectiveness of the noise beam
`former is reduced but still advantageous.
`The noise beam former is configured to place the main
`lobe of its output response away from the wearer's mouth.
`An adaptive beam former can adaptively orient the main
`lobe towards a major noise point source. In the absence of
`a specific major noise point Source, i.e. the noise is diffuse,
`a broad main lobe is oriented in a direction generally away
`from the mouth. Again a null can be implemented, or several
`nulls for a larger array, in the output response of the noise
`beam former. This null is oriented towards the wearer's
`mouth, reducing the response from the wearer's own Voice.
`Additional nulls, if available, may also be oriented towards
`the wearer's mouth or else they may be oriented towards the
`headset speaker, to reduce the acoustic echo perceived by the
`far-end talker.
`Referring to FIG. 3 and FIG. 2, a method of operation of
`an adaptive voice beam former 208 and adaptive noise
`25
`beam former 210 in directing beams and nulls is illustrated.
`One of ordinary skill in the art will recognize that the
`direction of the beams and nulls as described in FIG. 3 is
`only illustrative, and other configurations of beams and nulls
`can be utilized by beam form voice processor 208 and
`beam form noise 210 in addition to that described in FIG. 3.
`At step 304, the noise energy level from point noise sources
`is determined as follows. While (Tx VAD=false) and (RX
`VAD=false): (1) Determine the Direction of Arrival (DoA)
`for the strongest noise Sources greater than a noise threshold,
`and (2) generate an ordered list of DoA(n) according to the
`noise level, where 0<=n-N (N is equal to the maximum
`number of steerable nulls). These determinations and others
`described below can be made by a processor separate (not
`shown) from the beam form voice processor 208 and beam
`40
`form noise processor 210, or as part of the function of the
`beam form voice processor 208 or beam form noise processor
`210.
`At step 306, the acoustic echo level is determined as
`follows. While (RX VAD=true) and (Tx VAD=false); mea
`45
`Sure the cross-correlation of the RX and TX signals at various
`angles. While (RX VAD=false) and (Tx VAD=false); mea
`Sure the cross-correlation of the RX and TX signals at various
`angles. The ratio of (correlation during RX VAD=true)/
`(correlation during RXVAD=false) is calculated. The Acous
`tic Echo Direction of Arrival (AE DoA) with the strongest
`ratio is selected.
`At step 308, a determination is made whether the acoustic
`echo noise level is greater than the noise threshold and
`whether the AE DoA is not equal to an AE DoA on the list.
`If yes, at step 310 a determination is made whether the
`number of point noise sources is greater than or equal to the
`maximum number of steerable nulls. If no at step 308, at step
`312 it is determined whether the number of point sources is
`greater than Zero.
`If yes at step 310, at step 314 it is determined whether the
`AE noise level is greater than point noise level (n). If no at
`step 310, at step 316AE DoA is added to the end of the list.
`If yes at step 314, DoA(n) is replaced with Acoustic Echo
`DoA at step 315. If no at step 314, at step 312 it is
`determined whether the number of point sources is greater
`than Zero.
`
`50
`
`55
`
`60
`
`65
`
`8
`If yes at step 312, at step 318 the beam form noise output
`directs a null at the user's mouth and directs beam(s) at point
`noise Source(s). The beam form voice output directs a beam
`at the user's mouth and directs null(s) at point noise Source
`(s). If no at step 312, at step 320 the beam form noise output
`directs a null at the user's mouth and directs a wide beam
`away from the user's mouth. The beam form voice output
`directs a beam at the user's mouth and directs equal-spaced
`nulls away from the user's mouth.
`In an alternative embodiment where there are limited
`beams and/or nulls available, an adaptive algorithm may
`balance the noise energy level from the headset speaker
`against the diffuse noise energy level to determine the
`strength of the null to direct toward the headset speaker. In
`a further alternative embodiment, an adaptive algorithm
`balances the noise energy level from the headset speaker
`against the energy from one or more distinct point noise
`Sources to determine the appropriate weighting to place
`beams and nulls in particular directions. The determined
`weighting may be adaptively updated as point noise sources
`or acoustic echo changes. The adaptive algorithm maxi
`mizes the Voice to noise ratio, where the noise comprises
`point noise sources, diffuse noise, and acoustic echo. This
`Voice to noise ratio is also maximized in Subsequent echo
`cancellation and noise reduction stages.
`Referring to FIG. 1, the output of beam form voice pro
`cessor, enhanced voice signal 109, and the output of beam
`form noise processor 110, enhanced noise signal 111, are
`propagated to voice activity detector (VAD) 116.
`Voice activity detector 116 determines when the headset
`user is speaking and when the user is silent (i.e., whether the
`signals 109 and 111 include voice or only noise). A binary
`output “Voice/No Voice' signal 117 is output and used by
`other stages to control the echo cancellation and transmit
`noise reductio