`Burnett
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,280,072 B2
`Oct. 2, 2012
`
`US008280072B2
`
`(54) MICROPHONE ARRAY WITH REAR
`VENTING
`
`(51) Int. Cl.
`(2006.01)
`H04R 3/00
`(52) U.S. Cl. ......................... 381/92: 381/94.1: 381/94.7
`(75) Inventor: Gregory C. Burnett, Dodge Center, MN (58) Field of Classification Search .................... 381/92,
`(US)
`381/94.1, 94.7
`See application file for complete search history.
`(73) Assignee: AliphCom, Inc., San Francisco, CA
`(US)
`
`(56)
`
`References Cited
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1041 days.
`
`(21) Appl. No.: 12/163,617
`
`(22) Filed:
`
`Jun. 27, 2008
`
`(65)
`
`Prior Publication Data
`US 2009/OO 10449 A1
`Jan. 8, 2009
`
`Related U.S. Application Data
`(63) Continuation-in-part of application No. 12/139,333,
`filed on Jun. 13, 2008, and a continuation-in-part of
`application No. 1 1/805,987, filed on May 25, 2007,
`now abandoned, and a continuation-in-part of
`application No. 10/667.207, filed on Sep. 18, 2003,
`now Pat. No. 8,019,091, and a continuation-in-part of
`application No. 10/400.282, filed on Mar. 27, 2003.
`(60) Provisional application No. 60/937,603, filed on Jun.
`27, 2007.
`
`U.S. PATENT DOCUMENTS
`6,173,059 B1* 1/2001 Huang et al. .................... 381/92
`6,448,488 B1* 9/2002 Ekhaus et al. .................. 84f735
`6,618,485 B1* 9/2003 Matsuo ........................... 381/92
`* cited by examiner
`Primary Examiner — Wai Sing Louie
`(74) Attorney, Agent, or Firm — Kokka & Backus, PC
`(57)
`ABSTRACT
`Microphone arrays (MAs) are described that position and
`vent microphones so that performance of a noise Suppression
`system coupled to the microphone array is enhanced. The MA
`includes at least two physical microphones to receive acoustic
`signals. The physical microphones make use of a common
`rear vent (actual or virtual) that samples a common pressure
`source. The MA includes a physical directional microphone
`configuration and a virtual directional microphone configu
`ration. By making the input to the rear vents of the micro
`phones (actual or virtual) as similar as possible, the real
`world filter to be modeled becomes much simpler to model
`using an adaptive filter.
`9 Claims, 14 Drawing Sheets
`
`
`
`M1 (virtual)
`
`-410
`
`M2 (virtual)
`
`APPLE 1001
`
`1
`
`
`
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 1 of 14
`
`US 8,280,072 B2
`
`S
`
`
`
`V
`
`t
`
`(u)u9SION
`
`ZOI
`
`2
`
`
`
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 2 of 14
`
`US 8,280,072 B2
`
`MIAINOWJeol7DIN
`
`902
`
`Woy|OIN
`
`QUINJOAJIAIvoWOUTO?)
`
`OC
`
`MIAdOL
`
`C0
`
`AMAIAAIS
`
`
`
`
`
`(apisyows)suluadoyU9AUOWTUIO,)
`
`
`
`
`
`COM
`
`3
`
`
`
`
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 3 of 14
`
`US 8,280,072 B2
`
`Results in cafe environment with no NS (top) and PF + SS (bottom)
`
`
`
`302
`
`O
`
`5
`
`10
`
`20
`
`25
`
`30
`
`35
`
`312
`
`15
`Time (sec)
`
`FIG.3
`
`4
`
`
`
`U.S. Patent
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 4 of 14
`
`US 8,280,072 B2
`US 8,280,072 B2
`
`
`
`yooodspours)
`
`LOI
`
`
`
`
`
`WOTCWIOJUTSUIOIOA
`
`
`
`
`
`
`
`r
`
`t
`
`901
`
`10]
`
`(sy)
`
`eusig
`
`(u)s
`
`COT
`
`(sy)
`
`OSTON,
`
`(u)u
`
`5
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Oct. 2, 2012
`Oct. 2, 2012
`
`Sheet 5 of 14
`Sheet 5 of 14
`
`US 8,280,072 B2
`US 8,280,072 B2
`
`
`
`
`
`6
`
`
`
`U.S. Patent
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 6 of 14
`
`US 8,280,072 B2
`US 8,280,072 B2
`
`(Jenga)TW
`
`(JenastA)Z|
`
`NC
`S2
`t
`
`7
`
`
`
`U.S. Patent
`U.S. Patent
`
`Oct. 2, 2012
`Oct. 2, 2012
`
`US 8,280,072 B2
`
`
`
`Sheet 7 of 14
`Sheet 7 of 14
`
`US 8,280,072 B2
`
`8
`
`
`
`Oct. 2, 2012
`Oct. 2, 2012
`
`Sheet 8 of 14
`Sheet 8 of 14
`
`US 8,280,072 B2
`US 8,280,072 B2
`
`U.S. Patent
`U.S. Patent
`
`
`
`C
`C
`on
`
`
`
`9
`
`
`
`Oct. 2, 2012
`Oct. 2, 2012
`
`Sheet 9 of 14
`Sheet 9 of 14
`
`US 8,280,072 B2
`US 8,280,072 B2
`
`C
`C
`C
`w
`
`
`
`U.S. Patent
`U.S. Patent
`
`
`
`10
`
`10
`
`
`
`U.S. Patent
`U.S. Patent
`
`Oct. 2, 2012
`Oct. 2, 2012
`
`Sheet 10 of 14
`Sheet 10 of 14
`
`US 8,280,072 B2
`US 8,280,072 B2
`
`
`
`
`
`1100
`1100
`
`FIG.11
`FIG.11
`
`11
`
`TTT SST SO
`
`¢1
`
`102
`
`~
`
`11
`
`
`
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 11 of 14
`
`US 8,280,072 B2
`
`1200
`
`
`
`Position first microphone in housing relative
`to speech source.
`
`Position second microphone in housing
`relative to first microphone.
`
`Forming common rear port that is common to first and
`second microphone, the common rear port including a
`vent cavity in an interior region of housing.
`
`FIG.12
`
`1202
`
`1204
`
`1206
`
`12
`
`
`
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 12 of 14
`
`US 8,280,072 B2
`
`1300
`
`
`
`Position first microphone in housing relative
`to speech source.
`
`Position second microphone in housing
`relative to first microphone.
`
`Position third microphone in housing relative to first
`and second microphone and configure third
`microphone as rear "vent" for first and
`second microphone.
`
`FIG.13
`
`1302
`
`1304
`
`1306
`
`13
`
`
`
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 13 of 14
`
`US 8,280,072 B2
`
`1400
`
`
`
`Receive acoustic signals at first microphone and
`second microphone.
`
`Control delay of first rear port of first microphone
`to be approximately equal to delay of second rear
`port of second microphone.
`
`Generate denoised output signals by combining
`signals output from first and second microphones.
`
`1402
`
`1404
`
`1406
`
`FIG.14
`
`14
`
`
`
`U.S. Patent
`
`Oct. 2, 2012
`
`Sheet 14 of 14
`
`US 8,280,072 B2
`
`1500
`
`
`
`Receive acoustic signals at first physical microphone
`and output first microphone signal.
`
`Receive acoustic signals at second physical
`microphone and output second microphone signal.
`
`Receive acoustic signals at third physical
`microphone and output third microphone signal.
`
`Form first virtual microphone by generating
`combination of first microphone signal and
`third microphone signal.
`
`Form second virtual microphone by generating
`combination of second microphone signal and
`third microphone signal.
`
`Generate denoised output signals by combining
`signals output from the first virtual microphone
`and the second virtual microphone.
`
`FIG.15
`
`1502
`
`1504
`
`1506
`
`1508
`
`1510
`
`1512
`
`15
`
`
`
`1.
`MCROPHONE ARRAY WITH REAR
`VENTING
`
`US 8,280,072 B2
`
`RELATED APPLICATIONS
`
`This application claims the benefit of U.S. Patent Applica
`tion No. 60/937,603, filed Jun. 27, 2007.
`This application is a continuation in part application of
`U.S. patent application Ser. Nos. 10/400,282, filed Mar. 27,
`2003, 10/667,207, filed Sep. 18, 2003, 11/805,987, filed May
`25, 2007, and 12/139,333, filed Jun. 13, 2008.
`
`10
`
`TECHNICAL FIELD
`
`The disclosure herein relates generally to noise Suppres
`Sion. In particular, this disclosure relates to noise Suppression
`systems, devices, and methods for use in acoustic applica
`tions.
`
`15
`
`BACKGROUND
`
`Conventional adaptive noise Suppression algorithms have
`been around for Some time. These conventional algorithms
`have used two or more microphones to sample both an (un
`wanted) acoustic noise field and the (desired) speech of a user.
`The noise relationship between the microphones is then
`determined using an adaptive filter (Such as Least-Mean
`Squares as described in Haykin & Widrow, ISBN
`#0471215708, Wiley, 2002, but any adaptive or stationary
`system identification algorithm may be used) and that rela
`tionship used to filter the noise from the desired signal.
`Most conventional noise Suppression systems currently in
`use for speech communication systems are based on a single
`microphone spectral Subtraction technique first develop in the
`1970s and described, for example, by S. F. Boll in “Suppres
`sion of Acoustic Noise in Speech using Spectral Subtraction.”
`IEEE Trans. on ASSP. pp. 113-120, 1979. These techniques
`have been refined over the years, but the basic principles of
`operation have remained the same. See, for example, U.S. Pat.
`No. 5,687.243 of McLaughlin, et al., and U.S. Pat. No. 4,811,
`404 of Vilmur, et al. There have also been several attempts at
`multi-microphone noise Suppression systems, such as those
`outlined in U.S. Pat. No. 5,406,622 of Silverberg et al. and
`U.S. Pat. No. 5,463,694 of Bradley et al. Multi-microphone
`systems have not been very Successful for a variety of reasons,
`the most compelling being poor noise cancellation perfor
`mance and/or significant speech distortion.
`
`25
`
`30
`
`35
`
`40
`
`45
`
`INCORPORATION BY REFERENCE
`
`50
`Each patent, patent application, and/or publication men
`tioned in this specification is herein incorporated by reference
`in its entirety to the same extent as if each individual patent,
`patent application, and/or publication was specifically and
`individually indicated to be incorporated by reference.
`
`55
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a two-microphone adaptive noise Suppression
`system, under an embodiment.
`FIG.2 is a block diagram of a directional microphone array
`(MA) having a shared-vent configuration, under an embodi
`ment.
`FIG. 3 shows results obtained for a MA having a shared
`vent configuration, under an embodiment.
`FIG. 4 is a three-microphone adaptive noise Suppression
`system, under an embodiment.
`
`60
`
`65
`
`2
`FIG. 5 is a block diagram of the MA in the shared-vent
`configuration including omnidirectional microphones to
`form virtual directional microphones (VDMs), under an
`embodiment.
`FIG. 6 is a block diagram for a MA including three physical
`omnidirectional microphones configured to form two virtual
`microphones M and M, under an embodiment.
`FIG. 7 is a generalized two-microphone array including an
`array and speech source S configuration, under an embodi
`ment.
`FIG. 8 is a system for generating a first order gradient
`microphone V using two omnidirectional elements O and
`O, under an embodiment.
`FIG. 9 is a block diagram for a MA including two physical
`microphones configured to form two virtual microphones V
`and V2, under an embodiment.
`FIG.10 is a block diagram for a MA including two physical
`microphones configured to form N virtual microphones V
`through V, where N is any number greater than one, under an
`embodiment.
`FIG.11 is an example of a headset or head-worn device that
`includes the MA, under an embodiment.
`FIG. 12 is a flow diagram for forming the MA having the
`physical shared-vent configuration, under an embodiment.
`FIG. 13 is a flow diagram for forming the MA having the
`shared-vent configuration including omnidirectional micro
`phones to form VDMs, under an alternative embodiment.
`FIG. 14 is a flow diagram for denoising acoustic signals
`using the MA having the physical shared-vent configuration,
`under an embodiment.
`FIG. 15 is a flow diagram for denoising acoustic signals
`using the MA having the shared-vent configuration including
`omnidirectional microphones to form VDMs, under an alter
`native embodiment.
`
`DETAILED DESCRIPTION
`
`Systems and methods are provided including microphone
`arrays and associated processing components for use in noise
`Suppression. The systems and methods of an embodiment
`include systems and methods for noise Suppression using one
`or more of microphone arrays having multiple microphones,
`an adaptive filter, and/or speech detection devices. More spe
`cifically, the systems and methods described herein include
`microphone arrays (MAS) that position and vent microphones
`so that performance of a noise Suppression system coupled to
`the microphone array is enhanced.
`The MA configuration of an embodiment uses rear vents
`with the directional microphones, and the rear vents sample a
`common pressure source. By making the input to the rear
`vents of directional microphones (actual or virtual) as similar
`as possible, the real-world filter to be modeled becomes much
`simpler to model using an adaptive filter. In some cases, the
`filter collapses to unity, the simplest filter of all. The MA
`systems and methods described herein have been Success
`fully implemented in the laboratory and in physical systems
`and provide improved performance over conventional meth
`ods. This is accomplished differently for physical directional
`microphones and virtual directional microphones (VDMs).
`The theory behind the microphone configuration, and more
`specific configurations, are described in detail below for both
`physical and VDMs.
`The MAs, in various embodiments, can be used with the
`Pathfinder system (referred to herein as “Pathfinder') as the
`adaptive filter system or noise removal. The Pathfinder sys
`tem, available from AliphCom, San Francisco, Calif., is
`described in detail in other patents and patent applications
`
`16
`
`
`
`3
`referenced herein. Alternatively, any adaptive filter or noise
`removal algorithm can be used with the MAs in one or more
`various alternative embodiments or configurations.
`The Pathfinder system includes a noise Suppression algo
`rithm that uses multiple microphones and a VAD signal to
`remove undesired noise while preserving the intelligibility
`and quality of the speech of the user. Pathfinder does this
`using a configuration including directional microphones and
`overlapping the noise and speech response of the micro
`phones; that is, one microphone will be more sensitive to
`speech than the other but they will both have similar noise
`responses. If the microphones do not have the same or similar
`noise responses, the denoising performance will be poor. If
`the microphones have similar speech responses, then devoic
`ing will take place. Therefore, the MAs of an embodiment
`ensure that the noise response of the microphones is as similar
`as possible while simultaneously constructing the speech
`response of the microphones as dissimilar as possible. The
`technique described herein is effective at removing undesired
`noise while preserving the intelligibility and quality of the
`speech of the user.
`In the following description, numerous specific details are
`introduced to provide a thorough understanding of, and
`enabling description for, embodiments of the microphone
`array (MA). One skilled in the relevant art, however, will
`recognize that these embodiments can be practiced without
`one or more of the specific details, or with other components,
`systems, etc. In other instances, well-known structures or
`operations are not shown, or are not described in detail, to
`avoid obscuring aspects of the disclosed embodiments.
`Unless otherwise specified, the following terms have the
`corresponding meanings in addition to any meaning or under
`standing they may convey to one skilled in the art.
`The term “speech” means desired speech of the user.
`The term “noise' means unwanted environmental acoustic
`noise.
`The term “denoising means removing unwanted noise
`from MIC 1, and also refers to the amount of reduction of
`noise energy in a signal in decibels (dB).
`40
`The term “devoicing means removing/distorting the
`desired speech from MIC 1.
`The term “directional microphone (DM) means a physical
`directional microphone that is vented on both sides of the
`sensing diaphragm.
`The term “virtual microphones (VM) or “virtual direc
`tional microphones' means a microphone constructed using
`two or more omnidirectional microphones and associated
`signal processing.
`The term “MIC 1 (M1) means a general designation for a
`microphone that is more sensitive to speech than noise.
`The term “MIC 2 (M2) means a general designation for a
`microphone that is more sensitive to noise than speech.
`The term “null means a Zero or minima in the spatial
`response of a physical or virtual directional microphone.
`The term "O, means a first physical omnidirectional
`microphone used to form a microphone array.
`The term “O'” means a second physical omnidirectional
`microphone used to form a microphone array.
`The term “O'” means a third physical omnidirectional
`microphone used to form a microphone array.
`The term “V” means the virtual directional “speech”
`microphone, which has no nulls.
`The term “V” means the virtual directional “noise' micro
`phone, which has a null for the user's speech.
`The term “Voice Activity Detection (VAD) signal' means
`a signal indicating when user speech is detected.
`
`4
`FIG. 1 is a two-microphone adaptive noise Suppression
`system 100, under an embodiment. The two-microphone sys
`tem 100 includes the combination of microphone array 110
`along with the processing or circuitry components to which
`the microphone array couples. The processing or circuitry
`components, some of which are described in detail below,
`include the noise removal application or component 105 and
`the VAD sensor 106. The output of the noise removal com
`ponent is cleaned speech, also referred to as denoised acoustic
`signals 107.
`The microphone array 110 of an embodiment comprises
`physical microphones MIC 1 and MIC 2, but the embodiment
`is not so limited, and either of MIC 1 and MIC 2 can be a
`physical or virtual microphone. Referring to FIG. 1, in ana
`lyzing the single noise source 101 and the direct path to the
`microphones, the total acoustic information coming into MIC
`1 is denoted by m(n). The total acoustic information coming
`into MIC 2 is similarly labeled m(n). In the Z (digital fre
`quency) domain, these are represented as M(z) and M2(Z).
`Then,
`
`This is the general case for all two-microphone systems.
`Equation 1 has four unknowns and only two known relation
`ships and therefore cannot be solved explicitly.
`However, there is another way to solve for some of the
`unknowns in Equation 1. The analysis starts with an exami
`nation of the case where the speech is not being generated,
`that is, where a signal from the VAD subsystem 106 (optional)
`equals Zero. In this case, s(n)=S(Z)=0, and Equation 1 reduces
`tO
`
`where the N subscript on the M variables indicate that only
`noise is being received. This leads to
`
`MN (z) = M2N (3)H (3)
`MN (3)
`H. (3) = M2N (z)
`
`Eq. 2
`
`The function H (Z) can be calculated using any of the avail
`able system identification algorithms and the microphone
`outputs when the system is certain that only noise is being
`received. The calculation can be done adaptively, so that the
`system can react to changes in the noise.
`A solution is now available for H (Z), one of the unknowns
`in Equation 1. The final unknown, H2(Z), can be determined
`by using the instances where speech is being produced and the
`VAD equals one. When this is occurring, but the recent (per
`
`US 8,280,072 B2
`
`10
`
`15
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`60
`
`65
`
`17
`
`
`
`5
`haps less than 1 second) history of the microphones indicate
`low levels of noise, it can be assumed that n(s)=N(Z)-0. Then
`Equation 1 reduces to
`
`US 8,280,072 B2
`
`which in turn leads to
`
`M2s (3) = M1s (3) H2(3)
`M2s (3)
`
`which is the inverse of the H (Z) calculation. However, it is
`noted that different inputs are being used (now only the
`speech is occurring whereas before only the noise was occur
`ring). While calculating H(Z), the values calculated for H (Z)
`are held constant (and Vice versa) and it is assumed that the
`noise level is not high enough to cause errors in the H2(Z)
`calculation.
`After calculating H (Z) and H2(Z), they are used to remove
`the noise from the signal. If Equation 1 is rewritten as
`
`10
`
`15
`
`then N(z) may be substituted as shown to solve for S(Z) as
`
`M(z) - M2(3) H. (3)
`(3) =
`III,
`
`35
`
`Eq. 3
`
`If the transfer functions H (Z) and H(z) can be described
`with Sufficient accuracy, then the noise can be completely 40
`removed and the original signal recovered. This remains true
`without respect to the amplitude or spectral characteristics of
`the noise. If there is very little or no leakage from the speech
`Source into M, then H(Z)s0 and Equation 3 reduces to
`
`45
`
`50
`
`55
`
`Equation 4 is much simpler to implement and is very
`stable, assuming H (Z) is stable. However, if significant
`speech energy is in M(z), devoicing can occur. In order to
`construct a well-performing system and use Equation 4, con
`sideration is given to the following conditions:
`R1. Availability of a perfect (or at least very good) VAD in
`noisy conditions
`R2. Sufficiently accurate H (Z)
`R3. Very small (ideally Zero) H-(Z).
`R4. During speech production, H (Z) cannot change Sub
`stantially.
`R5. During noise, H2(Z) cannot change Substantially.
`Condition R1 is easy to satisfy if the SNR of the desired
`speech to the unwanted noise is high enough. "Enough
`60
`means different things depending on the method of VAD
`generation. If a VAD vibration sensor is used, as in Burnett
`U.S. Pat. No. 7.256,048, accurate VAD in very low SNRs
`(-10 dB or less) is possible. Acoustic-only methods using
`information from MIC 1 and MIC 2 can also return accurate
`VADs, but are limited to SNRs of -3 dB or greater for
`adequate performance.
`
`65
`
`6
`Condition R5 is normally simple to satisfy because for
`most applications the microphones will not change position
`with respect to the user's mouth very often or rapidly. In those
`applications where it may happen (such as hands-free confer
`encing systems) it can be satisfied by configuring MIC 2 So
`that H(Z)-0.
`Satisfying conditions R2, R3, and R4 are more difficult but
`are possible given the right combination of microphone out
`put signals. Methods are examined below that have proven to
`be effective in satisfying the above, resulting in excellent
`noise Suppression performance and minimal speech removal
`and distortion in an embodiment.
`The MA, in various embodiments, can be used with the
`Pathfinder system as the adaptive filter system or noise
`removal (element 105 in FIG. 1), as described above. When
`the MA is used with the Pathfinder system, the Pathfinder
`system generally provides adaptive noise cancellation by
`combining the two microphone signals (e.g., MIC 1, MIC 2)
`by filtering and Summing in the time domain. The adaptive
`filter generally uses the signal received from a first micro
`phone of the MA to remove noise from the speech received
`from at least one other microphone of the MA, which relies on
`a slowly varying linear transfer function between the two
`microphones for sources of noise. Following processing of
`the two channels of the MA, an output signal is generated in
`which the noise content is attenuated with respect to the
`speech content, as described in detail below.
`A description follows of the theory supporting the MA with
`the Pathfinder. While the following description includes ref
`erence to two directional microphones, the description can be
`generalized to any number of microphones.
`Pathfinder operates using an adaptive algorithm to continu
`ously update the filter constructed using MIC 1 and MIC 2. In
`the frequency domain, each microphone's output can be rep
`resented as:
`
`where F(z) represents the pressure at the front port of MIC1,
`B, (Z) the pressure at the back (rear) port, and Z' the delay
`instituted by the microphone. This delay can be realized
`through port venting and/or microphone construction and/or
`other ways known to those skilled in the art, including acous
`tic retarders which slow the acoustic pressure wave. If using
`omnidirectional microphones to construct virtual directional
`microphones, these delays can also be realized using delays in
`DSP. The delays are not required to be integer delays. The
`filter that is constructed using these outputs is
`
`H. (3) =
`
`In the case where B, (z) is not equal to Ba(z), this is an IIR
`filter. It can become quite complex when multiple micro
`phones are employed. However, if B (z)=B(Z) and d=d,
`then
`
`F(z) - : B (3)
`16 re-dra
`Hi(z) = -
`
`18
`
`
`
`7
`The front ports of the two microphones are related to each
`other by a simple relationship:
`
`US 8,280,072 B2
`
`where A is the difference in amplitude of the noise between
`the two microphones and d is the delay between the micro
`phones. Both of these will vary depending on where the
`acoustic source is located with respect to the microphones. A
`single noise source is assumed for purposes of this descrip
`10
`tion, but the analysis presented can be generalized to multiple
`noise sources. For noise, which is assumed to be more than a
`meter away (in the far field), A is approximately ~1. The delay
`d will vary depending on the noise source between-d
`and +d, where d
`is the maximum delay possible
`between the two front ports. This maximum delay is a func
`tion of the distance between the front vents of the micro
`phones and the speed of Sound in air.
`The rear ports of the two microphones are related to the
`front port by a similar relationship:
`
`15
`
`where B is difference in amplitude of the noise between the
`two microphones and d is the delay between front port 1
`and the common back port 3. Both of these will vary depend
`ing on where the acoustic source is located with respect to the
`microphones as shown above withd. The delay dis will vary
`depending on the noise source between -d
`and +ds,
`where d
`is the maximum delay possible between front
`port 1 and the common back port 3. This maximum delay is
`determined by the path length between front port 1 and the
`common back port 3 for example, if they are located 3
`centimeters (cm) apart, dis
`will be
`
`25
`
`30
`
`8
`Thus, for many noise locations, the H(Z) filter can be easily
`modeled using an adaptive FIR algorithm. This is not the case
`if the two directional microphones do not have a common rear
`vent. Even for noise sources away from a line perpendicular
`to the array axis, the H (Z) filter is still simpler and more
`easily modeled using an adaptive FIR filter algorithm and
`improvements in performance have been observed.
`A first approximation made in the description above is that
`B (Z)-B(Z). This approximation means the rear vents are
`exposed to and have the same response to the same pressure
`Volume. This approximation can be satisfied if the common
`vented Volume is Small compared to a wavelength of the
`sound wave of interest.
`A second approximation made in the description above is
`that did. This approximation means the rear port delays for
`each microphone are the same. This is no problem with physi
`cal directional microphones, but must be specified for VDMs.
`These delays are relative; the front ports can also be delayed
`if desired, as long as the delay is the same for both micro
`phones.
`A third approximation made in the description above is that
`F(z) F(z)z'i'. This approximation means the amplitude
`response of the front vents are about the same and the only
`difference is a delay. For noise sources greater than one (1)
`meter away, this is a good approximation, as the amplitude of
`a Sound wave varies as 1/r.
`For speech, since it is much closer to the microphones
`(approximately 1 to 10 cm), A is not unity. The closer to the
`mouth of the user, the more different from unity. A becomes.
`For example, if MIC 1 is located 8 cm away from the mouth
`and MIC 2 is located 12 cm away from the mouth, then for
`speech A would be
`
`0.03 m.
`d
`= 0.87 msec
`d13 na = - =
`C
`345 mis
`
`35
`
`A = F = 18 = 0.67
`
`Again, for noise, B is approximately one (1) since the noise
`Sources are assumed to be greater than one (1) meter away
`from the microphones. Thus, in general, the above equation
`reduces to:
`
`40
`
`This means for speech H (Z) will be
`
`F(z) - : B (3)
`-di Are-dr.
`136
`H. (3) = -
`
`HN (3) =
`
`45
`
`where the “N' denotes that this response is for far-field noise.
`Since d is a characteristic of the microphone, it remains the
`same for all different noise orientations. Conversely, d and
`da are relative measurements that depend on the location of
`the noise Source with respect to the array.
`If d12 goes to or becomes Zero (0), then the filter H(Z)
`collapses to
`55
`
`50
`
`with the “S” denoting the response for near-field speech and
`Az1. This does not reduce to a simple FIR approximation and
`will be harder for the adaptive FIR algorithm to adapt to. This
`means that the models for the filters H(Z) and H(Z) will be
`very different, thus reducing devoicing. Of course, if a noise
`Source is located close to the microphone, the response will be
`the similar, which could cause more devoicing. However,
`unless the noise source is located very near the mouth of the
`user, a non-unity A and nonzero da should be enough to limit
`devoicing.
`As an example, the difference in response is next examined
`for speech and noise when the noise is located behind the
`microphones. Let d=3. For speech, let d. 2, A=0.67, and
`B=0.82. Then
`
`Hi(z) = 1
`
`a = 1 (d.12 -> 0)
`
`and the resulting filter is a simple unity response filter, which
`is extremely simple to model with an adaptive FIR system.
`For noise Sources perpendicular to the array axis, the distance
`from the noise source to the front vents will be equal and d2
`will go to Zero. Even for Small angles from the perpendicular,
`d will be small and the response will still be close to unity.
`
`60
`
`65
`
`F(z) - : B (3)
`18) -di Ard R.
`H. (3) = -
`1 - 0.82:
`His(3) = 0.673 - 0.82. 3
`
`19
`
`
`
`US 8,280,072 B2
`
`which has a very non-FIR response. For noise located directly
`opposite the speech, di-2, A-B-1. Thus the phase of the
`noise at F is two samples ahead of F. Then
`
`F(z) - : B (z)
`Hix (c) = 2i.
`3
`
`z*- :
`– 1 - 5
`
`10
`
`15
`
`10
`first and the second microphone multiplied by a delay
`between the first and the second microphones. Further, a
`pressure of the first rear port is approximately proportional to
`a pressure of the first front port multiplied by a difference in
`amplitude of noise between the first and the second micro
`phone multiplied by a delay between the first front port and
`the common rear port.
`Generally, physical microphones of the MA of an embodi
`ment are selected and configured so that a first noise response
`and a first speech response of the first microphone overlaps
`with a second noise response and a second speech response of
`the second microphone. This is accomplished by selecting
`and configuring the microphones Such that a first noise
`response of the first microphone and a second noise response
`of the second microphone are substantially similar, and a first
`speech response of the first microphone and a second speech
`response of the second microphone are substantially dissimi
`lar.
`The first microphone and the second microphone of an
`embodiment are directional microphones. An example MA
`configuration includes electret directional microphones hav
`ing a 6 millimeter (mm) diameter, but the embodiment is not
`so limited. Alternative embodiments can include any type of
`directional microphone having any number of different sizes
`and/or configurations. The vent openings for the front of each
`microphone and the common rear vent Volume must be large
`enough to ensure adequate speech energy at the front and rear
`of each microphone. A vent opening of approximately 3 mm
`in diameter has been implemented with good results.
`FIG.3 shows results obtained for a microphone array hav
`ing a shared-vent configuration, under an embodiment. These
`experimental results were obtained using the shared-rear
`vent configuration described herein using a live Subject in a
`sound room in the presence of complex babble noise. The top
`plot 302 (“MIC 1 no processing) is the original noisy signal
`in MIC1, and the bottom plot 312 (“MIC1 after PF+SS) the
`denoised signal (Pathfinder plus spectral Subtraction) (under
`identical or nearly identical conditions) after adaptive Path
`finder denoising of approximately 8 dB and additional single
`channel spectral subtraction of approximately 12 dB. Clearly
`the technique is adept at removing the unwanted noise from
`the desired signal.
`FIG. 4 is a three-microphone adaptive noise Suppression
`system 400, under an embodiment. The three-microphone
`system 400 includes the combination of microphone array
`410 along with the processing or circuitry components to
`which the microphone array is coupled (described in detail
`herein, but not shown in this figure). The microphone array
`410 includes three physical omnidirectional microphones in a
`shared-vent configuration in which the omnidirectional
`microphones form VDMs. The microphone array 410 of an
`embodiment comprises physical microphones MIC1, MIC 2
`and MIC3 (correspond to omnidirectional microphones O,
`O, and Os), but the embodiment is not so limited.
`FIG. 5 is a block diagram of the microphone array 410 in
`the shared-vent configuration including omnidirectional
`microphones to form VDMs, under an embodiment. Here, the
`common “rear vent' is a third omnidirectional microphone
`situated between the other two microphones. This example
`embodiment places the first microphone Oona first side, and
`places the second O. and third O. microphones on a second
`side, but the embodiment is not so limited. The relationship
`between the three microphones is shown only as an example,
`and the positional relationship between the three micro
`phones can be any num