throbber
a2) United States Patent
`US 8,194,880 B2
`(0) Patent No.:
`Jun. 5, 2012
`(45) Date of Patent:
`Avendano
`
`US008194880B2
`
`(54) SYSTEM AND METHODFOR UTILIZING
`OMNI-DIRECTIONAL MICROPHONES FOR
`SPEECH ENHANCEMENT
`
`JP
`
`(75)
`
`Inventor: Carlos Avendano, Mountain View, CA
`(US)
`
`(73) Assignee: Audience, Inc., Mountain View, CA
`(US)
`
`(*) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1384 days.
`
`(21) Appl. No.: 11/699,732
`
`(22)
`
`(65)
`
`Filed:
`
`Jan. 29, 2007
`
`Prior Publication Data
`
`US 2008/0019548 Al
`
`Jan. 24, 2008
`
`Related U.S. Application Data
`
`(63) Continuation-in-part of application No. 11/343,524,
`filed on Jan. 30, 2006.
`
`(60) Provisional application No. 60/850,928, filed on Oct.
`10, 2006.
`
`(51)
`
`Int. Cl.
`(2006.01)
`HOAR 3/00
`(52) US.CL ....... 381/92; 381/94.1; 381/94.2; 381/943;
`381/94.7; 381/122; 704/226; 704/227; 704/233;
`704/275
`
`(58) Field of Classification Search .................. 381/313,
`381/312, 91, 92, 122, 95, 110, 94.1, 94.2,
`381/94.3, 94.7; 704/226, 227, 233, 275
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`3,976,863 A
`3,978,287 A
`4,137,510 A
`4,433,604 A
`4,516,259 A
`4,535,473 A
`
`8/1976 Engel
`8/1976 Fletcheretal.
`1/1979 Iwahara
`2/1984 Ott
`5/1985 Yato etal.
`8/1985 Sakata
`
`(Continued)
`
`Secondary
`Microphone
`108
`
`
`
`Primary
`Microphone 106
`
`4
`
`Audio
`Source 102
`
`FOREIGN PATENT DOCUMENTS
`62110349
`5/1987
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`Mare Moonenet al. “Multi-Microphone Signal Enhancement Tech-
`niques for Noise Suppression and Dereverberation,” source(s): http://
`www.esat.kuleuven.ac.be/sista/yearreport97/node37 html.
`Steven Boll et al. “Suppression of Acoustic Noise in Speech Using
`Two Microphone Adaptive Noise Cancellation”, source(s): TEEE
`Transactions on Acoustic, Speech, and Signal Processing. vol. v
`ASSP-28, n 6, Dec. 1980, pp. 752-753.
`
`(Continued)
`
`Primary Examiner — Vivian Chin
`Assistant Examiner — Paul Kim
`
`(74) Attorney, Agent, or Firm — Carr & Ferrell LLP
`
`(57)
`
`ABSTRACT
`
`Systems and methods for utilizing inter-microphone level
`differences (ILD) to attenuate noise and enhance speechare
`provided. In exemplary embodiments, primary and second-
`ary acoustic signals are received by omni-directional micro-
`phones, and converted into primary and secondary electric
`signals. A differential microphone array module processes
`the electric signals to determine a cardioid primary signal and
`a cardioid secondary signal. The cardioid signals are filtered
`through a frequency analysis module whichtakes the signals
`and mimics a cochlea implementation (1.e., cochlear domain).
`Energylevels ofthe signals are then computed, and the results
`are processed by an ILD module using a non-linear combi-
`nation to obtain the ILD. In exemplary embodiments, the
`non-linear combination comprises dividing the energy level
`associated with the primary microphoneby the energy level
`associated with the secondary microphone. The ILD is uti-
`lized by a noise reduction system to enhancethe speech ofthe
`primary acoustic signal.
`
`28 Claims, 9 Drawing Sheets
`
`
`
`
`
`
`Noise
`4110
`
`Amazon v. Jawbone
`U.S. Patent 8,321,213
`
`Amazon Ex. 1005
`
`1
`
`Amazon v. Jawbone
`U.S. Patent 8,321,213
`Amazon Ex. 1005
`
`

`

`US 8,194,880 B2
`
`Page 2
`
`U.S. PATENT DOCUMENTS
`4.536.844 A
`8/1985 Lyon
`4,581,758 A
`4/1986 Coker et al.
`4698599 A
`19/1986 Borthet
`al
`4630304 A
`12/1986 Douthat
`4,649,505 A
`3/1987 Zinser, Jr. et al.
`4,658,426 A
`4/1987. Chabrieset al.
`4.674.195 A
`6/1987 Carlson et al
`4,718,104 A
`1/1988 Anderson
`4811404 A
`3/1989 Vil
`\
`4812996 A
`3/1989 Stuhte
`41864620 A
`9/1989 Bialick
`4.920.508 A
`4/1990. Yassaie etal.
`5,027,410 A
`6/1991 Williamsonetal.
`5,054,085 A
`10/1991 Meiselet al.
`S0s8419 A
`10/1991 Nord
`1
`2000-738
`3/190) Hor strom et al.
`SLOT A 6/1992 Belletal
`5.142.961 A
`9/1992 Paroutaud
`5150-413 A
`9/199) Nakatani et al
`5,175,769 A
`12/1992 Hejna,Jr. et al.
`5187776 A
`5/1993 Yank
`5°208'864 A
`5/1993 K "
`5010306 A
`5/1993 Sykes, It
`5.224.170 A
`6/1993. Waite,Jr.
`5930022 A
`7/1993 Sak
`5319736 A
`6/1994 ata
`5393450 A
`6/1994 Hono
`5.341.432 A
`8/1994. Suzuki etal.
`5,381,473 A
`1/1995 Andreaet al
`5,381,512 A
`1/1995. Holtonet al.
`5400409 A
`3/1998 Linhard
`5°402.493 A
`3/1998 Goldsiein
`5.402.496 A
`3/1995. Soli etal
`S471 195 A
`11/1995. Rickman
`5,473,702 A
`12/1995 Yoshida etal.
`5,473,759 A
`12/1995 Slaneyet al.
`5,479,564 A
`12/1995 Vogtenet al.
`5.502.663 A
`3/1996
`5'344°920 A
`3/1996 Uchanski
`Sst4so4 A
`11990 Spel
`5,583,784 A
`12/1996 Kapustet al.
`5,587,998 A
`12/1996 Velardo, Jr. et al.
`3500 241 A
`10/1996 Patk ct al
`5,602,962 A
`2/1997 Kellermann
`S675 778 A
`10/1907 J
`5682463 A
`10/1997 ‘Allen tal
`5604474 A
`«10/1997 N set i
`5706-305 A
`1998 Alon tal
`5717800 A
`7/1998 Takagi
`5,729,612 A
`3/1998. Abel et al.
`5,732,189 A
`3/1998 Johnstonet al.
`5,749,064 A
`5/1998 Pawateet al.
`3757937 A
`3/1998 Itoh et al
`5,792,971 A
`8/1998. Timisetal.
`5,796,819 A
`8/1998 Romesburg
`5806025 A
`9/1998 Vie
`at
`al
`5'809.463 A
`9/1998 Guptaefal
`5,825,320 A
`10/1998 Miyamori et al.
`5,839,101 A
`11/1998 Vahataloet al.
`5,920,840 A
`7/1999 Satyamutti et al.
`5.933.405 A
`8/1999 Oh
`5943-499 A
`3/1999 Handel
`5956674 A
`9/1999 Sim, © tal
`5974380 A
`10/1999 snyth eta
`5078804 A
`11/1999 pmyth
`.
`5983130 A
`11/1999
`ih 7
`5990405 A
`11/1999 ‘aute etal
`tal
`6002776 A
`12/1999 Bh “ka al.
`061456. A
`5/2000 padkamkatal eal.
`6.072881 A
`6/2000 Linder
`,
`6097320 A
`8/2000 Turner
`6.108.626 A
`8/2000 Cellario et al.
`6.122.610 A
`9/2000 Isabelle
`6,134,524 A
`10/2000 Peters etal.
`6,137,349 A
`10/2000 Menkhoff et al.
`6,140,809 A
`10/2000 Doi
`6,173,255 B
`1/2001 Wilsonet al.
`BSee
`6,180,273
`1/2001 Okamoto
`
`4/2001 Wuet al.
`6,216,103 Bl
`4/2001 Feng et al.
`6,222,927 B1
`4/2001 Brungart
`6,223,090 Bl
`5/2001 _Youetal.
`6,226,616 Bl
`7/2001 Arslan etal.
`6,263,307 BL
`eeeRT Lol vigeins ot al.
`ag’
`atsuo
`6,339,758 Bl
`1/2002 Kanazawaet al.
`6,355,869 Bl
`3/2002 Mitton
`6,363,345 Bl
`3/2002 Marashetal.
`6,381,570 B2
`4/2002 Lietal.
`6,430,295 Bl
`8/2002 Handeletal.
`Otoake BI
`5500. ett ama
`6469732 Bl
`10/2002. Chan *t al
`Ter
`pera
`6,487,257 Bl
`11/2002 Gustafssonetal.
`6,496,795 Bl
`12/2002 Malvar
`6,513,004 BL
`1/2003 Rigazioet al.
`6,516,066 B2
`2/2003 Hayashi
`6,529,606 Bl
`3/2003 Jackson,Jr. Iet al.
`6,549,630 Bl
`4/2003 Bobisuthi
`6,584,203 B2
`6/2003 Elko etal.
`6,622,030 BL
`9/2003 Romesburg etal.
`6,717,991 BL
`4/2004 Gustafssonetal.
`6,718,309 Bl
`4/2004 Selly
`6,738,482 Bl
`5/2004 Jaber
`6,760,450 B2
`7/2004 Matsuo
`6,785,381 B2
`8/2004 Gartneret al.
`6,792,118 B2
`9/2004 Watts
`6,795,558 B2
`9/2004 Matsuo
`6,798,886 Bl
`9/2004 Smith et al.
`6,810,273 Bl
`10/2004 Mattila et al.
`6,882,736 B2
`4/2005 Dickeletal.
`6,915,264 B2
`7/2005 Baumgarte
`6,917,688 B2
`7/2005 Yuetal.
`oeTo BD Doo. Beey st al
`6082377 B2
`1/2006 Saketooal
`G00°
`:
`'
`6,999,582 Bl
`2/2006 Popovic etal.
`7,016,507 Bl
`3/2006 Brennan
`7on05 B ¥2006 amy
`7'054.452 B2
`5/2006 Ukita
`.
`ee
`7,065,485 B1
`6/2006 Chong-Whiteet al.
`7,076,315 BI
`7/2006 Watts
`7,092,529 B2
`8/2006 Yuetal.
`7,092,882 B2
`8/2006 Arrowoodet al.
`7,099,821 B2
`8/2006 Visseret al.
`7,142,677 B2
`11/2006 Gonopolskiy
`7,146,316 B2
`12/2006 Alves
`Feyeoo BS eos Heo ama
`7171008 B2
`1/2007 Elko ¥
`ay?
`:
`7,171,246 B2
`1/2007 Mattila et al.
`aoeta BS
`qeer Chang etal.
`SOQ!
`ang et al
`7,209,567 Bl
`4/2007 Kozelet al.
`7,225,001 Bl
`5/2007 Erikssonetal.
`rteoek BS
`reer peotal
`795494) BD
`8/2007 Ise etal
`Se"
`:
`7,359,520 B2
`4/2008 Brennan etal.
`7,412,379 B2
`8/2008 Taori et al.
`7,433,907 B2
`10/2008 Nagaiet al.
`7,555,434 B2
`6/2009 Nomuraetal.
`7,949,522 B2
`5/2011 Hetheringtonet al.
`2001/0016020 Al
`8/2001 Gustafssonetal.
`2001/0031053 Al
`10/2001 Feng etal.
`2002/0002455 Al
`1/2002 Accardietal.
`2002/0009203 Al
`‘1/2002.
`_Erten
`2002/0041693 Al
`4/2002 Matsuo
`2002/0080980 Al
`6/2002 Matsuo
`2002/0106092 Al
`8/2002 Matsuo
`2002/0116187 Al
`8/2002 Erten
`2002/0133334 Al
`9/2002 Coorman etal.
`2002/0147595 Al
`10/2002 Baumgarte
`2002/0184013 Al
`12/2002 Walker
`2003/0014248 Al
`1/2003 Vetter
`2003/0026437 Al
`2/2003 Janse etal.
`
`2
`
`

`

`US 8,194,880 B2
`
`Page 3
`
`2003/0033140 Al
`2003/0039369 Al
`2003/0040908 Al
`2003/0061032 Al
`2003/0063759 Al
`2003/0072382 Al
`2003/0072460 Al
`2003/0095667 Al
`2003/0099345 Al
`2003/0101048 Al
`2003/0103632 Al
`2003/0128851 Al
`2003/0138116 Al
`2003/0147538 AL*
`2003/0169891 Al*
`2003/0228023 Al
`2004/0013276 Al
`2004/0047464 Al
`2004/0057574 Al
`2004/0078199 Al
`2004/0131178 Al
`2004/0133421 Al
`2004/0165736 Al
`2004/0196989 Al
`2004/0263636 Al
`2005/0025263 Al
`2005/0027520 Al
`2005/0049864 Al
`2005/0060142 Al
`2005/0152559 Al
`2005/0185813 Al
`2005/0213778 Al
`2005/0216259 Al
`2005/0228518 Al
`2005/0276423 Al
`2005/0288923 Al
`2006/0072768 Al
`2006/0074646 Al
`2006/0098809 Al
`2006/0120537 Al
`2006/0133621 Al
`2006/0149535 Al
`2006/0184363 Al
`2006/0198542 Al
`2006/0222184 Al
`2007/0021958 Al
`2007/0027685 Al
`2007/0033020 Al
`2007/0067166 Al
`2007/0078649 Al
`2007/0094031 Al
`2007/0100612 Al
`2007/0116300 Al
`3007/0150268 Al
`9007/0154031 Al
`2007/0165879 Al
`2007/0195968 Al
`2007/0230712 Al
`2007/0276656 Al
`2008/0033723 Al
`2008/0140391 Al
`2008/0201138 Al
`2008/0228478 Al
`2008/0260175 Al
`2009/0012783 Al
`2009/0012786 Al
`2009/0129610 Al
`2009/0220107 Al
`2009/0238373 Al
`2009/0253418 Al
`2009/0271187 Al
`2009/0323982 Al
`2010/0094643 Al
`2010/0278352 Al
`2011/0178800 Al
`
`2/2003 Taori etal.
`2/2003 Bullen
`2/2003 Yanget al.
`3/2003 Gonopolskiy
`4/2003 Brennan etal.
`4/2003 Raleighetal.
`4/2003 Gonopolskiy et al.
`5/2003 Watts
`5/2003 Gartneretal.
`5/2003 Liu
`6/2003 Goubran etal.
`7/2003 Furuta
`7/2003 Jonesetal.
`8/2003 Elko wc eeeeeeees 381/92
`9/2003 Ryanetal. we 381/92
`12/2003 Burnettetal.
`1/2004 Ellis et al.
`3/2004 Yuet al.
`3/2004 Faller
`4/2004 Kremeretal.
`7/2004 Shahaf etal.
`7/2004 Burnettetal.
`8/2004 Hetheringtonetal.
`10/2004 Friedman etal.
`12/2004 Cutler et al.
`2/2005 Wu
`2/2005 Mattila etal.
`3/2005 Kaltenmeieret al.
`3/2005 Visseretal.
`7/2005 Gierl et al.
`8/2005. Sinclair etal.
`9/2005 Buck etal.
`9/2005 Watts
`10/2005 Watts
`12/2005 Aubauer etal.
`12/2005 Kok
`4/2006 Schwartz et al.
`4/2006 Alveset al.
`5/2006 Nongpiuret al.
`6/2006 Burnettet al.
`6/2006 Chenetal.
`7/2006 Choietal.
`8/2006 McCree etal.
`9/2006 Benjelloun Touimietal.
`10/2006 Buck etal.
`1/2007. Visseret al.
`2/2007 Arakawaetal.
`2/2007 Francois etal.
`3/2007 Pan etal.
`4/2007 Hetheringtonetal.
`4/2007 Chen
`5/2007 Ekstrandet al.
`5/2007 Chen
`6/2007 Aceroet al.
`7/2007 Avendanoet al.
`7/2007 Dengetal.
`8/2007 Jaber
`10/2007 Belt etal.
`11/2007 Solbachet al.
`2/2008 Jang etal.
`6/2008 Yenet al.
`8/2008 Visseretal.
`9/2008 Hetheringtonetal.
`10/2008 Elko
`1/2009 Klein
`1/2009 Zhangetal.
`5/2009 Kim et al.
`9/2009 Every et al.
`9/2009 Klein
`10/2009 Makinen
`10/2009 Yen etal.
`12/2009 Solbach et al.
`4/2010 Avendanoet al.
`11/2010 Petit et al.
`7/2011 Watts
`
`JP
`JP
`
`FOREIGN PATENT DOCUMENTS
`04184400
`7/1992
`5053587
`3/1993
`
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`WO
`
`05-172865
`06269083
`10-313497
`11-249693
`2004053895
`2004531767
`2004533155
`2005110127
`2005148274
`2005518118
`2005195955
`01/74118
`02080362
`02103676
`03/043374
`03/069499
`2003069499
`2004/010415
`2007/081916
`2007/140003
`2010/005493
`
`TN993
`9/1994
`11/1998
`9/1999
`2/2004
`10/2004
`10/2004
`4/2005
`6/2005
`6/2005
`7/2005
`10/2001
`10/2002
`12/2002
`5/2003
`8/2003
`8/2003
`1/2004
`7/2007
`12/2007
`1/2010
`
`OTHER PUBLICATIONS
`
`Chen Liu et al. “A two-microphone dual delay-line approach for
`extraction of a speech soundin the presence of multiple interferers”,
`source(s): Acoustical Society ofAmerica. vol. 110, Dec. 6, 2001, pp.
`3218-3231
`.
`Cohenet al. “Microphone Array Post-Filtering for Non-Stationary
`Noise”, source(s): IEEE. May 2002.
`Jingdong Chenetal. “New Insights into the Noise Reduction Wiener
`Filter’,
`source(s):
`IEEE Transactions on Audio, Speech,
`and.
`Langauge Processing. vol. 14, Jul. 4, 2006, pp. 1218-1234.
`Rainer Martin et
`al.
`“Combined Acoustic Echo Cancellation
`,
`.
`,
`:
`:
`.
`Dereverberation and Noise Reduction: A two Microphone
`Approach”, source(s): Annales des Telecommunications/Annals of
`Telecommunications. vol. 29, Jul. 7-8-Aug. 1994, pp. 429-438.
`Mitsunori Mizumachi et al. “Noise Reduction by Paired-Micro-
`phonesUsing Spectral Subtraction”, source(s): 1998 IEEE.pp. 1001-
`1004
`:
`7
`.
`.
`.
`.

`LucasParraetal. Convolutive blind Separation ofNon-Stationary ;
`source(s): IEEE Transactions on Speech and Audio Processing. vol.
`8, May 3, 2008, pp. 320-327.
`Isreal Cohen. “Multichannel Post-Filtering in Nonstationary Noise
`Environment”, source(s): IEEE Transactions on Signal Processing.
`vol. 52, May 5, 2004, pp. 1149-1160.
`R.A. Goubran.“Acoustic Noise Suppression Using Regressive Adap-
`tive Filtering”, source(s): 1990 IEEE.pp. 48-53.
`Ivan Tashevetal. “Microphone Array of Headset with Spatial Noise
`S

`.
`:
`:
`Suppressor”,
`source(s):
`http://research.microsoft.com/users/
`ivantash/Documents/Tashev_MAforHeadset_HSCMA_05.pdf. (4
`pages).
`.
`.
`.
`oo
`Martin Fuchsetal. “Noise Suppression for Automotive Applications
`Based on Directional Information”, source(s): 2004 IEEE. pp. 237-
`240.
`Jean-Marc Valin et al. “Enhanced Robot Audition Based on Micro-
`phone Array Source Separation with Post-Filter”’, source(s): Pro-
`ceedings of 2004 IEEE/RSJ International Conference on Intelligent
`Robots and Systems, Sep. 28-Oct. 2, 2004, Sendai, Japan. pp. 2123-
`2128.
`Jont B. Allen. “Short Term Spectral Analysis, Synthesis, and Modi-
`.
`.
`.
`e
`:
`fication by Discrete Fourier Transform”, IEEE Transactions on
`Acoustics, Speech, and Signal Processing. vol. ASSP-25, Jun. 3,
`1977. pp. 235-238.
`Jont B. Allen et al. “A Unified Approach to Short-Time Fourier
`Analysis and Synthesis”, Proceedings of the IEEE. vol. 65, Nov.11,
`1977. pp. 1558-1564.
`C. Avendano, “Frequency-Domain Techniques for Source Identifi-
`cation and Manipulation in Stereo Mixes for Enhancement, Suppres-
`sion and Re-Panning Applications,” in Proc. IEEE Workshop on
`Application of Signal Processing to Audio and Acoustics, Waspaa,
`03, New Paltz, NY, 2003.
`B. Widrow et al., “Adaptive Antenna Systems,” Proceedings IEEE,
`vol. 55, No. 12, pp. 2143-2159, Dec. 1967.
`
`3
`
`

`

`US 8,194,880 B2
`Page 4
`
`Avendano, Carlos, “Frequency-Domain Source Identification and
`Manipulation in Stereo Mixes for Enhancement, Suppression and
`Re-panning Applications,” 2003 IEEE Workshop on Applications of
`Signal Processing to Audio and Acoustics, Oct. 19-22, 2003, pp.
`55-58, New Peitz, New York, USA.
`Widrow,B. et al., “Adaptive Atenna Systems,” Dec. 1967, pp. 2143-
`2159, vol. 55 No. 12, Proceedings of the IEEE.
`Elko, Gary W., “Differential Microphone Arrays,” Audio Signal Pro-
`cessing for Next-Generation Multimedia Communication Systems,
`2004, pp. 12-65, Kluwer Academic Publishers, Norwell, Massachu-
`setts, USA.
`Boll, Steven F. “Suppression of Acoustic Noise in Speech using
`Spectral Subtraction”, IEEE Transactions on Acoustics, Speech and.
`Signal Processing, vol. ASSP-27, No. 2, Apr. 1979, pp. 113-120.
`Boll, Steven F. “Suppression of Acoustic Noise in Speech Using
`Spectral Subtraction”, Dept. of Computer Science, University of
`Utah Salt Lake City, Utah, Apr. 1979, pp. 18-19.
`Dahl, Mattiaset al., “Simultaneous Echo Cancellation and Car Noise
`Suppression Employing a Microphone Array”, 1997 IEEE Interna-
`tional Conference on Acoustics, Speech, and Signal Processing, Apr.
`21-24, pp. 239-242.
`“ENT 172.” Instructional Module. Prince George’s Community Col-
`lege Department of Engineering Technology. Accessed: Oct. 15,
`2011. Subsection: “Polar and Rectangular Notation”. <http://aca-
`demic.ppgec.edu/ent/ent172_instr_mod.html>.
`Fulghum,D. P. et al., “LPC Voice Digitizer with Background Noise
`Suppression”, 1979 IEEE International Conference on Acoustics,
`Speech, and Signal Processing, pp. 220-223.
`Graupe, Danielet al., “Blind Adaptive Filtering of Speech from Noise
`of Unknown Spectrum Using a Virtual Feedback Configuration”,
`IEEE Transactions on Speech and Audio Processing, Mar. 2000, vol.
`8, No. 2, pp. 146-158.
`Haykin, Simonetal. “Appendix A.2 Complex Numbers.”Signals and
`Systems. 2nd Ed. 2003. p. 764.
`in Proc.
`Hermansky, Hynek “Should Recognizers Have Ears?”,
`ESCATutorial and Research Workshop on Robust Speech Recogni-
`tion for Unknown Communication Channels, pp. 1-10, France 1997.
`Hohmann, V.
`“Frequency Analysis
`and Synthesis Using a
`Garmmatone Filterbank”, ACTA Acustica United with Acustica,
`2002, vol. 88, pp. 433-442.
`Jeffress, Lloyd A. et al. “A Place Theory of Sound Localizcion,”
`Journal of Comparative and Physiological Psychology, 1948, vol. 41,
`p. 35-39.
`Jeong, Hyuketal., “Implementation of a New Algorithm Using the
`STFT with Variable Frequency Resolution for the Time-Frequency
`Auditory Model”, J. Audio Eng. Soc., Apr. 1999, vol. 47, No. 4., pp.
`240-251.
`Kates, James M. “A Time-Domain Digital Cochlear Model”, IEEE
`Transactions on Signal Processing, Dec. 1991, vol. 39, No. 12, pp.
`2573-2592.
`Lazzaro, John et al., “A Silicon Model of Auditory Localization,”
`Neural Computation Spring 1989, vol. 1, pp. 47-57, Massachusetts
`Institute of Technology.
`Lippmann, Richard P. “Speech Recognition by Machines and
`Humans”, Speech Communication, Jul. 1997, vol. 22, No. 1, pp.
`1-15.
`Martin, Rainer “Spectral Subtraction Based on MinimumStatistics”,
`in Proceedings Europe. Signal Processing Conf., 1994, pp. 1182-
`1185.
`Mitra, Sanjit K. Digital Signal Processing: a Computer-based
`Approach. 2nd Ed. 2001. pp. 131-133.
`Watts, Lloyd Narrative of Prior Disclosure of Audio Display on Feb.
`15, 2000 and May 31, 2000.
`Cosi, Piero et al. (1996), “Lyon’s Auditory Model Inversion: a Tool
`for Sound Separation and Speech Enhancement,” Proceedings of
`ESCA Workshop on ‘The Auditory Basis of Speech Perception,’
`Keele University, Keele (UK), Jul. 15-19, 1996, pp. 194-197.
`Rabiner, LawrenceR.et al. “Digital Processing of Speech Signals”,
`(Prentice-Hall Series in Signal Processing). Upper Saddle River, NJ:
`Prentice Hall, 1978.
`Weiss, Ron et al., “Estimating Single-Channel Source Separation
`Masks: Revelance Vector MachineClassifiers vs. Pitch-Based Mask-
`ing”, Workshop on Statistical and Perceptual Audio Processing,
`2006.
`
`Schimmel, Steven et al., “Coherent Envelope Detection for Modula-
`tion Filtering of Speech,” 2005 IEEE International Conference on
`Acoustics, Speech, and Signal Processing,vol. 1, No. 7, pp. 221-224.
`Slaney, Malcom,“Lyon’s Cochlear Model”, Advanced Technology
`Group, Apple Technical Report #13, Apple Computer, Inc., 1988, pp.
`1-79.
`Slaney, Malcom, etal. “Auditory Model Inversion for Sound Sepa-
`ration,” 1994 IEEE International Conference on Acoustics, Speech
`and Signal Processing, Apr. 19-22, vol. 2, pp. 77-80.
`Slaney, Malcom.“An Introduction to Auditory Model Inversion”,
`Interval Technical Report IRC 1994-014, http://coweb.ecn.purdue.
`edu/~maclom/interval/1994-014/, Sep. 1994, accessed on Jul. 6,
`2010.
`Solbach, Ludger “An Architecture for Robust Partial Tracking and
`Onset Localization in Single Channel Audio Signal Mixes”, Techni-
`cal University Hamburg-Harburg, 1998.
`Stahl, V. et al., “Quantile Based Noise Estimation for Spectral Sub-
`traction and WienerFiltering,” 2000 IEEE International Conference
`on Acoustics, Speech, and Signal Processing, Jun. 5-9, vol. 3, pp.
`1875-1878.
`Syntrillium Software Corporation, “Cool Edit Users Manual”, 1996,
`pp. 1-74.
`Tchorz, Jurgen etal., “SNR Estimation Based on Amplitude Modu-
`lation Analysis with Applications to Noise Suppression”, TEEE
`Transactions on Speech and Audio Processing, vol. 11, No. 3, May
`2003, pp. 184-192.
`Watts, Lloyd, “Robust Hearing Systems for Intelligent Machines,”
`Applied Neurosystems Corporation, 2001, pp. 1-5.
`Yoo, Heejonget al., “Continuous-Time Audio Noise Suppression and
`Real-Time Implementation”, 2002 IEEE International Conference
`on Acoustics, Speech, and Signal Processing, May 13-17, pp.
`TV3980-1V3983.
`International Search Report dated Jun. 8, 2001 in Application No.
`PCT/US01/08372.
`International Search Report dated Apr. 3, 2003 in Application No.
`PCT/US02/36946.
`International Search Report dated May 29, 2003 in Application No.
`PCT/US03/04124.
`International Search Report and Written Opinion dated Oct. 19, 2007
`in Application No. PCT/US07/00463.
`International Search Report and Written Opinion dated Apr. 9, 2008
`in Application No. PCT/US07/2 1654.
`International Search Report and Written Opinion dated Sep. 16, 2008
`in Application No. PCT/US07/12628.
`International Search Report and Written Opinion dated Oct. 1, 2008
`in Application No. PCT/US08/08249.
`International Search Report and Written Opinion dated May 11, 2009
`in Application No. PCT/US09/0 1667.
`International Search Report and Written Opinion dated Aug. 27, 2009
`in Application No. PCT/US09/038 13.
`International Search Report and Written Opinion dated May20, 2010
`in Application No. PCT/US09/06754.
`Fast Cochlea Transform, US Trademark Reg. No. 2,875,755 (Aug.
`17, 2004).
`Dahl, Mattias et al., “Acoustic Echo and Noise Cancelling Using
`Microphone Arrays”, International Symposium on Signal Processing
`and its Applications, ISSPA, Gold coast, Australia, Aug. 25-30, 1996,
`pp. 379-382.
`Demol, M. et al. “Efficient Non-Uniform Time-Scaling of Speech
`With WSOLA for CALL Applications”, Proceedings of InSTIL/
`ICALL2004—NLP and Speech Technologies in Advanced Lan-
`guage Learning Systems—Venice Jun. 17-19, 2004.
`Laroche, Jean. “Time and Pitch Scale Modification of Audio Sig-
`nals”, in “Applications of Digital Signal Processing to Audio and
`Acoustics”, The Kluwer International Series in Engineering and.
`Computer Science, vol. 437, pp. 279-309, 2002.
`Moulines, Eric et al., “Non-Parametric Techniques for Pitch-Scale
`and Time-Scale Modification of Speech”, Speech Communication,
`vol. 16, pp. 175-205, 1995.
`Verhelst, Werner, “Overlap-Add Methods for Time-Scaling of
`Speech”, Speech Communication vol. 30, pp. 207-221, 2000.
`
`* cited by examiner
`
`4
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet 1 of 9
`
`US 8,194,880 B2
`
`
`
`OSION
`
`Or
`
`Asepuooas
`
`auoYydolaiyy
`
`801
`
`@SION
`
`OLL
`
`80l901—
`
`Q)Sls
`
`olpny
`
`20INOS
`
`cOL
`
`BLOld
`
`vol
`
`
`
`90}.auoYydoolyy
`
`Arewiid
`
`ZOL80sn0S
`
`olpny
`
`5
`
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet 2 of 9
`
`US 8,194,880 B2
`
`
`
`
`
`¢Olas
`
`
`
`gdlAeqindjno
`
`902
`
`
`
`Hulssad0/goipny
`
`auibuy
`
`v0Z
`
`
`
`AsepuooasAsewud
`
`
`
`SUOYCOII|yauoydooiyy
`
`sol901
`
`JOSS9B9N0IJ4
`
`
`
`vOLad!AaqOlpny
`
`6
`
`
`
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet 3 of 9
`
`US 8,194,880 B2
`
`~@
`
`yndjnoOL
`
`BdIAeq
`
`
`
`
`
`
`
`uolonpayasionFIEcle
`
`
`sisoujuASBuiyse-
`
`Aduenbai4"
`
` ||—_OdeaC
` Bulyyoowscron|Eg
`
`
`3iNpow
`Ole
`----_e8
`
`
`SINPOWLg|nefouls
`OSION
`
`WaysAS
`
`Ore
`
`
`
`
`
`ainpo
`
`
`
`49}
`
`|
`
`qi
`
`ainpow
`
`80€
`
`
`
`
`
`aulbuyBulsseoojgolpny
`
`0d
`
`ainpoyyamply
`vOE_—
`
`
`
`VAG
`
`
`
`'X
`
`“x
`
`7
`
`
`
`

`

`uononped|G4
`sINPOWCTI
`ainpowAbieug
`Aouanbel4
`
`80€
`
`90¢
`
`
`
`sajnpoy;ysisAjeuy
`
`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet 4 of 9
`
`US 8,194,880 B2
`
`OSION
`
`Wwa}shS
`
`OLE
`
`COESINPOWVIG
`
`8
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet 5 of 9
`
`US 8,194,880 B2
`
` ZH8
`
`
`
`7Q=]aul)Aejap<e(zZ)q
`
`
`
`
`
`(J9PJO,8ZL)6ZL=14ONYMl4€(Z)4
`
`qvOla
`
`JOPOLJOUlle(Zz)by
`JOP1OOLJOUUNE(z)Oy
`OL=]oulAejep<(z)%q
`
`
`
`OL=71aulAejap<(z)bq
`
`9
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet 6 of 9
`
`US 8,194,880 B2
`
`@SION
`
`uolonpey
`
`wa}shS
`
`OLe
`
`80¢
`
`90€
`
`sINPOWCl
`ainpo-Abseug
`COESINPOWVG
`
`GSls
`
`10
`
`
`
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet7 of 9
`
`US 8,194,880 B2
`
`
`
`
`
`
`
`FIG.6 ereseyen
`
`11
`
`11
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet 8 of 9
`
`US 8,194,880 B2
`
`700
`
`Received Audio Signals
`
`
`
`Perform Differential Array
`
`704
`
`
`
`\ 702
`Analysis
`
`
`Perform Frequency Analysis
`
`Compute Energy
`708
`
` Y
`
`706
`
`
`
`Computer Inter-Microphone
`Level Difference
`
`710
`
`.
`Processing
`
`712
`
` Perform Noise Reduction
`,
`OutputAudio Signal
`hm
`
`
`
`End
`
`FIG. 7
`
`12
`
`12
`
`

`

`U.S. Patent
`
`Jun. 5, 2012
`
`Sheet9 of 9
`
`US 8,194,880 B2
`
`412
`
`\
`
`
`
`Estimate Noise
`
`802
`
`
`
`
`
`i
`Estimate Filter
`204
`
`806
`
`
`
`
`808
`
`810
`
`FIG. 8
`
`13
`
`13
`
`

`

`US 8,194,880 B2
`
`1
`SYSTEM AND METHOD FOR UTILIZING
`OMNI-DIRECTIONAL MICROPHONES FOR
`SPEECH ENHANCEMENT
`
`CROSS-REFERENCE TO RELATED
`APPLICATION
`
`The present application claimsthe priority benefit of U.S.
`Provisional Patent Application No. 60/850,928, filed Oct. 10,
`2006, and entitled “Array Processing Technique for Produc-
`ing Long-Range ILD Cues with Omni-Directional. Micro-
`phonePair;”the present application is also a continuation-in-
`part ofU.S. patent application Ser. No. 11/343,524, filed Jan.
`30, 2006 andentitled “System and Methodfor Utilizing Inter-
`Microphone Level Differences for Speech Enhancement,”
`which claimsthe priority benefit of U.S. Provisional Patent
`Application No. 60/756,826, filed Jan. 5, 2006, and entitled
`“Inter-Microphone Level Difference Suppresor,”all ofwhich
`are herein incorporated by reference.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of Invention
`
`Thepresent invention relates generally to audio processing
`and more. particularly to speech enhancement using inter-
`microphonelevel differences.
`2. Description of Related Art
`Currently, there are many methods for reducing back-
`ground noise and enhancing speech in an adverse environ-
`ment. One such methodis to use two or more microphones on
`an audio device. These microphonesare in prescribed posi-
`tions and allow the audio device to determine a level differ-
`ence between the microphonesignals. For example, due to a
`space difference between the microphones, the difference in
`times of arrival of the signals from a speech source to the
`microphones may be utilized to localize the speech source.
`Once localized, the signals can be spatially filtered to sup-
`press the noise originating from the different directions.
`In order to take advantage ofthe level difference between
`two omni-directional microphones, a speech source needsto
`be closerto one ofthe microphones. Thatis, in order to obtain
`a significant level difference, a distance from the source to a
`first microphoneneedsto be shorter than a distance from the
`source to a second microphone. As such, a speech source must
`remain in relative closeness to the microphones, especially if
`the microphonesare in close proximity as may be required by
`mobile telephony applications.
`A solution to the distance constraint may be obtained by
`using directional microphones. Using directional micro-
`phonesallowsa user to extend an effective level difference
`between the two microphones over a larger range with a
`narrow inter-level difference (ILD) beam. This may be desir-
`able for applications such as push-to-talk (PTT) or video-
`phones where a speech sourceis not in as close a proximity to
`the microphones, as for example, a telephone application.
`Disadvantageously, directional microphones have numer-
`ous physical drawbacks. Typically, directional microphones
`are large in size and do not fit well in small telephones or
`cellular phones. Additionally, directional microphones are
`difficult to mount as they required ports in order for sounds to
`arrive from a plurality of directions. Slight variations in
`manufacturing may result in a mismatch, resulting in more
`expensive manufacturing and production costs.
`Therefore, it is desirable to utilize the characteristics of
`directional microphones in a speech enhancement system,
`withoutthe disadvantages of using directional microphones,
`themselves.
`
`2
`SUMMARY OF THE INVENTION
`
`Embodiments of the present invention overcome or sub-
`stantially alleviate prior problemsassociated with noise sup-
`pression and speech enhancement. In general, systems and
`methods for utilizing inter-microphone level differences
`(ILD) to attenuate noise and enhance speech are provided.In
`exemplary embodiments, the ILD is based on energy level
`differences of a pair of omni-directional microphones.
`Exemplary embodiments of the present invention use a
`non-linear process to combine components of the acoustic
`signals from the pair of omni-directional microphones in
`order to obtain the ILD. In exemplary embodiments, a pri-
`mary acoustic signal is received by a primary microphone,
`and a secondary acoustic signal is received by a secondary
`microphone(e.g., omni-directional microphones). The pri-
`mary and secondary acoustic signals are converted into pri-
`mary and secondary electric signals for processing.
`A differential microphone array (DMA) module processes
`the primary and secondary electric signals to determine a
`cardioid primary signal and a cardioid secondary signal. In
`exemplary embodiments, the primary and secondary electric
`signals are delayed by a delay node. The cardioid primary
`signal is then determined by taking a difference between the
`primary electric signal and the delayed secondary electric
`signal, while the cardioid secondary signal is determined by
`taking a difference between the secondary electric signal and
`the delayed primary electric signal. In various embodiments
`the delayed primary electric signal and the delayed secondary
`electric signal are adjusted by a gain. The gain maybea ratio
`between a magnitude of the primary acoustic signal and a
`magnitude of the secondary acoustic signal.
`The cardioidsignals are filtered through a frequency analy-
`sis module which takes the signals and mimics the frequency
`analysis of the cochlea (i.e., cochlear domain) simulated in
`this embodimentbya filter bank. Alternatively, otherfilters
`such as short-time Fourier transform (STFT), sub-bandfilter
`banks, modulated complex lapped transforms, cochlear mod-
`els, wavelets, etc. can be used for the frequency analysis and
`synthesis. Energy levels associated with the cardioid primary
`signal and the cardioid secondary signals are then computed
`(e.g., aS powerestimates) and the results are processed by an
`ILD module using a non-linear combination to obtain the
`ILD. In exemplary embodiments, the non-linear combination
`comprises dividing the power estimate associated with the
`cardioid primary signal by the powerestimate associated with
`the cardioid secondary signal. The ILD maythen be used as a
`spatial discrimination cue in a noise reduction system to
`suppress unwanted sound sources and enhance the speech.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`FIG. 1a and FIG.1are diagramsof two environments in
`which embodiments of the present invention may be prac-
`ticed.
`
`60
`
`65
`
`FIG. 2 is a block diagram of an exemplary audio device
`implementing embodiments ofthe present invention.
`FIG. 3 isa block diagram ofan exemplary audio processing
`engine.
`FIG. 4a illustrates an exemplary implementation of the
`DMA module, frequency analysis module, energy module,
`and the ILD module.
`
`FIG. 461s an exemplary implementation ofthe DMA mod-
`ule.
`FIG.5 is a block diagram ofan alternative embodimentof
`the present invention.
`
`14
`
`14
`
`

`

`US 8,194,880 B2
`
`3
`FIG.6 is a polar plot of a front-to-back cardioid directivity
`pattern and ILD diagram produced according to embodi-
`ments of the present invention.
`FIG.7 is a flowchart of an exemplary methodfor utilizing
`ILD of omni-directional microphones for speech enhance-
`ment.
`
`FIG. 8 is a flowchart of an exemplary noise reduction
`process.
`
`DESCRIPTION OF EXEMPLARY
`EMBODIMENTS
`
`10
`
`The present invention provides exemplary systems and
`methods for utilizing inter-microphone level differences
`(ILD) of at least two microphones to identify frequency
`regions dominated by speech in order to enhance speech and
`attenuate backgroundnoise andfar-field distracters. Embodi-
`ments ofthe present invention maybe practiced on any audio
`device that is configured to receive sound such as, but not
`limited to, cellular phones, phone handsets, headsets, and
`conferencing systems. Advantageously, exemplary embodi-
`ments are configured to provide improved noise suppression
`on small devices and in applications where the main audio
`sourceis far from the device. While some embodimentsofthe
`present invention will be described in reference to operation
`ona cellular phone,the present invention maybe practiced on
`any audio device.
`Referring to FIG. 1a and FIG. 15, environments in which
`embodiments of the present invention may be practiced are
`shown. A user provides an audio (speech) source 102 to an
`audio device 104. The exemplary audio device 104 comprises
`two microphones: a primary microphone106 relative to the
`audio source 102 and a secondary microphone 108 located a
`distance, d, away from the primary microphone 106. In exem-
`plary embodiments, the microphones 106 and 108 are omni-
`directional microphones.
`While the microphones 106 and 108 receive sound(i.e.,
`acoustic signals) from the audio source 102, the microphones
`106 and 108 also pick up noise 110. Althoughthe noise 110 is
`shown coming from a single location in FIG. 1a and FIG.18,
`the noise 110 may comprise any sounds from one or more
`locations different than the audio source 102, and may
`include reverberations and echoes.
`Embodiments of the present invention exploit level differ-
`ences(e.g., energy differences) between the acoustic signals
`received by the two microphones 106 and 108 independent of
`how thelevel differences are obtained. In FIG. 1a, because the
`primary microphone 106 is much closer to the audio source
`102 than the secondary microphone108, the intensity level is
`higher for the primary microphone 106 resulting in a larger
`energy level during a speech/voice segment, for example. In
`FIG. 18, because directional response of the primary mi

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket