`
`1111111111111111111111111111111111111111111111111111111111111
`US008194880B2
`
`c12) United States Patent
`Avendano
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,194,880 B2
`Jun. 5, 2012
`
`JP
`
`FOREIGN PATENT DOCUMENTS
`62110349
`5/1987
`(Continued)
`
`OTHER PUBLICATIONS
`Marc Moonen et al. "Multi-Microphone Signal Enhancement Tech-
`niques for Noise Suppression and Dereverberation," source(s ): http:/ I
`www.esat.kuleuven.ac. be/sista/yearreport97 /node37 .html.
`Steven Boll eta!. "Suppression of Acoustic Noise in Speech Using
`Two Microphone Adaptive Noise Cancellation", source(s): IEEE
`Transactions on Acoustic, Speech, and Signal Processing. vol. v
`ASSP-28, n 6, Dec. 1980, pp. 752-753.
`(Continued)
`
`Primary Examiner- Vivian Chin
`Assistant Examiner- Paul Kim
`(74) Attorney, Agent, or Firm- Carr & Ferrell LLP
`
`ABSTRACT
`(57)
`Systems and methods for utilizing inter-microphone level
`differences (ILD) to attenuate noise and enhance speech are
`provided. In exemplary embodiments, primary and second-
`ary acoustic signals are received by onmi-directional micro-
`phones, and converted into primary and secondary electric
`signals. A differential microphone array module processes
`the electric signals to determine a cardioid primary signal and
`a cardioid secondary signal. The cardioid signals are filtered
`through a frequency analysis module which takes the signals
`and mimics a cochlea implementation (i.e., cochlear domain).
`Energy levels of the signals are then computed, and the results
`are processed by an ILD module using a non-linear combi-
`nation to obtain the ILD. In exemplary embodiments, the
`non-linear combination comprises dividing the energy level
`associated with the primary microphone by the energy level
`associated with the secondary microphone. The ILD is uti-
`lized by a noise reduction system to enhance the speech of the
`primary acoustic signal.
`
`28 Claims, 9 Drawing Sheets
`
`(75)
`
`( *) Notice:
`
`(54) SYSTEM AND METHOD FOR UTILIZING
`OMNI-DIRECTIONAL MICROPHONES FOR
`SPEECH ENHANCEMENT
`Inventor: Carlos Avendano, Mountain View, CA
`(US)
`(73) Assignee: Audience, Inc., Mountain View, CA
`(US)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1384 days.
`(21) Appl. No.: 11/699,732
`Jan.29,2007
`(22) Filed:
`Prior Publication Data
`(65)
`US 2008/0019548 Al
`Jan.24,2008
`Related U.S. Application Data
`Continuation-in-part of application No. 11/343,524,
`filed on Jan. 30, 2006.
`Provisional application No. 60/850,928, filed on Oct.
`10, 2006.
`Int. Cl.
`(2006.01)
`H04R 3100
`U.S. Cl. ....... 381192; 381/94.1; 381/94.2; 381/94.3;
`381/94.7; 381/122; 704/226; 704/227; 704/233;
`704/275
`Field of Classification Search .................. 381/313,
`381/312,91,92, 122,95, 110, 94.1, 94.2,
`381/94.3, 94.7; 704/226, 227, 233, 275
`See application file for complete search history.
`
`(63)
`
`(60)
`
`(51)
`
`(52)
`
`(58)
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`8/1976 Engel
`3,976,863 A
`8/1976 Fletcher eta!.
`3,978,287 A
`111979 Iwahara
`4,137,510 A
`2/1984 Ott
`4,433,604 A
`5/1985 Yato eta!.
`4,516,259 A
`8/1985 Sakata
`4,535,473 A
`(Continued)
`
`Secondary
`Microphone
`108
`
`.---···.---··
`-------
`_::: __ ----- ----
`
`104
`
`Audio
`Source 102
`
`Primary
`Microphone 106
`
`1
`
`Sony v. Jawbone
`
`U.S. Patent No. 8,321,213
`
`Sony Ex. 1005
`
`
`
`U.S. PATENT DOCUMENTS
`4,536,844 A
`8/1985 Lyon
`4,581,758 A
`4/1986 Coker et al.
`4,628,529 A
`12/1986 Borth et al.
`4,630,304 A
`12/1986 Borth et al.
`4,649,505 A
`3/1987 Zinser, Jr. et al.
`4/1987 Chabri es et a!.
`4,658,426 A
`4,674,125 A
`6/1987 Carlson et a!.
`4,718,104 A
`111988 Anderson
`4,811,404 A
`3/1989 Vilmur eta!.
`4,812,996 A
`3/1989 Stubbs
`4,864,620 A
`9/1989 Bialick
`4/1990 Yassaie et a!.
`4,920,508 A
`5,027,410 A
`6/1991 Williamson et al.
`5,054,085 A
`10/1991 Meisel eta!.
`5,058,419 A
`10/1991 Nordstrom eta!.
`5,099,738 A
`3/1992 Hotz
`5,119,711 A
`6/1992 Bell eta!.
`5,142,961 A
`9/1992 Paroutaud
`9/1992 Nakatani et a!.
`5,150,413 A
`12/1992 Hejna, Jr. eta!.
`5,175,769 A
`5,187,776 A
`2/1993 Yanker
`5,208,864 A
`5/1993 Kaneda
`5,210,366 A
`5/1993 Sykes, Jr.
`5,224,170 A
`6/1993 Waite, Jr.
`5,230,022 A
`7/1993 Sakata
`5,319,736 A
`6/1994 Hunt
`5,323,459 A
`6/1994 Hirano
`5,341,432 A
`8/1994 Suzuki eta!.
`111995 Andrea et al.
`5,381,473 A
`5,381,512 A
`111995 Holton eta!.
`5,400,409 A
`3/1995 Linhard
`5,402,493 A
`3/1995 Goldstein
`3/1995 Soli eta!.
`5,402,496 A
`5,471,195 A
`1111995 Rickman
`12/1995 Yoshida et a!.
`5,473,702 A
`5,473,759 A
`12/1995 Slaney eta!.
`5,479,564 A
`12/1995 Vogten eta!.
`5,502,663 A
`3/1996 Lyon
`5,544,250 A
`8/1996 Urbanski
`1111996 Slyh eta!.
`5,574,824 A
`5,583,784 A
`12/1996 Kapust eta!.
`12/1996 Velardo, Jr. eta!.
`5,587,998 A
`12/1996 Park eta!.
`5,590,241 A
`5,602,962 A
`2/1997 Kellermann
`5,675,778 A
`10/1997 Jones
`5,682,463 A
`10/1997 Allen et al.
`12/1997 Ngo eta!.
`5,694,474 A
`111998 Arslan eta!.
`5,706,395 A
`5,717,829 A
`2/1998 Takagi
`3/1998 Abel eta!.
`5,729,612 A
`3/1998 Johnston et a!.
`5,732,189 A
`5/1998 Pawate eta!.
`5,749,064 A
`5/1998 Itoh eta!.
`5,757,937 A
`8/1998 Timis eta!.
`5,792,971 A
`5,796,819 A
`8/1998 Romesburg
`5,806,025 A
`9/1998 Vis eta!.
`5,809,463 A
`9/1998 Gupta eta!.
`5,825,320 A
`10/1998 Miyamori et a!.
`5,839,101 A
`1111998 Vahatalo et a!.
`5,920,840 A
`7/1999 Satyamurti et al.
`5,933,495 A
`8/1999 Oh
`5,943,429 A
`8/1999 Handel
`5,956,674 A
`9/1999 Smyth et al.
`5,974,380 A
`10/1999 Smyth et al.
`5,978,824 A
`1111999 Ikeda
`5,983,139 A
`1111999 Zierhofer
`5,990,405 A
`1111999 Auten eta!.
`6,002,776 A
`12/1999 Bhadkarnkar et al.
`6,061,456 A
`5/2000 Andrea et al.
`6/2000 Linder
`6,072,881 A
`6,097,820 A
`8/2000 Turner
`8/2000 Cellario et a!.
`6,108,626 A
`9/2000 Isabelle
`6,122,610 A
`6,134,524 A
`10/2000 Peters eta!.
`6,137,349 A
`10/2000 Menkhoff eta!.
`6,140,809 A
`10/2000 Doi
`6,173,255 B1
`112001 Wilson et al.
`6,180,273 B1
`112001 Okamoto
`
`US 8,194,880 B2
`Page 2
`
`6,216,103 B1
`6,222,927 B1
`6,223,090 B1
`6,226,616 B1
`6,263,307 B1
`6,266,633 B1
`6,317,501 B1
`6,339,758 B1
`6,355,869 B1
`6,363,345 B1
`6,381,570 B2
`6,430,295 B1
`6,434,417 B1
`6,449,586 B1
`6,469,732 B1
`6,487,257 B1
`6,496,795 B1
`6,513,004 B1
`6,516,066 B2
`6,529,606 B1
`6,549,630 B1
`6,584,203 B2
`6,622,030 B1
`6,717,991 B1
`6,718,309 B1
`6,738,482 B1
`6,760,450 B2
`6,785,381 B2
`6,792,118 B2
`6,795,558 B2
`6,798,886 B1
`6,810,273 B1
`6,882,736 B2
`6,915,264 B2
`6,917,688 B2
`6,944,510 B1
`6,978,159 B2
`6,982,377 B2
`6,999,582 B1
`7,016,507 B1
`7,020,605 B2
`7,031,478 B2
`7,054,452 B2
`7,065,485 B1
`7,076,315 B1
`7,092,529 B2
`7,092,882 B2
`7,099,821 B2
`7,142,677 B2
`7,146,316 B2
`7,155,019 B2
`7,164,620 B2
`7,171,008 B2
`7,171,246 B2
`7,174,022 B1
`7,206,418 B2
`7,209,567 B1
`7,225,001 B1
`7,242,762 B2
`7,246,058 B2
`7,254,242 B2
`7,359,520 B2
`7,412,379 B2
`7,433,907 B2
`7,555,434 B2
`7,949,522 B2
`200110016020 A1
`200110031053 A1
`2002/0002455 A1
`2002/0009203 A1
`2002/0041693 A1
`2002/0080980 A1
`2002/0106092 A1
`2002/0116187 A1
`2002/0133334 A1
`2002/0147 595 A1
`2002/0184013 A1
`2003/0014248 A1
`2003/0026437 A1
`
`4/2001 Wu eta!.
`4/2001 Feng eta!.
`4/2001 Brungart
`5/2001 You eta!.
`7/2001 Arslan eta!.
`7/2001 Higgins et a!.
`1112001 Matsuo
`1/2002 Kanazawa et a!.
`3/2002 Mitton
`3/2002 Marash eta!.
`4/2002 Li eta!.
`8/2002 Handel et al.
`8/2002 Lovett
`9/2002 Hoshuyama
`10/2002 Chang eta!.
`1112002 Gustafsson et a!.
`12/2002 Malvar
`112003 Rigazio et a!.
`2/2003 Hayashi
`3/2003 Jackson, Jr. II et al.
`4/2003 Bobisuthi
`6/2003 Elko eta!.
`9/2003 Romesburg et al.
`4/2004 Gustafsson et a!.
`4/2004 Selly
`5/2004 Jaber
`7/2004 Matsuo
`8/2004 Gartner et a!.
`9/2004 Watts
`9/2004 Matsuo
`9/2004 Smith eta!.
`10/2004 Mattila et al.
`4/2005 Dickel eta!.
`7/2005 Baumgarte
`7/2005 Yu et al.
`9/2005 Ballesty et al.
`12/2005 Feng eta!.
`112006 Sakurai et a!.
`2/2006 Popovic eta!.
`3/2006 Brennan
`3/2006 Gao
`4/2006 Belt eta!.
`5/2006 Ukita
`6/2006 Chong-White et al.
`7/2006 Watts
`8/2006 Yu et al.
`8/2006 Arrowood et a!.
`8/2006 Visser et al.
`1112006 Gonopolskiy
`12/2006 Alves
`12/2006 Hou
`1/2007 Hoshuyama
`1/2007 Elko
`1/2007 Mattila et al.
`2/2007 Zhang et al.
`4/2007 Yang eta!.
`4/2007 Kozel eta!.
`5/2007 Eriksson et a!.
`7/2007 He eta!.
`7/2007 Burnett
`8/2007 Ise eta!.
`4/2008 Brennan et a!.
`8/2008 Taori eta!.
`10/2008 Nagai eta!.
`6/2009 Nomura et al.
`5/2011 Hetherington et a!.
`8/2001 Gustafsson et a!.
`10/2001 Feng eta!.
`1/2002 Accardi et a!.
`1/2002 Erten
`4/2002 Matsuo
`6/2002 Matsuo
`8/2002 Matsuo
`8/2002 Erten
`9/2002 Coorman et a!.
`10/2002 Baumgarte
`12/2002 Walker
`112003 Vetter
`2/2003 Janse et al.
`
`2
`
`
`
`US 8,194,880 B2
`Page 3
`
`2/2003 Taori eta!.
`2003/0033140 A1
`2/2003 Bullen
`2003/0039369 A1
`2/2003 Yang eta!.
`2003/0040908 A1
`3/2003 Gonopolskiy
`2003/0061032 A1
`4/2003 Brennan eta!.
`2003/00637 59 A1
`4/2003 Raleigh eta!.
`2003/0072382 A1
`4/2003 Gonopolskiy et al.
`2003/0072460 A1
`2003/0095667 A1
`5/2003 Watts
`5/2003 Gartner eta!.
`2003/0099345 A1
`5/2003 Liu
`2003/0101048 A1
`6/2003 Goubran eta!.
`2003/0103632 A1
`7/2003 Furuta
`2003/0128851 A1
`7/2003 Jones et al.
`2003/0138116 A1
`2003/0147538 A1 *
`8/2003 Elko
`............................... 381192
`2003/0169891 A1 *
`9/2003 Ryan et a!. ...................... 381192
`12/2003 Burnett et al.
`2003/0228023 A1
`112004 Ellis et a!.
`2004/0013276 A1
`3/2004 Yu et al.
`2004/0047464 A1
`3/2004 Faller
`2004/0057574 A1
`4/2004 Kremer eta!.
`2004/0078199 A1
`7/2004 Shahaf eta!.
`2004/0131178 A1
`7/2004 Burnett et al.
`2004/0133421 A1
`8/2004 Hetherington eta!.
`2004/0165736 A1
`10/2004 Friedman eta!.
`2004/0196989 A1
`12/2004 Cutler eta!.
`2004/0263636 A1
`2/2005 Wu
`2005/0025263 A1
`2/2005 Mattila eta!.
`2005/0027520 A1
`3/2005 Kaltenmeier eta!.
`2005/0049864 A1
`3/2005 Visser eta!.
`2005/0060142 A1
`7/2005 Gierl eta!.
`2005/0152559 A1
`8/2005 Sinclair eta!.
`2005/0185813 A1
`9/2005 Bucket a!.
`2005/0213778 A1
`9/2005 Watts
`2005/0216259 A1
`2005/0228518 A1
`10/2005 Watts
`12/2005 Aubauer eta!.
`2005/0276423 A1
`12/2005 Kok
`2005/0288923 A1
`4/2006 Schwartz eta!.
`2006/0072768 A1
`4/2006 Alves et a!.
`2006/0074646 A1
`5/2006 Nongpiur eta!.
`2006/0098809 A1
`6/2006 Burnett et al.
`2006/0120537 A1
`6/2006 Chen et a!.
`2006/0133621 A1
`7/2006 Choi eta!.
`2006/0149535 A1
`8/2006 McCree et a!.
`2006/0184363 A1
`9/2006 Benjelloun Touimi et a!.
`2006/0198542 A1
`10/2006 Bucket a!.
`2006/0222184 A1
`112007 Visser et a!.
`2007/0021958 A1
`2/2007 Arakawa et a!.
`2007/0027685 A1
`2/2007 Francois eta!.
`2007/0033020 A1
`3/2007 Pan eta!.
`2007/0067166 A1
`4/2007 Hetherington eta!.
`2007/0078649 A1
`4/2007 Chen
`2007/0094031 A1
`5/2007 Ekstrand et a!.
`2007/0100612 A1
`5/2007 Chen
`2007/0116300 A1
`6/2007 Acero et a!.
`2007/0150268 A1
`7/2007 Avendano eta!.
`2007/0154031 A1
`7/2007 Deng et a!.
`2007/0165879 A1
`8/2007 Jaber
`2007/0195968 A1
`10/2007 Belt eta!.
`2007/0230712 A1
`1112007 Solbach et a!.
`2007/0276656 A1
`2/2008 Jang et al.
`2008/0033723 A1
`6/2008 Yen et a!.
`2008/0140391 A1
`8/2008 Visser et a!.
`2008/0201138 A1
`9/2008 Hetherington eta!.
`2008/0228478 A1
`10/2008 Elko
`2008/0260175 A1
`112009 Klein
`2009/0012783 A1
`112009 Zhang et al.
`2009/0012786 A1
`5/2009 Kim et a!.
`2009/0129610 A1
`9/2009 Every et al.
`2009/0220107 A1
`2009/0238373 A1
`9/2009 Klein
`2009/0253418 A1
`10/2009 Makinen
`10/2009 Yen et a!.
`2009/0271187 A1
`12/2009 Solbach eta!.
`2009/0323982 A1
`4/2010 Avendano eta!.
`2010/0094643 A1
`1112010 Petit et a!.
`2010/0278352 A1
`7/2011 Watts
`201110178800 A1
`FOREIGN PATENT DOCUMENTS
`7/1992
`04184400
`5053587
`3/1993
`
`JP
`JP
`
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP
`JP wo
`wo
`wo
`wo
`wo
`wo
`wo
`wo
`wo
`wo
`
`7/1993
`05-172865
`9/1994
`06269083
`1111998
`10-313497
`9/1999
`11-249693
`2/2004
`2004053895
`10/2004
`2004531767
`10/2004
`2004533155
`4/2005
`2005110127
`6/2005
`2005148274
`6/2005
`2005518118
`7/2005
`2005195955
`10/2001
`01/74118
`10/2002
`02080362
`12/2002
`02103676
`5/2003
`03/043374
`8/2003
`03/069499
`8/2003
`2003069499
`112004
`2004/010415
`7/2007
`2007/081916
`12/2007
`2007/140003
`112010
`2010/005493
`OTHER PUBLICATIONS
`Chen Liu et a!. "A two-microphone dual delay-line approach for
`extraction of a speech sound in the presence of multiple interferers",
`source(s): Acoustical Society of America. vol. 110, Dec. 6, 2001, pp.
`3218-3231.
`Cohen eta!. "Microphone Array Post-Filtering for Non-Stationary
`Noise", source(s): IEEE. May 2002.
`Jingdong Chen et al. "New Insights into the Noise Reduction Wiener
`Filter", source(s): IEEE Transactions on Audio, Speech, and
`Langauge Processing. vol. 14, Jul. 4, 2006, pp. 1218-1234.
`Rainer Martin et al. "Combined Acoustic Echo Cancellation,
`two Microphone
`Dereverberation and Noise Reduction: A
`Approach", source(s): Annales des Telecommunications/Annals of
`Telecommunications. vol. 29, Jul. 7-8-Aug. 1994, pp. 429-438.
`Mitsunori Mizumachi et a!. "Noise Reduction by Paired-Micro-
`phones Using Spectral Subtraction", source(s ): 1998 IEEE. pp. 1001-
`1004.
`Lucas Parra eta!. "Convolutive blind Separation ofNon-Stationary",
`source(s): IEEE Transactions on Speech and Audio Processing. vol.
`8, May 3, 2008, pp. 320-327.
`Isreal Cohen. "Multichannel Post-Filtering in Nonstationary Noise
`Environment", source(s): IEEE Transactions on Signal Processing.
`vol. 52, May 5, 2004, pp. 1149-1160.
`R.A. Goubran. "Acoustic Noise Suppression Using Regressive Adap-
`tive Filtering", source(s): 1990 IEEE. pp. 48-53.
`Ivan Tashev et al. "Microphone Array of Headset with Spatial Noise
`Suppressor",
`source( s):
`http:/ /research.microsoft.corn/users/
`ivantash/Documents/Tashev _MAforHeadset_HSCMA_05 .pdf. ( 4
`pages).
`Martin Fuchs eta!. "Noise Suppression for Automotive Applications
`Based on Directional Information", source(s): 2004 IEEE. pp. 237-
`240.
`Jean-Marc Valin eta!. "Enhanced Robot Audition Based on Micro-
`phone Array Source Separation with Post-Filter", source(s): Pro-
`ceedings of 2004 IEEE/RSJ International Conference on Intelligent
`Robots and Systems, Sep. 28-0ct. 2, 2004, Sendai, Japan. pp. 2123-
`2128.
`Jont B. Allen. "Short Term Spectral Analysis, Synthesis, and Modi-
`fication by Discrete Fourier Transform", IEEE Transactions on
`Acoustics, Speech, and Signal Processing. vol. ASSP-25, Jun. 3,
`1977. pp. 235-238.
`Jont B. Allen et a!. "A Unified Approach to Short-Time Fourier
`Analysis and Synthesis", Proceedings of the IEEE. vol. 65, Nov. 11,
`1977. pp. 1558-1564.
`C. Avendano, "Frequency-Domain Techniques for Source Identifi-
`cation and Manipulation in Stereo Mixes for Enhancement, Suppres-
`sion and Re-Panning Applications," in Proc. IEEE Workshop on
`Application of Signal Processing to Audio and Acoustics, Waspaa,
`03, New Paltz, NY, 2003.
`B. Widrow et a!., "Adaptive Antenna Systems," Proceedings IEEE,
`vol. 55, No. 12, pp. 2143-2159, Dec. 1967.
`
`3
`
`
`
`US 8,194,880 B2
`Page 4
`
`Avendano, Carlos, "Frequency-Domain Source Identification and
`Manipulation in Stereo Mixes for Enhancement, Suppression and
`Re-panning Applications," 2003 IEEE Workshop on Applications of
`Signal Processing to Audio and Acoustics, Oct. 19-22, 2003, pp.
`55-58, New Peitz, New York, USA.
`Widrow, B. eta!., "Adaptive Atenna Systems," Dec. 1967, pp. 2143-
`2159, vol. 55 No. 12, Proceedings of the IEEE.
`Elko, Gary W., "Differential Microphone Arrays," Audio Signal Pro-
`cessing for Next-Generation Multimedia Communication Systems,
`2004, pp. 12-65, Kluwer Academic Publishers, Norwell, Massachu-
`setts, USA.
`Boll, Steven F. "Suppression of Acoustic Noise in Speech using
`Spectral Subtraction", IEEE Transactions on Acoustics, Speech and
`Signal Processing, vol. ASSP-27, No.2, Apr. 1979, pp. 113-120.
`Boll, Steven F. "Suppression of Acoustic Noise in Speech Using
`Spectral Subtraction", Dept. of Computer Science, University of
`Utah Salt Lake City, Utah, Apr. 1979, pp. 18-19.
`Dahl, Mattias eta!., "Simultaneous Echo Cancellation and Car Noise
`Suppression Employing a Microphone Array", 1997 IEEE Interna-
`tional Conference on Acoustics, Speech, and Signal Processing, Apr.
`21-24, pp. 239-242.
`"ENT 172." Instructional Module. Prince George's Community Col-
`lege Department of Engineering Technology. Accessed: Oct. 15,
`2011. Subsection: "Polar and Rectangular Notation". <http:/ /aca-
`demic.ppgcc.edu/entlent172_instr_mod.htrnl>.
`Fulghum, D.P. eta!., "LPC Voice Digitizer with Background Noise
`Suppression", 1979 IEEE International Conference on Acoustics,
`Speech, and Signal Processing, pp. 220-223.
`Graupe, Daniel eta!., "Blind Adaptive Filtering of Speech from Noise
`of Unknown Spectrum Using a Virtual Feedback Configuration",
`IEEE Transactions on Speech and Audio Processing, Mar. 2000, vol.
`8, No.2, pp. 146-158.
`Hay kin, Simonet a!. "AppendixA.2 Complex Numbers." Signals and
`Systems. 2nd Ed. 2003. p. 764.
`Hermansky, Hynek "Should Recognizers Have Ears?", in Proc.
`ESCA Tutorial and Research Workshop on Robust Speech Recogni-
`tion for Unknown Communication Channels, pp. 1-10, France 1997.
`Hohmann, V. "Frequency Analysis and Synthesis Using a
`Garnrnatone Filterbank", ACTA Acustica United with Acustica,
`2002, vol. 88, pp. 433-442.
`Jeffress, Lloyd A. et a!. "A Place Theory of Sound Localizcion,"
`Journal of Comparative and Physiological Psychology, 1948, vol. 41,
`p. 35-39.
`Jeong, Hyuk eta!., "Implementation of a New Algorithm Using the
`STFT with Variable Frequency Resolution for the Time-Frequency
`Auditory Model", J. Audio Eng. Soc., Apr. 1999, vol. 47, No.4., pp.
`240-251.
`Kates, James M. "A Time-Domain Digital Cochlear Model", IEEE
`Transactions on Signal Processing, Dec. 1991, vol. 39, No. 12, pp.
`2573-2592.
`Lazzaro, John et a!., "A Silicon Model of Auditory Localization,"
`Neural Computation Spring 1989, vol. 1, pp. 47-57, Massachusetts
`Institute of Technology.
`Lippmann, Richard P. "Speech Recognition by Machines and
`Humans", Speech Communication, Jul. 1997, vol. 22, No. 1, pp.
`1-15.
`Martin, Rainer "Spectral Subtraction Based on Minimum Statistics",
`in Proceedings Europe. Signal Processing Conf., 1994, pp. 1182-
`1185.
`Mitra, Sanjit K. Digital Signal Processing: a Computer-based
`Approach. 2nd Ed. 2001. pp. 131-133.
`Watts, Lloyd Narrative of Prior Disclosure of Audio Display on Feb.
`15, 2000 and May 31,2000.
`Cosi, Piero eta!. (1996), "Lyon's Auditory Model Inversion: a Tool
`for Sound Separation and Speech Enhancement," Proceedings of
`ESCA Workshop on 'The Auditory Basis of Speech Perception,'
`Keele University, Keele (UK), Jul. 15-19, 1996, pp. 194-197.
`Rabiner, Lawrence R. et al. "Digital Processing of Speech Signals",
`(Prentice-Hall Series in Signal Processing). Upper Saddle River, NJ:
`Prentice Hall, 1978.
`Weiss, Ron et a!., "Estimating Single-Channel Source Separation
`Masks: Revelance Vector Machine Classifiers vs. Pitch-Based Mask-
`ing", Workshop on Statistical and Perceptual Audio Processing,
`2006.
`
`Schimmel, Steven eta!., "Coherent Envelope Detection for Modula-
`tion Filtering of Speech," 2005 IEEE International Conference on
`Acoustics, Speech, and Signal Processing, vol. 1, No.7, pp. 221-224.
`Slaney, Malcom, "Lyon's Cochlear Model", Advanced Technology
`Group, Apple Technical Report #13, Apple Computer, Inc., 1988, pp.
`1-79.
`Slaney, Malcom, et a!. "Auditory Model Inversion for Sound Sepa-
`ration," 1994 IEEE International Conference on Acoustics, Speech
`and Signal Processing, Apr. 19-22, vol. 2, pp. 77-80.
`Slaney, Malcom. "An Introduction to Auditory Model Inversion",
`Interval Technical Report IRC 1994-014, http://coweb.ecn.purdue.
`edu/-maclorn/interval/1994-014/, Sep. 1994, accessed on Jul. 6,
`2010.
`So Ibach, Ludger "An Architecture for Robust Partial Tracking and
`Onset Localization in Single Channel Audio Signal Mixes", Techni-
`cal University Hamburg-Harburg, 1998.
`Stahl, V. eta!., "Quantile Based Noise Estimation for Spectral Sub-
`traction and Wiener Filtering," 2000 IEEE International Conference
`on Acoustics, Speech, and Signal Processing, Jun. 5-9, vol. 3, pp.
`1875-1878.
`Syntrillium Software Corporation, "Cool Edit Users Manual", 1996,
`pp. 1-74.
`Tchorz, Jurgen eta!., "SNR Estimation Based on Amplitude Modu-
`lation Analysis with Applications to Noise Suppression", IEEE
`Transactions on Speech and Audio Processing, vol. 11, No. 3, May
`2003, pp. 184-192.
`Watts, Lloyd, "Robust Hearing Systems for Intelligent Machines,"
`Applied Neurosystems Corporation, 2001, pp. 1-5.
`Yoo, Heejong eta!., "Continuous-Time Audio Noise Suppression and
`Real-Time Implementation", 2002 IEEE International Conference
`on Acoustics, Speech, and Signal Processing, May 13-17, pp.
`IV3980-1 V3983.
`International Search Report dated Jun. 8, 2001 in Application No.
`PCT/USOl/08372.
`International Search Report dated Apr. 3, 2003 in Application No.
`PCT/US02/36946.
`International Search Report dated May 29, 2003 in Application No.
`PCT/US03/04124.
`International Search Report and Written Opinion dated Oct. 19, 2007
`in Application No. PCT/US07 /00463.
`International Search Report and Written Opinion dated Apr. 9, 2008
`in Application No. PCT/US07 /21654.
`International Search Report and Written Opinion dated Sep. 16, 2008
`in Application No. PCT/US07 /12628.
`International Search Report and Written Opinion dated Oct. 1, 2008
`in Application No. PCT/USOS/08249.
`International Search Report and Written Opinion dated May 11, 2009
`in Application No. PCT/US09/0 1667.
`International Search Report and Written Opinion dated Aug. 27, 2009
`in Application No. PCT/US09/03813.
`International Search Report and Written Opinion dated May 20, 20 10
`in Application No. PCT/US09/06754.
`Fast Cochlea Transform, US Trademark Reg. No. 2,875,755 (Aug.
`17, 2004).
`Dahl, Mattias eta!., "Acoustic Echo and Noise Cancelling Using
`Microphone Arrays", International Symposium on Signal Processing
`and its Applications, ISSPA, Gold coast, Australia, Aug. 25-30, 1996,
`pp. 379-382.
`Demo!, M. et a!. "Efficient Non-Uniform Time-Scaling of Speech
`With WSOLA for CALL Applications", Proceedings of InSTILl
`ICALL2004-NLP and Speech Technologies in Advanced Lan-
`guage Learning Systems-Venice Jun. 17-19, 2004.
`Laroche, Jean. "Time and Pitch Scale Modification of Audio Sig-
`nals", in "Applications of Digital Signal Processing to Audio and
`Acoustics", The Kluwer International Series in Engineering and
`Computer Science, vol. 437, pp. 279-309, 2002.
`Moulines, Eric eta!., "Non-Parametric Techniques for Pitch-Scale
`and Time-Scale Modification of Speech", Speech Communication,
`vol. 16, pp. 175-205, 1995.
`Verhelst, Werner, "Overlap-Add Methods for Time-Scaling of
`Speech", Speech Communication vol. 30, pp. 207-221, 2000.
`* cited by examiner
`
`4
`
`
`
`U.S. Patent
`
`Jun.5,2012
`
`Sheet 1 of9
`
`US 8,194,880 B2
`
`.~~I
`o.,.....
`z
`
`.~~I
`o.,.....
`z
`
`.
`C)
`LL
`
`co
`0 .,.....
`~QJ
`co c E_g
`·;:: c.
`a.. 0 .._
`.~
`~
`
`\ .. ! ~ ~
`
`:J ._
`<( :J
`0
`C/)
`
`5
`
`
`
`U.S. Patent
`
`Jun.5,2012
`
`Sheet 2 of9
`
`US 8,194,880 B2
`
`~~
`ro o
`"'0 ~ 001
`c c.a
`co..-
`I....
`(.)
`Q) .~
`(/)~
`
`Q)
`~§
`ro ~ (()I
`.§ e-~
`
`a_.~
`~
`
`.....
`0
`(/)
`(f) Nl
`a> a UN
`0 ..... a_
`
`Q) u ·:;:
`Q) o (()I
`...... a
`::JN
`c.
`.......
`::J
`0
`
`C) c
`(/)
`(/)
`Q) Q)
`g.~ VI
`I.... rna
`a_ eN
`.Q w
`"'0
`::J
`~
`
`N
`(.9
`LL
`
`6
`
`
`
`~ Oo 010 = = N
`
`""""'
`\C
`"'010
`rJl
`d
`
`\0
`0 .....
`.....
`rFJ =- ('D
`
`(.H
`
`('D
`
`0 .... N
`2' =
`
`N
`~Ul
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`ce
`tput
`......
`
`---------____ J
`I
`I
`I
`Noise Reduction :
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`Module
`Synthesis
`Frequen~y
`
`320
`
`System
`
`310
`
`FIG. 3
`
`L-------------------------
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`----
`
`-
`
`-----.J
`I
`I
`I
`I
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`I
`r--------------------------
`
`Module
`Masking
`
`318
`
`i
`'
`i
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`Module
`Analysis
`
`Freq.
`
`304
`
`Module
`DMA
`
`302
`
`2 X
`
`X
`
`· Engine
`
`Audio Processmg
`
`204
`
`7
`
`
`
`~ Oo 010 = = N
`
`""""'
`\C
`"'010
`rJl
`d
`
`\0
`0 .....
`.....
`rFJ =- ('D
`
`.j;o.
`
`('D
`
`N
`~Ul
`
`0 .... N
`2' :=
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`System
`Reduction
`
`Noise
`
`310
`
`FIG. 4a
`
`~ --
`
`ILD
`
`. ~
`,,
`
`I 12
`
`....
`....
`
`I 12
`
`...
`...
`
`304
`
`304
`
`~
`
`..
`
`~
`
`..
`
`cb
`
`-
`
`..
`
`cf
`
`.. -
`
`~
`
`1
`402
`
`\404
`_J
`z-T'
`
`z-T2
`(
`
`x\..
`
`y
`
`,,
`~ g
`~ ,
`
`~
`
`406
`r-
`
`'
`
`x2
`
`""
`
`"'\
`
`x1
`
`ILD Module
`
`308
`
`Energy Module
`
`306
`
`Analysis Modules
`
`Frequency
`
`DMA Module 302
`
`8
`
`
`
`~ Oo 010 = = N
`
`""""'
`\C
`"'010
`rJl
`d
`
`\0
`0 .....
`Ul
`.....
`rFJ =- ('D
`
`('D
`
`0 .... N
`~ = :=
`
`N
`~Ul
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`FIG. 4b
`
`~(z) ~IIR filter 1 01
`h order
`~ (z) ~ IIR filter 101h order
`D-.2(z) ~delay line L = 10
`0., (z) ~delay line L = 1 0
`F(z) ~FIR filter L = 129 (1281h order)
`D(z) ~delay line L = 64
`
`426
`
`_./
`
`I ___ I
`
`~-
`
`"'-422
`
`I
`
`D-.2 (z)
`
`)3 ~~ ~Kc j+
`
`?
`
`424
`
`8kHz
`
`~I 01 (z) I
`
`I
`
`I
`
`D(z)
`
`420
`
`414
`
`16kHz
`
`9
`
`
`
`~ Oo 010 = = N
`
`""""'
`\C
`"'010
`rJl
`d
`
`\0
`0 .....
`0\
`.....
`rFJ =- ('D
`
`('D
`
`N
`~Ul
`
`0 .... N
`2' :=
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`System
`Reduction
`
`Noise
`
`310
`
`FIG. 5
`
`A
`
`..
`
`Cue
`ILD
`
`2
`
`.. 1·1
`
`cb
`
`2
`
`1·1
`
`.. ..
`
`cf
`
`... -.,. Per tap
`
`.
`.
`
`I
`
`304
`
`,...
`..
`
`~
`
`....... ~ ej{J)1b
`
`ej{J)'f
`
`.
`. -~
`
`...... 304
`..
`
`x1
`
`ILD Module
`
`308
`
`Energy Module
`
`306
`
`DMA Module 302
`
`10
`
`
`
`FIG.6 eresun
`
`N/~;.
`
`~
`
`0
`(0
`
`U.S. Patent
`U.S. Patent
`
`Jun. 5, 2012
`Jun.5,2012
`
`Sheet 7 of 9
`Sheet 7 of9
`
`US 8,194,880 B2
`US8,194,880 B2
`
`
`
`
`
`
`
`r
`
`.
`
`sen
`
`11
`
`11
`
`
`
`U.S. Patent
`
`Jun.5,2012
`
`Sheet 8 of9
`
`US 8,194,880 B2
`
`700 \
`
`Start
`
`•
`•
`•
`•
`•
`•
`•
`• End
`
`Received Audio Signals ~ 702
`
`Perform Differential Array
`Analysis
`
`704
`~
`
`Perform Frequency Analysis ~ 706
`
`Compute Energy
`
`Computer Inter-Microphone
`Level Difference
`
`Perform Noise Reduction
`Processing
`
`~ 708
`
`710
`~
`
`712
`~
`
`Output Audio Signal ~ 714
`
`FIG. 7
`
`12
`
`
`
`U.S. Patent
`
`Jun.5,2012
`
`Sheet 9 of9
`
`US 8,194,880 B2
`
`712 \
`
`802
`
`804
`
`806
`
`808
`
`810
`
`Start
`
`Estimate Noise
`
`Estimate Filter
`
`Smooth Filter
`
`Apply Filter
`
`Perform Frequency Synthesis
`
`End
`
`FIG. 8
`
`13
`
`
`
`US 8,194,880 B2
`
`2
`SUMMARY OF THE INVENTION
`
`1
`SYSTEM AND METHOD FOR UTILIZING
`OMNI-DIRECTIONAL MICROPHONES FOR
`SPEECH ENHANCEMENT
`
`CROSS-REFERENCE TO RELATED
`APPLICATION
`
`The present application claims the priority benefit of U.S.
`Provisional Patent Application No. 60/850,928, filed Oct. 10,
`2006, and entitled "Array Processing Technique for Produc-
`ing Long-Range ILD Cues with Omni-Directional. Micro-
`phone Pair;" the present application is also a continuation-in-
`part ofU.S. patent application Ser. No. 11/343,524, filed Jan.
`30, 2006 and entitled "System and Method for Utilizing Inter-
`Microphone Level Differences for Speech Enhancement,"
`which claims the priority benefit of U.S. Provisional Patent
`Application No. 60/756,826, filed Jan. 5, 2006, and entitled
`"Inter-Microphone Level Difference Suppresor," all of which
`are herein incorporated by reference.
`
`BACKGROUND OF THE INVENTION
`
`10
`
`20
`
`Embodiments of the present invention overcome or sub-
`stantially alleviate prior problems associated with noise sup-
`pression and speech enhancement. In general, systems and
`methods for utilizing inter-microphone level differences
`(ILD) to attenuate noise and enhance speech are provided. In
`exemplary embodiments, the ILD is based on energy level
`differences of a pair of onmi-directional microphones.
`Exemplary embodiments of the present invention use a
`non-linear process to combine components of the acoustic
`signals from the pair of onmi-directional microphones in
`order to obtain the ILD. In exemplary embodiments, a pri-
`15 mary acoustic signal is received by a primary microphone,
`and a secondary acoustic signal is received by a secondary
`microphone (e.g., onmi-directional microphones). The pri-
`mary and secondary acoustic signals are converted into pri-
`mary and secondary electric signals for processing.
`A differential microphone array (DMA) module processes
`the primary and secondary electric signals to determine a
`cardioid primary signal and a cardioid secondary signal. In
`exemplary embodiments, the primary and secondary electric
`signals are delayed by a delay node. The cardioid primary
`25 signal is then determined by taking a difference between the
`primary electric signal and the delayed secondary electric
`signal, while the cardioid secondary signal is determined by
`taking a difference between the secondary electric signal and
`the delayed primary electric signal. In various embodiments
`30 the delayed primary electric signal and the delayed secondary
`electric signal are adjusted by a gain. The gain may be a ratio
`between a magnitude of the primary acoustic signal and a
`magnitude of the secondary acoustic signal.
`The cardioid signals are filtered through a frequency analy-
`35 sis module which takes the signals and mimics the frequency
`analysis of the cochlea (i.e., cochlear domain) simulated in
`this embodiment by a filter bank. Alternatively, other filters
`such as short-time Fourier transform (STFT), sub-band filter
`banks, modulated complex lapped transforms, cochlear mod-
`40 els, wavelets, etc. can be used for the frequency analysis and
`synthesis. Energy levels associated with the cardioid primary
`signal and the cardioid secondary signals are then computed
`(e.g., as power estimates) and the results are processed by an
`ILD module using a non-linear combination to obtain the
`45 ILD. In exemplary embodiments, the non-linear combination
`comprises dividing the power estimate associated with the
`cardioid primary signal by the power estimate associated with
`the cardioid secondary signal. The ILD may then be used as a
`spatial discrimination cue in a noise reduction system to
`50 suppress unwanted sound sources and enhance the speech.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. la and FIG. lb are diagrams of two envirouments in
`55 which embodiments of the present invention may be prac-
`ticed.
`FIG. 2 is a block diagram of an exemplary audio device
`implementing embodiments of the present invention.
`FIG. 3 is a block diagram of an exemplary audio processing
`engine.
`FIG. 4a illustrates an exemplary implementation of the
`DMA module, frequency analysis module, energy module,
`and the ILD module.
`FIG. 4b is an exemplary implementation of the DMAmod-
`65 ule.
`FIG. 5 is a block diagram of an alternative embodiment of
`the present invention.
`
`1. Field oflnvention
`The present invention relates generally to audio processing
`and more. particularly to speech enhancement using inter-
`microphone level differences.
`2. Description of Related Art
`Currently, there are many methods for reducing back-
`ground noise and enhancing speech in an adverse environ-
`ment. One such method is to use two or more microphones on
`an audio device. These microphones are in prescribed posi-
`tions and allow the audio device to determine a level differ-
`ence between the microphone signals. For example, due to a
`space difference between the microphones, the difference in
`times of arrival of the signals from a speech source to the
`microphones may be utilized to localize the speech source.
`Once localized, the signals can be spatially filtered to sup-
`press the noise originating from the different directions.
`In order to take advantage of the level difference between
`two onmi-directional microphones, a speech source needs to
`be closer to one of the microphones. That is, in order to obtain
`a significant level difference, a distance from the source to a
`first microphone needs to be shorter than a distance from the
`source to a second microphone. As such, a speech source must
`remain in relative closeness to the microphones, especially if
`the microphones are in close proximity as may be required by
`mobile telephony applications.
`A solution to the distance constraint may be obtained by
`using directional microphones. Using directional micro-
`phones allows a user to extend an effective level difference
`between the two microphones over a larger range with a
`narrow inter-level difference (ILD) beam. This may be desir-
`able for applications such as push-to-talk (PTT) or video-
`phones where a speech source is not in as close a proximity to
`the microphones, as for example, a telephone application.
`Disadvantageously, directional microphones have numer-
`ous physical drawbacks. Typically, directional microphones
`are large in size and do not fit well in small telephones or
`cellular phones. Additionally, directional microphones are
`difficult to mount as they required ports in order for sounds to 60
`arrive from a plurality of directions. Slight variations in
`manufacturing may result in a misma