`
`——F
`——S a
`=== w
`SS un
`=
`—— &
`SS:
`= a
`=
`=
`
`— =
`
`
`
`4. [Total Sheets__42Drawing(s) (35 U.S.C. 113) _]
`
`
`
`¢0/9T/60
`PTOISBIO5 (08-03)
`Approved for use through 07/31/2006. OMB 0651-0032
`U.S. Patent and Trademark Office. U.S. DEPARTMENT OF COMMERCE
`Under the Paperwork Reduction Act of 1995, no persons are required to respond (o a collection of information uniessit displays
`a valid OMB contro! number.
`
`
`
`UTILITY
`Attomey Docket No.
`ALPH.PO10X
`
`
`First Inventor
`Gregory C. Bumett
`PATENT APPLICATION
`
`
`
`VAD-based Multiple-Microphone Acoustic
`TR A N SM ITT.AL
`
`
`
`Noise Suppression
`
` (Only for new nonprovisionalapplications under 37 CFR 1.53(b})
`|EV 326 938 875 US
`Express Mail Label No.
`
`Mail Stop Patent Application
`
`.
`Commissionerfor Patents
` APPLICATION ELEMENTS
`
`Oo
`ADDRESSTO:
`P.O. Box 1450
`See MPEP chapter 600 concemingutility patent application contents.
`Alexandria VA 22313-1450
`r-
`
`
`
`
`
`1.1Py Fee Transmittal Form (e.g., PTO/SB/17)
`7. oO CD-ROM or CD-Rin duplicate, ‘arge table or
`SO
`(Submit an original and a duplicate for fee processing)
`Computer Program (Appendix)
`
`
`=o
`8. Nucleatide and/or Amino Acid Sequence Submission
`2.2) Applicant claims small entity status.
`B20
`See 37 CFR 1.27.
`(if applicable,all necessary)
`
`3.4)Specification [Total Pages___34 a. Computer Readable Form (CRF) eS
`
`
`
`(preferred arrangementsetforth below}
`Ss
`- Descriptivetitle of the invention
`;
`.
`let,
`.
`== i
`
`
`- Cross Reference to Related Applications
`Specification Sequence Listing on:
`- Statement Regarding Fed sponsored R & D
`ay.
`.
`- Reference to sequence listing, a table,
`5 O CD-ROM or CD-R (2 copies); or
`
`
`
`
`or a computer program isting appendix
`.
`- Backgroundof the Invention
`il. C Paper
`
`
`- Brief Summary ofthe Invention
`- Brief Description of the Drawings(iffiled)
`c. O Statements verifying identity
`of above copies
`- Detailed Description
`
`
`
`ACCOMPANYING APPLICATION PARTS
`
`- Claim(s)
`- Abstract of the Disclosure
`
`
`
`9. O Assignment Papers (cover sheet & document(s))
`
`
`
`
`10. (CJ
`37 CFR3.73(b) Statement
`[Co Powerof
`oO (whenthere is an assignee)
`Attorney
`
`11
`
`
`
`English Translation Document(if applicable)
`Copies of IDS
`Information Disclosure
`12.
`
`Statement (IDS)/PTO-1449
`Citations
`
`
`13. oma
`Preliminary Amendment
`.
`ty
`
`
`14. [4] Return Receipt Postcard (MPEP 503)
`
`(Should be specifically itemized)
`
`
`i] DELETION OF INVENTOR(S)
`15.€1Certified Copy of Priority Document(s)
`Signed statement attached deleting inventor(s)
`
`
`
`(if foreign priority is claimed)
`namein the prior application, see 37 CFR
`
`16. Nonpublication Request under 35 U.S.C. 122
`
`1,63(d)(2) and 1.33(b).
`
`
`(b)(2)(8\i). Applicant must attach form PTO/SB/35
`
`orits equivalent.
`6. {((]
`Application Data Sheet. See 37 CFR 1.76
`
`
`17 0 Other: ........
`
`
`
`18. If a CONTINUING APPLICATION, check appropriate box, and supply the requisite information below and in the first sentence of the
`Specification following thetitle, ar in an Application Data Sheet under 37 CFR 1.76:
`
`
`afprior application No.:99/908,361....
`Ed continuation
`C1 bivicionat
`ca Continuation-in-part (CIP)
`
`
`Art Unit: 2644
`Prior application information:
`Examiner Tony Jacobson
`For CONTINUATION OF DIVISIONAL APPSonly; The entire disclosure ofthe prior application, from which an oath or declaration Is supplied under Box
`5b, Is considered a part of the disclosura of the accompanying continuation or divisional application and is hereby incorporated by reference.
`
`The incorporation can only be relled upon when a portion has been Inadvertently omitted from the submitted application parts.
`
`19. CORRESPONDENCE ADDRESS
`
`
`
`
`
`Shemwell Gregory & Courtney LLP
`
`
`4880 Stevens Creek Boulevard
`
`
`mess Suite 201[Suite201
`
`Zp Code Tesi2g|
`Name—Rigpyese“f RegistrationNo.(Attorney/Agent)|42,607
`
`
`
`
`
`
`This collection ofinformation is requinepLAR R 1.53(b). The information is required to obtain or rétain a benefit by the public which is to file and bythe
`USPTO to process) an application. Coffidentia i joverned by 35 U.S.C. 122 and 37 CFR 1.14. This collection is estimated to take 12 minutes to complete,
`including gathering, preparing, and submitting the completed application form to the USPTO. Time will vary depending upon the individual case. Any comments
`on the amountof time you require to complete this farm and/or suggestions for reducing this burden, should be sent to the Chief Information Officer, U.S. Patent
`and Trademark Office, U.S. Oepartment of Commerce, P.O. Box 1450, Alexandria, VA 22313-1450. DO NOT SEND FEES OR COMPLETED FORMSTO THIS
`ADDRESS. SEND TO: Mall Stop Patent Application, Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450.
`ffyou need assistance in completing the form,call 1-800-PTO-9199 and select option 2.
`
`[Total Sheets
`5. Oath or Declaration
`.
`a. [7] Newly executed (original or copy)
`b. [£1 Copyfrom a prior application (37 CFR 1,63(d))
`(for continuation/divisional with Box 18 completed)
`
`]
`
`
`
`
`
`
`Oo Customer Number: fF OR
`
`Correspondence address below
`
`-1-
`
`Amazon v. Jawbone
`US. Patent 8,019,091
`
`Amazon Ex. 1013
`
`- 1 -
`
`Amazon v. Jawbone
`U.S. Patent 8,019,091
`Amazon Ex. 1013
`
`
`
`EXPRESS MAIL CERTIFICATE OF MAILING
`
`Express Mail” mailing label number: EV 326 938 875 US
`Date of Deposit: September 18, 2003
`| herebycertify that | am causing the paper(s) and/or fee(s) indicated below to be
`deposited with the United States Postal Service “Express Mail Post Office to
`Addressee”service on the date indicated above and that the paper(s) and/or
`fee(s) have been addressedto Mail Stop Patent Application, Commissioner for
`Patents, PO Box 1450, Alexandria, VA 22313-1450.
`
`Richard L. Gregory,Jr.
`
`(Typed or printed namegfperson mailing paper(s) or fee(s))
` (Signature of péfson
`
`¢nailing paperor fee)
`
`9-1$- 2003
`(Date signed)
`
`Filing/lssue Date: Herewith
`Serial/Patent No.:
`Tite: VAD-BASED MULTIPLE-MICROPHONE ACOUSTIC NOISE
`SUPPRESSION
`
`Atty. Docket No.:_ALPH.P010X Date Mailed: September 18, 2003
`The following has been received in the U.S. Patent & Trademark Office on the date stamped hereon:
`
`oO Amendment/Response (_ Oo Petition for Extension of Time (=month(s))pgs.)
`
`oO Preliminary Amendment (
`pgs.)
`v1 Information Disclosure Statement & PTO/SB/08A
`VI Application - Utility (34 pgs.)
`O Issue Fee Transmittal
`O Application - Rule 1.53(b) Contin. ( pgs.) oO Submission of Formal Drawings
`Oo Application - Rule 1.53(b) Divis.(
`pgs.) Oo Notice of Appeal
`oO Application - Rule 1.53(b) CIP (_—
`pgs.) O Appeal Brief(_
`1 Application - Rule 1.53(¢) CPA(
`pgs.) LI Reply Brief
`O Application-PCT(
`pgs.)
`Oo Responseto Notice of Missing Parts
`oO Application - Provisional {
`v1 Utility Patent Application Transmittal
`vi Drawings (12 sheets)
`O Fee Transmittal (in dupticate)
`O Declaration (_ pgs.)
`vi \temized Postcard
`O Assignment & Cover Sheet (_ pgs.)
`vy Express Mail Certificate Of Mailing
`O Powerof Attorney (_, pgs.)
`VJ Express Mail No. EV 326 938 875 US
`(1 Nonpublication Request(35 USC 122(by) LI check No.
`Amt
`O Other _Copies of twenty-five (25)cited references.
`
`pgs.in triplicate)
`
`pgs.)
`
`- 2 -
`
`
`
`Attorney Docket No. ALPH.P010X_
`
`UNITED STATED PATENT APPLICATION
`
`for
`
`Voice Activity Detector (VAD) -Based Multiple-Microphone Acoustic Noise Suppression
`
`Inventors:
`
`Gregory C. Burnett
`
`Eric F. Breitfeller
`
`Prepared by
`
`Shemwell Gregory & Courtney LLP
`4880 Stevens Creek Blvd., Suite 201
`San Jose, CA 95129
`408-236-6647
`
`Attorney Docket No. ALPH.O1OX
`
`EXPRESS MAIL CERTIFICATE OF MAILING
`
`“Express Mail” mailing label number: EV 326 938 875 US
`Date of Deposit:__September 18, 2003
`I hereby certify that this paper is being deposited with the United States Postal
`Service “Express Mail Post Office to Addressee” service under 37 CFR §1.10 on the date
`indicated above and is addressed to Mail Stop Patent Application, Commissioner for
`Patents, PO Box 1450, Alexandria, VA 22313-1450.
`
`
`
`- 3 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`Voice Activity Detector (VAD) -Based Multiple-Microphone Acoustic Noise
`
`Suppression
`
`RELATED APPLICATIONS
`This patent application is a continuation-in-part ofUnited States Patent
`Application Number 09/905,361, filed July 12, 2001, which claimspriority from United
`States Patent Application Number 60/219,297, filed July 19, 2000. This patent
`application also claims priority from United States Patent Application Number
`10/383,162, filed March 5, 2003.
`
`10
`
`FIELD OF THE INVENTION
`
`The disclosed embodimentsrelate to systems and methods for detecting and
`
`processing a desired signal in the presence of acoustic noise.
`
`15
`
`BACKGROUND
`
`20
`
`25
`
`Manynoise suppression algorithms and techniques have been developed over the
`years. Most of the noise suppression systems in use today for speech communication
`systemsare based ona single-microphonespectral subtraction technique first develop in
`the 1970’s and described, for example, by S. F. Boll in “Suppression of Acoustic Noise in
`Speech using Spectral Subtraction," IEEE Trans. on ASSP, pp. 113-120, 1979. These
`techniques have been refined over the years, but the basic principles of operation have
`remained the same. See, for example, United States Patent Number 5,687,243 of
`
`McLaughlin, et al., and United States Patent Number 4,811,404 of Vilmur,et al.
`
`Generally, these techniques make use of a microphone-based Voice Activity Detector
`(VAD)to determine the background noise characteristics, where “voice” is generally
`understood to include human voiced speech, unvoiced speech, or a combination of voiced
`
`and unvoiced speech.
`The VAD has also been used in digital cellular systems. As an example of such a
`use, see United States Patent Number 6,453,291 of Ashley, where a VAD configuration
`
`30
`
`appropriate to the front-end of a digital cellular system is described. Further, some Code
`
`Division Multiple Access (CDMA)systemsutilize a VAD to minimizethe effective radio
`
`spectrum used, thereby allowing for more system capacity. Also, Global System for
`
`- 4 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`Mobile Communication (GSM)systemscan include a VAD to reduce co-channel
`interference and to reduce battery consumption onthe client or subscriber device.
`These typical microphone-based VAD systemsare significantly limited in
`capability as a result of the addition of environmental acoustic noise to the desired speech
`signal received by the single microphone, whereinthe analysis is performed using typical
`signal processing techniques. In particular, limitations in performanceofthese
`microphone-based VAD systemsare noted whenprocessing signals having a low signal-
`to-noise ratio (SNR), and in settings where the background noise varies quickly. Thus,
`similar limitations are found in noise suppression systems using these microphone-based
`
`VADs.
`
`- 5 -
`
`
`
`Attomey Docket No. ALPH.P010X
`
`BRIEF DESCRIPTION OF THE FIGURES
`Figure 1 is a block diagram of a denoising system, under an embodiment.
`Figure 2 is a block diagram including components of a noise removal algorithm,
`under the denoising system of an embodiment assuminga single noise source anddirect
`
`paths to the microphones.
`Figure3 is a block diagram including front-endcomponents ofa noise removal
`algorithm of an embodiment generalized to n distinct noise sources (these noise sources
`maybereflections or echoes of one another).
`Figure 4 is a block diagram including front-end components of a noise removal
`algorithm of an embodimentin a general case wherethere-are n distinct noise sources and
`
`signal reflections.
`Figure 5 is a flow diagram of a denoising method, under an embodiment.
`Figure 6 showsresults of a noise suppression algorithm of an embodimentfor an
`American English female speaker in the presence ofairport terminal noise that includes
`
`many other human speakers and public announcements.
`Figure 7A is a block diagram of a Voice Activity Detector (VAD) system
`including hardware for use in receiving and processingsignals relating to VAD, under an
`
`embodiment.
`Figure 7B is a block diagram of a VAD system using hardware of a coupled noise
`suppression system for use in receiving VAD information, under an alternative
`embodiment.
`.
`Figure 8 is a flow diagram of a method for determining voiced and unvoiced
`speech using an accelerometer-based VAD,under an embodiment.
`Figure 9 showsplots including a noisy audio signal(live recording) along with a
`corresponding accelerometer-based VAD signal, the corresponding accelerometer output
`signal, and the denoised audio signal following processing by the noise suppression
`system using the VAD signal, under an embodiment.
`Figure 10 shows plots including a noisy audio signal (live recording) along with a
`corresponding SSM-based VAD signal, the corresponding SSM outputsignal, and the
`denoised audio signal following processing by the noise suppression system using the
`
`VAD signal, under an embodiment.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`- 6 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`Figure 11 showsplots including a noisy audio signal (live recording) along with a
`corresponding GEMS-based VAD signal, the corresponding GEMSoutput signal, and the
`
`denoised audio signal following processing by the noise suppression system using the
`VAD signal, under an embodiment.
`
`- 7 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`DETAILED DESCRIPTION
`
`The following description provides specific details for a thorough understanding
`
`of, and enabling description for, embodiments of the noise suppression system.
`
`However, one skilled in the art will understand that the invention may bepracticed
`
`without these details. In other instances, well-known structures and functions have not
`
`been shownor described in detail to avoid unnecessarily obscuring the description of the
`
`_embodimentsofthe noise suppression system. In the following description, “signal”
`represents any acoustic signal (such as human speech) that is desired, and “noise” is any
`acoustic signal (which may include human speech) that is not desired. An example
`would be a person talking on a cellular telephone with a radio in the background. The
`person’s speech is desired and the acoustic energy from the radiois not desired. In
`addition, “user” describes a person whois using the device and whose speech is desired
`
`to be captured by the system.
`
`Also, “acoustic” is generally defined as acoustic waves propagating in air.
`
`15
`
`Propagation of acoustic waves in media other than air will be noted as such. References
`
`to “speech”or “voice” generally refer to human speech including voiced speech,
`
`unvoiced speech, and/or a combination of voiced and unvoiced speech. Unvoiced speech
`
`or voiced speech is distinguished where necessary. The term “noise suppression”
`
`generally describes any method by which noise is reduced or eliminated in an electronic
`
`20
`
`signal.
`
`Moreover, the term ““VAD”is generally defined as a vector or array signal, data,
`
`or information that in some mannerrepresents the occurrence of speech in thedigital or
`
`analog domain. A commonrepresentation of VAD informationis a one-bit digital signal
`sampled at the samerate as the corresponding acoustic signals, with a zero value
`representing that no speech has occurred during the corresponding time sample, and a
`
`unity value indicating that speech has occurred during the corresponding time sample.
`
`While the embodiments described herein are generally described in the digital domain,
`
`the descriptions are also valid for the analog domain.
`
`Figure I is a block diagram of a denoising system 1000 of an embodimentthat
`uses knowledge of when speechis occurring derived from physiological information on
`voicing activity. The system 1000 includes microphones 10 and sensors 20 that provide
`
`25
`
`30
`
`- 8 -
`
`
`
`Attorney Docket No, ALPH.P010X
`
`signals to at least one processor 30. The processorincludes a denoising subsystem or
`
`algorithm 40,
`
`Figure 2 is a block diagram including components of a noise removal algorithm
`
`200 of an embodiment. A single noise source and a direct path to the microphonesare
`assumed. An operational description of the noise removalalgorithm 200of an
`
`embodimentis provided using a single signal source 100 and a single noise source 101,
`
`butis not so limited. This algorithm 200 uses two microphones:a “signal” microphone 1
`(“MIC1”) and a “noise” microphone 2 (“MIC 2”), but is not so limited. The signal
`microphone MIC 1 is assumed to capture mostly signal with some noise, while MIC 2
`
`10
`captures mostly noise with some signal. The data from the signal source 100 to MIC1is
`
`denoted by s(n), where s(n) is a discrete sample of the analog signal from the source 100.
`
`The data from the signal source 100 to MIC 2 is denoted by s,(n). The data from the
`
`noise source 101 to MIC 2 is denoted by n(n). The data from the noise source 101 to
`
`MIC 1 is denoted by n,(n). Similarly, the data from MIC 1 to noise removal element 205
`
`15
`
`is denoted by m,(n), and the data from MIC 2 to noise removal element 205 is denoted by
`
`m,(n).
`
`The noise removal element 205 also receives a signal from a voiceactivity
`
`detection (VAD)element 204. The VAD 204 uses physiological information to
`
`determine when a speaker is speaking. In various embodiments, the VAD can include at
`
`20
`
`least one of an accelerometer, a skin surface microphonein physical contact with skin of
`
`a user, a human tissue vibration detector, a radio frequency (RF) vibration and/or motion
`
`detector/device, an electroglottograph, an ultrasound device, an acoustic microphonethat
`
`is being used to detect acoustic frequency signals that correspondto the user’s speech
`
`directly from the skin of the user (anywhere on the body), an airflow detector, and a laser
`vibration detector.
`Thetransfer functions from the signal source 100 to MIC 1 and from the noise
`
`source 101 to MIC 2 are assumed to be unity. The transfer function from the signal
`source 100 to MIC 2 is denoted by H,(z), and the transfer function from the noise source
`101 to MIC 1 is denoted by H,{z). The assumption of unity transfer functions does not
`inhibit the generality of this algorithm,as the actual relations between the signal, noise,
`
`and microphonesare simply ratios and the ratios are redefined in this manner for
`
`25
`
`30
`
`simplicity.
`
`- 9 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`MIC2is used to attempt to remove noise from MIC 1. However, an (generally
`
`In conventional two-microphone noise removal systems, the information from
`
`unspoken) assumption is that the VAD element 204is neverperfect, and thus the
`denoising must be performedcautiously, so as not to remove too muchofthe signal along
`with the noise. However, if the VAD 204 is assumedto be perfect such thatit is equal to
`
`zero whenthere is no speech being producedbythe user, and equal to one when speechis
`
`produced, a substantial improvementin the noise removal can be made.
`In analyzing the single noise source 101 andthe direct path to the microphones,
`with reference to Figure 2, the total acoustic information coming into MIC 1 is denoted
`
`10
`
`by m,(n). The total acoustic information coming into MIC 2 is similarly labeled m,(n).
`
`In the z (digital frequency) domain, these are represented as M,(z) and M,(z). Then,
`
`with
`
`so that
`
`M,(z) = S(z)+N,(z)
`M,()=NG)+S,(2)
`
`N,@)=N@)H(2)
`S,(2)=S@)H,(),
`
`M,(2)=S(@)+ N@)H(2)
`M ,(2)=N@)+S@H,@).
`This is the general case for all two microphonesystems. In a practical system
`
`Eq.
`
`1
`
`there is always going to be some leakageofnoise into MIC 1, and someleakageofsignal
`
`into MIC 2. Equation 1 has four unknownsand only two knownrelationships and
`
`therefore cannot be solved explicitly.
`
`However, there is another way to solve for some of the unknownsin Equation 1.
`
`The analysisstarts with an examination of the case wherethe signal is not being
`generated, that is, where a signal from the VAD element 204 equals zero and speechis
`not being produced.
`In this case, s(n) = S(z) = 0, and Equation 1 reduces to
`
`M,,(2)=N@H(2)
`M,,@)=N@),
`
`20
`
`25
`
`30.
`
`where the n subscript on the M variables indicate that only noise is being received. This
`
`leads to
`
`-10 -
`
`- 10 -
`
`
`
`Attomey Docket No. ALPH.P010X
`
`M ,,@=M ,,(2)H(2)
`M,, (z)
`lava
`
`Eq. 2
`
`The function H,(z) can be calculated using anyof the available system
`identification algorithms and the microphone outputs when the system is certain that only
`noise is being received. The calculation can be done adaptively, so that the system can
`react to changes.in the noise.
`
`10
`
`A solution is now available for one of the unknownsin Equation 1. Another
`unknown, H,(z), can be determinedbyusing the instances where the VAD equals one and
`speech is being produced. Whenthis is occurring, but the recent (perhapsless than 1
`second) history of the microphonesindicate low levels ofnoise,it can be assumedthat
`n(s) = N(z) ~ 0. Then Equation 1 reduces to
`
`15
`
`whichin turn leads to
`
`M,, (2) = S(z) .
`M,, (2) = S@H, (z) 3
`
`H,(z)=
`
`M,,(Z)=M,,@H,@
`
`M2s (z)
`M(2)
`whichis the inverse of the H,(z) calculation. However,it is noted that different inputs are
`being used (now onlythesignal is occurring whereas before only the noise was
`occurring). While calculating H,(z), the values calculated for H,(z) are held constant and
`vice versa. Thus,it is assumed that while one of H,(z) and H,(z) are being calculated, the
`one not being calculated does not change substantially.
`After calculating H,(z) and H,(z), they are used to removethe noise from the
`signal. If Equation 1 is rewritten as
`. S@)=M,(2)—N@)H(2)
`N(Z)=M,(2)~ S(z)H, (2)
`S(Z)=M,(z)—[M ;(2)- S(@)H,(2)]H,(z)'
`S@[1-H,@H,(2)] =M,(@)-M,(2)H,(2),
`
`20
`
`25
`
`30
`
`then N(z) may be substituted as Shownto solve for S(z) as
`
`-11-
`
`- 11 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`M,@-M,(@)H,@)
`°O AGHA)
`" If the transfer functions H,(z) and H,(z) can be described with sufficient accuracy,
`then the noise can be completely removed and the original signal recovered. This
`
`Eq. 3
`
`remains true without respect to the amplitude or spectral characteristics of the noise. The
`only assumptions made include use ofa perfect VAD, sufficiently accurate H,(z) and
`H,(z), and that when one of H,(z) and H,(z) are being calculated the other does not
`
`changesubstantially. In practice these assumptions have proven reasonable.
`
`The noise removal algorithm described herein is easily generalized to include any
`numberofnoise sources. Figure 3 is a block diagram including front-end components
`300 of a noise removal algorithm of an embodiment, generalized to n distinct noise
`
`sources. These distinct noise sources maybereflections or echoes of one another, but are
`not so limited. There are several noise sources shown, each with a transfer function, or
`path, to each microphone. The previously named path H, has been relabeled as Hy, so
`that labeling noise source 2’s path to MIC 1] is more convenient. The outputs of each
`microphone, when transformed to the z domain, are:
`
`M,(2)=S@)+N,@)H(2)+N,@)H,(2)+...N,@H,(@
`M,(2=S2)H(2+ N(2G, Z+N,(2)G,(2)+...N,QG, (z) .
`
`Eq. 4
`
`Whenthereis no signal (VAD = 0), then (suppressing z for clarity) _
`
`M,,=N,H,+N,H,+...N,A,
`M,, =N,G,+N,G,+...N,G,.
`
`A new transfer function can nowbe definedas
`
`A,
`
`
`_M, _N,H,+N,H>+...N,H, ;
`M, N,G,+N,G,+...N,G,
`
`Eq. 5
`
`Eq. 6
`
`15S
`
`20
`
`25
`
`where #7 , is analogous to H,(z) above. Thus H,, dependsonly on the noise sources and
`their respective transfer functions and can be calculated any timethere is no signal being
`
`transmitted. Once again, the “n” subscripts on the microphoneinputs denote only that
`
`10
`
`-12-
`
`- 12 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`noise is being detected, while an ‘“‘s” subscript denotes that only signal is being received
`
`by the microphones.
`
`Examining Equation 4 while assuming an absence of noise produces
`
`. M,, =S
`M,, =SH,.
`
`Thus, H, can be solved for as before, using any available transfer function calculating
`
`algorithm. Mathematically, then,
`
`10
`
`Rewriting Equation 4, using H , defined in Equation 6, provides,
`
`M2s
`A=
`° M,,
`
`.
`
`Solving for S yields,
`
`A,-S
`M,—SH,
`
`M,-M,H
`Sa—L—_i+
`1-H,H,
`
`Eq. 7
`
`Eq. 8
`
`15
`
`20
`
`25
`
`whichis the same as Equation 3, with H, taking the place of H,, and A, taking the place
`of H,. Thus the noise removal algorithm still is mathematically valid for any number of
`noise sources, including multiple echoes of noise sources. Again, if H) and H , can be
`estimated to a high enough accuracy, and the above assumption of only one path from the
`
`signal to the microphonesholds, the noise may be removed completely.
`
`The most general case involves multiple noise sources and multiple signal
`sources. Figure 4 is a block diagram including front-end components 400 of a noise
`removal algorithm of an embodimentin the most general case where there are n distinct
`noise sources and signal reflections. Here, signal reflections enter both microphones MIC
`1 and MIC 2. This is the most general case, as reflections ofthe noise source into the
`microphones MIC 1 and MIC 2 can be modeled accurately as simple additional noise
`sources. Forclarity, the direct path from the signal to MIC 2 is changed from H,(z) to
`
`- 13 -
`
`- 13 -
`
`
`
`Attorney Docket No, ALPH.P010X
`
`H,,(z), and the reflected paths to MIC 1 and MIC2are denoted by Ho,(z) and H,,(z),
`
`respectively.
`
`The input into the microphones now becomes
`
`M(@)=S@)+S@QH, O+tNOH @+N,@QH,@+...N, @H, (2)
`M,, (2)=S(Hgg 2) + SU@)Hyy (2) + N()G,(2)+ Nj @)G,(2)+...N,()G, (2). Eq. 9
`
`When the VAD = 0,the inputs become(suppressing z again)
`
`M,,=N,H,+N,H,+..N,A,
`M,,=N,G,+N,G, +...N,G, >
`
`which is the same as Equation 5. Thus, the calculation of H, in Equation 6 is unchanged,
`
`as expected. In examiningthe situation where there is no noise, Equation 9 reducesto
`
`M,,=S+SH,,
`M,, =SHy+ SH).
`This leads to the definition of H, as
`
`M2 AwtHn
`7M,
`1+H,,
`Rewriting Equation 9 again using the definition for H, (as in Equation 7)
`
`H
`
`provides
`
`10
`
`15
`
`Eq. 10
`4
`
`Eq. 11
`
`H,=
`
`~ M,-SU+H
`or)
`1~S¢.
`M,-SHy+Hy)
`
`20
`
`.
`Somealgebraic manipulation yields
`S(I+Ho, ~HHg +H J=M,~M_H,
`
`(Hoo + Hox)
`
`S(I+H,,Jia, “tHor em-M,_H,
`S(1+H, [l-#,H,|-m,-M,#,,
`
`and finally
`
`12
`
`-14-
`
`- 14 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`ou eM
`2ms
`S(1+Hy, J=—+=+
`1-H,A,
`
`Eq. 12
`
`Equation 12 is the same as equation 8, with the replacement of H, by H>, and the
`
`addition of the (1 + H),) factor on the left side. This extra factor (1 + H,,) means that S
`
`cannotbe solvedfor directly in this situation, but a solution can be generated for the
`
`signal plus the addition ofall of its echoes. This is not such a bad situation, as there are
`
`many conventional methods for dealing with echo suppression, and evenif the echoes are
`
`not suppressed,it is unlikely that they will affect the comprehensibility of the speech to
`any meaningful extent. The more complex calculation of H, is needed to account for the
`
`signal echoes in MIC 2, which act as noise sources.
`
`Figure5is a flow diagram 500 of a denoising algorithm, under an embodiment.
`10
`
`15
`
`20
`
`In operation, the acoustic signals are received, at block 502. Further, physiological
`
`information associated with human voicing activity is received, at block 504. A first
`
`transfer function representative of the acoustic signal is calculated upon determining that
`voicing information is absent from the acoustic signal for at least one specified period of
`time, at block 506. A second transfer function representative of the acoustic signal is
`
`calculated upon determining that voicing information is present in the acoustic signal for
`
`at least one specified period of time, at block 508. Noise is removed from the acoustic
`
`signal using at least one combination ofthe first transfer function and the secondtransfer
`function, producing denoised acoustic data streams, at block 510.
`An algorithm for noise removal, or denoising algorithm, is described herein, from
`
`the simplest case of a single noise source with a direct path to multiple noise sources with
`
`reflections and echoes. The algorithm has been shownherein to be viable under any
`
`environmental conditions. The type and amountof noise are inconsequential if a good
`estimate has been made of H, and H,, andif one does not change substantially while the
`
`25
`
`- other is calculated. If the user environmentis such that echoesare present, they can be
`
`compensated for if coming from a noise source. If signal echoesare also present, they
`will affect the cleaned signal, but the effect should be negligible in most environments.
`In operation, the algorithm of an embodiment has shown excellent results in
`
`dealing with a variety of noise types, amplitudes, and orientations. However, there are
`always approximations and adjustments that have to be made when moving from
`
`30
`
`-15-
`
`- 15 -
`
`
`
`Attorney Docket No. ALPH.PO10X
`
`mathematical concepts to engineering applications. One assumption is made in Equation
`3, where H,(z) is assumed small and therefore H,(z)H,(z) ~ 0, so that Equation 3 reduces
`to
`
`S@)=M(2)-M,(@)H(2).
`This means that only H,(z) has to be calculated, speeding up the process and reducing the
`“number of computations required considerably. With the properselection of
`microphones, this approximationis easily realized.
`Another approximation involvesthefilter used in an embodiment. The actual
`H,(z) will undoubtedly have both poles andzeros, but for stability and simplicity an all-
`zero Finite Impulse Response (FIR)filter is used. With enough taps the approximation to
`the actual H,(z) can be very good.
`,
`To further increase the performanceofthe noise suppression system, the spectrum
`of interest (generally about 125 to 3700 Hz) is divided into subbands. The wider the
`range offrequencies over whicha transfer function mustbe calculated, the more difficult
`it is to calculate it accurately. Therefore the acoustic data was divided into 16 subbands,
`and the denoising algorithm was then applied to each subbandin turn. Finally, the 16
`denoised data streams were recombinedto yield the denoised acoustic data. This works
`very well, but any combinations of subbands(ie., 4, 6, 8, 32, equally spaced,
`perceptually spaced,etc.) can be used and all have been found to work better than a single
`
`10
`
`15
`
`20
`
`subband.
`
`The amplitude of the noise was constrained in an embodimentsothat the
`microphones used did not saturate (that is, operate outside a linear response region). It is
`important that the microphonesoperate linearly to ensure the best performance. Even
`with this restriction, very low signal-to-noise ratio (SNR) signals can be denoised (down
`
`25
`
`30
`
`to -10 dB orless).
`The calculation of H,(z) is accomplished every 10 millisecondsusing the Least-
`Mean Squares (LMS) method, a commonadaptive transfer function. An explanation may
`be found in “Adaptive Signal Processing” (1985), by Widrow and Steams, published by
`Prentice-Hall, ISBN 0-13-004029-0. The LMSwas used for demonstration purposes, but
`many other system idenfication techniques can beused to identify H,(z) and H,(z) in
`Figure 2.
`.
`
`14
`
`- 16 -
`
`- 16 -
`
`
`
`Attorney Docket No. ALPH.P010X
`
`The VAD for an embodimentis derived from a radio frequency sensor and the
`two microphones,yielding very high accuracy (>99%) for both voiced and unvoiced
`speech. The VAD of an embodimentuses a radio frequency (RF) vibration detector
`interferometer to detect tissue motion associated with human speech production, butis
`not so limited. The signal from the RF device is completely acoustic-noise free, and is
`able to function in any acoustic noise environment. A simple energy measurementofthe
`RE signal can be used to determineif voiced speech is occurring. Unvoiced speech can
`be determined using conventional acoustic-based methods, by proximity to voiced
`sections determined using the RF sensor or similar voicing sensors, or through a
`combination ofthe above. Since there is much less energy in unvoiced speech,its
`detection accuracyis notas critical to good noise suppression performanceasis voiced
`
`speech.
`
`With voiced and unvoiced speech detected reliably, the algorithm of an
`embodiment can be implemented. Onceagain,it is useful to repeat that the noise
`removalalgorithm does not depend on how the VAD is obtained, only that it is accurate,
`especially for voiced speech. If speech is not detected and training occurs on the speech,
`the subsequent denoised acoustic data can bedistorted.
`Data was collected in four channels, one for MIC 1, one for MIC 2, and twofor
`the radio frequency sensor that detected the tissue motions associated with voiced speech.
`The data were sampled simultaneously at 40 kHz, then digitally filtered and decimated
`downto 8 kHz. The high sampling rate was used to reduce anyaliasing that might result
`from the analog to digital process. A four-channel National Instruments A/D board was
`used along with Labview to capture and store the data. The data was then read into a C
`program and denoised 10 millisecondsata time.
`Figure 6 shows a denoised audio 602 signal output upon application ofthe noise
`suppression algorith