throbber
01-2~- 0 ~
`
`fl/J;Jbv
`
`PTO/SBiJ6 (6-95)
`Approved for use through 04/11/98. OMB065!-0037
`Patent and Trademark Office; US. DEPARTMENT OF COMMERCE
`
`PROVISIONAL APPLICATION COVER SHEET
`
`This is a request for filing a PROVISIONAL APPLICATION under 37 CFR § I .53(c)
`
`Express Mail label number EL473992223US Date of Deposit July 19, 2000
`I hereby certify that this paper or fee is being deposited with the United States Postal Service
`"Express Mail Post Office to Addressee" service under 37 CFR § 1. IO
`on the date indicated above and is addressed to the Commissioner for Patents, Washington, DC 20231.
`
`Cindy Baglietto
`Name of person signing
`
`~ ~ liJ#:v
`
`Signature
`
`Docket
`Number
`
`20628-701
`
`Type a plus sign
`( +) inside this
`box
`->
`
`+
`
`INVENTOR(s)/APPLICANT(s)
`
`LASTNAME
`
`Burnett
`
`FIRSTNAME
`
`Greg
`
`MIDDLE
`INITIAL
`
`RESIDENCE (CITY A.ND EITHER STATE OR
`FOREIGN COUNTRY)
`San Francisco, California
`
`\
`TITLE OF THE INVENTION (280 characters max)
`
`METHOD AND APPARATUS FOR NOISE REMOVAL
`
`CORRESPONDENCE ADDRESS
`
`WILSON SONSINI GOODRICH & ROSATI
`650 Page Mill Road
`Palo Alto, California 94304-1050
`Telephone: (650) 493-9300
`Facsimile: (650) 493-6811
`
`ENCLOSED APPLICATION PARTS (check all that apply)
`
`Specification
`Drawing(s)
`
`Numbero/Pages _14_
`Number o/Sheets __
`
`•
`•
`
`Small Entity Statement
`Other (specify)
`
`METHOD OF PAYMENT (check one)
`
`A check or money order is enclosed to cover the Provisional filing fees.
`The Commissioner is hereby authorized to charge filing fees and credit
`Deposit Account Number: 23-2415
`
`PROVISIONAL FILING
`FEE AMOUNT ($)
`
`$150.00
`
`I
`
`::.
`
`=-
`
`'..
`
`::
`
`~ =·
`
`:ce; r ,
`-
`I ' ,;
`i
`J
`
`~ •
`
`• ~
`
`The invention was made by an agency of the United States Government or under a contract 'with an agency of the United States Government
`
`18'1 •
`
`No.
`Yes, the same of the U.S. Government agency and the Government contract numbers are: ______________ _
`
`Respectfully submitted,
`
`Date: 7- ( q - crD
`REGISTRATION NO. __ 4..::2:a.:.4..:...4=-=2'------------
`(if appropriate)
`Additional inventors are being named on separately numbered sheets attached hereto.
`PROVISIONAL APPLICATION FILING ONLY
`
`D
`
`:::::A! l:::~35
`
`C:\NrPortbl\PALib I \CB9\l 208923 _I.DOC
`
`- i -
`
`Sony v. Jawbone
`
`U.S. Patent No. 8,019,091
`
`Sony Ex. 1009
`
`

`

`(
`
`This is to describe the noise removal algorithm that I devised that takes advantage of our
`knowledge of when speech is occurring. It is simple, robust, and should work for any type of
`noise regardless of spectral content, duration, or stationarity. I call it the "Pathfinder" algorithm,
`as it uses the knowledge of when speech occurs to determine the transfer functions (paths)
`between two microphones.
`
`Overview
`One of the most common acoustic-only ( as opposed to our algorithm which uses both acoustic
`and other information) adaptive noise removal algorithms is shown in Figure 1. It uses the input
`from two microphones and the normalized least-mean-squares (NLMS) method to adaptively
`remove noise.
`it works very well on sinusoidal inputs, even ones with 50
`sinusoids and close to full bandwidth spectrum.
`it updates its noise
`parameters during unvoiced periods, thus assuring that it is only training itself on noise and not
`voiced information (although it may occasionally pick up unvoiced information if areas within
`~500 msec of voiced speech are used for training). However, it does not work well on random
`noise or non-stationary noise.
`
`The approach I took was to re-examine the assumptions behind this widely used noise removal
`technique. One of them is that the time that the signal is being received, that is the time that the
`signal s(n) > 0, is not known. For acoustic-only applications, this is certainly true. However, we
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`
`- 1 -
`
`

`

`. '
`
`don't operate under this restriction. Using the energy of the GEMS signal, we know precisely
`when voiced speech is occurring. The question is then: What changes in the algorithm will
`occur if the times when s(n) is nonzero are known?
`
`If Figure 1, the acoustic information coming into Microphone 1 is denoted by m1(n). The
`information coming into Microphone 2 is similarly labeled m2(n). In the z ( digital frequency)
`domain, we can represent them as M1(z) and M2(z). Then
`
`with
`
`so that
`
`M1(z)= S(z)+ N2(z)
`M2(z)= N(z)+S2(z)
`
`N2 (z) = N(z )H1 (z)
`s2(z)= S(z)H2(z)
`
`M1 (z)= S(z)+ N(z)H1(z)
`M 2(z)= N(z)+S(z)H2(z)
`
`(1)
`
`This is the general case for all two microphone systems. There is always going to be some
`leakage of noise into Mic 1, and some leakage of signal into Mic 2. Equation 1 has four
`unknowns and only two relationships and cannot be solved explicitly.
`
`However, perhaps there is some way to solve for some of the unknowns in Equation 1 by other
`means. Let's examine the case where the signal is not being generated - that is, where the
`GEMS signal indicates voicing is not occurring. In this case, s(n) = S(z) = 0, and Equation 1
`reduces to
`
`M1n(z)= N(z)H1(z)
`M 2n(z)= N(z)
`where the n subscript on the M variables indicate that only noise is being received. This leads to
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`;;._
`
`- 2 -
`
`

`

`(2)
`
`M1n(z)= M2n(z)H1 (z)
`Hi(z)== M1n(z)
`M 2n(z)
`H1(z) can be calculated using any of the available system identification algorithms and the
`microphone outputs when only noise is being received. The calculation can be done adaptively,
`although if the relative position of the two microphones is held fairly constant H1(z) should be
`fairly constant as well. If done adaptively, the update would probably not need to be that often,
`perhaps on the order of once a second. The interesting thing is that the whiter the input, the
`better the transfer function calculation. That is, white noise, the most difficult type of noise to
`remove for acoustic-only systems, would actually result in the best performance for this
`technique.
`
`So now we have solved for one of the unknowns in Equation 1. We can solve for another, H2(z),
`by using the amplitude of the GEMS or similar device along with the amplitude of the two
`microphones. When the GEMS indicates voicing, but the recent (less than 1 second) history of
`the microphones indicate low levels of noise, we can assume that n(s) N(z) ~ 0. Then Equation
`1 reduces to
`
`M18 (z)= S(z)
`M2s (z) == S(z )H2 (z)
`
`which in turn leads to
`
`which is the inverse of the H1(z) calculation, but remember that we are using different inputs.
`
`After we have calculated H1(z) and H2(z) above, we can use them to remove the noise from the
`signal. Ifwe rewrite Equation 1 as
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`3
`
`- 3 -
`
`

`

`S(z)= M1(z)-N(z)H1(z)
`N(z)= M 2(z)-S(z)H2(z)
`S(z) = M1 (z )-[M2 (z )-S(z )H2(z )}H1 (z) '
`S(z )[I-H2(z )H1 (z)]= M1 (z )-M2 (z )H1 (z)
`
`we can solve for S(z):
`
`S(z)= M1 (z)-M 2(z)H1 (z)
`1-H2(z )H1 (z)
`
`(3)
`
`Summary
`Given that we can determine the two cases when there is no noise and when there is no signal
`and can therefore calculate H1(z) and H2(z), the original signal may be recovered by Equation 3.
`Since both H 1 and H2 should not change very rapidly, once determined we could just calculate
`1
`( ) ( )
`1-H2 z HI z
`and apply it until the next update ofH1 and H2 occurs. We might not even need H3(z), it would
`just apply a magnitude and phase change to the audio, and that might not help intelligibility.
`This would simplify the computations and result in a faster response.
`
`H3(z)=
`
`The processing above would have to take place on finite windows of information, possibly 30
`milliseconds or so in length or a certain number of glottal cycles. We could also try longer
`windows, but we don't want to delay the cleaned audio with respect to the original any more than
`necessary.
`
`It is interesting to note that we are not making any approximations or assumptions regarding the
`noise or the signal. If we are able to calculate H 1 and H2 sufficiently accurately, the noise will be
`completely removed regardless of the noise and signal characteristics.
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`4
`
`- 4 -
`
`

`

`, .
`
`',
`
`'
`
`1. This technique will allow the noise removal of any type of noise in the presence of any type
`of signal as long as the microphones are operating linearly and simply adding the signals -
`i.e. the signals are additive. If the microphones saturate, this algorithm will still work to a
`certain degree but will probably not remove the noise completely.
`2. This technique does not depend on the location or type of microphones, nor does it require
`that the microphones be matched to each other. These details may be compensated for in the
`calculation ofH1 and H2.
`3. This technique is not affected by aging of the microphones, as H1 and H2 can be recalculated
`whenever it is convenient. It is estimated that the calculation ofH1 and H2 will take on the
`order of 10-100 milliseconds.
`4. This technique does require the use of a "voicing device", a device that can determine when
`the user is voicing (using the vocal folds to produce speech). This device would include but
`is not limited to radio :frequency devices (such as the GEMS), electroglottographs (EGG),
`ultrasound devices, acoustic throat microphones, and airflow detectors.
`5. The calculation of the transfer functions H1 and H2 can be accomplished by the use of any
`techniques used by those skilled in the art, including but not limited to adaptive techniques,
`recursive techniques, and simpler ones such as Matlab's AR and ARX.
`6. The calculation of the transfer functions need not be done constantly, as they should change
`slowly or not at all. The calculation ofH1 can be accomplished when the voicing device
`indicates that no voicing is occurring or has occurred within a set time. H2 may be calculated
`when the recent history of the microphone signals indicate the absence of background noise
`and the voicing device indicates voicing is occurring.
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`5
`
`- 5 -
`
`

`

`multiple noise sources and reflective
`paths caused by the environment of the user (the multipath problem).
`
`"
`
`Overview
`Several different circumstances can occur in the user's environment. In the first memo, we
`examined only the relatively simple case of a single noise source and a single signal source with
`only direct paths from the sources to the microphones. This situation is shown in Figure 1. We
`found that by determining when the signal was occurring, we could calculate H1(z) and H2(z) and
`remove all of the noise in this situation. The signal was given by
`S(z)= M1(z)-M 2(z)H1(z)
`1 H2(z)H1(z)
`this algorithm works quite well, using a normalized LMS algorithm to
`calculate H1 and H2 given the GEMS information on voicing. However, it only works on
`simulated data, not actual recorded data. The real world is just not as simple as that portrayed in
`Figure 1.
`
`I will now generalize the Pathfinder algorithm to include many situations that are more realistic.
`The first is one in which there are multiple noise sources or equivalently, one noise source and
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`b
`
`- 6 -
`
`

`

`2
`
`many reflections of the noise source. These situations are equivalent in that the microphones are
`unable to determine if a second noise input is a different noise source or just the original noise
`source after reflection from an interface. It is still assumed that there is only one path from the
`signal to the microphones.
`
`Multiple noise sources, single signal with direct path
`This case is illustrated in Figure 2. There are several noise sources illustrated, each with a
`transfer function ( or path) to each microphone. The previously named path H2 has been
`relabeled as Ho, so that labeling noise source 2' s path to Mic 1 is more convenient. The outputs
`of each microphone, when transformed to the z domain, are:
`M1 (z) = S(z)+ N1 (z )H1 (z )+ N2 (z)H2(z )+ ... Nn (z )Hn (z)
`M 2 (z)= S(z)H0(z)+ N1 (z)G1 (z)+ N 2(z)G2(z)+ ... Nn(z)Gn(z)
`When there is no signal, as determined by the GEMS or other device, then (suppressing the z's
`for clarity)
`
`Eq. I
`
`(S=O)
`
`Eq.2
`
`M 1n == N 1H 1 +N2H 2 + ... NnHn
`M2n = NIGi + N2G2 + ... NnGn
`We can define a new transfer function now, analogous to H1(z) in the previous memo:
`ii = Min = N1H1 +N2H2 + ... NnHn
`NIGi +N2G2 + ... NnGn
`I M2n
`Thus H1 depends only on the noise sources and their respective transfer functions and can be
`calculated any time there is no signal being transmitted. Once again, the n subscripts on the
`microphone inputs denote only that noise is being detected, while an s subscript denotes that only
`signal is being received by the microphones.
`
`Eq. 3
`
`If we now examine Equation 1 assuming that there is no noise, we get
`M1s =S
`M2s =SHo
`So that Ho can be solved for as before, using any available transfer function calculating
`algorithms desired. Mathematically
`
`(N=O)
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`7
`
`- 7 -
`
`

`

`Ifwe now solve Equation 1, using B1 above, we get
`
`3
`
`Eq.4
`
`Solving for S, we get
`
`Eq.5
`
`S = Ml -M7: _ __fl1
`I-HOHi
`which is the same as before, with Ho taking the place ofH2, and B1 taking the place ofH1. Thus
`the Pathfinder algorithm still is mathematically valid for any number of noise sources, including
`multiple echoes of noise sources. The only change in the algorithm written by Eric is that the
`estimates of B1 contain both poles and zeros, whereas with a simpler FIR model there were only
`zeros. Still, if B O and B1 can be estimated to a high enough accuracy, and the above
`assumption of only one path from the signal to the microphones holds, the noise may be removed
`completely.
`
`Multiple noise sources, multiple signal sources
`This case is the most general one possible and is illustrated in Figure 3. Here, we are allowing
`reflections of the signal to enter both microphones. This is the most general case, as reflections
`of the noise source into the microphones can be modeled accurately as simple additional noise
`sources. I have modified the names of the signal transfer :functions, the direct path from the
`signal to Mic 2 has changed from H0(z) to H00(z), and the reflected paths to Microphones 1 and 2
`are denoted by Ho1(z) and Ho2(z) respectively.
`
`The input into the microphones now becomes
`M1 (z) = S(z )+ S(z )H01 (z )+ N1 (z )H1 (z )+ N2 (z )H2 (z )+ ... Nn (z )Hn (z)
`M2(z) = S(z)H0o(z)+S(z)H02 (z)+ N1 (z)G1 (z)+ N2(z )G2(z)+ ... Nn(z)GJz)
`
`Eq. 6
`
`When the signal= 0, the inputs become (suppressing the z's again)
`Min =N1H1 +N2H2 + ... NnHn
`M2n = N1G1 + N2G2 + ... NnGn
`
`(S=O)
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`<g
`
`- 8 -
`
`

`

`4
`
`which is the same as Equation 2 above. Thus the calculation of fl, in Equation 3 is unchanged,
`as expected. If we now examine the situation where there is no noise, Equation 6 reduces to
`M 1s =S+SH01
`M2s = SHoo + SH02
`
`(N=O)
`
`This leads to the definition of H 2 :
`
`jj
`
`= M2s =Hoo+ Ho2
`M,s
`l+Ho1
`Ifwe again solve Equation 6 using the definition for ii1 (as in Equation 4), we get
`ii _ M, -S(l+H01 )
`s(Hoo +HoJ
`I - M2
`
`2
`
`Eq. 7
`
`Eq. 8
`
`Some algebraic manipulation yields
`s(1+Ho1 -ii1(Hoo +Ho2))=M1 M2H1
`S(l + H { 1 - it (H 00 + H 02 )]- M - M ii
`(1 + H 01)
`-
`S(l+H0i)[1
`fi1it2 ] M1 -M2H1
`
`011
`
`I
`
`I
`
`2
`
`I
`
`and finally
`
`Eq.9
`
`S(l + Ho1) = M1 -!! :,it,
`1-H,H2
`Equation 9 is the same as equation 5, with the replacement of Ho by it2 , and the addition of the
`(1 +H01) factor on the left side. This extra factor means that we cannot solve directly for Sin this
`situation, but rather can only solve for the signal plus the addition of all of its echoes. This is not
`such a bad situation, as there are many conventional methods for dealing ·with echo suppression,
`and even if the echoes are not suppressed, it is unlikely that they will affect the
`comprehensibility of the speech to any meaningful extent. The more complex calculation of it 2
`is needed to account for the signal echoes in Microphone 2, which act similar to noise sources.
`The effects of signal echoes into Mic 2 can therefore be compensated for, unlike those into Mic
`1. This makes sense, as Mic 1 cannot determine if the signal it is receiving contains echoes and
`can therefore not remove them.
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`'i
`
`- 9 -
`
`

`

`5
`
`One thing that Equation 7 makes clear is that the addition of signal echoes makes if 2 dependent
`on the environment, not just the paths between the signal and the microphones. This means that
`if 2 will not be as constant as it is without signal echoes. Hopefully, though, this will be a small
`effect and not require frequent recalculation of ii 2 •
`
`Conclusion
`The Pathfinder algorithm has been shown to be viable under any environmental conditions. The
`type and amount of noise are inconsequential if a good estimate has been made of if 1 and if 2 •
`If the user environment is such that echoes are present, they can be compensated for if coming
`from a noise source. If signal echoes are also present, they will affect the cleaned signal, but the
`effect should be negligible in most environments.
`
`1. This technique will allow the noise removal of an arbitrary number of noise sources of any
`type in the presence of any type of signal. For example, this will work with low or high
`bandwidth noise, that is stationary or changing, that is short in duration or long, where there
`are 2 noise sources or 20.
`2. This algorithm will function properly as long as the microphones are operating linearly and
`simply adding the signals
`i.e. the signals are additive. If the microphones saturate, this
`algorithm will still work to a certain degree but will probably not remove the noise
`completely.
`3. The amount of noise removal will depend on the accuracy of the calculation of if 1 and ii 2 •
`With accurate enough calculations, the noise will be completely removed.
`4. This technique does not depend on the location or type of microphones, nor does it require
`that the microphones be matched to each other. These details may be compensated for in the
`calculation of if 1 and ii 2 •
`5. This technique is not affected by aging of the microphones, as ii1 and if 2 can be
`recalculated whenever it is convenient. It is estimated that the calculation of H1 and ii2
`will take on the order of I 0-100 milliseconds.
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`to
`
`- 10 -
`
`

`

`6
`
`6. This technique does require the use of a '"voicing device", a device that can determine when
`the user is voicing (using the vocal folds to produce speech). This device would include but
`is not limited to radio frequency devices (such as the GEMS), electroglottographs (EGG),
`ultrasound devices, acoustic throat microphones, and airflow detectors.
`7. The calculation of the transfer functions H1 and H 2 can be accomplished by the use of any
`techniques used by those skilled in the art, including but not limited to adaptive techniques,
`recursive techniques, and simpler ones such as Matlab's AR and ARX.
`8. The calculation of the transfer functions need not be done constantly, as they should change
`slowly or not at all. The calculation of H1 can be accomplished when the voicing device
`indicates that no voicing is occurring or has occurred within a set time. H 2 may be
`calculated when the recent history of the microphone signals indicate the absence of
`background noise and the voicing device indicates voicing is occurring. To increase the
`accuracy of the H1 and H 2 calculations, the parameters may be averaged. A longer time
`constant should be used for H2 , as it will change quite slowly.
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`\\
`
`- 11 -
`
`

`

`Figure 1. Setup for NLMS algorithm.
`
`wi(n)
`
`D3(n)
`
`Vr m2(n)-+1
`
`;?~
`IP.
`
`MIC2
`V
`/
`
`n(n)
`
`A/"\..
`
`NOISE
`
`n(n)
`
`(Cc;)))
`
`e(n)
`
`t+©
`
`MIC 1
`
`~'
`
`'~4>;"
`
`SIGN.AL
`
`s(n)
`
`~~) m1(n}
`
`s(n)
`
`(Cc;J)), A
`
`~ii
`
`1--1 • ~
`z ~
`tn
`d
`0 z ~ .......
`(')
`d
`~
`'jJ ~
`...-(
`~
`1--1
`(/).
`
`tn z
`
`(/1
`(/).
`(/).
`
`z tn
`t:o c
`
`--•
`
`(/).
`
`- 12 -
`
`

`

`8
`
`@;~--------.
`
`SIGNAL
`S(z)
`
`NOISE 1
`N1(1)
`
`(~;)))
`
`NOISEn
`Nn(I)
`
`MIC 1
`
`MIC2
`
`Figure 2. Situation with an
`arbitrary number of noise sources,
`including noise reflections.
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`\~
`
`- 13 -
`
`

`

`9
`
`' -,
`
`' '
`
`' \
`
`@;>))~-.--~
`
`SIGNAL
`S(z)
`
`~,
`
`((c;>))
`
`NOISE 1
`N1(.z)
`
`(Cc;>))
`
`NOISE2
`N2(z)
`
`((c;>))
`
`NOISE n
`Nn(.z)
`
`MIC2
`
`Figure 3. Most general case
`including multiple noise
`sources and signal reflections.
`
`BUSINESS SENSITIVE AND CONFIDENTIAL
`14
`
`- 14 -
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket