`61/403952
`09/24/2010
`PTO/SB/16 (12-08)
`i
`Approved for use through 09/30/201 o. 0MB 0651-0032
`U.S. Patent and Trademark Offic.e; U.S. DEPARTMENT OF COMMERCE
`NI
`~ Under the Paperwork Reduction Act of 1995, no persons are required to respond to a collection of information unless ii displays a valid 0MB control number.
`PROVISIONAL APPLICATION FOR PATENT COVER SHEET - Page 1 of 2
`C
`This is a request for filing a PROVISIONAL APPLICATION FOR PATENT under 37 CFR 1.53(c).
`Efq>ress Mail Label No.
`--------------------------
`INVENTOR($)
`Family Name or Surname
`
`-
`0
`c,o _
`~
`..... -
`O _
`
`- .
`
`~ ~
`Grwin Name (first and micldle [if any])
`
`Residence
`(City and either State or Foreign Country)
`Pearl River, NY
`
`New Providence, NJ
`
`Manli
`
`Qi
`
`Zhu
`
`Li
`
`Additional inventors are being named on the
`separately numbered sheets attached hereto.
`TITLE OF THE INVENTION (500 characters max):
`
`Microphone Array Design and Implementation for Telecommunications and Handheld Devices
`
`Direct all correspondence to:
`
`CORRESPONDENCE ADDRESS
`
`□ The address corresponding to Customer Number:
`OR
`a· (P t
`) L"
`[8] Firm or
`I e er
`Individual Name
`I
`Address Li Creative Technologies, Inc., 258 Hanover Road, Suite 140
`State NJ
`Telephone 973-822-0048
`ENCLOSED APPLICATION PARTS (check all that apply)
`
`City Florham Park
`Country USA
`
`Zip 07932
`Email li@licreativetech.com
`
`□ Application Data Sheet. See 37 CFR 1. 76
`□ Drawing(s) Number of Sheets
`□ Specification (e.g. description of the invention) Number of Pages
`
`□ CD(s), Number of CDs
`[8] Other (specify) descrip w/ drawings 23 ee
`
`Fees Due: Filing Fee of $220 ($110 for small entity). If the specification and drawings exceed 100 sheets of paper, an application size fee is
`also due, which is $270 ($135 for small entity) for each additional 50 sheets or fraction thereof. See 35 U.S.C. 41 (a)(1 )(G) and 37 CFR 1.16(s).
`
`METHOD OF PAYMENT OF THE FILING FEE AND APPLICATION SIZE FEE FOR THIS PROVISIONAL APPLICATION FOR PATENT
`[8] Applicant claims small entity status. See 37 CFR 1.27.
`□ A check or money order made payable to the Director of the United States Patent and Trademark Office
`is enclosed to cover the filing fee and application size fee (if applicable).
`[8] Payment by credit card. Form PTO-2038 is attached.
`□ The Director is hereby authorized to charge the filing fee and application size fee (if applicable) or credit any overpayment to Deposit
`
`110
`
`I
`I
`TOTAL FEE AMOUNT($)
`
`'
`
`Account Number:
`
`USE ONLY FOR FILING A PROVISIONAL APPLICATION FOR PATENT
`This collection of information is required by 37 CFR 1.51. The infonnation is required to obtain or retain a benefit by the public which is to file (and by the USPTO to
`proc.ess) an application. Confidentiality is governed by 35 U.S.C. 122 and 37 CFR 1.11 and 1.14. This collection is estimated to take 10 hours to complete, including
`gathering, preparing, and submitting the completed application form to the USPTO. Time will vary depending upon the individual case. Any comments on the
`amount of time you require to complete this form and/or suggestions for reducing this burden, should be sent to the Chief Information Officer, U.S. Patent and
`Trademark Offic.e, U.S. Department of Commerc.e, P.O. Box 1450, Alexandria, VA 22313-1450. DO NOT SEND FEES OR COMPLETED FORMS TO THIS
`ADDRESS. SEND TO: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450.
`If you need assistance in completing the form, ca/11-800-PT0-9199 and select option 2.
`
`Page 1 of 32
`
`GOOGLE EXHIBIT 1005
`
`
`
`PROVISIONAL APPUCATION COVER SHEET
`Page 2 of 2
`
`PTO/SB/16 (12-08)
`Approved for use through 09/30/201 o. 0MB 0651-0032
`U.S. Patent and Trademark Office; U.S. DEPARTMENT OF COMMERCE
`Under the Paperwork Reduction Act of 1995, no persons are required to respond to a collection of information unless it displays a valid 0MB control number.
`
`The invention was made by an agency of the United States Government or under a contract with an agency of the United States Government.
`[gj No.
`D Yes, the name of the U.S. Government agency and the Government contract number are: _____________ _
`
`WARNING:
`Petitioner/applicant is cautioned to avoid submitting personal information in documents filed in a patent application that may
`contribute to identity theft. Personal information such as social security numbers, bank account numbers, or credit card
`numbers (other than a check or credit card authorization form PTO-2038 submitted for payment purposes) is never required by
`the USPTO to support a petition or an application. If this type of personal information is included in documents submitted to the
`USPTO, petitioners/applicants should consider redacting such personal information from the documents before submitting them
`to the USPTO. Petitioner/applicant is advised that the record of a patent application is available to the public after publication of
`the application (unless a non-publication request in compliance with 37 CFR 1.213(a) is made in the application) or issuance of
`a patent. Furthermore, the record from an abandoned application may also be available to the public if the application is
`referenced in a published application or an issued patent (see 37 CFR 1.14). Checks and credit card authorization forms
`PTO-2038 submitted for payment purposes e not retained in the application file and therefore are not publicly available .
`..
`SIGNATURE ___ _ _ : := ' - - - - - - - - - - - - - - - - - - - - Date 09-22-2010
`
`TYPED or PRINTED NAME_Q_i_(._P_e_t_e ..... r)_L_i ___________ _
`
`REGISTRATION NO. ______ _
`(if appropriate)
`
`TELEPHONE 973-822-0048
`
`Docket Number: _____________ _
`
`Page 2 of 32
`
`
`
`PTO/SB/17 (10-08)
`Approved for use through 09/30/2010. 0MB 0651-0032
`U.S. Patent and Trademark Office; U.S. DEPARTMENT OF COMMERCE
`Under the Paperwork Reduction Act of 1995 no persons are reauired to respond to a collection of information unless it displays a valid 0MB control number
`'"I
`Complete if Known
`
`,.
`
`Effective on 12/08/2004.
`Fees pursuant to the Consolidated Appropriations Act, 2005 (H.R. 4818).
`
`FEE TRANSMITTAL Filing Date
`For FY 2009
`l[2J Applicant claims small entity status. See 37 CFR 1.27
`
`Application Number
`
`09-22-2010
`First Named Inventor Manli Zhu
`
`Examiner Name
`
`Art Unit
`
`\..TOTAL AMOUNT OF PAYMENT I($}
`
`110
`
`Attorney Docket No.
`
`~
`
`METHOD OF PAYMENT (check all that apply)
`
`D Check [{] Credit Card D Money Order ONone Oother (please identify):
`D Deposit Account Deposit Account Number:
`
`Deposit Account Name:
`For the above-identified deposit account, the Director is hereby authorized to: (check all that apply)
`□ charge fee(s) indicated below
`D Charge any additional fee(s) or underpayments of fee(s)
`
`D Charge fee(s) indicated below, except for the filing fee
`D Credit any overpayments
`
`under37CFR 1.16and 1.17
`WARNING: Information on this form may become public. Credit card Information should not be Included on this form. Provide credit card
`Information and authorization on PTO-2038.
`FEE CALCULATION
`1. BASIC FILING, SEARCH, AND EXAMINATION FEES
`FILING FEES
`SEARCH FEES
`Small Enti!)!
`Small Enti!)!
`.E!!t.W
`.E!!t.W &f!Jil
`.E!!t.W
`330
`165
`540
`270
`220
`110
`100
`50
`220
`110
`330
`165
`330
`165
`540
`270
`220
`110
`0
`0
`
`Fees Paid Ii}
`
`SUBMITTED BY
`
`Signature
`
`~
`
`Name (PrinVType) Qi (Peter) Li
`
`.. _ / / ~
`-
`
`I Registration No.
`
`IAttomev/Aaentl
`
`Telephone 973-822-0048
`
`Date 09-22-2010
`
`This collection of information is required by 37 CFR 1.136. The information is required to obtain or retain a benefit by the public which is to file (and by the
`USPTO to process) an application. Confidentiality is governed by 35 U.S.C. 122 and 37 CFR 1.14. This collection is estimated to take 30 minutes to complete,
`including gathering, preparing, and submitting the completed application form to the USPTO. Time will vary depending upon the individual case. Any comments
`on the amount of time you require to complete this form and/or suggestions for reducing this burden. should be sent to the Chief Information Officer, U.S. Patent
`and Trademark Office, U.S. Department of Commerce, P.O. Box 1450, Alexandria, VA 22313-1450. DO NOT SEND FEES OR COMPLETED FORMS TO THIS
`ADDRESS. SEND TO: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450.
`If you need assistance in completing the form, call 1-800-PTO-9199 and select option 2.
`
`EXAMINATION FEES
`Small Enti!)!
`Eli.W ~ Fees Paid Ii}
`220
`110
`140
`70
`170
`85
`650
`325
`0
`0
`
`:l:IQ
`Small Enti!l!
`E!!Uil
`E!!Uil
`26
`52
`110
`220
`195
`390
`Multii;ile Deaendent Claims
`E!!Uil
`Fee Paid Ii}
`
`E!!Uil
`
`=
`
`Fee Paid Ii}
`
`Fee Paid Ii}
`
`Aeelication Tl£ee
`Utility
`Design
`Plant
`Reissue
`Provisional
`2. EXCESS CLAIM FEES
`Fee Descrietion
`Each claim over 20 (including Reissues)
`Each independent claim over 3 (including Reissues)
`Multiple dependent claims
`Tgtal Claims
`Extra Claims
`-20orHP=
`X
`HP = highest number of total claims paid for, if greater than 20 .
`.E!!t.W
`lndea. Claims
`Extra Claims
`- 3 or HP =
`=
`X
`HP = highest number of independent claims paid for, if greater than 3.
`3. APPLICATION SIZE FEE
`If the specification and drawings exceed 100 sheets of paper (excluding electronically filed sequence or computer
`listings under 37 CFR l.52(e)), the application size fee due is $270 ($135 for small entity) for each additional 50
`sheets or fraction thereof. See 35 U.S.C. 4l~)(l)(G) and 37 CFR l.16~s).
`Total Sheets
`Extra Sheets
`Num er of each additional 50 or raction thereof ~ Fee Paid(i)
`=
`(round up to a whole number) X
`/50 =
`• 100 =
`4. OTHER FEE(S)
`Non-English Specification, $130 fee (no small entity discount)
`Other (e.g., late filing surcharge):
`
`Page 3 of 32
`
`
`
`PTO/SB/92 (07-09)
`Approved for use through 07/31/2012. 0MB 0561-0031
`Patent and Trademark Office; U.S. DEPARTMENT OF COMMERCE
`Under the Pa erwork Reduction Act of 1995, no ersons are re uired to res ond to a collection of information unless it contains a valid 0MB control number.
`
`Certificate of Mailing under 37 CFR 1.8
`
`I hereby certify that this correspondence is being deposited with the United States Postal Service
`with sufficient postage as first class mail in an envelope addressed to:
`
`Commissioner for Patents
`P.O. Box 1450
`Alexandria, VA 22313-1450
`
`09-22-2010
`on ____________ _
`Date
`
`C?-=
`
`Qi (Peter) Li
`
`Signature
`
`Typed or printed name of person signing Certificate
`
`Registration Number, if applicable
`
`973-822-0048
`
`Telephone Number
`
`Note: Each paper must have its own certificate of mailing, or this certificate must identify
`each submitted paper.
`
`This collection of information is required by 37 CFR 1.8. The information is required to obtain or retain a benefit by the public which is to file (and by the USPTO to
`process) an application. Confidentiality is governed by 35 U.S.C. 122 and 37 CFR 1.11 and 1.14. This collection is estimated to take 1.8 minutes to complete,
`including gathering, preparing, and submitting the completed application form to the USPTO. Time will vary depending upon the individual case. Any comments
`on the amount of time you require to complete this form and/or suggestions for reducing this burden, should be sent to the Chief Information Officer, U.S. Patent
`and Trademark Office, U.S. Department of Commerce, P.O. Box 1450, Alexandria, VA 22313-1450. DO NOT SEND FEES OR COMPLETED FORMS TO THIS
`ADDRESS. SEND TO: Commissioner for Patents, P.O. Box 1450, Alexandria, VA 22313-1450.
`
`If you need assistance in completing the fonn, ca/11-800-PTO-9199 and select option 2.
`
`Page 4 of 32
`
`
`
`•
`
`Li Creative Technologies, Inc.
`25 B Hanover Road, Suite 140, Florham Park, NJ 07932, USA
`Tel: (973) 822-0048; Fax: (973) 822-0399; Website: www.licreativetech.com
`
`Written Assertion of Small Entity Status
`
`This document states that Li Creative Technologies, Inc. (LcT) is entitled to small entity
`status with regards to our filing a patent application with the USPTO and therefore should
`be permitted to pay reduced fees.
`·
`
`,._
`
`C
`Signature
`
`,.......-:
`
`?
`
`Oi {Peter) Li, President
`Printed Name
`
`09-22-2010
`Date
`
`Page 5 of 32
`
`
`
`Microphone Array Design and Implementation for Telecommunications and
`Handheld Devices
`
`Manli Zhu and Qi Li
`
`1. Background of the Invention
`
`A microphone array consists of a set of microphone sensors located at different positions. The array can
`achieve directional gain in any preferred spatial direction and frequency band while suppressing signals
`from other directions and bands. The array can be implemented by filtering and summing multiple
`microphone outputs. Conventional array processing techniques, typically developed for applications such
`as radar and sonar, are generally not appropriate for hands-free or handheld speech acquisition devices.
`The main reason is that the desired speech signal has an extremely wide bandwidth relative to its center
`frequency, meaning that conventional narrowband techniques are not suitable. In the approaches to keep
`the constant response in the wide range of frequency, the array size is usually large; thus most of
`prototypes or products of microphone arrays on the market are quite large, which prevents the array
`products from having broader applications, such as for use in mobile and handheld communication
`devices.
`
`2. Summary of the Invention
`
`Our invention is Microphone Array Design and Implementation for Telecommunications and Handheld
`Devices. Our invention can be used for arbitrary directivity pattern for arbitrarily distributed microphones.
`Our invention can be used to design a microphone array for small, portable communication devices, such
`as conference phones, mobile phones, or tablet computers. To illustrate our invention, we present three
`applications of our invention: (I) a microphone array for a conference phone conference phone device
`with of eight microphones non-uniformly distributed on around a circle with diameter of 4 inches, (2) a
`microphone array of four microphones located at the four corners of a rectangle for a wireless phone or
`handheld device; and (3) a microphone array of four microphones located on the frame of a tablet
`computer.
`
`3. Description of the Invention
`
`In this section, we provide a complete description of our invention together with all the drawings
`necessary to understand the invention. Our invention can be used for arbitrary numbers of microphone
`components and arbitrary locations of the microphone components. Our can be implemented in either
`software or hardware or a combination
`
`Page 6 of 32
`
`
`
`Microphone
`Array
`
`f--+ Sound Source f--t
`Localization
`
`Beam-
`Forming
`
`f--t,
`
`Noise
`Reduction
`
`...
`
`Figure 1. For the best performance, the invented microphone array system may consist of the following
`modules: microphone array sensors, sound source localization, beamforming, and noise reduction.
`
`The microphone array module consists of multiple microphone components functioning as a single unit to
`pick up sound signals. The source localization module serves to find the spatial location of the principal
`sound source such that an acoustic beam can point to the sound source. The beamforming module serves
`to form acoustic beams in the direction of the principal sound source enhancing sound from this range and
`suppressing sound from all other directions. The noise reduction module serves to further reduce
`background noise and enhance speech. Depending on applications, a real product may use some or all of
`the modules.
`
`3.1. Two-Dimensional Microphone Array Configuration
`
`180°
`
`-- --
`
`Mo
`
`90°
`
`I
`I
`
`I
`\
`\
`\
`
`\ '
`
`M3
`
`M1
`
`\
`
`\
`\
`\
`I
`
`I
`I
`I
`I
`I
`
`/
`
`-90°
`
`-- --oo
`
`M2
`
`Figure. 2 Illustration of a microphone array configuration wherein N microphones sensors are arbitrarily
`distributed on a circle with diameter of d (N=4).
`
`Assuming N microphones are arbitrarily distributed on a circular with diameter of d as shown in Figure 2,
`where only four microphones are displayed. Microphone locations are specified a acute angles from they(cid:173)
`axis, shown as <1>0 (<l>r2: 0, n= 1... N). The output y of the array is the filter-and-sum of the N microphone
`outputs, i.e., y = '°'N-i w r x , where Xn is the output of the (n+ 1 Yh microphone and Wn is the length-L filter
`L:...n=O n
`n
`applied to it as shown in Figure 3.
`
`2
`
`Page 7 of 32
`
`
`
`S~_und Source
`
`Figure 3. Illustration of filter-and-sum beam forming.
`
`The spatial directivity pattern H(ro,8) for the sound source from angle 8 with normalized frequency ro is
`defined as:
`
`(1)
`
`where x is the signal received at the center of the circular array and W is the frequency response of the
`real-valued FIR filter w. If the sound source is far enough away from the array, the difference between the
`signal received by the (n+ 1 Yh microphone Xn and the center of the array is a pure delay Tn,
`i.e.,
`Xn((i), r) = X((1),0')e-1()}',. Figure 4 illustrates the distance between origin and microphone M 1 and microphone
`M3 when the incoming sound is from angle of 0.
`
`--
`
`Mo
`
`' I
`
`I
`I
`I
`
`/
`
`oo
`Figure 4. Illustration oft1 and t 3, the distance between origin and microphone M1 and microphone M3 when the
`incoming sound is from angle of 0.
`
`--
`
`We derived the distance for each microphone, measured both in meters and in the number of samples, and
`summarize them in Table I.
`
`3
`
`Page 8 of 32
`
`
`
`Table 1. Distance between each microphone and origin.
`Note: dis the radius of the circle, f5 is the sampling frequency, and C is the sound speed.
`
`Microphone Distance (m)
`d*cos(0+<D0)
`MO
`Ml
`d *cos(0-<D1)
`-d *cos(0+<D2)
`M2
`-d*cos(0-<1>3)
`M3
`
`Distance (number of samples)
`d *cos(0+<D0)*fJC
`d *cos(0-<D1 )*fJC
`-d*cos(0+<D2)*f/C
`-d*cos(0-<l>3)*f/C
`
`In general, the distance and the location have the following relationship:
`
`Table 2. Relationship of microphone position and its distance to the origin.
`
`Microphone position
`oo
`180°
`90°
`-90°
`(O<<D<90°)
`<I> clockwise away from 0°
`(0<<1><90°)
`<I> anticlockwise away from 0°
`<I> clockwise away from 180° (0<<1><90°)
`<I> anticlockwise away from 180° (0<<1><90°)
`
`Distance (m)
`-d*cos(0)
`d*cos(0)
`-d*sin(0)
`d*sin(0)
`-d*cos(0- <D)
`-d*cos(0+ <l>)
`d*cos(0- <D)
`d*cos(0+ <I>)
`
`Distance (number of samples)
`-d*cos(0)*fJC
`d*cos(0)*fJC
`-d*sin(0)*f/C
`d*sin(0)*fJC
`-d*cos(0- <D)*f/C
`-d*cos(0+ <l>)*f/C
`d*cos(0- <l>)*f/C
`d*cos(0+ <l>)*fJC
`
`Now, the spatial directivity pattern H can be re-written as:
`
`(2)
`
`-{ -jw(k+rn(O))}
`d ( 0)-{ i( 0)}
`h
`T]
`T
`T
`T
`T_[ T
`- g OJ,
`an g OJ,
`W ere W - Wo ,Wt ,W2 ,W3 , ••• ,WN-1
`i=I...NL -
`e
`i=L..NL
`is the steering vector, i=l ... NL, k=mod(i-1,L) and n=floor((i-1)/L).
`
`3.2. Extension to 3-Dimensional Sound Source
`
`The calculation in section 3.1 is for the sound source in the same plane with the array. In real applications,
`the sound can come from any direction in the 3-D space. We generalize the problem as shown in Figure 5.
`The sound is from the 3-dimentional (3-D) space, where I.JI is the elevate angle and 0 is the azimuth. We
`have proved that when the sound is coming from the angle of (I.J',0), the delay between each microphone
`and the center of the array is similar to Table 2 but with an extra factor sin(I.J') as shown in Table 3. When
`I.JI moves from 90° to o0
`, sin(I.J') changes from 1 to 0, and as the result, the difference between each
`microphone gets smaller and smaller. When \J'=O°, there is no difference between microphones, which
`means the sound reaches each microphone at the same time. Taking into account that the sample delay
`between microphones can only be an integer, we detennine the range where all microphones are identical.
`As shown in Figure 6, when I.J'<<l>, four microphones receive identical signals for 0°<0<360°. Our
`beamfonning technique enhances sound from this range and suppresses sound from all other directions,
`treating it as background noise.
`
`4
`
`Page 9 of 32
`
`
`
`s
`
`Figure 5. Illustration of3-D sound source: The sound is from the direction (4',0), where 4' is the elevation angle
`and 0 is the azimuth.
`
`Table 3 The delay between each microphone and the array center for sound from (4',0).
`I r 1=-r1S•sin(0+<l>)sin(4')/C I T2 = -r".fs•sin(0-<l>)sin(4')/C I t 3= r".fs•sin(0+<l>)sin(4')/C I t4= r".fs•sin(0-<l>)sin(4')/C
`
`◄
`
`Figure 6. Illustration of the array working space: When sound comes from 4'< <I>, four microphones receive same
`signals. Our beam-forming technique will enhance sound from this range and treating sound from other directions
`(for example S1 and S2) as background noise to suppress.
`
`3.3. Least Mean Square Solution
`
`Let the desired spatial directivity pattern be 1 in pass band and O in stop band. The least square cost
`function can be defined as
`
`J(w)= 1 l lH(m,0)-11 2 da>d0+afi 1 IH(m,0)1 2 dmd0
`n.
`'
`=r r_ IH(m,0)1 2 da.d0+af r_ IH(m,0)1 2 da.d0-2f
`r_ Re(H(m,0)Jia.dB+f
`f1da.d0
`Jor Jep
`Jo, J0,
`Jo., J0..-
`Jor J0r
`
`p
`
`,
`
`(3)
`
`Replacing I H(m,0) 12 = wT g(m,0)gH (a>,0)w = wTG(a>,0)w= wT (GR(m,0)+ jG/m,0))w= wTGim,0)w and
`
`Re(H(w,0)) = wT gR(w,0), we then have
`
`5
`
`Page 10 of 32
`
`
`
`J(w) = wr Qw-2wr a+ d
`
`, where
`
`(4)
`
`When aJ I ow= 0, the cost function J is minimized. The least-square estimate of w is obtained by
`
`(5)
`
`3.4. Linear Constrain
`
`Applying linear constrains Cw= b, we can further constrain the spatial response to a predefined value b
`at angle 0r using following equation:
`
`Now, the design problem becomes
`
`min wr Qw- 2wra + d
`
`w
`
`subject to Cw= b
`
`and the solution of the constrained minimization problem is equal to:
`
`where w is the filter parameters for the designed beamformer.
`
`(6)
`
`(7)
`
`(8)
`
`3.5 Sound Source Localization
`
`There are two categories of techniques to estimate the sound localization: one employs time difference of
`arrival (TDOA) and another is based on steered response power (SRP).
`
`For an array with N microphones, a delayed, filtered and noise corrupted version of sound signals is
`presented in each of the microphone signals. The delay-and-sum beam former time aligns and sums all
`the microphone sigryal asy(t,q) = L:=oxn(t+~n), where~ is the steering delay appropriate for
`
`6
`
`Page 11 of 32
`
`
`
`focusing the array to the direction of q. When the focus corresponds to the location of the sound source,
`the steered response power (SRP) should reach a global maximum.
`
`Time difference of arrival can be used to estimate the sound source location. According to the sound
`propagation theory, the sound direction is uniquely determined by the time difference for a wave to
`propagate through non-linearly distributed distant microphones. Estimating the sound direction is
`essentially identical to estimate the TDOA, which is achieved by estimating the cross correlation.
`
`Our preliminary research showed that TDOA-based localization is effective under low to moderate
`reverberation condition. The SRP approach requires shorter analysis intervals and exhibits an elevated
`insensitivity to environmental condition while not allowing for use under excessive multi-path. We
`implemented a new method called SRP-PHA T which combines the advantages of two approaches, and
`has a decreased sensitivity to noise and reverberations and more precise location estimates than the
`existing localization methods.
`
`Figure 7 shows our experimental results. The upper plot is the value of SPR-PHA T at each angle. The
`minimum value corresponds to the sound location.
`
`0 . 7~ -~ - -~ - -~ - -~ - -~ - -~ -~
`
`0.6
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`0.1
`
`0 L-L,..L.J..,..L..LJ,L..J....JL..J....JL.....LLlL..J....JL....L..JL...JJ...,JL....L..JL....L..JL....L..JL....L..J-'-l.JJ.-l-'-l--L..J.LL-_
`0
`
`__J
`
`120
`
`80 1
`I
`
`60
`
`,3o
`
`151(\ _ :_ 0f
`( ... ',(,/\ : / ~(.,,..
`1\--,- :: ~; -j - - 0
`21,// -:.-
`', /·330
`
`1
`.,,,.
`
`(
`
`,,.
`
`.,,..
`
`I I \
`../
`I
`
`)
`
`,/.._
`,..
`
`' '
`
`2 4 0~
`270
`Figure 7. The upper image shows the value of SRP-PHA T for every l 0°; the lower image represents the estimation
`and ground truth.
`
`3.6 Adaptive Beamforming
`
`Section 3.2 and 3.3 introduce the algorithm to derive the fixed beamforming to form the directivity
`pattern. We further extend it to adaptive beamforming. Adaptive beamforming can achieve better
`interference suppression than fixed beamforming. This is because the target direction of arrival, which is
`assumed to be stable in fixed beamforming, does change with the movement of the speaker. Also, the
`sensor gains, which are assumed uniform in fixed beamforming, exhibit significant distribution. All these
`factors will reduce speech quality. On the other hand, adaptive beamforming adaptively performs beam
`
`7
`
`Page 12 of 32
`
`
`
`steering and null steering; therefore, the adaptive beamforming method is more robust against steering
`error caused by the array imperfection mentioned above.
`
`The structure of our adaptive beamforming method is shown in Figure 8. It comprises of a fixed
`beamforming, a blocking matrix (BM) and a set of adaptive filters. The purpose of the blocking matrix is
`to block the target signal and let interfering noises through. The interfering noises are fed into an adaptive
`filter to minimize their influence in the output. One of the key steps in adaptive beamforming is to
`determine when the adaptation should be applied. Because of signal leakage, the output z of the blocking
`matrix may contain some weak speech signals. If the adaptation is active when speech is present, the
`speech will be cancelled out together with the noise; therefore, our invention uses a control module on the
`adaptation. This module enables adaptation according to the spectrum and energy of both noise signal and
`speech signal.
`
`S1
`>---.-..,.Fixed
`eamforming
`
`b
`
`Speech signal
`
`+
`
`I
`
`Noise
`
`Blocking
`Matrix
`(BM)
`
`Spec~m and/or energy--.~----f Oil
`of b and z.
`-+~
`
`Figure 8. Diagram of adaptive beam forming: It consists of a fixed beamformer, the blocking matrix, and the
`adaptive filter. A control module is applied to enable/disable the adaptation process.
`
`In Figure 8, the dotted block represents our adaptive filtering process. We developed a sub-band adaptive
`filtering for this invention for two reasons: firstly, it leads to a higher convergence speed than when using
`a full band adaptive filter. Secondly, our noise reduction algorithm is developed in sub-band, so applying
`sub-band adaptive filtering here provides the same framework for both beamforming and noise reduction,
`and saves on computational cost. Figure 9 shows the structure of our sub-band adaptive filtering. Both
`input signals are split into frequency sub-bands via an analysis filter bank. Each sub-band adaptive filter
`usually has a shorter impulse response than its full band counterpart. The step size can be adjusted
`individually for each sub-band, which leads to a higher convergence speed than when using a full band
`filter.
`
`8
`
`Page 13 of 32
`
`
`
`Sub-band adaptive filtering
`
`Output b of fixed
`beamfonning
`
`Analysis 1---♦l'A.
`filter
`bank
`
`Output z of blocking
`matrix
`
`Analysis 1--------~
`filter
`bank
`
`Synthesis
`filter
`bank
`
`Speech signal
`
`Analysis
`
`Adaptation
`
`Synthesis
`
`Figure 9. The structure of the sub-band adaptive filter: In the analysis step, both outputs of fixed beamforming and
`blocking matrix are split into sub-band through the analysis filter bank. In the adaptation step, the filter is adapted
`such that the output only contains speech signal. Finally, in the synthesis step, the sub-band speech signal is
`synthesized to full-band speech through the synthesis filter bank. Because noise reduction and beamforming are in
`the same sub-band framework, we applied noise reduction (NR) before synthesis to save computation. The NR
`module will be introduced in the next section.
`
`To ensure the speech quality, the filter bank should not distort the sound signal by itself. We already
`implemented an efficient perfect-reconstruction filter bank, which can fully meet this requirement. In this
`implementation, all sub-band filters are factorized to operation on the prototype filter coefficients and a
`modulation matrix is used to take advantage of FFT. This modification ensures a minimum amount of
`multiply-accumulate operations. Figure 10 shows the perfonnance of our filter bank. The blue line
`represents the input signal to the filter bank, and the red circle is the output of the filter band after analysis
`and synthesis. The output perfectly matches the input, called perfect-reconstruction filter bank.
`
`9
`
`Page 14 of 32
`
`
`
`1/0 for real valued GDFT OSFB
`
`--inpu1
`filterbank output
`O
`
`()
`()
`
`\)
`
`(.
`
`)
`
`I)
`
`'ffl
`()
`
`()
`
`()
`
`~
`
`(~
`
`l
`
`(
`
`\)
`
`~,)
`
`l)
`
`(
`
`~)
`
`. \ll \)
`(/
`
`)
`}
`
`0.5
`
`0.4
`
`0.3
`
`0.2
`
`:i
`a. 0.1
`:5
`0
`:5 a.
`.!:
`
`-0.1
`
`-0.3
`
`-0.4
`L-l..---1-----'-----'------'----'-----'-------'---.1...----'----'----'
`1. 0384
`1. 0386
`1. 0388
`1. 039
`1. 0392 1. 0394
`1. 0396 1. 0398
`1. 04
`1.0402 1. 0404
`time [fullband sampling periods)
`X 105
`
`Figure 10. Perfect reconstruction filter bank input and output: The blue line represent input signal to filter bank and
`red circle is the output of the filter band after analysis and synthesis. The output perfectly matches the input.
`
`The noise reduction (NR) module as shown in Figure 9 is used to further reduce background noise after
`adaptive beamforming. It explores the short-term and long-term statistics of speech and noise, and the
`wide-band and narrow-band signal-to-noise ratio (SNR) to support a Wiener gain filtering. After the
`spectrum of noisy-speech passes through the Wiener filter, an estimation of the clean-speech spectrum is
`generated. The filter bank synthesis module, as an inverse process of filter bank analysis module,
`reconstructs the signals of the clean speech given the estimated spectrum of the clean speech.
`
`3.7 Noise reduction
`
`The noise reduction module can include any kind of noise reduction algorithm, such as Wiener filter(cid:173)
`based noise reduction, spectral subtraction noise reduction, auditory (or cochlear) transform-based noise
`reduction, or model-based noise reduction algorithm.
`
`3.8 Hardware Implementation:
`
`The structure of circuit design is shown in Figure 11. The acoustic signal is picked up by four or eight
`microphone components/elements arranged as a linear or circular array. First, the microphone amplifiers
`provide 20dB gain to boost the signal level to enhance the microphone sensitivity, and then the audio
`Codec provides an adjustable gain level from -74dB to 6dB before it converts the four channels of analog
`signals into digital signals. The pre-amplifier may not need for some applications. The Codec then
`
`IO
`
`Page 15 of 32
`
`
`
`transmits the digital audio signals to DSP (digital signal processing) chip for audio signal processing and
`computation. The DSP chip also transmits output signal to the Codec, and then the Codec converts it into
`analog signal, which is then amplified by speaker amplifier to drive the internal loudspeaker if it is needed.
`
`The flash memory stores the code for the DSP chip and compressed audio signals. Once the system boots
`up, the DSP chip reads the code from flash memory into internal memory and starts to execute the code.
`During the start up stage, we can also configure the Codec by writing to the registers of the DSP chip.
`There are switch power regulators and linear power regulators to provide appropriate voltage and current
`supply for all the components on the board.
`
`Anal09 domain
`
`Digital domain
`
`External
`microphone
`9ocl(et
`
`Microphone
`a1nplifier
`
`M ic;.rophoo-e ;
`a1nplifier
`
`Microphone
`amplifier
`
`Microphone
`amplifier
`
`t
`
`External
`headphone
`
`Speaker
`amplifier
`
`. ~
`' ~
`
`Audio Codec
`
`$510 DSP chip
`
`Flash memory
`
`Linear power
`regulators
`
`• .
`• .
`• • • . • •
`
`Switch power ,
`regulators
`
`i
`
`Figure I 1. Hardware implementation of the invention: It consists of 3 major chips, codec, DSP, and flash memory.
`The USB control is built in the DSP chip. For 8-sensor microphone array, we can use two four-channel codec chips.
`
`We will use a mixed signal circuit board (6-layer PCB). The board layout will be carefully partitioned to
`isolate the analog circuits from the digital circuits, because the noisy digital signal can easily contaminate
`the low voltage analog signal from the microphones. Although the speaker amplifier's input and output is
`
`11
`
`Page 16 of 32
`
`
`
`an analog signal, it will be placed in the digital region because of its high power consumption and its
`switch amplifier nature. Only linear power regulators are deployed in the analog region due to their low
`noise pro