throbber
United States Patent [19J
`Holzrichter
`
`[54]
`
`METHODS AND APPARATUS FOR NON(cid:173)
`ACOUSTIC SPEECH CHARACTERIZATION
`AND RECOGNITION
`
`[75]
`
`Inventor: John F. Holzrichter, Berkeley, Calif.
`
`[73]
`
`Assignee: The Regents of the University of
`California, Oakland, Calif.
`
`[21] Appl. No.: 08/597,596
`
`[22] Filed:
`
`Feb. 6, 1996
`
`Int. Cl.6
`........................................................ GlOL 3/02
`[51]
`[52] U.S. Cl. .......................... 704/208; 704/205; 704/206;
`704/207
`[58] Field of Search ..................................... 395/2.1, 2.16,
`395/2.27, 2.37, 2.67, 2.17, 2.15, 2.14; 704/205-208,
`201, 218, 228, 258
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`2,193,102
`2,539,594
`2,823,365
`3,555,188
`3,699,856
`3,925,774
`4,027,303
`4,092,493
`4,260,229
`4,461,025
`4,621,348
`4,769,845
`4,783,803
`4,803,729
`4,882,746
`4,903,305
`4,914,703
`5,008,941
`5,027,406
`5,030,956
`5,127,055
`5,202,952
`5,227,797
`5,280,563
`5,337,394
`
`3/1940 Koch ........................................... 250/6
`1/1951 Rines et al. ............................... 250/17
`2/1958 Rines .......................................... 340/6
`1/1971 Meacham ................................ 381/115
`10/1972 Chabot et al. ............................. 95/1.1
`12/1975 Amlung .................................. 340/258
`5/1977 Neuwirth et al. ....................... 340/258
`5/1978 Rabiner et al. ............................. 179/1
`4/1981 Bloomstein ............................... 352/50
`7/1984 Franklin .................................... 381/56
`11/1986 Tender .................................... 367/116
`9/1988 Nakamura ................................. 381/43
`11/1988 Baker et al.
`.............................. 381/42
`2/1989 Baker ........................................ 381/43
`11/1989 Shimada .................................. 455/462
`2/1990 Gillick et al. ............................. 381/41
`4/1990 Gillick ...................................... 381/43
`4/1991 Sejnoha ................................... 3881/43
`6/1991 Roberts et al.
`........................... 381/43
`7/1991 Murphy ..................................... 342/22
`6/1992 Larkey ...................................... 381/43
`4/1993 Gillick et al. ............................... 395/2
`7/1993 Murphy ..................................... 342/22
`1/1994 Ganong ....................................... 395/2
`8/1994 Sejnoha .................................... 395/2.5
`
`I 1111111111111111 11111 1111111111 11111 lllll 111111111111111 lll111111111111111
`US006006175A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,006,175
`Dec. 21, 1999
`
`5,345,471
`5,361,070
`5,386,492
`5,388,183
`5,390,278
`5,428,707
`5,573,012
`
`9/1994 McEwan ..................................... 375/1
`11/1994 McEwan ................................... 342/21
`1/1995 Wilson et al.
`......................... 395/2.61
`2/1995 Lynch ..................................... 395/2.51
`2/1995 Gupta et al. ........................... 395/2.52
`6/1995 Gould et al. ............................. 395/2.4
`11/1996 McEwan ................................. 128/782
`
`OTHER PUBLICATIONS
`
`Rabiner, L. R. "Applications of Voice Processing to Tele(cid:173)
`communications", Proc. of the IEEE, 82(2), 199-228 (Feb.
`1994).
`Skolnik, M.I. ( ed.) "Radar Handbook 2nd ed." McGraw
`Hill, page v (1990).
`Waynant, R. W. and Ediger, M. N. (eds.) "Electro-Optics
`Handbook", McGraw-Hill, p. 24.22 (1994).
`Flanagan, J. L. "Speech Analysis Synthesis, and Percep(cid:173)
`tion", Academic Press NY, pp. 8, 16-20, 154--156 (1965).
`Coker, C.H. "A Model of Articulatory Dynamics and Con(cid:173)
`trol", Proc. IEEE, 64(4), 452-459 (1976).
`Javkin, H. et al "Multi-Parameter Speech Training System"
`Speech and Language Technology for Disabled Persons,
`Proceedings of a European Speech Communication Asso(cid:173)
`ciation (ESCA) Workshop, Stockholm, Sweden, 137-140
`(May 31, 1993).
`
`(List continued on next page.)
`
`Primary Examiner-Tariq R. Hafiz
`Attorney, Agent, or Firm-John P. Wooldridge
`
`[57]
`
`ABSTRACT
`
`By simultaneously recording EM wave reflections and
`acoustic speech information, the positions and velocities of
`the speech organs as speech is articulated can be defined for
`each acoustic speech unit. Well defined time frames and
`feature vectors describing the speech, to the degree required,
`can be formed. Such feature vectors can uniquely charac(cid:173)
`terize the speech unit being articulated each time frame. The
`onset of speech, rejection of external noise, vocalized pitch
`periods, articulator conditions, accurate timing, the identi(cid:173)
`fication of the speaker, acoustic speech unit recognition, and
`organ mechanical parameters can be determined.
`
`48 Claims, 31 Drawing Sheets
`
`Page 1 of 63
`
`GOOGLE EXHIBIT 1022
`
`

`

`6,006,175
`Page 2
`
`OIBER PUBLICATIONS
`
`Papcun, G. et. al. "Inferring articulation and recognizing
`gestures from acoustics with a neural network trained on
`x-ray microbeam data", J. Acoustic Soc. Am. 92(2),
`688-700 (Aug. 1992).
`Olive, J.P. et al. "Acoustics of American Engliish Speech",
`Springer-Verlag, pp. 79-80 (1993).
`Hirose, H. and Gay, T. "The Activity of the Intrinsic Laryn(cid:173)
`geal Muscles in Voicing Control", Phonetica 25, 140---164
`(1972).
`Tuller, B. et al. "An evaluation of an alternating magnetic
`field device for monitoring tongue movements", J. Acoust.
`Soc. Am. 88(2), 674-679 (Aug. 1990).
`Gersho, A. "Advances in Speech and Audio Compression"
`Proceeding of IEEE 82(6), 900-918 (1994).
`
`Schroeter, J. and Sondhi, M. M. "Techniques for Estimating
`Vocal-Tract Shapes from the Speech Signal", IEEE Trans.
`on Speech and Audio Proceeding 2(1), Part II, 133-150 (Jan.
`1994).
`
`Atal, B. S. and Hanauer, S. L. "Speech Analysis and
`Synthesis by Linear Prediction of the Speech Wave", J.
`Acoustic Soc. Am. 50(2) Part II, 637-655 (1971).
`
`Furui, S. "Cepstral Analysis Technique for Automatic
`Speaker Verification", IEEE Trans. on Acoustics, Speech,
`and Signal Processing, ASSP 29(2), 254-272 (1981).
`
`Rabiner, L. and Juang, B.-H. "Fundamentals of Speech
`Recognition", Prentice Hall, pp. 436-438, 494 (1993).
`
`Page 2 of 63
`
`

`

`6,006,175
`6,006,175
`
`(savaoiad)TS
`
`..._,,
`
`Sheet 1 of 31
`Sheet 1 of 31
`
`
`
`NOLLZIAWASINOLQaaoLs
`
`
`
`SOBNIONSSANIWAS
`
`
`
`KAOWAWSBOLIAN
`
`RIvwaen
`
`SISAIWNY
`
`
`MEW2SSYATEWASSYONTADLINNOLANONIGsaroSNALNAS
`
`
`
`
`t
`
`f
`
`
`
`Dec. 21, 1999
`Dec.21, 1999
`
`
`
`t
`
`f
`
`QaomMmNIAaaniuvasAWYNSIS
`
`
`
`U.S. Patent
`U.S. Patent
`
`w -z.
`0
`:%
`0.
`
`SNOHdOSAIW
`
`~ V -l
`
`Page 3 of 63
`
`Page 3 of 63
`
`
`
`

`

`Sheet 2 of 31
`Sheet 2 of 31
`
`6,006,175
`6,006,175
`
`
`
`
`
`SatAWNHANONOL
`
`Dec. 21, 1999
`Dec. 21, 1999
`
`
`WALNOZIGOHHANOWHALNOW—-XNADV
`pySdinQALiNIinaneALINGSD(7)Zaan
`
`
`
`3SON\vSVN
`
`AgNOALINGD13hXNAABWHS
`(HMA1UOdALIAWD
`
`
`
`U.S. Patent
`U.S. Patent
`
`Page 4 of 63
`
`a.
`
`
`
`
`
`dilSNONOL08ANSNOLAYDON
`
`N
`
`PBOSNAS
`
`
`
`ALINGDONO—
`
`ws
`
`
`
`Ns,UBaAnn$Q704WIONwUosnas
`
`bWSJd104
`
`Page 4 of 63
`
`
`
`

`

`Ul
`.....:a
`~
`....
`
`.... = = 0--,
`
`0--,
`
`'"""'
`~
`
`~
`
`0 ....,
`~ ....
`'JJ. =(cid:173)~
`
`\0
`\0
`'"""'
`\0
`'"""' ~
`N
`ri
`~
`~
`
`~ = ......
`~ ......
`~
`•
`r:JJ.
`d •
`
`ETC.
`lRANSMl '5~ \0"1
`
`MEMOR.'(,
`\0 0\5PLAY1
`+
`
`ASSEMBLY
`\5 ~ SE~,E~CE
`
`~SSEM'BL'(
`
`WORt:>
`
`~
`
`14 ""\
`
`M~'TC\-l\~G,
`PA..1t'E2N
`
`I
`
`,1 "'\..
`
`L\BRA.R.'(
`\/EC:TOR.
`I~~ FEA.'TURE
`I ':>ToREb
`
`-------_J
`
`1
`
`~E~TOR
`_,;J FEATURE
`
`FOR.M L\P
`
`_J
`I
`,----
`•
`
`9
`
`-
`
`ETC.
`E,G. TV, A\R..~LOUJ
`OT\-\ER.. seN4:>02E>,
`lo
`
`seNSOR I
`5~/ L\P5 l/4
`
`FIG.3
`
`~ECTOR
`Fe~T\J'RE.
`EM/TONGUEL---+-f----+1'0NGUE-~~~
`
`~3
`
`~EN~OR
`
`:r~VJ
`
`8
`
`l I 1 FORM
`~V~~~~LO
`
`\JECfOR R>R.MAT ION r-\ \
`
`~ 'l"O\ ~, l=E~1"UR.E
`
`AW0 P2.0CESS
`
`E~-ACO\JST\C
`
`VOCAL ~,er
`
`\/ECTOR.
`FEATURE
`
`FOR.\,,\
`
`1
`
`~ECTOR.
`FeATURE
`
`1
`
`\]
`
`M\C.R.OPMOWE M ~\.l~'{S\S-l'2M:.1'
`ACOV~'T\C.. I I \IOCAl.. 'S"!S,EM
`
`MOOE LS
`
`EM/~OCAL rl 111
`
`SEN~OR
`FOLDS
`
`Page 5 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 4 of 31
`
`6,006,175
`
`\/ELUM
`
`\/ELUM
`CLOSED
`
`PH~R'<~Y.
`
`L\PS
`
`~
`[ill
`
`2.0
`
`\JOCAL FOL.0:,
`
`FIG.4
`
`Page 6 of 63
`
`

`

`Ul
`.....:a
`~
`....
`
`.... = = 0--,
`
`0--,
`
`'"""'
`~
`
`0 ....,
`Ul
`~ ....
`'JJ. =(cid:173)~
`
`\0
`\0
`'"""'
`\0
`'"""' ~
`N
`ri
`~
`~
`
`~ = ......
`~ ......
`~
`•
`r:JJ.
`d •
`
`LJ
`
`4b
`
`47
`
`L\N\<..
`
`CO"-{MU~\CA""T lO~
`
`I
`\ELEPHON'f
`rOR. COOE.'D
`
`TELE
`
`"IOCOOE~
`145
`
`U~\T
`
`S'iSTe.M.S
`TO OT)(ER.
`lt.lFORM~"t\0
`ACOU~T\C
`WIRELESSJ
`\AJ\R.E., O?T'ICJ
`Ll~k ~01'
`
`4\
`TiRAN.'SMISSIOIJ
`
`G-e:"-LE~TOR.
`
`A~O S~NTENd:
`Ge.AMMAR
`S~E.LL\~G
`
`~'<WT A.)(.
`
`WOQ.t)
`'"\
`35
`
`\ NTERi=ACE
`E'GlUlPMa,rr
`AU~\L\..\AR.V
`
`W\T
`
`I
`
`CAME'RA
`\/\CEO
`
`-
`
`GENER>-.L P~0CESSOR.
`
`CON'TRs:>L UN\T A~O
`
`PR.o Ce SSOR.
`RECOGN\TlON
`
`WOIJ ACOU-STIC. SPEEC\-1
`'l
`
`V\DEO TER.MlNAL
`
`\.OUO g::,1;A.\'(6R
`
`31
`3Sl_
`~9 1..-
`
`3\-f
`
`FEEC\!.ACK
`R>R. \IE.RSAL
`HE'At:>PHONE5
`
`')
`2q
`'37----
`1~;
`
`L->20
`
`FIG.5
`
`U"1\T
`
`\CENT\ F\O\TION
`
`SPEP\\<.ER
`
`coce Book
`REC06NrnoN
`ACOUCST\C
`
`~o~
`
`41 _A, lDEWi\ FlER
`LAW.GUAGE
`t=OR.E\GN
`
`TraA~SL.ATOR.
`
`AWD
`
`44~
`
`A~O CODE
`
`BOOK
`
`ReC06Nl'Z.ER
`
`SPEECH
`ACOU!::>T\C
`
`;34
`
`4\
`
`,'t-..lt:> COCE
`S'MTf4ES\"Z.ER
`
`Sook
`
`s?eecH
`5 33
`
`4~
`
`MOU$E
`A~D
`
`k:.CE'( 13 O"R. 0
`CON\leN"'t \ON~L
`
`I
`
`/?,\
`_r3b
`
`!>l~
`
`COMMA.NOS
`FOR. l=AST
`k.EVPAO
`'SlMPLE
`
`MIC20PHOWE.
`C.ON\lelfT\ONAL
`
`AND
`
`SPEE°" SE.~SOR..'S
`NO~ ACOUST\C.
`
`Page 7 of 63
`
`

`

`U.S. Patent
`
`6,006,175
`Sheet 6 of 31
`Dec. 21, 1999
`2r-------------------,,
`,.
`u
`A
`I.S
`
`D ra O.S
`w~
`:> UJ
`
`- -1 w
`<.J .J
`w <( « 2
`l: \9
`UJ en
`
`0
`
`-0. 5
`
`- )
`-1.5
`-2,..__..,_ _ _,__..,__...L.-_ _ . _ _~ -~
`\.5
`-l -0,S O O.S
`'2.S
`\
`TIM~ (S)
`
`FIG. 6
`
`'l
`
`1.5'
`
`SALINE
`
`5.'\L\)(G
`
`~L\NE
`
`'
`~ ~ 0.5
`w2
`lJJ~
`0
`:, lf1
`" a -0.5
`z UJ
`~~ _,
`'l1J ~M hct
`-\.5
`-2
`-2.5
`
`-\.'~15
`-0.2.5
`T\ME (s)
`
`0.875
`
`FIG.7
`
`Page 8 of 63
`
`

`

`U.S. Patent
`U.S. Patent
`
`Dec. 21, 1999
`Dec.21, 1999
`
`Sheet 7 of 31
`Sheet 7 of 31
`
`6,006,175
`6,006,175
`
`\
`\
`\
`\
`\
`
`SONSOUANG
`
`@AIONV
`
`AMOADAN-3
`
`NO/LDAAG
`
`I
`I
`I
`I
`I
`—
`
`~~~*SNOVINOWdOUd—
`
`@anNMWa
`
`
`
`WNNALNYS10dId
`
`-Cl
`
`WPSB-7/X
`
`DBAUIWSNVAL
`
`mais-2)
`
`V8Dld
`
`Page 9 of 63
`
`Page 9 of 63
`
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 8 of 31
`
`6,006,175
`
`,-<
`
`0~
`
`~ ~-w~
`I.I) w C). u§
`u.w ~i
`~
`
`~
`
`,.i.
`cl>
`
`,<
`('-I
`
`~
`-
`w
`l!-
`I
`w
`
`V
`
`~
`tt)
`•
`0
`1--4
`J ~
`
`~ llJ
`~
`~
`~
`
`t
`~w
`u. ~
`~ '° :z
`~ c! t
`t-
`3 0. ~ ~ :, Gl --1711 r-
`c( 0 0 w w
`a"'-<-woo c:e ~ u
`u_JJ w l1J rJ
`w R ~ ...J a ........
`0
`,<
`llJ 5 bl -
`~ J c(
`0.
`ot O Q. D
`::>
`
`c:(
`
`I
`I
`I
`I
`I
`
`\
`
`Page 10 of 63
`
`

`

`U.S. Patent
`
`Dec.21,1999
`
`Sheet 9 of 31
`
`6,006,175
`
`;I.
`0
`
`0
`..J
`lu
`u.
`<.... _______ y __ 01 ___ _,,,J
`
`I
`
`Page 11 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 10 of 31
`
`6,006,175
`
`~
`0'
`•
`C,
`~
`~
`A
`
`r
`
`Cl
`l!J
`'\ '::!.
`0
`7
`2
`:>
`
`~
`•
`C,
`~
`~
`
`~
`cQ
`
`-!
`
`•
`•
`lo
`-ct
`~
`c:-,,1
`
`-
`I
`
`:#
`
`u.
`
`0
`
`lJ._
`
`V
`
`-
`~
`
`0
`\J
`-.<(
`
`(flt)
`Q~
`J a.
`~:,
`_J 8
`<u
`o< 9'-J
`
`:"
`-
`"'
`In -
`\J .. •
`~ ........_
`0 '
`u -0
`>
`}~
`
`'t.
`0
`
`c/)
`
`-]::
`I
`
`a
`llJ
`
`0
`ul
`
`0
`7
`
`" -
`5
`
`Page 12 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 11 of 31
`
`6,006,175
`
`(R.EFERENCE---,
`I
`T\ME.)
`
`I
`
`FQ.ONT OF
`'FA.d::
`
`TEETH CLOSED
`
`\/ELUM<:)Pl:i~
`TO NOS'e
`
`w
`D
`:)
`t(cid:173)
`J
`a.
`~
`
`w
`D
`:>
`t
`
`J 1
`
`l
`
`2
`
`(~t:=ERENce"i
`TlME)
`F'RONTOF
`l=ACE
`O?Q.l\N5
`L\PS
`
`FIG.IOA
`
`7"0~GUE T\P- l=>ALA,E
`C.ONTA.CT
`
`C .~ \JEUJM
`NO 2E~LECt"\Ol-J
`~'t'N'>'(cid:173)
`NoRMA.L
`
`FIG.lOB
`
`2..
`
`llME (.n.~ec.)
`4
`FIG.lOC
`
`Page 13 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 12 of 31
`
`6,006,175
`
`ROUNt>'TR.\P 'T\ME. i:"R.0""'
`~"'1'SM\TI'ER ,o \ . .\?5 SA.Cl<:
`TO R.ECli.\VER. = \.0 VlSec.
`A.PPR.O~.
`
`1\ME.
`
`FIG.llA
`
`~R,O\lT OF FA.CE
`
`TEET\-\
`TONG,UE
`
`\/ELUM
`
`L\PS
`
`"T\\.\ta.
`
`FIG.11B
`
`....
`-..1 I• RAlJ.c.£ GA"lE ON \\ME o., n.sec. T\~'E.
`\OTA\.. "TIME i:oR. 30 ?\.>LSE- ~1
`1~
`FIG.llC
`
`B\~S \S M!!,J:JJ'f' "3 't\Sec.
`
`FIG.11D
`
`\8
`
`I 3 S 1
`2. 4
`I::,
`'TOTAL TlME Fo~ 1000 ?oL~ES
`PeQ. ~l"-1 A'-lt:> '30 8\\lS lS ASOUT
`15 M\lL\$E<.ONOS
`
`Page 14 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 13 of 31
`
`6,006,175
`
`PULSE
`GE"-l•'RAT0R -
`
`El-'\ SeN soR..
`CONT~OL ~S1AR'T
`Ut-.llT
`r"
`so
`5\\.
`IR\GGER. -►
`
`I
`
`/54-
`v55
`. \W~GAATO~ ~
`5b_/
`
`•
`
`57./ AMP
`
`-
`l
`t:£LI\'{
`
`I
`
`- S\) .. HTC~
`
`6"8"\
`
`'
`AID
`
`H
`
`59
`•
`~
`PROc.eSSOR. r-4
`
`60(,_
`
`6\
`
`MEMORY
`
`~ FEATURE.
`\/cct"OR.S
`
`r-4
`
`- TOOTHEl<_
`At=>PL\ CJ!rr\ OW S
`
`~ 52)
`
`'
`
`Sf:
`
`~53
`
`Ml~OtJE. u-61.
`
`AID
`t
`ME'MOR.Y
`•
`
`l=EA,URE.
`\JECT0QS
`
`L,)6"3
`-
`
`564
`
`65
`
`.....
`
`-
`
`-
`
`66
`
`-
`
`-
`
`✓67
`
`PROCESSOR..
`
`COMS\NER.
`
`IS \"T
`SPEECJ4
`~OGNlT\OW NO
`7
`VE'.=»
`
`SPE~C\-t
`.
`- QJ:CDG"-l \, \ON
`ALGOR\1"MM
`
`~68
`
`FIG.12
`
`Page 15 of 63
`
`

`

`U.S. Patent
`
`Dec.21,1999
`
`Sheet 14 of 31
`
`6,006,175
`
`PUL~E
`GEW\:-Q.A,OR
`
`-
`
`E'M'SSN~OR.
`CON""{R.OL. ~START
`\JN\T
`
`1RIGGER.
`
`1---41
`
`\\l~-rc,R. -~
`
`AMP
`
`A/0
`
`--o
`
`TIM'E.
`
`MS~OR."(
`
`-
`(6
`
`r
`~ I Mel-\OR.'f -
`
`Me,AOR..'{ ---~o
`6oa
`
`COW'TR.OL
`Ul\l\T W\TH
`CONTROL
`~\G\..l~LS
`
`FIG.13
`
`J
`Qr:
`
`M\CQDPHoWE
`
`A/D
`
`MSMOR..~
`t
`FEAT\JR.E
`'\Jee.TORS
`t
`PWCESSOR. -
`t
`COM~lNER..
`
`.~
`
`-
`
`l
`O'cLA"(
`
`I
`
`SWlTCH
`
`l=EA"TURE..
`~ec:roR~
`••
`
`IS l i
`SPEl=C.H
`2EC06~\i\ON
`.
`7
`'{ES
`
`WO
`
`TO OT\i~R..
`Af''?\..\CA"T\ON~
`
`SPEECI-\
`- RECOGN\'i \ON
`.
`~OR.\THM
`
`Page 16 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 15 of 31
`
`6,006,175
`
`Acoustic Feature
`Vector Input
`
`,,
`
`Is the Quality of the fit
`between the processed
`+.: radar data and the decision
`filter high or Is it low?
`> 97% • high
`<97%-low
`
`◄ . , ._ Both High:
`
`---- Identification error is
`
`less than 0.1 % confidence
`Proceed with Recog)tion
`
`Combined or selected
`EM-sensor feature
`Vector Input
`
`, ,
`
`Does the conventional
`acoustic speech recognizer
`have a high probability of
`identification or low?
`> 97%-high
`< 97%-low
`I
`
`Both the acoustic recognizer and the EM
`based recognizer validate the
`identificaition with total probability the
`combination of tAe separate probabilities. .
`
`- - - - - - - - - . Check library for quality of expected fit to
`EM based information, and for type of
`4 ~
`acoustic uncertainty. If acoustic set Is - -
`EM data High & Acoustic
`resolvabte by EM data let EM data break the
`low
`Check to see it ambiguity acoustic based ·ambiguity"". If not find
`combined probability and if >97% print.
`can be resolved, then
`othewise send message to operator that
`i - - -+ - - - i proceed with recogniton:
`_________ _. word Is uncertain .
`
`. . . - - - - - - - - - · Check library for expectation of acoustic
`, .,._ EM sensor low & Acoustic
`identification-to be resolvable by EM- · .. •
`High:
`sensor data. If LOW choose acoustic.
`If HIGH, choose combined probability Is
`Check to see if ambiguitl
`>97%. If <97°.4, print uncertainty note to
`can be resolved, then
`- - - - proceed witn .r_pcogniton
`operator but continue with post
`proceSSing to use grammar and context
`to increase probability. .
`
`-
`
`Bothlow:
`
`Notify operator.
`and proceed
`
`Send note to operator
`with poor word - - - - - - - - - - . . t
`identification, or try
`another post processor to
`find word from context
`
`FIG.14
`
`Output to Control
`Unit for Action
`
`Page 17 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 16 of 31
`
`6,006,175
`
`\00
`so
`
`0
`
`0
`
`T\-\e SOUNt>
`ak\.\ •.••.
`
`I"-
`
`FIG.15A
`
`_,
`<
`-VJ
`~
`-0
`-50
`~ -100
`0 0.1 0.1. 0~ OA 0.5 0.60."1 0.6 O.~
`Tlt-\E. (1.ec.)
`2 0 0 - - - - - - - - - - -
`\ 60
`THE. SOONt>
`a\\\\ ....
`
`_j
`4.
`Z
`
`~
`
`\00
`
`s~ ~
`
`~ -50
`~ -lOO
`.
`iE
`-:.~~
`0 0.1 O?. O.~ OA OS 0.6 01 0.& O.C:,'
`TIME (Sec.)
`
`:FIG.15B
`
`-~----=------FIG.15D
`102..
`F~EQIJENC.'< (J\-z.)
`
`\0 '?,
`
`Page 18 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 17 of 31
`
`6,006,175
`
`5,-----------------
`
`~-'--MII--
`
`T~VJ t>Q.OPS
`N 0.'2., 'S g\:FQQE.
`"0\..16:" \ S
`\IOICE.t>
`
`,,,_...
`z,
`m %
`0
`0-
`~ -)
`ct
`cl..
`0
`
`- "3 1-------------a.___,j
`
`" -i.
`
`_,
`
`_ _ . _ _ _ i _ _ _
`0.5"
`
`~ -5,..._ __ L - -_
`-as
`o
`'T\ME (~c)
`
`\Joe.AL
`F"OL.t)S
`
`___J
`
`l
`
`FIG.16
`
`COWT/:ii..c:r
`
`0
`Tl ME ( sec.)
`
`s
`
`FIG.17
`
`Page 19 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 18 of 31
`
`6,006,175
`
`Fl W \ Si \-I e-0
`G,O TO ~E)CT
`FR.AME Y\+l
`
`TE~T ~e~"T\JQ..E.
`s-----1 'JEC.TOQ l=OR
`~PEEC.H 'F"R.A.Mli: V\
`
`It
`
`C'-'E:C.K A.LL F EA."TUQ.£ \IECTOR..
`r--_.-~ COlaFF\C\eNT K W\-\E.Q.E
`CONTAC..T MAY OCCUR.
`
`I
`
`OOES K
`l='EATUR.E. VEC.iOR
`C.OEFFlc\El--l-r E~C.EEO ies-T
`TuR.ESHOLD m.. FOR. ~AC:i
`'T'O HA'1 e: occ.uR;;: t) ?
`'/ES
`
`SU6'TR.A..C.T ,owc;.uE. F~"T\JQ.E
`<:.o~Fl=\C.lENT k
`\W FRAME Y\.
`F"R.oM K C.OEF'l='\C\EW, \N
`l=~Mi: V\.- \
`
`DOES l="EA.,UR.E \IECTOR
`COEFl='\CIE\IT K FOR "'TONG,\JE.
`\"'CR£A~E ~'1' MOR.~ ~AW
`6 1
`'IES
`
`COt-.lT AO \--ltii..S OCC.UR.l:~ OW
`FRA~l::S Y\.-.i "THRU V\.. A'T
`LOC.A"T\0"-1 K.. ~'=t' l=eATU~E:
`VEC.'TOR. WOTAi,-\0\..l ~OR.
`C.ONTAC..T TEST W\.leN Th\3l£
`LOO\<.uP occ..u~s
`
`._1--l_O ___ _
`
`NO
`
`- - -11 . - - - IL - - - - -
`
`.
`\NC.R.eA-:~e K SY
`\ TO Q-lEO<. R)R
`CON.,~c."r A.,
`OTHER..
`LOCA.1 lON S
`
`_]
`
`FIG.18
`
`Page 20 of 63
`
`

`

`.... = = 0--,
`
`Ul
`.....:a
`~
`....
`
`0--,
`
`signal ls low
`acoustic
`user that
`message to
`Send
`
`form feature vector,
`NASR system to
`Use data from
`
`set flag for no
`
`acoustiic data.
`
`Set T • T + ( ti • ~-, )
`
`to user
`Noise message
`T • T + {ti· ~-t )
`noise, aet
`
`Bl "'-1 mark as rejected
`information and
`Delete frame
`
`7e
`
`Return to tontrol
`
`Return 'to Control
`
`Return to'Control
`
`Return· to Control
`
`Returrf to Control
`
`77
`
`YES
`
`above threshold> £?
`Is there acoustic sound
`
`73
`
`NO I
`Timer T> 0.5 sec?
`Is unvoiced
`
`YES
`
`~"\"Mt'
`
`72.
`
`Checks present fram
`acoustic speech sensor
`
`for acoustic level,
`
`7\
`
`"am" In "sam", 0.5 sec
`example, voiced speech period
`
`example, unvoiced speech period for
`
`"s" In "sam", 0.2 sec
`
`Tell controller to
`
`start or continue
`frame t I
`
`be unvoiced.
`acoustic signal present, must
`There Is an acceptable
`
`Store feature vector for
`
`Set T • T + ( t1 -,.1 )
`
`7~
`
`NO
`
`present frame?
`moved to form the
`Have other organs
`
`83
`
`6~
`
`~
`
`'"""
`
`0 ....,
`'"""
`\0
`~ ....
`'JJ. =(cid:173)~
`
`'"""
`'""" ~
`N
`ri
`~
`~
`
`\0
`\0
`\0
`
`~ = ......
`~ ......
`~
`•
`r:JJ.
`d •
`
`S4-
`
`YES
`
`~
`
`last 0.5 sec.?
`of v .fold motion, in
`Are there 0.3 sec
`
`NOf
`
`S 1 Return to Control
`
`Signal to User
`
`End of Speech
`
`Set T • 0
`
`acoustics < £
`last 0.5 sec with
`Delete frames in
`
`85
`
`FIG.19
`
`so
`t1-_t!Olf, SE< dehi.?
`energy, liE, from frame
`Is the change in acoustic
`
`Is there vocal
`
`in box 72?
`1 fold motion
`
`ES
`
`1 .._
`
`NO
`
`are preaent,
`Both Acoustics aria EM
`
`Ht T • 0, for
`
`vocalized speech
`
`Store feature vector
`
`NO
`
`to stan or continue
`for frame t1 , Tell control
`
`ES I
`
`frame ti ?
`Fold FIiter pass band In I~
`Is there motion In Vccal
`
`Page 21 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 20 of 31
`
`6,006,175
`
`- o.os
`
`I
`
`\
`
`I
`
`o.os
`
`FIG.20A
`
`o. I
`
`-o. \
`
`-o.3
`
`\\ tAu..u.."
`-0.5 _ _ _ _ _ _ ____J,___ _ _ _ _ _ --1
`o.06'
`-0.05
`0
`T\ME (5)
`
`FIG. 20B
`
`Page 22 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 21 of 31
`
`6,006,175
`
`BIN 6
`\N NO-SP£'=C.~
`CO"-lD\"l'\O~ \ ~ OSEt) FOR
`~~='RE""CC:. "TI ME FOR.
`OTH'ER. REFLECT\O'-LS
`
`""teETH Cl.OSED
`TON<oUE RELA~eo
`\/liUJM OPEN
`
`r rP'nAQ.'<N'kOPal
`
`\lex.AL
`FO\_t)5
`RELA~O
`
`T\ME
`
`30
`
`FIG. 21A
`
`B\N NUt-\B~R.'5 ARt: At)'l\J~iet>
`01\l 5UBSEQUE\.\T SCll.\.lS TO
`MA\<&:. "'t'\oi1= ~,.~tlG'e.CS."T
`CS.\~NI.\L e.l:. IN SlN il 6, 'F~
`R.E.MA \N~ 45 TI LL-----.
`
`L\PS O?E\o-.\ SVT(cid:173)
`DoN'T MO\JE
`FORli:vJAR.t)
`
`'TON<:.U'i:. ii? L\ ~ "TO
`?"1.AlE. BE\-\\Nt) \\:.li=YH
`A"-lt) MA.Ke:."S C()NTAC.T
`tSc: f~ OPe-4 ~WO
`REOUCE RE.L.ECTION
`VELUM o..oses
`"TO t)\R.Eel" /).\'R.
`'THR.OUGH Mo..rrl-l
`
`~~ 'ft.\~ a...DSE:S
`A
`l-11tL..E
`
`\10<:AL.
`'FOL.OS
`UN\JO\<:.e:t>
`FOR. "'"I;_"
`
`Bl~':>
`
`l 3 S" 1
`246
`
`FIG. 21B
`
`Page 23 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 22 of 31
`
`6,006,175
`
`iEETM
`<:»L.l6\4'TL~ ~
`
`~ow..~ i\?
`M~ke5 co1-n~c.T
`
`l="CE \'5
`~~T~~T
`
`L\PS
`'Sl.16'-n\.'f
`FlDWER
`asFLECl1M6
`
`\IE.LUM CLOSE-0
`NO RErLECt\0"-l
`~"'-R.'<N~
`S\..\G,1-\TL'( Q.OSEt>
`\JOCA.LFOL05
`NOTMQIJ\~C"o
`"t'' , ~ uN\lo,cen
`
`l
`
`FIG. 21C
`
`TO~G.UE 'T\P DR.OPS
`8EI-I \Mt) 1'e.csn-t 1 "'TOMGUE ~OC.
`ALSO ~~~li.S
`
`TEET\-l OPEN MOR.E.
`
`L\P~ MO\le.
`FOR.VJARI:>
`
`Bl~'5
`
`I 3 5 1
`'l. 4
`I:,
`
`\16:LUM Oi>E"S
`1=0~ N"~L "' 0 ''
`
`\loc:AL l=Ol-'DS
`ARJ:. \10\CEt)
`W~\l.E
`SPEAK\""6 v.0'1
`
`l
`
`'T\Me.
`30
`
`FIG. 22A
`
`Page 24 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 23 of 31
`
`6,006,175
`
`TOlo.lG:.Ul=- C0~1'~"'T
`R.EL~seo ,o
`SA'( "'O ''
`
`TEET\-\
`OPeN MOR.E
`F,._c.e \S
`CONSTAW,
`
`L\P~
`SL\G.HTL.'f
`
`FOR.WAR.t> i
`
`\JE\..UM. 0Pe\4S
`A \..\1tLe l=OR.
`NA'SAL "o''
`
`P\-4ARXW~
`o~~ SlJc&\.m.Y
`l='OR. ''o''
`
`'JOCA.L FOL.t)~
`,<LOS\\\t6 ~Nt>
`I OPet\NG(FOR
`PHOWE:.M&i "'O"
`BUT "10T "FOR:"t.''
`
`!
`
`FIG. 22B
`
`Page 25 of 63
`
`

`

`U.S. Patent
`
`Dec.21,1999
`
`Sheet 24 of 31
`
`6,006,175
`
`< m
`N
`• 0
`t-4
`Jl.f
`-----
`
`/'
`
`r-:,,
`- .
`\J ,-
`•• UI
`.
`:
`-z
`a; 0 UJ J
`~ - i ,!
`- ~ ... ~ ~
`-
`~ .
`N U. {/)
`1 ui' 0 U)
`- ~ ~ ~
`~ ,,
`~ ~ ~ D-
`..,,_ ·~ u fl
`1d ("'I a...:
`
`~ I
`Ill
`'
`t-:~
`-u
`ffi ci
`' -'°
`fJ.
`•
`~ o,
`v~
`~ o,
`gll!
`tD -...
`....... -
`)? I'!
`-,J
`I ...
`'O
`w
`...
`~~
`
`-
`d.
`II
`2 r
`::>
`-
`al :z
`
`N -
`..
`.....,
`
`~
`
`~
`
`~
`m
`N
`•
`0
`t--1
`Jl.f
`
`~
`
`r
`
`'
`
`0
`
`"'
`
`-
`~
`tl1
`~
`~
`J
`%,-.
`
`I~
`.
`~ .
`~.
`~:
`~ ("J
`It
`'
`~~
`g~
`0 ~~
`I ~
`
`0
`
`~ o ..
`u. r-
`.
`......_
`oO
`..........
`wtJ
`F~
`
`..
`..
`0
`.,
`,
`
`'\
`
`0
`~
`
`,- ...0
`lD ,q-
`,n c""
`-
`d
`21
`- ~
`m :,
`
`'%
`
`Page 26 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 25 of 31
`
`6,006,175
`
`\
`\
`\
`\
`\
`
`\
`\
`
`~
`'J
`0
`rJ. .J
`\9 4(
`Y. 2
`v '9
`,cC-
`it) II)
`~
`\
`\
`\
`\
`\
`
`.::
`0
`,-
`~
`_.
`...
`~
`::.
`0
`0
`~
`
`II!
`~
`
`0
`~
`
`~
`~
`N
`•
`0
`- ~
`
`In
`
`~
`
`l") '3~0d~~ 6
`
`-
`
`0
`I
`
`~
`0
`I
`
`0
`1/l 0
`<:S
`I
`
`an
`i
`
`~
`8 .J
`a1
`\
`-
`"
`\
`"'
`\
`~
`
`\
`\
`\
`\
`\
`
`~
`0
`I
`
`Q <
`
`,n
`
`~
`N
`•
`0
`- ~
`In ~
`
`0
`d
`
`~
`
`w
`%
`0
`~
`
`~ -
`~
`
`~
`
`~
`0
`
`~
`0
`0
`l ") .3S1'\0dS3~
`
`0
`'
`
`Page 27 of 63
`
`

`

`U.S. Patent
`U.S. Patent
`
`Dec. 21, 1999
`Dec. 21, 1999
`
`Sheet 26 of 31
`Sheet 26 of 31
`
`6,006,175
`6,006,175
`
`
`
`
`
`BOvVsRIBINIDNSSiL/NOSMurUBANAWOUsSONWLSIG
`
`DreOIA
`
`~ 4 I
`!!
`VI -t-
`' z
`Q u
`"'
`I ~
`N
`t d
`•
`"4'~ C,
`'Z
`1--(
`::,
`~
`s
`~ u.
`llJ
`~
`~ -a
`
`tD ~
`
`ws3Tl
`“WA?+p‘|irrO
`
`I
`
`r
`
`r
`
`[
`[
`
`[
`
`I
`
`r
`
`I
`
`I
`
`I
`
`I
`
`r
`
`I
`
`I
`
`r
`
`I
`0
`
`[
`
`0
`
`;oa0W:oOr
`
`-z
`0 -
`t
`~
`~
`
`I
`
`~
`a
`)!
`IL
`:l
`
`I
`
`id
`"7
`i
`~
`UI
`~
`
`
`
`WwaoanINVsurne/
`
`o:~
`
`I
`
`I
`
`I
`
`I
`
`I
`
`~
`Y
`
`NONLINayANMNOLV4.4
`NO(IDaS323MYLAaMe)
`
`2
`Q --------...
`G
`111
`..J
`IL I
`lU
`
`I
`
`I
`
`I
`
`I
`
`I
`
`OY
`
`of4
`
`Page 28 of 63
`
`Page 28 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 27 of 31
`
`6,006,175
`
`u
`<
`~
`tn
`ln
`tn
`N
`N
`N
`•
`•
`•
`9
`C,
`C,
`~
`~
`i:i.,
`ii..
`i:i.,
`~ ~ ~
`rJ.
`d.

`-
`-
`-
`J
`V
`0
`~
`(/)
`t;
`2
`-z
`~
`~ ffi
`~
`:> m
`=>
`J
`en
`8
`0
`~
`l:
`ffi
`~
`~
`w
`l!I
`
`-Ill "' "'
`
`-....J
`
`\)
`
`<(
`
`.. .. z
`
`~
`
`2
`3
`0
`ct.
`dl
`
`~
`
`v -:)
`
`0
`
`0
`~
`ill
`
`tl
`
`w
`._
`:!
`
`0 w
`-
`\)
`0
`I ..,
`7
`0
`:z.
`... ...
`co
`
`J
`}
`
`0
`
`Page 29 of 63
`
`

`

`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 28 of 31
`
`6,006,175
`
`~
`lt)
`N
`• 0
`J---1
`~
`~
`0
`~ z
`~
`'E w
`
`~i
`oO
`:J E
`
`"~I
`
`j
`1JJ
`
`J
`
`~~
`
`}jl
`
`N
`
`J;l.c
`lt)
`N
`•
`0
`J---1
`~
`~
`~
`
`~
`~
`:1.
`u,
`:>
`0
`<f)
`u
`~
`4. w
`
`:
`QJ
`OJ
`,,
`
`/
`
`7-
`
`0 -f;
`~,~
`~
`3N
`0-
`
`v -~
`~s :)
`8
`~E H~
`4:
`llJ :r ,-
`
`~
`lt)
`N
`
`• 0 J---1
`
`-ti
`8
`
`?
`
`.c(
`
`~
`~
`rJ.
`u
`~
`~
`ill
`~
`UI
`
`w
`
`0
`uJ
`0-
`l:
`::>
`h
`
`ct
`w
`>
`0
`
`Page 30 of 63
`
`

`

`Dec. 21, 1999
`Dec. 21, 1999
`
`Sheet 29 of 31
`
`Sheet 29 of 31 = It)
`
`6,006,175
`6,006,175
`
`N
`
`VVVVVVVV
`AUUU
`
`0
`II)
`N
`
`DSc*DIA
`
`AOSNASWAj
`
`—_——
`
`"sss,
`
`U.S. Patent
`U.S. Patent
`
`DASANODY
`
`0
`
`Q
`
`Page 31 of 63
`
`ti)
`
`i :r
`
`GVadAQADION
`
`:. 0
`t') lJ,J
`t> u
`(I) -
`~ 0
`7
`
`NOVLISNVSLId
`
`:>
`..I
`0.
`
`Page 31 of 63
`
`

`

`Ul
`.....:a
`~
`....
`
`.... = = 0--,
`
`0--,
`
`'"""'
`~
`
`0 ....,
`0
`~
`
`~ ....
`'JJ. =(cid:173)~
`
`\0
`\0
`'"""'
`\0
`'"""' ~
`N
`ri
`~
`~
`
`~ = ......
`~ ......
`~
`•
`r:JJ.
`d •
`
`g-5~~ . I~ IFIG.26D
`
`~ 500-------------------------.
`
`Cl
`
`5
`
`~ FIG. 26C
`
`4.S"
`\ .r
`
`4
`
`'?>.S
`
`\....,rl
`
`2.
`v \,,I \..,..J V\ 1
`
`1\ME (45.ec.)
`3
`
`2.S
`
`I.S
`
`\. /
`
`\ _;
`
`0r-V W
`
`'500
`
`~
`::r
`
`4.5
`
`4
`
`'3.5
`
`TIME (sec..)
`3
`
`2.!5'
`
`2..
`
`I. 5
`
`J
`
`o.s
`
`0
`::::i-500
`~
`0
`
`500---------------------
`
`IFIG.26A
`
`5
`
`4,5
`
`1
`
`3,5
`1 -
`
`"T\ME (~c..)
`'3
`1
`
`1.5
`
`1
`
`2..
`1
`
`1,6
`1
`
`;_:~t ,
`
`0.5'
`
`0
`
`5
`
`-4.S
`
`4
`
`?,.5
`
`'T \ ME. ( ~ec:.)
`'3
`
`2.5
`
`2.
`
`\.'S
`
`I
`
`O.S"
`
`0
`
`g_
`
`Page 32 of 63
`
`

`

`--..
`::>
`V
`LU
`Cl)
`2
`0
`0.
`
`-1
`
`tJ -1.
`°'
`-3
`-4
`-I
`
`U.S. Patent
`
`Dec. 21, 1999
`
`Sheet 31 of 31
`
`6,006,175
`
`S\1<:TEEN
`
`S\Y, . ."T'(
`
`4
`3
`
`2
`
`\
`
`0
`
`0
`T\ME (s-c..)
`
`5
`
`T
`
`FIG.27
`
`Page 33 of 63
`
`

`

`6,006,175
`
`1
`METHODS AND APPARATUS FOR NON(cid:173)
`ACOUSTIC SPEECH CHARACTERIZATION
`AND RECOGNITION
`
`The United States Government has rights in this inven(cid:173)
`tion pursuant to Contract No. W-7405-ENG-48 between the
`United States Department of Energy and the University of
`California for the operation of Lawrence Livermore
`National Laboratory.
`
`BACKGROUND OF THE INVENTION
`The invention relates generally to speech recognition and
`more particularly to the use of nonacoustic information in
`combination with acoustic information for speech recogni(cid:173)
`tion and related speech technologies.
`Speech Recognition
`The development history of speech recogmt10n (SR)
`technology has spanned four decades of intensive research.
`In the '50s, SR research was focused on isolated digits,
`monosyllabic words, speaker dependence, and phonetic(cid:173)
`based attributes. Feature descriptions included a set of
`attributes like formants, pitch, voiced/unvoiced, energy,
`nasality, and frication, associated with each distinct pho(cid:173)
`neme. The numerical attributes of a set of such phonetic
`descriptions is called a feature vector. In the '60s, research(cid:173)
`ers addressed the problem that time intervals spanned by
`units like phonemes, syllables, or words are not maintained
`at fixed proportions of utterance duration, from one speaker
`to another or from one speaking rate to another. No adequate
`solution was found for aligning the sounds in time in such
`a way that statistical analysis could be used. Variability in
`phonetic articulation due to changes in speaker vocal organ
`positioning was found to be a key problem in speech
`recognition. Variability was in part due to sounds running
`together ( often causing incomplete articulation), or half-way
`organ positioning between two sounds ( often called
`coarticulation). Variability due to speaker differences were
`also very difficult to deal with. By the early '70s, the
`phonetic based approach was virtually abandoned because
`of the limited ability to solve the above problems. A much
`more efficient way to extract and store acoustic feature
`vectors, and relate acoustic patterns to underlying phonemic
`units and words, was needed.
`In the 1970s, workers in the field showed that short
`"frames" ( e.g., 10 ms intervals) of the time waveform could
`be well approximated by an all poles (but no zeros) analytic
`representation, using numerical "linear predictive coding"
`(LPC) coefficients found by solving covariance equations.
`Specific procedures are described in B. S. Atal and S. L.
`Hanauer, "Speech analysis and synthesis by linear prediction
`of the speech wave," J. Acoust. Soc. Am. 50(2), 637 (1971)
`and L. Rabiner, U.S. Pat. No. 4,092,493. Better coefficients
`for achieving accurate speech recognition were shown to be
`the Cepstral coefficients, e.g., S. Furui, "Cepstral analysis
`technique for automatic speaker verification," IEEE Trans. 55
`onAcoust. Speech and Signal Processing,ASSP-29 (2), 254,
`(1981). They are Fourier coefficients of the expansion of the
`logarithm of the absolute value of the corresponding short
`time interval power spectrum. Cepstral coefficients effec(cid:173)
`tively separate excitation effects of the vocal cords from 60
`resonant transfer functions of the vocal tract. They also
`capture the characteristic that human hearing responds to the
`logarithm of changes in the acoustic power, and not to linear
`changes. Cepstral coefficients are related directly to LPC
`coefficients. They provide a mathematically accurate 65
`method of approximation requiring only a small number of
`values. For example, 12 to 24 numbers are used as the
`
`5
`
`2
`component values of the feature vector for the measured
`speech time interval or "frame" of speech.
`The extraction of acoustic feature vectors based on the
`LPC approach has been successful, but it has serious limi-
`tations. Its success relies on being able to simply find the
`best match of the unknown waveform feature vector to one
`stored in a library (also called a codebook) for a known
`sound or word. This process circumvented the need for a
`specific detailed description of phonetic attributes. The
`10 LPC-described waveform could represent a speech
`phoneme, where a phoneme is an elementary word-sound
`unit. There are 40 to 50 phonemes in American English,
`depending upon whose definition is used. However, the LPC
`information does not allow unambiguous determination of
`15 physiological conditions for vocal tract model constraints.
`For example it does not allow accurate, unambiguous vocal
`fold on/off period measurements or pitch. Alternatively, the
`LPC representation could represent longer time intervals
`such as the entire period over which a word was articulated.
`20 Vector "quantization" (VQ) techniques assisted in handling
`large variations in articulation of the same sound from a
`potentially large speaker population. This helped provide
`speaker independent recognition capability, but the speaker
`normalization problem was not completely solved, and
`25 remains an issue today. Automatic methods were developed
`to time align the same sound units when spoken at a different
`rate by the same or different speaker. One successful tech(cid:173)
`niques was the Dynamic Time Warping algorithm which did
`a nonlinear time scaling of the feature coefficients. This
`30 provided a partial solution to the problem identified in the
`'60s as the nonuniform rate of speech.
`For medium size vocabularies (e.g., about 500 words), it
`is acceptable to use the feature vectors for the several speech
`units in a single word as basic matching units. During the
`35 late 1970s, many commercial products became available on
`the market, permitting limited vocabulary recognition.
`However, word matching also required the knowledge of the
`beginning and the end of the word. Thus sophisticated
`end-point (and onset) detection algorithms were developed.
`40 In addition, purposeful insertion of pauses by the user
`between words simplified the problem for many applica(cid:173)
`tions. This approach is known as discrete speech. However,
`for a larger vocabulary (e.g., >1000 words), the matching
`library becomes large and unwieldy. In addition, discrete
`45 speech is unnatural for human communications, but con(cid:173)
`tinuous speech makes end-point detection difficult. Over(cid:173)
`coming the difficulties of continuous speech with a large size
`vocabulary was a primary focus of speech recognition (SR)
`research in the '80s. To accomplish this, designers of SR
`50 systems found that the use of shorter sound units such as
`phonemes or PLUs (phone-like units) was preferable,
`because of the smaller number of units needed to describe
`human speech.
`In the '80s, a statistical pattern matching technique known
`as the Hidden Markov Model (HMM) was applied success(cid:173)
`fully in solving the problems associated with continuous
`speech and large vocabulary size. HMMs were constructed
`to first recognize the 50 phonemes, and to then recognize the
`words and word phrases based upon the pattern of pho-
`nemes. For each phoneme, a probability model is built
`during a learning phase, indicating the likelihood that a
`particular acoustic feature vector represents each particular
`phoneme. The acoustic system measures the qualities of
`each speaker during each time frame (e.g. 10 ms), software
`corrects for speaker rates, and forms Cepstral coefficients. In
`specific systems, other values such as total acoustic energy,
`differential Cepstral coefficients, pitch, and zero crossings
`
`Page 34 of 63
`
`

`

`6,006,175
`
`3
`are measured and added as components with the Cepstral
`coefficients, to make a longer feature vector. By example,
`assume 10 Cepstral coefficients are extracted from a con(cid:173)
`tinuous sp

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket