`US007693720B2
`
`c12) United States Patent
`Kennewick et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7 ,693, 720 B2
`*Apr. 6, 2010
`
`(54) MOBILE SYSTEMS AND METHODS FOR
`RESPONDING TO NATURAL LANGUAGE
`SPEECH UTTERANCE
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`(75)
`
`Inventors: Robert A. Kennewick, Seattle, WA
`(US); David Locke, Redmond, WA
`(US); Michael R. Kennewick, Sr.,
`Bellevue, WA (US); Michael R.
`Kennewick, Jr., Bellevue, WA (US);
`Richard Kennewick, Woodinville, WA
`(US); Tom Freeman, Mercer Island, WA
`(US); Stephen F. Elston, Seattle, WA
`(US)
`
`(73) Assignee: VoiceBox Technologies, Inc., Kirkland,
`WA (US)
`
`4,430,669 A
`
`2/1984 Cheung ...................... 358/122
`
`(Continued)
`
`FOREIGN PATENT DOCUMENTS
`
`WO
`
`WO 01/78065
`
`10/2001
`
`OTHER PUBLICATIONS
`
`Lin et al., "A Distributed Architecture for Cooperative Spoken Dia(cid:173)
`logue Agents With Coherent Dialogue State and History", ASRU'99,
`1999. *
`
`(Continued)
`
`( *) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 950 days.
`
`Primary Examiner-James S Wozniak
`(74) Attorney, Agent, or Firm-Pillsbury Winthrop Shaw
`PittmanLLP
`
`This patent is subject to a terminal dis(cid:173)
`claimer.
`
`(21) Appl. No.: 10/618,633
`
`(22) Filed:
`
`Jul. 15, 2003
`
`(65)
`
`Prior Publication Data
`
`US 2004/0193420Al
`
`Sep.30,2004
`
`Related U.S. Application Data
`
`(60) Provisional application No. 60/395,615, filed on Jul.
`15, 2002.
`
`(51)
`
`Int. Cl.
`GlOL 21100
`(2006.01)
`GlOL 15118
`(2006.01)
`(2006.01)
`GJOL 13100
`(52) U.S. Cl. ..................... 704/275; 257/270; 257/270.1
`( 58) Field of Classification Search . ... ... ... ... .. .. 704/270,
`704/270.1, 275, 257; 709/202
`See application file for complete search history.
`
`(57)
`
`ABSTRACT
`
`Mobile systems and methods that overcomes the deficiencies
`of prior art speech-based interfaces for telematics applica(cid:173)
`tions through the use of a complete speech-based information
`query, retrieval, presentation and local or remote command
`environment. This environment makes significant use of con(cid:173)
`text, prior information, domain knowledge, and user specific
`profile data to achieve a natural environment for one or more
`users making queries or commands in multiple domains.
`Through this integrated approach, a complete speech-based
`natural language query and response environment can be
`created. The invention creates, stores and uses extensive per(cid:173)
`sonal profile information for each user, thereby improving the
`reliability of determining the context and presenting the
`expected results for a particular question or command. The
`invention may organize domain specific behavior and infor(cid:173)
`mation into agents, that are distributable or updateable over a
`wide area network. The invention can be used in dynamic
`environments such as those of mobile vehicles to control and
`communicate with both vehicle systems and remote systems
`and devices.
`
`55 Claims, 6 Drawing Sheets
`
`Text to Speech
`Engine
`
`Trans- Speech
`ceiver Coder
`
`Speech
`Recognition
`Engine
`
`Parser
`
`User Profile
`
`Personality
`
`Agents
`
`>----+<1 Network Interface
`' '~--~
`' '~--~
`i Graphical User
`Main Unit
`Manager
`Interface
`:
`L--------------------------------c.:;-----
`Speech Processing System Block Diagram
`98
`
`12
`
`10
`
`08
`
`06
`
`04
`
`02
`
`Page 1 of 28
`
`
`
`US 7,693,720 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`5,155,743 A
`5,274,560 A
`5,377,350 A
`5,386,556 A
`5,424,947 A
`5,471,318 A
`5,475,733 A
`5,499,289 A
`5,500,920 A
`5,517,560 A
`5,533,108 A
`5,537,436 A
`5,539,744 A
`5,557,667 A
`5,563,937 A
`5,590,039 A
`5,617,407 A
`5,633,922 A
`5,675,629 A
`5,696,965 A
`5,708,422 A
`5,721,938 A
`5,722,084 A
`5,742,763 A
`5,752,052 A
`5,754,784 A
`5,761,631 A
`5,774,859 A
`5,794,050 A
`5,797,112 A
`5,799,276 A
`5,802,510 A
`5,832,221 A
`5,878,385 A
`5,878,386 A
`5,892,813 A *
`5,895,466 A
`5,902,347 A
`5,911,120 A
`5,918,222 A
`5,926,784 A
`5,933,822 A
`5,953,393 A
`5,963,894 A
`5,963,940 A
`5,987,404 A
`5,991,721 A
`5,995,119 A
`6,009,382 A
`6,014,559 A
`6,021,384 A
`6,044,347 A
`6,049,602 A
`6,058,187 A
`6,078,886 A
`6,081,774 A
`6,101,241 A
`6,119,087 A
`6,134,235 A
`6,144,667 A
`6,160,883 A
`6,173,279 Bl
`6,175,858 Bl
`6,185,535 Bl*
`6,192,110 Bl
`6,192,338 Bl
`6,195,634 Bl
`6,195,651 Bl
`6,208,972 Bl *
`6,219,346 Bl
`6,219,643 Bl
`
`10/ 1992 Jacobs .. ... ... .. ... ... ... ... .. . 37 5/28
`12/1993 LaRue ........................ 364/444
`12/1994 Skinner ...................... 395/600
`111995 Hedin et al.
`................ 395/600
`6/1995 Nagao et al. ........... 364/419.08
`1111995 Ahuja et al.
`................ 358/400
`12/1995 Eisdorfer et al ............... 379/52
`3/1996 Bruno et al.
`................ 379/220
`3/1996 Kupiec ...................... 395/2.79
`5/1996 Greenspan .................. 379/114
`7/1996 Harris et al.
`................ 379/201
`7/1996 Bottoms et al. ............. 375/222
`..................... 370/60
`7I1996 Chu et al.
`9/1996 Bruno et al.
`................ 379/201
`10/1996 Bruno et al.
`................ 379/201
`12/1996 Ikeda et al.
`................. 395/759
`4/1997 Bareis ..................... 369/275.3
`5/1997 August et al. ............... 379/220
`10/1997 Raffel et al.
`.................. 379/58
`12/1997 Dedrick ...................... 395/610
`111998 Blonder et al. . .. ... ... 340/825 .34
`211998 Stuckey ...................... 395/7 54
`2/1998 Chakrin et al. .............. 455/551
`4/1998 Jones ...................... 395/200.3
`511998 Richardson et al. ......... 395/7 59
`5/1998 Garland et al. ......... 395/200.49
`6/1998 Nasukawa ..................... 704/9
`611998 Houser et al. ............... 704/27 5
`8/1998 Dahlgren et al ............. 395/708
`8/1998 Komatsu et al ............. 7011201
`8/ 1998 Komissarchik et al. ...... 704/251
`911998 Jones ... ... ... .. ... ... ... ... .. ... 70712
`1111998 Jones .................... 375/200.36
`3/1999 Bralich et al ................... 704/9
`3/1999 Coughlin ..................... 704/10
`4/1999 Morin et al.
`............. 379/88.01
`4/1999 Goldberg et al. ............... 707/5
`511999 Backman et al. ............ 7011200
`6/1999 Jarett et al.
`................. 455/417
`6/1999 Fukui et al. .................... 707/1
`7I1999 Richardson et al. . ... ... ... .. 704/9
`8/1999 Braden-Harder et al. ....... 707/5
`9/1999 Culbreth et al. .......... 379/88.25
`10/ 1999 Richardson et al. . ... ... ... .. 704/9
`10/1999 Liddy et al. .................... 707/5
`1111999 Della Pietra et al ............ 704/9
`1111999 Asano et al ................. 704/257
`1111999 Cosatto et al.
`.............. 345/473
`12/1999 Martino et al. ................. 704/1
`112000 Amin ......................... 455/413
`212000 Gorin et al. . ... .. ... ... ... ... .. 704/ 1
`3/2000 Abella et al. ................ 704/272
`412000 Foladare et al .............. 379/265
`512000 Chen ........................... 380/21
`612000 Dragosh et al.
`............. 704/270
`612000 De Hita et al. . .. ... ... ... ... .. 704/9
`............. 379/88.01
`8/2000 Boyce et al.
`912000 Kuhn et al.
`................. 704/270
`10/2000 Goldman et al ............. 370/352
`1112000 Doshi et al. ................. 370/401
`12/2000 Jackson et al. .............. 379/230
`112001 Levin et al ..................... 707/5
`112001 Bulfer et al ................. 709/206
`................ 704/270
`212001 Hedin et al.
`2/2001 Abella et al. ............. 379/88.01
`212001 Haszto et al.
`............... 704/257
`212001 Dudemaine et al. ......... 704/231
`212001 Handel et al. .. .. ... ... ... ... .. 70712
`3/2001 Grant et al. ................. 704/27 5
`4/2001 Maxemchuk ............... 370/338
`412001 Cohen et al ................. 704/257
`
`6,233,556 Bl
`6,233,559 Bl *
`6,246,981 Bl
`6,272,455 Bl
`6,292,767 Bl
`6,314,402 Bl
`6,366,886 Bl
`6,381,535 Bl
`6,385,646 Bl
`6,393,428 Bl
`6,404,878 Bl
`6,408,272 Bl *
`6,411,810 Bl
`6,415,257 Bl
`6,418,210 Bl
`6,420,975 Bl *
`6,430,285 Bl
`6,434,523 Bl
`6,434,524 Bl
`6,442,522 Bl
`6,446,114 Bl
`6,453,153 Bl
`6,453,292 B2
`6,456,711 Bl
`6,466,654 Bl
`6,466,899 Bl
`6,498,797 Bl
`6,499,013 Bl
`6,501,833 B2
`6,501,834 Bl
`6,513,006 B2
`6,523,061 Bl
`6,532,444 Bl
`6,539,348 Bl
`6,553,372 Bl
`6,556,970 Bl
`6,556,973 Bl
`6,560,576 Bl
`6,567,778 Bl
`6,567,797 Bl
`6,570,555 Bl
`6,570,964 Bl
`6,574,597 Bl
`6,574,624 Bl
`6,594,257 Bl
`6,598,018 Bl
`6,604,077 B2
`6,611,692 B2
`6,614,773 Bl
`6,615,172 Bl*
`6,629,066 Bl
`6,631,346 Bl
`6,643,620 Bl
`6,650,747 Bl
`6,691,151 Bl
`6,721,001 Bl
`6,721,706 Bl
`6,735,592 Bl
`6,741,931 Bl*
`6,742,021 Bl *
`6,757,718 Bl
`6,795,808 Bl
`6,833,848 Bl
`6,865,481 B2
`6,877,134 Bl
`6,901,366 Bl
`6,937,977 B2 *
`6,944,594 B2 *
`6,973,387 B2
`6,980,092 B2 *
`6,990,513 B2
`6,996,531 B2
`7,020,609 B2
`
`5/2001 Teunen et al. ............... 704/250
`5/2001 Balakrishnan .............. 704/275
`6/2001 Papineni et al .............. 704/235
`8/2001 Hoshen et al.
`................. 704/1
`9/2001 Jackson et al. ................. 704/1
`1112001 Monaco et al .............. 704/275
`412002 Dragosh et al.
`.......... 704/270.l
`412002 Durocher et al. ............ 7011202
`512002 Brown et al. ................ 709/217
`512002 Miller et al.
`................ 707/102
`612002 Jackson et al. ......... 379/221.01
`612002 White et al.
`............. 704/270.l
`612002 Maxemchuk ............... 455/453
`712002 Junqua et al. ............... 704/275
`712002 Sayko ................... 379/142.15
`712002 DeLine et al.
`........... 340/815.4
`8/2002 Bauer et al. ............ 379/265.01
`8/2002 Monaco ...................... 704/257
`8/2002 Weber ........................ 704/257
`8/2002 Carberry et al. ............. 704/257
`912002 Bulfer et al ................. 709/206
`912002 Bowker et al. ............. 455/67.4
`912002 Ramaswamy et al. ....... 704/235
`912002 Cheung et al. ......... 379/265.09
`10/2002 Cooper et al ............. 379/88.01
`10/2002 Yano et al. ..................... 704/1
`12/2002 Anerousis et al. ........... 370/522
`12/2002 Weber ........................ 704/257
`12/2002 Phillips et al.
`........... 379/88.07
`12/2002 Milewski et al. ......... 379/93.24
`112003 Howard et al ............... 704/257
`212003 Halverson et al. ........... 709/202
`3/2003 Weber ........................ 704/257
`3/2003 Bond et al.
`. ................... 704/9
`4/2003 Brassell et al. ................. 70715
`4/2003 Sasaki et al. ................ 704/257
`4/2003 Lewin ........................ 704/277
`5/2003 Cohen et al. ................ 704/270
`5/2003 Chao Chang et al.
`....... 704/257
`5/2003 Schuetze et al.
`. .............. 70712
`5/2003 Prevost et al. ............... 345/156
`5/2003 Murveit et al. ............. 379/67 .1
`6/2003 Mohri et al.
`................ 704/251
`6/2003 Johnson et al.
`. ............... 70715
`7/2003 Doshi et al. ................. 370/352
`7 /2003 Junqua ....................... 704/251
`8/2003 Dragosh et al.
`.......... 704/270.l
`................ 455/552
`8/2003 Raffel et al.
`9/2003 Maxemchuk ............... 370/337
`9/2003 Bennett et al. .............. 704/257
`9/2003 Jackson et al. . ... ... .. ... ... .. 704/9
`.. .. ... ... .. 704/9
`10/2003 Karaorman et al.
`1112003 Contolini et al. ............ 704/270
`1112003 Bala et al ............... 379/265.06
`212004 Cheyer et al. ............... 709/202
`412004 Berstis .................... 348/231.3
`412004 Strubbe et al. .............. 704/275
`512004 Neumann et al.
`........... 707/101
`512004 Kohut et al ................. 7011209
`512004 Halverson et al. ........... 709/218
`6/2004 Halverson et al. ........... 709/218
`912004 Strubbe et al. .............. 704/275
`12/2004 Wolff et al.
`3/2005 Kawazoe et al. ............ 7011211
`412005 Fuller et al.
`................. 704/275
`512005 Kuhn et al.
`8/2005 Gerson ....................... 704/201
`912005 Busayapongchai et al. .. 704/275
`12/2005 Masclet et al. .............. 7011211
`12/2005 Turnbull et al. .......... 340/425.5
`112006 Belfiore et al. .............. 709/203
`212006 Koral! et al.
`................ 704/270
`3/2006 Thrift et al. .............. 704/270.l
`
`Page 2 of 28
`
`
`
`US 7,693,720 B2
`Page 3
`
`7,027,975 Bl
`7,062,488 Bl
`7,092,928 Bl
`7,127,400 B2
`7,137,126 Bl
`7,146,319 B2
`7,289,606 B2
`7,376,645 B2
`7,398,209 B2
`7,424,431 B2
`7,461,059 B2
`7,493,559 Bl
`200110041980 Al
`2002/0049805 Al *
`2002/0065568 Al
`2002/0082911 Al
`2002/0120609 Al
`2002/0124050 Al
`200210143535 Al
`200210188602 Al
`2002/0198714 Al
`2003/0014261 Al
`2003/0112267 Al
`2003/0182132 Al*
`2004/0025115 Al
`2004/0044516 Al
`200410166832 Al
`2004/0199375 Al
`2005/0021334 Al
`2005/0021826 Al
`2005/0033574 Al
`2005/0114116 Al
`2005/0246174 Al
`2007 /0033005 Al
`2007/0038436 Al
`2007/0050191 Al
`2008/0091406 Al
`2008/0103761 Al
`2008/0115163 Al
`2008/0133215 Al
`200910117885 Al
`200910144271 Al
`
`412006 Pazandak et al ................ 704/9
`612006 Reisman ........................ 707/8
`8/2006 Elad et al ...................... 706/60
`10/2006 Koch ....................... 704/270.l
`1112006 Coffman et al. ............. 719/328
`12/2006 Hunt .......................... 704/254
`10/2007 Sibal et al. .................... 379/52
`5/2008 Bernard ........ ...... ...... .. ... 707 /3
`7/2008 Kennewick et al .......... 704/255
`9/2008 Greene et al. ............... 704/270
`12/2008 Richardson et al. ............ 707/5
`212009 Wolff et al. ................. 715/727
`1112001 Howard et al ............... 704/270
`412002 Yamada et al. .............. 709/202
`512002 Silfvast et al.
`.. .............. 700/94
`612002 Dunn et al .................... 705/14
`8/2002 Lang et al. ..................... 707/1
`912002 Middeljans ................. 709/203
`................... 704/251
`10/2002 Kist et al.
`12/2002 Stubler et al. .................. 707/3
`12/2002 Zhou .......................... 704/252
`112003 Kageyama .................. 704/275
`6/2003 Belrose ...................... 345/728
`9/2003 Niemoeller ................. 704/27 5
`212004 Sienel et al.
`................ 715/513
`3/2004 Kennewick et al ............. 704/5
`.......... 455/412.1
`8/2004 Portman et al.
`10/2004 Ehsani et al.
`.. .. ...... ...... .. 704/4
`112005 Iwahashi .................... 704/240
`112005 Kumar ....................... 709/232
`212005 Kim et al. ................... 704/251
`512005 Fiedler ....................... 704/201
`1112005 DeGolla ..................... 704/270
`212007 Cristo et al.
`........ ...... ..... 704/9
`212007 Cristo et al.
`........ ...... ..... 704/9
`3/2007 Weider et al. ............... 704/275
`4/2008 Baldwin et al.
`.. ...... ...... .. 704/4
`5/2008 Printz et al. .... .. ...... ...... .. 704/9
`5/2008 Gilboa et al.
`................. 725/34
`6/2008 Sarukkai .. .. ...... ...... ........ 704/2
`512009 Roth ....................... 455/414.3
`612009 Richardson et al. ............ 707/5
`
`OTHER PUBLICATIONS
`
`Kuhn et al. "Hybrid in-car speech recognition for mobile multimedia
`applications", Vehicular Technology Conference, 1999, IEEE, Jul.
`1999, pp. 2009-2013. *
`
`Belvin et al. "Development of the HRL Route Navigation Dialogue
`System", Proceedings of the first international conference on Human
`language technology research, San Diego, 2001, pp. 1-5. *
`Lind et al. "The network vehicle-A glimpse into the future of mobile
`multimedia," IEEEAerosp. Electron. Syst. Mag., vol.14, No. 9, Sep.
`1999, pp. 27-32.*
`Zhao, "Telernatics: Safe and Fun Driving," IEEE Intelligent Systems,
`vol. 17, Issue 1, 2002, pp. 10-14.*
`Elio et al, "On abstract task models and conversation policies," in
`Workshop on Specifying and Implementing Conversation Policies,
`Autonomous Agents '99, Seattle, 1999. *
`Turunen; "Adaptive interaction methods in speech user interfaces",
`Conference on Human Factors in Computing Systems, Seattle,
`Washington, 2001, pp. 91-92.*
`Reuters, "IBM to Enable Honda Drivers to Talk to Cars", Charles
`Schwab & Co., Inc., Jul. 28, 2002, 1 page.
`Mao, MarkZ., "Automatic Training Set Segmentation for Multi-Pass
`Speech Recognition", Department of Electrical Engineering,
`Stanford University, CA, copyright 2005, IEEE, pp. I-685 to I-688.
`VanHoucke, Vincent, "Confidence Scoring and Rejection Using
`Multi-Pass Speech Recognition", Nuance Communications, Menlo
`Park, CA, [no date], 4 pages.
`Weng, Fuliang, et al., "Efficient Lattice Representation and Genera(cid:173)
`tion", Speech Technology and Research Laboratory, SRI Interna(cid:173)
`tional, Menlo Park, CA, [no date], 4 pages.
`Chai et al., "MIND: A Semantics-Based Multimodal Interpretation
`Framework for Conversational System", Proceedings of the Interna(cid:173)
`tional Class Workshop on Natural, Intelligent and Effective Interac(cid:173)
`tion in Multimodal Dialogue Systems, Jun. 2002, pp. 37-46.
`Chey er et al., "Multimodal Maps: An Agent-BasedApproach", Inter(cid:173)
`national Conference on Cooperative Multimodal Communication
`(CMC/95), May 24-26, 1995, pp. 111-121.
`El Meliani et al., "A Syllabic-Filler-Based Continuous Speech
`Recognizer for Unlimited Vocabulary", Canadian Conference on
`Electrical and Computer Engineering, vol. 2, Sep. 5-8, 1995, pp.
`1007-1010.
`Arrington, Michael, "Google Redefines GPS Navigation Landscape:
`Google Maps Navigation for Android 2.0", TechCrunch, printed
`from the Internet <http://www.techcrunch.com/2009/l 0/28/google(cid:173)
`redefines-car-gps-navigation-google-maps-navigation-android/>,
`Oct. 28, 2009, 4 pages.
`* cited by examiner
`
`Page 3 of 28
`
`
`
`N = = N
`-....l °" \C w
`
`~
`
`rJl
`d
`
`~
`
`0 .....
`....
`.....
`1J1 =- ('D
`
`('D
`
`N
`~~
`
`0 ....
`
`0
`
`'e :-:
`>
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`Workstation
`
`Service Representative
`
`GPS
`
`I I Specialized lso
`
`Services
`
`System
`
`Payment H CRM
`
`Services
`
`Services
`Location I
`
`t+8
`
`Services
`Emergency
`
`Computer
`
`Fixed
`
`Network
`Wireless ~TN Netwo
`
`A
`
`Figure 1. First Embodiment System Block Diagram
`
`I 10
`
`----JV
`
`I
`
`Handheld1
`
`-------------------
`
`L_J
`
`Vehicle
`
`L _ _ _ _ _
`I I
`
`Camera N 6
`I Video
`4
`
`Keypad
`
`28
`
`Unit
`
`Spee ch
`
`System
`
`Transceiver
`
`RF
`
`Wide-.Area
`
`Interface
`Handheld
`
`I
`I
`
`: Display
`
`Data Interfaces
`
`r '-";1 Control Unit( s)
`L "S Telematics
`
`Processing Unit.
`Main Speech
`
`Control and Device Interfaces
`
`2
`
`32
`
`32
`
`Device 2 I • • • I Device N
`
`Device1
`
`, r~f
`l~ I ~ 120 y Navigation
`8 g
`
`Handheld
`
`36
`
`34
`
`Control
`Control
`Manual
`Manual
`-----------------------------
`
`Control
`Manual
`
`"" 4
`
`Page 4 of 28
`
`
`
`N = = N
`-....l °" \C w
`
`~
`
`rJl
`d
`
`~
`
`0 .....
`N
`.....
`1J1 =(cid:173)
`
`('D
`('D
`
`N
`~~
`:-:
`~
`
`0 ....
`
`0
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`Service Representative
`
`Workstation
`
`G~S
`
`so
`
`Services
`
`Specialized
`
`Systems
`CRM
`
`Services
`Payment
`
`Services
`Location
`
`Services
`Emergency
`
`Computer
`
`Fixed
`
`5
`
`Network
`Wireless ~S TN Networ,
`
`Internet/
`ata Networ
`
`Figure 2. Second Embodiment System Block Diagram
`
`'-"'------"-----------------------'"'vi 0
`
`Camera I '""I 6 IR ll!v36
`Video
`
`System
`
`Navigation
`
`Transceiver
`
`4 ......---""------.
`
`2
`
`Interface
`Handheld
`
`Unit
`
`Processing
`
`I-------..
`: Display
`
`Speech
`Main
`
`8 D_
`
`Data Interfaces
`
`Telematics Control Unit(s)
`
`Control and Device Interfaces
`
`2
`
`32
`
`Device 2 I • • • I Device N tv32
`
`Device 1
`
`4
`
`34
`
`Control
`• Manual
`-----------------------------
`
`Control
`Manual
`
`Control
`Manual
`
`Handh~Idl
`
`28
`Keypad L::.
`
`VehicleLJ
`
`Page 5 of 28
`
`
`
`N = = N
`-....l °" \C w
`
`~
`
`rJl
`d
`
`~
`
`0 .....
`
`(.H
`
`.....
`1J1 =(cid:173)
`
`('D
`('D
`
`N
`~~
`:-:
`~
`
`0 ....
`
`0
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`"--SO
`
`Memory
`Nonvolatile
`__..,,,
`
`'--
`
`74
`
`f: .. ~ITT
`Display -v72r===::ll
`
`'
`
`~
`
`Keypad
`
`'I
`
`v-78
`
`Transceiver
`
`Network
`Wide-kea
`
`76
`
`Transcevier(s)
`
`Network
`
`Local Area
`
`L-...i
`
`rv-70
`
`Processing Unit(s)
`
`'
`
`r
`
`Figure 3. Handheld Computer Block Diagram.
`
`36
`
`Handheld Computer
`
`'
`
`r
`
`\.
`
`r
`
`N28
`
`'I
`
`Unit
`
`Speech
`
`~
`Unit
`
`'
`
`Processing ~8
`Speech
`Main
`
`Page 6 of 28
`
`
`
`N = = N
`-....l °" \C w
`
`~
`
`rJl
`d
`
`~
`
`0 .....
`
`.i;...
`
`.....
`1J1 =(cid:173)
`
`('D
`('D
`
`N
`~~
`:-:
`~
`
`0 ....
`
`0
`
`. Nonvolatile ~4
`
`Memory
`
`___.,,,
`
`Transceiver
`
`network
`Wide-Area
`
`Transcevier(s)
`
`~ Network
`Local Area
`
`Figure 4. Fixed Computer Block Diagram.
`
`Fixed Computer
`
`~ ~ITT, 8
`Display Nl6 r====i
`
`Keyboard
`
`--
`
`,,
`
`'I
`
`'
`
`,
`
`./
`
`N28
`
`Unit
`
`Speech
`
`'\
`
`~
`
`Processing MS
`
`Unit
`
`'
`
`\
`
`'
`
`Speech
`Main
`
`r
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`rv84
`
`Processing Unit(s)
`
`Page 7 of 28
`
`
`
`N = = N
`-....l °" \C w
`
`~
`
`rJl
`d
`
`~
`
`0 .....
`Ul
`.....
`1J1 =(cid:173)
`
`('D
`('D
`
`N
`~~
`:-:
`~
`
`0 ....
`
`0
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`98
`
`Speech Processing System Block Diagram
`
`Figure 5.
`
`Interface
`
`Graphical User
`
`04
`
`--Update Manager
`
`06
`
`08
`
`10
`
`12
`
`I
`I
`
`Agents
`
`Personality
`
`User Profile
`
`Phrases
`
`Dictionary and
`
`18
`
`I
`
`N20
`
`22
`
`24
`
`: ~1 Network Interface
`I
`
`Data Network/ PSTN 14
`
`Parser
`
`Engine
`
`Recognition
`
`Speech
`
`I
`I
`
`: I
`
`I
`I
`I
`I
`I
`I
`
`ceiver Coder
`Trans-I Speech
`
`Engine
`
`Text to Speech
`
`2
`
`Array Microphone ~
`
`00000 34
`
`....
`
`Transceiver
`136 Speech Unit
`
`----------------------------------------~
`
`Main Unit
`
`00
`
`Manager
`
`Event
`
`02
`
`Databases
`
`14
`
`16
`
`---------------------------------~------
`I
`1
`:
`I
`I
`I
`I
`
`Page 8 of 28
`
`
`
`N = = N
`-....l °" \C w
`
`~
`
`rJl.
`d
`
`~
`
`0 .....
`
`~
`
`.....
`1J1 =(cid:173)
`
`('D
`('D
`
`N
`~~
`:-:
`~
`
`0 ....
`
`0
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`Figure 6. Agent Architecture
`
`00
`
`Manager
`
`Event
`
`02
`
`Databases
`
`58
`
`I
`I
`I
`I
`
`i
`
`N 06
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`-----------------------------~
`
`i .1 Domain Agent rn_ 1 56
`
`Agents ---~
`
`~L_-_-_~_-_-_-_~_-_-_-_--------------
`: .
`. ~1 Agent Manager
`.
`
`154
`
`.
`
`52
`
`Criteria Handler
`
`System Agent
`
`Agent Library
`
`
`I•
`
`~
`
`Page 9 of 28
`
`
`
`US 7,693,720 B2
`
`1
`MOBILE SYSTEMS AND METHODS FOR
`RESPONDING TO NATURAL LANGUAGE
`SPEECH UTTERANCE
`
`This application claims priority from U.S. Provisional
`Patent Application Ser. No. 60/395,615 filed Jul. 15, 2002, the
`disclosure of which is hereby incorporated by reference by its
`entirety.
`
`FIELD OF THE INVENTION
`
`The present invention relates to the retrieval of online infor(cid:173)
`mation and processing of commands through a speech inter(cid:173)
`face in a vehicle environment. More specifically, the inven(cid:173)
`tion is a fully integrated environment allowing mobile users to 15
`ask natural language speech questions or give natural lan(cid:173)
`guage commands in a wide range of domains, supporting
`local or remote commands, making local and network queries
`to obtain information, and presenting results in a natural
`manner even in cases where the question asked or the 20
`responses received are incomplete, ambiguous or subjective.
`
`BACKGROUND OF THE INVENTION
`
`2
`The fact that most natural language queries and commands
`are incomplete in their definition is a significant barrier to
`natural human query-response interaction. Further, some
`questions can only be interpreted in the context of previous
`questions, knowledge of the domain, or the user's history of
`interests and preferences. Thus, some natural language ques(cid:173)
`tions and commands may not be easily transformed to
`machine processable form. Compounding this problem,
`many natural language questions may be ambiguous or sub-
`10 jective. In these cases, the formation ofa machine processable
`query and returning of a natural language response is difficult
`at best.
`Even once a question is asked, parsed and interpreted
`machine processable queries and commands must be formu(cid:173)
`lated. Depending on the nature of the question, there may not
`be a simple set of queries returning an adequate response.
`Several queries may need to be initiated and even these que(cid:173)
`ries may need to be chained or concatenated, to achieve a
`complete result. Further, no single available source may
`include the entire set of results required. Thus multiple que(cid:173)
`ries, perhaps with several parts, need to be made to multiple
`data sources, which can be both local or on a network. Not all
`of these sources and queries will return useful results or any
`results at all. In a mobile or vehicular environment, the use of
`25 wireless communications compounds the chances that que(cid:173)
`ries will not complete or return useful results. Useful results
`that are returned are often embedded in other information,
`and from which they may need to be extracted. For example,
`a few key words or numbers often need to be "scraped" from
`30 a larger amount of other information in a text string, table, list,
`page or other information. At the same time, other extraneous
`information such as graphics or pictures needs to be removed
`to process the response in speech. In any case, the multiple
`results must be evaluated and combined to form the best
`possible answer, even in the case where some queries do not
`return useful results or fail entirely. In cases where the ques-
`tion is ambiguous or the result inherently subjective, deter(cid:173)
`mining the best result to present is a complex process. Finally,
`to maintain a natural interaction, responses need to be
`returned rapidly to the user. Managing and evaluating com(cid:173)
`plex and uncertain queries while maintaining real-time per-
`formance is a significant challenge.
`
`Telematics systems are systems that bring human-com(cid:173)
`puter interfaces to vehicular environments. Conventional
`computer interfaces use some combination of keyboards,
`keypads, point and click techniques and touch screen dis(cid:173)
`plays. These conventional interface techniques are generally
`not suitable for a vehicular environment, owing to the speed
`of interaction and the inherent danger and distraction. There(cid:173)
`fore, speech interfaces are being adopted in many telematics
`applications.
`However, creating a natural language speech interface that
`is suitable for use in the vehicular environment has proved 35
`difficult. A general-purpose telematics system must accom(cid:173)
`modate commands and queries from a wide range of domains
`and from many users with diverse preferences and needs.
`Further, multiple vehicle occupants may want to use such
`systems, often simultaneously. Finally, most vehicle environ- 40
`ments are relatively noisy, making accurate speech recogni(cid:173)
`tion inherently difficult.
`Human retrieval of both local and network hosted online
`information and processing of commands in a natural manner
`remains a difficult problem in any environment, especially 45
`onboard vehicles. Cognitive research on human interaction
`shows that a person asking a question or giving a command
`typically relies heavily on context and the domain knowledge
`of the person answering. On the other hand, machine-based
`queries of documents and databases and execution of com- 50
`mands must be highly structured and are not inherently natu-
`ral to the human user. Thus, human questions and commands
`and machine processing of queries are fundamentally incom(cid:173)
`patible. Yet the ability to allow a person to make natural
`language speech-based queries remains a desirable goal.
`Much work covering multiple methods has been done in
`the fields of natural language processing and speech recog(cid:173)
`nition. Speech recognition has steadily improved in accuracy
`and today is successfully used in a wide range ofapplications.
`Natural language processing has previously been applied to 60
`the parsing of speech queries. Yet, no system developed pro(cid:173)
`vides a complete environment for users to make natural lan(cid:173)
`guage speech queries or commands and receive natural
`sounding responses in a vehicular environment. There remain
`a number of significant barriers to creation of a complete 65
`natural language speech-based query and response environ-
`ment.
`
`These and other drawbacks exist in existing systems.
`
`SUMMARY OF THE INVENTION
`
`An object of the invention is to overcome these and other
`drawbacks of prior speech-based telematic systems.
`According to one aspect of the invention, systems and
`methods are provided that may overcome deficiencies of prior
`systems through the application of a complete speech-based
`information query, retrieval, presentation and command envi(cid:173)
`ronment. This environment makes significant use of context,
`55 prior information, domain knowledge, and user specific pro(cid:173)
`file data to achieve a natural environment for one or more
`users making queries or commands in multiple domains.
`Through this integrated approach, a speech-based natural
`language query, response and command environment is cre(cid:173)
`ated. Further, at each step in the process, accommodation may
`be made for full or partial failure and graceful recovery. The
`robustness to partial failure is achieved through the use of
`probabilistic and fuzzy reasoning at several stages of the
`process. This robustness to partial failure promotes the feel(cid:173)
`ing of a natural response to questions and commands.
`According to another aspect of the invention, a mobile
`interactive natural language speech system (herein "the sys-
`
`Page 10 of 28
`
`
`
`US 7,693,720 B2
`
`3
`tern") is provided that includes a speech unit. The speech unit
`may be incorporated into a vehicle computer device or sys(cid:173)
`tem, or may be a separate device. If a separate device, the
`speech unit may be connected to the vehicle computer device
`via a wired or wireless connection. In some embodiments, the
`interactive natural language speech device can be handheld.
`The handheld device may interface with vehicle computers or
`other electronic control systems through wired or wireless
`links. The handheld device can also operate independently of
`the vehicle. The handheld device can be used to remotely
`control the vehicle through a wireless local area connection,
`a wide area wireless connection or through other communi(cid:173)
`cation links.
`According to another aspect of the invention, the system
`may include a stand alone or networked PC attached to a
`vehicle, a standalone or networked fixed computer in a home
`or office, a PDA, wireless phone, or other portable computer
`device, or other computer device or system. For convenience,
`these and other computer alternatives shall be simply referred
`to as a computer. One aspect of the invention includes soft- 20
`ware that is installed onto the computer, where the software
`includes one or more of the following modules: a speech
`recognition module for capturing the user input; a parser for
`parsing the input, a text to speech engine module for convert(cid:173)
`ing text to speech; a network interface for enabling the com- 25
`puter to interface with one or more networks; a graphical user
`interface module, an event manager for managing events and
`other modules. In some embodiments the event manager is in
`communication with a dictionary and phrases module, a user
`profile module that enables user profiles to be created, modi- 30
`fied and accessed, a personality module that enables various
`personalities to be created and used, an agent module, an
`update manager and one or more databases. It will be under(cid:173)
`stood that this software can be distributed in any way between
`a handheld device, a computer attached to a vehicle, a desktop 35
`computer or a server without altering the function, features,
`scope, or intent of the invention.
`According to one aspect of the invention, and regardless of
`the distribution of the functionality, the system may include a
`speech interface device that receives spoken natural language 40
`queries, commands and/or other utterances from a user, and a
`computer device or system that receives input from the speech
`unit and processes the input (e.g., retrieves information
`responsive to the query, takes action consistent with the com(cid:173)
`mand and performs other functions as detailed herein), and 45
`responds to the user with a natural language speech response.
`According to another aspect ofinvention, the system can be
`interfaced by wired or wireless connections to one or more
`vehicle-related systems. These vehicle-related systems can
`themselves be distributed between electronic controls or 50
`computers attached to the vehicle or external to the vehicle.
`Vehicle systems employed can include, electronic control
`systems, entertainment devices, navigation equipment, and
`measurement equipment or sensors. External systems
`employed include those used during vehicle operation, such
`as, weight sensors, payment systems, emergency assistance
`networks, remote ord