`Class et al.
`
`US006230132B1
`US 6,230,132 B1
`May 8,2001
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`(54) PROCESS AND APPARATUS FOR REAL
`TIME VERBAL INPUT OF A TARGET
`ADDRESS OF A TARGET ADDRESS SYSTEM
`
`(75) Inventors: Fritz Class, Roemerstein; Thomas
`Kuhn, Ulm; Carsten-Uwe Moeller,
`Koengen; Frank Reh, Stuttgart;
`Gerhard Nuessle, Blaustein, all of
`(DE)
`(73) Assignee: DaimlerChrysler AG, Stuttgart (DE)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/038,147
`(22) Filed:
`Mar. 10, 1998
`(30)
`Foreign Application Priority Data
`
`Mar. 10, 1997
`
`(DE) ............................................ .. 197 O9 518
`
`(51) Int. Cl.7 .................................................... .. G10L 11/00
`(52) US. Cl. ........................................... .. 704/270; 704/276
`(58) Field of Search ................................... .. 704/270, 276;
`701/210
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`9/1989 Baker ................................... .. 381/43
`4,866,778
`5,054,082 * 10/1991 Smith et a1. ....................... .. 704/275
`5,165,095 * 11/1992 Borcherding .
`5,172,321 * 12/1992 Ghaem et al. ..................... .. 364/444
`5,677,990 * 10/1997 Junqua ......... ..
`704/255
`5,832,429 * 11/1998 Gammel et al. ................... .. 704/255
`5,893,901 * 4/1999 Maki .................................. .. 704/260
`
`FOREIGN PATENT DOCUMENTS
`
`36 O8 497 A1 *
`196 O0 700
`A1 *
`195 33 541
`O 346 483
`O 477 688 A2 *
`
`9/1987 (DE) .
`
`8/1996
`3/1997
`12/1989
`9/1991
`
`(DE) .
`(DE) .
`(EP) .
`(EP) .
`
`0736 853 A1 * 4/1996 (EP).
`61-147298 * 7/1986 (JP).
`6-66591 * 3/1994 (JP).
`6-85893 * 3/1994 (JP).
`6-42154 * 6/1994 (JP).
`6-54440 * 7/1994 (JP).
`7-219961 * 8/1994 (JP).
`6-261126 * 9/1994 (JP).
`6-318977 * 11/1994 (JP).
`7-64480 * 3/1995 (JP).
`7-219590 * 8/1995 (JP).
`7-261784 * 10/1995 (JP).
`7-319383 * 12/1995 (JP).
`8-166797 * 6/1996 (JP).
`8-202386 * 8/1996 (JP).
`8-328584 * 12/1996 (JP).
`WO 96/13030 * 5/1996 (WO).
`
`* cited by examiner
`
`Primary Examiner—Krista Zele
`Assistant Examiner—Michael N. Opsasnick
`(74) Attorney, Agent, or Firm—Evenson,
`EdWards & Lenahan, P.L.L.C.
`(57)
`ABSTRACT
`
`McKeoWn,
`
`In a method for real time speech input of a destination
`address into a navigation system, the speech statements that
`are entered by a user are recognized by a speech recognition
`device and classi?ed in accordance With their recognition
`probability. The speech statement With the greatest recog
`nition probability is identi?ed as the input speech statement,
`With at least one speech statement being an admissible
`speech command that activates the operating functions of
`the navigation system associated With this speech command.
`(All the admissible speech statements being stored in at least
`one database.) According to the invention, at least one
`operating function of the navigation system comprises an
`input dialogue. Following activation of that operating
`function, depending on the input dialogue, at least one
`lexicon is generated in real time from the admissible speech
`statements stored in at least one database, and the generated
`lexicon is loaded as vocabulary into the speech recognition
`device.
`
`17 Claims, 10 Drawing Sheets
`
`swim 5mm
`tixmow m0 LOAD
`w RECCCNHUN [NGWE
`
`?
`
`YES
`
`RESOWE
`win
`M/BJGLIHY
`
`5100
`
`comwur m1
`SW 500
`
`1
`
`
`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 1 0f 10
`
`US 6,230,132 B1
`
`O8
`
`2:
`
`35%Em:
`
`Cm
`
`meE;
`
`
`
`oz<§8:§n_m
`
`38258
`
`$58fizm
`20.22553%.
`
`.zo_:z_Efi_
`
`.2853
`
`zo=<z_Eo.
`
`L22.
`
`22.503%zo_:z__§
`
`3:53>552982
`
`m<222228E
`
`
`
`Emazo=<z_Eo
`
`
`
`
`
`EEGzo=<o_><z9mm~_8<zofizzaozwmzé
`
`2
`
`
`
`
`
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 2 0f 10
`
`US 6,230,132 B1
`
`1000
`
`'DESTiNATION
`( LOCATION ENTRY‘ 1
`
`1
`1010 —\ 10/10 BASIC
`VOCABULARY
`
`1
`1020 *\ "PLEASE 511 THE
`05111110100 LOCATION"
`,
`
`1030 Z’ <DESTINAT|ON
`LOCATION_1>
`
`1
`
`I
`1040
`<DESTINATION LOCAT|0N_1>
`\. TO SPEECH RECOGNITION
`ENGINE
`1
`
`1050
`
`1 100
`
`11 10
`
`11 YES
`
`‘1s <HYPO.2.2>
`CORRECT?
`
`"0
`
`YES
`
`_
`1s <HYPO.2.1>
`001212501?
`
`YES
`
`‘
`GENERATE
`“45mm U51 $1150
`
`11
`
`1160
`
`1120
`
`NO
`
`Is <HYPO.1.1>
`CORRECT?
`
`YES
`
`05311111111011 LOCATION
`UNAMBICUOUS?
`
`"0
`
`1170
`g
`
`‘1 NO
`STORE
`1050
`L, <DESTINATION LOCATION_1>
`
`11
`‘PLEASE SAY DESTINATION
`1070 _‘
`LOCATION AGAIN’
`
`1 080
`
`1
`<DEST1NA1ION
`L°CAT'°N—2>
`'
`
`1090 <DESTINATION LOCAT|ON_2>
`10 SPEECH RECOGNITION
`\J
`ENGINE
`
`YES
`
`RESOLVE
`AMBIGUITY
`
`1180
`
`_
`
`'IS <POSTAL CODE)
`<DEST|NATION LOCAT10N>
`
`STORE DESHNATION f1 190
`LOCATION
`
`1140
`
`11
`CONTINUE wnH
`STEPJSO
`
`11 2
`SPELL DESTINATION
`LOCATION
`
`3
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 3 0f 10
`
`US 6,230,132 B1
`
`DESETNTETRO
`
`INA I N
`LOCATION
`
`1 000
`
`‘I
`LOAO BASIC
`VOCABULARY \1010
`
`‘
`"PLEASE SAY
`
`\1020
`
`I
`I "<DESTINATION L
`LOCATION I>'
`'
`
`I
`<OEsTINATION LOCATION_I>
`To SPEECH RECOGNITION
`ENGINE
`I
`'Is <IIYPO.I.I>
`CORRECT?
`N0
`
`YES
`
`1030
`
`NO
`
`1040
`
`V
`
`REsOLvE
`ANBIOuITY
`
`Z1170
`1050
`
`STORE
`<OEsTINATION LOCATION_I> \1050
`
`"IS <HYP0.1.2>
`CORRECT?
`
`"0
`
`1150
`N
`
`1 160
`
`GENERATE
`AMBIGUITY LIST
`
`II
`OEsTINATION
`LOCATION
`UNAMBIGUOUS?
`YES
`
`1180
`
`N0
`
`II
`"IS <POsTAL CODE>
`<OEsTINATION LOCATION>
`CORRECT?‘
`I
`STJREES
`DESTINATION p1190
`LOCATION
`II
`CONTINuE WITH
`STEP 350
`
`II
`P
`0551mm“
`LOCATION
`
`11405
`
`4
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 4 0f 10
`
`US 6,230,132 B1
`
`SELECT
`FROM LIST
`
`1 430
`
`V
`‘LIST COMPRISES
`<N> ENTRIES' x1440
`
`1445
`
`V
`2 "READ our THE LIST?"
`
`YES
`
`‘
`SET READ FLAG N1450
`
`NO
`
`‘ _
`
`DISPLAY UST:
`DMDED INTO PAGES '1
`WITH <K> ENTRIES
`1460
`
`r
`
`‘
`
`’
`
`E10
`
`"
`
`v
`
`.
`
`.
`
`“ORE
`
`E9
`
`‘BACK’
`
`'NUMBER_X'
`
`_
`
`_
`
`CANCEL
`
`I
`uPREPARE
`‘PREPARE
`NEXT PAGE"
`PREVIOUS PAGE’
`K147
`\
`0
`1475
`
`v
`
`I
`
`1480
`
`E11
`N0
`
`2 E12
`
`\
`'IS ENTRY <X>
`CORRECT?
`'
`Y“
`LOCATION/STREET
`1500
`\d couw NOT BE
`FOUND
`v
`
`PASS ON
`1490 w <ENTRY_X>
`
`CONTINUE WITH
`WAIT STATE 0
`
`FIG.4
`
`5
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 5 0f 10
`
`US 6,230,132 B1
`
`(
`
`RESOLVE
`AMBIGUITY
`
`)
`
`II
`‘THERE ARE <N>
`INSTANCES OF
`<DESTINATION LOCAT|0N>
`I
`SORT LIST BY
`NUMBER 0|:
`INHABITANTS
`II
`"00 YOU WANT T0 00
`To <LARGEST cITY>?'
`II N0
`
`N>K?
`
`YES
`
`1250
`V“ TIRsT
`INTERROGATION
`
`V
`
`YES
`
`TIEsTINATIoN LOCATION
`UNAMBIGUOUS?
`
`i
`I
`H70 :
`
`I
`I
`:
`
`1270
`
`i
`I
`'
`
`x1200
`
`N > K ?
`
`NO
`
`1220
`
`‘230
`
`1280
`V YES
`SECOND J
`INTERROGATION
`I,
`‘290
`N-TH
`ATI N J
`INTERROC 0
`1300
`II
`K
`DESTINATION LOCATION
`UNAMBIGUOUS?
`A
`No *
`SELECT
`FROM LIST
`_
`
`YES
`
`’\-1240
`
`RETURN RESULTS TO
`DIALOGUE CALLING ~\. 1410
`THEM UP
`
`1260
`
`I
`I
`I
`|
`l
`l
`l
`
`6
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 6 0f 10
`
`US 6,230,132 B1
`
`2000
`
`_ ( SPELL DESTINATION)
`LOCATION
`
`II
`"PLEASE SPELL
`2010 ’\ OESTINATION
`LOCATION"
`
`II
`
`2020 1 SPEECH INPuT V
`
`RETuRN
`2030 \-/ HYPOTHESIS LIST
`
`2040 ’\ OENERATE NEw
`LOCATION LIST
`
`2050
`
`20.75 I N0
`
`ACOuSTIC VALUE
`STORED?
`
`I L LOAD AND CENERATE
`NEW
`LEXICON FROM NEW
`LOCAHON
`LOCATION LIST
`LIST
`I,
`+ NEW
`TRANSFER STOREO
`M82155 ACOUSTIC VALUE TO
`SPEECH RECOCNITION
`(‘
`ENGINE
`2070
`V
`HYPOTHESIS LIST FROM
`SPEECH RECOCNITION
`ENGINE-*NEW
`HYPOTHESIS LIST
`
`I 2075
`
`I
`N0 ‘IS <NEW_HYPO.4.I>
`CORRECT?
`
`I
`SELECT
`FROM LIST P2090
`
`YES
`
`W
`CENERATE
`2100’\ LIST wITII
`AMBIGUITIES
`
`II
`
`2080
`
`"0
`
`I
`
`21 10
`
`DESTINATION LOCATION
`UNAMBIGUOUS?
`YES
`
`II
`RESOLvE
`2120
`b AMBIGUITY
`
`2130
`
`"Is <POSTAL COOE>
`<DESTINATION LOCATION>
`CORRECT?‘
`II YES
`IEIIIPORARILY
`2140 ’\ STORE OESTINATION
`LOCATION
`
`N0
`
`2150
`
`I /
`NO DESTINATION
`LOCATION COUND
`BE FOUND
`
`V
`CONTINuE WITH
`STEP 350
`
`‘
`CoNTINuE WITH
`WAN STATE
`O
`
`Fl (3.6
`
`7
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 7 0f 10
`
`US 6,230,132 B1
`
`COURSE
`DESTINATION INPUT
`
`)f 3000
`
`I
`
`ENTER COURSE
`DESTINATION
`I
`CALCULATE I500
`LOCATIONS IN THE
`VICINITY OF <COURSE
`DESTINATION)
`
`I
`
`CENERATE FINE
`DESTINATION LEXICON
`AND LOAD IN
`RECOGNITION ENGINE
`I
`DESTINATION INPUT
`WITHOUT
`IMPLEMENTATION
`STEP IOIO
`
`f 3320
`
`FIG.7
`
`8
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 8 0f 10
`
`US 6,230,132 B1
`
`STORE
`ADDRESS
`
`7000
`
`7010
`
`II
`DESTINATION ADDRESS
`IN MEMORY?
`
`NO
`
`II YES
`"WOULD YOU LIKE TO STORE
`THE CURRENT DESTINATION?"
`
`YES
`
`i
`
`v
`'PLEASE SAY THE NAME
`UNDER WHICH YOU WANT
`THE STORE TO DESTINATION”
`
`J 7030
`
`No
`
`II
`ENTER
`ADDRESS
`I
`
`II
`
`I Milk/K7050
`
`II
`<NAME_I > —
`SPEECH RECOGNITION
`ENGINE
`I
`DESTINATION
`ADDRESS
`—-— <NAME_I>
`II
`'ADDRESS STORED UNDER
`<NAME_I >
`
`II
`CONTINUE WITH
`WAIT STATE 0
`
`FIG.8
`
`9
`
`
`
`U.S. Patent
`
`May 8,2001
`
`Sheet 9 0f 10
`
`US 6,230,132 B1
`
`,
`
`5100
`
`"IS <HYPO.5.I>
`CORRECT?"
`II No
`5110
`STORE
`<STREET_T> “f
`
`I
`"IS <HYPO.5.2>
`CORRECT?‘
`YES
`
`_
`
`SPELL
`STREET
`
`J
`
`5140
`GENERATE LIST
`WITH AMBIGUITY J
`
`YES
`
`INPUT
`II
`DESTINATION LOCATION
`ALREADY ENTERED?
`
`I "0
`‘WOULD YOU
`LIKE A STREET IN
`<CURRENT_CITY>?'
`
`YES
`
`YES
`
`5010
`
`5020
`NO
`
`5030
`
`V2
`
`ENTER
`DESTINATION
`LOCATION
`
`II
`NUMBER OF STREETS
`(NS) < M?
`YES
`
`N0
`
`5040
`5050
`8
`II
`LIIIIT SCOPE
`UNTIL
`NS < M
`
`. v
`
`GENERATE STREET
`LEXICON AND LOAD J‘ 5060
`IN RECOGNITION ENGINE
`
`II
`
`5070
`"PLEASE SAY THE
`STREET NAME‘ J
`
`II
`<STREET_I>
`
`I
`
`2 / ‘5080
`
`II
`<STREET_I> T0 SPEECH
`RECOGNITION ENGINE
`I
`
`f 5090
`
`CONTINUE WITH
`STEP 500
`
`10
`
`
`
`U.S. Patent
`
`May 8, 2001
`
`Sheet 10 of 10
`
`US 6,230,132 B1
`
`o_.o_.1_
`
`2o=.<o_><z
`
`.93
`
`11
`
`11
`
`
`
`US 6,230,132 B1
`
`1
`PROCESS AND APPARATUS FOR REAL
`TIME VERBAL INPUT OF A TARGET
`ADDRESS OF A TARGET ADDRESS SYSTEM
`
`BACKGROUND AND SUMMARY OF THE
`INVENTION
`
`5
`
`This application claims the priority of German patent
`document 197 09 518.6, ?led Mar. 10, 1997, the disclosure
`of Which is expressly incorporated by reference herein.
`The invention relates to a method and apparatus for
`real-time speech input of a destination address into a navi
`gation system.
`German patent document DE 196 00 700 describes a
`target guidance system for a motor vehicle in Which a ?xedly
`mounted circuit, a contact ?eld circuit or a voice recognition
`apparatus can be used as an input device. The document,
`hoWever, does not deal With the vocal input of a target
`address in a target guidance system.
`Published European patent application EP 0 736 853 A1
`likeWise describes a target guidance system for a motor
`vehicle. The speech input of a target address in a target
`guidance system is, hoWever, not the subject of this docu
`ment.
`Published German patent application DE 36 08 497 A1
`describes a process for speech controlled operation of a long
`distance communication apparatus, especially an auto tele
`phone. It is considered a disadvantage of the process that it
`does not deal With the special problems in speech input of a
`target address in a target guidance system.
`Not yet prepublished German patent application P 195 33
`541.4-52 discloses a method and apparatus of this type for
`automatic control of one or more devices, by speech com
`mands or by speech dialogue in real time. Input speech
`commands are recogniZed by a speech recognition device
`comprising a speaker-independent speech recognition
`engine and a speaker-independent additional speech recog
`nition engine that identi?es recognition probability as the
`input speech command, and initiates the functions of the
`device or devices associated With this speech command. The
`speech command or speech dialogue is formed on the basis
`of at least one syntax structure, at least one basic command
`vocabulary, and if necessary at least one speaker-speci?c
`additional command vocabulary. The syntax structures and
`basic command vocabularies are presented in speaker
`independent form and are established in real time. The
`speaker-speci?c additional vocabulary is input by the
`respective speaker and/or modi?ed by him/her, With an
`additional speech recognition engine that operates according
`to a speaker-dependent recognition method being trained in
`training phases, during and outside real-time operation by
`each speaker, to the speaker-speci?c features of the respec
`tive. speaker by at least one-time input of the additional
`command. The speech dialogue and/or control of the devices
`is developed in real time as folloWs:
`Speech commands input by the user are fed to a speaker
`independent speech recognition engine operating on
`the basis of phonemes, and to the speaker-dependent
`additional speech recognition engine Where they are
`subjected to feature extraction and are checked for the
`presence of additional commands from the additional
`command vocabulary and classi?ed in the speaker
`dependent additional speech recognition engine on the
`basis of the features extracted therein.
`Then the classi?ed commands and syntax structures of the
`tWo speech recognition engines, recogniZed With a
`
`25
`
`35
`
`45
`
`55
`
`65
`
`2
`certain probability, are assembled into hypothetical
`speech commands and the latter are checked and clas
`si?ed for their reliability and recognition probability in
`accordance With the syntax structure provided.
`Thereafter, the additional hypothetical speech commands
`are checked for their plausibility in accordance With
`speci?ed criteria and, of the hypothetical speech com
`mands recogniZed as plausible, the one With the highest
`recognition probability is selected and identi?ed as the
`speech command input by the user.
`Finally, the functions of the device to be controlled that
`are associated With the identi?ed speech command are
`initiated and/or ansWers are generated in accordance
`With a predetermined speech dialogue structure to
`continue the speech dialogue. According to this
`document, the method described can also be used to
`operate a navigation system, With a destination address
`being input by entering letters or groups of letters in a
`spelling mode and With it being possible for the user to
`supply a list for storage of destination addresses for the
`navigation system using names and abbreviations that
`can be determined in advance.
`The disadvantage of this method is that the special
`properties of the navigation system are not discussed, and
`only the speech input of a destination location by means of
`a spelling mode is described.
`The object of the invention is to provide an improved
`method and apparatus of the type described above, in Which
`the special properties. of a navigation system are taken into
`account and simpli?ed.
`Another object of the invention is to provide such an
`arrangement Which enables faster speech input of a desti
`nation address in a navigation system, improving operator
`comfort.
`These and other objects and advantages are achieved by
`the method and apparatus according to the invention for
`speech input of destination addresses in a navigation system,
`Which uses a knoWn speech recognition device, such as
`described for example in the document referred to above,
`comprising at least, one speaker-independent speech
`recognition engine and at least one speaker-dependent addi
`tional speech-recognition engine. The method according to
`the invention makes possible various input dialogues for
`speech input of destination addresses. In a ?rst input dia
`logue (hereinafter referred to as the “destination location
`input”), the speaker-independent speech recognition device
`is used to detect destination locations spoken in isolation,
`and if such destination location is not recogniZed, to recog
`niZe continuously spoken letters and/or groups of letters. In
`a second input dialogue (hereinafter referred to as “spell
`destination location”), the speaker-independent speech rec
`ognition engine is used to recogniZe continuously spoken
`letters and/or groups of letters. In a third input dialogue
`(hereinafter referred to as “coarse destination input”), the
`speaker-independent speech-recognition engine is used to
`recogniZe destination locations spoken in isolation, and if
`such destination location is recogniZed, to recogniZe con
`tinuously spoken letters and/or groups of letters. In a fourth
`input dialogue (hereinafter referred to as “indirect input”),
`the speaker-independent speech recognition engine is used
`to recogniZe continuously spoken numbers and/or groups of
`numbers. In a ?fth input dialogue (hereinafter referred to as
`“street input”), the speaker-independent speech-recognition
`device is. used to recogniZe street names spoken in isolation
`and if the street name spoken in isolation is not recogniZed,
`to recogniZe continuously spoken letters and/or groups of
`letters.
`
`12
`
`
`
`US 6,230,132 B1
`
`3
`By means of the input dialogues described above, the
`navigation system is supplied With veri?ed destination
`addresses, each comprising a destination location and a
`street. In a sixth input dialogue (hereinafter referred to as
`“call up address”), in addition to the speaker-independent
`speech-recognition engine, the speaker-dependent addi
`tional speech-recognition engine is used to recogniZe key
`Words spoken in isolation. In a seventh input dialogue
`(hereinafter referred to as “store address”), a keyWord spo
`ken in isolation by the user is assigned a destination address
`entered by the user, so that during the input dialogue “call up
`address” a destination address associated With the corre
`sponding recogniZed keyWord is transferred to the naviga
`tion system.
`The method according to the invention is based primarily
`on the fact that the entire admissible vocabulary for a
`speech-recognition device is not loaded into the speech
`recognition device at the moment it is activated; rather, at
`least a required lexicon is generated from the entire possible
`vocabulary during real-time operation and is loaded into the
`speech-recognition device as a function of the required input
`dialogue for executing an operating function. There are more
`than 100,000 locations In the Federal Republic of Germany
`that can serve as vocabulary for the navigation system. If
`this vocabulary Were to be loaded into the speech
`recognition device, the recognition process Would be
`extremely sloW and prone to error. A lexicon generated from
`this vocabulary comprises only about 1500 Words, so that
`the recognition process Would be much faster and the
`recognition rate higher.
`At least one destination ?le that contains all possible
`destination addresses and certain additional information for
`the possible destination addresses of a guidance system, and
`is stored in at least one database, is used as the database for
`the method according to the invention. From this destination
`?le, lexica are generated that comprise at least parts of the
`destination ?le, With at least one lexicon being generated in
`real time as a function of at least one activated input
`dialogue. It is especially advantageous for the destination
`?le for each stored destination location to contain additional
`information, for example political af?liation or a additional
`naming component, postal code or postal code range, tele
`phone area code, state, population, geographic code, pho
`netic description, or membership in the lexicon. This addi
`tional information can then be used to resolve ambiguities or
`to accelerate the search for the desired destination location.
`Instead of the phonetic description, a transcription of the
`phonetic description in the form of a chain of indices,
`depending on the implementation of the transcription, can be
`used instead of the phonetic description for the speech
`recognition device. In addition, a so-called automatic pho
`netic transcription that performs a rule-based conversion of
`orthographically present names using a table of exceptions
`into a phonetic description can be provided. Entry of lexicon
`membership is only possible if the corresponding lexica are
`generated in an “off-line editing mode,” separately from the
`actual operation of the navigation system, from the destina
`tion ?le and have been stored in the (at least one) database,
`for example a CD-ROM or a remote database at a central
`location that can be accessed by corresponding communi
`cations devices such as a mobile radio netWork. Generation
`of the lexica in the “off-line editing mode” makes sense only
`if suf?cient storage space is available in the (at least one)
`database and is especially suitable for lexica that are
`required very frequently. In particular, a CD-ROM or an
`external database can be used as the database for the
`destination ?le since in this Way the destination ?le can
`alWays be kept up to date.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`55
`
`60
`
`65
`
`4
`At the moment, not all possible place names in the Federal
`Republic of Germany have been digitiZed and stored in a
`database. Similarly, a corresponding street list is not avail
`able for all locations. Therefore it is important to be able to
`update the database at any time. An internal nonvolatile
`storage area of the navigation system can also be used as the
`database for the (at least one) lexicon generated in the
`“off-line editing mode.”
`To facilitate more rapid speech entry of a desired desti
`nation address into the navigation system, folloWing the
`initialiZation phase of the navigation system or With suf?
`ciently large nonvolatile internal storage, a basic vocabulary
`is loaded each time the database is changed, Which vocabu
`lary comprises at least one basic lexicon generated from the
`destination ?le. This basic lexicon can be generated in the
`“off-line editing mode.” The basic lexicon can be stored in
`the database in addition to the destination ?le or can be
`stored in a nonvolatile internal memory area of the naviga
`tion system. As an alternative, generation of the basic
`lexicon can Wait until after the initialiZation phase. Dynamic
`generation of lexica during real-time operation of the navi
`gation system, in other Words during operation, offers tWo
`important advantages. Firstly this creates the possibility of
`putting together any desired lexica from the database stored
`in the (at least one) database, and secondly considerable
`storage space is saved in the (at least one) database since not
`all of the lexica required for the various input dialogues need
`to be stored in the (at least one) database prior to activation
`of the speech-recognition engine.
`In the embodiment described beloW, the basic vocabulary
`comprises tWo lexica generated in the “off-line editing
`mode” and stored in the (at least one) database, and tWo
`lexica generated folloWing the initialization phase. If the
`speech-recognition device has sufficient Working memory,
`the basic vocabulary is loaded into it after the initialiZation
`phase, in addition to the admissible speech commands for
`the speech dialogue system, as described in the above
`mentioned German patent application P 195 33 541.4-52.
`FolloWing the initialiZation phase and pressing of the PTT
`(push-to-talk) button, the speech dialogue system then
`alloWs the input of various information to control the
`devices connected to the speech dialogue system as Well as
`to perform the basic functions of a navigation system and to
`enter a destination location and/or a street as the destination
`address for the navigation system. If the speech-recognition
`device has. insuf?cient RAM, the basic vocabulary is not
`loaded into it until a suitable operating function that accesses
`the basic vocabulary has been activated.
`The basic lexicon, stored in at least one database, com
`prises the “p” largest cities in the Federal Republic of
`Germany, With the parameter “p” in the design described
`being set at 1000. This directly accesses approximately 53
`million citiZens of the FRG or 65% of the population. The
`basic lexicon comprises all locations With more than 15,000
`inhabitants. A regional lexicon also stored in the database
`includes “Z” names of regions and areas such as Bodensee,
`SchWabische Alb, etc., With the regional lexicon in the
`version described comprising about 100 names for example.
`The regional lexicon is used to ?nd knoWn areas and
`conventional regional names. These names cover combina
`tions of place names that can be generated and loaded as a
`neW regional lexicon after the local or regional name is
`spoken. An area lexicon, generated only after initialiZation,
`comprises “a” dynamically loaded place names in the vicin
`ity of the actual vehicle location, so that even smaller places
`in the immediate vicinity can be addressed directly, With the
`parameter “a” in the embodiment described being set at 400.
`
`13
`
`
`
`US 6,230,132 B1
`
`5
`This area lexicon is constantly updated at certain intervals
`While driving so that it is always possible to address loca
`tions in the immediate vicinity directly. The current vehicle
`location is reported to the navigation system by a positioning
`system knoWn from the prior art, for example by means of
`a global positioning system (GPS). The previously described
`lexica are assigned to the speaker-independent speech
`recognition engine. A name lexicon that is not generated
`from the destination ?le and is assigned to the speaker
`dependent speech-recognition engine comprises approxi
`mately 150 keywords from the personal address list of the
`user, spoken by the user. Each keyWord is then given a
`certain destination address from the destination ?le by the
`input dialogue “store address.” These speci?c destination
`addresses are transferred to the navigation system by speech
`input of the associated keyWords using the input dialogue
`“call up address.” This results in a basic vocabulary of about
`1650 Words that are recogniZed by the speech-recognition
`device and can be entered as Words spoken in isolation
`(place names, street names, keyWord).
`Provision can also be made for transferring addresses
`from an external data source, for example a PDA (personal
`digital assistant) or a portable laptop computer, by means of
`data transfer to the speech dialogue system or to the navi
`gation system and integrate it as an address lexicon in the
`basic vocabulary. Normally, no phonetic descriptions for the
`address data (name, destination location, street) are stored in
`the external data sources. Nevertheless in order to be able to
`transfer these data into the vocabulary for a speech
`recognition device, an automatic phonetic transcription of
`these address data, especially the names, must be performed.
`Assignment to the correct destination location is then per
`formed using a table.
`For the sample dialogues described beloW, a destination
`?le must be stored in the (at least one) database of the
`navigation system that contains a data set according to Table
`1 in the place found in the navigation system. Depending on
`the storage location and availability, parts of the information
`entered can also be missing. HoWever, this only relates to
`data used to resolve ambiguities, for example additional
`naming component, county, telephone area codes, etc. If
`address data from an outside data source are used, the
`address data must be supplemented accordingly. The Word
`subunits for the speech-recognition device are especially
`important, Which act as hidden Markov model speech rec
`ognition engines (HMM recognition engines).
`
`TABLE 1
`
`Description of Entry
`
`Place Name
`Political A?iliation or
`additional naming component
`Postal Code or Postal Code
`Range
`Telephone Area Code
`County
`State
`Population
`Geographic Code
`Phonetic Description
`Word Subunits for HMM Speech
`Recognizing Device
`
`Example
`
`Flensburg
`
`24900-24999
`
`0461
`Flensburg, county
`SchlesWig-Holstein
`87,526
`9.43677, 54.78204
`I?'EnsIbUrkI
`
`Lexicon Membership
`
`3, 4, 78 .
`
`.
`
`.
`
`Other objects, advantages and novel features of the
`present invention Will become apparent from the folloWing
`detailed description of the invention When considered in
`conjunction With the accompanying draWings.
`
`6
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a schematic diagram providing an overvieW of
`the possible input dialogues for speech input of a destination
`address for a navigation system according to the invention;
`FIG. 2 is a schematic representation of a ?oWchart of a
`?rst embodiment of the input dialogue “destination location
`input”;
`FIG. 3 is a schematic vieW of a ?oWchart of a second
`embodiment for the input dialogue “destination location
`input”;
`FIG. 4 is a schematic vieW of a ?oWchart for the input
`dialogue “choose from list”;
`FIG. 5 is a schematic vieW of a ?oWchart for the input
`dialogue “resolve ambiguity”;
`FIG. 6 is a schematic diagram of a ?oWchart for the input
`dialogue “spell destination location”;
`FIG. 7 is a schematic vieW of a ?oWchart for the input
`dialogue “coarse destination input”;
`FIG. 8 is a schematic vieW of a ?oWchart for the input
`dialogue “store address”;
`FIG. 9 is a schematic vieW of a ?oWchart for the input
`dialogue “street input”; and
`FIG. 10 is a schematic vieW of a block diagram of a device
`for performing the method according to the invention.
`
`DETAILED DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shoWs an overvieW of the possible input dialogues
`for speech input of a destination address for a navigation
`system. A speech dialogue betWeen a user and a speech
`dialogue system according to FIG. 1 begins folloWing the
`initialiZation phase With a Wait state 0, in Which the speech
`dialogue system stops until the PTT button (push-to-talk
`button) is actuated, and to Which the speech dialogue system
`returns after the speech dialogue is terminated. The user
`activates the speech dialogue system by actuating the PTT
`button in step 100. The speech dialogue system replies in
`step 200 With an acoustic output, for example by a signal
`tone or by a speech output indicating to the user that the
`speech dialogue system is ready to receive a speech com
`mand. In step 300, the speech dialogue system Waits for an
`admissible speech command in order, by means of dialogue
`and process control, to control the various devices connected
`to the speech dialogue system or to launch a corresponding
`input dialogue. HoWever, no details of the admissible speech
`commands Will be provided at this point that relate to the
`navigation system. The folloWing speech commands relating
`to the various input dialogues of the navigation system can
`noW be entered:
`“Destination location input” E1: This speech command
`activates the input dialogue “destination location
`input.”
`“Spell destination location” E2: This speech command
`activates the input dialogue “spell destination loca
`tion.”
`“Coarse destination input” E3: This speech command
`activates the input dialogue “coarse destination input.”
`“Postal code” E4 or “telephone area code” E5: The input
`dialogue “indirect input” is activated by these tWo
`speech commands.
`“Street input” E6: This speech command activates the
`input dialogue “street input.”
`“Store address” E7: This speech command activates the
`input dialogue “store address.”
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`14
`
`
`
`US 6,230,132 B1
`
`7
`“Call up address” E8: This speech command activates the
`input dialogue “call up address.”
`Instead of the above, of course, other terms can be used
`to activate the various input dialogues. In addition to the
`above speech commands, general speech commands can
`also be used to control the navigation system, for eXample
`“navigation information,” “start/stop navigation,” etc.
`After starting an input dialogue by speaking the corre
`sponding speech command, the corresponding leXica are
`loaded as the vocabulary into the speech recognition device.
`With a successfully performed speech input of the destina
`tion location as part of the destination address input by
`means of one of the input dialogues “destination location
`input” in step 1000, “spell destination location” in step 2000,
`“coarse destination input” in step 3000, or “indirect input”
`in step 4000, a check is then made in step 350 Whether or not
`a corresponding street list is available for the recogniZed
`destination location. If the check yields a negative result, a
`branch is made to step 450. If the check yields a positive
`result, a check is made in step 400 to determine Whether or
`not the user Wants to enter a street name. If the user responds
`to question 400 by “yes,” the input dialogue “street input” is
`called up. If the user ansWers question 400 by “no” a branch
`is made to step 450. Question 400 is therefore implemented
`only if the street names for the corresponding destination
`location are included in the navigation system. In step 450,
`the recogniZed desired destination location is automatically
`updated by entering “center” or With “doWntoWn” as the
`street input, since only a complete destination addres