throbber
(12) United States Patent
`US 6,188,985 B1
`(10) Patent N0.:
`Thrift et al.
`(45) Date of Patent:
`Feb. 13, 2001
`
`USOO6188985B1
`
`(54) WIRELESS VOICE-ACTIVATED DEVICE
`FOR CONTROL OF A PROCESSOR-BASED
`HOST SYSTEM
`
`(75)
`
`anentorS: Philip R. Thrift, Dallas; Charles TI
`Hemphill,A11en, both of TX (US)
`
`(73) Assignee: Texas Instruments Incorporated,
`Dallas, TX (US)
`
`( * ) Notice:
`
`Under 35 U.S.C. 154(b), the term of this
`patent shall be extended for 0 days.
`
`5,796,394 *
`5,802,526 *
`5,890,122 *
`5,890,123 *
`
`6,075,575 *
`
`......................... 345/329
`8/1998 Wicks et al.
`.......
`9/1998 Fawcett et al.
`707/104
`
`3/1999 Van Kleeck et al.
`.
`704/275
`
`3/1999 Brown et al.
`.
`704/275
`........................ 348/734
`6/2000 Schem et al.
`OTHER PUBLICATIONS
`
`Holmes “Speech Synthesis and Recognition” Chapman Hill,
`p. 109, 1988*
`Ballou “Handbook for Sound Engineers” Howard Sams, p.
`376, 1987.*
`Dragon “Dragon Dictate 1.0 for Windows” Dragon systems,
`pp. 140, 13.*
`
`(21) Appl. N0.: 08/943,795
`(22)
`Filed:
`OCt' 3’ 1997
`.
`.
`Related US. Application Data
`Provisional application No. 60/034,685, filed on Jan. 6,
`1997.
`
`(60)
`
`* Cited by examiner
`Primary Examiner—David R. Hudspeth
`Assistant Examiner—Harold Zintel
`.
`.
`.
`.
`(74) Attorney, Agent, or Firm—Robert L. Tr01ke, Fredean
`J. Telecky, Jr.
`
`(51)
`Int. Cl.7 .............................. G10L 15/00; H04N 5/44
`
`(52) US. Cl. .............................. 704/275; 348/734
`(58) Fleld of Search ..................................... 704/275, 270;
`348/734> 738
`
`(56)
`
`,
`References Clted
`US. PATENT DOCUMENTS
`*
`.
`.
`gaigéaggg *
`“31/133; gismmur‘: l"""""""""""""" 4381/1110
`
`5’247’580 *
`41993 K1231: :t :1’ "
`""" 704275
`
`5:636:211 *
`6/1997 Newlin et al.
`..
`...... 370/465
`4/1998 Allen et al. ........... 704/270
`5,737,491 *
`
`............................. 704/255
`5,774,628 *
`6/1998 Hemphill
`
`ABSTRACT
`(57)
`A hand-held wireless voice-activated device (10) for con-
`trolling a hOSt system (11), such as a computer connected to
`the World Wide Web. The device (10) has a display (10a),
`a microphone (10b), and a wireless transmitter (10g) and
`receiver (10h).
`It may also have a processor (106) and
`memory (10f) for performing voice recognition. A device
`(20) can be specifically designed for Web browsing, by
`having a processor (206) and memory (20f) that perform
`both voice recognition and interpretation of results of the
`VOICC recognition.
`
`18 Claims, 3 Drawing Sheets
`
`VOICE—ACTIVATED CONTROL UNIT
`
`HOST COMPUTER
`
`10
`
`10°
`
`109
`
`DISPLAY
`
`VOICE INPUT
`TRANSMITTER
`
`PROCESSOR
`
`DATA
`RECEIVER
`
`MEMORY
`
`RECOGNIZER
`
`FILES
`
`DYNAMIC
`GRAMMAR
`GENERATOR
`
`GRAMMAR
`
`PROCESSOR
`
`CONTROL
`INTERPRETER
`
`GRAMMAR
`
`FILES
`WEB
`BROWSER
`
`Petitioner Microsoft Corporation - EX. 1005, p. l
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 1
`
`

`

`US. Patent
`
`Feb. 13, 2001
`
`Sheet 1 013
`
`US 6,188,985 B1
`
`VOICE—ACTIVATED CONTROL UNIT
`
`HOST COMPUTER
`
`10
`
`DISPLAY
`
`10°
`
`VOICE INPUT
`TRANSMITTER
`
`PROCESSOR
`
`110
`
`www
`
`BROWSER
`
`PROCESSOR
`
`DATA
`RECEIVER
`
`RECOGNIZER
`GRAMMAR
`FILES
`DYNAMIC
`GRAMMAR
`GENERATOR
`
`MEMORY
`
`CONTROL
`INTERPRETER
`GRAMMAR
`FILES
`WEB
`
`FIG.
`
`1
`
`54
`
`52
`
`VOCABULARY
`
`GRAMMAR
`CONSTRAINTS
`
`
`
`
`
`
`
`
`
`
`ONLINE
`DICTIONARY
`PRONUNCIATIONS
`
`56
`SPEAKER INDEPENDENT
`CONTINUOUS SPEECH
`
`PHONETIC MODELS
`
`
`60
`FIG. 5
`
`62
`
`USER AGENT
`
`SPEECH
`
`CONTEXT
`
`
`
`Petitioner Microsoft Corporation - EX. 1005 , p. 2
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 2
`
`

`

`US. Patent
`
`Feb. 13, 2001
`
`Sheet 2 013
`
`US 6,188,985 B1
`
`WWW
`
`20
`
`
`
`
`VOICE—ACTIVATED CONTROL UNIT
`HOST COMPUTER
`
`
` DISPLAY
`PROCESSOR
`
` SIGNAL
`INTERFACE
`
`21
`
`
`
`
`BROWSER
`
`
`RECOGNIZER
`
`
`VOICE
`CONTROL
`
`INTERPRETER
`
`
`
`GRAMMAR
`FILES
`
`
`DYNAMIC
`GRAMMAR
`
`GENERATOR
`
`
`
`FIG. 2
`
`Petitioner Microsoft Corporation - EX. 1005 , p. 3
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 3
`
`

`

`US. Patent
`
`m,
`
`03mS
`
`1
`
`
`
`
`
` B.Wm»Q:m,522:25833229:so::8:522;B38532:.ucoanxmgEz1,84.34.5225$0;
`
`
`
`mm.uE
`m232,6:20.93on>80?.305m:
`
`
`.>.zz_<z<_o6$2.33So389.235:.855m:E288890m823:2v.8»ta:28.8V22.8.3m23%28°.26:62.:_2o:E84
`
`
`
`
`
`28%NxB2:?29:32:£2;
`
`
`
`3~33.use33“Eu:”ExEdin.«3.x;"3m6%:5m
`
`.22“:z_<z8n3:.2:26.5“.585BS53888£883863:33
`
`
`
`
`
`.36:UcoEEoo039.0on>80:...305m:
`
`
`
`n50250980.53.”:mmmzazm
`
`
`
`>380:32.com
`
`no;262
`
`3
`
`Petitioner Microsoft Corporation - EX. 1005 , p. 4
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 4
`
`

`

`US 6,188,985 B1
`
`1
`WIRELESS VOICE-ACTIVATED DEVICE
`FOR CONTROL OF A PROCESSOR-BASED
`HOST SYSTEM
`
`This application claims benefit of Ser. No. 60/034,685
`filed Jan. 6, 1997.
`
`TECHNICAL FIELD OF THE INVENTION
`
`The present invention relates generally to voice recogni-
`tion devices, and more particularly to a wireless voice-
`controlled device that permits a user to browse a hypermedia
`network, such as the World Wide Web, with voice com-
`mands.
`
`RELATED PATENT APPLICATIONS
`
`This patent application is related to the following patent
`applications, each assigned to Texas Instruments Incorpo-
`rated.
`
`U.S. patent US. Pat. No. 5,774,628 entitled “Speaker-
`Independent Dynamic Vocabulary and Grammar in
`Speech Recognition”
`US. patent application Ser. No. 08/419,229, entitled
`“Voice Activated Hypermedia Systems Using Gram-
`matical Metadata”
`
`BACKGROUND OF THE INVENTION
`
`The Internet is a world-wide computer network, or more
`accurately, a world-wide network of networks. It provides an
`exchange of information and offers a vast range of services.
`Today, the Internet has grown so as to include all kinds of
`institutions, businesses, and even individuals at their homes.
`
`The World-Wide Web (“WWW” or “Web”) is one of the
`services available on the Internet. It is based on a technology
`known as “hypertext”, in which a document has links to its
`other parts or to other documents. Hypertext has been
`extended so as to encompass links to any kind of information
`that can be stored on a computer, including images and
`sound. For example, using the Web, from within a document
`one can select highlighted words or phases to get definitions,
`sources, or related documents, stored anywhere in the world.
`For this reason, the Web may be described as a “hyperme-
`dia” network.
`
`in the Web is a “page”, a (usually)
`The basic unit
`text-plus-graphics document with links to other pages.
`“Navigating” the Web primarily means moving around from
`page to page.
`The idea behind the Web is to collect all kinds of data
`
`from all kinds of sources, avoiding the problems of incom-
`patibilities by allowing a smart server and a smart client
`program to deal with the format of the data. This capability
`to negotiate formats enables the Web to accept all kinds of
`data, including multimedia formats, once the proper trans-
`lation code is added to the servers and clients. The Web
`client is used to connect to and to use Web resources located
`on Web servers.
`
`One type of client software used to access and use the
`Web is referred as “web browser” software. This software
`
`can be installed on the user’s computer to provide a graphic
`interface, where links are highlighted or otherwise marked
`for easy selection with a mouse or other pointing device.
`
`SUMMARY OF THE INVENTION
`
`One aspect of the invention is a wireless voice-activated
`control unit for controlling a processor-based host system,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`such as a computer connected to the World Wide Web. A
`compact hand-held unit has a microphone, a wireless audio
`input transmitter, a wireless data receiver, and a display. The
`microphone receives voice input from a user, thereby pro-
`viding an audio input signal. The audio transmitter wire-
`lessly transmits data derived from the audio signal to the
`host system. After the host acts on the audio input, it delivers
`some sort of response in the form of image data wirelessly
`delivered to the receiver. A display generates and displays
`images represented by the image data.
`Variations of the device can include a speaker for audio
`output information. The device can also have a processor
`and memory for performing front-end voice recognition
`processes or even all of the voice recognition.
`An advantage of the invention is that it makes information
`on the Web more accessible and useful. Speech control
`brings added flexibility and power to the Web interface and
`makes access to information more natural.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 illustrates one embodiment of a wireless voice-
`activated control unit in accordance with the invention.
`FIG. 2 illustrates another embodiment of a wireless voice-
`
`activated control unit, specially configured for translating
`and interpreting audio input from the user.
`FIG. 3 illustrates an example of a display provided by the
`speakable command process.
`FIG. 4 illustrates a portion of a Web page and its speak-
`able links.
`
`FIG. 5 illustrates a process of dynamically creating gram-
`mars for use by the voice recognizer of FIGS. 1 and 2.
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`The invention described herein is directed to a wireless
`
`voice-activated device for controlling a processor-based host
`system. That
`is,
`the device is a voice-activated remote
`control device. In the example of this description, the host
`system is a computer connected to the World-Wide Web and
`the device is used for voice-controlled web browsing.
`However,
`the same concepts can be applied to a voice-
`controlled device for controlling any processor-based sys-
`tem that provides display or audio information, for example,
`a television.
`
`Various embodiments of the device differ with regard to
`the “intelligence” embedded in the device. For purposes of
`the invention, the programming used to recognize an audio
`input and to interpret the audio input so that it can be used
`by conventional web browser software is modularized in a
`manner that permits the extent of embedded programming to
`become a matter of design and cost.
`FIG. 1 illustrates one embodiment of a wireless voice-
`activated control unit 10 in accordance with the invention. It
`
`communicates with a host system 11. As stated above, for
`purposes of this description, host system 11 is a computer
`and is in data communication with the World-Wide Web.
`
`Control unit 10 has a display 10a and a microphone 10b.
`Display 10a is designed for compactness and portability, and
`could be an LCD. Microphone 10b receives voice input from
`a user. It may have a “mute” switch 106, so that control unit
`10 can be on, displaying images and even receiving non-
`audio input via an alternative input device such as a keypad
`(not shown), but not performing voice recognition. Micro-
`phone 10b may be a microphone array, to enhance the ability
`to differentiate the user’s voice from other sounds.
`
`Petitioner Microsoft Corporation - Ex. 1005 , p. 5
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 5
`
`

`

`US 6,188,985 B1
`
`3
`In the embodiment of FIG. 1, control unit 10 performs all
`or part of the voice recognition process and delivers speech
`data to host computer 11 via transmitter 10g. Host computer
`11 performs various voice control interpretation processes
`and also executes a web browser. However, in its simplest
`form control unit would transmit audio data directly from
`microphone 10b to host system 11, which would perform all
`processing.
`In the case where control unit 10 performs all or part of
`the voice recognition process, control unit 10 has a processor
`106. Memory 10f stores voice recognition programming to
`be executed by processor 106. An example of a suitable
`processor 10a for speech recognition is a signal processor,
`such as those manufactured by Texas Instruments Incorpo-
`rated. Where microphone 10b is a microphone array, pro-
`cessor 10a may perform calculations for targeting the user’s
`v01ce.
`
`If control unit performs only some voice processing, it
`may perform one or more of the “front end” processes, such
`as linear predictive coding (LPC) analysis or speech end
`pointing.
`If control unit 10 performs all voice recognition
`processes, memory 10f stores these processes (as a voice
`recognizer) as well as grammar files. In operation, the voice
`recognizer receives audio input from microphone 10b, and
`accesses the appropriate grammar file. A grammar file han-
`dler converts the grammar to speech-ready form, creating a
`punctuation grammar, and loading the grammar into the
`voice recognizer. The voice recognizer uses the grammar file
`to convert the audio input to a text translation.
`The grammar files in memory 10f may be pre-defined and
`stored or may be dynamically created or may be a combi-
`nation of both types of grammar files. An example of
`dynamic grammar file creation is described below in con-
`nection with FIG. 5. The grammars may be written with the
`Backus-Naur form of context-free grammars and can be
`customized. In the embodiment of FIG. 1, and where unit 10
`is used for Web browsing, host computer 11 delivers the
`HTML (hyertext markup language) for a currently displayed
`Web page to unit 10. Memory 10f stores a grammar file
`generator for dynamically generating the grammar. In alter-
`native Web browsing embodiments, host 11 could dynami-
`cally generate the grammar and download the grammar file
`to control unit 10.
`
`The output of the voice recognizer is speech data. The
`speech data is transmitted to host system 11, which performs
`voice control interpretation processes. Various voice control
`interpretation processes for voice-controlled Web browsing
`are described in US. patent application Ser. No. 08/419,229,
`entitled “Voice Activated Hypermedia Systems Using
`Grammatical Metadata”, assigned to Texas Instruments
`Incorporated and are incorporated herein by reference. As a
`result of the interpretation, the host system 11 may respond
`to the voice input to control unit 10 by executing a command
`or providing a hypermedia (Web) link.
`An example of voice control interpretation other than for
`Web browsing is for commands to a television, where host
`system 11 is a processor-based television system. For
`example, the vocal command, “What’s on TV tonight?”,
`would result in a display of the television schedule. Another
`example of voice control interpretation other than for Web
`browsing is for commands for computer-based household
`control. The vocal command, “Show me the sprinkler sched-
`ule” would result in an appropriate display.
`After host system 11 has taken the appropriate action, a
`wireless receiver 10h receives data from host system 11 for
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`display on display 10a or for output by speaker 10d. Thus,
`the data received from host system 11 may be graphical
`(including text, graphics, images, and videos or audio.
`FIG. 2 illustrates an alternative embodiment of the
`invention, a wireless voice-activated control unit 20 that
`performs voice control interpretation as well as voice rec-
`ognition. The voice control
`interpretation is specific to
`browsing a hypermedia resource, such as the Web. The host
`system 21 is connected to the hypermedia resource.
`Control unit 20 has components similar to those of control
`unit 10. However,
`its processor 206 performs additional
`programming stored in memory 20f Specifically, the voice
`control interpretation processes may comprise a speakable
`command process, a speakable hotlist process, or a speak-
`able links process. These processes and their associated
`grammar files reside on control unit 20.
`The speakable command process displays a command
`interface on display 20a and accepts various Web browsing
`commands. The process has an associated grammar file for
`the words and phrases that may be spoken by the user.
`FIG. 3 illustrates an example of a display 30 provided by
`the voice control
`interpretation process. One speakable
`command is a “Help” command, activated with a button 31.
`In response, the command process displays a “help page”
`that describes how to use voice-controlled browsing.
`Another speakable command is, “Show me my speakable
`command list”. Speaking this command displays a page
`listing a set of grammars, each representing a speakable
`command. Examples are pagedownicommand, backi
`command, and helpicommand. When the command pro-
`cess receives a translation of one of these commands, it
`performs the appropriate action.
`FIG. 3 also illustrates a feature of the voice recognizer
`that is especially useful for Web browsing. The user has
`spoken the words, “What is the value of XYZ stock?” Once
`the voice recognizer recognizes an utterance, it determines
`the score and various statistics for time and memory use. As
`explained below, the request for a stock value can be a hotlist
`item, permitting the user to simply voice the request without
`identifying the Web site where the information is located.
`Another speakable command is “Show me my speakable
`hotlist”, activated by button 33. A“hotlist” is a stored list of
`selected Uniform Resource Locators (URLS), such as those
`that are frequently used. Hotlists are also known as book-
`marks. URLs are a well known feature of the Web, and
`provide a short and consistent way to name any resource on
`the Internet. A typical URL has the following form:
`http://www.ncsa.uiic.edu/General/NCSAHome.html
`The various parts of the URL identify the transmission
`protocol, the computer address, and a directory path at that
`address. URLs are also known as “links” and “anchors”.
`
`The speakable hotlist process permits the user to construct
`a grammar for each hotlist item and to associate the grammar
`with a URL. To create the grammar, the user can edit an
`ASCII grammar file and type in the grammar using the BNF
`syntax. For example, a grammar for retrieving weather
`information might define phrases such as, “How does the
`weather look today?” and “Give me the weather”. The user
`then associates the appropriate URL with the grammar.
`The hotlist grammar file can be modified by voice. For
`example, a current page can be added as a hotlist item.
`Speaking the phrase, “Add this page to my hotlist” adds the
`title of the page to the grammar and associates that grammar
`with the current URL. Speaking the phrase, “Edit my
`speakable hotlist”, permits the user to edit the grammar by
`adding additional phrases that will cause the page to be
`retrieved by voice.
`
`Petitioner Microsoft Corporation - Ex. 1005 , p. 6
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 6
`
`

`

`US 6,188,985 B1
`
`5
`The speakable hotlist process is activated when the voice
`recognizer recognizes a hotlist translation from the hotlist
`grammar file and passes the translation to the hotlist process.
`The hotlist process looks up the associated URL. It passes
`the URL to the browser residing on host computer 11 (via
`wireless communication), so that the Web page may be
`retrieved and transmitted to the voice control unit 10 for
`
`display on display 10a.
`The grammar files for speakable commands and the
`speakable hotlist are active at all times. This permits the user
`to speak the commands or hotlist links in any context. A
`speakable links process may also reside in memory 206 of
`voice control unit 20. Selected information in a Web page
`may provide links, for access to other web pages. Links are
`indicated as such by being underlined, highlighted, differ-
`ently colored, outlined as in the case of pictures, or other-
`wise identified. Instead of using a mouse or other pointing
`device to select a link, the user of voice control unit 10 may
`speak a link from a page being display on display 10a.
`FIG. 4 illustrates a portion of a Web page 40 and its links.
`For example, the second headline 41 is a link.
`The grammar for speakable links includes the full phrase
`as well as variations. In addition to speaking the full phase,
`the speaker may say “Diana in N period Y period” (a literal
`variation), “Diana in NY”, or “Diana in New York”.
`Making a link speakable first requires obtaining the
`link/URL pair from its Web page. Because a Web page in
`HTML (hypertext markup language) format can have any
`length,
`the number of candidate link/URL pairs that the
`recognizer searches may be limited to those that are visible
`on a current screen of display 20a. A command such as,
`“Scroll down”, updates the candidate link/URL pairs. Once
`the link/URL pairs for a screen are obtained, a grammar is
`created for the all the links on the current screen. Next,
`tokens in the links are identified and grammars for the tokens
`are created. These grammars are added to the recognizer’s
`grammar files. Correct tokenization is challenging because
`link formats can vary widely. Links can include numbers,
`acronyms, invented words, and novel uses of punctuation.
`Other challenges for speakable links are the length of
`links, ambiguity of links in the same page, and graphics
`containing bit-mapped links. For long links, the speakable
`links process permits the user to stop speaking the words in
`a link any time after N words. For ambiguity, the process
`may either default to the first URL or it may offer a choice
`of URLs to the user. For bit-mapped links, the process uses
`an <ALT> tag to look for link information.
`The grammars for speakable links may be dynamically
`created so that only the grammar for a current display is
`active and is updated when a new current display is gener-
`ated. Dynamic grammar creation also reduces the amount of
`required memory 10f.
`FIG. 5 illustrates a suitable process of dynamically cre-
`ating grammar files. This is the process implemented by the
`dynamic grammar generator of FIGS. 1 and 2. As explained
`above, dynamic grammar files are created from current Web
`pages so that speakable links may be recognized. US. patent
`US. Pat. No. 5,774,628, incorporated by reference above,
`further describes this method as applied to a voice-
`controlled host system 11, that is, voice control without a
`separate remote control device 10.
`A display, such as the display 40 of FIG. 4, affects
`grammar constraints 52. The grammar constraints 52 are
`input into a vocabulary 54 and the user agent 64. In turn, the
`vocabulary 54 feeds the online dictionary 56, which inputs
`into the pronunciations module 58. The pronunciations
`module 58, as well as the Speaker Independent Continuous
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`Speech Phonetic Models module 60, input into the User
`Agent 64. In addition,
`the Speech module 66 inputs the
`user’s speech into the User Agent 64. In parallel, the Context
`module 68 gets inputs from the screen 40 and inputs into the
`User Agent 64.
`An existing RGDAG (Regular Grammar Directed Acyclic
`Graph) may dynamically accommodate new syntax and
`vocabulary. Every time the screen 40 changes, the user agent
`64 creates a grammar containing the currently visible under-
`lined phrases (links). From this grammar, the user agent 64
`tokenizes the phrases to create phrase grammars that can
`include, for example, optional letter spelling and deleted/
`optional punctuation. From the tokens, the user agent 64
`creates phonetic pronunciation grammars using a combina-
`tion of online dictionaries and a text-to-phoneme mapping.
`The voice recognition process then adds the grammars
`created. This involves several simple bookkeeping opera-
`tions for the voice recognizer, including identifying which
`symbols denote “words” to output. Finally, global changes
`are implemented to incorporate the new/changed grammars.
`For this, the grammars are connected in an RGDAG rela-
`tionship. In addition, the maximum depth for each symbol is
`computed. It is also determined whether the voice recognizer
`requires parse information by looking for ancestor symbols
`with output. Then the structure of the grammar for efficient
`parsing is identified.
`Other Embodiments
`
`Although the invention has been described with reference
`to specific embodiments, this description is not meant to be
`construed in a limiting sense. Various modifications of the
`disclosed embodiments, as well as alternative embodiments,
`will be apparent to persons skilled in the art. It is, therefore,
`contemplated that the appended claims will cover all modi-
`fications that fall within the true scope of the invention.
`What is claimed is:
`
`1. A wireless voice-activated control system comprising:
`a remote processor-based host system;
`a voice recognition processor operable to perform a voice
`recognition process and a memory that stores said voice
`recognition process and grammar files; and
`a voice activated control unit for remotely controlling said
`remote processor-based host system comprising:
`a microphone operable to receive voice command input
`from a user, thereby providing an audio input signal;
`said microphone operably coupled to said voice
`recognition processor, said memory and said gram-
`mar files for voice recognition of said voice com-
`mands;
`an audio transmitter operable to wirelessly transmit
`data derived from said audio input signal to said host
`system to control said host system;
`a data receiver operable to wirelessly receive image
`data from said host system representing voice com-
`manded display images; and
`a display operable to generate and display said voice
`commanded images represented by said image data.
`2. The control unit of claim 1, wherein said microphone
`is switchable to an on or off state separately from said
`display.
`3. The control unit of claim 1, wherein said microphone
`is a multi-element microphone array.
`4. The system of claim 1, wherein said voice recognition
`process comprises linear predictive coding analysis, and
`wherein said transmitter is operable to transmit the results of
`said analysis.
`5. The system of claim 1, wherein said grammar files are
`dynamically created, wherein said processor is further oper-
`able to perform a dynamic grammar generation process.
`
`Petitioner Microsoft Corporation - Ex. 1005 , p. 7
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 7
`
`

`

`US 6,188,985 B1
`
`7
`6. The system of claim 1, wherein said voice recognition
`processor comprises speech and pointing analysis and
`wherein said transmitter is operable to transmit the result of
`said analysis.
`7. A wireless voice-activated control system for voice-
`control of a remote processor-based host system in data
`communication with a hypermedia resource to permit a user
`to browse a hypermedia network, comprising:
`said remote processor-based host system including a web
`browser and in data communication with a hypermedia
`resource;
`
`a voice recognition processor operable to perform a voice
`recognition process and a memory that stores said voice
`recognition process and grammar files; and
`a voice-activated control unit for remotely controlling
`said remote processor-based host system comprising:
`a microphone operable to receive voice browser com-
`mands input from a user, thereby generating an audio
`input signal; said microphone operably coupled to
`said voice recognition processor, said memory and
`said grammar files for voice recognition of said voice
`commands;
`an audio transmitter operable to wirelessly transmit
`data representing browser commands derived from
`said audio input signal to said remote processor-
`based host system to cause said host system to
`browse said hypermedia network and retrieve
`selected web page;
`a data receiver operable to wirelessly receive image
`data representing a selected web page from said
`remote host system; and
`a display operable to generate and display web page
`images represented by said image data and retrieved
`from said hypermedia resource by said host system.
`8. The system of claim 7, wherein said voice recognition
`processor, said memory, and said grammar files are in said
`control unit.
`
`8
`9. The system of claim 7, wherein said voice recognition
`processor comprises linear predictive coding analysis, and
`wherein said transmitter is operable to transmit the results of
`said analysis.
`10. The system of claim 7, wherein said voice recognition
`processor comprises speech end pointing analysis, and
`wherein said transmitter is operable to transmit the results of
`said analysis.
`11. The system of claim 7, wherein said grammar files are
`dynamically created, wherein said processor is further oper-
`able to perform a dynamic grammar generation process.
`12. The system of claim 7, further comprising a processor
`operable to perform voice control processes and a memory
`that stores said voice control processes.
`13. The system of claim 12, wherein said voice control
`processor comprise a speakable commands process such that
`said user may vocally direct the operations of said host
`system.
`14. The system of claim 12, wherein said voice control
`processor comprise a speakable hotlist process such that said
`user may vocally request a particular one of said resources
`to be retrieved by said host system.
`15. The of claim 12, wherein said voice control processes
`comprise a speakable links process such that said user may
`vocally request that a link on a current page being displayed
`on said display be retrieved by said host system.
`16. The system of claim 7, further comprising a processor
`operable to perform dynamic grammar creation processes,
`and memory that stores said processes.
`17. The system of claim 7, wherein said host system
`performs voice control processes.
`18. The system of claim 7, wherein audio data from the
`microphone is sent to the host system which performs all
`processing.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`Petitioner Microsoft Corporation - EX. 1005 , p. 8
`
`Petitioner Microsoft Corporation - Ex. 1005, p. 8
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket