throbber
US 6,757,718 B1
`
`7
`8
`language interpreter attempts to determine both the meaning
`environments and various combinations thereof, including,
`by way of just a few examples: a general-purpose hardware
`of spoken words (semantic processing) as well as the
`grammar of the statement (syntactic processing), such as the
`microprocessor such as the Intel Pentium series; operating
`Gemini Natural Language Understanding System developed
`system software such as Microsoft Windows/CE, Palm OS,
`or Apple Mac OS (particularly for client devices and client- 5 by SRI International. The Gemini system is described in
`side processing), or Unix, Linux, or Windows/NT (the latter
`detail in publications entitled "Gemini: A Natural Language
`three particularly for network data servers and server-side
`System for Spoken-Language Understanding" and "Inter-
`processing), and/or proprietary information access platforms
`leaving Syntax and Semantics in an Efficient Bottom-Up
`such as Microsoft's WebTV or the Diva Systems video-on-
`Parser," both of which are currently available online at
`demand system.
`lo http://www.ai.sri.com/natural-language/projects/arpa-sls/
`2. Processing Methodology
`nat-lang.html. (Copies of those publications are also
`The present invention provides a spoken natural language
`included in an information disclosure statement submitted
`interface for interrogation of remote electronic databases
`herewith, and are incorporated herein by this reference).
`and retrieval of desired information. A preferred embodi-
`Briefly, Gemini applies a set of syntactic and semantic
`ment of the present invention utilizes the basic methodology 15 grammar rules to a word string using a bottom-up parser to
`outlined in the flow diagram of FIG. 4 in order to provide
`generate a logical form, which is a structured representation
`this interface. This methodology will now be discussed.
`of the context-independent meaning of the string. Gemini
`a. Interpreting Spoken Natural Language Requests
`can be used with a variety of grammars, including general
`At step 402, the user's spoken request for information is
`English grammar as well as application-specific grammars.
`initially received in the form of raw (acoustic) voice data by 20 The Gemini parser is based on "unification grammar,"
`a suitable input device, as previously discussed in connec-
`meaning that grammatical categories incorporate features
`tion with FIGS. 1-2. At step 404 the voice data received
`that can be assigned values; so that when grammatical
`from the user is interpreted in order to understand the user's
`category expressions are matched in the course of parsing or
`request for information. Preferably this step includes per-
`semantic interpretation, the information contained in the
`forming speech recognition in order to extract words from 25 features is combined, and if the feature values are incom-
`the voice data, and further includes natural language parsing
`patible the match fails.
`of those words in order to generate a structured linguistic
`It is possible for some applications to achieve a significant
`representation of the user's request.
`reduction in speech recognition error by using the natural-
`Speech recognition in step 404 is performed using speech
`language processing system to re-score recognition hypoth-
`recognition engine 310. A variety of commercial quality, 30 eses. For example, the grammars defined for a language
`speech recognition engines are readily available on the
`parser like Gemini may be compiled into context-free gram-
`market, as practitioners will know. For example, Nuance
`mar that, in turn, can be used directly as language models for
`Communications offers a suite of speech recognition
`speech recognition engines like the Nuance recognizer.
`engines, including Nuance 6, its current flagship product,
`Further details on this methodology are provided in the
`and Nuance Express, a lower cost package for entry-level 35 publication "Combining Linguistic and Statistical Knowl-
`edge Sources in Natural-Language Processing for ATIS"
`applications. As one other example, IBM offers the ViaVoice
`speech recognition engine, including a low-cost shrink-
`which is currently available online
`through http://
`wrapped version available through popular consumer distri-
`www.ai.sri.com/natural-language/projects/arpa-sls/spnl-
`bution channels. Basically, a speech recognition engine
`int.html. A copy of this publication is included in an infor-
`processes acoustic voice data and attempts to generate a text 40 mation disclosure submitted herewith, and is incorporated
`stream of recognized words.
`herein by this reference.
`Typically, the speech recognition engine is provided with
`In an embodiment of the present invention that may be
`a vocabulary lexicon of likely words or phrases that the
`preferable for some applications, the natural language inter-
`recognition engine can match against its analysis of acous-
`preter "learns" from the past usage patterns of a particular
`tical signals, for purposes of a given application. Preferably, 45 user or of groups of users. In such an embodiment, the
`the lexicon is dynamically adjusted to reflect the current user
`successfully interpreted requests of users are stored, and can
`context, as established by the preceding user inputs. For
`then be used to enhance accuracy by comparing a current
`example, if a user is engaged in a dialogue with the system
`request to the stored requests, thereby allowing selection of
`about movie selection, the recognition engine's vocabulary
`a most probable result.
`may preferably be adjusted to favor relevant words and 50
`b. Constructing Navigation Queries
`In step 405 request processing logic 300 identifies and
`phrases, such as a stored list of proper names for popular
`movie actors and directors, etc. Whereas if the current
`selects an appropriate online data source where the desired
`dialogue involves selection and viewing of a sports event,
`information (in this case, current weather reports for a given
`the engine's vocabulary might preferably be adjusted to
`city) can be found. Such selection may involve look-up in a
`favor a stored list of proper names for professional sports 55 locally stored table, or possibly dynamic searching through
`teams, etc. In addition, a speech recognition engine is
`an online search engine, or other online search techniques.
`provided with language models that help the engine predict
`For some applications, an embodiment of the present inven-
`the most likely interpretation of a given segment of acous-
`tion may be implemented in which only access to a particu-
`tical voice data, in the current context of phonemes or words
`lar data source (such as a particular vendor's proprietary
`in which the segment appears. In addition, speech recogni- 60 content database) is supported; in that case, step 405 may be
`tion engines often echo to the user, in more or less real-time,
`trivial or may be eliminated entirely.
`a transcription of the engine's best guess at what the user has
`Step 406 attempts to construct a navigation query, reflect-
`said, giving the user an opportunity to confirm or reject.
`ing the interpretation of step 404. This operation is prefer-
`In a further aspect of step 404, natural language inter-
`ably performed by query construction logic 330.
`preter (or parser) 320 linguistically parses and interprets the 65
`A "navigation query" means an electronic query, form,
`textual output of the speech recognition engine. In a pre-
`series of menu selections, or the like; being structured
`ferred embodiment of the present invention, the natural-
`appropriately so as to navigate a particular data source of
`
`Petitioner Microsoft Corporation - Ex. 1008, p. 695
`
`

`

`US 6,757,718 BI
`
`interest in search of desired information. In other words, a
`navigation query is constructed such that it includes what-
`ever content and structure is required in order to access
`desired information electronically from a particular database
`or data source of interest.
`For example, for many existing electronic databases, a
`navigation query can be embodied using a formal database
`query language such as Standard Query Language (SQL).
`For many databases, a navigation query can be constructed
`through a more user-friendly interactive front-end, such as a
`series of menus and/or interactive forms to be selected or
`filled in. SQL is a standard interactive and programming
`language for getting information from and updating a data-
`base. SQL is both an ANSI and an ISO standard. As is well
`known to practitioners, a Relational Database Management
`System (RDBMS), such as Microsoft's Access, Oracle's
`Oracle7, and Computer Associates' CA-OpenIngres, allow
`programmers to create, update, and administer a relational
`database. Practitioners of ordinary skill in the art will be
`thoroughly familiar with the notion of database navigation
`through structured query, and will be readily able to appre-
`ciate and utilize the existing data structures and navigational
`mechanisms for a given database, or to create such structures
`and mechanisms where desired.
`In accordance with the present invention, the query con-
`structed in step 406 must reflect the user's request as
`interpreted by the speech recognition engine and the NL
`parser in step 404. In embodiments of the present invention
`wherein data source 110 (or 210 in the corresponding
`embodiment of FIG. 2) is a structured relational database or
`the like, step 406 of the present invention may entail
`constructing an appropriate Structured Query Language
`like, or automatically filling out a
`(SQL) query or the
`front-end query form, series of menus or the
`like, as
`described above.
`In many existing Internet (and Intranet) applications, an
`online electronic data source is accessible to users only
`through the medium of interaction with a so-called Common
`Gateway Interface (CGI) script. Typically the user who
`visits a web site of this nature must fill in the fields of an
`online interactive form. The online form is in turn linked to
`a CGI script, which transparently handles actual navigation
`of the associated data source and produces output for
`viewing by the user's web browser. In other words, direct
`user access to the data source is not supported, only medi-
`ated access through the form and CGI script is offered.
`For applications of this nature, an advantageous embodi-
`ment of the present invention "scrapes" the scripted online
`site where information desired by a user may be found in
`order to facilitate construction of an effective navigation
`query. For example, suppose that a user's spoken natural
`language request is: "What's the weather in Miami?" After
`this request is received at step 402 and interpreted at step
`404, assume
`that step 405 determines that the desired
`weather information is available online through the medium
`of a CGI-scripted interactive form. Step 406 is then prefer-
`ably carried out using the expanded process diagrammed in
`FIG. 5. In particular, at sub-step 520, query construction
`logic 330 electronically "scrapes" the online interactive
`form, meaning that query construction logic 330 automati-
`cally extracts the format and structure of input fields
`accepted by the online form. At sub-step 522, a navigation
`query is then constructed by instantiating (filling in) the
`extracted input format-essentially an electronic template-
`in a manner reflecting the user's request for information as
`interpreted in step 404. The flow of control then returns to
`step 407 of FIG. 4. Ultimately, when the query thus con-
`
`20
`
`structed by scraping is used to navigate the online data
`source in step 408, the query effectively initiates the same
`scripted response as if a human user had visited the online
`site and had typed appropriate entries into the input fields of
`5 the online form.
`In the embodiment just described, scraping step 520 is
`preferably carried out with the assistance of an online
`extraction utility such as WebL. WebL is a scripting lan-
`guage for automating tasks on the World Wide Web. It is an
`l0 imperative, interpreted language that has built-in support for
`common web protocols like HTTP and FTP, and popular
`data types like HTML and XML. WebL's implementation
`language is Java, and the complete source code is available
`from Compaq. In addition, step 520 is preferably performed
`15 dynamically when necessary-in other words, on-the-fly in
`response to a particular user query-but in some applica-
`tions it may be possible to scrape relatively stable
`(unchanging) web sites of likely interest in advance and to
`cache the resulting template information.
`It will be apparent, in light of the above teachings, that
`preferred embodiments of the present invention can provide
`a spoken natural language interface atop an existing, non-
`voice data navigation system, whereby users can interact by
`means of intuitive natural language input not strictly con-
`25 forming to the linear browsing architecture or other artifacts
`of an existing menu/text/click navigation system. For
`example, users of an appropriate embodiment of the present
`invention for a video-on-demand application can directly
`speak the natural request: "Show me
`the movie
`'Unforgiven"'-instead of walking step-by-step through a
`typically linear sequence of genre/title/actor/director menus,
`scrolling and selecting from potentially long lists on each
`menu, or instead of being forced to use an alphanumeric
`keyboard that cannot be as comfortable to hold or use as a
`35 lightweight remote control. Similarly, users of an appropri-
`ate embodiment of the present invention for a web-surfing
`application in accordance with the process shown in FIG. 5
`can directly speak the natural request: "Show me a one-
`month price chart for Microsoft stock" -instead of poten-
`40 tially having to navigate to an appropriate web site, search
`for the right ticker symbol, enter/select the symbol, and
`specify display of the desired one-month price chart, each of
`those steps potentially involving manual navigation and data
`entry to one or more different interaction screens. (Note that
`45 these examples are offered to illustrate some of the potential
`benefits offered by appropriate embodiments of the present
`invention, and not to limit the scope of the invention in any
`respect.)
`c. Error Correction
`Several problems can arise when attempting to perform
`searches based on spoken natural language input. As indi-
`cated at decision step 407 in the process of FIG. 4, certain
`deficiencies may be identified during the process of query
`construction, before search of the data source is even
`55 attempted. For example, the user's request may fail to
`specify enough information in order to construct a naviga-
`tion query that is specific enough to obtain a satisfactory
`search result. For example, a user might orally request
`"what's the weather?" whereas the national online data
`60 source identified in step 405 and scraped in step 520 might
`require specifying a particular city.
`Additionally, certain deficiencies and problems may arise
`following the navigational search of the data source at step
`408, as indicated at decision step 409 in FIG. 4. For
`65 example, with reference to a video-on-demand application,
`a user may wish to see the movie "Unforgiven", but perhaps
`the user can't recall name of the film, but knows it was
`
`30
`
`50
`
`Petitioner Microsoft Corporation - Ex. 1008, p. 696
`
`

`

`US 6,757,718 BI
`
`directed by and starred actor Clint Eastwood. A typical
`video-on-demand database might indeed be expected to
`allow queries specifying the name of a leading actor and/or
`director, but in the case of this query-as in many cases-
`that will not be enough to narrow the search to a single film,
`and additional user input in some form is required.
`In the event that one or more deficiencies in the user's
`spoken request, as processed, result in the problems
`described, either at step 407 or 409, some form of error
`handling is in order. A straightforward, crude technique
`might be for the system to respond simply "input not
`understood/insufficient, please try again." However, that
`approach will likely result in frustrated users, and is not
`optimal or even acceptable for most applications. Instead, a
`preferred technique in accordance with the present invention
`handles such errors and deficiencies in user input at step 412,
`whether detected at step 407 or step 409, by soliciting
`additional input from the user in a manner taking advantage
`of the partial construction already performed and via user
`interface modalities in addition to spoken natural language
`("multi-modality"). This supplemental interaction is prefer-
`ably conducted through client display device 112 (202, in the
`embodiment of FIG. 2), and may include textual, graphical,
`audio and/or video media. Further details and examples are
`provided below. Query refinement logic 340 preferably
`carries out step 412. The additional input received from the
`user is fed into and augments interpreting step 404, and
`query construction step 406 is likewise repeated with the
`benefit of the augmented interpretation. These operations,
`and subsequent navigation step 408, are preferably repeated
`until no remaining problems or deficiencies are identified at
`decision points 407 or 409. Further details and examples for
`this query refinement process are provided immediately
`below.
`Consider again the example in which the user of a
`video-on-demand application wishes to see "Unforgiven"
`but can only recall that it was directed by and starred Clint
`Eastwood. First, it bears noting that using a prior art navi-
`gational interface, such as a conventional menu interface,
`will likely be relatively tedious in this case. The user can
`proceed through a sequence of menus, such as Genre (select
`"western"), Title (skip), Actor ("Clint Eastwood"), and
`Director ("Clint Eastwood"). In each case-especially for
`the last two items-the user would typically scroll and select
`from fairly long lists in order to enter his or her desired
`name, or perhaps use a relatively couch-unfriendly keypad
`to manually type the actor's name twice.
`Using a preferred embodiment of the present invention,
`the user instead speaks aloud, holding remote control micro-
`phone 102, "1 want to see that movie starring and directed
`by Clint Eastwood. Can't remember the title." At step 402
`the voice data is received. At step 404 the voice data is
`interpreted. At step 405 an appropriate online data source is
`selected (or perhaps the system is directly connected to a
`proprietary video-on-demand provider). At step 406 a query
`is automatically constructed by the query construction logic
`330 specifying "Clint Eastwood" in both the actor and
`director fields. Step 407 detects no obvious problems, and so
`the query is electronically submitted and the data source is
`navigated at step 408, yielding a list of several records
`satisfying the query (e.g., "Unforgiven", "True Crime",
`"Absolute Power", etc.). Step 409 detects that additional
`user input is needed to further refine the query in order to
`select a particular film for viewing.
`At that point, in step 412 query refinement logic 340
`might preferably generate a display for client display device
`112 showing the (relatively short) list of film titles that
`
`20
`
`satisfy the user's stated constraints. The user can then
`preferably use a relatively convenient input modality, such
`as buttons on the remote control, to select the desired title
`from the menu. In a further preferred embodiment, the first
`5 title on the list is highlighted by default, so that the user can
`simply press an "OK" button to choose that selection. In a
`further preferred feature, the user can mix input modalities
`by speaking a response like "I want number one on the list."
`Alternatively, the user can preferably say, "Let's see
`l0 Unforgiven," having now been reminded of the title by the
`menu display.
`Utilizing the user's supplemental input, request process-
`ing logic 300 iterates again through steps 404 and 406, this
`time constructing a fully-specified query that specifically
`15 requests the Eastwood film "Unforgiven." Step 408 navi-
`gates the data source using that query and retrieves the
`desired film, which is then electronically transmitted in step
`410 from network server 108 to client display device 112 via
`communications network 106.
`Now consider again the example in which the user of a
`web surfing application wants to know his or her local
`weather, and simply asks, "what's the weather?" At step 402
`the voice data is received. At step 404 the voice data is
`interpreted. At step 405 an online web site providing current
`25 weather information for major cities around the world is
`selected. At step 406 and sub-step 520, the online site is
`scraped using a WebL-style tool to extract an input template
`for interacting with the site. At sub-step 522, query con-
`struction logic 330 attempts to construct a navigation query
`3o by instantiating the input template, but determines (quite
`rightly) that a required field-name of city-cannot be
`determined from the user's spoken request as interpreted in
`step 404. Step 407 detects this deficiency, and in step 412
`query refinement logic 340 preferably generates output for
`35 client display device 112 soliciting the necessary supple-
`mental input. In a preferred embodiment, the output might
`display the name of the city where the user is located
`highlighted by default. The user can then simply press an
`"OK" button-or perhaps mix modalities by saying "yes,
`4o exactly" -to choose that selection. Apreferred embodiment
`would further display an alphabetical scrollable menu listing
`other major cities, and/or invite the user to speak or select
`the name of the desired city.
`input,
`Here again, utilizing the user's supplemental
`45 request processing logic 300 iterates through steps 404 and
`406. This time, in performing sub-step 520, a cached version
`of the input template already scraped in the previous itera-
`tion might preferably be retrieved. In sub-step 522, query
`construction logic 330 succeeds this time in instantiating the
`50 input template and constructing an effective query, since the
`desired city has now been clarified. Step 408 navigates the
`data source using that query and retrieves
`the desired
`weather information, which is then electronically transmit-
`ted in step 410 from network server 108 to client display
`55 device 112 via communications network 106.
`It is worth noting that in some instances, there may be
`details that are not explicitly provided by the user, but that
`query construction logic 330 or query refinement logic 340
`may preferably deduce on their own through reasonable
`6o assumptions, rather than requiring the use to provide explicit
`clarification. For example,
`in the example previously
`described regarding a request for a weather report, in some
`applications it might be preferable for the system to simply
`assume that the user means a weather report for his or her
`65 home area and to retrieve that information, if the cost of
`doing so is not significantly greater than the cost of asking
`the user to clarify the query. Making such an assumption
`
`Petitioner Microsoft Corporation - Ex. 1008, p. 697
`
`

`

`US 6,757,718 B1
`
`www.ai.sri.com/-lesaf/commandtalk.html and in the follow-
`ing publications, copies of which are provided in an Infor-
`mation Disclosure Statement submitted herewith and
`incorporated herein by this reference:
`"CommandTalk: A Spoken-Language Interface for Battle-
`field Simulations", 1997, by Robert Moore, John
`Dowding, Harry Bratt, J. Mark Gawron, Yonael Gorfu
`and Adam Cheyer, in "Proceedings of the Fifth Con-
`ference on Applied Natural Language Processing",
`Washington, D.C., pp. 1-7, Association for Computa-
`tional Linguistics
`"The CommandTalk Spoken Dialogue System", 1999, by
`Amanda Stent, John Dowding, Jean Mark Gawron,
`Elizabeth Owen Bratt and Robert Moore, in "Proceed-
`ings of the Thirty-Seventh Annual Meeting of the
`ACL", pp. 183-190, University of Maryland, College
`Park, Md., Association for Computational Linguistics
`
`5
`
`10
`
`15
`
`might be even more strongly justified in a preferred
`embodiment, as described earlier, where user histories are
`tracked, and where such history indicates that a particular
`user or group of users typically expect local information
`when asking for a weather forecast. At any rate, in the event
`such an assumption is made, if the user actually intended to
`request the weather for a different city, the user would then
`need to ask his or her question again. It will be apparent to
`practitioners, in light of the above teachings, that the choice
`of whether to program query construction logic 330 and
`query refinement logic 340 to make make particular assump-
`tions will typically involve trade-offs involving user con-
`veience that can be assessed in the context of specific
`applications.
`3. Open Agent Architecture (OAA®)
`Open Agent ArchitectureM
`(OAA®) is a software
`platform, developed by the assignee of the present invention,
`that enables effective, dynamic collaboration among com-
`munities of distributed electronic agents. OAA is described
`in greater detail in co-pending U.S. patent application Ser.
`No. 09/225,198, which has been incorporated herein by
`reference. Very briefly, the functionality of each client agent
`is made available to the agent community through registra-
`tion of the client agent's capabilities with a facilitator. A
`software "wrapper" essentially surrounds the underlying
`application program performing the services offered by each
`client. The common infrastructure for constructing agents is
`preferably supplied by an agent library. The agent library is
`preferably accessible in the runtime environment of several
`different programming languages. The agent library prefer-
`ably minimizes the effort required to construct a new system
`and maximizes the ease with which legacy systems can be
`"wrapped" and made compatible with the agent-based archi-
`tecture of the present invention. When invoked, a client
`agent makes a connection to a facilitator, which is known as
`its parent facilitator. Upon connection, an agent registers
`with its parent facilitator a specification of the capabilities
`and services it can provide, using a high-level, declarative
`Interagent Communication Language ("ICL")
`to express
`those capabilities. Tasks are presented to the facilitator in the
`form of ICL goal expressions. When a facilitator determines
`that the registered capabilities of one of its client agents will
`help satisfy a current goal or sub-goal thereof, the facilitator
`delegates that sub-goal to the client agent in the form of an
`ICL request. The client agent processes the request and
`returns answers or information to the facilitator. In process-
`ing a request, the client agent can use ICL to request services
`of other agents, or utilize other infrastructure services for
`collaborative work. The facilitator coordinates and inte-
`grates the results received from different client agents on
`various sub-goals, in order to satisfy the overall goal.
`OAA provides a useful software platform for building
`systems that integrate spoken natural language as well as
`other user input modalities. For example, see the above-
`referenced co-pending patent application, especially FIG. 13
`and the corresponding discussion of a "multi-modal maps"
`application, and FIG. 12 and the corresponding discussion of
`a "unified messaging" application. Another example is the
`InfoWiz interactive information kiosk developed by the
`assignee and described in the document entitled "InfoWiz:
`An Animated Voice Interactive Information System" avail-
`able online at http://www.ai.sri.com/-oaa/applications.html.
`A copy of the InfoWhiz document is provided in an Infor-
`mation Disclosure Statement submitted herewith and incor-
`porated herein by this reference. A further example is the
`"CommandTalk" application developed by the assignee for
`the U.S. military, as described online at http://
`
`20
`
`25
`
`30
`
`50
`
`"Interpreting Language in Context in CommandTalk",
`1999, by John Dowding and Elizabeth Owen Bratt and
`Sharon Goldwater, in "Communicative Agents: The
`Use of Natural Language in Embodied Systems", pp.
`63-67, Association for Computing Machinery (ACM)
`Special Interest Group on Artificial Intelligence
`(SIGART), Seattle, Wash.
`For some applications and systems, OAA can provide an
`advantageous platform for constructing embodiments of the
`present invention. For example, a representative application
`is now briefly presented, with reference to FIG. 6. If the
`statement "show me movies starring John Wayne" is spoken
`into the voice input device, the voice data for this request
`will be sent by UI agent 650 to facilitator 600, which in turn
`will ask natural language (NL) agent 620 and speech rec-
`ognition agent 610 to interpret the query and return the
`interpretation in ICL format. The resulting ICL goal expres-
`sion is then routed by the facilitator to appropriate agents-
`in this case, video-on-demand database agent 640-to
`40 execute the request. Video database agent 640 preferably
`includes or is coupled to an appropriate embodiment of
`query construction logic 330 and query refinement logic
`340, and may also issue ICL requests to facilitator 600 for
`additional assistance-e.g., display of menus and capture of
`45 additional user input in the event that query refinement is
`needed-and facilitator 600 will delegate such requests to
`appropriate client agents
`in the community. When the
`desired video content is ultimately retrieved by video data-
`base agent 640, UI agent 650 is invoked by facilitator 600
`to display the movie.
`Other spoken user requests, such as a request for the
`current weather in New York City or for a stock quote,
`would eventually lead facilitator to invoke web database
`agent 630 to access the desired information from an appro-
`55 priate Internet site. Here again, web database agent 630
`preferably includes or is coupled to an appropriate embodi-
`ment of query construction logic 330 and query refinement
`logic 340, including a scraping utility such as WebL. Other
`spoken requests, such as a request to view recent emails or
`6o access voice mail, would lead the facilitator to invoke the
`appropriate email agent 660 and/or telephone agent 680. A
`request to record a televised program of interest might lead
`facilitator 600 to invoke web database agent 630 to return
`televised program schedule information, and then invoke
`65 VCR controller agent 680 to program the associated VCR
`unit to record the desired television program at the sched-
`uled time.
`
`Petitioner Microsoft Corporation - Ex. 1008, p. 698
`
`

`

`Control and connectivity embracing additional electronic
`home appliances (e.g., microwave oven, home surveillance
`system, etc.) can be integrated
`in comparable fashion.
`Indeed, an advantage of OAA-based embodiments of the
`present invention, that will be apparent to practitioners in
`light of the above teachings and in light of the teachings
`disclosed in the cited co-pending patent applications, is the
`relative ease and flexibility with which additional service
`agents can be plugged into the existing platform, immedi-
`ately enabling the facilitator to respond dynamically to
`spoken natural language requests for the corresponding
`services.
`4. Further Embodiments and Equivalents
`While the present invention has been described in terms
`of several preferred embodiments,
`there are many
`alterations, permutations, and equivalents that may fall
`within the scope of this invention. It should also be noted
`that there are many alternative ways of implementing the
`methods and apparatuses of the present invention. It is
`therefore intended that the following appended claims be
`interpreted as including all such alterations, permutations,
`and equivalents as fall within the true spirit and scope of the
`present invention.
`What is claimed is:
`1. A method for speech-based navigation of an electronic
`data source located at one or more network servers located
`remotely from a user, wherein a data link is established
`between a mobile information appliance of the user and the
`one or more network servers, comprising the steps of:
`(a) receiving a spoken request for desired information
`from the user utilizing the mobile information appli-
`ance of the user, wherein said mobile information
`appliance comprises a portable remote control device
`or a set-top box for a television;
`(b) rendering an interpretation of the spoken request;
`(c) constructing a navigation query based upon the inter-
`pretation;
`(d) utilizing the navigation query to select a portion of the
`electronic data source; and
`(e) transmitting the selected portion of the electronic data
`source from the network server to the mobile informa-
`tion appliance of the user.
`2. The method of claim 1, wherein the step of rend

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket