throbber
United States Patent
`
`[19}
`
`[11] Patent Number:
`
`5,454,106
`
`Bums et a].
`
`[45] Date of Patent:
`
`Sep. 26, 1995
`
`”111 1111 III "III "III Ill” [[11 111 1111 "III "III 11" M III" 11
`USUOS454106A
`
`[54]
`
`DATABASE RETRIEVAL SYSTEM USING
`NATURAL LANGUAGE FOR PRESENTING
`UNDERSTOOD COMPONENTS OF AN
`AMIBIGUOUS QUERY ON A USER
`INTERFACE
`
`[75]
`
`Inventors: Luanne M. Burns, Ridgefield, Conn;
`Ashe]: Malhotra, Croton-Onnfludson,
`N.Y.
`-
`
`[73]
`
`Assignee:
`
`International Business Machines
`Corporation, Armonk, N.Y.
`
`[21]
`
`[22]
`
`[51]
`[52]
`
`[531
`
`[561
`
`Appl. No.: 62,502
`
`May 17, 1993
`
`Filed:
`Int. Cl.“i
`U.S. C1.
`
`
`
`Field of Search
`
`G06F 17130
`3951600: 364141908; 3641963.];
`36419746; 3641DIG. 2
`3951600, 419;
`3641419
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`111976 Goddard eta}.
`3,932,948
`4341314
`4,241,402 1211980 Mayper,.1r.eta3.
`3951575
`
`4,506,326
`311985 Shaw el al.
`.............
`3951100
`4,553,222
`1111985 Kurland et a1.
`.
`3641401
`
`4,570,217
`211986 Allen et a1.
`3641188
`
`4,670,848
`611987 Scbmmm ........
`.. 395160
`
`4,674,042
`611987 Hernandez etal.
`3641401
`
`4,688,195
`811987 Thompson et a1.
`.. 395112
`.
`
`4,688,198
`811987 Wiggins ..........
`367146
`3951147
`4,723,211
`211988 Barker et a].
`
`364141908
`411988 Katayama et a1.
`..
`4,736,296
`511988 Hirayama et a1.
`..
`4,747,059
`3641496
`
`.. 395176
`4,803,641
`211989 Hardy et a1.
`
`3951146
`4,815,029
`311989 Barker et 81.
`
`.
`4,816,994
`311989 Freiling eta].
`395175
`511989 Tennantet a1.
`.
`364141908
`4,829,423
`
`4,829,427
`511989 Green
`..... 3641300
`
`511989 Yamatla ........... 3951133
`4,831,580
`
`4,349,398
`71.1989 Adj .............
`.. 364141921
`4,866,634
`911989 Reboh el al.
`....... 395176
`111990 Den- et a]. ............. 395176
`4,891,766
`
`364141903
`4,914,590
`411990 Loattnan et al.
`
`611990 Isle et a].
`395111
`4,931,950
`711990 Okajima et a1.
`364141908
`4,942,526
`711990 Lark etal.
`...... 395176
`4,943,932
`1111990 Amirghodsi et a].
`..
`3951275
`4,974,191
`
`1211990 Hutchins ............
`39512.63
`4,980,917
`......... 3641419
`211991 Asakawa
`4,994,967
`
`711991 lto ..............
`364141908
`5,029,085
`1111991 Pajak et a1. .......... 3951159
`5,065,347
`
`211992 Spielmal et 31.
`. 3951158
`5.088.052
`5.175.814 1211992 Articltetal.
`. 3951161
`5,301,317
`411994 Lehman et al.
`3951600
`
`..
`364141908
`5,369,575
`1111994 [amberti et a1.
`.................. 364141908
`5,377,103
`1211994 Lambefii el al.
`
`OTHER PUBLICATIONS
`
`M. D. Williams, “What Makes Rabbit Run?", 1984, Aca-
`demic Press Inc. (London) limited. pp. 333—352.
`K. Y. Whang, A Malhotra, G. Sockut, L. Burns Sr. K. Choi,
`‘Two—Dimensional Specification of Universal Quantifica-
`tion in a Graphical Database Query Language” [BEE Trans
`actions on Software Engineering, vol. 18, No. 3, Mar. 1992
`pp. 216-224.
`L. M. Burns, A Malhotra. G. Soekut, K. Whang “Aerial Ad
`Hoc Entity—Relationship Investigation and Learning", [El-3E
`Transactions on Soflware Eng. (USA vol. 18, No. 3 Mar.
`1992 pp. 115141159.
`
`Primary Examiner—Thomas G. Black
`Assistant Exmniner—Peter Y. Wang
`Attorney, Agent, or Finm—louis J. PerceDo
`
`[57]
`
`ABSTRACT
`
`Information is retrieved from a database using natural lan-
`guage (NL) queries and graphical interfaces and displays. A
`query is separated into tokens which are parsed into ele-
`ments. The parsed elements are matched to a list of database
`names. If all the parsed elements can be uniquely matched
`to database names, a database query is constructed and used
`to query the database to retrieve information and to present
`to a user. However. when an ambiguous query is encoun—
`tered, i.e. all of whose elements cannot be uniquely matched
`with database names, the understood components of the
`ambiguous query, i.e., those elements matching database
`names, are presented the user along with relationships of the
`elements to other names in the database so that the user can
`use an interface to explore the database by accessing and
`displaying this database information and these relationships.
`
`
`
`.—_.. 501E
`Ell REWIZED
`
`
`
`
`RECOGNIZED
`
`(‘NEI 229
`PICK BEST
`
`
`PARSE
`
`
`(2'3
` No)
`
`mast: /EI5
`SUEZHUL'I [F'LE
`
`HATCHED
`
`
`DISPLRY
`
`01 SP1. A? PMTIRL.
`
`DPT IONS
`'U'NDERS'IIIUD‘
`
`
`SCIIEM
`216
`
`CREME
`MTABASE
`
`QUERY RESUBHI
`GLERY
`
`
`DEFINE
`
`
`
`
`
`2&0
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 1
`Petitioner Microsoft Corporation - EX. 1004, p. 1
`
`

`

`5,454,106
`Page 2
`
`The interface can take the form of views on a graphical
`interface. Using the displayed information,
`the user can
`create associations between database names and compo-
`nents not understood in the query. In other words, database
`names can be associated with the natural language words or
`phrases. These associations are added to the system knowl-
`edge and used to respond to future queries. In this way. the
`
`i.e., by using the added associations the
`system learns.
`system is able to respond to queries that it was unable to
`respond in a satisfactory manner before the association was
`added.
`
`14 Claims, 7 Drawing Sheets
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 2
`Petitioner Microsoft Corporation - EX. 1004, p. 2
`
`

`

`US. Patent
`
`Sep. 26, 1995
`
`Sheet 1 of 7
`
`5,454,106
`
`{05
`
`100
`
`PROCESSOR
`
`DICTIONARY
`L35
`
`
`
`
`I _
`
`PARSER
`{20
`
`SCANNERITOKENIZER
`I40
`
`DATABASE
`130
`
`STORAGE
`125
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 3
`Petitioner Microsoft Corporation - EX. 1004, p. 3
`
`

`

`US. Patent
`
`Sep.26,1995
`
`Sheet 2 of 7
`
`5,454,106
`
`
`
`._.DZPDQZH
`
`nunkmmwnza
`
`mmm
`
`hmwmMOE
`
`wmma‘m
`
`
`
`mmmmmmnumwmaiuzu
`
`mzawpmm
`
`mam
`
`
`>¢4mmHmw0m_vfim
`
`_D._.Dz
`
`xummmzuP¢z
`
`Iuh¢z
`
`
`
`mm2¢zmw¢m¢H¢Q
`
`E.mhzmzwufi
`
`omm
`
`3m
`
`
`
`E..mezzuu
`
`
`
`>mu30um¢m¢h¢m
`
`amazon—DUN...
`
`
`44¢~4¢Hhm¢m>¢4mmHn
`
`Dom
`
`0mm
`
`«FinU>m=mhum
`
`
`
`k¢4ammm.w
`
`ohm
`
`owmmeuzau
`
`mZHuwm
`
`numuunm>mm3a«zwxum
`
`
`
`um¢m¢h¢nmanoaxm
`
`mw¢mmu
`
`352085
`uzam.r||_fim
`
`w4¢¢432\mznw
`
`mum—IDES;
`
`
`
`QMIUhczwh¢umu
`
`HQ4wm2¢z
`
`.QDDHWEMQZD.
`
`dzwIUm
`
`pfizmnmmm
`
`>muna
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 4
`Petitioner Microsoft Corporation - EX. 1004, p. 4
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Sep. 26, 1995
`
`Sheet 3 of 7
`
`5,454,106
`
`320
`
`535
`
`328 DEPARTMENT-ff 322
`
`52?
`
`355“
`WORK LOCATION
`
`
`BO
`
`D EPA RTMEMT DEPARTMENT
`
`NO
`
`NAME
`
`
`
`3
`
`10\
`
`3T4
`
`3%?
`EMPLOYEE
`
`
`
`
`
`EMPLOYEE DEPARTMENT
`
`330
`
`
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 5
`Petitioner Microsoft Corporation - EX. 1004, p. 5
`
`

`

`US. Patent
`
`Sep. 26, 1995
`
`Sheet 4 of 7
`
`5,454,106
`
`mzczHES111%
`
`¢.o_n_
`
`m:
`
`Dav
`
`
`
`mmvflfffff
`
`”:225mg;Emmv
`mz¢zBu:HEv
`ma:3%Emmv
`
`E328%Enew
`39:ENEE36
`
`mm;
`
`mm;/mznfiazuuI261
`
`omv
`
`m3.
`
`3%
`
`vow
`
`wt»
`
`QN¢Nthum»...mz¢z
`
`Illulllill!
`
` 33>.mzu.pxmIIIII
`“:2Faunaim
`L.«Mia—H3
`nI“I“IIII
`mozEwEw
`
`OS»
`
`31»
`
`m:
`
`omv
`
`mmw
`
`mmv
`
`3%
`
`omv
`
`mhv
`
`NNV
`
`
`
`Sawhauzuzuuqmw
`
`MZDHHEZDU2:2;
`
`wzngumfiuma
`
`wnv
`
`ht“
`
`mnw
`
`9:.
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 6
`Petitioner Microsoft Corporation - EX. 1004, p. 6
`
`
`
`

`

`US. Patent
`
`Sep. 26, 1995
`
`Sheet 5 of 7
`
`5,454,106
`
`Dom
`
`omm
`
`ovm
`
`0mm
`
`com
`
`\\2m
`mmzqzM753,9?.
`
`IUPQZbulk
`
`mmzqzMJmE.mad Iota:
` mlfiqh.and 10.2251.;wmz¢z1.5.3MZHD...JDLDZHZa‘m—z
`
`
`
`0mm.1.53mmzqzHjmqfiand.
`
`
`
`uhqzfizjm mmIUPd‘zHQIHmzqz
`«zwluw4.ZSMDZEDDMU\WQHIMZD:¢JMM
`1:335>a.ZHmmzqz
`mu£¢2mimic...mtqumfufim
`
`
`>mQMFUMZZDQ
`DZQPM:BZHHJDmmm\CQIEME
`
`
`3m?).q
`
`
`.EIHmmzqz94H:
`
`mmm
`
`mvm
`
`mmloqu
`
`wm2¢z
`
`mom
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 7
`Petitioner Microsoft Corporation - EX. 1004, p. 7
`
`
`

`

`US. Patent
`
`Sep. 26, 1995
`
`Sheet 6 of 7
`
`5,454,106
`
`v
`H
`0
`
`v
`0.!
`
`O
`
`TABLE4
`
`600
`J(//
`
`FIG.6
`
`6H
`
`TABLE3_
`
`
`625([’
`
`623
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 8
`Petitioner Microsoft Corporation - EX. 1004, p. 8
`
`

`

`US. Patent
`
`Sep.26,1995
`
`Sheet 7 of”?
`
`5,454,106
`
`FIG. 7
`
`700
`
`/
`
`IDENTIFY NAMES UF TABLES
`
`780
`
`CUMPRISING THE VIEW
`
`IDENTIFY MEANINCFUL JUINS BETWEEN SLECTED
`
`730
`
`TABLES DR SPECIFY NEW JDINS BETWEEN TABLES
`
`DEFINE SELECTIDN CRITERIA DN
`FIELD VALUES EIF SELECTED TABLES
`
`7
`
`4‘3
`
`SELECT FIELDS DF SELECTED TABLES
`TU BE INCLUDED IN THE WIEW
`
`750
`
`SELECT DR INPUT NL WEIRD DR PHRASE
`TD ASSDCIATE WITH THE CREATED VIEW
`
`7 0
`6
`
`STDRE THE ASSDCIATIDN [IF THE VIEW &
`THE NL CDNCERT ABDVE IN MEMDRY
`
`770
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 9
`Petitioner Microsoft Corporation - EX. 1004, p. 9
`
`

`

`5,454,106
`
`1
`DATABASE RETRIEVAL SYSTEM USING
`NATURAL LANGUAGE FOR PRESENTING
`UNDERSTOOD COWONENTS OF AN
`AMLBIGUOUS QUERY ON A USER
`INTERFACE
`
`FIELD OF THE INVENTION
`
`This invention relates to the field of gaining access to
`computer database information by using a novel combina-
`tion of natural language and graphical queries and displays.
`
`BACKGROUND OF THE INVENTION
`
`The prior art includes many computer systems that allow
`the user to obtain information from databases by entering a
`nattu‘al
`language query or command. Examples include
`Intellectfl], Natural Language from NLlfz], and Language
`Aceess from IBM[3]. (language Access“ is a trademark of
`the IBM Corporation.) These prior art systems generally
`follow the same method. First a natural language query is
`posed to the system using some sort of interface like a
`computer keyboard and screen. The system runs the input
`through a scanner or tokenizer that breaks the natural
`language (NL) query into individual words or tokens and
`looks up each worda'token in a system dictionary. The system
`then uses a NL parser that parses the query into its elements.
`The output of the parser is organized as a parse tree that
`shows the relationship between the elements. The parser
`may also provide additional information about each parse
`tree element, called element attributes, that might include:
`the parse tree element part of speech. its tense, under any
`parse one element synonyms, hypouyms, and hypernyms. A
`matching step is then performed where one or more parse
`tree elements anchor attributes are matched to names in the
`database. For a relational database. the names would include
`table and table field names. If the NL query can be com—
`pletely and unambiguously parsed and if the relevant ele-
`ments can be matched to the database names and, further, if
`the NL query can be transformed into a complete and correct
`database query then the desired information is retrieved
`from the database and displayed in some format on the user
`interface (e.g., computer screen). However, if the query
`cannot be unambiguously parsed or if there is a partial or
`multiple match between the parse tree elementis] and the
`database names, or a correct database query cannot be
`constructed, then the system is unable to ‘hnderstand" the
`user request. Incorrect database information or no informa»
`tion at all will be retrieved in these cases.
`
`There are many ways in which the system can fail to
`“understand" the user request. First, the scannen'tokeniaer
`may not recognize one or more wordsltokens of the NL
`query if, for example, one or more of the wordsftokens (or
`their synonyms) making up the NL query do not match the
`entries in the system dictionary. Second, the parser may fail
`to correctly parse the natural language input. This can occur
`if the natural language input has a structure which the parser
`does not recognize. Alternatively, the parser can fail by
`yielding multiple parses. This can occur even for relatively
`simple NL queries.
`The prior art tries to resolve these problems in a number
`of ways. Often the prior art asks for clarification. Clarifica»
`tion is helpful if the natural language query can be resolved
`by using a different word or by defining the misunderstood
`word. If the system does not understand the syntactic
`structure of the query, the system may ask the user to clarify
`
`10
`
`15
`
`20
`
`35
`
`40
`
`45
`
`5t}
`
`55
`
`60
`
`65
`
`2
`the query by rephrasing the NL query in an understandable
`form. However, in case of multiple parses, the system must
`then decide to which of the possible parses it should
`respond. Several heuristics are used to determine this. For
`example, the parse that best matches the names in the
`database may be selected. However,
`these heuristics are
`often no better than guesswork.
`Failures can occur in the prior an even after a single
`correct parse. In these cases, some or all of the elements in
`the parse u-ee cannot be matched to the names in the
`database. In these instances. the process fails and a database
`query cannot be developed to retrieve the desired informa-
`tion from the database. The system can still ask for clarifi—
`cation or rephrasing but it is very ditiicult in this case to tell
`the user how to change the query. Repeated, non-specific
`requests to rephrase the query can quickly discourage the
`user and cause the system to be rejected. To avoid this, some
`prior art guesses at the meaning of the natural language
`query. Guessing sometime permits information retrieval
`from the database, but the user has no way of telling if the
`information retrieved and presented is the correct system
`response or not. Guessing and presenting the wrong infor-
`mation can rapidly cause the user to lose faith in the system
`and stop using it.
`
`STATEMENT OF PROBLEMS WITH THE
`PRIOR ART
`
`Understanding nanu‘al language using computer recogni-
`tion systems is very difiicult and is still the subject of a great
`deal of ongoing research. The prior art fails to understand a
`largefraction of natural language queries and fails to accu—
`rately translate them into database queries. As discussed
`above,
`this can occur due to a variety of causes: not
`recognizing individual wordsltokens or phrases, not recog—
`nizing the syntactic form employed, and not developing a
`proper parse. Further, at times the prior art fails to query the
`database information even though the input query can be
`parsed unambiguously because it cannot match all
`the
`elements in the parse tree unambiguously to database names
`or is unable to develop a correct database query from the
`matched elements. In all of these situations, no information
`is retrieved that responds to the input query or, worse yet, the
`retrieved infomation is erroneous.
`
`The prior art does not improve with usage. There is no
`capability to “teach" the prior an new concepts. The prior art
`systems do not improve with use over time because they
`cannot “learn“ to retrieve database information using new
`and different input words and NL phrases.
`Finally, and most importantly, the prior art cannot explain
`itself to the user. In case of partial understanding, it is unable
`tell the user what it has understood. Neither can it explain
`what concepts are included in the database that can be
`queried. Therefore. prior art systems lack the ability to teach
`the user to use system more effectively.
`
`OBJECTS OF THE INVENTION
`
`An object of this invention is an improved, user-friendly
`apparatus and method that enables a user to respond to
`ambiguous queries.
`Another object of this invention is an improved database
`query system using a combination of natural language and
`graphical displays and operations.
`Another object of this invention is an improved combined
`natural language and graphical data retrieval system and
`method that resolves ambiguous input queries by displaying
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 10
`Petitioner Microsoft Corporation - EX. 1004, p. 10
`
`

`

`5,454,106
`
`3
`a subset of the database organization according to what is
`understood of the query and by allowing the user to further
`explore the database using the graphical capabilities.
`A further object of this invention is an improved data
`retrieval system and method that learns in the process of
`resolving ambiguous NI. queries so that those queries can
`subsequently be understood.
`
`SUMMARY OF THE INVENTION
`
`The present invention is a novel method and apparatus for
`retrieving information from a database using natural lan—
`guage (NL) queries and graphical interfaces and displays.
`The system is able to resolve ambiguities that result from the
`N1 query by presenting the user with information in the
`database that matches wordsr'rokens or parsed elements of
`the query. Specifically. the system presents the user what it
`“understands" of the query in the preferred form of a
`graphical display. Using the graphical display and what the
`system understands as a starting point1 the user explores the
`database by accessing and displaying database information
`and information relationships. The user is also able to
`specify queries directly by using graphical operations on the
`information presented. Finally, the user can specify a data-
`base “view” by using graphical operations on the display
`and establishing an association between the view and 21 NL
`word or phrase. This association is stored in the system and
`used to match elements in the parse tree for future queries.
`In this way. the system learns and improves with use and
`over time. As associations are added to it. the system can
`respond to an ever increasing set of NL queries.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block diagram showing the hardware compo-
`nents of a preferred natural language data retrieval system.
`FIG. 2 is a flow chart showing the steps of the process of
`resolving ambiguous queries and retrieving matched infor-
`mation from the natural language data retrieval system.
`FIG. 3 shows one preferred database structure used to
`store information that is to be retrieved using the present
`invention.
`
`FIG. 4 shows a data structure of the present invention
`used to store the database names.
`
`FIG. 5 is a flow chart showing how database names are
`matched to input query wordsr'tokcns or parser result ele-
`ments to resolve ambiguous input queries.
`FIG. 6 is an example of a preferred schema display
`showing the user only what the system understands of an
`ambiguous query.
`FIG. 7 is a flow chart showing how the present invention
`learns to subsequently respond to an ambiguous NL query
`by associating a NL word or phrase with a selected set of
`information within the database.
`
`DETAILED DESCRlPTION OF THE
`INVENTION
`
`FIG. 1 shows a preferred computer system apparatus 100
`of the present invention comprising a user interface 105. a
`processor 110, and a storage memory 125. The user interface
`105 can he a standard keyboardICRT display. a speech
`recognition interface. andlor an interface to another system,
`such as another computer, that requests information from the
`database 130 resident on the storage memory 125. A dictio-
`nary 135 is also resident on the storage memory 125 or
`alternatively in main memory 115. User interfaces 105 like
`
`10
`
`15
`
`20
`
`25
`
`3t]
`
`35
`
`45
`
`SD
`
`55
`
`60
`
`65
`
`4
`interfaces, are well
`these. along with other equivalent
`known. The processor 110 can be a standard personal
`computer, mainframe computer or other computer system
`that is also well know. The processor 110 has some main
`memory 115, a parser program 120, and a seannen‘tolrenizer
`program 140. The storage memory 125 can be a disc drive
`or other equivalent bulk storage apparatus. Alternatively, the
`database 130 andlor dictionary 135 can be stored in the main
`memory 115 and the storage memory 125 can be omiued.
`One preferred system for embodying this invention is an
`IBM P512 Model 95 computer equipped a microchannel
`speech accelerator card #6306. a graphical CRT display. 16
`megabytes of random access memory. and 300 megabytes of
`storage memory. One preferred parser 120 is Lbeparserin the
`IBM Language Access product [3]. and one preferred scan—
`nerltokenizer 140 is the scannerftokenizer in [3]. Equivalent
`parsers 120 and seannerltokenizers 140 may also be used.
`FIG. 2 shows a. flow chart of the method 200 used by the
`present apparatus shown in FIG. 1.
`A user 202 (a human or an apparatus capable of querying
`a system database) interacts with the system interface 205.
`The interface 205. embodied in FIG. 1 as component 105,
`receives an input query. preferably a natural language (NL)
`query. from the user. The NL query can be in the form of text
`input to a keyboard interface 205 or auditory speech to a
`speech recognition interface 205. The user interface 205
`ultimately converts the user input query into a system
`recognizable format so that it can be scanned by the scanner!
`tokeru'zer 210.
`Block 210 shows the scannen'tokenizer. A seannen’token-
`izer takes the input (NL) query and breaks it up into
`identifiable word or token components. [A token is a sym-
`bol. like a NL word, that can be recognized by matching it
`to entries in a system dictionary 135.) For typed input. the
`wordltokens are identified by being delimited by blanks. For
`speech input. the wordftoken may be delinrited by pauses.
`Other ways of identifying speech tokens e.g., matching of
`words or partial words. are well known in the art.
`In one embodiment, if one or more words/tokens of the
`NL query are scanned and not understood (e.g.. the words!
`tokens fail
`to match data in the scannen’tokenizer 210
`dictionary), feedback is given to the user 202 through the
`user interface 205 by block 208. Block 208 typically would
`indicate that the NL query is not understood by the system
`100 and would request an alternate NL query. The user 202
`would continue to input alternative N1. queries until all the
`wordsftokens are understood by the scannen'tokcnizer 210.
`At this point, the scannen’tokenizer 210 would convert the
`wordsr’toltens of the NL query into a form recognized by the
`parser 212.
`The case where some of the wordsr‘tokens of the NLqucry
`are never recognized by the scannen'tokenizer 210. for our
`purposes. is called a scanner or tokenizer ambiguity 211.
`Resolution of the tokcnizer ambiguity 211 is described
`below.
`
`The parser 212 can be any known parser that is capable of
`parsing the user input into elements. As stated above, the
`parsed query elements can have information about the
`relationships of the components with in the NL query and
`possibly additional attribute information about the NLquety
`components.
`The parser 212 results can take three forms: many parses
`of the NL query. one parse of the NL query. and no parses
`of the NLqucry. For example, many parses can occur where
`one or more wordltoken in the N1. query has more than one
`meaning or can be used as more than one part of speech in
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 11
`Petitioner Microsoft Corporation - EX. 1004, p. 11
`
`

`

`5,454,106
`
`5
`
`the NI. query. Other examples are also known. The one parse
`case occurs when there is only one meaning and one
`organization of the NLquery components that the parser 212
`is able to provide. The no parse case 213 might occur if the
`query contains a NL construction that the parser does not
`recognize or if there is a typographical error in NL query
`which yields recognizable wordsltokens that do not make
`sense in a sentence. For example, misspelling “show" as
`“how".
`
`Ifthere are many parse results, an algorithm is used in box
`225 to select the best or most likely parse of the NL query.
`This can be done in two ways (22? and 229). First 229, the
`system can display the misunderstood candidate parse
`results (box 230) to the user 202 through the user interface
`205 with a request to the user 202 to select the correct parse.
`Alternatively, the system can select one of candidate parse
`results using a heuristic 227. There are many known heu-
`ristics 227 for doing this (for example using statistical
`analysis to select the most probable parse), but often the
`heuristics boil down to the system guessing which candidate
`parse result represents the user’s intention. The elements of
`a single successfill parse or a selected candidate parse (if
`there is more than one parser result) are matched to names
`in the database in box 214.
`
`The last parser result form, the no parse result. is called a
`parser failure 213 for our purposes. The case of the parser
`failure 213 will be discussed below.
`
`The database 130 of the system 100 is stored in the system
`memory (typically in bulk storage 125) and can have the
`structure of databases generally known in the art. See FIG.
`3. The preferred database comprises records having fields
`(or attributes) that contain values. Collections or records are
`called files or tables. The “Harries" in the database are the
`names of the tables (these names are the same as the names
`of the records in the table) and the names of the fields of the
`table. They also include meaningful join names and database
`view names. These are explained below.
`For example, data about employees in a company is
`organized in a table 310. The table name 312 is “Employee".
`(Likewise, each given record 318 of the table 310 named
`“Employee" is canted “Employee” or “Employee Record".)
`Each given record 318 of the table named "Employee" has
`one or more fields. Each of the plurality of fields also has a
`name. In the example, three field names (also called column
`names) are shown for the “Employee" table 310: Employee
`number 314, Employee department 315, Salary 316, etc.
`Tables in a relational database may be related to one
`another. In the relational database case, there are two or
`more tables organized like the employee table 310. That is,
`each of the tables in the relational database has table (record)
`names and field (column) names for each field in a given
`record. For example, there can be an additional table 320 in
`the database with the name “Depanment” 322. The “Depart-
`ment" table also has records with one or more fields. Each
`of the fields also have names. In this example, the fields
`shown are: “Department number 325, “Department name"
`326, “Manager" 327, etc. When a given field of two or more
`tables in a relational database represents the same type of
`information, there is a possible join between the two tables.
`In the case 350, this means that infonnation can be accessed
`from the “Department" table through the "Employee" table
`by using the common field (here “Employee Department"
`315) as a key. A subset of the possible joins are designated
`meaningful joins, such as 350, because they represent a
`useful relationship between the tables. Using meaningful
`joins to access information in relational database tables is
`
`it)
`
`15
`
`20
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`well known. In the preferred embodiment, the meaningful
`joins 350 of the relational database 130 are also given names
`355.
`
`Using relational database techniques, other tables, called
`extended tables or views 330, can be created. The views are
`created using information from other related tables (310,
`320) in the database 130. The newly created tables (views)
`also have a table (view) name 332 (here “Extended
`Employee") and fields with names (here “Employee Num—
`ber” 334, “Department Name" 335, “Manager" 337. and
`“Salary” 336). Views (extended tables) can be created by the
`system designer or alternatively by the user (see box 280 of
`FIG. 2). The steps in defining a view are exactly the same as
`the steps in defining a query except dint information is not
`used to access the database (as in query) but stored in the
`system for future use.
`FIG. 3 also shows examples of the values that reside in
`fields of the database tables. A given record, 318, 328, and
`338 in tables 312, 322, and 332, is shown to have the values:
`6000 for the field named “Employee Number", Engineering
`for the field named “Department Name", Smith for the field
`name “Manage ", and $l05,000 for the field named “Sal-
`ary“, respectively.
`Refer again to FIG. 2. in box 214, the elements of the
`results of the parse can be matched to database names in a
`number of ways that are known in the art. For example. one
`or more selected elements of the parser output or their
`synonyms can be matched against a list of names in the
`database i.e., for a relational database,
`the names of the
`tables, the names of the fields or columns, the names of the
`meaningful joins and the names of the views [1].
`There are four possible results of the matching step in box
`214:
`
`1. None of the parser result elements match any database
`names.
`
`2. Each of the parser result elements matches a single
`database name.
`
`3. Some parser result elements (but not all) match data-
`base names.
`
`4. Some or all of the parser result elements match more
`than one database name.
`For the first result, the no match case 219, a message is
`given to the user 202 through the user interface 205 using a
`function like that in box 208. For the second set of parser
`results, the individually matched case, the matched parser
`result elements are convened into a database query in box
`216. Results 3 and 4 are called match ambiguities 215 and
`will be discussed below.
`When all of the parser result elements are successfully
`matched to names in the database, box 216 attempts to
`convert the parse tree and the matched elements into a
`database query. A database query is a formal specification
`used to retrieve data from a database. Thus, the output of box
`216 is a set of field and record names in the database along
`with join specifications and some logical selection of the
`values of the fields in those records. For example, a simple,
`single table database query might take the form of: Select the
`“Department" (field name 315) from “Employee" (table!
`record name 312) with “Salary" (field name 316) greater
`than “$100,000" (logical lower limit of the value in the
`“Salary" field.) This is well known prior art. See.
`for
`example, [3}.
`Sometimes the system fails to conven the successfully
`matched elements of the parser into a database query in box
`216. For our purposes, this is called a query conversion
`ambiguity 217.15. query conversion ambiguity 217 can occur
`
`Petitioner Microsoft Corporation - Ex. 1004, p. 12
`Petitioner Microsoft Corporation - EX. 1004, p. 12
`
`

`

`7
`
`8
`
`5,454,106
`
`in a number of ways. For example, if a single table and query
`name is matched an ambiguity occurs if there is no field in
`the selected table with a name that matches a field nante
`selected by the database query. The handling of query
`conversion ambiguities 217 is described below.
`In box 20, database queries, successfully converted from
`matched elements, are used to retrieve data from the data-
`base. The data is displayed (or provided} to the user 2202
`through the user interface 205. This is handled by a database
`management system using well known methods.
`In the above discussion. specific preferred examples of
`the user 202, user interface 205, searmerftokenizer 210,
`display feedback 208, parser 212, picking the “best parse”
`225, displaying misunderstood parsed options 230, match—
`ing elements 214, converting to database queries 216, and
`retrieving and displaying database data 220 where pre~
`sentcd. These aspects of the system have many alternative
`embodiments in the prior art that are envisioned within the
`scope of this invention. However, the prior art, as mentioned
`above, fails to resolve the many ambiguities (21.1, 213. 215,
`and 217} that may result in servicing a user input query,
`specifically the scanner/tokeniaer ambiguity 211, the parser
`ambiguity 213, the matching ambiguity 215, and the query
`conversion ambiguity 217 that are identified above.
`The present invention introduces a novel solution to
`resolving these ambiguities and furthermore is able to
`"learn" how to subsequently retrieve database information
`requested by input queries that initially produce ambiguities.
`These novel features are: selecting words/tokens or parser
`elements that the system 109 understands (create matched
`names list) 240, displaying the understood wordsltoltcns or
`parser result elements to the user 250, and enabling the user
`to define new concepts so that ambiguities associated with
`the NL query do not arise in the future 200, i.c., the system
`learns.
`Box 240 initially processes the scanning 211, parsing 223,
`match 215 and conversion 21';l ambiguities 211.
`In the event diat some wordsr'tokens of the input query are
`not recognized 211 [matched to entries in the dictionary} or
`the wordst‘tokens of the input query cannot be parsed 213.
`the input query wordsl‘tokens themselves are matched to the
`names in the database. In the case of a match ambiguity 215,
`the parser result elements that matched database names
`andror have multiple matches to database names in hex 214
`are passed to box 240. In the case of a conversion ambiguity
`217, all the parser result elements are passed to box 240.
`Box 240 uses matching techniques similar
`to those
`described in box 214. However, only those wordsltokens or
`parser elements that match database names (i.e., table, field,
`meaningful join or view names) are “understood” by the
`system 10!). Those matching database names pass as output
`from box 240. The other wordltokens or parser result
`elements, those not having a match to a database name, are
`ignorcdldiscardcd.'1‘hereforc, the output of box 240 is a list
`of database names that match the recognised components.
`(A component is a wordltoken or a parser result element.) In
`boxes 240 and 250, there may be one or more matches for
`each query component.
`Box 250 resolves the ambiguities created in the scanner!
`tokenizer 210 {ambiguity 211), parser 212 (ambiguity 213),
`the matching elements 214 (ambiguity 215), a

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket