throbber
United States Patent [19J
`Wical
`
`[54] CONCEPT KNOWLEDGE BASE SEARCH
`AND RETRIEVAL SYSTEM
`
`[75]
`
`Inventor: Kelly Wical, San Carlos, Calif.
`
`[73] Assignee: Oracle Corporation, Redwood Shores,
`Calif.
`
`[21] Appl. No.: 08/861,983
`
`[22] Filed:
`
`May 21, 1997
`
`Int. Cl? ...................................................... G06F 17/30
`[51]
`[52] U.S. Cl. ................................................... 707/5; 706/50
`[58] Field of Search ................................ 706/50, 61, 934;
`707/5
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`............................. 706/11
`4/1992 Lanier et a!.
`5,103,498
`5,159,667 10/1992 Borrey eta!. ........................... 707/500
`5,167,011 11/1992 Priest ........................................ 706/62
`5,226,111
`7/1993 Black eta!. .............................. 706/50
`5,257,185 10/1993 Farley eta!. ............................ 707/100
`5,276,616
`1!1994 Kuga et a!. ............................... 704/10
`5,325,298
`6/1994 Gallant ........................................ 707/5
`5,369,763 11/1994 Biles ........................................... 707/3
`5,442,780
`8/1995 Takanashi eta!. .......................... 707/1
`5,555,408
`9/1996 Fujisawa et a!.
`........................... 707/5
`5,598,557
`1!1997 Doner et a!. ................................ 707/5
`5,615,112
`3/1997 Liusheng et a!. ......................... 706/50
`4/1997 Bartell eta!. ........................... 345/440
`5,625,767
`5,630,117
`5/1997 Oren eta!. .............................. 707/100
`5,630,125
`5/1997 Zellweger ............................... 707/103
`5,634,051
`5/1997 Thomson .................................... 707/5
`8/1997 Borgida eta!. ........................... 706/50
`5,659,724
`2/1998 McGuinness eta!. .................... 706/50
`5,720,008
`
`01HER PUBLICATIONS
`
`Cox, John '"Text-Analysis' Server to Simplify Queries",
`Communications Week, Apr. 19, 1993.
`
`"Verity finds the Topic," The Seybold Report on Publishing
`Systems, vol. 19(4), Oct. 1989.
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US006038560A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,038,560
`Mar.14,2000
`
`D.R. Cutting, et al., "Constant interaction-time scatter/
`gather browsing of very large document collections," Proc.
`Sixteenth annual international ACM SIGIR Conf. on
`Research and Development in Information Retrieval, pp.
`126-134, Dec. 1993.
`R.B. Allen "An Interface for Navigating Clustered Docu(cid:173)
`ment Sets Returned by Queries," Proc. of the Conf. on
`Organizational Computing Systems, pp. 166-171, Dec.
`1993.
`E.D. Liddy, et al., "Text Categorization for Multiple Users
`Based on Semantic Features from a Machine-Radable Dic(cid:173)
`tionary," ACM TRansactions on Information Systems, vol.
`12(3), pp. 278-295, Jul. 1994.
`R.B. Allen, "Two Digital Library Interfaces that Exploit
`Hierarchical Structure," DAGS95: Electronic Publishing
`and the Information Superhighway, (10 pages), May 1995.
`A. Celentano, et al., "Knowledge-based Document
`Retrieval in Office Environments: the Kabiria System,"
`ACM TRans. on Information Systems, vol. 13(30, pp.
`237-268, Jul. 1995.
`
`(List continued on next page.)
`
`Primary Examiner-Robert W. Downs
`Attorney, Agent, or Firm---Fliesler, Dubb, Meyer & Lovejoy
`LLP
`
`[57]
`
`ABSTRACT
`
`A knowledge base search and retrieval system, which
`includes factual knowledge base queries and concept knowl(cid:173)
`edge base queries, is disclosed. A knowledge base stores
`associations among terminology/categories that have a
`lexical, semantical or usage association. Document theme
`vectors identify the content of documents through themes as
`well as through classification of the documents in categories
`that reflects what the documents are primarily about. The
`factual knowledge base queries identify, in response to an
`input query, documents relevant to the input query through
`expansion of the query terms as well as through expansion
`of themes. The concept knowledge base query does not
`identify specific documents in response to a query, but
`specifies terminology that identifies the potential existence
`of documents in a particular area.
`
`29 Claims, 21 Drawing Sheets
`
`IBM-1006
`Page 1 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 1
`
`

`
`6,038,560
`Page 2
`
`01HER PUBLICATIONS
`
`M. Iwayama and T. Tokunaga, "Cluster-based Text Catego(cid:173)
`rization: a Comparison of Category Search Strategies," Proc.
`18th Annual Int'l. ACM SIGIR Conf. on Research and
`DEvelopment in Information Retrieval, pp. 273-280, Dec.
`1995.
`
`G. Salton, et al., "Automatic Text Decomposition Using Text
`Segments and Text Themes," Proc. Seventh ACM Conf. on
`Hypertext '96, pp. 53-65, Dec. 1996.
`P. Pirolli, et al., "Scatter/Gather Browsing Communicates
`the Topic Structure of a Very Large Text Collection," Conf.
`Proc. on Human Factors in Computing Systems, pp.
`213-220, Dec. 1996.
`
`IBM-1006
`Page 2 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 2
`
`

`
`.._.,....._ 130
`
`::I.
`::I.
`
`Document
`
`...
`
`+I
`
`Content
`Processing
`System
`11Q
`
`100
`
`Knowledge
`Scoring
`140
`
`•
`•
`
`Document(s)
`Theme
`Vector
`160
`
`Inference
`Processing
`145
`
`H Learning
`
`Processing
`165
`
`FIG. 1
`
`Knowledge
`Catalog
`150
`
`..
`
`Knowledge
`Base
`155
`T
`
`Query
`Processing
`175
`
`To
`Screen
`Module
`
`I
`
`• • • User
`
`Query
`
`d •
`\Jl
`•
`~
`~ ......
`
`~ = ......
`
`~
`~ :-:
`'"""'
`~,J;;..
`
`N c c c
`
`'JJ. =-~
`~ .....
`'"""' 0 ......,
`
`N
`'"""'
`
`0\
`....
`8
`00
`....
`Ul
`0\ =
`
`IBM-1006
`Page 3 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 3
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 2 of 21
`
`6,038,560
`
`Mode
`
`User
`uery
`Q
`
`Query Processing 175
`Concept Query
`
`.. Processing
`..
`
`200
`n
`
`Query Term
`-
`.... Processing
`205
`
`1 - Retrieval
`1 - Information
`,
`Factual Query
`Processing
`210
`
`•
`
`-
`
`To Screen
`Module ,,
`230
`
`...
`•
`
`Document
`Signatures
`160
`
`Knowledge
`Base
`155
`
`FIG. 2
`
`IBM-1006
`Page 4 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 4
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 3 of 21
`
`6,038,560
`
`FIG. 3
`
`Query:
`
`Legal, Betting, China
`
`610
`
`Government, Casino, Asia --- 620
`
`Gaming Industry (2)
`
`625
`
`Patents, Slot Machines, Japan
`
`630
`
`~-----~Patent Law (4)
`
`635
`
`L----~ Gaming Industry (2)
`
`640
`
`Crime, Wagering, China
`
`645
`
`.,.__.._~Insects (1)
`
`650
`
`L...--.-~~ Conservation - Ecology (2) ---655
`
`IBM-1006
`Page 5 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 5
`
`

`
`FIG. 4
`
`Geography
`
`Leisure and Recreation
`
`Political
`Geography
`
`(Marker)
`
`Europe
`
`8
`
`Western
`Europe
`
`Arts & Entertainment
`
`Tourism
`
`Visual
`Arts
`
`10
`
`----1
`i-Eiffel
`1
`1 Tower
`:
`~------
`
`d •
`\Jl
`•
`~
`~ ......
`~ = ......
`
`~
`~ :-:
`'"""'
`~,J;;..
`
`N c c c
`
`'JJ. =(cid:173)~
`~ .....
`,J;;..
`0 ......,
`N
`'"""'
`
`0\
`....
`8
`00
`....
`Ul
`0\ =
`
`IBM-1006
`Page 6 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 6
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 5 of 21
`
`6,038,560
`
`FIG. 5
`
`Generate senses and distinct parts from query
`
`Generate query term strengths
`
`Expand query terms using knowledge base
`
`Select categories in knowledge base
`identified by expanded query terms
`
`Select documents classified for those categories
`
`Select themes from documents
`
`Sort and compile information by theme
`
`List themes in order of strongest themes
`
`Select top themes from additional documents
`based on predetermined criteria
`
`Organize themes in groups
`
`Order theme groups
`
`Order documents within groups
`
`Display groups and associated document names
`
`Display categories classified for documents
`
`400
`
`402
`
`405
`
`410
`
`420
`
`430
`
`440
`
`450
`
`460
`
`465
`
`470
`
`475
`
`480
`
`485
`
`IBM-1006
`Page 7 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 7
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 6 of 21
`
`6,038,560
`
`N
`Q)
`-o
`0 z
`
`§I
`
`C\J
`<(
`Q)
`""0
`
`0 z
`
`-
`
`/
`
`~
`
`<(
`a>
`""'0
`0 z
`
`0
`'(t<(
`>..E
`,_ ..__
`(])(l)
`::If-a
`I
`I
`I
`
`/
`
`/
`
`/
`
`/
`
`/
`
`>< Q)
`-o
`0 z
`
`1.()
`
`I(cid:173)
`(])
`""'0
`
`0 z
`
`U)
`Q)
`-o
`0 z
`
`c.o
`<.9
`LL
`
`co
`
`co
`Q)
`""'0
`
`0 z
`
`u
`a>
`""'0
`0 z
`
`IBM-1006
`Page 8 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 8
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 7 of 21
`
`6,038,560
`
`FIG. 7
`
`Generate applicable senses and forms for
`distinctive query terms
`
`--500
`
`Generate strengths for query terms
`
`Map query terms to knowledge base
`
`Expand query terms through knowledge base
`
`Select theme set for expanded query terms
`
`510
`
`520
`
`530
`
`540
`
`Expand theme set through knowledge base
`
`550
`
`Select common denominators of expanded themes
`among expanded query terms to satisfy input query
`
`Relevance rank query terms, expanded query
`terms, and themes
`
`Display query response
`
`560
`
`570
`
`580
`
`IBM-1006
`Page 9 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 9
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 8 of 21
`
`6,038,560
`
`FIG. SA
`
`-
`
`Social Sciences
`
`-
`
`....-
`
`History
`
`. Ancient History
`..
`
`.
`--
`
`Ancient Rome
`
`-
`..
`
`~
`
`Anthropology
`
`- Customs and Practices
`..
`
`- Kinship and Marriage
`
`...
`
`-
`--..
`
`-
`
`-
`
`-
`-
`
`Peoples
`
`Races of Peoples
`
`Linguistics
`
`Languages
`
`IBM-1006
`Page 10 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 10
`
`

`
`U.S. Patent
`FIG. 88
`
`Mar.14,2000
`
`Sheet 9 of 21
`
`6,038,560
`
`Food and Agriculture
`
`~----~ Cereal and Grains
`
`Condiments
`
`Dairy Products
`
`Drinking and Dining
`
`Beers
`
`Liquors
`
`Liqueurs
`
`Wines
`
`Meats
`
`Beef
`
`Lamb
`
`t----.t Pate and Sausages
`
`Seafood
`
`Pastas
`
`Prepared Foods
`
`Desserts
`
`Cakes
`
`Cookies
`
`Pastries
`
`Sauces
`
`Soups and Stews
`
`IBM-1006
`Page 11 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 11
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 10 of 21
`
`6,038,560
`
`FIG. 8C
`
`Geography
`
`Political Geography
`
`Europe
`
`Western Europe
`
`Austria
`
`Germany
`
`France
`
`Iberia
`
`Spain
`
`Ireland
`
`Italy
`
`Sweden
`
`Netherlands
`
`United Kingdom
`
`England
`
`Eastern Europe
`
`Greece
`
`IBM-1006
`Page 12 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 12
`
`

`
`FIG. 9A
`
`Leisure and Recreation
`
`Social Sciences
`
`Arts and Entertainment
`
`Performing Arts
`
`7
`
`Dance
`
`Ballet
`
`Folk Dance
`
`Marker
`
`8
`
`8
`
`9
`
`Anthropology
`
`Customs and Practices
`
`History
`
`Ancient
`History
`
`Festivals
`
`Ancient
`Rome
`
`I
`
`National
`
`Religious
`Festivals
`
`d •
`\Jl
`•
`~
`~ ......
`~ = ......
`
`~
`~ :-;
`'"""'
`~,J;;..
`
`N c c c
`
`'JJ. =-~
`~ .....
`'"""'
`'"""' 0 ......,
`
`N
`'"""'
`
`0\
`....
`8
`00
`....
`Ul
`0\ =
`
`IBM-1006
`Page 13 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 13
`
`

`
`FIG. 98
`
`Food and Agriculture
`
`Drinking and Dining
`
`Occupations
`
`Chefs
`
`French Chefs
`
`French Chefs
`
`Crepes
`Tripe Sausages
`Chicken Cordon Bleu
`
`8
`
`French Cheeses
`
`(Theme
`Strength = 5}
`
`Dairy Products
`
`Cheeses
`
`Brie
`
`d •
`\Jl
`•
`~
`~ ......
`~ = ......
`
`~
`~ :-:
`'"""'
`~,J;;..
`N
`g
`{Theme
`Strength = 50} =
`
`'JJ. =(cid:173)~
`~ .....
`'"""' N
`0 ......,
`N
`'"""'
`
`0\
`....
`8
`00
`....
`Ul
`0\ =
`
`IBM-1006
`Page 14 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 14
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 13 of 21
`
`6,038,560
`
`FIG. 9C
`I
`----1 I) Festivals, Foods, Western Europe
`rl A) Festivals, Drinking and Dining, Germany I
`
`1) Beer
`2) Knockwurst
`3) Oktoberfest
`4) Stein
`5) Sauerkraut
`
`~ B) Festivals, Drinking and Dining, France
`
`I
`
`1) Mardi Gras
`2) Crepes
`3) Calembert
`4) Croissant
`5) Brie
`6) Tripe Sausage
`7) Onion Soup
`8) Chicken Cordon Bleu
`
`----1 II) Festivals, Food
`
`~ A) Ancient Rome, Wines
`
`1) Wine
`2) Grapes
`L..--~- 3) Fermentation
`4) Barrels
`5) Vineyards
`
`I
`
`I
`
`IBM-1006
`Page 15 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 15
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 14 of 21
`
`6,038,560
`
`Internet
`
`Virtua[ C[er~
`in.. •
`Concept Search
`~- Knowledge Search
`List Topics
`Help
`
`Found 15 Documents and 5 Categories
`****Computer Networking (15)
`*
`Internet CreditBureau, Incorporated (0)
`Internet Fax SeNer (0)
`*
`*
`Internet Productions, Incorporated (0)
`
`* Internet Newbies (0)
`FIG. 1 OA
`
`IBM-1006
`Page 16 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 16
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 15 of 21
`
`6,038,560
`
`FIG. 108
`
`Science and Technology (2380) I Communications (279) I Telecommunications Industry (90)
`
`Computer Networking(15)
`Electronic Mail ( 1 )
`GE Networks (1)
`Internet Technology (2)
`Messaging (1)
`NBC Networks (3)
`Networks (1)
`
`[J Documents About Computer Networking and Also:
`a Colorado
`[J 7/01/88 Business Brief: Noted...
`LJ 8/19/88 The Americas: Mexico's...
`B Mexican
`a NBC Officials
`[J 7/05/88 NBC Talks With European...
`a State Agencies
`D 1 0/07/88 Three Companies Win $180 ...
`a Television and Radio D 8/09/88 NBC-TV Trying to Beat...
`
`12§1
`~
`- New
`
`+ 112§1
`
`See Also:
`Computer Hardware Industry (56)
`Computer Industry (256)
`Computer Standards (1)
`Information Technology (9)
`Mathematics (4)
`
`IBM-1006
`Page 17 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 17
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Internet
`
`Sheet 16 of 21
`
`6,038,560
`FIG. 11A-1
`
`o/irtua[ C[erft
`in. ..
`Concept Search
`~- Knowledge Search
`List Topics
`Help
`
`Stocks
`
`Found 152 Documents and 64 Categories
`(42)
`"*"*"*Commerce and Trade
`"*"* Companies
`(.11)
`(ill
`"*"* Financial Investments
`"*"* Investors
`(§_)
`"*"* Portfolios
`Q)
`"* Pharmaceutical Industry
`(§_)
`"* Magazines
`(1)
`Q)
`"* Automotive Industry
`Q)
`"* Mineralogy
`"* Computer Software Industry
`(1)
`(f)
`"* Stocks and Bonds
`"* Food and Drink Industry
`(1)
`(.1)
`"* Petroleum Products Industry
`"*Television and Radio
`(1)
`"* New York Life Insurance Company CD
`"* McGraw-Hill. Incorporated
`(.1)
`"* Banking Industry
`(2)
`*
`Industrial Goods Manufacturing (2)
`*Texaco, Incorporated
`(.1)
`(2)
`(.1)
`(.1)
`(.1)
`(2)
`(.1)
`(2)
`(f)
`
`* Insurance Industry
`* Lawyers
`* CitiCorp
`* Preferred Stocks
`* Computer Hardware Industry
`
`"* Walt Disney Company
`
`*"Diversified Companies
`*Buys
`
`IBM-1006
`Page 18 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 18
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 17 of 21
`
`6,038,560
`
`FIG. 11 A-2
`"* Dun & Bradstreet Corporation
`
`* Health-care Companies
`* Brokers
`* Personal Finance
`*Lawsuits
`*Leveraged Buy-outs
`
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(2)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`(1)
`*Airlines
`(1)
`*Cinema
`(1)
`*Construction Industry
`(1)
`* Automotive Service and Repair
`(1)
`* Retail Trade Industry
`(1)
`* Dow Chemical Company
`(1)
`*Real Estate
`(1)
`* Consumer Electronics
`(.1)
`* Chemical Industry
`(.1)
`* Convenience Products Businesses (1)
`(.1)
`*American Brands. Incorporated (1)
`* Motorola, Incorporated
`(.1)
`* Package Delivery Industry
`(.1)
`* Masco Corporation
`(.1)
`
`* Computer Industry
`* Aviation
`* Plastic and Rubber
`
`*Drugs
`* Clothing
`
`"* ltel Corporation
`* Hard Sciences
`* Rail Transportation
`"* Financial Lending
`"* Chrysler Corporation
`* Gillette
`* Brush Wellman. Incorporated
`* Taxes and Tariffs
`* Manufacturing
`* Japanese Companies
`
`* Shares Outstanding
`
`IBM-1006
`Page 19 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 19
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 18 of 21
`
`6,038,560
`
`FIG. 118
`
`Business and Economics (5438) I Business and Industry (2889) I Corporate Practices (263)
`
`Portfolios (4)
`
`[J Documents About Portfolios and Also:
`~ Commerce and Trade Ll11/16/88 Money Managers With ...
`a Interest Rates
`lr@
`Ll 8/24/88 Your Money Matters: Many...
`D 10/10/88 These Stocks Are a...
`~
`~ Investors
`~Securities
`[j 7/14/88 Fannie Mae Net Rose 97%... ~
`
`IBM-1006
`Page 20 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 20
`
`

`
`U.S. Patent
`
`Mar.14,2000
`
`Sheet 19 of 21
`
`6,038,560
`
`FIG. 12
`
`1/irtua[ Cferf(_
`i.a..
`Subject Location
`~- Knowledge Search
`List Topics
`Help
`
`President George Herbert Walker Bush
`Appears in 28 Docs/17 Categories:
`
`* * President George Herbert Walker Bush
`i:t Republican Party
`*Capital Gains Taxes
`i:t White House
`*President Ronald Wilson Reagan
`*Senate
`*Democratic Party
`*
`Iran Contra Affair
`i:t Congress
`i:t Job Actions
`* Campaigns
`i:t Meetings
`*Tax Rates
`i:t Presidential Candidates
`i:t Senators
`i:t Florida Governor
`i:t AIDS- Acquired Immune Deficiency Syndrome
`
`(7)
`(6)
`(1)
`( 1)
`(1)
`(1)
`(1)
`(1)
`(1)
`( 1)
`( 1 )
`( 1)
`(1)
`( 1)
`( 1)
`( 1)
`(1)
`
`IBM-1006
`Page 21 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 21
`
`

`
`L.---. 130
`
`J.
`
`Document
`
`'-
`
`--
`..
`
`Linguistic
`Engine
`700
`
`J~
`
`..
`....
`
`Structured Output
`
`Contextual
`Tags
`720
`
`Thematic
`Tags
`730
`
`Stylistic
`Tags
`735
`
`Content
`carrying Words
`737
`
`FIG. 13
`
`Morphology I Section 770
`
`Knowledge
`Catalog
`150
`
`Lexicon
`760
`
`f..---. 710
`
`-- Knowledge
`,..
`
`Catalog
`Processor
`740
`
`...
`
`Theme
`Vector
`
`750
`
`,,
`... ..
`.. Processor
`..
`'
`
`L-+ Content
`Indexing
`...
`..
`770
`-
`
`d •
`\Jl
`•
`~
`~ ......
`~ = ......
`
`~
`~ :-:
`'"""'
`~,J;;..
`
`N c c c
`
`'JJ. =(cid:173)~
`~ .....
`N c
`0 ......,
`N
`'"""'
`
`0\
`....
`8
`00
`....
`Ul
`0\ =
`
`..
`....
`
`Document(s)
`Theme Vector
`160
`
`J
`
`...
`
`Knowledge
`Base
`155
`
`j
`
`IBM-1006
`Page 22 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 22
`
`

`
`Processor
`Unit
`1005
`
`l
`
`l
`
`Peripheral
`Device(s)
`1030
`
`I
`
`L ,
`
`l
`
`Mass
`Storage
`Device
`1020
`
`Memory
`1010
`
`l
`
`l
`
`Input
`Control
`Device(s)
`1070
`
`FIG. 14
`
`1000
`
`1025
`)
`
`:
`
`Portable
`Storage
`Medium Drive
`1040
`
`d •
`\Jl
`•
`~
`~ ......
`~ = ......
`
`~
`~ :-:
`'"""'
`~,J;;..
`
`N c c c
`
`'JJ. =(cid:173)~
`~ .....
`N
`
`'"""' 0 ......,
`
`N
`'"""'
`
`0\
`....
`8
`00
`....
`Ul
`0\ =
`
`I
`7
`
`l
`
`Graphics
`Subsystem
`1050
`
`•
`
`Output
`Display
`1060
`
`IBM-1006
`Page 23 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 23
`
`

`
`6,038,560
`
`1
`CONCEPT KNOWLEDGE BASE SEARCH
`AND RETRIEVAL SYSTEM
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention is directed toward the field of
`search and retrieval systems, and more particularly to a
`knowledge base search and retrieval system.
`2. Art Background
`In general, search and retrieval systems permit a user to
`locate specific information from a repository of documents,
`such as articles, books, periodicals, etc. For example, a
`search and retrieval system may be utilized to locate specific
`medical journals from a large database that consists of a
`medical library. Typically, to locate the desired information,
`a user enters a "search string" or "search query." The search
`query consists of one or more words, or terms, composed by
`the user. In response to the query, some prior art search and
`retrieval systems match words of the search query to words
`in the repository of information to locate information.
`Additionally, boolean prior art search and retrieval systems
`permit a user to specify a logic function to connect the
`search terms, such as "stocks AND bonds", or "stocks OR
`bonds."
`In response to a query, a word match based search and
`retrieval system parses the repository of information to
`locate a match by comparing the words of the query to words
`of documents in the repository. If there is an exact word
`match between the query and words of one or more 30
`documents, then the search and retrieval system identifies
`those documents. These types of prior art search and
`retrieval systems are thus extremely sensitive to the words
`selected for the query.
`The terminology used in a query reflects each individual 35
`user's view of the topic for which information is sought.
`Thus, different users may select different query terms to
`search for the same information. For example, to locate
`information about financial securities, a first user may com(cid:173)
`pose the query "stocks and bonds", and a second user may 40
`compose the query "equity and debt." For these two different
`queries, a word match based search and retrieval system
`would identify two different sets of documents (i.e., the first
`query would return all documents that have the words stocks
`and bonds and the second query would return all documents 45
`that contain the words equity and debt). Although both of
`these query terms seek to locate the same information, with
`a word search and retrieval system, different terms in the
`query generate different responses. Thus, the contents of the
`query, and subsequently the response from word based 50
`search and retrieval systems, is highly dependent upon how
`the user expresses the query term. Consequently, it is desir(cid:173)
`able to construct a search and retrieval system that is not
`highly dependent upon the exact words chosen for the query,
`but one that generates a similar response for different queries 55
`that have similar meanings.
`Prior art search and retrieval systems do not draw infer(cid:173)
`ences about the true content of documents available. If the
`search and retrieval system merely compares words in a
`document with words in a query, then the content of a
`document is not really being compared with the subject
`matter identified by the query term. For example, a restau(cid:173)
`rant review article may include words such as food quality,
`food presentation, service, etc., without expressly using the
`word restaurant because the topic, restaurant, may be 65
`inferred from the context of the article (e.g., the restaurant
`review article appeared in the dining section of a newspaper
`
`2
`or travel magazine). For this example, a word comparison
`between a query term "restaurant" and the restaurant review
`article may not generate a match. Although the main topic of
`the restaurant review article is "restaurant", the article would
`5 not be identified. Accordingly, it is desirable to infer topics
`from documents in a search and retrieval system in order to
`truly compare the content of documents with a query term.
`Some words in the English language connote more than a
`single meaning. These words have different senses (i.e.,
`10 different senses of the word connote different meanings).
`Typically, prior art search and retrieval systems do not
`differentiate between the different senses. For example, the
`query "stock" may refer to a type of financial security or to
`cattle. In prior art search and retrieval systems, a response to
`15 the query "stock" may include displaying a list of
`documents, some about financial securities and others about
`cattle. Without any further mechanism, if the query term has
`more than one sense, a user is forced to review the docu(cid:173)
`ments to determine the context of the response to the query
`20 term. Therefore, it is desirable to construct a search and
`retrieval system that displays the context of the response to
`the query.
`Some prior art search and retrieval systems include a
`classification system to facilitate in the location of inform a-
`25 tion. For these systems, information is classified into several
`pre-defined categories. For example, Yahoo! TM, an Internet
`directory guide, includes a number of categories to help
`users locate information on the World Wide Web. To locate
`information in response to a search query, Yahoo!™ com(cid:173)
`pares the words of the search query to the word strings of the
`pre-defined category. If there is a match, the user is referred
`to web sites that have been classified for the matching
`category. However, similar to the word match search and
`retrieval systems, words of the search query must match
`words in the category names. Thus, it is desirable to con(cid:173)
`struct a search and retrieval system that utilizes a classifi(cid:173)
`cation system, but does not require matching words of the
`search query with words in the name strings of the catego-
`nes.
`
`SUMMARY OF THE INVENTION
`
`Concept knowledge base query processing in a search and
`retrieval system identifies, in response to a query, the poten(cid:173)
`tial existence of documents by displaying terminology
`related to the query. The search and retrieval system includes
`a knowledge base that links terminology having a lexical,
`semantic or usage association. In response to an input query,
`the search and retrieval system selects and displays termi(cid:173)
`nology relevant to one or more terms of the input query. The
`terminology guides the user in the overall search because the
`user may view the terminology to learn different contexts for
`the query.
`The knowledge base includes a plurality of categories and
`terminology, arranged hierarchically. To process a query, the
`search and retrieval system maps the terms of the query to
`categories/terminology in the knowledge base. In one
`embodiment, an expanded set of query terms are generated
`through use of the knowledge base, and the expanded set of
`60 query terms are used to identify relevant terminology.
`The search and retrieval system further uses a plurality of
`themes that relate context information to one or more of the
`categories. In one embodiment, a content processing system
`processes a plurality of documents to identify themes for a
`document, and classifies the documents, including themes
`identified for the documents, in categories of the knowledge
`base. The themes are selected from the categories/
`
`IBM-1006
`Page 24 of 41
`
`IPR2016-00020
`Petitioners Old Republic General Ins. Group, Inc., et al. Ex. 1034, p. 24
`
`

`
`3
`terminology identified by the query terms for potential
`display as terminology for the query response.
`In one embodiment, concept knowledge base query pro(cid:173)
`cessing further includes selecting additional themes, based
`on the original themes selected, by associating, through use
`of the knowledge base, the additional themes with the
`themes selected. To identify terminology for the query
`response, themes are matched with the query terms, or the
`expanded set of query terms, to select terminology common
`to both the themes and the expanded set of query terms that
`satisfies as many query terms as possible. Groupings of
`expanded query terms and themes, which satisfy more than
`one query term, are extracted for display with the query
`terms. Furthermore, the groupings and the themes are rel(cid:173)
`evance ranked to display the most relevant groups and 15
`themes first.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block diagram illustrating one embodiment for
`the search and retrieval system of the present invention.
`FIG. 2 is a block diagram illustrating one embodiment for
`query processing.
`FIG. 3 illustrates a response to an example query config(cid:173)
`ured in accordance with one embodiment of the search and 25
`retrieval system of the present invention.
`FIG. 4 illustrates an example portion of a knowledge base
`that includes a directed graph.
`FIG. 5 is a flow diagram illustrating one embodiment for
`factual knowledge base query processing.
`FIG. 6 illustrates one embodiment for expanding query
`terms using a directed graph of the knowledge base.
`FIG. 7 is a flow diagram illustrating one embodiment for
`processing concept knowledge base queries.
`FIG. Sa illustrates one embodiment of categories,
`arranged hierarchically, for the "social sciences" topic.
`FIG. Sb illustrates one embodiment of categories,
`arranged hierarchically, for the "food and agriculture" topic.
`FIG. 8c illustrates one embodiment of categories,
`arranged hierarchically, for the "geography" topic.
`FIG. 9a illustrates an example portion of a knowledge
`base used to expand the query term "festivals."
`FIG. 9b is a block diagram illustrating a portion of an 45
`example knowledge base used to expand themes.
`FIG. 9c illustrates one embodiment for a search and
`retrieval response in accordance with the example query
`input.
`FIG. lOa illustrates an example display of the search and
`retrieval system to the query "Internet."
`FIG. lOb illustrates another example display an example
`display for the query "Internet."
`FIG. lla illustrates an example display of the search and
`retrieval system to the query "stocks."
`FIG. llb illustrates an example display in response to the
`selection to the category "portfolios" from the display
`shown in FIG. lla.
`FIG. 12 illustrates an example display for a profile query
`in accordance with one embodiment of the present inven(cid:173)
`tion.
`FIG. 13 is a block diagram illustrating one embodiment
`for a content processing system.
`FIG. 14 illustrates a high level block diagram of a general
`purpose computer system in which the search and retrieval
`system of the present invention may be implemented.
`
`6,038,560
`
`30
`
`4
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`Search and Retrieval Paradigm
`The search and retrieval system of the present invention
`5 utilizes a rich and comprehensive content processing system
`to accurately identify themes that define the content of the
`source material (e.g., documents). In response to a search
`query, the search and retrieval system identifies themes, and
`the documents classified for those themes. In addition, the
`10 search and retrieval system of the present invention draws
`inferences from the themes extracted from a document. For
`example, a document about wine, appearing in a wine club
`magazine, may include the words "vineyards",
`"Chardonnay", "barrel fermented", and "french oak", which
`are all words associated with wine. As described more fully
`below, if the article includes many content carrying words
`that relate to the making of wine, then the search and
`retrieval system infers that the main topic of the document
`is about wine, even though the word "wine" may only
`20 appear a few times, if at all, in the article. Consequently, by
`inferring topics from terminology of a document, and
`thereby identifying the content of a document, the search
`and retrieval system locates documents with the content that
`truly reflect the information sought by the user. In addition,
`the inferences of the search and retrieval system provide the
`user with a global view of the information sought by
`identifying topics related to the search query although not
`directly included in the search query.
`The search and retrieval system of the present invention
`utilizes sense associations to identify related terms and
`concepts. In general, sense associations relate terminology
`to topics or categories based on contexts for which the term
`may potentially appear. In one embodiment, to implement
`the use of sense association in a search and retrieval system,
`35 a knowledge base is compiled. The knowledge base reflects
`the context of certain terminology by associating terms with
`categories based on the use of the terms in documents. For
`the above example about wine making, the term "barrel
`fermented" may be associated with the category "wines." A
`40 user, by processing documents in the content processing
`system described herein, may compile a knowledge base
`that associates terms of the documents with categories of a
`classification system to develop contextual associations for
`terminology.
`As described more fully below, the search and retrieval
`system of the present invention maps search queries to all
`senses, and presents the results of the query to reflect the
`contextual mapping of the query to all possible senses. In
`one embodiment, the search and retrieval system presents
`50 the results relative to a classification system to reflect a
`context associated with the query result. For example, if the
`user search term is "stock", the search and retrieval system
`response may include a first list of documents under the
`category "financial securities", a second list of documents
`55 under the category "animals", and a third category under the
`category "race automobiles." In addition, the search and
`retrieval system groups categories identified in response to
`a query. The grouping of categories further reflects a context
`for the search results. Accordingly, with contextual mapping
`60 of the present

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket