throbber
I lllll llllllll Ill lllll lllll lllll lllll lllll 111111111111111111111111111111111
`US007885963B2
`
`c12) United States Patent
`Sanders
`
`(IO) Patent No.:
`(45) Date of Patent:
`
`US 7 ,885,963 B2
`Feb.8,2011
`
`(54) FREE TEXT AND ATTRIBUTE SEARCHING
`OF ELECTRONIC PROGRAM GUIDE (EPG)
`DATA
`
`(75)
`
`Inventor: Scott D. Sanders, Union City, CA (US)
`
`(73) Assignee: Microsoft Corporation, Redmond, WA
`(US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1446 days.
`
`(21) Appl. No.: 10/395,679
`
`(22) Filed:
`
`Mar. 24, 2003
`
`(65)
`
`(51)
`
`(52)
`
`(58)
`
`Prior Publication Data
`
`US 2004/0194141 Al
`
`Sep.30,2004
`
`Int. Cl.
`G06F 17130
`(2006.01)
`U.S. Cl. ....................... 7071750; 707/706; 707/713;
`707/722; 707/736; 707/748; 707/752; 707/754;
`707/755; 707/758; 707/781; 707/791; 707/802;
`707/822; 704/275; 704/270.1; 704/257; 725/53;
`725/39; 725/40; 235/380; 709/203; 705/14
`Field of Classification Search ................. 704/275,
`704/270.1, 257; 707/3, 706, 713, 722, 736,
`707/748, 752, 754, 755, 758, 781, 791, 802,
`707/822, 999.002, 999.003, 999.007, 999.01;
`725/53, 39, 40; 235/380; 709/203; 705/14
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,992,737 A *
`6,456,978 Bl *
`6,553,345 Bl *
`7,213,256 Bl*
`
`1111999 Kubota ....................... 235/380
`912002 Wymore et al .............. 704/275
`4/2003 Kuhn et al.
`................. 704/275
`5/2007 Kikinis ........................ 725/53
`
`2001/0049826 Al *
`2002/0069218 Al*
`2002/0143860 Al*
`2002/0151327 Al*
`2003/0093790 Al*
`200410177063 Al *
`
`12/2001 Wilf ........................... 725/120
`612002 Sull et al.
`................ 707/501.1
`10/2002 Catan ......................... 709/203
`10/2002 Levitt ......................... 455/556
`5/2003 Logan et al.
`.................. 725/38
`912004 Weber et al .................... 707 /3
`
`FOREIGN PATENT DOCUMENTS
`
`EP
`EP
`EP
`GB
`GB
`RU
`
`1079371 Al
`1094406 A2
`1289287 A2
`2340633 A
`2340633 A *
`2138076 Cl
`
`212001
`4/2001
`3/2003
`212000
`212000
`9/1999
`
`OTHER PUBLICATIONS
`
`Decision on Grant from the Russian Patent Office for Application No.
`2004108667/09(009236), mailed on Mar. 30, 2009, 21 pgs.
`* cited by examiner
`Primary Examiner-Syling Yen
`(74) Attorney, Agent, or Firm-Lee & Hayes, PLLC
`
`(57)
`
`ABSTRACT
`
`Subject matter includes a search engine for electronic pro(cid:173)
`gram guide (EPG) data and related methods. In an exemplary
`method, a text search string can be normalized into searchable
`terms and the terms interpreted as either text search terms or
`attribute search terms. One or more queries having search
`conditions of varying degrees of complexity are created
`according to the interpretation of the terms of the search
`string. One or more searches in EPG databases and/or web(cid:173)
`resources are performed based on interpretation of the text
`and attribute terms and results are given a relevancy ranking
`according to the interpretation. The combined search results
`may be grouped, ranked, and filtered for display to the user.
`Results may also be displayed progressively as each character
`of a search string is entered by the user.
`
`29 Claims, 7 Drawing Sheets
`
`rfillll
`
`RECEIVE A TEXT SEARCH STRING
`
`NORMALIZE THE TEXT SEARCH
`STRING
`
`SEPARATE THE SEARCH STRING INTO
`TEXT TERMS AND ATTRIBUTE TERMS.
`
`SEARCH ON EACH TEXT TERM AND
`EACH ATTRIBUTE TERM.
`
`COMBINE, RANK, ANDfOR FILTER THE
`SEARCH RESULTS.
`
`COMBINE, RANK, AND/OR FILTER THE
`SEARCH RESULTS.
`
`602
`
`604
`
`606
`
`608
`
`610
`
`612
`
`Comcast, Exhibit-1002
`
`1
`
`

`

`N
`~CIO
`?'
`('D
`""f'j
`
`....
`0 ....
`
`~ = ~
`
`~
`~
`~
`•
`00
`~
`
`11§.
`
`HIT LIST
`
`ill
`
`FILTERING LOGIC
`
`IB
`
`RESULTS
`
`.1.1Q
`
`SEARCH
`
`PERFORM ATTRIBUTE
`
`106
`
`VIA CHECKBOXES
`SEARCH CRITERIA
`INPUT ATTRIBUTE
`
`102\
`
`ill
`
`RESULTS
`
`108
`
`SEARCH
`
`PERFORM TEXT
`
`104
`
`SEARCH FIELD
`KEYWORD(S) To
`
`INPUT TEXT
`
`100\
`
`2
`
`

`

`?'9, 2
`
`............
`
`222
`
`HIT LIST
`RANKED
`FILTERED,
`COMBINED,
`
`·~
`
`QUERY 216
`
`r
`
`' +
`
`'
`
`INTERPRETATION(S)
`
`,_
`
`I-
`
`l
`
`212
`
`SEARCHES
`ATIRIBUTE
`CONDITONAL
`
`,_
`
`....
`
`210
`
`TEXT SEARCHES
`
`CONDITIONAL
`
`,_
`
`....
`
`208
`
`SEARCHES
`ATIRIBUTE
`POSSIBLE
`
`,_
`
`1-.
`
`206
`
`TEXT SEARCHES
`
`POSSIBLE
`
`STRING INTERPRETER 214
`
`SEARCH ENGINE 204
`
`~200
`
`[ SEARCH STRING 202
`
`3
`
`

`

`332
`
`DISPLAY
`
`DISPLAY UPDATER 330
`
`INCLUSION FILTER 328
`
`RESULTS RANKER 326
`
`FILTER 324
`
`SEARCH COMPARATOR/
`
`HIT RECEIVER 322
`
`320
`
`RESULTS COMPILER
`
`FORMULATOR 318
`ATTRIBUTE QUERY
`
`FORMULATOR 316
`
`TEXT QUERY
`
`QUERY GENERATOR dM
`
`ATTRIBUTE IDENTIFIER 308
`
`STRING TRANSFORMER 306
`
`STRING INTERPRETER 214
`
`302
`
`NORMALIZER
`
`DATA MANAGER 319
`
`}Q1
`
`SEARCH STRING RECEIVER
`
`SEARCH ENGINE 204
`
`300
`
`DATABASE(S)
`
`EPG
`
`312
`
`CONSTRAINTS
`
`SEARCH
`TABLE OF
`
`INDEX(ES)
`
`ill
`
`METADATA
`
`ATTRIBUTES .llQ
`
`LIST OF
`
`4
`
`

`

`How THE WEST WAS WON
`
`FILTERED HIT LIST 422
`
`ALIAS BILLY THE KID
`
`THE ALAMO
`
`ADVENTURES OF KIT CARSON
`
`[SHANE WEST MOVIES]
`
`[DOMINIC WEST MOVIES]
`
`[STEVE DOLBY MOVIES]
`[THOMAS DOLBY MOVIES]
`
`FLYING G MEN
`
`G
`
`ALIG INDAHOUSE
`
`SECOND RATE
`
`CHEAP RATE GRAVITY
`
`THE DOLBY DIGITAL EXPERIENCE
`
`BEN-HUR
`
`BAREFOOT IN THE PARK
`
`AN AMERICAN TAIL
`AMAZING GRACE
`
`AIRPORT
`
`2001: A SPACE ODYSSEY
`
`102 DALMATIONS
`
`HIT LISTS 420
`
`HIT LISTS 418
`
`HIT LISTS 416
`
`WEST BANK BROOKLYN
`
`WEST OF THE RIO GRANDE
`
`THE WEST WING
`WILD WILD WEST
`WEST SIDE STORY
`
`How THE WEST WAS WON
`
`HIT LISTS _1ll
`
`-G?
`-WEST?
`-DOLBY?
`
`-"G"?
`-"RATED"?
`
`11DOLBY WEST"?
`
`-
`
`-"DOLBY"?
`
`ALT. ATTRIBUTE SEARCHES 412
`
`ALTERNATE TEXT SEARCHES 410
`
`-RATED G
`-DOLBY AUDIO
`-DOLBY DIGITAL
`
`ATTRIBUTE SEARCHES 408
`
`PRIMARY
`
`11WEST"
`
`-
`
`TEXT SEARCHES 406
`
`PRIMARY
`
`DOLBY WEST RA TED G 402
`
`doLBy WEST-rated "9" 202
`
`5
`
`

`

`U.S. Patent
`
`Feb.8,2011
`
`Sheet 5of7
`
`US 7 ,885,963 B2
`
`PROGRESSIVE SEARCH ,5QQ
`
`502
`
`THE
`
`506
`
`THE WEST
`
`510
`
`THE Wl::ST WING
`
`514
`
`THE WEST WING DOLBY
`
`518
`
`THE WEST WING DOLBY ELECTION
`
`----------1- ENGINE -------.i
`
`SEARCH
`
`2Qi
`
`504
`
`508
`
`512
`
`516
`
`520
`
`6
`
`

`

`U.S. Patent
`
`Feb.8,2011
`
`Sheet 6of7
`
`US 7 ,885,963 B2
`
`602 \
`
`604 \
`
`606 \
`
`608 \
`
`RECEIVE A TEXT SEARCH STRING.
`
`1J
`
`NORMALIZE THE TEXT SEARCH
`STRING.
`
`,,
`
`SEPARATE THE SEARCH STRING INTO
`TEXT TERMS AND ATTRIBUTE TERMS.
`
`1J
`
`SEARCH ON EACH TEXT TERM AND
`EACH ATTRIBUTE TERM.
`
`,,
`
`610 \
`
`COMBINE, RANK, AND/OR FILTER THE
`SEARCH RESULTS.
`
`,,
`
`612 \
`
`COMBINE, RANK, AND/OR FILTER THE
`SEARCH RES UL TS.
`
`7
`
`

`

`PROGRAMS
`APPLICATION
`
`NETWORK
`
`785
`
`MODEM
`
`I oo o>o'i
`
`773
`
`772
`
`756
`
`KEYBOARD
`
`747
`
`746
`
`745
`
`DATA
`
`MODULES
`
`PROGRAM
`
`OTHER
`
`SYSTEM
`
`OPERATING
`
`--·----·-·----~.--··
`
`--·-
`
`-
`
`737
`
`PROGRAM DATA
`
`796
`
`797
`
`791
`
`PRINTER
`
`SPEAKERS
`
`D
`
`INTERFACE
`NETWORK l/L--U.l~"""'-Ui.l.-1
`
`INTERFACE
`USER INPUT
`
`771
`
`1 LOCALAREA
`
`INTERFACE
`MEMORY
`
`REMOVABLE
`
`MEMORY
`
`REMOVABLE
`
`760
`
`740 750
`
`Bus
`
`721
`
`ADAPTER
`
`VIDEO
`
`790
`
`PROCESSING UNIT
`
`MODULES 736
`OTHER PROGRAM
`
`204
`
`SEARCH ENGINE
`
`EXEMPLARY
`
`PROGRAMS
`APPLICATION
`
`735
`
`I
`I
`I
`732 I
`
`-
`
`-
`
`-
`
`-
`
`SYSTEM 734
`
`OPERATING
`
`L --·-·-·--
`·--
`I 1
`11
`I 1
`I 1
`11
`11
`I 1
`11
`11
`I 1
`I 1
`11
`I 1
`I I
`I 1
`11
`11
`11
`I 1
`I 1
`ii (RAM
`Ir --·-·-·-·--
`
`-----~----
`
`I
`
`720
`
`: = =. §y:§~E~ ~E~O=Ri. =. ~ 730
`
`731
`
`BIOS zaa)
`
`: : (
`
`I I (ROM)
`
`I
`
`8
`
`

`

`US 7,885,963 B2
`
`1
`FREE TEXT AND ATTRIBUTE SEARCHING
`OF ELECTRONIC PROGRAM GUIDE (EPG)
`DATA
`
`One set of XML text file listings used in accordance with
`the subject matter are provided in an appendix after the
`abstract on three sheets of paper and incorporated by refer(cid:173)
`ence into the specification. The XML text file listing is an
`exemplary list of attributes (e.g., block 310 of FIG. 3).
`
`TECHNICAL FIELD
`
`This invention relates generally to multimedia data com(cid:173)
`munications and specifically to free text and attribute search(cid:173)
`ing of electronic program guide (EPG) data.
`
`BACKGROUND
`
`In a digital television system, users typically search for
`program viewing information in an electronic program guide
`(EPG). An EPG is a dataset of program and schedule meta(cid:173)
`data that relate to programs available from a user's local
`headend (a programming provider's channel lineup) during
`an interval of time, for example one week. A user searches for
`programs by designating keyword terms to perform a text 25
`search of program titles. A search result, typically a program
`title, that fits the criteria of the search is known as a "hit."
`Television programs can also be found by searching for
`program attributes, not just keywords from the program title.
`Although a program's title can be viewed as one attribute of 30
`the program, the term "attribute" is used more technically in
`the multimedia arts to describe many non-title characteristics
`of a program. As such, attributes include an unlimited myriad
`of characteristics about a program and its presentation. A
`program and its attributes typically fit into a vast hierarchy of 35
`EPG information. Some attributes, such as "category," may
`describe a broad set of programs of which the program being
`sought is a member, that is, the "category" attribute describes
`a hierarchical level inclusive of the program being sought, so
`that a user's search for the "category" attribute results in hits 40
`for every program in the category. A search for "westerns," for
`example, may result in many hits belonging to the western
`movie and television genre.
`Attributes can also describe characteristics proper to a
`program title, i.e., attributes can describe options that cause 45
`two versions of the same program to vary from each other.
`This narrower type ofattribute describes elements of the EPG
`hierarchy "below" a particular program, i.e., that cause the
`same program to have different flavors. For example, one
`version of "The Mountain" starring Spencer Tracy may have 50
`attributes of "black & white" and "monophonic" while
`another version has attributes of"color," "DOLBY® Digital,"
`"close captioning," and "rated PG." A text search limited to
`keyword elements of the title will bring up both versions as
`hits, whereas a text search for keyword elements of the title 55
`combined with an attribute search for "Dolby" will bring up
`only the DOLBY® digital and DOLBY® Stereo sound sys(cid:173)
`tem versions.
`A conventional text search for a keyword element of the
`title is performed by entering keyword text into a search field
`of a user interface (UI), and a conventional attribute search is
`performed by entering the attribute criterion (e.g., in check(cid:173)
`boxes) separately from the text search keywords in a separate
`part of the UI, or in a separate UI altogether. Hits from the
`separate text search and the separate attribute search are logi- 65
`cally combined only at the end of both searches to produce a
`hit list.
`
`2
`FIG. 1 shows a conventional text search 100 and a conven(cid:173)
`tional attribute search 102 wherein the text keywords and the
`attribute criteria are entered separately, such as in a search
`field separate from attribute checkboxes that are displayed via
`one or more Uls.
`At block 104, the text keyword(s) are input. At block 106,
`the attribute criteria are input separately from the keyword(s)
`input. A text search is performed 108, typically on the pro(cid:173)
`gram title, using the entered text keyword(s ), and an attribute
`10 search is performed 110 using the entered attribute search
`criteria. Each separate search yields separate results 112, 114.
`Filtering logic 116 may compare the results 112, 114 and
`produces a hit list 118 that typically satisfies both the text
`search 100 and the attribute search 102.
`Since users are typically required to search for attributes in
`a separate part of a UI, many restrict their search input to
`keyword style queries of program title words only, while
`others incorrectly enter attributes as keywords for a text
`search of program titles without realizing that the digital
`20 television system interprets the attributes thus entered as ordi(cid:173)
`nary program title search terms.
`
`15
`
`SUMMARY
`
`Subject matter includes a search engine for electronic pro(cid:173)
`gram guide (EPG) data and related methods. In an exemplary
`method, a text search string can be normalized into searchable
`terms and the terms interpreted as either text search terms or
`attribute search terms. Queries using search logic of varying
`complexity are created according to the interpretation of the
`terms in the search string. One or more searches in EPG
`databases and/or EPG web-resources are performed for each
`text and each attribute term and results are given a relevancy
`ranking according to the interpretation of the search string.
`The combined search results may be ranked, filtered, and
`grouped for display to the user. Results may also be displayed
`progressively as each character of a search string is entered by
`the user.
`In one implementation, a list of attributes can be used to
`separate the text terms from the attribute terms. An exemplary
`list of attributes may additionally contain indexes or metadata
`for each attribute that direct the course of searches performed
`for each attribute term in the search string.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a graphic representation of prior art text and
`attribute searches.
`FIG. 2 is a graphic representation of an exemplary search
`according to the subject matter.
`FIG. 3 is a block diagram of an exemplary search engine.
`FIG. 4 is a graphic representation of an exemplary search
`performed by the exemplary search engine of FIG. 3.
`FIG. 5 is a graphic representation of an exemplary progres(cid:173)
`sive search performed by the exemplary search engine of FIG.
`3.
`
`FIG. 6 is a flow diagram of an exemplary method according
`to the subject matter.
`FIG. 7 is a block diagram of an exemplary computing
`60 device environment in which to practice the subject matter.
`
`DETAILED DESCRIPTION
`
`Overview
`Quick, intuitive, and accurate searching of electronic pro(cid:173)
`gram guide (EPG) data is important to end-users of digital
`television systems, including both "set-top box" systems
`
`9
`
`

`

`US 7,885,963 B2
`
`3
`added to a television set and personal computer (PC) systems
`capable of receiving and recording television broadcasts.
`Exemplary subject matter describes a search engine and
`related methods for entering and interpreting a user's text and
`attribute queries in a digital television system. This quick and
`accurate search mechanism interprets, for example, a single
`text string input by a user and infers the user's likely intent,
`displaying a set of hit results that attempts to match the
`interpretation of what the user is searching for. Such
`advanced search capabilities are important as the PC digital 10
`video recorder (DVR) and set-top box markets expand at a
`rapid rate.
`FIG. 2 gives an overview of an exemplary search 200 in
`EPG databases or other resources. The exemplary search 200
`begins with a search string 202 entered by or received from a 15
`user or retrieved from storage. An exemplary search engine
`204 receives the search string 202, performing several trans(cid:173)
`formations on the terms of the string to generate possible
`query instructions of various levels of complexity.
`A single search string 202 may spawn one or more inde- 20
`pendent searches, depending on the complexity and indefi(cid:173)
`niteness of the search string 202, and so accordingly in one
`implementation, a string interpreter 214 tries to compose as
`few queries (one if possible) and as restrictive queries as
`possible, based on the particular search string 202. For 25
`example, in one implementation the string interpreter 214
`achieves this minimum number of concise and/or restrictive
`queries by determining which terms in the search string 202
`are known attributes to be searched as attribute search terms.
`The remaining terms in the search string 202 may be possible
`attribute terms or possible text terms, but determining known
`attributes (e.g., using a database or list of known attributes)
`often narrows a query considerably. Thus, the string inter(cid:173)
`preter 214 "considers" series of text searches 206, attribute
`searches 208, conditional text searches 210, and conditional 35
`attribute searches 212, using various methods and resources
`as discussed more fully below with regard to FIG. 3.
`In one implementation, a brute force determination of the
`likely meaning of the search string 202 is accomplished by
`performing a preliminary search on each normalized term in 40
`the search string 202 either in an EPG database and/or in an
`online dictionary andranking results. For example, if"Ryan"
`is a search term, then "Saving Private Ryan" might be
`selected over "Ryan O'Neal" as the likely interpretation of
`the search if it is more common in the database or dictionary.
`In another implementation, the string interpreter 214 tries
`different combinations of the search terms until one of the
`combinations is found in the EPG resources being searched or
`is logically favored. Hence, the search terms, "moon" and
`"paper" occurring randomly but as the only text terms among
`a host of attribute terms in a search string 202 would likely
`result in the interpretation "Paper Moon," which would be
`ranked highly if actually found as a hit via a query.
`The string interpreter 214 thus arrives at one or more inter(cid:173)
`pretations, which result in a query 216 (or queries). The query
`216 is used to perform one or more searches resulting in a
`combined, filtered, and/or ranked hit list 222 that is displayed
`for the user. This exemplary search 200 is only one example
`of how one or more text and attribute sensitive searches can
`stem from only one search string 202. The search string 202
`is parsed, transformed, and analyzed, e.g., by the string inter(cid:173)
`preter 214, to achieve a text and attribute aware interpretation
`of the terms in the search string 202.
`In one implementation, the combined, filtered, and/or
`ranked hit list 222 is updated as various search paths return
`results. Alternatively, results may progressively appear in the
`combined, filtered, and/or ranked hit list 222 as the search
`
`4
`string 202 is being entered, i.e., a new and/or modified search
`is initiated as each character is added to the search string 202.
`Exemplary Search Engine
`FIG. 3 shows the exemplary search engine 204 of FIG. 2 in
`greater detail. The exemplary search engine 204 is commu(cid:173)
`nicatively coupled with one or more EPG database(s) 300.
`Certain features and processes performed by the exemplary
`search engine 204 will now be described, but of course an
`exemplary search engine 204 may have features and perform
`processes that are additional to those illustrated, for example,
`indexing database contents. Various implementations of the
`exemplary search engine 204 will of course operate with
`various different human languages. Thus, some features of
`the exemplary search engine 204 may not apply or may not be
`needed with some languages.
`In general, the exemplary search engine 204 executes the
`EPG database search( es) by first exposing a search string
`receiver 301, such as an application program interface (API),
`that receives search expressions and accesses a set of con(cid:173)
`straints generated either progranimatically or by the user.
`After creating queries based partly on the constraints, the
`exemplary search engine 204 returns the result set.
`The search string receiver 301 can accept the search string
`202 via one or more Uls. Uis for EPG data searching may be
`optimized for either a "two-foot" or a "ten-foot" system. A
`two-foot system is a digital television system in which the
`user has the information entering capability typical of a PC
`system. For example, a two-foot system might use a full
`keyboard, mouse, and monitor. A ten-foot system, on the
`30 other hand, is a digital television system in which the user has
`limited inputting capability compared with a two-foot sys(cid:173)
`tem.Aten-foot system might be the UI generated by a set-top
`box and a search string 202 entered using a remote control
`with relatively few keys to actuate.
`Because data entry and navigation around a UI is generally
`more difficult with a ten-foot system, it is useful minimize the
`amount of data necessary to capture the user's intent in enter(cid:173)
`ing a search into the ten-foot system, while maximizing the
`accuracy and power of the search.
`The search string receiver 301 can, of course, accept user
`input from any number of entry fields, checkboxes, switches,
`and/or devices. If the input is just a single search string 202,
`however, the exemplary search engine 204 can "decode" the
`search string to separate out text and attribute search infor-
`45 mation and construct appropriate queries.
`A normalizer 302 communicatively coupled with the
`search string receiver 301 converts irregularities (i.e., con(cid:173)
`verts the format of "non-standardized" terms) in the search
`string 202 into terms that can be consistently applied in the
`50 various queries. A non-standard entry is a term or a part of the
`search string 202 that has text, character, language, and/or
`punctuation formalities that do not match the formalities of
`the EPG database( s) 3 00 to be searched. A non-standard entry
`runs the risk of being unintelligible in part or whole to the
`55 string interpreter 304, and might result in the miss of quali(cid:173)
`fying hits when its formalities do not match the EPG
`database(s) 300 being searched.
`As part of a search request, the search string receiver 301 is
`often passed a string of characters entered by the user, for
`60 example, "The West Wing". The normalizer 302, or another
`module of the exemplary search engine 204, may employ
`term separation to attempt to split the phrase into each of the
`words that compose it, performing a text search for each word
`individually. This separation into individual words has the
`65 advantage of finding matches when the user has omitted or
`rearranged words in a well-known title, but may also generate
`more hits than intended by the user. Individual word search
`
`10
`
`

`

`US 7,885,963 B2
`
`5
`terms are often separated at white-space and dash boundaries.
`The normalizer 302 may also remove extra spaces between
`words of a quoted phrase.
`Users do not always enter a search string 202 using the
`correct letter casing. For example, "The West Wing" might be
`typed as "the west wing". Additionally, EPG database infor(cid:173)
`mation sometimes contains non-conventional letter casing in
`phrases like "SUNDAY TICKET" instead of "Sunday
`Ticket" or "sunday ticket". The normalizer 302 may perform
`letter-case normalization by converting each search term and 10
`also the EPG database text being searched to a common case
`(either upper or lower), e.g., by following the rules of the
`current locale.
`The normalizer 302 may also normalize certain symbols
`and marks. A diacritical mark is a combining character such 15
`as an accent or umlaut that may appear in many international
`language alphabets, usually accompanying vowels. In one
`implementation of the subject matter, search terms and the
`database text being searched are compared as ifthe diacritical
`marks were not present. Symbols and special alphabet char- 20
`acters can also be rewritten and/or added to search terms by
`the normalizer 302 to facilitate an accurate search. For
`example, characters such as the German esset "W' can be
`rewritten as "ss" so that a search containing the German es set
`will produce hits if either the esset form or the "ss" form are
`present in the text being searched.
`Symbol normalization involves the removal or conversion
`of symbols, such as punctuation, from both the search term(s)
`and the text being searched for more accurate matching. For
`example, periods within abbreviations such as "F.B.I." and
`asterisks in "M* A *S *H" are be removed. Symbols such as
`"&"and"@" may be converted to their word forms "and" and
`"at". Symbols substituted for letters may be converted, e.g.,
`the "$" sign in "Vega$". This symbol normalization may
`result in additional searches. For example, the exemplary
`search engine 204 might search for both "Vegas" and "Vega$"
`after the normalizer 302 has processed the search string 202,
`instead of using only one form by convention and normaliz(cid:173)
`ing both the search term and the text being search according
`to the one conventional form.
`Articles of speech as well as ubiquitous but not very
`descriptive words such as "a", "the", "is", "are", "el", "la",
`"los", "!es", etc. can be removed from the search terms by the
`normalizer 302 to speed processing. In some implementa(cid:173)
`tions, the articles and ubiquitous words are not removed or are 45
`conditionally searched for if they are the only words present
`in the search, or if a first pass of the search yields no results.
`The exemplary search engine 204 may scan the search
`string in advance for tokens that alter the search meaning in
`some way. For instance, preceding a word with a minus-sign 50
`"-"might indicate that the word must not appear in a match.
`Grouping terms together as a quote might indicate that the
`group should only be considered when the terms in the group
`appear in the given order, and together. The words "AND",
`"OR", and "NOT" can be used in a special manner to influ- 55
`ence search operators. This allows a user more control over a
`search.
`In one implementation, the normalizer 302 or other com(cid:173)
`ponents of the search engine 204, may apply spell-checking
`to the search string 202 as entered. Several strategies are
`employed to aid the user in avoiding mistakes when entering
`the search string 202. The user can be alerted to the possible
`presence of misspelled words in the search text before the
`search is executed either via a spelling dialog box or another
`text indicator, such as the "wavy-line" INTELLISENSE®
`highlight. Alternatively, spell-checking can be deferred to the
`end of search execution and performed only if no exact results
`
`6
`are found. Lastly, spell-checking can be performed transpar(cid:173)
`ently. Corrected forms of words interpreted by the normalizer
`302 to be misspelled can be automatically appended to the
`search string 202 and queried as conditional searches. Rel(cid:173)
`evancy ranking (to be discussed below) can then be used to
`lower the rank of these conditional search terms if the
`searches yield hits that actually contain the misspelled search
`terms.
`The normalizer 302 can also use auto-correct, which is a
`simplified form of spell-checking. Instead of comparing each
`search term to a dictionary, words are corrected as they are
`typed using a simple lookup table of commonly misspelled
`words. In rare cases in which auto-correct makes the wrong
`decision (correcting a word the user intentionally meant to
`"misspell"), the user can backspace once to undo the auto(cid:173)
`matic correction. As a user types, a pop-up list of possible
`word completions can also be presented by the normalizer
`302 based upon a pre-existing full-text index. This allows the
`user to enter search criteria more quickly.
`A string interpreter 304 is communicatively coupled with
`the normalizer 302. In one implementation, the string inter(cid:173)
`preter 304 includes a string transformer 306 and an attribute
`identifier 308 communicatively coupled with a list of
`attributes 310. The string interpreter 304 may also be com-
`25 municatively coupled with a table of search constraints 312.
`The string transformer 306 may test the search terms in a
`search string 202 either singly or in combination to achieve an
`interpretation for the search string 202 and to determine if the
`string should be rearranged. The string transformer 306 may
`30 use a rearrangement and/or scrambling scheme to try combi(cid:173)
`nations of the various terms within the search string 202, and
`in various orders to identify possible attributes for which the
`user was trying to search. Thus, "dolby the closed mile green
`captioning" might yield search term candidates including
`35 "dolby," "the Green Mile," and "closed captioning." If it
`appears two or more of the terms should be grouped together,
`the string transformer may transpose the terms and/or trans(cid:173)
`parently add the posited group to the search terms in the
`search string 202. To accomplish this, the string transformer
`40 306 may test the search terms against the list ofattributes 310
`and/or the table of search constraints 312.
`The string transformer 306 may stem a search term to
`broaden the scope of a search. "Stemming" is a process of
`removing prefixes and/or suffixes from a search term to allow
`the root of the search term to proxy for the original search
`term. This produces more hits capturing more variants of the
`original search term. For example, the word "divine" can be
`stemmed to "divin" so that it matches "divinity," "divination",
`"diviner", "divinity," etc. The string transformer 306 may use
`a known stemming algorithm such as the "Porter" algorithm.
`Stemming has the advantage of generating hits that might
`have been missed ifthe search term was queried exactly, but
`has the disadvantage of often generating irrelevant hits. This
`effect can be decreased by using a relevance ranker, either in
`the string transformer 306 or in another part of the exemplary
`search engine 204, such as a results compiler (described
`below). In some implementations, stemming requires a data
`preprocessing step that can cause a decrease in search speed.
`Hence, in some implementations, stemming is limited to spe-
`60 cial words, e.g., rare words as determined by a frequency of
`occurrence index.
`The string transformer 306 may perform additional term
`and string transformations. Word branching involves gener(cid:173)
`ating a set of search terms from a single term. A list of related
`65 words may be used. For example, the search term "NFL"
`might result in the term "football" being appended to the
`search string 202, and thus "NFL" might spawn an additional
`
`11
`
`

`

`US 7,885,963 B2
`
`7
`search for "football". The term "thirty" might be added to the
`search string 202 ifthe term "30" appears in the search string
`202, etc. The related terms can come from an exception list
`relevant to the data being searched, and/or an exception list
`stored in the user-defined or programmatically determined
`table of constraints 312. Word branching may also slow down
`searches and may generate irrelevant results, but it will also
`generate hits in some cases where none would have been
`found. A relevance ranker can be used to place less relevant
`results at the bottom of the hit list 222.
`Pluralizing may also be used. Pluralizing is a simple form
`of word stemming that generates additional search terms
`based on the singular or plural form of the original word. For
`example, if the search term is "baby", an additional search
`term of "babies" can be included.
`Optionally, the string interpreter 3 04 or another component
`of the search engine 204, may add natural language parsing to
`the other string interpretation features and processes. This
`type of parsing includes converting a search phrase such as
`"When is Star Wars shown in Dolby-Digital?" into workable
`search terms that meet the users' search request. Though
`natural language parsing is very powerful, the grammar is
`generally limited to the user's choice of noun/verb meanings.
`Additionally, given the increased number of words in the
`natural language search string 202 (relative to a Boolean term 25
`search), the natural language search may not be the optimal
`choice for a ten-foot system in which search terms have to be
`entered using a remote control keypad. Lastly, natural lan(cid:173)
`guage parsing can require more processing power to imple(cid:173)
`ment.
`The features and processes performed by the string trans(cid:173)
`former 306 and by the other modules of the exemplary search
`engine 204 are described in a certain order, but the order in
`which these features and processes are presented is not meant
`to indicate a suggested order of use through a processing 35
`pipeline. In many cases, the ordering and/or the performance
`of each of these features and processes is important to the
`results generated.
`The attribute identifier 308 separates text search terms
`from attribute search terms by various methods, such as com(cid:173)
`paring candidate or potential search terms to attributes in

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket