throbber
Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 1 of 18
`
`
`
`
`
`
`
`EXHIBIT B
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 2 of 18
`
`(12) United States Patent
`Parikh
`
`USOO831 1805B2
`(10) Patent No.:
`US 8,311,805 B2
`(45) Date of Patent:
`Nov. 13, 2012
`
`1/2010 Gaussier et al. ....... TO7/999.104
`7,644,102 B2 *
`1/2010 Lowles et al. .......... TO7/999.101
`7,650,348 B2 *
`7,657.423 B1* 2/2010 Harik et al. ....................... TO4/9
`7,679,534 B2 * 3/2010 Kay et al. .......
`... 341 (22
`8,036,878 B2 * 10/2011 Assadollahi .................... TO4/10
`2004/0093.557 A1* 5, 2004 Kawatani .......
`T15,500
`2004/0153963 A1* 8/2004 Simpson et al. ........... 715/500.1
`2005/0060448 A1
`3/2005 Gutowitz ........................ 71O/72
`OTHER PUBLICATIONS
`Matiaseketal. “Exploiting Long Distance Collocational Relations in
`PredictiveTyping”. Proc. of EACL-03 Workshop on Language Mod
`eling for Text Entry Methods. 2003.*
`Witten et al. “The Zero-frequency problem: estimating the probabili
`ties of novel events in adaptive text compression'. IEEE Transactions
`on Information Theory, vol. 37, No. 4. (1991), p. 1085-1094.*
`cited by examiner
`k .
`Primary Examiner — Jesse Pullias
`74). A
`ey, Ag
`Fi
`Wei
`Law Offi
`ttorney,
`ent, or Firm — Weltzman Law
`CeS,
`LLC
`ABSTRACT
`(57)
`A method, performed in a character entry system, for inter
`relating character strings so that an incomplete input charac
`ter string can be completed by selection of a presented char
`acter string involves computing relationship scores for
`individual character strings in the system from documents
`present in the character entry system, in response to inputting
`of a string of individual characters that exceeds a specified
`threshold, identifying at least one selectable character string
`from among contextual associations that can complete the
`input character string in context based upon an overall rank
`ing score computed as a function of at least two other scores,
`and providing the identified at least one selectable character
`string to a user for selection.
`15 Claims, 6 Drawing Sheets
`
`- 0
`
`(54) AUTOMATIC DYNAMIC CONTEXTUAL
`DATA ENTRY COMPLETON SYSTEM
`(76) Inventor: Prashant Parikh, New York, NY (US)
`-
`(*) Notice:
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1245 days.
`(21) Appl. No.: 11/422,939
`(22) Filed:
`Jun. 8, 2006
`O
`O
`(65)
`Prior Publication Data
`Sep. 28, 2006
`US 2006/0217953 A1
`Related U.S. Application Data
`(63) Continuation-in-part of application No. 11/040,470,
`filed on Jan. 21, 2005
`ed. On Jan. Zl,
`(51) Int. Cl.
`(2006.01)
`G06F 7/27
`(2006.01)
`G06F 7/20
`(2006.01)
`G06F 5/00
`(52) U.S. Cl. ........ grgr. 704/9: 704/1: 704/10
`(58) Field of Classification Search .................. 715/256,
`715/261, 271; 707/737
`See application file for complete search history.
`References Cited
`U.S. PATENT DOCUMENTS
`. 36."i.
`.
`.
`.
`.
`.
`S. A : g E. R al .
`.
`.
`.
`.
`.
`.
`.
`.
`7,039,631 B1* 5/2006 Finger, II .......................... 707/3
`7,111.248 B2 * 9/2006 Mulvey et al.
`715812
`7,149,695 B1* 12/2006 Bellegarda ...
`704/275
`7,218,249 B2 * 5/2007 Chadha ........................... 341 (23
`
`s--- (-
`
`C. C. a. .
`
`(56)
`
`Matrix
`500 Y
`
`5O2
`
`Finance
`Summary
`Sugar
`Two
`One
`
`
`
`514 516
`512
`506 508 510
`504
`|-
`|- - - - slot
`Finance Summary Sugar Two One Chapter Chili Spoon
`12
`2
`3
`5
`O
`O
`3
`6
`10
`O
`O
`7
`9
`O
`3
`14
`3
`5
`8
`11
`6
`8
`19
`O
`O
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 3 of 18
`
`U.S. Patent
`
`Nov. 13, 2012
`
`Sheet 1 of 6
`
`US 8,311,805 B2
`
`
`
`110
`
`
`
`Other input
`
`
`
`Character
`Input
`
`Documents
`
`14O
`
`
`
`
`
`160
`
`
`
`
`
`170
`
`Su ested
`9. r
`
`Completion S
`(
`
`
`
`
`
`a
`Cont
`r
`A
`SSociations
`
`
`
`O
`15
`
`Completed /
`Input
`
`18O
`
`FIG. 1
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 4 of 18
`
`U.S. Patent
`
`Nov. 13, 2012
`
`Sheet 2 of 6
`
`US 8,311,805 B2
`
`Create list of pertinent documents on
`the device.
`Create list of unique words from
`documents.
`
`(optionally) Remove stop words from
`Word list.
`
`For each document, Count number of
`occurrences of each Word in Word list.
`
`1
`200
`hu 205
`h 210
`U215
`
`Offline
`or Online
`Computation
`
`Store in a matrix of documents vs.
`Words.
`
`Calculate the similarity value for all
`possible pairs of documents using
`matrix.
`
`Compare similarity values to the
`threshold value.
`
`225
`
`1U230
`us
`
`Discard document pairs whose
`similarity value falls below specified
`threshold value.
`Form groups of documents, using
`remaining document pairs, such that
`similarity value of all possible pairs in
`each group is above the threshold
`value. Words within each similar
`document group are contextually
`related.
`Create lists of unique words from each
`group of similar documents. 1U 245
`
`240
`
`FIG. 2a
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 5 of 18
`
`U.S. Patent
`
`Nov. 13, 2012
`
`Sheet 3 of 6
`
`US 8,311,805 B2
`
`Online
`Computation
`-
`
`250
`Accept characters from the user until the threshold?
`number of required characters are entered.
`
`ldentify relevant document groups from entered
`characters.
`
`
`
`55
`
`260
`Using the identified document groups, choose words Y\U
`that are contextually associated with and match the
`characters entered.
`
`Offer the chosen contextually associated words to
`the user for selection to complete the entry.
`The user accepts or rejects the offered contextually
`associated words and the process repeats with the
`beginning of the next word entry.
`
`270
`
`
`
`
`
`
`
`
`
`
`
`FIG. 2b
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 6 of 18
`
`U.S. Patent
`
`Nov. 13, 2012
`
`Sheet 4 of 6
`
`US 8,311,805 B2
`
`
`
`3O2
`
`312
`
`314
`
`316
`
`506 508 510
`594
`-
`Finance Summary Sugar
`12
`1
`
`2
`
`512
`514
`Chapter Chili
`5
`
`516
`
`Matrix
`500 Y
`
`502
`
`Finance
`Summary
`Sugar
`
`TWO
`
`emrm
`
`One
`Chapter
`Chili
`Spoon
`
`Matrix
`600 Na
`602 - Word List it 1
`F G 6
`Word List i2
`604/, Word List i3
`606
`Word List if A
`608
`Word List i5
`
`
`
`finance, report, summary, ...
`chili, one, salt, spoon, sugar, two, three, ...
`book, chapter, summary, two, three, ...
`-
`apple, cider, orange, ...
`Orange, paper, ...
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 7 of 18
`
`U.S. Patent
`
`Nov. 13, 2012
`
`Sheet 5 of 6
`
`US 8,311,805 B2
`
`41
`O
`
`Offline or
`Online
`Computation
`
`
`
`Create list of pertinent
`documents on the device. r 400
`ra-i- is a ri-...- ...---- far
`Create list of unique words from
`documents.
`-
`(optionally) Remove stop words
`from Word list.
`r 420
`Count frequency of co
`430
`OCCurrence, Within a unit, across
`documents for all possible pairs r
`of Words from list.
`to frequency results into a ruA40
`Use matrix to identify word pairs
`that are contextually associated
`based on their frequency of co- 1U 50
`OCCUSC6.
`
`matrix.
`
`
`
`FIG. 4a
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 8 of 18
`
`U.S. Patent
`
`Nov. 13, 2012
`
`Sheet 6 of 6
`
`US 8,311,805 B2
`
`Online
`Computation
`
`Accept characters from the user until the threshold
`number of required characters are entered.
`
`ldentify relevant words in the matrix from entered
`characters.
`
`
`
`ru
`
`Using identified words in matrix, choose words that
`are contextually associated.
`
`
`
`48O
`
`Offer the chosen contextually associated words to
`the user for selection to complete the entry.
`
`FIG. 4b.
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 9 of 18
`
`US 8,311,805 B2
`2
`1.
`usage by a user and thus, rather than offering the most
`AUTOMATIC DYNAMIC CONTEXTUAL
`recently used word, offer the user's most frequently used
`DATA ENTRY COMPLETON SYSTEM
`words.
`
`30
`
`CROSS REFERENCE TO RELATED
`APPLICATIONS
`This application is a continuation-in-part of and claims the
`benefit of priority of U.S. patent application Ser. No. 11/040,
`470 filed Jan. 21, 2005, the entirety of which is incorporated
`herein by reference as if fully set forth herein.
`FIELD OF THE INVENTION
`The present invention relates to information processing
`and, more particularly, computer, cellphone, personal digital
`assistant, or other similar device-based text entry.
`BACKGROUND OF THE INVENTION
`In modern life, there are a number of devices, notably
`digital computers and multifunctional handheld units that
`involve data entry, typically text, including for example cel
`lular phones and other devices like organizers and handheld
`computers. For all of these, one important use is the entry of
`linguistic items like words, phrases, and sentences. For
`example, a user may create an unstructured text document or
`might formulate an email message or a short text message to
`be sent as an SMS message on a cellphone. In such cases, text
`entry may occur through use of a keyboard or stylus for some
`handheld computers or cellphones, etc. However, data entry
`can be difficult when the keyboard is relatively small as it is on
`a handheld cell phone, organizer or computer, or uses indi
`vidual keys for entry of multiple letters, text, especially when
`a large number of characters must be entered. Similarly, with
`devices employing a stylus for text entry, entry of text can be
`slow and burdensome.
`Automated word completion programs have eased the bur
`den somewhat. Such automated word completion programs
`have appeared recently in a variety of applications in a variety
`of devices. These programs are typically based on either
`predefined word Suggestion lists (e.g. a dictionary) or are
`culled from the user's own most recently typed terms, the
`latter often called MRU (i.e. “Most Recently Used') pro
`grams. For example, the former type of program is based on
`a pre-given word Suggestion list based on a dictionary aug
`45
`mented with information about which words are more fre
`quently used. If a user types the characters “Su” in a docu
`ment, then it might Suggest 'super as the appropriate word
`completion based on the fact that it belongs to the pre-given
`word Suggestion list and has a high frequency of use in gen
`50
`eral English. On the other hand, the latter type of program
`would suggest a word completion based on the user's own
`recently used words (e.g. 'Supreme' may be suggested to a
`lawyer who has recently input “Supreme Court'). Such pro
`grams are often found in web browsers for example and will
`Suggest the most recently used “uniform resource locator” or
`URL (e.g. www.google.com when the user types "www.g.)
`as characters are input.
`A third type of program is able to detect that the user is in
`a particular type of field (e.g. the closing of a letter) and will
`60
`Suggest word completions (e.g. “Sincerely when the user
`types “Si') based on a more limited “contextual” list. An
`extension of this is to maintain many separate word Sugges
`tion lists and allow the user to choose an appropriate list for
`each document the user creates. Other variants allow users to
`actually insert entries manually into word Suggestion lists
`(e.g. a name and address) or to maintain frequencies of word
`
`SUMMARY OF THE INVENTION
`While the methods delineated above have many useful
`features, there is still a lack of a true context based system that
`is dynamic and automatic and thus, there is still much room
`for improvement when it comes to data entry in Such devices.
`Systems that maintain separate word lists and allows the user
`to choose an appropriate list are contextual to some degree,
`but still have the drawback of requiring the user to make a list
`selection each time, something that can become annoying for
`a user who typically creates several documents within the
`course of a single day. Moreover, separate word Suggestion
`lists are still inefficient because they are not automatically
`generated but instead depend on the user's guidance and
`input.
`The present invention combines certain features from
`existing techniques but goes significantly beyond them in
`creating a family of techniques that are automatic, dynamic,
`and context based as explained in greater detail herein.
`The present invention involves a method, performed in a
`character entry system. The method is used for interrelating
`character strings so that incomplete input character strings
`can be completed by a selection of a presented character
`string. The approach involves computing contextual associa
`tions between multiple character Strings based upon co-oc
`currence of character Strings relative to each other in docu
`ments present in the character entry system, identifying at
`least one selectable character string from among the com
`puted contextual associations that can complete the incom
`plete input character string in context (performed in response
`to inputting of a specified threshold of individual characters),
`and providing the identified at least one selectable character
`string to a user for selection.
`The advantages and features described herein are a few of
`the many advantages and features available from representa
`tive embodiments and are presented only to assist in under
`standing the invention. It should be understood that they are
`not to be considered limitations on the invention as defined by
`the claims, or limitations on equivalents to the claims. For
`instance, Some of these advantages or features are mutually
`exclusive or contradictory, in that they cannot be simulta
`neously present in a single embodiment. Similarly, some
`advantages are applicable to one aspect of the invention, and
`inapplicable to others. Thus, the elaborated features and
`advantages should not be considered dispositive in determin
`ing equivalence. Additional features and advantages of the
`invention will become apparent in the following description,
`from the drawings, and from the claims.
`BRIEF DESCRIPTION OF THE FIGURES
`FIG. 1 illustrates, in simplified form, a top-level flowchart
`for the automatic completion of character input using contex
`tual word associations;
`FIG. 2a illustrates a simplified flowchart for computing
`contextual associations in one example implementation of the
`invention;
`FIG.2b illustrates a simplified flowchart for the selection
`of contextual associations in an example implementation of
`the invention;
`FIG. 3 illustrates an example documents versus words
`matrix used to compute contextual associations with an
`example implementation of the invention;
`
`10
`
`15
`
`25
`
`35
`
`40
`
`55
`
`65
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 10 of 18
`
`15
`
`US 8,311,805 B2
`3
`4
`FIG. 4a illustrates a simplified flowchart for computing
`ideograms used therein (or “stroke' components thereof) are
`contextual associations in an alternative example implemen
`considered “words' and thereby are intended to be incom
`passed by the terms “text' and “textual.” In some cases, an
`tation of the invention;
`FIG. 4b illustrates a simplified flowchart for the selection
`entire pictogram or ideogram will be usable as a “word as
`of contextual associations in an alternative example imple- 5
`described herein with entry of a component of the pictogram
`mentation of the invention;
`or ideogram, Such as a defined stroke, being analogous to
`FIG. 5 illustrates an example matrix of pairs of words used
`entry of a letter in English. Likewise, for simplicity in the
`to compute contextual associations in the alternative example
`following examples, the terms “typing or “typed are used to
`implementation of the invention; and
`describe data entry. However, those terms should be broadly
`FIG. 6 illustrates an example set of word lists for a word 10
`read to encompass any and all methods of data entry, whether
`completion example involving the alternative example imple
`involving entry through use of a keyboard, a pointing or
`mentation.
`selection device, a stylus or other handwriting recognition
`system, etc. They are not in any way intended to be limited
`DETAILED DESCRIPTION OF THE INVENTION
`only to methods that make use of a typewriter-like keyboard.
`Examples of devices that can use and benefit from incor
`The present invention can be used with a variety of elec
`poration of the invention therein range from large computer
`tronic devices. The minimum requirements for any Such
`networks, where an implementation of the invention may be
`device are some means for accepting textual input from a user,
`part of or an application on the network, to Small portable
`one or more processor(s) that execute stored program instruc
`hand held devices of more limited or specialized function
`tions to process the input, storage for the data and the program 20
`Such as cell phones, text messaging devices and pagers.
`instructions and a display or other output device of some sort
`Implementations incorporating the invention can be used to
`to make output visible or available to the user. Representative,
`assist users in interacting with large databases by helping in
`non-exhaustive, example input devices can include, but are
`not limited to, a keyboard, a handwriting recognition system
`the entry of search terms or in data entry. Other implementa
`that makes use of a stylus, a touchpad, a telephone keypad, a 25
`tions incorporating the invention are particularly useful for
`pointing device like a mouse, joystick, trackball or multi
`portable devices, in which the input device is limited by size
`directional pivoting Switch or other analogous or related input
`and difficult to work with, because the automatic completion
`of character string entries provides greater benefits in Such
`devices. The storage preferably includes non-volatile
`devices. Still other implementations incorporating the inven
`memory, and can also include Volatile semiconductor-based
`memory, electro-magnetic media, optical media or other 30
`tion are particularly useful for devices used by those with
`types of rewriteable storage used with computer devices. If a
`physical handicaps. In addition to the methods of character
`display is used, the display may be small and capable of
`input already mentioned, devices intended for use by handi
`displaying only text or much larger and capable of displaying
`capped individuals may rely on Some type of pointing device
`to select individual characters for input. The pointing device
`monochrome or color images in addition to text. If another
`output device is used, like a text to speech converter, appro- 35
`may be controlled by movement of the eyes, head, hands, feet
`priate implementing equipment will be included. Although
`or other body part depending on the abilities of the particular
`described, for purposes of clarity, with reference to keyboard
`individual. The present invention may also be used with
`“text that is implemented in braille or other tactile represen
`type entry, it is to be understood that the present invention is
`independent of the particular mode of, or device used for, text
`tations for individuals with impaired vision.
`In overview, in connection with the invention, words from
`40
`data entry.
`At the outset, it should be noted that, for the purposes of
`one or more documents are associated, in either a fully or
`partially automated way, based on context. Context is derived
`this invention, a "document” as used herein is intended to be
`a very general term covering one or more characters, whether
`from the co-occurrence of individual words in documents. In
`alone or in conjunction with numerals, pictures or other
`addition, the associations can be pre-computed and static or
`dynamic So they can thereby evolve and improve with con
`items. A document’s length can vary from a single “word to 45
`any number of words and it can contain many types of data
`tinued use.
`For example, in an implementation of the invention, an
`other than words (e.g. numbers, images, Sounds etc.). Thus,
`association between “finance' and "Summary may be gen
`ordinary documents such as pages of text are documents, but
`so are spreadsheets, image files, sound files, emails, SMS text
`erated but not one between “finance' and “sugar; in this case,
`if a user has typed in the word “finance' followed by the
`messages etc.
`As noted above, a “word,” for the purposes of this inven
`characters "su, then, based on the association, the invention
`will suggest 'summary as the appropriate word completion
`tion, can be considered to be more than a string of alphabetic
`rather than “sugar.” Here, the word “finance' has provided the
`characters, it may include numeric and other symbols as well.
`context that suggests the appropriate completion; if instead
`Broadly, the invention provides contextual completion of
`character strings, where a character string includes not only 55
`the user had typed “two spoons of and then the characters
`alphabetic words but any other discrete collection of charac
`“Su” and if an association had been generated between,
`“spoon' and “sugar rather than “spoon' and “summary”
`ters, symbols, or stroke based pictographs or ideograms, for
`then the invention would suggest "Sugar as the contextually
`example, those used in languages like Chinese, Korean and
`appropriate completion. As more words are entered in the
`Japanese, and thus can benefit from use of the present inven
`tion. Thus, although for simplicity the term “word is used in 60
`document, the contextual associations become richer.
`The invention permits the use of different techniques for
`the following discussion, it should be understood to encom
`actually creating the associations. As a result, for purposes of
`pass any discrete collection of characters, symbols or other
`understanding, two fully automated example techniques are
`stroke based representations of communicative concepts,
`thoughts or ideas. Thus, the present invention, although
`described below with the understanding that semi-automatic
`implementation techniques are considered to be literally the
`described with reference to English, is independent of any 65
`particular language. It can be used for phonetic, pictographic
`same as the fully automated ones. The automatic or manual
`or ideographic languages when the characters, pictograms or
`nature of a technique is, in most respects, independent of the
`
`50
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 11 of 18
`
`10
`
`15
`
`35
`
`US 8,311,805 B2
`6
`5
`as follows. The device accepts character input from the user
`invention because it relates more to the ease of processing
`large amounts amount of text, not the technique itself.
`until a specified threshold number of characters has been
`The general approach is illustrated, in simplified overview,
`entered (250). Using the entered characters, relevant word
`in FIG. 1 with respect to a single document. The approach
`lists are identified (255). Due to the processing, the words
`begins with a device Such as a personal digital assistant, cell
`within these identified lists are deemed contextually related
`and thus, words in the identified lists having a corresponding
`phone, computeror other device (100,110, 120 or 130) which
`initial character string matching the entered characters are
`has documents (140) stored in its memory. These documents
`are used to create associations (150) between pairs of words
`chosen (260) to be offered for selection by the user to com
`plete the character entry (265).
`or character strings within the document and use these asso
`ciations to suggest word or character string completions (170)
`The above referenced process can be fully understood by
`way of the following simplified example. To assess the simi
`to the user entering text (160) in a document. The associations
`larity or dissimilarity of documents, one way of thinking of a
`among the words or strings may be static or dynamic. With
`implementations incorporating a more dynamic approach, as
`document that contains one or more words is as a bag or
`multiset of words. A bag or multiset is like an ordinary set in
`the user adds to a document or creates more documents on the
`device, the associations are recomputed or Suitably aug
`mathematics, a collection, except that it can contain multiple
`occurrences of the same element. For example, book, cape,
`mented. This will alter the set of associations by either adding
`pencil, book is a bag containing four words of which the
`new associations, deleting existing associations or both.
`Thus, with implementations of the automatic contextual word
`word “book' appears twice. The order of occurrence of ele
`completion system having this 'dynamic aspect, the system
`ments in a bag does not matter, and could equally be written
`as book, book, pencil, cape. Also, any bag can be converted
`evolves as the user adds to or creates new documents and thus
`generally improves with use. Extensions to these implemen
`to a set just by dropping multiple occurrences of the same
`element. Thus, the example bag above, when converted to a
`tations further allow the device to impliedly track the user's
`set, would be book, cape, pencil. To create the bag or
`evolving interests.
`Associations between words can be computed in a variety
`multiset, the contents of a document with the exception of
`of ways and, as non-limiting examples, two alternative auto
`numbers which are a special case are stripped of all internal
`25
`structure (e.g. Syntactic structure, punctuation etc.) including
`matic methods of doing so are described.
`all non-lexical items like images, sounds etc. The resulting
`In the first method, the first step is to assess the similarity of
`stripped document would be a bag or multiset of words as
`words within one document or from one document to other
`described above which may also include numbers and in
`documents that may exist on the user's device. In this method,
`which some words may occur multiple times. For a user who
`contextual associations are arrived at by grouping documents
`30
`based on similarity and creating lists of words that are com
`has a device with a number of stored documents, each perti
`nent document is similarly stripped down to form bags and
`monto each group. There are many known methods to assess
`document similarity including the Jaccard, Dice or cosine
`the mathematical union of these bags can be taken to form a
`larger bag.
`coefficients and the K-vec methods. For purposes of expla
`As a side note, optionally, a certain class of words, typically
`nation, one such example similarity assessment method,
`based on treating documents as vectors in a multidimensional
`called “stop words.” are removed from such document-de
`space, is used, it being understood that, depending on the
`rived bags. Stop words are words like “the “of” “in” etc. and
`particular implementation, other similarity assessment meth
`are removable because they usually are not very informative
`about the content of the document. Stop words, if removed,
`ods can be used in addition to, or instead of those used in the
`examples described herein for practical reasons.
`can be removed from the bags either before or after a math
`This example method is outlined in the flowcharts in FIGS.
`ematical union of the bags is made, as the end result is the
`same. Typically stop words are identified in a list which can
`2a and 2b. The method starts by creating a list of all the
`be used for the exclusion process. Since the stop word
`pertinent documents (200) on the device. From this list of
`pertinent documents a list of unique words is created (205).
`removal process is well known it is not described herein. In
`addition, in some implementations where a stop word list is
`An optional step, is to remove stop words from the word list
`45
`(210). Stop words are described in greater detail below but
`used, the list may be editable so that additional words can be
`defined as “stop words. For example, otherwise non-trivial
`include words like “the “at” and “in” For each word in the
`words that are trivial in the particular context because they
`word list, the number of times it occurs in each document is
`occur too often in that context (e.g. words like “shares' in
`counted (215) and this number is stored in a matrix of docu
`stock related government filings).
`ments vs. words (220). This matrix is used to calculate a
`By way of simplified example (FIG. 3), if the user has just
`similarity value (225) for each possible pair of documents in
`the document list. The similarity value for each document pair
`two documents on a device: “d 1’ (306) made up of “an apple,
`apple cider and an orange' and “d2 (308) made up of “a
`is compared to a threshold value (230) and those document
`paper apple' then, each corresponding bag is apple, cider,
`pairs whose similarity value falls below the specified thresh
`apple, orange and paper, apple. Their union is the larger
`old value are discarded (235). The remaining document pairs
`bag apple, cider, apple, orange, paper, apple and a set for
`are used to group documents such that the similarity value of
`the bag would be apple, cider, orange, paper.
`each possible pair in each group is above a specified threshold
`value (240). Lists of unique words from each group of similar
`A matrix (300) is then formed with for example, each
`documents are created (245). Words within each of these lists
`element in the set of words derived from the documents on the
`are contextually related. The steps of the example method to
`user's device listed along the columns (302) of the matrix and
`this point may be carried out independently of user text entry
`each document itself (symbolized in Some way) along the
`or, in implementations where the dynamic aspects of the
`rows (304) of the matrix. In the cell corresponding to the
`invention are utilized, carried out simultaneously with user
`intersection of a document 'd' with a word “w, the number
`text entry, so that the contextual associations are updated as
`of times “w” occurs in “d” is entered (318). For the simple
`example above, as shown in FIG.3, for the cell corresponding
`the user enters more words into the device.
`Once at least an initial set of contextual associations exists,
`to the intersection of the row for the first document “d 1’ and
`it can be used at Some point thereafter. The approach to use is
`the column for the word “apple” a “2 (318) is entered since
`
`40
`
`50
`
`55
`
`60
`
`65
`
`

`

`Case 1:20-cv-23178-WPD Document 1-2 Entered on FLSD Docket 07/31/2020 Page 12 of 18
`
`US 8,311,805 B2
`8
`7
`approaches, such as Jaccard, Dice or cosine coefficients, the
`it occurs twice in document “d 1.” This occurrence frequency
`K-vec methods or some other method, is a judgment of simi
`information is obtained from the document bags. If a word
`larity and dissimilarity of document pairs in the pertinent
`does not occur in aparticular documentatall, a Zero is entered
`in the corresponding cell. Note that depending upon the num
`document collection or set.
`This similarity judgment is then used to form groups of
`ber of documents and the number of words, the size of the
`matrix can be exceedingly large. Moreover, there is no sig
`documents, each group of which contains only documents
`that are Sufficiently similar to one another when compared in
`nificance to whether rows list documents and columns list
`a pair wise fashion. Note that the relationship of similarity is
`words or vice versa—the contents of the rows and columns
`reflexive and symmetric, but it is not necessarily transitive.
`could be exchanged without affecting the invention.
`This means that the groups may not be disjoint i.e. the same
`Once the matrix is created, each document is treated as a
`10
`document may belong to more than one group, particularly in
`vector in a multidimensional Euclidean space, with the num
`implementations where a document need not be sufficiently
`ber of dimensions being the number of words or columns of
`the matrix. Thus, the simplified example of FIG. 3, each of
`similar to every other member of the group, but only some
`specified portion thereof. In other words, as a result of the
`documents d1 and d2 can be treated as a four dimensional
`grouping, two or more groups will be formed wherein each
`vector since there are four elements in the corresponding set
`apple, cider, orange, paper. Notably, by using this
`document is meaningfully similar to at least Some specified
`portion of the other documents in that group. In general, each
`approach, the words can al

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket