throbber

`
`'1 1—4
`
`5H (
`
`D H :
`
`3
`(D
`H
`
`
`
`jCDi
`
`DH C
`
`D
`
`
`
`‘SJapgdSI8mgpun‘s.19>[o.lgo‘smmpuem
`
`
`
`an E"ECD —. (I,
`
`VMware - Exhibit 1014
`
`VMware v. IV | - |PR2020-00470
`
`Page 1 of 435
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 1 of 435
`
`

`

`Create and
`
`effectively
`
`manage
`
`agents and
`
`explore
`
`their effects
`
`
`
`on the
`
`Internet
`
`VMware - Exhibit 1014
`
`VMware v. IV | - |PR2020-00470
`
`Page 2 of 435
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 2 of 435
`
`

`

`Internet Agents:.
`Spiders, W.llnd~,;aers, Brokers, and 'Bots
`••••••••••••••••••••••••••••••••••
`
`Fah.:Chun Cheong
`
`.S'hC•,••t·,c,,;,,,,;,~,.. ~
`
`,,.,,;\>\•;ic;i\f.;.y<',$'
`
`ishi~g,\lndianapolis, Indiana
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 3 of 435
`
`

`

`J l
`
`Internet Agents: Spiders, Wanderers, Brokers, and 'Bots
`By Fah-Chun Cheong
`
`Published by:
`New Riders Publishing
`201 West 103rd Street
`Indianapolis, IN 46290 USA
`
`All rights reserved. No part of this book may be reproduced or trans(cid:173)
`mitted in any form or by any means, electronic or mechanical, in(cid:173)
`cluding photocopying, recording, or by any information storage and
`retrieval system, without written permission from the publisher, ex(cid:173)
`cept for the inclusion of brief quotations in a review.
`
`Copyright© 1996 by New Riders Publishing
`
`Printed in the United States of America 1 2 3 4 5 6 7 8 9 0
`
`CIP Data Available upon Request
`
`Warning and Disclaimer
`This book is designed to provide information about Internet agents.
`Every effort has been made to make this book as complete and as
`accurate as possible, but no warranty or fitness is implied.
`
`The information is provided on an "as is" basis. The author and New
`Riders Publishing shall have neither liability nor responsibility to any
`person or entity with respect to any loss or damages arising from
`the information contained in this book or from the use of the disks or
`programs that may accompany it.
`
`Publisher
`Publishing Manager
`Marketing Manager
`Managing Editor
`
`Don Fowley
`Jim LeValley
`Ray Robinson
`Tad Ringo
`
`~od~ct'llavelopment··•
`. Specialist
`.
`. Julie Fairweather
`ll,ev.~lopment Eclitor
`Suzanne Snyder
`Pl'ot:l~ctirin.Editor
`Gliff ShutJs
`IJapyErino~
`Arny Bezek,~ra,n.Blauw,.·
`Gal.I a.1.1rfakoffi .Laura

`Frey, t.iscl Wilson

`As11ac:iate M11rketin~
`M11,nager
`·••
`. · Tarriara Apple
`·. J\°:qui~Hions Cot1rdinat~r
`rr~~vJwge~9"!\ . · •..•. ·
`Ptl~lishe~'s~sistaiJt ·
`Karet:r Opal ·
`Jl~~~r~~s!fjn~i• ··
`···J,ay,cqrp~s .•
`• tray,r!lllls~at1w ..
`Roger•Morga'rt •
`Btnlltl1asigner
`Sahdra Schroeder
`· Martilfaatlil'jn.fr. Coordinator
`l?aul Gilchrist
`Produc:ti~ri fvt11nager.
`Kelly 0, Qobbs
`. ProdutltiotrTearri Supem!isor
`Laurie .<;;.asey
`. . ·.•.
`· BraRbits bnlige Specialists
`Jas()n Hanel, C::llpt · ..•.•
`Lahner;i; ~aura.Robl:lihs,
`C::ta:igSrni:111, ToddWeQte
`Pl't1d1:mUq11 Aiial~t11
`· 4\r(gEJla o '.. aanqan
`Bobbi Satter:fieltf
`Proiju~tion 'feaIQ
`. l-:l~a@3r l3Litler, Dan
`•·•
`.. • ,G)aparo1 J<irn.Cofet\. Kevit1
`• E'oltt, Eirika MiUef);EriG!1 .
`. J,Hicht~r;. Chnstir\e
`Tyner, l<~renWalsh ·
`r1111e~el\
`< Ghrist.op~er Cl~yelahd
`
`ii
`
`Internet Agents: Spiders, Wanderers, Brokers, and 'Bots
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 4 of 435
`
`

`

`About the Author
`Fah-Chun Cheong consults with start-up compa(cid:173)
`nies around the San Francisco Bay Area in the
`application of agent technologies for electronic
`commerce on the World Wide Web and Internet.
`
`Mr. Cheong received his B.S. in Electrical Engi(cid:173)
`neering from The University of Texas at Austin in
`1986, and his M.S. and Ph.D. degrees in Com(cid:173)
`puter Science from the University of Michigan in
`1988 and 1992, respectively. His Ph.D. research
`work is on the design and development of an ex(cid:173)
`perimental agent-oriented programming language
`and compiler system for heterogeneous distrib(cid:173)
`uted computing environments. He founded Agent
`Computing, Inc. in 1994, with a vision to develop
`innovative application-specific agent technologies
`for the Internet.
`
`Trademark Acknowledgments
`All terms mentioned in this book that are known to
`be trademarks or service marks have been appro(cid:173)
`priately capitalized. New Riders Publishing cannot
`attest to the accuracy of this information. Use of a
`term in this book should not be regarded as affect(cid:173)
`ing the validity of any trademark or service mark.
`
`Dedication
`To my parents and sisters
`
`Acknowledgments
`This book might not have been written (well, at
`least not in 19951) if Vinay Kumar had not invited
`me along to a dinner earlier this year at a sushi
`place in San Francisco with Jim LeValley, Publish(cid:173)
`ing Manager for New Riders Publishing. I thank
`him for that and for the many interesting and in(cid:173)
`sightful discussions on a variety of topics we have
`had over many cups of espresso.
`
`A very big thank you to Kevin Hughes for review(cid:173)
`ing drafts of this book. I am grateful to ex(cid:173)
`colleagues at EIT and ex-EIT friends, especially
`Jeff Pan and Jim McGuire, for information in a
`variety of areas, most notably procurement
`agents, Web robots, and secure HTTP.
`
`I would like to thank all the people on the Internet
`whose pioneering work in agents, spiders, wan(cid:173)
`derers, and Web robots has made an early book
`on this topic a possibility. Special thanks to all the
`authors of Web robots, spiders, and wanderers
`who have a,nswered e-mail questionnaires on
`Internet agents; their insightful comments and
`responses have contributed much toward shap(cid:173)
`ing the content of this book.
`
`I am indebted to Roy Fielding for his libwww-perl
`and MOMspider source code, which, in a vastly sim(cid:173)
`plified form, have now become the basis upon which
`WebWalker is built. Many thanks to Bruce Krulwich
`whose Bargain Finder agent on the Web inspired the
`development of WebShopper for this book.
`
`Martijn Koster has authored and maintained a
`number of marvelous Web pages on the net.
`Among his creations, I have found the List of Ro(cid:173)
`bots a comprehensive reference and an invalu(cid:173)
`able resource for much of this book.
`
`The Stanford Libraries have proved invaluable to
`me on this project, as on others. I am extremely
`grateful that Stanford opens its Mathematics and
`Computer Science Library, and also the Engineer(cid:173)
`ing Library, to the surrounding community at large.
`
`A very big thank you to the friendly, competent,
`and generally fantastic editorial staff at New Rid(cid:173)
`ers who prepared this book for publication. I am
`indebted to Jim LeValley for taking an interest in
`Internet agents, coming up with an initial plan for
`this book, and supplying me continuously with an
`unending strecJm of helpful sources and materi(cid:173)
`als. I am especially thankful to Julie Fairweather
`for developing the book, coordinating the process
`to keep publication on schedule, and for helping
`with numerous screen-shots of the Web. Special
`thanks to Cliff Shubs for his excellent editing and
`his many thoughtful remarks on the book, and to
`Suzanne Snyder for helping with the development
`of the book. Many thanks go to Roger Morgan for
`designing the great spider on the front cover.
`
`Internet Agents: Spiders, Wanderers, Brokers, and 'Bats
`
`iii
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 5 of 435
`
`

`

`Contents at a Glance
`
`Part I: Introduction
`The World of Agents ....................................................................... 3
`
`2 The Internet: Past, Present, and Future ........................................ 37
`
`3 World Wide Web: Playground for Robots ..................................... 61
`
`Part II: Web Robot Construction
`4 Spiders for Indexing the Web ........................................................ 81
`
`5 Web Robots: Operational Guidelines .......................................... 105
`
`6 HTTP: Protocol of Web Robots ................................................... 125
`
`7 WebWalker: Your Web Maintenance Robot ............................... 153
`
`Part III: Agents and Money on the Net
`8 Web Transaction Security ........................................................... 185
`
`9 Electronic Cash and Payment Services ....................................... 205
`
`Part IV: Bots in Cyberspace
`10 Worms and Viruses ..................................................................... 229
`
`11 MUD Agents and Chatterbots ..................................................... 249
`
`Part V: Appendices
`A HTTP 1.0 Protocol Specifications ................................................ 283
`
`B WebWalker 1.00 Program Listing ............................................... 293
`
`C WebShopper 1.00 Program Listing ............................................. 337
`
`D List of Online Bookstores Visited by BookFinder .......... • .............. 347
`
`E List of Online Music Stores Visited by CDFinder ........................ 351
`
`F List of Active MUD Sites on the Internet .................................... 355
`
`G List of World Wide Web Spiders and Robots .............................. 375
`
`Bibliography ................................................................................. 387
`
`Index ............................................................................................ 401
`
`iv
`
`Internet Agents: Spiders, Wanderers, Brokers, and 'Bots
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 6 of 435
`
`

`

`Table of Contents
`
`Part I: Introduction
`
`1
`
`1 The World of Agents
`3
`What are Agents? ........................................................................................ 5
`Agents and Delegation ................................................................................. 6
`Personal Assistants ................................................................................. 6
`Envoy Desktop Agents ............................................................................ 8
`New Wave Desktop Agents .................................................................... 8
`Surrogate Bots ........................................................................................ 9
`Internet Softbots ................................................................................... 10
`Agents and Coordination ............................................................................ 12
`Conference-Support Agents .................................................................. 12
`Integrated Agents .................................................................................. 13
`Communicative Agents ......................................................................... 15
`Agents and Knowledge .............................................................................. 17
`Teaching Agents .................................................................................... 17
`Learning Agents .................................................................................... 19
`Common-Sense Agents ........................................................................ 21
`Physical Agents ..................................................................................... 23
`Agents and Creativity ............. ·.· .................................................................. 24
`Creative Agents ..................................................................................... 24
`Automated Design Agents .................................................................... 27
`Agents and Emotion ................................................................................... 27
`Art of Animation .................................................................................... 28
`Artificial lritelligence .............................................................................. 28
`The Oz Project ....................................................................................... 28
`Agents and Programming .......................................................................... 30
`KidSim ................................................................................................... 30
`Oasis ..................................................................................................... 31
`Agents and Society .................................................................................... 32
`Control ................................................................. .' ................................. 33
`Over Expectations ............................................................... : ................. 33
`Safety ..................................................................................... : .............. 33
`Privacy ................................................................................................... 33
`Commercial Future of Agents .................................................................... 33
`Product Suites ....................................................................................... 34
`Mobile Computing ................................................................................. 34
`Concluding Remarks .................................................................................. 35
`
`I Table of Contents
`
`V
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 7 of 435
`
`

`

`2 The Internet: Past, Present, and Future
`37
`Early Days of ARPAnet ............................................................................... 38
`Notable Computer Networks ..................................................................... 39
`Internet and NSFnet ................................................................................... 41
`NSF and AUP ............................................................................................. 42
`Growth of the Internet ............................................................................... 42
`How Big is the Internet? ............................................................................ 45
`Internet Society, IAB, and IETF .................................................................. 55
`Information Superhighway and the National Information Infrastructure .... 57
`
`3 World Wide Web: Playground for Robots
`61
`World Wide Web Development ................................................................. 62
`Growth of the Web ............................................................................... 62
`Information Dissemination with the Web ............................................. 62
`Innovative Uses of the Web .................................................................. 65
`Architecture of the World Wide Web ......................................................... 65
`Web Clients ........................................................................................... 66
`Web Servers .......................................................................................... 66
`Web Proxies .......................................................................................... 67
`Web Resource Naming, Protocols, and Formats .................................. 67
`URI and URL: Universal Resource Identifier and Locator .......................... 67
`Common URI Syntax ............................................................................. 68
`URLs for Various Protocols ................................................................... 69
`Gopher and WAIS .................................................................................. 69
`HTTP: HyperText Transfer Protocol ........................................................... 69
`Statelessness in HTTP .......................................................................... 70
`Format Negotiations .............................................................................. 70
`HTML: HyperText Markup Language ......................................................... 71
`Level of HTML Conformance ................................................................ 71
`HTML Tags ............................................................................................ 72
`Forms and Image maps: Enhanced Web Interactivity ................................ 73
`Fill-Out Forms ........................................................................................ 73
`Clickable Images ................................................................................... 73
`Gateway Programming: Processing Client Input ....................................... 74
`Gateway Program Interaction ............................. '. .................................. 75
`The Next Step: Agents on the Web ........................................................... 76
`Early Commerce Agents ....................................................................... 76
`Web Agents of the Future? ................................................................... 78
`
`Part II: Web Robot Construction
`
`79
`
`4 Spiders for Indexing the Web
`81
`Web Indexing Spiders ................................................................................ 82
`WebCrawler: Finding What People Want .................................................. 84
`
`vi
`
`Internet Agents: Spiders, Wanderers, Brokers, and 'Bots
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 8 of 435
`
`

`

`Searching with WebCrawler .................................................................. 84
`How WebCrawler Moves in Webs pace ................................................ 85
`Lycos: Hunting WWW Information ............................................................ 89
`Searching with Lycos ............................................................................ 90
`Lycos' Search Space ............................................................................. 91
`Lycos Indexing ...................................................................................... 92
`How Lycos Moves in Webspace ........................................................... 92
`Harvest: Gathering and Brokering Information ........................................... 93
`Searching with Harvest ......................................................................... 94
`Harvest Architecture ............................................................................. 95
`WebAnts: Hunting in Packs ....................................................................... 99
`WebAnts Motivation ........................................................................... 100
`WebAnts Searching and Indexing ....................................................... 100
`Issues of Web Indexing ........................................................................... 100
`Recall and Precision ............................................................................ 101
`Good Web Citizenship ......................................................................... 101
`Performance ........................................................................................ 102
`Scalability ............................................................................................. 102
`Spiders of the Future ............................................................................... 1 03
`
`105
`5 Web Robots: Operational Guidelines
`Web Robot Uses ...................................................................................... 106
`Web Resource Discovery .................................................................... 107
`Web Maintenance ............................................................................... 107
`Web Mirroring ..................................................................................... 107
`Proposed Standard for Robot Exclusion ................................................... 108
`Robot Exclusion Method ..................................................................... 108
`Robot Exclusion File Format.. .............................................................. 109
`Recognized Field Names ..................................................................... 109
`Sample Robot Exclusion Files ............................................................. 110
`The Four Laws of Web Robotics .............................................................. 110
`I. A Web Robot Must Show Identifications ......................................... 111
`II. A Web Robot Must Obey Exclusion Standard ................................ 112
`Ill. A Web Robot Must Not Hog Resources ........................................ 113
`IV. A Web Robot Must Report Errors .................................................. 115
`The Six Commandments for Robot Operators .......... .' .............................. 115
`I. Thou Shalt Announce thy Robot ....................................... : .............. 116
`II. Thou Shalt Test, Test, and Test thy Robot Locally .......................... 117
`111. Thou Sha It Keep thy Robot Under Control ..................................... 118
`IV. Thou Shalt Stay in Contact with the World .................................... 119
`V. Thou Shalt Respect the Wishes of Webmasters ............................ 119
`VI. Thou Shalt Share Results with thy Neighbors ................................ 120
`Robot Tips for Webmasters ..................................................................... 121
`Web Ethics ............................................................................................... 122
`
`I Table of Contents
`
`vii
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 9 of 435
`
`

`

`6 HTTP: Protocol of Web Robots
`125
`Understanding HTTP Operation ............................................................... 126
`Messaging with HTTP .............................................................................. 128
`Message Headers ............................................................................... 128
`General Message Header Fields ......................................................... 129
`Request Message .................................................................................... 130
`Method ................................................................................................ 130
`Request Header Fields ........................................................................ 133
`Response Message ................................................................................. 136
`Status Codes and Reason Phrases ..................................................... 137
`Response Header Fields ..................................................................... 140
`Entity ........................................................................................................ 141
`Entity Header Fields ............................................................................ 141
`Entity Body .......................................................................................... 146
`Protocol Parameters ................................................................................. 14 7
`HTTP Version ....................................................................................... 14 7
`Universal Resource Identifiers ............................................................ 147
`Date/Time Formats .............................................................................. 147
`Content Parameters ................................................................................. 148
`Media Types ........................................................................................ 148
`Character Sets ..................................................................................... 148
`Encoding Mechanisms ........................................................................ 149
`Transfer Encodings .............................................................................. 149
`Language Tags .................................................................................... 150
`Content Negotiation ................................................................................. 150
`Access Authentication ............................................................................. 151
`
`7 WebWalker: Your Web Maintenance Robot
`153
`The Web Maintenance Problem .......................................................... 154
`Web lnfostructure ............................................................................... 154
`Past Approaches ................................................................................. 154
`Web Maintenance Spiders .................................................................. 155
`WebWalker Operation .............................................................................. 156
`Processing Task Descriptions ............................................................. 156
`Avoiding and Excluding URLs ........................... , ................................. 156
`Keeping History ................................................................ ,. ................. 157
`Traversing the Web ............................................................................. 157
`Generating Reports ............................................................................. 157
`Is WebWalker a Good Robot? ............................................................. 157
`WebWalker Limitations ....................................................................... 158
`WebWalker Program Installation .............................................................. 158
`WebWalker Task File ............................................................................... 159
`Global Directives ................................................................................. 159
`Task Directives .................................................................................... 160
`Task File Format .................................................................................. 161
`
`viii
`
`Internet Agents: Spiders, Wanderers, Brokers, and 'Bots
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 10 of 435
`
`

`

`WebWalker Usage Examples ................................................................... 161
`Sample WebWalker Output ................................................................ 162
`WebWalker Forms Interface ............................................................... 167
`WebWalker Program Organization ........................................................... 169
`External Library Calls ........................................................................... 169
`WebWalker Program Call-Graph .......................................................... 170
`Configuration Section .......................................................................... 172
`Avoidance Package ............................................................................. 17 4
`History Package ................................................................................... 175
`Traversal Package ................................................................................ 177
`Summary Package ............................................................................... 178
`Growing into the Future ........................................................................... 181
`
`Part III: Agents and Money on the Net
`
`183
`
`185
`8 Web Transaction Security
`Concepts of Security ................................................................................ 186
`Privacy: Keeping Private Messages Private ........................................ 187
`Authentication: Proving You Are Who You Claim to Be ...................... 188
`Integrity: Ensuring Message Content Remains Unaltered .................. 189
`Brief Tour of Classical Cryptography ........................................................ 189
`The Role of NSA .................................................................................. 190
`Development of Data Encryption Standard (DES) ............................... 190
`Development of Public-Key Cryptography ............................................... 191
`Problems with Secret Keys ................................................................. 191
`Key Management ................................................................................ 192
`The RSA Alternative ............................................................................ 192
`Comparing Secret-Key and Public-Key Cryptography .......................... 193
`Digital Signatures ..................................................................................... 194
`How Digital Signatures Work .............................................................. 194
`The Digital Signature Standard ............................................................ 197
`Key Certification ....................................................................................... 197
`Certifying Authority ............................................................................. 197
`Certificate Format .................................................. : ............................. 198
`Two Approaches to Web Security ............................................. · .............. 198
`Secure Socket Layer (SSL) .................................................................. 200
`Secure HTTP (S-HTTP) ........................................................................ 201
`Current Practice and Future Trend in Web Security ............................ 203
`
`205
`9 Electronic Cash and Payment Services
`Brief History of Money ............................................................................. 206
`Choice of Payment Methods ................................................................... 207
`What is Digital Cash? ............................................................................... 207
`Digital Cashier's Check ........................................................................ 208
`Anonymous Digital Cash through Blind Signatures ............................. 210
`
`I Table of Contents
`
`ix
`
`VMware - Exhibit 1014
`VMware v. IV I - IPR2020-00470
`Page 11 of 435
`
`

`

`Ecash from DigiCash ........................................................................... 211
`Ecash Security and Other Issues ........................................................ 213
`Payment Systems on the Internet ........................................................... 214
`U.S. Payment Systems Today ............................................................. 214
`CyberCash Internet Payment Service ................................................. 215
`Information Commerce on the Internet ................................................... 220
`Economics of Information Commerce ................................................ 220
`First Virtual Payment System .............................................................. 222
`The Future ................................................................................................ 225
`
`Part IV: Bots in Cyberspace
`
`227
`
`10 Worms and Viruses
`229
`Short History of Worms ........................................................................... 230
`The First Worm ................................................................................... 230
`The Christmas Tree Worm .................................................................. 232
`The Internet Worm .............................................................................. 232
`Anatomy of the Internet Worm ................................................................ 232
`Method of Worm Attack .......................................................................... 232
`Method of Worm Defense .................................................................. 233
`What Does the Worm Not Do? ........................................................... 234
`Brief History of Viruses ............................................................................ 234
`Types of Viruses ....................................................................................... 236
`Boot-Sector lnfectors .......................................................................... 236
`File lnfectors ........................................................................................ 236
`PC Virus Basics ........................................................................................ 237
`Viral Activation in the Boot Process ......................................................... 238
`Step One: ROM BIOS Routines Execution ......................................... 238
`Step Two: Partition Record Code Execution ....................................... 238
`Step Three: Boot-Sector Code Execution ............................................ 238
`Step Four: IQ

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket