`
`-
`
`- -
`
`Simon St. Laurent
`
`Twitter-Google Exhibit 1046
`Page 1 of 46
`
`
`
`McGraw-Hill
`A Division ofTheMcGmw·Hill Companies
`
`Z2
`
`Copyright © 1998 by The McGraw-Hill Companies, Inc. All rights reserved. Printed in
`the United States of America. Except as permitted under the United States Copyright Act
`of 1976, no part of this publication may be reproduced or distributed in any form or by
`any means, or stored in a data base or retrieval system, without the prior written permission
`of the publisher.
`The views expressed in this book are solely those of the author, and do not
`represent the views of any other party or parties.
`1234567890 DOQDOC 90201987
`
`ISBN 0-07-050498-9
`The sponsoring editor for this book was Michael Sprague and the production
`supervisor was Claire Stanley. It was set in Janson Text by Patricia Wallenburg.
`Printed and bound by R. R. Donne/ley & Sons Company.
`McGraw-Hill books are available at special quantity discounts to use as premiums and
`sales promotions or for use in corporate training programs. For more information, please
`write to the Director of pecial ales, McGraw-Hill, 11 West 19th Street, New York, NY
`10011. Or contact your local bookstore.
`
`Information contained in this work has been obtained by The McGraw-Hill Companies,
`Inc. ("McGraw-Hill") from sources believed to be reliable. However, neither McGraw(cid:173)
`Hill nor its authors guarantee the accuracy or completeness of any information published
`herein and neither McGraw-Hill nor its authors shall be responsible for any errors, omis(cid:173)
`sions, or damages arising out of use of this information. This work is published with the
`understanding that McGraw-Hill and its authors are supplying information but are not
`attempting to render engineering or other professional services. If such services are
`required, the assistance of an appropriate professional should be sought.
`
`This book is printed on recycled, acid-free paper containing
`a minimum of 50% recycled de-inked fiber.
`
`Twitter-Google Exhibit 1046
`Page 2 of 46
`
`
`
`Cookie Anatomy
`
`Before we continue our examination of the pros and cons
`of cookies, we need to take a detailed look at their con(cid:173)
`tents. The information contained in mosi: cookies is trivial,
`but is still enough to make programmers' and marketers'
`dreams conceivable and give privacy advocates fits. Despite
`their tremendous power, cookies perform these grand tasks
`using only a tiny amount of information, making this open(cid:173)
`ing tour quite brief.
`
`17
`
`Twitter-Google Exhibit 1046
`Page 3 of 46
`
`
`
`18
`
`Chapter Two
`
`Remember, cookies are not supported the same way in all browsers.
`Throughout the rest of this book, we will be exploring the Netscape and
`Microsoft implementations of cookies, but not all browsers implement
`those particular models of this technology. Lynx, for instance, imple(cid:173)
`ments cookies (including much of the new RFC 2109 cookies) but "gob(cid:173)
`bles" all of them when the user exits the program. Some browsers you
`might not expect to support cookies, like those created by Spyglass for
`embedding into appliances, also support cookies this way. Older
`browsers, like Mosaic and the early AOL browsers, do not support cook(cid:173)
`ies at all.
`
`Looking into Cookies
`
`After several years of remarkable stability, the cookie standard is in flux.
`Netscape originally created cookies, but is handing them over to a stan(cid:173)
`dards body. RFC 2109, a proposed IETF standard, will transform some of
`the basic mechanisms of cookies and add extra features to the current stan(cid:173)
`dard. Most of the examples in this book will use the older Version 0
`(Netscape) cookies, as RFC 2109 is not yet widely supported, but notes
`along the way will point out ways to improve your cookie development
`with RFC 2109. In this section, we will cover both kinds of cookies, start(cid:173)
`ing with the current but older standard. For now we will cover only the
`contents of cookies; tools for creating and managing them will get full
`treatment in subsequent chapters.
`The two varieties of cookie have much in common, and provide similar
`services. Browsers and servers that can handle RFC 2109 cookies still work
`with the older versions as well. To maintain compatibility with the widest
`range of browsers, the authors of RFC 2109 recommend using both kinds
`of cookies and allowing the browser to decide which to use. Later we will
`examine techniques that developers can use to manage this transition. For
`now, we will cover the contents of both kinds of cookies to give you a sense
`of where cookies are now and where they're headed.
`RFC 2109 has been published, but the proposals it makes are receiving
`new revisions that reflect difficulties, both technical and political, vendors
`have had in implementing the standard. RFC stands for Request-For-Com(cid:173)
`ment, which is something of a misnomer. While most standards begin as pro(cid:173)
`posals and have a name change to standards at some point, an RFC remains
`an RFC even after the commenting process is complete. If RFC 2109
`receives two implementations from different vendors, i.e. a compatible client
`
`Twitter-Google Exhibit 1046
`Page 4 of 46
`
`
`
`0
`
`'·
`d
`Lt
`
`)-
`u
`r
`r
`
`{.
`
`[-
`
`0
`
`~s
`tt
`
`e
`11
`
`lf
`k
`>t
`ls
`11
`If
`. e
`
`g
`-s
`
`1-
`
`)-
`
`IS
`9
`lt
`
`Cookie Anatomy
`
`19
`
`and a server, then it will be given the more prestigious title of Internet stan(cid:173)
`dard. Even if it achieves that lofty position, however, there is no requirement
`that vendors implement the new standard. RFCs provide the detailed stan(cid:173)
`dards for much of the basic infrastructure of the web. Unfortunately, they are
`not always as stable as other standards, and can be superseded or made obso(cid:173)
`lete by later RFCs. In RFC 2109's case, it is already being supplanted by
`working drafts, which may eventually turn into a new and improved RFC,
`and, with any luck, be implemented in the mainstream browsers.
`
`Cookies Today: Version 0
`
`Version 0 (Netscape) cookies have six parts: name, value, domain, path,
`expires, and a secure property that determines whether the cookies can be
`transferred unencrypted. Each cookie is supposed to be limited to 4K of
`information. Not all browsers enforce the 4K limit, but development
`beyond that point is not recommended even if it is possible because of per(cid:173)
`formance issues. Uploading and downloading a 20K cookie would definitely
`annoy the average user on a modem connection. If you really need to push
`the envelope, your site could create and use as many as 20 cookies, all of
`which were smaller, but there are usually easier ways to manage information.
`
`Table 2-1
`Structure of a
`Version 0
`(Netscape) Cookie
`
`Part
`Name
`Domain
`Path
`Expires
`Secure
`
`Name
`
`Value
`Value
`domain name
`path information
`date (in GMT)
`No valuEr-Cookie is transmitted
`securely if attribute is listed.
`
`The name is a sequence of characters that uniquely identifies the cookie .
`The name is required, and cannot contain whitespace, semicolons, or com(cid:173)
`mas. If you create two cookies with the same domain, path and name, the
`cookie that was there first will be obliterated by the newcomer.
`
`Value
`
`Once you have gotten past the required cookie header material, this is the
`area developers can use to store information. A value is also required, and
`
`Twitter-Google Exhibit 1046
`Page 5 of 46
`
`
`
`20
`
`Chapter Two
`
`cannot contain whitespace, semicolons or commas. Many times developers
`will use escape encoding to get around this restriction; see Chapters 6 and 7
`for encoding techniques in CGI andJavaScript applications.
`
`Domain
`
`Only the pages from the domain which created a cookie are supposed to be
`allowed to read that cookie. Microsoft cannot read Netscape's cookies; IBM
`cannot read cookies from Dell. This field contains the domain name from
`which the cookie was created. Cookies from IBM's top-level domain will
`have a domain value of .ibm.com. As we'll see later, not all cookies actually
`come from pages in the domain listed, but as a general rule this simple pro(cid:173)
`tection can at least keep unwanted readers out of the cookie jar.
`By default, the domain is set to the full domain name of the Web server that
`made the request or provided the page that created the cookie. A developer can
`use a smaller portion of the domain name to share cookies among several
`servers sharing a top-level domain. For instance, myserver.mydomain.~om
`could request that the domain for its cookies be .mydomain.com instead of
`the full .myserver.mydomain.com. This way hlsserver.mydomain.com and
`herserver.mydomain.com could both read and use those cookies. You cannot,
`however, request that a cookie have a domain of .com or a similar top-level
`domain; that would make it too easy to create cookies completely open to pry(cid:173)
`ing eyes .The domain limitations on cookie access are shown in Figure 2-1.
`
`Documents
`
`Figure2-l
`How Cookie
`Access can be
`Limited by
`Domains and
`Subdomains
`
`.myplace
`.mydomain.com
`
`Cookies
`
`.mydomain.com
`
`.myplace.mydomain.com
`
`.yourdomain.com
`
`Twitter-Google Exhibit 1046
`Page 6 of 46
`
`
`
`Cookie Anatomy
`
`21
`
`All domain names are prepended with periods to meet a requirement
`Netscape created for security-all domains must include at least two
`periods (for domains in the top-level .com, .mil, .gov, .edu, .org, .net,
`and .int domains) and possibly three (for all other domains). This pre(cid:173)
`vents developers from creating generic .com cookies or even .ny.us cook(cid:173)
`ies that can be read by a large number of servers.
`
`Path
`
`t
`
`1
`.f
`l
`-.,
`:1
`
`~~:t&£:.:&-'l~~l~£S:~~¥i~§:;i~f~!"JJ
`Figure 2-2
`How Cookie
`Access can be
`Limited by Path
`
`The Path value is similar to the domain, but restricts cookie usage within a
`site. By default, the path is set to the path for the page that creates the
`cookie. Only pages in the path specified by this part of the cookie can read
`or set the cookie. Suppose a developer creates a mall site that contained
`many different stores. All of the stores use the same shopping cart software,
`but do not want to share shopping carts because they need or want to have
`their own private cash registers. Each store could receive its own private
`cookie, kept separate from the others by a path, while sharing the general
`user information stored in a cookie for the sjte without a path specified. If
`the domain were mymall.com, then there cou ld be one cookie for
`mymall.com with no path specified, and separate cookies for the paths
`I storea and I storeb. Store A could not read Store B's cookies at all,
`allowing competing shops to share a web server politely, but both stores
`could get general information from the cookie without a specified path.
`Figure 2-2 shows how path information can restrict cookie access.
`
`Documents
`
`Cooldes
`
`I
`
`/code
`
`/code/samples
`
`/code/samples/
`
`Twitter-Google Exhibit 1046
`Page 7 of 46
`
`
`
`22
`
`Chapter Two
`
`You should always set the path explicitly if you specify an expiration
`date for a cookie. Set the path to 1 if necessary. Netscape 1.1 will delete
`all cookies without a specified path when the user exits the browser.
`
`Expires
`
`Much like real cookies, browser cookies don't last forever. Some cookies
`disappear when the user quits the browser, while others can hang around
`for months or years. Most of the cookies detested by privacy advocates are
`of the more pemanent variety, allowing developers to track users between
`sessions. This variable holds the expiration date for the cookie, expressed in
`Greenwich Mean Time (GMT) in the format:
`
`Wdy. DD-Mon-YYYY HH:MM:SS GMT
`
`Wdy is the weekday (optional), DD is the date, Mon is the month, YYYY is
`the year, HH is the hour (in 24-hour time), MM is the minute, and SS is the
`second. If you leave this value blank, most browsers will hold on to the
`cookie for the duration of the session. ·
`
`Some browsers, including Lynx, will delete all cookies when the user
`quits the browser no matter what the expires variable requests.
`
`Secure
`
`The Secure option allows developers to create cookies that are encrypted in
`transit, using HTTPS, SSL, or another means of providing security for all
`communications about this cookie between the server and the browser.
`This attribute has no value; unless it appears when the cookie is created, the
`cookie wll travel without any security.
`
`At least in current browsers, the Secure option has no effect on the wa;y
`cookies are stored on a user's machine. They remain unencrypted and
`available to anyone with access to the cookie files. Developers who need
`to keep cookie information awa;y from prying eyes must implement their
`own encryption schemes.
`
`Twitter-Google Exhibit 1046
`Page 8 of 46
`
`
`
`Cookie Anatomy
`
`23
`
`Cookies Tomorrow: Version 1 and BFC 2109
`
`Although the cookie technology in Version 0 is primitive, it has proven capable
`of a wide variety of tasks. RFC 2109 proposes to extend cookie abilities and
`provide extra information that will make it easier for users to manage cookies.
`RFC 2109 is in flux and there is no browser yet available that supports all of its
`features, but the basic structure are word1 exploring. The general structure i
`similar to that of the original (Netscape) cookie but many of the details are dif(cid:173)
`ferent. The following outline is an introduction to the improved cookie-later
`chapters will provide more depth on how to apply these tools.
`
`Table 2-2
`Structure of
`Version 1 Cookies
`
`Part
`Name
`Comment
`
`CommentURL
`Discard
`
`Domain
`Max-Age
`
`Path
`Port
`Secure
`
`Version
`
`Name
`
`Value
`Value
`Text that will be displayed to users considering
`whether to accept cookie
`Site with more information on cookie
`No value-cookie will be discarded upon browser
`quit if present
`Originating domain name
`Suggested lifetime of the cookie, in milliseconds
`from time of creation
`Path information for cookie
`Port information for cookie (0-65535)
`No value-cookie will be transmitted securely if
`present.
`Should be 1 in all RFC 2109 cookies
`
`As it was in the older standard, the name is a sequence of characters that
`uniquely identifies the cookie. The name is required, and cannot contain
`whitespace, semicolons, or commas. If you create two cookies with the
`same domain, path, and name, the cookie that was there first will be oblit(cid:173)
`erated by the newcomer. The only innovation here is that names beginning
`with the dollar sign ($) are reserved and may not be used by applications.
`These$4U is an acceptable name, but $4U is not.
`
`Value
`
`A value is a required part of the cookie. RFC21 09 does not explicitly
`restrict the use of semicolons and whitespace, but developers will probably
`
`Twitter-Google Exhibit 1046
`Page 9 of 46
`
`
`
`24
`
`Chapter Two
`
`find it easier to stick to the old encoding for easier compatibility with the
`older cookie standards.
`
`Comment
`
`The Comment attribute is an innovation that allows developers to tell users
`why they are creating cookies. Although it is optional, I strongly recom(cid:173)
`mend that developers use the comment to provide clear information about
`the purpose behind a cookie. If users have set their browsers to ask them
`before accepting cookies, this message will appear in the dialog box. The
`comment can give users a clearer picture of the cookie they're considering,
`and perhaps the comment will be persuasive.
`
`CommentURL
`
`CommentURL gives developers an opportunity to provide users with a
`more detailed explanation of what a cookie is supposed to do. If a user
`needs more information than the basic comment provides, and the browser
`displays this URL, the user can visit the site to get a clearer picture of why
`a particular cookie is attractive. It is unclear how many developers want to
`give away the secrets of their trade, or how many users will be interested in
`reading Web programming diagrams, but it is certainly an additional
`opportunity to communicate with users. Also, be sure that the site refer(cid:173)
`enced by the CommentURL doesn't hand out cookies itself-users will end
`up negotiating a maze of dialog boxes to reach your explanation.
`
`Discard
`
`If the Discard attribute is present the bwwser will discard the cookie when
`the program exits, n matter what the Max-Age of the cookie may be. Thi
`is very useful for creating co kies that could la t for hour in certain cases (a
`user working on a project through a browser for a long time) without acci(cid:173)
`dentally leaving the c okie behind when the work is complete. In dlis ver-
`ion of cookies, the expiration time is supposed to be enforced, and
`browsers ru·e expected to clean up their cookies as th cookies expire. Most
`browsers using Ver ion 0 only discru:ded expired cookies when the browser
`exited--discard simply mandates that they discard the cookie at brow er
`exit. The Di card attribute, in combination with Max-Age, defines the lifes(cid:173)
`pan of a cookie more precisely, giving programmers more control over the
`life of their data.
`
`Twitter-Google Exhibit 1046
`Page 10 of 46
`
`
`
`Cookie Anatomy
`
`Domain
`
`25
`
`The Domain attribute works the same way it did in Version 0. The default
`domain is the domain name from which the cookie came, though the script
`setting the cookie may specify a subsection of that domain name(cid:173)
`www.foozball.tablesport.com could specify a cookie for .tablesport.com.
`As in Version 0, the domain is stored with a period placed at the front, if
`the period wasn't specified to make it more difficult to create generic cook(cid:173)
`ies that can be accessed by multiple domains.
`
`Max-Age
`
`Max-Age performs a function similar to that of Expires in the old standard,
`but with a clearer, stronger set of rules. Instead of providing an expiration
`date, Max-Age gives the browser a lifespan for the cookie measured in sec(cid:173)
`onds. If Max-Age is set to 60, the cookie will last for one minute. At 3,600 it
`gets to live for an hour, at 86,400 it lasts a day, and so on. If Max-Age is set
`to zero, the cookie is deleted as soon as it is received. Max-Age must be set
`to a positive integer or zero; negative lifetimes are not permitted. Browsers
`are supposed to discard cookies as soon as their Max-Age has been reached,
`instead of waiting until the program is exiting. This means that you cannot
`count on a cookie you create in a session to last the duration of a transaction.
`To avoid temporary cookies that disappear too soon, give Max-Age a high
`value and use the Discard attribute to make sure they get deleted when your
`transaction is complete. Of course, if you really do want the cookie to stay
`around for a few months, just set Max-Age accordingly.
`
`Path
`
`Path works as it did in Version 0. If a path is specified, only URLs that con(cid:173)
`tain that complete path are eligible to read or modify the cookie. Cookies
`for use site-wide should always explicitly have a path of I.
`
`Port
`
`Port is an additional feature of the new cookie standard. A single Web serv(cid:173)
`er could host a site with several levels of security, enforced by Web servers
`operating on different ports. Ports allow a single machine with a single IP
`address to have muJtij)le T CP/IP applications running simult.meously. All
`traffic for an IP address goes to that machine; the IP stack on that machine
`examines the incoming packets and routes tl1em to thei1· proper application
`based on the port munber assigned to the packet. Port numbers can range
`
`:l
`t
`r
`r
`
`e
`
`Twitter-Google Exhibit 1046
`Page 11 of 46
`
`
`
`26
`
`Chapter Two
`
`from 0-65,535, but 0-255 are generally reserved for specific system applica(cid:173)
`tions that need to have a standardized location. FTP is usually on port 21,
`Telnet is usually on port 23, and Web servers are on port 80. You can
`reconfigure these any way you like or even create new services on other
`ports, but most people stick to the standards.
`Web servers can and do float among ports fairly frequently. The default
`port is 80, but many servers also use port 8080 or port 81. A single machine
`could have a Microsoft Internet Information Server on port 80, a Netscape
`Communications Server on port 81, and a Lotus Domino server on port
`8080, each of which has its own set of pages and even its own administrator.
`While it is extremely unlikely that a hacker could break into a site, set up a
`parallel server on the same machine, and use that server to gather cookies,
`it might be possible for a rogue administrator or someone else with internal
`access to breach security that way. If you have data that need to be kept pri(cid:173)
`vate, setting the port attribute is probably worthwhile. Just remember that
`cookies are usually stored in plain text on user machines, and a secret server
`could easily be brought to light by someone rooting through a cookie file
`on a client machine if cookies are left behind.
`
`Secure
`
`Secure behaves as it did in Version 0, demanding that the cookies be sent
`securely when transmitted between client and server. It does not require
`encryption of the cookie on the client, nor does it do anything to hide
`cookie values from the user.
`
`Version
`
`The Version attribute is required. For this version of the cookie standard,
`you should always specify Version=l. The Version attribute will dramati(cid:173)
`cally simplify the process of updating the cookie standard in the future. As
`we will see, the standards developers had to come up with some intricate
`tricks to make the new and old cookies coexist without creating serious
`errors for the older servers and browsers.
`
`Fo r more
`t he IETF at
`in form a t ion a bout RFC 2109 , visit
`http:/ jwww.ietf.org or the RFC 2109 versions site at http:/ jportal.
`research.bell-labs.com/ -dmkjcookie-ver.html.
`
`Twitter-Google Exhibit 1046
`Page 12 of 46
`
`
`
`Cookie Anatomy
`
`The Future of Cookies
`
`27
`
`AJthough the proposals surrounding RFC 2109 are still very much a work
`in progress, there are always calls on the Internet for more features and
`more disclosure. \iVhile the Comment and CommentURL attributes give
`developers a space to explain themselves, some privacy advocates have
`asked for a classification system for cookies. Cookies for tracking banner
`advertisements could then be sorted out from cookies for shopping carts,
`without requiring the browser constantly to pester the user about whetl1er
`or not to accept a particular cookie. More complex schemes that use
`encrypted keys and certi£cation to provide a guarantee of proper cookie use
`(and automatic acceptance of "certified cookies") are also in the works.
`Others have suggested that the software tool which creates a cookie be
`specified as an attribute; as we wHI see later, Netscapc's LiveWire already
`includes its identification in tl1e names of the cookies it creates.
`Several other proposals, including some examined below, create cookie(cid:173)
`like data objects that are open to multiple readers-sparing users the need
`to enter their address at 30 different sites. Other cookie-like proposals pro(cid:173)
`vide for key exchanges to make encryption simpler and more automatic. At
`this point, however, it looks like cookies may stay simple, focusing on what
`they do best and allowing other structures to carry the burdens of identifi(cid:173)
`cation and authentication.
`
`I,
`n
`:r
`
`It
`te
`)C
`rt
`)(,
`a
`!S,
`tal
`:t(cid:173)
`tat
`·er
`1le
`
`!Jlt
`ire
`ide
`
`trd,
`ari(cid:173)
`As
`;ate
`ous
`
`at
`-tal.
`
`Twitter-Google Exhibit 1046
`Page 13 of 46
`
`
`
`Cookies and CGI
`
`The Common Gateway Interface (CGI) was one of the ear(cid:173)
`liest tools for connecting HTTP requests to programs, and
`i remain extreme!}' popular de. pi e the appe.tranc of new
`t ol . CGl i available on nearly eve '\ cb erver and pro(cid:173)
`vid basic crvice that cl vel pe can use
`r·ead in \ cb
`and end back appr pria
`r ponse . For our pur(cid:173)
`r qu
`po ~, the coClkie information is the m
`t critical part of
`tho c requests and responses. CGI development demands
`m re from the programmer than many of the other tools
`explored in this book, but can provide rewards in compati(cid:173)
`bility across server platforms and the considerable number
`available t simplify the ta k. ~ hi chapter will
`f Jibrari
`f cookie·.
`,.I appli ati n · that ake advantage
`c. pion:
`demon trating how to et, collect, and proc
`cookies using
`the mo t popular language f r
`T developmcnt,l1erl.
`
`165
`
`Twitter-Google Exhibit 1046
`Page 14 of 46
`
`
`
`166
`
`Chapter Six
`
`Introduction to the Common Gateway Interface
`
`The CGI provides a set of standards that allow Web servers and programs
`to communicate with each other. CGI has been available since the earliest
`days of the NCSA and Ch RN servers, and continues to be available on
`Web servers from Netscape, Microsoft, Apache, Sun, IBM, Lotus, O'Reilly,
`and nearly every other server vendor. There may occasionally be extra
`information made available or a slightly different standard for communicat(cid:173)
`ing it (the WtnCGI standard, for example), but all of these systems allow
`developers to use a common set of tools for passing information between
`Web servers and programs.
`The CGI takes the information from the HTTP requests (shown in the
`previous chapter) and passes it to the program as a set of variables. When
`the server receives an HTTP request directed at a script, it parses the
`request headers and transforms them into a list of variables whic)l is sent to
`the CGI program as standard input when the program first begins. Table 6-
`1 shows a standard list of CGI variables as well as the equivalents in the
`Perl CGI. pm module; more variables may appear on particular servers.
`
`Table6-1
`Standard List of
`CGI Variables
`
`Variable
`
`AUTH_TYPE
`
`CONTENT_LENGTH
`
`CONTENT_TYPE
`
`GATEWAY_INTERFACE
`
`HTTP_ACCEPT
`
`HTTP_COOKIE
`
`HTTP_REFERER
`
`auth_type ()
`
`none
`
`none
`
`none
`
`accept()
`
`CGI.pm method Contents
`Authentication type, 1f the user
`had to log in with a user name
`and password
`The length in bytes of the request
`(Used with POST requests only)
`The MIME type of <Ut,ta. sent in the
`request; generally appllcat1on/x-
`'www-form-urlencoded (Used with
`POST requests only)
`Provides the name and revision
`number of the CGI interface used;
`CGI/ 1.1 would indicate Version
`1.1 of the CGI interface
`Provides a. list of the MIME types
`the browser will accept a.s a
`response
`Provides the cookies ava.lla.ble to
`1;4e requested URL as a. list of
`URirencoded name value pairs
`separated by semicolons
`Provides the URL of the page
`from which the user reached the
`requested URL
`
`raw_cookie ()
`
`referer()
`
`(IJ1ltinued
`
`Twitter-Google Exhibit 1046
`Page 15 of 46
`
`
`
`Cookies and CGI
`
`167
`
`Variable
`
`CGI.pm method
`
`HTTP _USER_AGENT user_agent ()
`
`PATH_INFO
`
`path_info ()
`
`QUERY STRING
`
`query_string ()
`
`REMOTE_ADDR
`
`remote_addr ()
`
`REMOTE_HOST
`
`remote_host ()
`
`REMOTE USER
`
`remote_user ()
`
`Contents
`Provides information about the
`user's browser
`Extra path information provided
`in the URL (Often used for CGI
`redirection programs)
`PATH_TRANSLATED path_translated () The absolute path on the server
`used to reach the script; useful
`for accessing files in directories
`on the server
`Information sent to the server as
`part of the URL after a question
`mark (Frequently used with GET
`requests)
`The IP address of the machine
`making the request
`The domain name of the xrl.achine
`making the request, if available
`The name used to authenticate
`with the Web server; present only
`if the user had to authenticate at
`some point during the current
`browser session with this server
`request_method () The request method used to make
`this request, generally GET or
`POST, though some scripts m8iY
`want to check for HEAD
`The name of the script; useful for
`reconstructing URL references to
`the script
`The server name (domain name if
`available, IP address if not) of the
`server that received the request;
`like SCRIPT_NAME, useful for
`reconstructing URL references.
`The port number on the server
`that received the request
`The protocol used to make the
`request; usually HTTP /1. 0 or
`HTTP/1.1.
`SERVER_SOFl'WARE server_software () Provides information about the
`HTTP server that received
`the request in the format
`name/version
`
`REQUEST_METHOD
`
`SCRIPT_NAME
`
`script_name ()
`
`SERVER_NAME
`
`server_name ()
`
`SERVER_PORT
`
`server_port ()
`
`SERVER_PROTOCOL none
`
`Twitter-Google Exhibit 1046
`Page 16 of 46
`
`
`
`168
`
`Chapter Six
`
`Creating a CGI program is mostly a matter of matching up appropriate
`outputs to parrjcular inputs. When a request arrives, the CGJ program
`.informa(cid:173)
`parses the information it receives from the \f'Veb server, extractin
`ti.on like the query string, the form infi rmation sent by a POST request, and
`any cookies that cnme wid1 the request . .Ba.s don this in~ rmati n, th CGI
`application will interact with server-side resources like files and databases
`(if necessary) and generate a response for the user, usually a set of headers
`followed by some HTML. This response is passed back to the server when
`the program concludes, and the server then passes the response to the user.
`
`CGI programs are not limited to producing HTML output, but most
`return only HTML.
`
`CGI programs can be written in nearly any language, from vtsual Basic to
`assembly to C to Tel to Python to Perl, but Perl still retains the lion's share
`of CGI development. CGI only specifies what the Web server sends to the
`program and what it expects back; the CGI standard by itself places no lim(cid:173)
`its or expectations on the behavior of that program. Unlike the other tools
`covered in later chapters, CGI provides no built-in mechanisms for access(cid:173)
`ing databases, maintaining state, or even writing programs. CGI is the ulti(cid:173)
`mate do-jt-yourelf environment Jar \iVeb programming, wh_ich gives it
`enormous l~exibility. Still, many programmers prefer to a oid reinventing
`the wheel, wbich has led to the development of some powerful libraries for
`C
`I programming. ' hey do not offer all the features of the server devel(cid:173)
`opment environments that will be covered in the next chapters, but they
`can certainly spare developers a lot of hassle.
`CGI does have some drawbacks. Every time a user makes a call to a CGI
`program, the erver builds the C ·I reque t and runs that program-usuaUy
`as a separate application. Od1er tools, like Active Server Pages and senders,
`can run d)mamic requests in the smne program pace as the ·erver, reducing
`the overhead nee ted to create clymunic pages. The jmplementation varies
`from s .rver to server and platform to platform; mos UNIX madli.nes run
`CGI applications ass ·p~rrate pro esses, while '/Vmd ws and Macintosh pro(cid:173)
`grams run CGI as applic<1tions or services. Using a server company's pro(cid:173)
`r Mkrosoft's ISAPI or
`prietary systems (Netscape' NSAPI or LiveWire
`Active Server Pages, for instance) often demands less overhead and is there(cid:173)
`fore quicker but comes at a ·cost: those dynamic pages must always be host(cid:173)
`ed on the particular server for which they were developed. Using CGI may
`be more complicated, but it allows developers to choose their own tools
`and keep their Web applications much more portable. Perl applications can
`
`---~-- ---- -------~----------------------
`
`Twitter-Google Exhibit 1046
`Page 17 of 46
`
`
`
`----------~~----·-------------
`
`Cookies and CGI
`
`169
`
`usually move from platform to platform, and even C and C++ programs can
`often be recompiled.
`
`When using Perl with liS, there is an application called Perl for ISAPI
`that allows the server to work with one copy of the Perl application,
`avoiding the constant relaunching of the interpreter. Perl for ISAPI is
`availabe from http:/ ;www.activestate.com.
`
`CGiandPerl
`
`Perl remains the language of choice for many CGI developers, and will be
`the only language used in this chapter. There are several reasons for Perl's
`popularity. Perl is available for nearly every kind of UNIX, for Windpws 95
`and NT, and for the Macintosh, as well as for a variety of other platforms.
`Developers can use their Perl scripts on nearly any machine, and libraries
`are generally portable among the different implementations. Perl offers an
`incredibly powerful set of tools for managing arrays and finding and modi(cid:173)
`fying text, and it is not difficult to learn. Perl is well suited to interpret the
`information provided by the server and produce complex documents with a
`minimum of programming, especially with the assistance of the CGI. pm
`module that is now a part of the Perl distributi