throbber
Understanding Web Internals
`
`Tbe Definitive Guide
`
`O'REILLY®
`
`David Gourley & Brian Totty
`with Marjorie Sayer, Sailu Reddy & Anshu Aggarwal
`
`Exhibit 2002
`IPR2016-01431 - Part 1 of 2
`
`

`
`03$.
`
`
`L..._4...r/r.f..hL.__.$..:
`
`_._:.,.wwFwnyxu\N1.:K».r._.,.:.._?._,_;.,
`
`.;V_.%_.
`....:.::_,.,.
`
`V,\¢\,_:..
`
`
`
`

`
`HTTP
`The Definitive Guide
`
`

`
`HTTP
`The Definitive Guide
`
`David Gourley and Brian Totty
`with Marjorie Sayer, Sailu Reddy, and Anshu Aggarwal
`
`O'REILLY®
`
`

`
`HTIP: The Definitive Guide
`by David Gourley and Brian Totty
`with Marjorie Sayer, Sailu Reddy, and Anshu Aggarwal
`
`c;opyright © 2002 O'Reilly Media, Inc. All rights reserved.
`Primed in the United States of America.
`
`Published by O'Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol,
`CA95472.
`
`O'Reilly Media, Inc. books may be purchased for educational, business, or sales promotional use. On(cid:173)
`line editions are also available for most titles (safari.oreilly.com). For more information, contact our cor(cid:173)
`porate/institutional sales department: (800) 998-9938 or corporate@oreilly.com.
`
`Editor:
`Production Editor:
`Cover Designer:
`Interior Designers:
`
`Printing History;
`
`Linda Mui
`
`Rachel Wheeler
`
`Ellie Volckhausen
`
`David Futato and Melanie Wang
`
`September 2002:
`
`First Edition.
`
`Nutshell Handbook, the Nutshell Handbook logo, and the O'Reilly logo are registered trademarks of
`O'Reilly Media, Inc. HTTP: The Definitive Guide, the image of a thirteen-lined ground squirrel, and
`related trade dress are trademarks of O'Reilly Media, Inc. Many of the designations used by
`manufacturers and sellers to distinguish their products are claimed as trademarks. Where those
`designations appear in this book, and O'Reilly Media, Inc. was aware of a trademark claim, the
`designations have been printed in caps or initial caps.
`
`While every precaution has been taken in the preparation of this book, the publisher and authors
`assume no responsibility for errors or omissions, or for damages resulting from the use of the
`information contained herein.
`
`ISBN: 978-1-56592-509-0
`[LSI]
`
`[2011-'01-27]
`
`

`
`Table of Contents
`
`Preface ................................................................ xiii
`
`Part I.
`
`HTTP: The Web's Foundation
`
`1. Overview of HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
`3
`HTTP: The Internet's Multimedia Courier
`4
`Web Clients and Servers
`Resources
`4
`8
`Transactions
`Messages
`10
`11
`Connections
`16
`Protocol Versions
`17
`Architectural Components of the Web
`21
`The End of the Beginning
`21
`For More Information
`
`2. URLs and Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
`Navigating the Internet's Resources
`24
`URL Syntax
`26
`URL Shortcuts
`30
`35
`Shady Characters
`A Sea of Schemes
`38
`The Future
`40
`41
`For More Information
`
`3. HTTP Messages .................................................... 43
`The Flow of Messages
`43
`44
`The Parts of a Message
`
`v
`
`

`
`Methods
`St;Hus Codes
`Headers
`For More Information
`
`53
`59
`67
`73
`
`4. Connection Management ........................................... 74
`TCP Connections
`7 4
`TCP Performance Considerations
`80
`86
`HTTP Connection Handling
`Parallel Connections
`88
`Persistent Connections
`90
`99
`Pipelined Connections
`101
`The Mysteries of Connection Close
`For More Information
`104
`
`Part II. HTTP Architecture
`
`5 .. Web Servers ........ : ............................................ 109
`Web Servers Come in All Shapes and Sizes
`109
`111
`A Minimal Perl Web Server
`113
`What Real Web Servers Do
`115
`Step 1: Accepting Client Connections
`116
`Step 2: Receiving Request Messages
`120
`Step 3: Processing Requests
`120
`Step 4: Mapping and Accessing Resources
`125
`Step 5: Building Responses
`127
`Step 6: Sending Responses
`127
`Step 7: Logging
`For More Information
`127
`
`:6. Proxies .................................................. · ........ 129
`Web Intermediaries
`129
`Why Use Proxies?
`131
`Where Do Proxies Go?
`137
`Client Proxy Settings
`141
`Tricky Things About Proxy Requests
`144
`150
`Tracing Messages
`156
`Proxy Authentication
`
`vi
`
`I Table of Contents
`
`

`
`Proxy Inter6peration
`For More Information
`
`157
`160
`
`7. Caching ............. : . ..... · ....................................... , ... 161
`161
`Redundant Data Transfers
`161
`Bandwidth Bottlenecks
`163
`Flash Crowds
`163
`Distance Delays
`164
`Hits and Misses
`168
`Cache Topologies
`171
`Cache Processing Steps
`175
`Keeping Copies Fresh
`182
`Controlling Cachability
`186
`Setting Cache Controls
`187
`Detailed Algorithms
`194
`Caches and Advertising
`196
`For More Information
`
`8.
`
`Integration Points: Gateways, Tunnels, and Relays . . . . . . . . . . . . . . . . . . . . 197
`197
`Gateways
`200
`Protocol Gateways
`203
`Resource Gateways
`205
`Application Interfaces and Web Services
`206
`Tunnels
`212
`Relays
`213
`For More Information
`
`9. Web Robots ...................................................... 215
`215
`Crawlers and Crawling
`225
`Robotic HTTP
`228
`Misbehaving Robots
`229
`Excluding Robots
`239
`Robot Etiquette
`242
`Search Engines
`246
`For More Information
`
`10. HTTP-NG ......................................................... 247
`247
`HTTP's Growing Pains
`248
`HTTP-NG Activity
`
`Table of Contents
`
`l vii
`
`

`
`Modularize and Enhance
`Distributed Objects
`Layer 1: Messaging
`Layer 2: Remote Invocation
`Layer 3: Web Application
`WebMUX
`Binary Wire Protocol
`Current Status
`For More Information
`
`248
`249
`250
`250
`251
`251
`252
`252
`253
`
`Part Ill.
`
`Identification, Authorization, and Security
`
`11. Client Identification and Cookies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
`The Personal Touch
`257
`HTTP Headers
`258
`Client IP Address
`259
`User Login
`260
`Fat URLs
`262
`Cookies
`263
`For More Information
`276
`
`12. Basic Authentication .............................................. 277
`Authentication
`277
`Basic Authentication
`281
`The Security Flaws of Basic Authentication
`283
`For More Information
`285
`
`13. Digest Authentication.· ............................................ 286
`The Improvements of Digest Authentication
`286
`Digest Calculations
`291
`Quality of Protection Enhancements
`299
`Practical Considerations
`300
`Security Considerations
`303
`For More Information
`306-
`
`14. Secure HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
`Making HTTP Safe
`307
`Digital Cryptography
`309
`
`viii
`
`I Table of Contents
`
`

`
`Symmetric-Key Cryptography
`Public-' Key Cryptography
`Digital Signatures
`Digital Certificates
`HTTPS: The Details
`A Real HTTPS Client
`Tunneling SecureTraffic Through Proxies
`For More Information
`
`313
`315
`317
`319
`322
`328
`335
`336
`
`Part IV. Entities, Encodings, and Internationalization
`
`15. Entities and Encodings .... : . ...................................... 341
`342
`Messages Are Crates, Entities Are Cargo
`Content-Length: The Entity's Size
`344
`347
`Entity Digests
`348
`Media Type and Charset
`351
`Content Encoding
`354
`Transfer Encoding and Chunked Encoding
`359
`Time-Varying Instances
`360
`Validators and Freshness
`363
`Range Requests
`365
`Delta Encoding
`369
`For More Information
`
`16~ Internationalization .............................................. 370
`370
`HTTP Support for International Content
`371
`Character Sets and HTTP
`376
`Multilingual Character Encoding Primer
`384
`Language Tags and HTTP
`389
`Internationalized URis
`392
`Other Considerations
`392
`For More Information
`
`17. Content Negotiation and Transcoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
`395
`Content-Negotiation Techniques
`396
`Client-Driven Negotiation
`397
`Server-Driven Negotiation
`400
`Transparent Negotiation
`
`Table of Contents
`
`I ix
`
`

`
`Transcoding
`Next Steps
`For More Information
`
`403
`405
`406
`
`Part V. Content Publishing and Distribution
`
`18. Web Hosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
`Hosting Services
`411
`Virtual Hosting
`413
`Making Web Sites Reliable
`419
`Making Web Sites Fast
`422
`For More Information
`423
`
`19. Publishing Systems ............................................... 424
`FrontPage Server Extensions for Publishing Support
`424
`WebDAV and Collaborative Authoring
`429
`For More Information
`446
`
`20. Redirection and Load Balancing .................................... 448
`Why Redirect?
`449
`Where to Redirect
`449
`Overview of Redirection Protocols
`450
`General Redirection Methods
`452
`Proxy Redirection Methods
`462
`Cache Redirection Methods
`469
`Internet Cache Protocol
`473
`Cache Array Routing Protocol
`475
`Hyper Text Caching Protocol
`478
`For More Information
`481
`
`21. Logging and Usage Tracking ....................................... 483
`What to Log?
`483
`Log Formats
`484
`Hit Metering
`492
`A Word on Privacy
`495
`For More Information
`495
`
`x I Table of Contents
`
`

`
`Part VI. Appendixes
`
`A. URI Schemes ..................................................... 499
`
`B. HTTP Status Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
`
`C. HTTP Header Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
`
`D. MIME Types ...................................................... 533
`
`E. Base-64 Encoding ................................................. 570
`
`F. Digest Authentication ............................................. 574
`
`G. Language Tags . ..................................................... 581
`
`H. MIME Charset Registry .........................•............ · ....... 602
`
`Index .•............................................................... 617
`
`TableofContents
`
`I xi
`
`

`
`

`
`Preface
`
`The Hypertext Transfer Protocol (HTTP) is the protocol programs use to communi(cid:173)
`cate over the World Wide Web. There are many applications of HTTP, but HTTP is
`most famous for two-way conversation between web browsers and web servers.
`HTTP began as a simple protocol, so you might think there really isn't that much to
`say about it. And yet here you stand, with a two-pound book in your hands. If you're
`wondering how we could have written 650 pages on HTTP, take a look at the Table
`of Contents. This book isn't just an HTTP header reference manual; it's a veritable
`bible of web architecture.
`
`In this book, we try to tease apart HTTP's interrelated and often misunderstood
`rules, and we offer you a series of topic-based chapters that explain all the aspects of
`HTTP. Throughout the book, we are careful to explain the "why" of HTTP, not just
`the "how." And to save you time chasing references, we explain many of the critical
`non-HTTP technologies that are required to make HTTP applications work. You can
`find the alphabetical header reference (which forms the basis of most conventional
`HTTP texts) in a conveniently organized appendix. We hope this conceptual design
`makes it easy for you to work with HTTP.
`
`This book is written for anyone who wants to understand HTTP and theunderlying
`architecture of the Web. Software and hardware engineers can use this book as a
`coherent reference for HTTP and related web technologies. Systems architects and
`network administrators can use this book to better understand how to design,
`deploy, and manage complicated web architectures. Performance engineers and ana(cid:173)
`lysts can benefit from the sections on caching and performance optimization. Mar(cid:173)
`keting and consulting professionals will be able to use the conceptual orientation to
`better understand the landscape of web technologies.
`
`This book illustrates common misconceptions, advises on "tricks of the trade," pro(cid:173)
`vides convenient reference material, and serves as a readable introduction to dry and
`confusing standards specifications. In a single book, we detail the essential and inter(cid:173)
`related technologies that make the Web work.
`
`xiii
`
`

`
`This book is the result of a tremendous amount of work by many people who share
`an enthusiasm for Internet technologies. We hope you find it useful.
`
`Running Example: Joe's Hardware Store
`Many of our chapters include a running example of a hypothetical online hardware
`and home-improvement store called "Joe's Hardware" to demonstrate technology
`concepts. We have set up a real web site for the store (http://www.joes-hardware.
`com) for you to test some of the examples in the book. We will maintain this web site
`while this book remains in print.
`
`Chapter-by-Chapter Guide
`This book contains 21 chapters, divided into 5 logical parts (each with a technology
`theme), and 8 U$eful appendixes containing reference data and surveys of related
`technologies:
`
`Part I, HTTP: The Web's Foundation
`· Part II, HTTP Architecture
`Part III, Identification, Authorization, and Security
`Part IV, Entities, Encodings, and Internationalization
`Part V, Content Publishing and Distribution
`Part VI, Appendixes
`
`Part I, HTTP: The Web's Foundation, describes the core technology ofHTTP, the
`foundation of theW eb, in four chapters:
`
`• Chapter 1, Overview of HTTP, is a rapid-paced overview of HTTP.
`• Chapter 2, URLs and Resources, details the formats of uniform resource locators
`(URLs) and the various types of resources that URLs name across the Internet. It
`also outlines the evolution to uniform resource names (URNs).
`• Chapter 3, HTTP Messages, details how HTTP messages transport web content.
`• Chapter 4, Connection Management, explains the commonly mis1,mderstood and
`poorly documented rules and behavior for managing HTTP connections.
`
`Part II, HTTP Architecture, highlights the HTTP server, proxy, cache, gateway, and
`robot applications that are the architectural building blocks of web systems. (Web
`browsers are another building block, of course, but browsers already were covered
`thoroughly in Part I of the book.) Part II contains the following six chapters:
`
`• Chapter 5, Web Servers, gives an overview of web server architectures.
`• Chapter 6, Proxies, explores HTTP proxy servers, which are intermediary serv(cid:173)
`ers that act as platforms for HTTP services and controls.
`• Chapter 7, Caching, delves into the science of web caches-devices that improve
`performance and reduce traffic by making local copies of popular documents.
`
`xiv
`
`I Preface
`
`

`
`• Chapter 8, Integration Points: Gateways, Tunnels, and Relays, explains gateways
`and application servers that allow HTTP to work with software that speaks dif,.
`ferent protocols, including Secure Sockets Layer (SSL) encrypted protocols.
`• Chapter 9, Web Robots, describes the various types of clients that pervade the
`Web, including the ubiquitous browsers, robots and spiders, and search engines.
`• Chapter 10, HTTP-NG, talks about HTTP developments still in the works: the
`. HTTP-NG protocol.
`
`Part III, Identification, Authorization, and Security, presents a suite of techniques and
`technologies to track identity, enforce security, and control access to content. It con(cid:173)
`tains the following four chapters:
`
`• Chapter 11, Client Identification and Cookies, talks about techniques to identify
`users so that content. can be personalized to the user audience.
`• Chapter 12, Basic Authentication, highlights the basic mechanisms to verify user
`identity. The chapter also examines how HTTP authentication interfaces with
`databases.
`• Chapter 13, Digest Authentication, explains digest authentication, a complex
`proposed enhancement to HTTP that provides significantly enhanced security.
`• Chapter 14, Secure HTTP, is a detailed overview of Internet cryptography, digi-
`tal certificates, and SSL.

`
`Part IV, Entities; Encodings, and Internationalization, focuses on the bodies of HTTP
`messages (which contain the actual web content) and on the web standards that
`describe and manipulate content stored in the message bodies. Part IV contains three
`chapters:
`
`• Chapter 15, Entities and Encodings, describes the structure of HTTP content.
`• Chapter 16, Internationalization, surveys the web standards that allow users
`around the globe to exchange content in different languages and character sets.
`• Chapter 17, Content Negotiation and Transcoding, explains mechanisms for
`negotiating acceptable content.
`
`Part V, Content Publishing and Distribution, discusses the technology for publishing
`and disseminating web content. It contains four chapters:
`
`• Chapter 18, Web Hosting, discusses the ways people deploy servers in modern
`web hosting environments and HTTP support for virtual web hosting.
`• Chapter 19, Publishing Systems, discusses the technologies for creating webcon(cid:173)
`tent and installing it onto web servers.
`• Chapter 20, Redirection and Load Balancing, surveys the tools and techniques for
`distributing incoming web traffic among a collection of servers.
`• Chapter 21, Logging and Usage Tracking, covers log formats and common
`questions.
`
`Preface
`
`I xv
`
`

`
`Part VI, Appendixes, contains helpful reference appendixes and tutorials in related
`technologies:
`
`• Appendix A, URI Schemes, summarizes the protocols supported through uni(cid:173)
`form resource identifier (URI) schemes.
`• Appendix B, HTTP Status Codes, conveniently lists the HTTP response codes.
`• Appendix C, HTTP Header Reference, provides a reference list of HTTP header
`fields.
`• Appendix D, MIME Types, provides an extensive list of MIME types and
`explains how MIME types are registered.
`• Appendix E, Base-64 Encoding, explains base-64 encoding, used by HTTP
`authentication.
`• Appendix F, Digest Authentication, gives details on how to implement various
`authentication schemes in HTTP.
`• Appendix G, Language Tags, defines language tag values for HTTP language
`headers.
`• Appendix H, MIME Charset Registry, provides a detailed list of character encod(cid:173)
`ings, used for HTTP internationalization support.
`
`Each chapter contains many examples and pointers to additional reference material.
`
`Typographic Conventions
`In this book, we use the following typographic conventions:
`Italic
`Used for URLs, C functions, command names, MIME types, new terms where
`they are defined, and emphasis
`Constant width
`Used for computer output, code, and any literal text
`Constant width bold
`Used for user input
`
`Comments and Questions
`Please address comments and questions concerning this book to the publisher:
`
`O'Reilly & Associates, Inc.
`1005 Gravenstein Highway North
`Sebastopol, CA 95472
`(800) 998-9938 (in the United States or Canada)
`(707) 829-0515 (international/local)
`(707) 829-0104 (fax)
`
`xvi
`
`I Preface
`
`

`
`There is a web page for this book, which lists errata, examples, or any additional
`information. You can access this page at:
`http://www. oreilly. comlcatalog/httptdg!
`To comment or ask technical questions about this book, send email to:
`bookquestions@oreilly. com
`
`For more information about books, conferences, Resource Centers, and the O'Reilly
`Network, see the O'Reilly web site at:
`http://www.oreilly.com
`
`Acknowledgments
`This book is the labor of many. The five authors would like to hold up a few people
`.
`in thanks for their significant contributions to this project.
`
`To start, we'd like to thank Linda Mui, our editor at O'Reilly. Linda first met with
`David and Brian way back in 1996, and she refined and steered several concepts into
`the book you hold today. Linda also helped keep our wandering gang of first-time
`book authors moving in a coherent direction and on a progressing (if not rapid) time(cid:173)
`line. Most of all, Linda gave us the chance to create this book. We're very grateful.
`
`We'd also like to thank several tremendously bright, knowledgeable, and kind souls
`who devoted noteworthy energy to reviewing, commenting on, and correcting drafts
`of this book. These include Tony Bourke, Sean Burke, Mike Chowla, Shernaz Daver,
`Fred Douglis, Paula Ferguson, Vikas Jha, Yves Lafon, Peter Mattis, Chuck Neer(cid:173)
`daels, Luis Tavera, Duane Wessels, Dave Wu, and Marco Zagha. Their viewpoints
`and suggestions have improved the book tremendously.
`
`Rob Romano from O'Reilly created most of the amazing artwork you'll find in this
`book. The book contains an unusually large number of detailed illustrations that
`make subtle concepts very clear. Many of these illustrations were painstakingly cre(cid:173)
`ated and revised numerous times. If a picture is worth a thousand words, Rob added
`hundreds of pages of value to this book.
`
`Brian would like to personally thank all of the authors for their dedication to this
`project. A tremendous amount of time was invested by the authors in a challenge to
`make the first detailed but accessible treatment of HTTP. Weddings, childbirths,
`killer work projects, startup companies, and graduate schools intervened, but the
`authors held together to bring this project to a successful completion. We believe the
`result is worthy of everyone's hard work and, most importantly, that it provides a
`valuable serv'ice. Brian also would like to thank the employees of Inktomi for their
`enthusiasm and support and for their deep insights about the use of HTTP in real(cid:173)
`world applications. Also, thanks to the fine folks at Cajun-shop.com for allowing us
`to use their site for some of the examples in this book.
`
`Preface
`
`I xvii
`
`

`
`David would like to thank his family, particularly his mother and grandfather for
`their ongoing support. He'd like to thank those that have put up with his erratic
`schedule over the years writing the book. He'd also like to thank Slurp, Orctomi, and
`Norma for everything they've done, and his fellow authors for all their hard work.
`Finally, he would like to thank Brian for roping him into yet another adventure.
`Marjorie would like to thank her husband, Alan Liu, for techniCal insight, familial
`support and understanding. Marjorie thanks her fellow authors for many insights
`and inspirations. She is grateful for the experience of working together on this book.
`
`Sailu would like to thank David and Brian for the opportunity to work on this book,
`and Chuck Neerdaels for introducing him to HTTP.
`Anshu would like to thank his wife, Rashi, and his parents for their patience, sup(cid:173)
`port, and encouragement during the long years spent writing this book.
`Finally, the authors collectively thank the famous and nameless Internet pioneers,
`whose research, development, and evangelism over the past four decades contrib(cid:173)
`uted so much to our scientific, social, and economic community. Without these
`labors, there would be no subject for this book.
`
`xviii
`
`I Preface
`
`

`
`PART I
`HTTP: The Web's Foundation
`
`This section is an introduction to the HTTP protocol. The next four chapters
`describe the core technology of HTTP, the foundation of the Web:
`
`• Chapter 1, Overview of HTTP, is a rapid-paced overview of HTTP.
`• Chapter 2, URLs and Resources, details the formats of URLs and the various
`types of resources that URLs name across the Internet. We also outline the evo(cid:173)
`lution to URNs.
`• Chapter 3, HTTP Messages, details the HTTP messages that transport web
`content.
`• Chapter 4, Connection Management, discusses the commonly misunderstood
`and poorly documented rules and behavior for managing TCP connections by
`HTTP.
`
`

`
`

`
`CHAPTER 1
`Overview of HTTP
`
`The world's web browsers, servers, and related web applications all talk to each
`other through HTTP, the Hypertext Transfer Protocol. HTTP is the common lan(cid:173)
`guage of the modern global Internet.
`
`This chapter is a concise overview of HTTP. You'll see how web applications use
`HTTP to communicate, and you'll get a rough idea of how HTTP does its job. In
`particular, we talk about:
`
`• How web clients and servers communicate
`• Where resources (web content) come from
`• How web transactions work
`• The format of the messages used for HTTP communication
`• The underlying TCP network transport
`• The different variations of the HTTP protocol
`• Some of the many HTTP architectural components installed around the Internet
`
`We've got a lot of ground to cover, so let's get started on our tour of HTTP.
`
`HTTP: The Internet's Multimedia Courier
`Billions of ]PEG images, HTML pages, text files, MPEG movies, W A V audio files,
`Java applets, and more cruise through the Internet each and every day. HTTP moves
`the bulk of this information quickly, conveniently, and reliably from web servers all
`around the world to web browsers on people's desktops.
`
`Because HTTP uses reliable data-transmission protocols, it guarantees that your data
`will not be damaged or scrambled in transit, even when it comes from the other side of
`the globe. This is good for you as a user, because you can access information without
`worrying about its integrity. Reliable transmission is also good for you as an Internet
`application developer, because you don't have to worry about HTTP communications
`
`3
`
`

`
`being destroyed, duplicated, or distorted in transit. You can focus on programming
`the distinguishing details of your application, without worrying about the flaws and
`foibles of the Internet.
`
`Let's look more closely at how HTTP transports the Web's traffic.
`
`Web Clients and Servers
`Web content lives on web servers. Web servers speak the HTTP protocol, so they are
`often called HTTP servers. These HTTP servers store the Internet's data and provide
`the data when it is requested by HTTP clients. The clients send HTTP requests to
`servers, and servers return the requested data in HTTP responses, as sketched in
`Figure 1-1. Together, HTTP clients and HTTP servers make up the basic compo(cid:173)
`nents of the World Wide Web.
`
`Figure 1-1. Web clients and servers
`
`You probably use HTTP clients every day. The most common client is a web
`browser, such as Microsoft Internet Explorer or Netscape Navigator. Web browsers
`request HTTP objects from servers and display the objects on your screen.
`
`When you browse to a page, such as "http://www.oreilly.com/index.html," your
`browser sends an HTTP request to the server www.oreilly.com (see Figure 1-1). The
`server tries to find the desired object (in this case, "/index.html") and, if successful,
`sends the object to the client in an HTTP response, along with the type of the object,
`the length of the object, and other information.
`
`Resources
`Web servers host web resources. A web resource is the source of web content. The
`simplest kind of web resource is a static file on the web server's filesystem. These
`files can contain anything: they might be text files, HTML files, Microsoft Word
`files, Adobe Acrobat files, ]PEG image files, A VI movie files, or any other format you
`can think of.
`
`However, resources don't have to be static files. Resources can also be software pro- ·
`grams that generate content on demand. These dynamic content resources can gen(cid:173)
`erate content based on your identity, on what information you've requested, or on
`
`4 I Chapter 1: Overview of HTTP
`
`

`
`the tim'e of day. They can show you a live image from a camera, or let you trade
`stocks, search real estate databases, or buy gifts from online stores (see Figure 1-2).
`
`r·····-----------~~~"iii~~~-------------------i~~~~~~~s--i
`
`I
`
`:
`
`Image file
`
`Text file
`1---------1
`
`Real estate search
`gateway
`-.~--+-• $11000101101$
`l
`.
`£-commerce
`[ ____________________ ~~~~~~Y.. __________________________________ _l
`
`'
`
`Figure 1-2. A web resource is anything that provides web content
`
`In summary, a resource is any kind of content source. A file containing your com(cid:173)
`pany's sales forecast spreadsheet is a resource. A web gateway to scan your local
`public library's shelves is a resource. An Internet search engine is a resource.
`
`Media Types
`Because the Internet hosts many thousands of different data types, HTTP carefully
`tags each object being transported through the Web with a data format label called a
`MIME type. MIME (Multipurpose Internet Mail Extensions) was originally designed
`to solve problems encountered in moving messages between different electronic mail
`systems, MIME worked so well for email that HTTP adopted it to describe and label
`its own multimedia content.
`
`Web servers attach a MIME type to all HTTP object data (see Figure 1-3). When a
`web browser gets an object back from a server, it looks at the associated MIME type
`to see if it knows how to handle the object. Most browsers can handle hundreds of
`popular object types: displaying image files, parsing and formatting HTML files,
`playing audio files through the computer's speakers, or launching external plug-in
`software to handle special formats.
`
`Resources
`
`I 5
`
`

`
`~MIME type
`: Content-txpe: image/ j peg i
`,-----------~-----,-~~----
`1 Content-length: 12984
`
`I
`I
`I
`
`1
`I
`I
`I
`I
`(:
`I
`F
`
`Client
`
`1_ -
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`- - - -
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`I
`I
`- J
`
`Server
`
`Figure 1-3. MIME types are sent back with the data content
`
`A MIME type is a textual label~ represented as a primary object type and a specific
`subtype, separated by a slash. For example:
`
`• An HTML-formatted text document would be labeled with type text/html.
`• A plain ASCII text document would be labeled with type text/plain.
`• A]PEG version of an image would be image/jpeg.
`• A GIF-format image would be image/gif.
`• An Apple QuickTime movie would be video/quicktime.
`• A Microsoft PowerPoint presentation would be application/vnd.ms-powerpoint.
`
`There are hundreds of popular MIME types, and many more experimental or limited(cid:173)
`use types. A very thorough MIME type list is provided in Appendix D.
`
`URis
`Each web server resource has a name, so clients can point out what resources they
`are interested in. The server resource name is called a uniform resource identifier, or
`URI. URis are like the postal addresses of the Internet, uniquely identifying and
`locating information resources around the world.
`
`Here's a URI for an image resource on Joe's Hardware store's web server:
`http://www.joes-hardware.com/specials/saw-blade.gif
`Figure 1-4 shows how the URI specifies the HTTP protocol to access the saw-blade
`GIF resource on Joe's store's server. Given the URI, HTTP can retrieve the object.
`URis come in two flavors, called URLs and URNs. Let's take a peek at each of these
`types of resource identifiers now.
`
`URLs
`The uniform resource locator (URL) is the most common form of resource identifier.
`URLs describe the specific location of a resource on a particular server. They tell you
`exactly how to fetch a resource from a precise, fixed location. Figure 1-4 shows how
`a URL tells precisely where a resource is located and how to access it. Table 1-1
`shows a few examples of URLs.
`
`6 I Chapter 1: Overview of HTTP
`
`

`
`2
`1
`Go to wwwjoes-hardware.com
`Use HTTP protocol
`I \. · ........ •.J./··
`
`..•••• · ' • ·
`
`'~:!±i>.:?U'~~r1~~"~.7·~~;~.~W~:E~:~-~2Ti:$p.~s~~!~(s~~i'9~~·4~ .. ~g~f:
`
`3
`Grab the resource called /specials/saw-blade.gif
`.·· .. •· .. ··.····•···· . . ··.· .. ·.·· .• I .·· •·· .. ·.·
`·..
`••
`.
`
`.........
`
`;
`
`Client
`
`~-------------------------~
`
`www.joes·hardware.com
`
`Figure 1-4. URLs specify protocol, server, and local resource
`
`Table 1-1. ExampleURLs
`.:;ijR(;·:••
`http://www.oreilly.com/index.html
`
`http://www.yahoo.com/images/logo.gif
`
`http://www.joes-hardware.com/inventory-check.
`cgi?item= 12731
`
`ftp:/!joe:tools4u@ftp.joes-hardware.com/locking(cid:173)
`pliers.gif
`
`The home URl for O'Reilly & Associates, Inc.
`The URl for the Yahoo! web site's logo
`The URl for a program that checks if inventory item
`#12731 is in stock
`The URl for the locking-pliers.gifimage file, using
`password-protected FTP as the access protocol
`
`Most URLs follow a standardized format of three main parts:
`
`• The first part of the URL is called the scheme, and it describes the protocol used
`to access the resource. This is usually the HTTP protocol (http:/!).
`• The second part gives the server Internet address (e.g., www.joes~hardware.com).
`• The rest names a resource on the web server (e.g., !specials!saw-blade.gif).
`
`Today, almost every URI is a URL.
`
`URNs
`The second flavor of URI is the uniform resource name, or URN. A URN serves as a
`unique name for a particular piece of content, independent of where the resource
`currently resides. These location-independent URNs allow resources to move from
`place to place. URNs also allow resources to be accessed by multiple network access
`protocols while maintaining the same name.
`
`For example, the following

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket