`Architect
`of the Netscape
`Proxy Server and
`Co-Developer
`of the CERN
`Proxy Server
`
`Thorough Coverage
`of Proxy
`Network Design,
`Management,
`and Optimization
`
`
`
`Exhibit 2003
`IPR2016-01431
`
`
`
`LIIBTUNEN
`
`
`
`Servers
`
`Ari luotonen
`Paul Mockapetris, Series Advisor
`
`To join a Prentice Hall PTR Internet mailing list, point to
`http://www. prenhall. com/mail_lists/
`
`
`
`Library of Congress Cataloging-in-Publication Date
`
`Luoronen, Ari.
`Web proxy servers I Ari Luotonen.
`p.
`em. -
`(Web infrastructure series)
`Includes bibliographical references and index.
`ISBN: 0-13-680612-0 (all<. paper)
`I. Web servers. 2. Title. II. Series.
`TK5105.888.L86 1997
`005.7'1376-dc21
`
`Editorial/Production Supervision: Kathleen M Caren
`Acquisitions Editor: Mary Franz
`Editorial Assistant: Noreen Regina
`Series Advisor: Paul Mockapetris
`Marketing Manager: Miles Williams
`Buyer: Alexis Heydt
`Cover Design: Anthony Gemmellaro; Illustration: Leonardo da Vinci
`Cover Design Direction: Jerry Votta
`Series Design: Meg Vtm Arsdale
`Art Director: Gail Cocker-Bogusz
`
`© 1998 Netscape Communications Corporation
`Published by Prentice Hall PTR
`Prentice-Hall, Inc., A Simon & Schuster Company
`Upper Saddle River, NJ 07458
`
`97-40619
`CIP
`
`Prentice Hall books are widley used by permission by corporations and government agencies for training,
`marketing, and resale. The publisher offers discounts on this book when ordered in bulk quantities.
`For more information, contact:
`
`Corporate Sales Department,
`Prentice Hall PTR
`One Lake Street
`Upper Saddle River, NJ 07458
`Phone: 800-382-3419; FAX: 201-236-714
`E-mail (Internet): corpsales@prenhall.com
`
`All rights reserved. No part of this book may be reproduced, in any form
`or by any means, without permission in writing from the publisher
`
`Printed in the United States of America
`
`10987654321
`
`ISBN 0-13-680612-0
`
`Prentice-Hall International (UK) Limited, London
`Prentice-Hall of Australia Pry. Limited, Sydney
`Prentice-Hall Canada Inc., Toronto
`Prentice-Hall Hispanoamericana, S.A., Mexico
`Prentice-Hall of India Private Limited, New Delhi
`Prentice-Hall of Japan, Inc., Tokyo
`Simon & Schuster Asia Pte. Ltd., Singapore
`Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro
`
`
`
`SUCCESS
`To laugh often and much;
`to win the respect of intelligent people and the affection of children;
`to earn the appreciation of honest critics and endure the betrayal of false
`friends;
`to appreciate beauty, to find the best in others;
`to leave the world a bit better, whether by a healthy child, a garden patch
`or a redeemed social condition;
`to know even one life has breathed easier because you have lived.
`This is to have succeeded.
`
`Ralph Waldo Emerson
`
`This book is dedicated to Nirmalya Bhowmick and Daniel Gonzalez
`for their help in teaching me how to be successful,
`believe in myself, live my life to the fullest,
`and help other people.
`
`
`
`ble
`
`f on tents
`
`· · . · . . . . . . . . . . . . . . . . . . . . . . . . . . Xlll
`
`Preface
`Acknowledgments ...................... xv
`
`...
`
`Part I
`Overview of Firewalls and Proxy Servers .... 1
`
`1 Firewall Overview 3
`
`Terminology .................................... 4
`Firewalls ....................................... 5
`Summary ..................................... 15
`2 Overview of Proxy Servers 17
`
`History of Web Proxy Servers ...................... 18
`General Properties of Proxy Servers .................. 19
`Different Types of Proxy Servers .................... 21
`
`v
`
`
`
`vi
`
`Web Proxy Servers
`
`Generic Firewall Proxy Servers ...................... 21
`Proxy Chaining ................................. 22
`Departmental Proxy Servers ....................... 22
`Personal Proxy Servers ............................ 23
`Specialized Proxy Servers .......................... 24
`Why Proxy Servers are Not Part of Web Servers? ........ 25
`Dynamic Content ............................... 26
`Summary ..................................... 27
`Internal Server Architectures . . . . . . . . . . . . . . . . . . . . . 29
`
`3
`
`Single-Process Serialized Server Architecture ........... 30
`Forking ....................................... 30
`Process Mob Architecture ......................... 31
`Multithreaded, Single-Process Architecture ............ 32
`Multithreaded, Multiprocess Architecture ............. 33
`Single-Process, Asynchronous 1/0 Architecture ......... 33
`Mixed Asynchronous 1/0 with Threads Architecture ..... 34
`Summary ..................................... 35
`
`Part2
`Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 7
`
`4 The HTTP Protocol ............................. 39
`
`Overall Operation of HTIP ........................ 40
`Design Goals of HTIP ............................ 41
`HTIP/0.9 ..................................... 42
`HTIP/1.0 ..................................... 43
`The HTIP/1.1 Protocol ........................... 47
`HTIP Persistent Connections (Keep-Alive) ............. 47
`HTIP Authentication ............................. 54
`Virtual Servers .................................. 59
`META HTIP-EQUIV .............................. 62
`Mime Media Types .............................. 63
`HTIP Request Methods ........................... 64
`HTIP Headers .................................. 69
`
`
`
`r
`
`Table of Contents
`
`vii
`
`HTIP Response Status Codes ....................... 96
`Summary .... · ................................. 103
`5 Cookies-The HTTP State Management Protocol .... 107
`
`Overall Operation of Cookies ...................... 1 08
`Common Uses of Cookies ........................ 1 08
`Cookies vs. Proxy Cookies ........................ 111
`Non-Static Route and Cookies with Encoded IP Address . 112
`Summary ..................................... 113
`ICP-The Internet Cache Protocol ................ 115
`
`6
`
`ICP Message Format ............................ 120
`ICP Op Codes ................................. 122
`ICP Option Flags ............................... 128
`Multicast with ICP .............................. 1 30
`Security Considerations .......................... 1 31
`Summary ..................................... 132
`7 Handling of Different Protocols by Proxies ......... 133
`
`Standard Port Numbers .......................... 134
`HTIP ........................................ 136
`FTP ......................................... 139
`Gopher ...................................... 147
`News ........................................ 148
`SSL, HTIPS, and SNEWS ......................... 148
`(SSL) Tunneling Protocol ......................... 150
`WAIS ........................................ 151
`LDAP ........................................ 151
`IIOP ......................................... 152
`Telnet ....................................... 152
`Streaming Protocols Based on UDP ................. 152
`Summary ..................................... 152
`
`
`
`viii
`
`Web Proxy Servers
`
`Part 3 Caching ........................ 155
`
`8 Caching .................................... 157
`
`Advantages of Caching .......................... 158
`Disadvantages of Caching ........................ 158
`Conditional Requests ............................ 158
`Guaranteeing Freshness of Cached Documents ........ 161
`Cache Hit Ratio ................................ 165
`On-Demand Caching ........................... 166
`On-Command Caching .......................... 169
`Caching of Data Requiring Authentication ............ 171
`Caching Data from Local Hosts .................... 171
`Caching and SSL ............................... 1 72
`Caching Queries ............................... 1 72
`HTIP/1.1 Cache Control Terminology ............... 173
`HTIP/1.1 Cache Control ......................... 176
`SUmmary .................................... 1 79
`9 Caching and Online Advertising. . . . . . . . . . . . . . . . . 181
`
`"Cache Busting" ............................... 182
`Alternatives for "Cache Busting" ................... 183
`Copyright Violation by Cache ..................... 193
`Summary .................................... 194
`10 Cache Architectures ........................... 195
`
`Components of a Cache Architecture ............... 196
`Existing Cache Architectures ...................... 198
`Summary .................................... 203
`11 Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
`
`The Idea of Garbage Collection .................... 206
`Cache Garbage Collection ........................ 207
`Run-Time Cache Management .................... 210
`Summary .................................... 21 0
`
`
`
`Table of Contents
`
`ix
`
`Part4
`Filtering, Monitoring, and Access Control. 211
`
`12 Filtering ..................................... 213
`
`URL Filtering .................................. 214
`Content Rating ................................ 217
`Censorship on the Internet ....................... 219
`Request Header Filtering ......................... 219
`Request Content Filtering ........................ 222
`Response Header Filtering ........................ 223
`Response Content Filtering ....................... 224
`Summary ..................................... 224
`13 Access Control. ............................... 227
`
`Access Control By User Authentication .............. 228
`Access Control By Client Host Address ............... 228
`Summary ..................................... 230
`14 Logging and Monitoring ....................... 233
`
`Format of Access Log Files ........................ 234
`Log Analyzers ................................. 234
`Analyzing Proxy Logs ............................ 237
`Determining the Peak Load ....................... 243
`Monitoring ................................... 243
`Summary ..................................... 244
`Part 5 Security. . . . . . . . . . . . . . . . . . . . . . . . 245
`
`15 Encryption and Authentication Security ........... 247
`
`Single Key Cryptography ......................... 248
`Public Key Cryptography ......................... 250
`Authentication with Public Key Cryptography ......... 252
`Message Digest (Hash) Algorithms ................. 253
`The MD5 Algorithm ............................ 255
`Certificates ................................... 255
`
`
`
`x
`
`Web Proxy Servers
`
`Summary .................................... 257
`16 Setup Security ................................ 259
`
`Server User ID ................................. 260
`File Ownerships and Permissions (UNIX) ............. 260
`Common Security Holes in Server Software Itself. ...... 261
`Access Control Based on Incoming lp Address ......... 265
`Reverse Proxy Security ...... ! • • • • • • • • • • • • • • • • • • • • 266
`Firewall Router Configuration ..................... 266
`Information Revealed in HTTP Headers .............. 267
`Protocol Verification ............................ 270
`Capturing Authentication Credentials ............... 271
`Securing the Logs .............................. 271
`Passwords in FTP URLS .......................... 272
`java, javaScript, and ActiveX Security ............... 272
`File Upload Security ............................. 273
`Summary .................................... 273
`Part 6 Performance .................... 275
`
`17 Performance ................................. 277
`
`DNS Lookups ................................. 278
`Protocol Performance ........................... 283
`Cache Performance ............................. 286
`Filtering ...................................... 286
`Summary .................................... 290
`18 Capacity Planning ............................ 291
`
`Purposes Of The Proxy Server ..................... 292
`Estimated Load ................................ 294
`Average Transaction Time ........................ 299
`Choosing the Proxy Hierarchy ..................... 306
`Choosing the Hardware and Software ............... 308
`Disk Space .................................... 309
`Cache Configuration ............................ 312
`
`
`
`Table of Contents
`
`xi
`
`Summary ..................................... 313
`19 Load Balancing ............................... 315
`
`DNS Round-Robin-Based Load Balancing ............. 316
`Hash-Function-Based Proxy Selection ............... 317
`CARP-Cache Array Routing Protocol ............... 318
`ICP-Based Proxy Selection ........................ 322
`Client Proxy Auto-Configuration in Load Balancing ..... 322
`Other Load Balancing Solutions .................... 324
`Summary ..................................... 324
`20 Reverse Proxying .............................. 325
`
`Uses of Reverse Proxy Servers ..................... 326
`Components of a Reverse Proxy Setup ............... 328
`Secure Reverse Proxying ......................... 338
`Dynamic Content and Reverse Proxying ............. 341
`Alternatives to Reverse Proxying ................... 342
`Summary ..................................... 342
`Part 7 Deployment Scenarios . . . . . . . . . . . 345
`
`21 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 7
`
`Case Study 1: A Small Internet Software Company ..... 348
`Case Study 2: A Small Accounting Firm .............. 349
`Case Study 3: A Medium-sized Company ............ 351
`Case Study 4: A Large Corporation ................. 352
`Summary ..................................... 354
`22 Trouble-Shooting ............................. 357
`
`Debugging with Telnet .......................... 358
`Packet Sniffing ................................. 363
`Tracing System Calls ............................ 364
`Tracing the Network Route ....................... 369
`HTTP Tracing .................................. 370
`Trouble-Shooting the Cache ...................... 371
`
`
`
`xii
`
`Web Proxy Servers
`
`Summary .................................... 373
`Part 8 Appendices . . . . . . . . . . . . . . . . . . . .. 3 7 5
`
`Appendix A
`Proxy Auto-Configuration Support in Clients ....... 377
`
`Auto-Configuration File Mime Type ................. 378
`Setting Up an Auto-Configuration File ............... 379
`Predefined javaScript Utility Functions for
`Proxy Auto-Configuration File ..................... 380
`Example Proxy Auto-Configuration Scripts ........... 388
`Generating Proxy Auto-Configuration File from CGI .... 395
`CARP In Proxy Auto-configuration .................. 396
`Summary .................................... 396
`Appendix B
`Wildcard Expressions .......................... 399
`
`Regular Expression Syntax ........................ 400
`Using Regular Expressions for URLS ................. 404
`Shell Expression Syntax .......................... 409
`Summary .................................... 411
`Appendix C
`Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
`
`References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
`
`Index .......................................... 423
`
`
`
`Preface
`
`Thanks, Ari. You're a fountain of information as usual.
`
`-Gregg Ulrich
`
`During the four years that I have worked on the fast-evolv(cid:173)
`ing World Wide Web technology-specifically proxy serv(cid:173)
`ers-! have felt uneasy about the fact that no one has really
`had time to produce good documentation that would pro(cid:173)
`vide true insight and understanding of this great technol(cid:173)
`ogy. We engineers who understand how things work are
`often just too busy to explain them to others. Mter all, an
`Internet year is just 52 days long, and the average release
`cycle tends to be less than nine (Earth) months. So many
`new features, so little time.
`I've always taken great pride in promptly answering all
`the E-mail that I get. But there have been times when I have
`been completely overwhelmed by the number of messages
`and the span of questions that I have received. And I've real-
`
`xiii
`
`
`
`xiv
`
`Web Proxy Servers
`
`ized that I have explained the same things over and over again in E-mail.
`If only there were a book that covered all these issues, people could just
`read it and receive the same answer I would give them anyway-and
`maybe learn a bit more and gain depth in their understanding about how
`things work, and why they work like that.
`So one day I got an E-mail from Mary Franz at Prentice Hall asking if
`I'd be interesting in writing a book on Web proxy servers. Well, not being
`able to say "no" to more work, I said yes, sure, might as well. I have
`always found working on the Web to be rewarding. It's great to receive
`messages from the Internet, thanking and praising my efforts.
`
`
`
`Ackn wledgments
`
`The list of people I want to thank and acknowledge is a
`long one, and I'm afraid I'll forget someone-so if I do,
`please believe me it is unintentional.
`First, I want to thank Tim Berners-Lee, the inventor of
`the World Wide Web, for his enthusiasm and invention. I
`had the pleasure of working with Tim for a year when we
`were still at CERN in Switzerland. My supervisor Robert
`Cailliau deserves a very special thank you for his support
`and inspiration, too. I had a lot of fun in Geneva, thanks to
`him. But don't remind me of the time my apartment was
`without gas for a week .... And my colleague Henrik Frystyk
`Nielsen, are you still working on that FTP module?
`Next, I want to extend much gratitude to Marc
`Andreessen for his belief in Web navigation software, for
`making people see the coolness of the Web, and for hiring
`me. By the way, I was never interviewed for my job at
`Netscape-Marc simply dropped me an E-mail one day
`
`XV
`
`
`
`xvi
`
`Web Proxy Servers
`
`and asked if I wanted to work for him. And my answer was as straightfor(cid:173)
`ward as his question had been. Equal appreciation belongs to Jim Clark
`for founding our great company, believing in it, encouraging us at the
`time of doubt-and he's a funny guy on top of everything else!
`My work group at Netscape deserves a big bucket o' kudos, especially
`Pinaki Shah who read the manuscript from cover to cover and provided a
`truckload of feedback. Many of the people I work with reviewed this
`book and provided valuable feedback which helped make this book bet(cid:173)
`ter. The head hunting here in Silicon Valley is bloody, so I'm not going to
`list the names here-otherwise, I might find my valuable colleagues being
`recruited by other companies. But you know who you are, and consider
`the fact that your names are withheld an expression of special apprecia(cid:173)
`tion! The management at Netscape has been very supportive and excited
`about this book project, and I want to thank them as well.
`Paul Mockapetris deserves a huge thank you for his efforts and price(cid:173)
`less feedback he's provided for me while reviewing the manuscript.
`Other people I should definitely mention here are Thomas Kroeger
`for his bibliography entries; Jeffrey Mogul for his extensive work on the
`caching chapter of the HTTP/1.1 standard; and Neil Smith for his long(cid:173)
`term help and support when he was at the Hensa UNIX in the United
`Kingdom, and later here at Netscape.
`Finally, huge kudos to Mary Franz and the rest of the Prentice Hall
`staff I have had the pleasure to work with. It was a lot of fun, and I hope
`to do it again some time soon!
`
`Ari Luotonen
`October 1, 1997
`
`
`
`PART
`
`•
`
`vervtew f Firewalls
`· and Pr
`Servers
`
`The first part of this book provides an overview of firewalls,
`proxy servers, and their internal architectures. This part sets
`the terminology and lays the groundwork for understanding
`the latter parts of this book.
`
`1
`
`
`
`CHAPTER
`
`Firewall
`
`verv1ew
`
`0
`
`The bloom of the Internet has encouraged companies to get
`online and establish their own Internet presence by setting up
`Web servers that contain information about the company, its
`products, order information, and oftentimes, ordering online.
`This exposure to the Internet involves a risk of hackers
`attempting to break into the internal network of the company
`or institution. To protect these internal networks from outside
`intruders firewalls are set up.
`
`3
`
`
`
`r
`
`4
`
`Web Proxy Servers
`
`This first chapter provides an overview of firewalls, with its basic concepts
`and components. It provides to top-level view of proxy servers in general,
`their benefits, and role in the firewall solution. The rest of Part I goes into
`more detail about the internal architecture of proxy servers, and provides
`the framework for understanding the rest of this book.
`
`TERMINOLOGY
`
`Before we go on, let's review some important terms that are used through(cid:173)
`out this book, and that should be fully understood before continuing. A
`full list ofWeb-related technical terms is found in Appendix E.
`
`resource A file, HTML document, image, applet, or any other object
`addressable by a single URL. Do not confuse with an HTML page
`which may consist of multiple resources (the HTML text itsel£ and
`several inlined images and applets).
`object Same as "resource."
`URL Uniform Resource Locator; a World Wide Web resource address,
`for example http: I /wwvv. prenhall. com
`user An actual user (a person), usually using the client software to surf the
`Net.
`client The client side of a request-response transaction; the client side
`makes the request, and server side responds. The client may be the
`Web navigation software program, such as the Netscape Navigator
`[1] or Internet Explorer [2]. However, a proxy server acting as a
`client may also be referred to as a "client."
`server A program accepting and servicing requests from clients; a server
`may be an origin server or a proxy server.
`origin server The Web server that hosts the resource, such as a Web page.
`destination server Same as "origin server."
`proxy (server) An intermediary server that accepts requests from clients
`and forwards them to other proxy servers, the origin server, or
`services the request from its own cache. A proxy acts both as a server
`as well as a client: the proxy is a server to the client connecting to it,
`and a client to servers that it connects to.
`host A physical computer, running client, server, proxy, or other software.
`
`
`
`FIREWALLS
`
`1 Firewall Overview
`
`5
`
`The term "firewall" does not refer to any specific piece of hardware or
`software. "Firewall" is just a general name for hardware, software, or often
`times the combination of the two, used to protect the internal network
`from intruders.
`Think of firewalls as transit devices that examine traffic before they let
`it pass. Firewall hardware consists of routers and dedicated computers that
`run firewall software. Routers typically operate at the network level and
`may filter IP packets based on what values the packet header fields have,
`such as the source and destination addresses. Routers can be configured to
`allow only certain types of packets to pass, allow connections to be estab(cid:173)
`lished only from a certain set of [internal] hosts, and block any attempts
`to access internal hosts from the external (unsafe) Internet. This screening
`process is often referred to as packet filtering.
`More complicated application-level firewalls perform more complex
`tasks that are beyond the capabilities of router hardware. They under(cid:173)
`stand the application, and the application content and hence can perform
`such intelligent tasks as filtering mail based on its source. Future firewalls
`may have even higher-level filters that scan for dirty pictures, or even bad
`grammar, and such. The key thing for firewalls is to understand the con(cid:173)
`tents being transferred, not the protocol per se-even though under(cid:173)
`standing the protocol is often a prerequisite to finding out what the
`content is.
`A component of a firewall is often one or more application-level proxy
`servers. An application-level proxy server is familiar with the protocol that
`is being relayed through the proxy server, and because of this knowledge
`it is able to perform a higher level of access control, monitoring, logging,
`and performance improvement-related tasks than would otherwise be
`possible.
`The main subject of this book is specifically World Wide Web proxy
`server software. While proxy servers exist for other protocols as well, we
`will focus on protocols related to the World Wide Web only.
`The concept of Web proxy servers is introduced in Chapter 2. The
`Hypertext Transfer Protocol, or HTTP for short, is described in Chapter 4.
`Other Web-related protocols are discussed in the other chapters of Part II.
`The rest of this chapter will describe the overall operation of firewalls,
`their different components, and define the terms used throughout this
`book.
`
`
`
`6
`
`Web Proxy Servers
`
`Packet Filtering by Routers
`A router is hardware that can perform simple packet filtering. Packet fil(cid:173)
`tering means the process of inspecting the TCP/IP [3] header data in net(cid:173)
`work packets, and based on that information, denying or allowing the
`packet to proceed. Examples of typical filtering rules might be
`
`1. Allow outbound connections from the internal network to the
`proxy server host
`2. Allow outbound connections from the proxy server host to the
`external Internet
`3. Block all attempts to connect directly from the internal network
`to the Internet
`4. Allow inbound connections to the SMTP [4] port (allows E-mail
`to be received)
`
`Note that the above rules would not all exist on the same router-rules 1
`and 2 would be on different routers. See Figure 1.4 for an example of a
`firewall configuration using two (or more) routers.
`Also note that the above example is not complete. There may be a
`wealth of other rules to allow other protocols for other applications to
`work through the firewall, such as receiving USENET news feeds.
`A simple firewall solution might consist of just a single router; Figure
`1.1 illustrates such a scenario. The router sits between the internal net(cid:173)
`work and the internet connection and packet filters all the traffic between
`the two networks. Usually, most incoming connections-except for mail
`transfers and news feeds-are blocked, whereas outbound connections
`are allowed with few limitations. This will shield the internal network
`from intruders trying to break in from the Internet, while not interfering
`terribly with internal users' ability to surf the Internet.
`
`However, this simple firewall provides only a single level of protec(cid:173)
`tion. If an intruder manages to break through to the internal network,
`there is nothing further preventing the attack. The next section covers
`proxy servers with and without routers to provide a more secure firewall
`with more protective layers.
`The filtering capabilities of routers are limited to the information in the
`TCP/IP headers-data in the application protocol level (such as HTTP) is
`beyond the capabilities of routers. Routers cannot enforce authentication of
`users, either-their access control capabilities are limited to IP address level.
`
`1
`
`
`
`1 Firewall Overview
`
`7
`
`Figure 1.1 A simple firewall consisting of only a single router performing packet
`filtering.
`
`Furthermore, mere routers are not able to provide meaningful logs of trans(cid:173)
`actions that occur. All of these shortcomings can be remedied by applica(cid:173)
`tion-level proxy servers, described in the next section.
`
`Application-level Proxy Servers
`Application-level gateways, or proxy servers, are software programs that
`are familiar with a specific protocol-or several protocols-that they
`relay. Clients on the internal network will make requests to the proxy
`server instead of connecting directly to a remote service, and the proxy
`server will perform the actual request on behalf of the client.
`Proxy servers are often run on dual-homed hosts-server machines
`that have two (or more) network interfaces. Each network interface has its
`own IP address. Typically, one of the interfaces is connected to the inter(cid:173)
`nal network, the other to the Internet. The proxy server software running
`on the host machine will relay the authorized traffic between the two
`interfaces and block traffic that is denied. Figure 1.2 illustrates a proxy
`server on a dual-homed host as a firewall.
`
`In a way, a dual-homed host running a proxy server performs a simi(cid:173)
`lar function as a router-with the difference that it has more intelligence
`and provides a richer set of features:
`
`Latency reduction, bandwidth conservation Application-level proxies
`are able to cache data and service requests from their cache. By
`caching content, it is possible to trade disk space for faster responses
`and conserve network bandwidth. Part III of this book focuses on
`caching.
`
`
`
`Firewall
`
`INTERNET
`
`8
`
`Web Proxy Servers
`
`INTERNAL
`NETWORK
`
`Client
`
`Server
`
`Two separate network
`interfaces
`
`Figure 1.2 A firewall using a single dual-homed host running a proxy server.
`
`Advanced access control Application-level proxies can perform
`authentication and use the authentication information in access
`control. Access control is the subject of Chapter 13.
`Advanced filtering Application-level proxies can perform more
`sophisticated filtering because they understand the internals of the
`protocol and are therefore not limited to the information available
`in the TCP/IP headers in network packets as regular packet filters
`are. Chapter 12 discusses filtering.
`Application level proxies may actually change the request: filter out
`sensitive information, insert additional information, or remap the
`request to a mirror site.
`Logging and auditing Application-level proxies are able to provide
`extensive logs of transactions that occur through the proxy server. This
`enables auditing ofWeb traffic. Logging is covered in Chapter 14.
`
`Usually, routers and application-level proxy servers are used together
`to provide a more secure firewall solution. In this scenario, the proxy
`server does not need to be run on a dual-homed host as the network level
`traffic control is carried out by the router.
`
`
`
`1 Firewall Overview
`
`9
`
`Figure 1.3 illustrates a configuration where the router protects the
`proxy server from the Internet. The router will drop any connection
`
`INTERNAL
`NETWORK
`
`Router
`
`Client
`
`Server
`
`Figure 1.3 A firewall with a proxy server, protect by a router.
`
`attempts from the Internet trying to come into the intranet, unless it's
`one of the explicitly allowed ports, such as SMTP (for incoming mail). A
`typical rule set for the router in this type of firewall might be
`
`1. Allow outbound connections from the internal network to the
`proxy server host/port.
`2. Deny any outbound connections from the internal network
`directly to the Internet. This rule forces the proxy to be used to
`go out, allowing the proxy server to be the single gateway out,
`making it possible to collect an exact log of all activity that occurs
`between the Internet and the intranet.
`3. Allow outbound connections from the proxy server host out to
`the Internet; or allow a set of well-known ports, such as 80 for
`HTTP and 443 or HTTPS (see page 135 for a list of well-known
`Internet protocol ports related to the WWW).
`4. Deny all inbound connections from the Internet to the proxy
`server, or any host in the internal network. E-mail and news feed
`traffic, as well as some other well-known protocols may be an
`exception.
`
`An even more secure firewall architecture is shown in Figure 1.4. The
`proxy server host (often referred to as the firewall bastion) is surrounded
`by a router on both sides. This encloses the proxy server into its own sub(cid:173)
`network and provides three layers of protection. This subnetwork is often
`
`
`
`10
`
`Web Proxy Servers
`
`referred to as the DMZ-the demilitarized zone. The DMZ is the part of
`the network which lies between the Internet and the internal network
`(intranet). It is more exposed to the threats of the Internet than the inter(cid:173)
`nal hosts, and more stringent security measures must be taken on that
`zone. At the same time, the DMZ shields the rest of the intranet from the
`threats of the Internet.
`
`INTERNAL
`NETWORK
`
`DMZ
`
`INTERNET
`
`Figure 1.4 A firewall with a proxy server enclosed in a subnet protected by two
`routers. The area between the routers is the DMZ.
`
`An example router rule set for this type of firewall configuration fol(cid:173)
`lows.
`
`Inner router:
`
`1. Allow outbound connections from the internal network to the
`bastion host.
`2. Block all other outbound connections from the internal network.
`3. Block all inbound connections attempting to enter the internal
`network, either from the bastion host, or from any host in the
`external Internet. This rule protects the internal network even if
`the outer router and the bastion host get compromised.
`
`Proxy server on the bas