throbber
Speeding Up the Web
`
`
`
`Peerrmance
`Tuning
`
`O’REILLY"
`
`Pam'cle Killelea
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`Web Performance Tuning
`
`Patrick Killelea
`
`Beijing - Cambridge - K6171 - Paris - Sebastopol - Taipei . Tokyo
`
`O’REILLY”
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`Web Performance Tuning
`by Patrick Killelea
`
`Copyright © 1998 O’Reilly & Associates, Inc. All rights reserved.
`Printed in the United States of America.
`
`Published by O’Reilly & Associates, Inc., 101 Morris Street, Sebastopol, CA 95472.
`
`Editor: Linda Mui
`
`Production Editor: Madeleine Newell
`
`Printing History:
`
`October 1998:
`
`First Edition.
`
`Nutshell Handbook and the Nutshell Handbook logo are registered trademarks of O’Reilly &
`Associates, Inc. JavaTM and all Java-based trademarks and logos are trademarks or registered
`trademarks of Sun Microsystems, Inc., in the United States and other countries. O’Reilly 8:
`Associates, Inc. is independent of Sun Microsystems. The association between the image of a
`hummingbird and the topic of web performance tuning is a trademark of O’Reilly 8:
`Associates, Inc.
`
`Netscape, Netscape Navigator, and the Netscape Communications Corporate logos are
`trademarks and tradenames of Netscape Communications Corporation. Internet Explorer and
`the Internet Explorer logo are trademarks and tradenames of Microsoft Corporation. All other
`product names and logos are trademarks of their respective owners. Many of the designations
`used by manufacturers and sellers to distinguish their products are claimed as trademarks.
`Where those designations appear in this book, and O’Reilly & Associates, Inc. was aware of
`a trademark claim, the designations have been printed in caps or initial caps.
`
`Appendix A is Copyright © 1998 Netscape Communications Corporation. Reproduced with
`permission.
`
`While every precaution has been taken in the preparation of this book, the publisher assumes
`no responsibility for errors or omissions, or for damages resulting from the use of the
`information contained herein.
`
`(E
`{Q9
`
`This book is printed on acid—free paper with 85% recycled content, 15% post-consumer waste.
`O’Reilly & Associates is committed to using paper with the highest recycled content available
`consistent with high quality.
`
`ISBN: 1—56592-579-0
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`
`
`In this chapter:
`- Parameters of
`Performance
`- Benchmark
`
`Specifications and
`Benchmark Tests
`
`- Web Performance
`Measuring Tools and
`
`.293“
`
`Recommendations
`
`Web Performance
`Measurement
`
`Parameters of Performance
`
`There are four classic parameters describing the performance of any computer sys—
`tem:
`latency, throughput, utilization, and efficiency. Tuning a system for perfor—
`mance can be defined as minimizing latency and maximizing the other three
`parameters. Though the definition is straightforward, the task of tuning itself is not,
`because the parameters can be traded off against one another and will vary with
`the time of day, the sort of content served, and many other circumstances. In addi—
`tion, some performance parameters are more important to an organization’s goals
`than others.
`
`Latency ana’ Throughput
`
`Latency is the time between making a request and beginning to see a result. Some
`define latency as the time between making a request and the completion of the
`request, but this definition does not cleanly distinguish the psychologically signifi-
`cant time spent waiting, not knowing whether your request has been accepted or
`understood. You will also see latency defined as the inverse of throughput, but
`this is not useful because latency would then give you the same information as
`
`throughput. Latency is measured in units of time, such as seconds.
`
`Throughput is the number of items processed per unit time, such as bits transmit—
`ted per second, HTTP operations per day, or millions of instructions per second
`(MIPS). It is conventional to use the term bandwidth when referring to through—
`
`put in bits per second. Throughput is found simply by adding up the number of
`items and dividing by the sample interval. This calculation may produce correct
`but misleading results because it ignores variations in processing speed within the
`sample interval.
`
`43
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`
`44
`Chapter 3: Web Petformance Measurement
`
`The following three traditional examples help clarify the difference between
`latency and throughput:
`
`1. An overnight (24—hour) shipment of 1000 different CDs holding 500 mega—
`bytes each has terrific throughput but lousy latency. The throughput is (500 X
`220 X 8 X 1000) bits/(24 X 60 X 60) seconds = about 49 million bits/second,
`which is better than a T3’s 45 million bits/second. The difference is that the
`
`overnight shipment bits are delayed for a day and then arrive all at once, but
`T3 bits begin to arrive immediately, so the T5 has much better latency, even
`though both methods have approximately the same throughput when consid-
`ered over the interval of a day. We say that the overnight shipment is barsty
`traffic.
`
`2. Supermarkets would like to achieve maximum throughput per checkout clerk
`because they can then get by with fewer of them. One way for them to do this
`is to increase your latency, that is, to make you wait in line, at least up to the
`limit of your tolerance. In his book Configuration and Capacity Planning for
`Solaris Servers (Prentice Hall), Brian Wong phrased this dilemma well by say-
`
`ing that throughput is a measure of organizational productivity while latency is
`a measure of individual productivity. The supermarket may not want to Waste
`
`your individual time, but it
`organizational productivity.
`
`is even more interested in maximizing its own
`
`3. One woman has a throughput of one baby per 9 months, barring twins or trip—
`
`lets, etc. Nine women may be able to bear 9 babies in 9 months, giving the
`group a throughput of 1 baby per month, even though the latency cannot be
`decreased (i.e., even 9 women cannot produce 1 baby in 1 month). This
`
`mildly offensive but unforgettable example is from The Mythical Man-Month,
`by Frederick P. Brooks (Addison Wesley).
`
`Although high throughput systems often have low latency, there is no causal link.
`You’ve just seen how an overnight shipment can have high throughput with high
`latency. Large disks tend to have better throughput but worse latency: the disk is
`physically bigger, so the arm has to seek longer to get to any particular place. The
`latency of packet network connections also tends to increase with throughput. As
`you approach your maximum throughput, there are simply more packets to put on
`the wire, so a packet will have to wait longer for an opening, increasing latency.
`This is especially true for Ethernet, which allows packets to collide and simply
`retransmits them if there is a collision, hoping that it retransmitted them into an
`
`increasing throughput capacity will decrease
`It seems obvious that
`open slot.
`latency for packet switched networks. While this is true for latency imposed by
`traffic congestion, it is not true for cases where the latency is imposed by routers
`or sheer physical distance.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`Parameters ofPeiformance
`
`45
`
`Finally, you can also have low throughput with low latency: a 14.4kbps modem
`may get the first of your bits back to you reasonably quickly, but its relatively low
`throughput means it will still take a tediously long time to get a large graphic to
`you.
`
`With respect to the Internet, the point to remember is that latency can be more
`significant
`than throughput. For small HTML files, say under 2K, more of a
`28.8kbps modem user’s time is spent between the request and the beginning of a
`response (probably over one second) than waiting for the file to complete its
`arrival (one second or under).
`
`Measuring network latency
`
`Each step on the network from client to server and back contributes to the latency
`of an HTTP operation. It is difficult to figure out where in the network most of the \
`
`latency originates, but there are two commonly available Unix tools that can help.
`Note that we’re considering network latency here, not application latency, which is
`the time the applications running on the server itself take to begin to put a result
`back out on the network.
`
`If your web server is accessed over the Internet, then much of your latency is
`probably due to the store and forward nature of routers. Each router must accept
`an incoming packet into a buffer,
`look at the header information, and make a
`decision about where to send the packet next. Even once the decision is made,
`
`the router will usually have to wait for an open slot to send the packet. The
`latency of your packets will therefore depend strongly on the number of router
`hops between the web server and the user. Routers themselves will have connec—
`tions to each other that vary in latency and throughput. The odd, yet essential
`thing about the Internet is that the path between two endpoints can change auto—
`matically to accommodate network trouble, so your latency may vary from packet
`to packet. Packets can even arrive out of order. You can see the current path your
`packets are taking and the time between router hops by using the tracerom‘e util-
`ity that comes with most versions of Unix. (See the traceroute manpage for more
`information.) A number of kind souls have made traceroute available from their
`
`web servers back to the requesting IP address, so you can look at path and perfor-
`mance to you from another point on the Internet, rather than from you to that
`point. One page of links to traceroute servers is at http://www.slacstanford.edn/
`comp/net/wan—mon/tmceroute-srv.btml. Also see http://www.internetwealbercom/
`for continuous measurements of ISP latency as measured from one point on the
`Internet.
`
`Note that traceroute does a reverse DNS lookup on all intermediate IPs so you can
`see their names, but this delays the display of results. You can skip the DNS
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`46
`
`Chapter 3: Web Performance Measurement
`
`lookup with the —n option and you can do fewer measurements per router (the
`default is three) with the -q option. Here’s an example of traceroute usage:
`
`% traceroute —q 2 www.umich.edu
`traceroute to www.umich.edu (141.211.144.53), 30 hops max, 40 byte packets
`1
`router.cableco—op.com (206.24.110.65)
`22.779 ms
`139.675 ms
`2 mv103.mediacity.com (206.24.105.8)
`18.714 ms
`145.161 ms
`3 grfge000.mediacity.com (206.24.105.55)
`23.789 ms
`141.473 ms
`4 bordercoreZ—hssiO—O.SanFrancisco.mci.net
`(166.48.15.249)
`29.091 ms
`39.856 ms
`62.75 HS
`63.16 ms
`(166.48.22.1)
`bordercoreZ.WillowSprings.mci.net
`merit.WillowSprings.mci.net
`(166.48.23.254)
`82.212 ms
`76.774 ms
`f-umbin.c—ccb2.ummet.umich.edu (198.108.3.5)
`80.474 ms
`76.875 ms
`www.umich.edu (141.211.144.53)
`81.611 ms *
`
`mde'l
`
`If you are not concerned with intermediate times and only want to know the cur—
`rent time it takes to get a packet from the machine you’re on to another machine
`on the Internet (or on an intranet) and back to you, you can use the Unix ping
`utility. ping sends Internet Control Message Protocol (ICMP) packets to the named
`host and returns the latency between you and the named host as milliseconds. A
`latency of 25 milliseconds is pretty good, while 250 milliseconds is not good. See
`the ping manpage for more information. Here’s an example of ping usage:
`
`% ping www.umich.edu
`PING www.umich.edu (141.211.144.53): 56 data bytes
`64 bytes from 141.211.144.53:
`icmp_seq=0 ttl=248 time=112.2 ms
`64 bytes from 141.211.144.53:
`icmp_seq=1 ttl=248 time=83.9 ms
`64 bytes from 141.211.144.53:
`icmp_seq=2 tt1=248 time=82.2 KB
`64 bytes from 141.211.144.53:
`icmp_seq=3 ttl=248 time=80.6 ms
`64 bytes from 141.211.144.53:
`icmp_seq=4 ttl=248 time=87.2 H5
`64 bytes from 141.211.144.53:
`icmp_seq=5 ttl=248 time=81.0 ms
`
`——— www.umich.edu ping statistics ———
`6 packets transmitted,
`6 packets received,
`round—trip min/avg/max = 80.6/87.8/112.2 ms
`
`0% packet loss
`
`Measuring network latency and throughput
`
`When ping measures the latency between you and some remote machine, it sends
`ICMP messages, which routers handle differently than the TCP segments used to
`carry HTTP. Routers are sometimes configured to ignore ICMP packets entirely.
`Furthermore, by default, ping sends only a very small amount of information, 56
`data bytes, although some versions of ping let you send packets of arbitrary size.
`For these reasons, ping is not necessarily accurate in measuring HTTP latency to
`the remote machine, but it is a good first approximation. Using telnet and the Unix
`tulle program will give you a manual feel for the latency of a connection.
`
`The simplest ways to measure web latency and throughput are to clear your
`browser’s cache and time how long it takes to get a particular page from your
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`
`Parameters ofPeiformcmce
`47
`
`server, have a friend get a page from your server from another point on the Inter-
`net, or log in to a remote machine and run time lynx —source http://
`myserver.com/ > /dev/null. This last method is sometimes referred to as the
`
`stopwath method of web performance monitoring.
`
`Another way to get an idea of network throughput is to use FTP to transfer files to
`and from a remote system. FTP is like HTTP in that it is carried over TCP. There
`are some hazards to this approach, but if you are careful, your results should
`reflect your network conditions. First, do not put too much stock in the numbers
`the FTP program reports to you. While the first significant digit or two will proba—
`bly be correct, the FTP program internally makes some approximations, so the
`number reported is only approximately accurate. More importantly, what you do
`with FTP will determine exactly which part of the system is the bottleneck. To put
`
`it another way, what you do with FTP will determine what you’re measuring. To
`insure that you are measuring the throughput of the network and not of the disk
`of the local or remote system, you want to eliminate any requirements for disk
`access which could be caused by the FTP transfer. For this reason, you should not
`FTP a collection of small files in your test; each file creation requires a disk access.
`
`Similarly, you need to limit the size of the file you transfer because a huge file will
`not fit in the filesystem cache of either the transmitting or receiving machine, again
`resulting in disk access. To make sure the file is in the cache of the transmitting
`machine when you start the FTP, you should do the FTP at least twice, throwing
`away the results from the first iteration. Also, do not write the file on the disk of
`the receiving machine. You can do this with some versions of FTP by directing the
`result to Mew/null. Altogether, we have something like this:
`
`ftp> get bigfile /dev/null
`
`Try using the FTP [ms/9 command to get an interactive feel for latency and
`throughput. The has}? command prints hash marks (#) after the transfer of a block
`of data. The size of the block represented by the hash mark varies with the FTP
`
`implementation, but FTP will tell you the size when you turn on hashing:
`
`ftp> hash
`Hash mark printing on (1024 bytes/hash mark).
`ftp> get ers.27may
`200 PORT command successful.
`150 Opening BINARY mode data connection for ers.27may (362805 bytes).
`#############################################################################
`#############################################################################
`#############################################################################
`#############################################################################
`##############################################
`226 Transfer complete.
`362805 bytes received in 15 secs (24 Kbytes/sec)
`ftp> bye
`221 Goodbye.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`
`
`48 Chapter 3: Web Performance Measurement
`
`You can use the Expect scripting language to run an FTP test automatically at reg-
`ular intervals. Other scripting languages have a difficult time controlling the termi—
`nal of a spawned process; if you start FTP from within a shell script, for example,
`execution of the script halts until FTP returns, so you cannot continue the FTP ses—
`sion. Expect is designed to deal with this exact problem. Expect is well docu—
`mented in Exploring Expect, by Don Libes (O’Reilly 8: Associates).
`
`You can of course also retrieve content via HTTP from your server to test network
`performance, but
`this does not cleanly distinguish network performance from
`server performance.
`
`Here are a few more network testing tools:
`
`ttcp
`
`It
`ttcp is an old C program, circa 1985, for testing TCP connection speed.
`makes a connection on port 2000 and transfers zeroed buffers or data copied
`from STDIN. It is available from fip://ftp.arl.mil/piib/ttcp/ and distributed with
`some Unix systems. Try which ttcp and mom ttcp on your system to see if the
`binary and documentation are already there.
`nettest
`
`is Nettest, available at fip.-//ftp.sgi.com/sgi/src/
`A more recent tool, circa 1992,
`nettest/ Nettest was used to generate some performance statistics for VBNS, the
`very-high-performance backbone network service, http://wwwz/bnsnet/
`
`bing
`
`bing attempts to measure bandwidth between two points on the Internet. See
`http://web.cnamfr/reseau/bing.btml.
`
`cbargen
`
`The cbargen service, defined in RFC 864 and implemented by most versions
`of Unix, simply sends back nonsense characters to the user at the maximum
`possible rate. This can be used along with some measuring mechanism to
`determine what that maximum rate is. The TCP form of the service sends a
`
`continuous stream, while the UDP form sends a packet of random size for
`each packet received. Both run on well-known port 19.
`
`netspec
`
`NetSpec simplifies network testing by allowing users to control processes
`across multiple hosts using a set of daemons.
`It can be found at bttp://
`www. tisl. u/eoms. edu/Projects/AAI/products/netspec/
`
`Utilization
`
`Utilization is simply the fraction of the capacity of a component that you are actu—
`ally using. You might think that you want all your components at close to 100%
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`Parameters of Performance
`
`49
`
`utilization in order to get the most bang for your buck, but this is not necessarily
`how things work. Remember that for disk drives and Ethernet,
`latency suffers
`greatly at high utilization. A rule of thumb is that many components can run at
`their best performance up to about 70% utilization. The petfmeter tool that comes
`with many versions of Unix is a good graphical way to monitor the utilization of
`your system.
`
`Ejficiency
`
`Efficiency is usually defined as throughput divided by utilization. When compar—
`ing two components, if one has a higher throughput at the same level of utiliza-
`tion, it is regarded as more efficient. If both have the same throughput but one has
`a lower level of utilization that one is regarded as more efficient. While useful as a
`basis for comparing components, this definition is otherwise irrelevant, because it
`is only a division of two other parameters of performance.
`
`‘
`
`A more useful measure of efficiency is performance per unit cost. This is usually
`called cost efficiency. Performance tuning is the art of increasing cost efficiency:
`getting more bang for your buck. In fact, the Internet itself owes its popularity to
`the fact that it is much more cost—efficient than previously existing alternatives for
`transferring small amounts of information. Email is vastly more cost-efficient than a
`letter. Both send about the same amount of information, but email has near—zero
`
`latency and near—zero incremental cost; it doesn’t cost you any more to send two
`emails rather than one. Web sites providing product information are lower latency
`and cheaper than printed brochures. As the throughput of the Internet increases
`faster than its cost, entire portions of the economy will be replaced with more
`cost-efficient alternatives, especially in the business-to-business market, which has
`little sentimentality for old ways. First, relatively static information such as busi-
`ness paperwork, magazines, books, CDs, and videos will be Virtualized. Second,
`the Internet will become a real—time communications medium.
`
`The cost efficiency of the Internet for real—time communications threatens not only
`the obvious target of telephone carriers, but also the automobile industry. That is,
`telecommuting threatens physical commuting. Most of the workforce simply moves
`bits around, either with computers, on the phone, or in face-to-face conversa~
`tions, which are, in essence, gigabit—per-second, low-latency video connections. It
`is only these face—to—face conversations that currently require workers to buy cars
`for the commute to work. Cars are breathtakingly inefficient, and telecommuting
`represents an opportunity to save money. Look at the number of cars on an urban
`
`It’s a slow river of metal, fantastically expensive in
`highway during rush hour.
`terms of car purchase, gasoline, driver time, highway construction, insurance, and
`fatalities. Then consider that most of those cars spend most of the day sitting in a
`parking lot. Just think of the lost interest on that idle capital. And consider the cost
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`50
`
`Chapter 3: Web Performance Measurement
`
`itself, and the office. As data transmission costs continue to
`of the parking lot
`accelerate their fall, car costs cannot fall at the same pace. Gigabit connections
`
`between work and home will inevitably be far cheaper than the daily commute,
`for both the worker and employer. And at gigabit bandwidth,
`it will feel
`like
`you’re really there.
`
`Benchmark Specifications and
`Benchmark Tests
`
`For clarity, we should distinguish between benchmark specifications and bench—
`mark tests. There are several web benchmarks that may be implemented by more
`
`than one test, since there are implementation details that do not affect the results
`of the test. For example, a well—specified HTTP load is the same regardless of the
`hardware and software used to generate the load and regardless of the actual bits
`in the content. On the other hand, some benchmarks are themselves defined by a
`
`test program or suite, so that running the test is the only way to run the bench—
`mark. We will be considering both specifications and tests in this section.
`
`The point of a benchmark is to generate performance statistics that can legiti—
`mately be used to compare products. To do this, you must try to hold constant all
`of the conditions around the item under test and then measure performance. If the
`
`only thing different between runs of a test is a particular component, then any dif-
`ference in results must be due to the difference between the components.
`
`Exactly defining the component under test can be a bit tricky. Say you are trying
`to compare the performance of Solaris and Irix in running Netscape server soft—
`ware. The variable in the tests is not only the operating system, but also, by neces—
`
`sity, the hardware. It would be impossible to say from a benchmark alone which
`performance characteristics are due to the operating system and which are due to
`the hardware. You would need to undertake a detailed analysis of the OS and the
`hardware, which is far more difficult.
`
`It may sound odd, but another valid way to think of a benchmark test is the cre—
`ation of a deliberate bottleneck at the subject of the test. When the subject is defi-
`nitely the weakest link in the chain, then the throughput and latency of the whole
`system will reflect those of the subject. The hard part is assuring that the subject is
`actually the weakest link, because subtle changes in the test can shift the bottle-
`neck from one part of the system to another, as we saw earlier with the FTP test of
`network capacity. If you’re testing server hardware throughput, for example, you
`want to have far more network throughput than the server could possibly need,
`otherwise you may get identical results for all hardware, namely the bandwidth of
`the network.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`
`
`In this chapter:
`0 Brief History of the
`Web Browser
`
`- How Browsers Work
`
`- Popular Browsers
`- Browser Speed
`0 Browser Tuning Tips
`- Figuring Out Why the
`
`Browser Is Hanging
`
`o
`
`Recommendations
`
`Cl”t 50ft
`
`Brief History of the Web Browser
`
`The idea of a hypertext browser is not new. Many word processing packages such
`as FrameMaker and formats such as PDF generate or incorporate hyperlinks. The
`
`idea of basing a hypertext browser on common standards such as ASCII text and
`Unix sockets was an advance first made by the Gopher client and server from the
`University of Minnesota. Gopher proved to be extremely light and quick, but the
`links were presented in a menu separate from the text, and Gopher did not have
`the ability to automatically load images. The first drawback was solved by the
`invention of HTML, and the second was solved in the first graphical HTML
`
`browser, Mosaic, produced in 1993 at the University of Illinois National Center for
`Supercomputing Applications (NCSA).
`
`Many of the original students who developed Mosaic were among the founders of
`Netscape the following year. An effort by the University of Illinois to commercial-
`ize Mosaic led to the founding of Spyglass, which licensed its code to Microsoft for
`the creation of Internet Explorer. Netscape and IE have been at the forefront of
`browser advances in the last few years, but the core function of the browser, to
`
`retrieve and display hypertext and images, has remained the same.
`
`How Browsers Work
`
`The basic function of a browser is extremely simple. Any programmer with a good
`
`knowledge of Perl or Java can write a minimal but functional text-only browser in
`one day. The browser makes a TCP socket connection to a web server, usually on
`port 80, and requests a document using HTTP syntax. The browser receives an
`HTML document over the connection and then parses and displays it, indicating in
`some way which parts of the text are links to other documents or images. When
`
`Microsoft Corp. Exilfoit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`
`86
`Chapter 6: Client Software
`
`the user selects one of the links, perhaps by clicking on it, the process starts all
`over again, with the browser requesting another document.
`In spite of the
`advances in HTML, HTTP, and Java, the basic functionality is exactly the same for
`all web browsers.
`‘
`
`Let’s take a look at'the functionality of recent browsers in more detail, noting per-
`formance issues. To get the ball rolling, the browser first has to parse the URL
`you’ve typed into the “Location:” box or recognize which link you’ve clicked on.
`This should be extremely quick. The browser then checks its cache to see if it has
`that page. The page is looked up through a quick hashed database mapping URLs
`to cache locations. Dynamic content should not be cached, but if the provider of
`the content did not specify an immediate timeout in the HTTP header or if the
`browser is not clever enough to recognize CGI output from URLs, then dynamic
`content will be cached as well.
`
`If the page requested is in the cache and the user has requested Via a preference
`setting that the browser check for updated versions of pages, then a good browser
`will try to save time by making only an HTTP HEAD request to the server with an
`If—modified—since line to check whether the cached page is out of date. If the
`
`reply is that the cached page is still current, the browser simply displays the page
`from the cache. If the desired web page is not in the cache, or is in the cache but
`is stale, then the browser needs to request the current version of the page from the
`server.
`
`In order to connect to a web server, the client machine needs to know the server’s
`
`4-byte IP address (e. g., 198.137.240.92). But the browser usually has only the fully-
`qualified server name (e.g., wwwwbitebousagov) from the user’s manual request
`or the HTML of a previous page. The client machine must figure out which IP
`address is associated with the DNS name of a web server. It does this Via the dis-
`
`is, DNS. The client
`that
`tributed database of domain name to IP mappings,
`machine makes a request of its local name server, which either knows the answer
`or queries a higher—level server for the answer. If an IP answer is found, the client
`can then make a request directly to the server by using that IP address. If no
`answer is found, the request cannot proceed and the browser will display “No
`DNS Entry” or some other cryptic message to the user.
`
`The performance problem here is that DNS lookups are almost always imple—
`mented with blocking system calls, meaning that nothing else can happen in the
`browser until the DNS lookup succeeds or fails. If the local DNS server is over-
`loaded, the browser will simply hang until some rather long operating system tim—
`eout expires, perhaps one minute. DNS services, like most other Internet services,
`tend to get exponentially slower under heavy load. The only guaranteed way to
`avoid the performance penalty associated with DNS is not to use it. You can sim-
`ply embed IP addresses in HTML or type them in by hand. This is hard on the
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`How Browsers Work
`
`87
`
`user, because DNS names are much easier to remember than IP addresses, and
`because it is confusing to see an IP address appear in the “Location:” box of the
`browser. Under good conditions, DNS lookup takes only a few tenths of a sec—
`ond. Under bad conditions, it can be intolerably slow.
`
`The client-side implementation of DNS is known as the resolver. The resolver is
`usually just a set of library calls rather than a distinct program. Under Unix, for
`example, the resolver is part of the lilac library that most C programmers use for
`their applications. Fortunately, most DNS resolvers cache recently requested DNS
`names, so subsequent lookups are much faster than the first.
`
`Once a browser client has the IP address of the desired server,
`
`it generates the
`
`HTTP request describing its abilities and what it wants, and hands it off to the OS
`for transmission. In generating the HTTP request, the browser will check for previ- .
`
`ously received cookies associated with the desired page or DNS domain and send
`those along with the request so that the web server can easily identify repeat cus—
`tomers. The whole request is small, a hundred bytes or so. The OS attempts to
`establish a TCP connection to the server and to give the server the browser’s
`
`request. The browser then simply waits for the answer or a timeout. If no reply is
`forthcoming, the browser does not know whether it is because the server is over-
`loaded and cannot accept a new connection, because the server crashed, or
`because the server’s network connection is down.
`
`When the response from the server arrives, the 08 gives it to the browser, which
`then checks the header for a valid HTTP response code and a new cookie. If the
`
`response is OK, the browser stores any cookie, parses the HTML content or image,
`and starts to calculate how to display it. Parsing is very CPU—intensive. You can
`feel how fast your CPU is when you load a big HTML page, say 100K or more,
`from cache or over a very fast network connection. Remember that parsing text is
`
`a step distinct from laying out and displaying it. Netscape, in particular, will delay
`the display of parsed text until the size of every embedded image is known. If the
`image sizes are not included in the HTML <IMG> tag, this means that the browser
`must request every image and receive a response before the user sees anything on
`the page.
`
`The order in which an HTML page is laid out is up to the particular browser. In
`Netscape 4.x, web pages are rendered in the following order, once all the image
`sizes are known:
`
`1. The text of the page is laid out. Links in the text are checked against a history
`database, and if found, are shown in a different color to indicate that the user
`
`has already clicked on them.
`
`2. The boundary boxes for images are displayed with any ALT text for the image
`and with the image icon.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`

`

`
`88
`Chapter 6: Client Software
`
`3. Images are displayed, perhaps with progressive rendering, where the image
`gains in definition as data arrives rather than simply filling in from top to bot-
`
`tom. It is common for Netscape to load and show an image before showing
`any text.
`
`4. Subsidiary frames are loaded starting over at step 1.
`
`A browser may open multiple connections to the server. You can clearly

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket