`
`
`
`
`
`
`
`
`
`
`
`Advancing Technology
`for Humanity
`
`
`
`
`
`DECLARATION OF GORDON MACPHERSON
`
`I, Gordon MacPherson, am over twenty-one (21) years of age. I have never been
`convicted of a felony, and I am fully competent to make this declaration. I declare the following
`to be true to the best of my knowledge, information and belief:
`
`1. I am Director Board Governance & IP Operations of The Institute of Electrical and
`Electronics Engineers, Incorporated (“IEEE”).
`
`2. IEEE is a neutral third party in this dispute.
`
`3. I am not being compensated for this declaration and IEEE is only being reimbursed
`for the cost of the article I am certifying.
`
`4. Among my responsibilities as Director Board Governance & IP Operations, I act as a
`custodian of certain records for IEEE.
`
`5. I make this declaration based on my personal knowledge and information contained
`in the business records of IEEE.
`
`6. As part of its ordinary course of business, IEEE publishes and makes available
`technical articles and standards. These publications are made available for public
`download through the IEEE digital library, IEEE Xplore.
`
`7. It is the regular practice of IEEE to publish articles and other writings including
`article abstracts and make them available to the public through IEEE Xplore. IEEE
`maintains copies of publications in the ordinary course of its regularly conducted
`activities.
`
`8. The article below has been attached as Exhibit A to this declaration:
`
`
`A. R. van Renesse; A.S. Tanenbaum; A. Wilschut, “The design of a high-
`performance file server”, [1989] Proceedings. The 9th International
`Conference on Distributed Computing Systems, June 5 - 9, 1989.
`
`I
`
`I
`
`9. I obtained a copy of Exhibit A through IEEE Xplore, where it is maintained in the
`ordinary course of IEEE’s business. Exhibit A is a true and correct copy of the
`Exhibit, as it existed on or about July 13, 2021.
`
`10. The article and abstract from IEEE Xplore shows the date of publication. IEEE
`Xplore populates this information using the metadata associated with the publication.
`
`445 Hoes Lane Piscataway, NJ 08854
`
`
`
`
`
`DocuSign Envelope ID: CE82FE46-D91B-4A08-8F8B-034CB5620879
`
`Netflix, Inc. - Ex. 1034, Page 000001
`
`IPR2021-01319 (Netflix, Inc. v. CA, Inc.)
`
`
`
`
`11. R. van Renesse; A.S. Tanenbaum; A. Wilschut, “The design of a high-performance
`file server” was published in the 1989 Proceedings of the 9th International
`Conference on Distributed Computing Systems. The 9th International Conference on
`Distributed Computing Systems was held from June 5 - 9, 1989. Copies of the
`conference proceedings were made available no later than the last day of the
`conference. The article is currently available for public download from the IEEE
`digital library, IEEE Xplore.
`
`12. I hereby declare that all statements made herein of my own knowledge are true and
`that all statements made on information and belief are believed to be true, and further
`that these statements were made with the knowledge that willful false statements and
`the like are punishable by fine or imprisonment, or both, under 18 U.S.C. § 1001.
`
`I declare under penalty of perjury that the foregoing statements are true and correct.
`
`
`
`
`Executed on:
`
`
`
`
`
`
`
`
`
`
`DocuSign Envelope ID: CE82FE46-D91B-4A08-8F8B-034CB5620879
`
`7/14/2021
`
`Netflix, Inc. - Ex. 1034, Page 000002
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`EXHIBIT A
`
`
`
`DocuSign Envelope ID: CE82FE46-D91B-4A08-8F8B-034CB5620879
`
`Netflix, Inc. - Ex. 1034, Page 000003
`
`
`
`Scheduled System Maintenance: On Tuesday, July 13, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET. During this
`time, there may be intermittent impact on performance. We apologize for any inconvenience.
`
`
`
`IEEE.org
`
`IEEE Xplore
`
`IEEE-SA
`
`IEEE Spectrum
`
`More Sites
`
`Cart Create Account
`
`Personal Sign In
`
`
`
`Browse My Settings Help
`
`Access provided by:
`IEEE Staff
`
`File Cabinet
`
`Sign Out
`Show Usage
`
`Access provided by:
`IEEE Staff
`
`File Cabinet
`
`Sign Out
`Show Usage
`
`All
`
` SheppardMullin
`
`
`
`ADVANCED SEARCH
`
`Conferences > [1989] Proceedings. The 9th I...
`
`The design of a high-performance file server
`Publisher: IEEE
`
`Cite This
`
`
`A
`
`lerts
`
`Manage
`Content Alerts
`
`Add to
`Citation Alerts
`
`
`
`R. van Renesse ; A.S. Tanenbaum ; A. Wilschut All Authors
`
`11
`Paper
`Citations
`
`19
`Patent
`Citations
`
`96
`Full
`Text Views
`
` D
`
`ownl
`
`Abstract:The Bullet server is a file server that outperforms traditional file servers by
`more than a factor of three. It achieves high throughput and low delay by a software
`desig... View more
`
` Metadata
`First Page of the Article
`
`Abstract:
`The Bullet server is a file server that outperforms traditional file servers by more than a
`factor of three. It achieves high throughput and low delay by a software design radically
`different from that of file servers currently in use. Whereas files are normally stored as a
`sequence of disk blocks, each Bullet server file is stored contiguously, both on disk and
`in the server's random access memory cache. Furthermore, it uses the concept of an
`immutable file to improve performance, to enable caching, and to provide a clean
`semantic model to the user. The authors describe the design and implementation of the
`Bullet server in detail, present measurements of its performance, and compare this
`performance to that of the SUN file server running on the same hardware.< >
`
`Abstract
`
`Authors
`
`References
`
`Citations
`
`Keywords
`
`Metrics
`
`More Like This
`
` Export to
`Collabratec
`
`Netflix, Inc. - Ex. 1034, Page 000004
`
`More
`Like
`This
`Towards a software-defined
`Network Operating System for the
`IoT
`2015 IEEE 2nd World Forum on
`Internet of Things (WF-IoT)
`Published: 2015
`A network operating system for
`interconnected LANs with
`heterogeneous data-link layers
`Proceedings [1988] 13th
`Conference on Local Computer
`Networks
`Published: 1988
`Show
`More
`
`
`Published in: [1989] Proceedings. The 9th International Conference on Distributed
`Computing Systems
`
`Date of Conference: 5-9 June 1989
`
`INSPEC Accession Number: 3472178
`
`Date Added to IEEE Xplore: 06 August
`2002
`
`DOI: 10.1109/ICDCS.1989.37926
`
`Publisher: IEEE
`
`Print ISBN:0-8186-1953-8
`
`First Page of the Article
`
`Conference Location: Newport Beach,
`CA, USA
`
`R. van Renesse
`Department of Computer Science, Vrije Universiteit, Netherlands
`
`Hide First Page Preview
`
`A.S. Tanenbaum
`Department of Computer Science, Vrije Universiteit, Netherlands
`
`A. Wilschut
`Department of Computer Science, University of Twente, Enschede, Netherlands
`Department of Computer Science, Vrije Universiteit, Netherlands
`
`Authors
`
`R. van Renesse
`Department of Computer Science, Vrije Universiteit, Netherlands
`
`A.S. Tanenbaum
`Department of Computer Science, Vrije Universiteit, Netherlands
`
`A. Wilschut
`Department of Computer Science, University of Twente, Enschede, Netherlands
`Department of Computer Science, Vrije Universiteit, Netherlands
`
`References
`
`Citations
`
`Keywords
`
`Metrics
`
`
`
`
`
`
`
`
`
`
`
`IEEE Personal Account
`
`Purchase Details
`
`Profile Information
`
`Need Help?
`
`CHANGE USERNAME/PASSWORD
`
`PAYMENT OPTIONS
`
`COMMUNICATIONS PREFERENCES
`
`US & CANADA: +1 800 678 4333
`
`VIEW PURCHASED DOCUMENTS
`
`PROFESSION AND EDUCATION
`
`WORLDWIDE: +1 732 981 0060
`
`TECHNICAL INTERESTS
`
`CONTACT & SUPPORT
`
`Follow
`
`
`
`About IEEE Xplore | Contact Us | Help | Accessibility | Terms of Use | Nondiscrimination Policy | IEEE Ethics Reporting | Sitemap | Privacy & Opting Out of Cookies
`A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
`
`© Copyright 2021 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
`
`Netflix, Inc. - Ex. 1034, Page 000005
`
`
`
`IEEE Account
`
`Purchase Details
`
`Profile Information
`
`Need Help?
`
`» Change Username/Password
`» Update Address
`
`» Payment Options
`» Order History
`» View Purchased Documents
`
`» Communications Preferences
`» Profession and Education
`» Technical Interests
`
`» US & Canada: +1 800 678 4333
`» Worldwide: +1 732 981 0060
`» Contact & Support
`
`About IEEE Xplore Contact Us
`
`
`|
`
`|
`
`Help
`
`
`|
`
`Accessibility
`
`
`|
`
`Terms of Use
`
`
`|
`
`Nondiscrimination Policy
`
`
`|
`
`Sitemap
`
`
`|
`
`Privacy & Opting Out of Cookies
`
`A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
`© Copyright 2021 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.
`
`Netflix, Inc. - Ex. 1034, Page 000006
`
`
`
`The Design of a High-Performance File Sener
`
`Robbert van Renesse* Andrew S. Tanenbaum Annita Wilschutl
`
`Dept of Computer Science
`Vrije Universiteit
`The Netherlands
`
`ABSTRACT
`The Bullet server is an innovative file server that outperforms
`ttaditional file servers like SUN's NFS by more than a factor
`of three. It achieves high throughput and low delay by a radi(cid:173)
`cally different software design than current file servers in use.
`Instead of storing files as a sequence of disk blocks, each Bul(cid:173)
`let server file is stored contiguously, both on disk and in the
`server's RAM cache. Furthermore, it employs the concept of
`an immutable file, to improve perfonnance, to enable caching,
`and to provide a clean semantic model to the user. The paper
`describes the design and implementation of the Bullet server in
`detail, presents measurements of its performance, and com(cid:173)
`pares this performance with other well-known file servers run(cid:173)
`ning on the same hardware.
`
`1, Introduction
`file systems were designed for small
`Traditional
`machines, that is, computers with little RAM memory and
`small disks. Emphasis was on supporting large files using as
`few resources as possible. To allow dynamic growth of files,
`files were split into fixed size blocks scattered all over the
`disk. Blocks would be dynamically allocated to files, such that
`a large file would be scattered all over a disk. Performance
`suffered since each block had to be separately accessed. Also
`indirect
`the block management introduced high overhead:
`blocks were necessary to administer the files and their blocks.
`A small part of the computer's little memory was used to keep
`parts of files in a RAM cache to make access somewhat more
`efficient.
`changed considerably.
`situation has
`Today
`the
`Machines have enormous RAM memories and huge disks.
`Files usually fit completely in memory. For example, memory
`sizes of at least 16 Megabytes are common today, enough to
`hold most files encountered in practice. Measurements [1]
`
`* This research was supported in pan by the Netherlands Orgaoizalion for
`Scientific Research (N.W.O.) under grant 125-30-10.
`* Current affiliation: Dept. of Computer Science, University of Twente,
`Enschede, The Netherlands
`
`show that the median file size in a UNIXt system is 1 Kbyte
`and 99% of all files are less than 64 Kbytes. File systems,
`however, have not changed yet. Files are still subdivided into
`blocks. To take some advantage of new technology, the size
`of blocks has been increased, and the memory caches have
`been enlarged. This has led to a marginal performance
`improvement.
`As part of the Amoeba distributed operating system pro(cid:173)
`ject [2, 3] we have designed and implemented a file server that
`is intended for current and future computer and disk technol(cid:173)
`ogy. We have devoted considerable energy to making the file
`server fast. Since we believe we have achieved this goal, we
`have named it the Bullet file server. Among its features are
`support for replication, caching, and immutability.
`This paper has six sections. In the following section we
`will present the architectural model of the Bullet file service.
`In section three we present the implementation of the server,
`including the data structures and interfaces. The performance
`of the file server is the subject of section four. In section five
`we will compare the Bullet file server with other files servers,
`such as SUN NFS. Section six contains our conclusions.
`
`2. Architectural model
`The basic idea behind the design of the Bullet file server
`is to do away with the block model. In fact, we have chosen
`files contiguously
`to maintain
`is
`the extreme, which
`throughout the system. That is, files are contiguously stored
`on disk, contiguously cached in RAM, and kept contiguously
`in processes• memories. This dictates the choice for whole file
`transfer. As a consequence, processors can only operate on
`files that fit in their physical memory. This affects the way in
`which we store data structures on files, and how we assign
`processors to applications. Since most files (about 75%) are
`accessed in entirety [4], whole file transfer optimizes overall
`scaling and performance, as has also been reported in other
`system that do whole file transfer, such as in the Andrew ITC
`file system [5].
`
`t UNIX is a Registered Trademark of AT&T in the USA and other coun(cid:173)
`tries.
`
`CH2706-0/89/0000/0022$01.00 © 1989 IEEE
`
`22
`
`Authorized licensed use limited to: IEEE Staff. Downloaded on July 13,2021 at 15:03:52 UTC from IEEE Xplore. Restrictions apply.
`
`Netflix, Inc. - Ex. 1034, Page 000007
`
`
`
`Another design choice, which is closely linked to keep(cid:173)
`ing files contiguous, is to make all files immutable. That is,
`the only operations on files are creation, retrieval, and dele(cid:173)
`tion; there are no update-in-place operations. Instead, if we
`want to update a data structure that is stored on a file, we do
`this by creating a new file holding the updated data structure.
`In other words, we store files as sequences of versions. Note
`that as we do whole file transfer anyway, this puts no perfor(cid:173)
`mance penalties on the file server. Version mechanisms have
`positive influences on caching, as reported in the Cedar File
`System [6], and on replication. It also presents the possibility
`of keeping versions on write-once storage such as optical
`disks. Although the version mechanism is itself quite interest(cid:173)
`ing, space limitations prevent us from describing it here. It is
`discussed in [7].
`For most applications this model works well, but there
`are some applications where different solutions will have to be
`found. Each append to a log file, for example, would require
`the whole file to be copied. Similarly, for data bases, a small
`update might incur a large overhead. For log files we have
`implemented a separate server. Data bases can be subdivided
`over many smaller Bullet files, for example based on the iden(cid:173)
`tifying keys.
`Throughout the design we have strived for performance,
`scalability, and availability. The Bullet file server is the main
`storage server for the Amoeba distributed operating system,
`where these issues are of high importance [8, 9]. Performance
`can only be achieved if the management of storage and repli(cid:173)
`cation are low, and the model of contiguity and immutability
`corresponds to how files are usually accessed. Scalability
`involves both geographic scalability-Amoeba currently runs
`in four different countries-and quantitative scalability-there
`may be thousands of processors accessing files. Availability
`implies the need for replication.
`Since these issues correlate heavily with those of the
`Amoeba distributed operating system, we will first devote a
`section to Amoeba. In this section we will also describe the
`naming service of Amoeba, which plays a role in how data
`structures, especially large ones, may be stored efficiently on
`immutable files. In the following section we will describe the
`Bullet file interface.
`
`2.1. Amoeba
`Amoeba [2, 3] is a distributed operating system that was
`designed and implemented at the Vrije Universiteit in Amster(cid:173)
`dam, and is now being further developed there and at the Cen(cid:173)
`tre for Mathematics and Computer Science, also in Amster(cid:173)
`dam. It is based on the object model. An object is an abstract
`data type, and operations on it are invoked through remote pro(cid:173)
`cedure calls. The hardware on which Amoeba runs consists of
`four principal components:
`
`workstations
`•
`dynamically allocatable processors
`•
`specialized services
`•
`gateways
`•
`Workstations provide the user interface to Amoeba and are
`tasks such as command
`interactive
`involved with
`only
`interpretation and text editing. Consequently they do not deal
`with very large or dynamically changing files. The dynami(cid:173)
`cally allocatable processors together form the so-called pro(cid:173)
`cessor pool. These processors may be allocated for compiling
`or text formatting purposes, or for distributed or parallel algo(cid:173)
`rithms. Among other applications, we have implemented a
`parallel make [IO] and parallel heuristic search [l l].
`Specialized servers include filing servers such as the
`Bullet file server, and the directory server. The directory
`server is used in conjunction with the Bullet server. It's func(cid:173)
`tion is to handle naming and protection of Bullet server files
`and other objects in a simple, uniform way. Servers manage
`the Amoeba objects, that is, they handle the storage and per(cid:173)
`form the operations. Gateways provide transparent communi(cid:173)
`cation among Amoeba sites currently operating in four dif(cid:173)
`ferent countries (The Netherlands, England, Norway, and Ger(cid:173)
`many).
`All objects in Amoeba are addressed and protected by
`capabilities [3, 12]. A capability consists of four parts:
`The server port identifies the server that manages the
`I)
`object. It is a 48-bit location-independent number that
`is chosen by the server itself and made known to the
`server's potential clients.
`The object number identifies the object within the
`server. For example, a file server may manage many
`files, and use the object number to index in a table of
`inodes. An inode contains the position of the file on
`disk, and accounting information.
`The rights field specifies which access rights the holder
`of the capability has to the object. For a file server
`there may be a bit indicating the right to read the file,
`another bit for deleting the file, and so on.
`The check field is used to protect capabilities against
`forging and tampering. In the case. of the file server
`this can be done as follows. Each time a file is created
`the server generates a large random number and stores
`this in the inode for the file. Capabilities for the file
`can be generated by taking the server's port, the index
`of the file in the inode table, and the required rights.
`The check field can be generated by taking the rights
`and the random number from the inode, and encrypting
`both. If, later, a client shows a capability for a file, its
`validity can be checked by decrypting the check field
`and comparing the rights and the random number.
`Other schemes are described in [12]. Capabilities can
`be cached to avoid decryption for each access.
`
`2)
`
`3)
`
`4)
`
`23
`
`Authorized licensed use limited to: IEEE Staff. Downloaded on July 13,2021 at 15:03:52 UTC from IEEE Xplore. Restrictions apply.
`
`Netflix, Inc. - Ex. 1034, Page 000008
`
`
`
`Although capabilities are a convenient way for addressing and
`protecting objects, they are not usable for human users. For
`this the directory service maps human-chosen ASCII names to
`capabilities. Directories are two-column tables, the first
`column containing names, and the second containing the
`corresponding capabilities. Directories are objects themselves,
`and can be addressed by capabilities. By placing directory
`capabilities in directories an arbitrary naming structure can be
`built at the convenience of the user. The directory service pro(cid:173)
`vides a single global naming space for objects. This has
`allowed us to link multiple Bullet file servers together provid(cid:173)
`ing one single large file service that crosses international bord(cid:173)
`ers [13, 14).
`
`2.2. Bullet Server Interface
`The simple architectural model of the fde service is
`reflected in its simple interface. Whole file transfer eliminates
`the need for relatively complicated interfaces to access parts of
`files. Immutability eliminates the need for separate update
`operators. Version management is not part of the fde server
`interface, since it is done by the directory service [7].
`The Bullet interface consist of four functions:
`BULLET.CREATE(SERVER, DATA, SIZE, P-FACTOR) (cid:157)
`CAPABILITY
`
`•
`•
`•
`
`BULLET.SIZE(CAPABILITY) (cid:157) SIZE
`
`BULLET .READ(CAPABILITY, &DATA)
`
`•
`BULLET.DELETE(CAPABILITY)
`The BULLET.CREATE function is the only way to store data on
`a Bullet server. The SERVER argument specifies which Bullet
`server to use. This enables users to use more that on Bullet
`server. The DATA and SIZE arguments describe the contents of
`the file to be created. A capability for the file is returned for
`subsequent usage.
`P-FACTOR stands for Paranoia Factor. It is a measure for
`how carefully the file should be stored before BULLET.CREATE
`can return
`If the P-FACTOR
`to the
`invoker.
`is zero,
`BULLET.CREATE will return immediately after the file has been
`copied to the fde server's RAM cache, but before it has been
`stored on disk. This is fast, but if the server crashes shortly
`afterwards the file may be lost. If the P-FACTOR is one, the file
`will be stored on one disk before the client can resume. If the
`P-FACTOR is N, the file will be stored on N disks before the
`client can resume. This requires the file server to have at least
`N disks available for replication. At present we have two
`disks.
`The BULLET.SIZE and BULLET .READ functions are used
`to retrieve files from a server. First BULLET.SIZE is called to
`get the size of the file addressed by CAPABILITY, after which
`local memory
`is allocated
`to store its contents. Then
`BULLET.READ is invoked to get the contents, where &DATA is
`the address of the allocated local memory. Alternatively a sec(cid:173)
`tion of the virtual address space can be reserved, after which
`
`24
`
`the file can be mapped into the virtual memory of the process.
`In that case the underlying kernel performs the BULLET .READ
`function. BULLET.DELETE allows files to be discarded from
`the file server.
`
`3. Implementation
`Keeping files contiguous (i.e., not splitting them up in
`blocks) greatly simplifies file server design. Consequently, the
`implementation of the file server can be simple. In this section
`we will discuss an implementation on a 16. 7 MHz Motorola
`68020-based server with 16 Mbytes of RAM memory and two
`800 Mbyte magnetic disk drives. We will describe the disk
`layout of the file server, the file server cache, and how replica(cid:173)
`tion is done.
`The disk is divided into two sections. The first is the
`inode table, each entry of which gives the ownership, location,
`and size of one file. The second section contains contiguous
`files, along with the gaps between files. Inode entry O is spe(cid:173)
`cial, and contains three 4 byte integers:
`
`1)
`
`2)
`
`3)
`
`block size: the physical sector size used by the disk
`hardware;
`control size: the number of blocks in the control sec(cid:173)
`tion;
`data size: the number of blocks in the data section;
`
`The remaining inodes describe files. An inode consist of four
`fields:
`1)
`
`2)
`
`3)
`
`A 6-byte random number that is used for access protec(cid:173)
`tion. It is essentially the key used to decrypt capabili(cid:173)
`ties that are presented to the server.
`A 2-byte integer that is called the index. The index has
`no significance on disk, but is used for cache manage(cid:173)
`ment and will be described later.
`A 4-byte integer specifying the first block of the file on
`disk. Files are aligned on blocks (sectors). This align(cid:173)
`ment is forced by the disk hardware.
`4)
`A 4-byte integer giving the size of the file in bytes.
`When the file server starts up, it reads the complete inode table
`into the RAM inode table and keeps it there permanently. By
`scanning the inodes it can figure out which parts of disk are
`free. It uses this information to build a free list in RAM. Also
`unused inodes (inodes that are :zero-filled) are maintained in a
`list. While scanning the inodes, the file server performs some
`consistency checks, for example to make sure that files do not
`overlap. All of the server's remaining memory will be used
`for fde caching. At this time the fde server is ready for opera(cid:173)
`tion and starts awaiting client requests.
`The entries in the RAM inode table are a slightly dif(cid:173)
`ferent format from those on the disk, and are called rnodes.
`An mode contains the following information:
`
`Authorized licensed use limited to: IEEE Staff. Downloaded on July 13,2021 at 15:03:52 UTC from IEEE Xplore. Restrictions apply.
`
`Netflix, Inc. - Ex. 1034, Page 000009
`
`
`
`the client in one RPC operation. The age field is updated to
`reflect the recent access of the file.
`If the index in the inode is zero, the file is not in
`memory and has to be loaded from disk. First the memory free
`list is searched to see if there is a part large enough to hold the
`file. If not, the least recently accessed file is removed from the
`RAM cache, found by checking the age fields in the modes.
`This is done by re-claiming the mode, freeing the associated
`memory, and clearing the index field in the corresponding
`inode. This is repeated until enough memory is found. Then
`an mode is allocated for this file, and its fields initialized. The
`index field in the inode of the file is set to the mode index.
`Then the file can be read into the RAM cache. Most modem
`controllers can do this in a single DMA transfer. The client
`read operation can now proceed as with files that were already
`cached.
`Creating files is much the same as reading files that
`were not in the cache. A large enough part of cache memory
`has to be allocated to hold the file, after which it can be filled
`with data specified by the client. Also, an inode and a free
`part in the disk data section have to be allocated. For this we
`use a first fit strategy. In our implementation we use a write(cid:173)
`through caching scheme, that is, we immediately write the file
`to disk. The new inode, complete with a new random number,
`is immediately written as well. For this the whole disk block
`containing the inode has to be written. The new random
`number is used to create a capability for the user, which is
`returned in a reply message.
`In our hardware configuration we have two disks that we
`use as identical replicas. One of the disks is the main disk on
`which the file server reads. Disk writes are performed on both
`disks. If the main disk fails, the file server can proceed unin(cid:173)
`terruptedly by using the other disk. Recovery is simply done
`by copying the complete disk. The P-FACTOR in the create
`operation is used to determine where in the execution to send a
`reply back to the client.
`Deleting a file involves checking the capability, freeing
`an inode by zeroing it and writing it back to the disk. If the
`file is in the cache, the space in the cache can be freed. The
`disk free list in RAM has to be updated to include the part pre(cid:173)
`viously occupied by the file.
`At first glance, it appears that storing files contiguously
`in memory and on disk is very wasteful due to the external
`fragmentation problem (gaps between files). However, the
`fragmentation in memory can be alleviated by compacting part
`or all of the RAM cache from time to time. The disk fragmen(cid:173)
`tation can also be relieved by compaction every morning at say
`3 am when the system is lightly loaded.
`However, it is our belief that this trade-off is not unrea(cid:173)
`sonable. In effect, the conscious choice of using contiguous
`file may requiring buying, say, an 800 MB disk to store 500
`MB worth of files (the rest being lost to fragmentation unless
`compaction is done). Since our scheme gives an increased
`
`lnode Table
`
`Contiguous Files
`and Holes
`
`Disk Descriptor
`
`Inode I
`
`Inode 2
`
`InodeN
`
`file 2
`
`free
`
`file 1
`
`free
`
`Fig. 1. The Bullet Disk Layout
`
`The inode table index of the corresponding file;
`l)
`A pointer to the file in RAM cache, if present;
`2)
`An age field to implement an LRU cache strategy.
`3)
`Just like the disk, the free modes and free parts in the RAM
`cache are maintained using free lists.
`Client requests basically come in three varieties: read(cid:173)
`ing files, creating files, and deleting files. To read a file the
`client has to provide the capability of the file. The object
`number in the capability indexes into the inode table to find
`the inode for the file. Using the random number in the inode,
`and the check field in the capability, the right to read the file
`by this client can be checked. Next the index field in the inode
`is inspected to see whether there is a copy of the file in the
`RAM cache. If the index in the inode is non-zero, there is a
`copy in the server's RAM cache. The index is used to locate
`an mode, which describes where to find the file in memory.
`Since the file is contiguously kept in RAM, it can be sent to
`
`25
`
`Authorized licensed use limited to: IEEE Staff. Downloaded on July 13,2021 at 15:03:52 UTC from IEEE Xplore. Restrictions apply.
`
`Netflix, Inc. - Ex. 1034, Page 000010
`
`
`
`perfonnance of a factor of 3 (as described in the next section)
`and disk prices rise only very slowly with disk capacity, a rela(cid:173)
`tively small increment in total file server cost gives a major
`gain in speed.
`The complete code of the file server is less than 30
`pages of C. The MC68020 object size of the server, including
`all library routines, is 23 Kbytes. This small and simple
`implementation has resulted in a file server that has been
`operational flawlessly for over a year. The most vulnerable
`component of the server is the disk, but because of its replica(cid:173)
`tion, the complete file server is highly reliable.
`
`4. Performance and Comparison
`Figure 2 gives the perfonnance of the Bullet file server.
`In the first column the delay and bandwidth for read operations
`are shown. The measurements have been done on a normally
`loaded Ethernet from a 16 MHz 68020 processor. In all cases
`the test file will be completely in memory, and no disk
`accesses are necessary. In the second column a create and a
`delete operation together is measured, and the file is written to
`both disks. Note that both creation and deletion involve
`requests to two disks.
`To compare this with the SUN NFS file system, we have
`measured reading and creating files on a SUN 3/50 using a
`remote SUN 3/180 file server (using 16.7 MHz 68020s and
`SUN OS 3.5), equipped with a 3 Mbyte buffer cache. The
`measurements were made on an idle system. To disable local
`caching on the SUN 3/50, we have locked the file using the
`SUN UNIX lockf primitive. The read test consisted of an lseek
`followed by a read system call. The write test consisted of
`consecutively executing creat, write, and close. The SUN
`NFS file server uses a write-through cache, but writes the file
`to one disk only. The results are depicted in Figure 3.
`These measurements include both the communication
`time and the file server time. Since Amoeba uses a dedicated
`processor for the file server, it is impossible to separate com(cid:173)
`munication and file server performance. Observe that reading
`and creating 1 Mbyte NFS files result in lower bandwidths
`than reading and creating 64 Kbyte NFS files. The Bullet file
`server performs read operations three to six times better than
`the SUN NFS file server for all file sizes. Although the Bullet
`file server stores the files on two disks, for large files the
`bandwidth is ten times that of SUN NFS. For very large files
`(> 64 Kbytes) the Bullet server even achieves a higher
`bandwidth for writing than SUN NFS achieves for reading
`files.
`
`5. Discussion and Conclusions
`The simple architectural model of immutable files that
`are kept contiguous on disk, in memory, and on the network,
`results in a major perfonnance boost. Whole file ttansfer
`minimizes the load on the file server and on the network,
`allowing the service to be used on a larger scale [5].
`
`FileSi7.e
`
`1 byte
`
`16 bytes
`
`256 bytes
`
`4 Kbytes
`
`64 Kbytes
`
`1 Mbyte
`
`FileSi7.e
`
`1 byte
`
`16 bytes
`
`256 bytes
`
`4 Kbytes
`
`64Kbytes
`
`1 Mbyre
`
`Delay (msec)
`READ
`CREATE+DEL
`
`2
`
`2
`
`2
`
`7
`
`109
`
`1970
`
`94
`
`97
`
`98
`
`109
`
`331
`
`2700
`
`(a)
`
`Bandwidth ( Kbytesl sec)
`READ
`CREATE+DEL
`
`0.5
`
`8
`
`llO
`
`559
`
`587
`
`520
`
`(b)
`
`0.01
`
`0.16
`
`3
`
`37
`
`193
`
`379
`
`Fig. 2. Perfonnance of the Bullet file server for read
`operations, and create an