throbber
DEPARTMENT OF COMPUTER SCIENCE 




`
`Declaration of Shahram Ghandeharizadeh
`
`I, Shahram Ghandeharizadeh, hereby declare as follows:
`1. I am over twenty-one (21) years of age. I am fully competent to make this declaration.
`2. I am a Professor of Computer Science at the University of Southern California.
`3. I am making this declaration based on my personal knowledge.
`4. I am one of the authors of Shahram Ghandeharizadeh, et al., Placement of Data in Multi-
`Zone Disk Drives, published in Hele-Mai Haav and Bernhard Thalheim (Editors), Databases and
`Information Systems: Proceedings of the Second International Baltic Workshop, June 12-14, 1996
`(ISBN 9985-50-132-2, ISBN 9985-50-133-0) (“Placement of Data in Multi-Zone Disk Drives”).
`5. Placement of Data in Multi-Zone Disk Drives was submitted as a conference paper in
`February 1996 and published by the Institute of Cybernetics at Tallinn University in June 1996.
`Attached as Exhibit A hereto is a true and correct copy of the Placement of Data in Multi-Zone
`Disk Drives.
`6. As of June 1996, Placement of Data in Multi-Zone Disk Drives was publicly available to
`those interested in the subject matter of the article. A person interested in the subject matter of the
`article could purchase copies of the Proceedings of the Second International Baltic Workshop from
`the publisher. Further, I am aware that the Proceedings of the Second International Baltic
`Workshop were indexed and publicly available from libraries.
`I declare under penalty of perjury under the laws of the United States of America that the
`foregoing is true and correct.
`
`Executed on: April 23, 2021
`
`
`Signature:
`
`University of Southern California

`Viterbi School of Engineering | Department of Computer Science

`941 Bloom Walk, SAL 106, Los Angeles, California 90089-0781 • Tel: 213 740 4498 • Fax: 213 740 7285
`
`Netflix, Inc. - Ex. 1033, Page 000001
`
`IPR2021-01319 (Netflix, Inc. v. CA, Inc.)
`
`

`

`
`Exhibit A
`
`Exhibit A
`
`Netflix, Inc. - Ex. 1033, Page 000002
`
`Netflix, Inc. - Ex. 1033, Page 000002
`
`

`

`Placement of Data in Multi-Zone Disk Drives∗
`
`Shahram Ghandeharizadeh, Douglas J. Ierardi, Dongho Kim, Roger Zimmermann
`
`Department of Computer Science
`University of Southern California
`Los Angeles, California 90089
`
`February 22, 1996
`
`Abstract
`
`This paper describes a placement technique for the layout of files across the surface of a multi-
`zone magnetic disk drive to maximize its bandwidth. It argues for a file system that monitors the
`frequency of access to the files and modifies the placement of files in order to respond to an evolving
`pattern of file accesses. Such a file system is particularly useful for repositories whose primary
`functionality is to disseminate information, such as the World-Wide-Web along with thousands
`of ftp sites that render documents, graphic pictures, images, audio clips, and full-motion video
`available on the Internet. Employing software techniques to enhance the disk performance is
`important because its mechanical nature has limited its hardware performance improvements at
`7 to 10 percent a year.
`
`1
`
`Introduction
`
`While microprocessor and networking technologies have been advancing at a rapid pace during the
`
`past decade, disk performance improvements have been negligible, resulting in the I/O bottleneck
`
`phenomenon. The speedup for microprocessor technology is typically quoted at a rate of 40 to 60
`
`percent annually. While disk storage densities are improving at a high rate (from 60 to 80 percent
`
`annually), performance improvements have been rated at only 7 to 10 percent annually [RW94].
`
`A technique employed by modern magnetic disk drives to improve their storage density is zoning.
`
`Zoning introduces a disk with regions that provide different transfer rates. Depending on the char-
`
`acteristics of a disk, the fastest zone might provide a transfer rate 60 to 80 percent higher than that
`
`of the slowest zone. In this paper, we study software techniques that improve the performance of
`
`∗This research was supported in part by the National Science Foundation under grants IRI-9203389, IRI-9258362
`(NYI award), and CDA-9216321, and a Hewlett-Packard unrestricted cash/equipment gift.
`
`1
`
`Netflix, Inc. - Ex. 1033, Page 000003
`
`

`

`Arm
`assembly
`
`Arm
`
`Head
`
`Spindle
`
`Platter
`
`Cylinder
`
`Figure 1: A disk drive
`
`multi-zone disks by controlling the placement of data across the zones. To the best of our knowl-
`
`edge, no related work has appeared in the literature due to the recent introduction of multi-zone
`
`disk drives. We show that one can enhance the performance of a multi-zone disk by assigning the
`
`frequently accessed files to the zone with the highest transfer rate. We use the term heat [CABK88]
`
`to denote the frequency of access to a file.
`
`The rest of this paper is organized as follows. Section 2 provides an overview of multi-zone
`
`magnetic disks and transfer rates from two commercial disk drives. Assuming that the size of a file
`system equals the storage capacity of a magnetic disk drive1, Section 3 describes an optimal layout
`
`strategy that maximizes the performance of the disk drive. Section 4 describes an on-line reorganiza-
`
`tion technique the enables the file system system to respond to an evolving pattern of disk accesses.
`
`In addition, this section describes the design of a heat tracking module that estimates the frequency
`
`of access to each file. Section 5 quantifies the tradeoffs associated with the proposed techniques using
`
`a trace-driven simulation study. The obtained results demonstrate significant improvements using
`
`the proposed designs. Section 6 eliminates the assumption that the size of a file system equals the
`
`storage capacity of a disk and describes two alternative designs for data layout. Our conclusion and
`
`future research directions are contained in Section 7.
`
`2 Multi-zone magnetic disks
`
`A magnetic disk drive is a mechanical device, operated by its controlling electronics. The mechanical
`
`parts of the device consist of a stack of platters that rotate in unison on a central spindle (see [RW94]
`
`for details). Presently, a single disk contains one, two, or as many as sixteen platters (see Figure 1).
`
`1This assumption is relaxed in Section 6.
`
`2
`
`Netflix, Inc. - Ex. 1033, Page 000004
`
`

`

`Zone # Size (MB) Transfer Rate
`0
`324.0
`3.40
`1
`112.0
`3.17
`2
`76.0
`3.04
`3
`77.0
`2.92
`4
`71.0
`2.78
`5
`145.0
`2.54
`6
`109.0
`2.27
`7
`89.0
`2.02
`
`Zone # Size (MB) Transfer Rate (MBps)
`0
`139.7
`4.17
`1
`67.0
`4.08
`2
`45.7
`4.02
`3
`45.3
`3.97
`4
`43.9
`3.88
`5
`42.6
`3.83
`6
`41.3
`3.74
`7
`56.4
`3.60
`8
`39.2
`3.60
`9
`37.5
`3.50
`10
`51.5
`3.36
`11
`35.0
`3.31
`12
`33.8
`3.22
`13
`46.5
`3.13
`14
`31.8
`3.07
`15
`30.7
`2.99
`16
`41.8
`2.85
`17
`28.0
`2.76
`18
`27.6
`2.76
`19
`37.1
`2.62
`20
`25.0
`2.53
`21
`33.5
`2.43
`22
`25.4
`2.33
`
`(a) HP C2247 (Fast & Wide) disk
`
`(b) Seagate ST31200W (Fast & Wide) disk
`
`Table 1: Two commercial disks and their zoning information
`
`Each platter surface has an associated disk head responsible for reading and writing data. Each
`
`platter is set up to store data in a series of tracks. A single stack of tracks at a common distance
`
`from the spindle is termed a cylinder. To access the data stored in a track, the disk head must be
`
`positioned over it. The operation to reposition the head from the current track to the desired track
`
`is termed seek. Next, the disk must wait for the desired data to rotate under the head. This time
`
`is termed rotational latency. In a final step, the disk reads the referenced data. This time is termed
`
`transfer time. Its value depends on the size of the referenced data and the transfer rate of the disk.
`
`To meet the demands for higher storage capacity, magnetic disk drive manufacturers have intro-
`
`duced disks with zones. A zone is a contiguous collection of disk cylinders whose tracks have the
`
`same storage capacity. Tracks are longer towards the outer portions of a disk platter as compared to
`
`the inner portions, hence, more data may be recorded in the outer tracks. While zoning increases the
`
`storage capacity of the disk, it produces a disk that does not have a single transfer rate. The multiple
`
`transfer rates are due to (1) the variable storage capacity of the tracks and (2) a fixed number of
`
`revolutions per second for the platters. Table 1 presents the storage capacity and transfer rate of
`
`zones for two commercial disk drives. Both are SCSI-2 fast and wide disks with a 1 gigabyte storage
`
`capacity.
`
`3
`
`Netflix, Inc. - Ex. 1033, Page 000005
`
`

`

`3 Optimal zone layout
`
`Assume that the disk consists of a sequence of blocks, partitioned into zones Z0, Z1, . . . , Zk, where
`
`the ith zone Zi has a transfer rate of tfr(Zi) MB/s. The zones are ordered from fastest to slowest,
`
`so that tfr(Zi) > tfr(Zi+1) for i = 0, 1, . . . , k − 1.
`
`To improve the average service time sustained by a disk–resident file, we propose to keep the
`
`“hottest” files — the most popular files, with the highest frequency of access — in the fastest zones.
`
`Specifically, we propose the following Optimal Zone (OZ) Layout for a given set of files, under the
`
`assumptions that (1) we have an accurate estimate for the probability of access of each file fx, denoted
`
`heat(fx), and (2) the size of the set of files is no more than the capacity of the disk. First, order the
`
`files by their heats, from maximum to minimum. Then lay out the sequence of files contiguously,
`
`starting with zone Z0 and proceeding progressively towards the innermost zone. Objects may span
`
`multiple zones, and no blocks are left unoccupied.
`
`The OZ layout assigns the hottest files to the fastest zones. As one might expect, this layout in
`
`fact minimizes the expected (or average) service time for a file from the file system. As discussed in
`
`the previous section, the actual service time of any file includes time for seeking, rotational latency
`
`and transfer time. Since every access requires at least an initial seek and latency, we shall exclude
`
`this amount from our calculations and focus on the transfer time and additional latencies that arises
`
`from the fragmentation of files. However, in the proposed layout all files are stored contiguously, so
`
`none of these additional costs are observed.
`
`Without loss of generality, assume that each file occupies only one zone. (If not, we can apply
`
`the following argument by considering each fragment of the file separately.) The actual transfer time
`
`for a file fx in zone Zi is size(fx)/tfr(Zi); so the expected transfer time for a disk–resident file is
`
`heat(fx)size(fx)/tfr(Zi)
`
`(1)
`
`Xi Xfx∈Zi
`
`where the sum is computed over all zones Zi and files contained therein.
`
`Lemma 3.1
`
`The layout proposed by OZ is optimal — that is, it minimizes the expected transfer rate, and thus
`
`the expected service time, of each request — when the given heat values provide an accurate estimate
`
`for the probability of access to any file.
`
`Proof We claim that the quantity in (1) is minimized when the heat of each file in Zi is greater
`
`or equal to the heat of each file in Zi+1, for all i. This is proved by contradiction. Assume, for a
`
`4
`
`Netflix, Inc. - Ex. 1033, Page 000006
`
`

`

`contradicion, that we in fact have an optimal file layout in which some file f1 in zone Zi that has a
`
`lower heat than file f2 in zone Zi+1. Then we could increase the average transfer rate by swapping
`
`blocks of f1 and f2, proving that this layout could not be optimal. Since an optimal layout must
`
`exist, the argument shows that it can only be achieved when the files are laid out so that the fastest
`
`zones have the hottest files.
`
`
`
`Section 5.2 describes a trace–driven simulation study that quantifies the performance improve-
`
`ments that can be achieved in this best–case scenario.
`
`4 Dynamic on-line reorganization
`
`It is not always the case that the static layout described in the previous section will be possible
`
`because it requires that: (1) all heat values to be known at the time that the layout is realized on the
`
`disk, (2) files are not added to or deleted from the file system, and (3) the heats remain constant. In
`
`many situations, one would expect both the content of the system, and/or the popularity of its files,
`
`to evolve over time. In this case, we envision a file system which (1) maintains statistical records of
`
`the popularity of its files, and (2) uses these computed heat values to reorganize data on the disk,
`
`migrating the hotter files to faster zones, in order to approximate the layout obtained using OZ.
`
`4.1 Accumulating heat statistics
`
`The file system must learn about the heat of files by gathering statistics from the issued requests.
`
`Our design employs the past pattern of access to estimate the identity of files referenced frequently
`
`in the future.
`
`Its learning process is as follows. The file system maintains a queue of timestamps for every
`
`file, as well as an estimated heat value. All the queues are initially empty and the heat values are
`uniformly set2 to 1
`n , where n is the total number of files. Upon the arrival of a request referencing
`file fx, the current time is recorded in the queue of file fx. Whenever the timestamp queue of file fx
`
`becomes full, the heat value of that file is updated according to
`
`+ c × heatold(fx)
`
`(2)
`
`1
`i=1 ti+1 − ti
`
`×PK −1
`
`1K
`
`heatnew(fx) = (1 − c) ×
`
`2This is the case in our simulation, and would be true in any situation where there is no reliable information available
`on the expected heat of files. However, if such information is available, it could be used in initializing the system.
`
`5
`
`Netflix, Inc. - Ex. 1033, Page 000007
`
`

`

`where K is the length of the timestamp queue, c is a constant between 0 and 1, and tx is one individual
`
`timestamp. After the update is completed, the queue of this file is flushed and new timestamps can
`
`be recorded.
`
`This approach is similar to the concept of the backward k-distance used by the authors of [OOW93]
`
`in the LRU-k algorithm. The two schemes differ in three ways. First, the heat estimates are not
`
`based on the interval between the first and the last timestamp in the queue but are averages over
`
`all intervals. Second, the heat of a file fx is only updated when the timestamp queue of fx is full,
`
`therefore reducing overhead. And third, the previous heat value heatold(fx) is taken into account
`
`when heatnew(fx) is calculated. The above measures balance the need for smoothing out short
`
`term fluctuations in the access pattern and guaranteeing responsiveness to longer term trends. The
`
`constants k and c could thus be used to tune the module’s behavior to a particular application.
`
`4.2 Migrating files among zones
`
`In Section 3, the optimal layout for files was attained by ensuring that each was laid out contiguously
`
`along the surface of the disk, thus avoiding file fragmentation. However, since the file system may
`
`now rearrange files on the disk, there is the likelihood that files will become fragmented, causing
`
`the system to incur additional seeks and latencies. So while the system migrates hotter files to
`
`faster zones to improve their transfer time, it may also degrade the expected service time of files by
`
`fragmenting them.
`
`To deal with these issues, we propose the following policies for reorganization:
`
`1. On each access to a file fx, if it is discovered that fx’s heat has changed, then it can be moved
`
`or promoted to a hotter zone. Any valid data residing in its target location will be swapped
`
`into fx’s current location. So in effect, we attempt to improve the performance of the system
`
`only by swapping files “pairwise”, although the target of the swap may involve one or more
`
`files, or fragments of files. Moreover, the swap is initiated to improve the transfer time of a
`
`hotter file, never to intentionally degrade the transfer time of a colder one.
`
`2. To deal with fragmentation that can arise, each file that is promoted to a faster zone must
`
`remain contiguous. In other words, if the hot file that is targeted for migration is currently
`
`contiguously laid out, then it remains continguous after its transfer to the new location. If the
`
`file was initially fragmented, then it is reassembled in a single contiguous sequence of blocks
`
`once moved. Note that the swap may result in the fragmentation of the colder files currently
`
`residing in the faster zone; yet the policy tries to insure the integrity of the more popular files
`
`6
`
`Netflix, Inc. - Ex. 1033, Page 000008
`
`

`

`on the disk.
`
`Given these policies, the system must still choose potential target locations for the migrating file, and
`
`for each potential target, must determine whether the proposed swap is worthwhile. These problems
`
`are resolved in the following way.
`
`Initially, upon a reference to file fx whose heat has changed, we determine the zone that it should
`
`occupy with OZ (say zone Zoptimal). Assume fx resides in zone Zx. If Zx and Zoptimal refer to the
`
`same zone then no migration is necessary. Otherwise, fx is a candidate for migration from Zx to
`
`Zoptimal. The system searches Zoptimal for the coldest contiguous sequence of blocks that can hold
`
`fx. To calculate the heat of a sequence of blocks, we note that each block of the disk that holds valid
`
`data inherits a heat from the file that occupies it, which is just the frequency at which that block
`
`is accessed from the device. The total heat of a sequence of blocks is then just the sum of the heat
`
`of all blocks in the sequence. So if the difference in total heat between fx and the current target
`
`exceeds a preset threshold, the file is swapped into this location. This heat difference is proportional
`
`to the amount by which the total service time is expected to decrease if the swap is made, excluding
`the time for any additional seeks that have been introduced.3 If the difference does not exceed the
`
`threshold, Zoptimal is reset to Zoptimal+1 (the next slower zone) and this process is repeated. This
`
`process terminates when Zoptimal = Zx.
`
`A more detailed description of the algorithm is presented in Figure 2. The procedure is con-
`
`servative, to the extent that it attempts to maintain the contiguity of the more popular files, and
`
`never tries to move a file beyond its placement in the optimal static schedule. (This is an attempt
`
`to promote files, while avoiding competition with other, presumably hotter, files that should reside
`
`in the faster outermost zones.) The procedure also relies on the configurable threshold parameter
`
`(denoted THRESHOLD in Figure 2) that is used to tune the sensitivity of the reorganization pro-
`
`cedure: a swap is done only if the gain in expected service time exceeds this threshold. For smaller
`
`values of this parameter, files are migrated frequently in response to small variations in heat; as the
`
`parameter’s value is increased, files will migrate only if their heat changes by a large amount, or
`
`if the files are themselves large. In this way, the parameter helps to quantify the tradeoff between
`
`the one-time cost of moving a file, and the expected reduction in service time that will be observed
`
`during future accesses.
`
`3A more accurate measure of the change in service time can be computed by estimating the service time for both
`the current and proposed layouts of the files involved in the swap, weighting these by the files’ heats, and determining
`the effect upon the overall average service time. This approach is discussed further in Section 5.
`
`7
`
`Netflix, Inc. - Ex. 1033, Page 000009
`
`

`

`10
`
`20
`
`move(f )
`{
`
`if the file already resides in the fastest zone
`then return;
`
`/∗ At this point all the heats should be current. ∗/
`current location = blocks currently occupied by f ;
`current zone = slowest zone in which any fragment of f resides;
`target zone = best static placement of f for current heats;
`
`while (current zone > target zone) {
`target location = a sequence of size(f ) blocks of minimum total
`heat in target zone;
`if (total heat (current location) − total heat (target location) > THRESHOLD ) {
`swap the contents of current location and target location;
`break;
`
`lse
`−−current zone;
`
`} e
`
`}
`return;
`
`}
`
`Figure 2: Procedure for attempting to promote a file during on-line reorganization.
`
`5 Trace driven evaluation
`
`We employed a trace driven simulation study to investigate the tradeoffs associated with the proposed
`
`techniques. The trace is a sequence of file accesses to a ftp site at the University of California-
`
`Berkeley (s2k-ftp.CS.Berkeley.EDU). It was accumulated over a period of 17 months (from 11/14/93
`
`to 4/13/95). The file system consists of 59,072 files and is 4.22 gigabytes in size. The trace contains
`
`neither the creation nor deletion date of a file, only the references to retrieve a file. It consists of
`
`460,000 requests. A few files are hot and referenced more frequently than the others. On the average,
`
`a request retrieves 294 kilobytes of data.
`
`5.1 Simulation model
`
`For the purposes of this evaluation, we represented a disk drive as an array of blocks. The size of
`
`each block corresponded to a physical disk block and is 512 bytes long. A zone occupied a contiguous
`
`number of blocks. The simulator maintained the number of blocks that constituted a file and the
`
`location of each block. As we did not have the zone characteristics of a 4 Gigabyte disk drive, we
`
`employed the zone characteristics of the Seagate ST31200W disk for this evaluation. The storage
`
`capacity of this disk is 25% of the file system size. We increased the size of each zone by a factor of
`
`8
`
`Netflix, Inc. - Ex. 1033, Page 000010
`
`

`

`four to store the file system.
`
`We employed the analytical models of [GSZ95] to represent the seek operation and latency of
`
`a magnetic disk drive. These models estimate the characteristics of the Seagate ST31200W disk
`
`and are similar to those of [RW94, WGPW95]. The minimum and maximum seek times are 3 and
`
`21.2 milliseconds, respectively. The average rotational latency time is 5.5 milliseconds. When a file
`
`transfer was initiated, the simulator reads its physically contiguous block in one transfer (eliminating
`
`seek and rotational latency for two or more physically adjacent blocks). A seek is incurred if a file
`
`is fragmented across the surface of the disk. The simulator computes the number of seeks incurred
`
`when reading a file. We assume all the references are directed to the file system and do not observe
`
`a memory hit.
`
`5.2 Optimal zone layout
`
`We analyzed the performance of the system with the OZ layout. For comparison purposes, we
`
`analyzed the worst case scenario that lays the files on the surface of the disk in the reverse order,
`
`assigning the frequently accessed files to the innermost zones. The average service time of the system
`
`is 83.5 milliseconds and 134 milliseconds with optimal and worst case assignment, respectively. The
`
`OZ layout provides a 38 percent improvement relative to the worst case.
`
`The average service time of the disk drive based on a random assignment of files to the disk
`
`is estimated to be 100.4 msec. This number is computed based on the probability of a request
`
`size of Zi
`disk capacity ), the time required to
`referencing data stored in a zone (for zone Zi, this probability is
`retrieve the average number of bytes per request from zone Zi ( average request size
`), and the average
`tf r(Zi)
`seek and rotational delays. This average estimate was verified by running the simulator several times
`
`assuming a random assignment of files. OZ provides a 17 percent improvement as compared to this
`
`average service time.
`
`5.3 Heat tracking
`
`In order to quantify the impact and effectiveness of the configuration parameters K and c (Equa-
`
`tion 2) of the heat tracking module we analyzed its performance by computing the root mean square
`
`(r.m.s.) deviation [Wal84] of the estimates computed by the on-line heat–tracking module from the
`
`9
`
`Netflix, Inc. - Ex. 1033, Page 000011
`
`

`

`overall values computed over the duration of the run. That is, we consider the quality factor Q:
`
`Q = sPN
`
`i=1(E(fi) − R(fi))2
`N
`
`where N is the total number of files on the disk, E(fi) is the estimated heat value for file fi com-
`
`puted using the tracking module, and R(fi) denotes the real heat value calculated as the fraction
`number of accesses to fi
`total number of accesses . Of course, this is not necessarily the only measure to use here. We are, in
`effect, tracking a moving target — the changing popularity of files. Nevertheless, this value can
`
`provide a good estimate of how much change there is among the heats of individual files throughout
`
`the run, and how responsive or unresponsive the heat–tracking module is to local fluctuations.
`
`The deviation Q was computed after every 20,000 requests during the simulation of the trace.
`
`We investigated the following c values: 0.1, 0.25, 0.5, and 0.75. For each of the c values, the heat
`
`statistics queue length K was set to 5, 15, 25, and 50 for different experiments. The results indicate
`
`that for this particular trace, the best (i.e., smallest) overall deviation can be achieved with either
`
`K = 50 and c = 0.5 or K = 25 and c = 0.75, although in general the r.m.s. deviations are relatively
`
`consistent. A small c value that favors the more recent history of the frequency of access is more
`
`accurate at system startup time, however, its margin of error increases as the number of issued
`
`requests increase. Short heat queues (i.e., either K = 5 or K = 15) lead to more responsive heat
`
`tracking, however, they also increase the fluctuation in the estimated heat values, resulting in a
`
`higher margin of error.
`
`5.4 On-line reorganization
`
`We analyzed the on-line reorganization technique by assuming the worst-case scenario for the initial
`
`placement of files on the disk. Next the simulator was invoked to learn the heat of files (K=50,
`
`c=0.5 in Equation 2) and was allowed to migrate the frequently accessed files from the slower zones
`
`to the faster ones using the algorithm developed earlier. We tallied the average service time of the
`
`disk for each 20,000 requests. Finally, we computed the percentage improvement in service time
`
`with reorganization as compared to the worst-case layout with no reorganization. Figure 3 shows the
`
`obtained results. In this figure, every tick on the x-axis represents 20,000 requests (approximately
`
`15 days of operation). The y-axis represents the percent improvement. For comparison purposes,
`this figure also shows the percent improvement4 in service time with OZ relative to the worst case
`
`4Assuming that S(layout) is a function that denotes the average service time of the system with a layout, say OZ,
`as compared to worst is computed at: 100 Θ S(worst)−S(OZ)
`. This number is always less than 100%.
`S(worst)
`
`10
`
`Netflix, Inc. - Ex. 1033, Page 000012
`
`

`

`Percent improvement
`%
`60
`
`50
`
`40
`
`30
`
`20
`
`10
`
`0
`
`OZ
`On-line reorganization
`
`100000
`
`200000
`
`300000
`
`400000
`
`Accumulated number of requests
`
`Figure 3: Percent improvement with on-line reorganization and optimal
`
`layout with no reorganization.
`
`During the first 20,000 requests, the reorganization module improves the average service time
`
`by 16 percent. This improvement increases to 30 percent during the next 20,000 requests because
`
`the heat tracking module provides a better estimate of file heats. The reorganization module ap-
`
`proximates the percent improvement provided by optimal. It cannot provide a better improvement
`
`because it is approximating the heat of files while OZ has organized the files based on “perfect”
`
`heat statistics (that is, heat statistics corresponding to global averages across the entire trace). The
`
`amount of fragmentation attributed to the reorganization process is negligible. At the end of simula-
`
`tion, the reorganization process resulted in four hundred non-contiguous chunks due to fragmentation
`
`(as compared to zero at system startup).
`
`6 Discussion
`
`The simulation results of the previous section assumed that the size of the file system was about equal
`to the capacity of the disk, and thus that the files were packed into zones, leaving no free space5.
`
`In the case where the file system is smaller than the disk capacity, the system may choose between
`
`two alternatives when assigning files to the zones; it can either (1) try to fill completely each of the
`
`fastest zones, to take advantage of their higher transfer rate, or (2) distribute the files more evenly
`
`5The proposed online reorganization procedure, however, made no assumptions in this regard.
`
`11
`
`Netflix, Inc. - Ex. 1033, Page 000013
`
`

`

`across the disk, leaving free space in each of the zones. Filling up each of the fastest zones completely
`
`has the advantage of utilizing the whole space of these zones, and therefore maximizes the average
`
`observed transfer of the disk for its resident files. This can potentially give the greatest possible
`
`performance enhancement. However, this approach might suffer from the following limitations:
`
`greater difficulties during on-line compaction, file creation and update, which are more likely to lead
`
`to the fragmentation of files.
`
`On the other hand, if we do not fill up each zone completely, there is the likelihood that online
`
`compaction, file creation and file update will be more efficient and effective, and that fragmentation
`
`can be minimized during these operations because of the available free space within zones.
`
`(By
`
`comparison, the Unix Fast File System [MJLF84] groups the cylinders of a hard disk drive as cylinder
`
`groups. It then tries to localize related files within a cylinder group, while also spreading unrelated
`
`files across cylinder groups in order to improve performance by minimizing seek time. To achieve
`
`this, it usually reserves 10 percent of the space as free space to do compactions.) In addition, it
`
`is likely that the initial stage of object promotions can be made cheaper, and that the possibility
`
`of multiple moves (both ping–ponging and domino effects) can be reduced. However, this strategy
`
`does not fully utilize the capacity of the fastest zones. Therefore it cannot expect to achieve the best
`
`possible performance.
`
`It is likely that these tradeoffs can best be managed in an application dependent way. For
`
`example, the amount of free space reserved in each zone could be tuned to the projected stability
`
`of file system (i.e. how often files are created, deleted and updated, and how much variation in the
`
`heat of files is expected).
`
`In the trace driven simulation, we also did not deal with the issue of file creation (since file
`
`creation times were not explicitly represented in the trace). Nevertheless, the creation of new files
`
`can be handled within the framework developed above. Specifically, if we are given a prospective
`
`heat for the newly created file, we can determine from this what its appropriate zone should be. If we
`
`have free space for the file in this zone, it is inserted there; otherwise, the reorganization algorithm
`
`is applied to this file. If this also fails, the file can then be placed into the zone with the most free
`
`space, which is most likely the inner-most or slowest zone. Updates can be handled in a similar
`
`manner.
`
`12
`
`Netflix, Inc. - Ex. 1033, Page 000014
`
`

`

`7 Conclusion and Future Directions
`
`This study describes an optimal placement technique for the assignment of files to the zones of a disk
`
`drive. This technique strives to assign the frequently accessed files to the fastest zones in order to
`
`maximize the disk bandwidth. In order to respond to evolving access patterns, we outlined an on-line
`
`reorganization process that monitors the frequency of access to the files and migrates the frequently
`
`accessed ones to the faster zones upon reference. Our trace driven simulation study demonstrates
`
`that: 1) the optimal layout improves the average service time of the system, 2) effective heat tracking
`
`modules can be constructed, and 3) the on-line reorganization process can approximate the percent
`
`improvement provided by the optimal organization. As discussed in Section 6, the reorganization
`
`process can be fine–tuned, based on the characteristics of a target application.
`
`As part of our future research activity, we intend to analyze alternative multi-zone disk drives and
`
`traces obtained from other ftp sites. In addition, we intend to analyze the concept of striping using
`
`multi-zone disk drives. When a file is striped across multiple disks, its retrieval time is determined
`
`by the fragment assigned to the slowest zone of the participating disks. Is it important to control the
`
`assignment of different fragments of a file to the zones of available disks? Does this impact the size
`
`of a fragment? What is the impact of this on the parity block and its computation? Some of these
`
`issues are addressed in the context of heterogeneous disks [CRS95] and we intend to extend these
`
`designs to multi-disk environments consisting of multi-zone disk drives. We intend to investigate
`
`these design decisions along with the implementation details of the proposed ideas using the Unix
`
`operating system.
`
`References
`
`[CABK88] G. Co

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket