throbber

`
`
`
`
`UNIFIED PATENTS
`
`EXHIBIT 1009
`
`
`UNIFIED PATENTS
`
`EXHIBIT 1009
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 1
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 1
`
`

`

`Unlted States Patent
`
`[19]
`
`[11] Patent Number:
`
`5,974,502
`
`
`DeKoning et al.
`[45] Date of Patent:
`Oct. 26, 1999
`
`U5005974502A
`
`[54] APPARATUS AND METHOD FOR
`ANALYZING AND MODIFYING DATA
`TRANSFER REGUESTS IN A RAID SYSTEM
`
`[75]
`
`InVentorS: Rodney AI DeKOning; Donald R.
`Humllcek; Curtls W. Rlnk, all of
`W1Ch1ta, Kans.
`
`[73] Assignee:
`
`ICJSleLogic Corporation, Milpitas,
`a 1 .
`
`[21] Appl. No.: 08/549,384
`
`[22]
`
`Filed:
`
`Oct. 27, 1995
`
`Int. Cl.6 ..................................................... G11B 17/22
`[51]
`[52] US. Cl.
`............................. 711/114; 711/111; 710/36;
`714/770
`[58] Field of Search ............................... 395/441, 182.05,
`395/182.04, 183.18; 371/401, 40.4, 51.1;
`711/114, 165, 111; 710/36, 39, 60
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`......................... 395/200.32
`11/1992 Row et al.
`5,163,131
`8/1993 Beardsley et a1.
`395/440
`5,235,690
`
`5/1995 De Bey ................
`. 455/5.1
`5,421,031
`
`5,455,934 10/1995 HOHand 6t al~ -~
`711/4
`
`575197435
`5/1996 Anderson ~~~~~~~~~
`348/8
`5,519,849
`5/1996 Malan et a1.
`............................ 395/500
`
`3:33:33? 131332 file/(11%;: :lt‘ ail."""""""""""""
`3337;)?
`
`5,572,699
`11/1996 Kamo et a1.
`395/441
`11/1996 Rathunde ........................... 395/182.05
`5,574,851
`
`5,584,008 12/1996 Shimada et a1.
`........................ 395/441
`5,598,549
`1/1997 Rathunde ............... 711/11
`
`..... 711/114
`8/1997 Stallmo et a1.
`.
`5,657,468
`
`.
`5,721,823
`2/1998 Chen et a1.
`395/200.33
`
`3/1998 Taoda ...................................... 711/165
`OTHER PUBLICATIONS
`
`5,724,552
`
`Andrew S. Tanenbaum, “Modern Operating System”, 1992,
`Prentice Hall, Inc. , pp. 27_71.
`Primary Examiner—Eddie P. Chan
`Assistant Examiner—Hong Kim
`
`[57]
`
`ABSTRACT
`
`The invention provides a method and apparatus for increas-
`ing the efficiency of data transfer between a host computer
`and a disk array in a RAID system. The invention operates
`by Splitting up large 1/0 requests from the computer into
`smaller, more manageable pieces and processing the pieces
`as though they were individual
`I/O requests.
`In one
`embodiment, the invention keeps only a limited number of
`these smaller individual I/O requests “active” at any par-
`ticular time so that a single large I/O request cannot preclude
`other I/O requests from making progress in the controller.
`Both the size of the smaller I/O request pieces and the
`limited number of these pieces which will be “active” at any
`one time may be tunable parameters. The invention
`improves the efficiency of data transfer between the host
`computer and the array of disk drives by providing for
`increased overlap of activity in the controller. This increased
`overlap of activity results in increased controller throughput.
`
`22 Claims, 4 Drawing Sheets
`
`
`
`
`CALCULATE
`SIZE OF
`REQUEST
`
`SIZE
`
`FULL I/O
`
`REQUEST
`
` BLOCK
`
`REQUEST
`
`
`
`
`
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 2
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 2
`
`

`

`US. Patent
`
`Oct. 26, 1999
`
`Sheet 1 0f 4
`
`5,974,502
`
`><mm<me
`
`EMIEOEPZOO
`
`DE55.1»F...GE
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 3
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 3
`
`

`

`US. Patent
`
`Oct. 26, 1999
`
`Sheet 2 0f 4
`
`5,974,502
`
`nKOOIE
`
`5200.5
`
`
`
`
`
`mmZAOEHZOO><mm<xmfi.
`
`Om>ED
`
`Fv_OOI_m_
`
`0N\«mn__<m_mm
`
`._.mOI
`
`Emzb>m
`
`mKOOJm
`
`N>._._m<n_
`
`NIGE
`
`mm.om
`
`
`
`E3.09m
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 4
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 4
`
`
`
`
`

`

`US. Patent
`
`Oct. 26, 1999
`
`Sheet 3 0f 4
`
`5,974,502
`
`
`
`
`
`ZOPOZDu—EMIEOEFZOOD_<m
`
`MEI/NJ
`
`MN_mO:
`
`PmeOmm
`
`200.5
`
`mD_>_n_
`
`FwMDOmmOh
`
`CD.33“.
`
`HwMDOmm
`
`m._.<._30._<0
`
`“.0MN_m
`
`HwMDOmm
`
`v/Qv
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 5
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 5
`
`
`
`
`
`

`

`US. Patent
`
`Oct. 26, 1999
`
`Sheet 4 0f 4
`
`5,974,502
`
`_>_<m
`
`Vb
`
`m6
`
`Pmm30mmme30mmfi/. __________4on“mmmmoo<mm"<F<o___mumnszom<4
`
`x004m0:443;_mm
`
`oh20mm
`
`Dom10:26,
`
`20mm
`
`Dom
`
`
`
`m@H%IIIIIIIIIIIIIIL
`
`4on
`
`
`
`mmmox
`
`mezm
`
`
`
`mummsm
`
`
`
`><mm<O.§>_Om_n_
`
`mm_>_mn_
`
`IIIwIGE
`
`
`
`.50:OtEOmn.
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 6
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 6
`
`

`

`5,974,502
`
`1
`APPARATUS AND METHOD FOR
`ANALYZING AND MODIFYING DATA
`TRANSFER REGUESTS IN A RAID SYSTEM
`
`FIELD OF THE INVENTION
`
`The invention relates, in general, to magnetic data storage
`systems and, in particular, to an apparatus and technique for
`the efficient transfer of data between a host computer and an
`array of disk drives.
`BACKGROUND OF THE INVENTION
`
`Magnetic disk storage is currently the most widely used
`method of mass storage for computer systems. Traditionally,
`systems using this method of mass storage have included a
`single disk capable of storing large amounts of data.
`However, systems using an array of smaller capacity, less
`expensive disk drives are currently emerging as a low cost
`alternative to large single disk systems. These array systems
`are known as RAID (redundant array of inexpensive disks
`systems.
`When used in conjunction with a host computer, a RAID
`system appears to behave just like a single disk system.
`RAID systems, however, offer many advantages over single
`disk systems. Among the advantages of RAID technology
`are improved reliability and performance. Reliability is
`improved through the use of redundancy information in the
`array which allows the system to continue operating even
`though one of the drives in the array has failed. The failed
`drive may then be replaced, and the lost data regenerated,
`without having to shut down the system. Conversely, if the
`one disk in a single disk system fails, the system is rendered
`inoperable and valuable data may be lost. RAID systems
`may achieve improvement in performance over single disk
`systems by, among other things, allowing data to be read
`from and written to multiple disks in the array in parallel.
`This greatly increases the speed with which I/O operations
`can be performed.
`RAID technology encompasses a series of techniques for
`managing the operation of multiple disks. These techniques
`are discussed in an article entitled “A Case for Redundant
`
`Arrays of Inexpensive Disks (RAID)” by Patterson, Gibson,
`and Katz of the University of California (Report No. UCB/
`CSD 87/391, December 1987) which categorizes the differ-
`ent techniques into five RAID “levels” and is hereby incor-
`porated by reference. Each RAID level represents a different
`approach to storing and retrieving data and the associated
`redundancy information across the array of disk drives.
`FIG. 1 illustrates a typical RAID system 10. This system
`10 may be used to implement any of the five RAID levels.
`As seen in the figure,
`the system 10 includes:
`a host
`computer 12, a disk array controller 14, and an array of disk
`drives 16. The host computer 12, in addition to other tasks,
`is operative for sending I/O commands to the disk array
`controller 14, instructing the controller 14 to perform certain
`read and/or write operations. The disk array controller 14
`receives the commands from the host computer 12 and
`coordinates the transfer of data between the host computer
`12 and the array of disk drives 16 in accordance with specific
`stored algorithms. The array of disk drives 16 includes a
`plurality of drives which are each independently coupled to
`and controlled by the disk array controller 14 for receiving,
`storing, and retrieving data delivered to it by the controller
`14.
`
`As described above, the disk array controller 14 includes
`stored algorithms which the controller 14 uses to coordinate
`the transfer of data between the host computer 12 and the
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`array 16. The algorithms perform such tasks as determining
`and reserving an adequate amount of buffer space to perform
`a particular transfer, controlling the delivery of data between
`the buffer and the host computer 12, calculating the redun-
`dancy data to be stored in the disk array 16, distributing
`write data and redundancy data to the separate drives in the
`array 16, and retrieving data from the disk array 16. It should
`be apparent that the algorithm used to perform any particular
`transfer will depend on both the I/O function being
`requested and the RAID level being implemented.
`In performing its function of coordinating the transfer of
`data between the host computer 12 and the array of disk
`drives 16, it is desirable that the disk array controller 14
`operate as efficiently as possible. Improved efficiency of
`operation in the controller 14 will result in improved data
`transfer bandwidths between the host computer 12 and the
`array 16.
`
`SUMMARY OF THE INVENTION
`
`The invention provides a method and apparatus for
`increasing the efficiency of data transfer between a host
`computer and a disk array in a RAID system. The invention
`operates by splitting up large I/O requests from the computer
`into smaller, more manageable pieces and processing the
`pieces as though they were individual I/O requests. In one
`embodiment, the invention keeps only a limited number of
`these smaller individual I/O requests “active” at any par-
`ticular time so that a single large I/O request cannot preclude
`other I/O requests from making progress in the controller.
`Both the size of the smaller I/O request pieces and the
`limited number of these pieces which will be “active” at any
`one time may be tunable parameters. The invention
`improves the efficiency of data transfer between the host
`computer and the array of disk drives by allowing for
`increased overlap of activity in the controller.
`In other
`words, controller operations such as:
`(1) data transfers
`between the host and the buffer, (2) buffer setup in the
`controller, and (3) data transfers between the buffer and the
`drives can take place, to an increased extent, at the same
`time. This increased overlap of activity results in increased
`controller throughput.
`In one aspect of the present invention, a data storage
`system is provided comprising: (a) an array of disk drives;
`(b) a host computer capable of issuing I/O requests for read
`and write operations to be performed using the array of disk
`drives, the read and write operations requiring a transfer of
`data between the host computer and the array of disk drives;
`(c) means for coordinating the transfer of data between the
`host computer and the array of disk drives in response to the
`I/O requests from the host computer; and (d) means for
`determining an amount of data required to be transferred
`between the host computer and the array of disk drives for
`each I/O request issued by the host computer, comparing the
`amount of data to a predetermined amount, dividing each
`I/O request
`into a plurality of separate block requests
`whenever the amount of data exceeds the predetermined
`amount, each block request requiring a transfer of no more
`than the predetermined amount of data, and issuing each of
`the plurality of separate block requests to the means for
`coordinating as an individual I/O request. By dividing larger
`I/O requests, the system allows greater functional overlap in
`the means for coordinating which can result in increased
`controller throughput.
`In one embodiment of the system of the present invention,
`the means for coordinating processes no more than a pre-
`determined number of separate block requests at one time.
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 7
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 7
`
`

`

`5,974,502
`
`3
`This also allows greater functional overlap in the controller.
`In addition, both the predetermined amount of data and the
`predetermined number of separate block requests may be
`adjustable parameters to provide even further gains in func-
`tional overlap. For example,
`the system of the present
`invention may include a feedback loop for dynamically
`tuning either or both of the parameters based on measured
`system variables to ensure continuous, efficient data transfer
`between the host computer and the array of disk drives.
`In another aspect of the present invention, a method for
`increasing data transfer efficiency between a host computer
`and an array of disk drives is provided. The method com-
`prises the steps of: (a) receiving an I/O request from the host
`computer; (b) determining an amount of data required to be
`transferred between the host computer and the array of disk
`drives to process the I/O request; (c) dividing the I/O request
`into a plurality of block requests when the amount of data
`exceeds a predetermined amount of data, each of the plu-
`rality of block requests requiring a transfer of no more than
`the predetermined amount of data; and (d) processing each
`of the block requests as an individual I/O request. As
`described above, dividing larger I/O requests allows greater
`functional overlap in the means for coordinating which can
`result in increased controller throughput.
`In one embodiment, the method further includes the step
`of processing no more than a predetermined number of
`block requests concurrently in an array controller.
`In
`addition, both the predetermined amount of data and the
`predetermined number of block requests may be adjustable
`parameters. In this regard, the method may also include the
`step of adjusting either or both of these parameters to
`increase data transfer bandwidth between the host computer
`and the array of disk drives. Similarly,
`the method may
`include the step of dynamically adjusting either or both of
`the parameters based on a plurality of measured system
`variables, via a feedback loop.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block diagram illustrating a typical RAID data
`storage system;
`FIG. 2 is a block diagram illustrating a RAID system
`implementing RAID level 5;
`FIG. 3 is a block diagram illustrating a RAID system
`implementing one embodiment of the present invention; and
`FIG. 4 is a block diagram illustrating one embodiment of
`the RAID controller function unit 52 of FIG. 3.
`
`DETAILED DESCRIPTION
`
`invention provides a RAID data storage
`The present
`system which is capable of more efficient operation than past
`RAID systems. The invention accomplishes this, in part, by
`splitting up large I/O requests from the computer into
`smaller, more manageable pieces and processing these
`pieces as though they were individual I/O requests. In one
`embodiment, the invention keeps only a limited number of
`these smaller individual I/O requests “active” at any par-
`ticular time so that a single large I/O request cannot preclude
`other I/O requests from making progress in the controller.
`Both the size of the smaller I/O request pieces and the
`limited number of these pieces which will be “active” at any
`one time may be tunable parameters.
`To better understand the utility and functioning of the
`present invention, a description will now be made of the
`various RAID levels. As described above, RAID technology
`was originally divided into five separate levels, each level
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`representing a particular technique for storing and retrieving
`data and redundancy information to and from an array of
`disk drives. Since that time, other RAID levels have been
`defined, most of which are merely combinations of the
`original five levels. Consequently, only the original five
`levels will be described. It should be appreciated, however,
`that the present invention can be used with RAID systems
`implementing any of the various RAID techniques, such as
`RAID level 0, and also in other bridge controller architec-
`tures which control multiple devices, such as JBOD systems.
`RAID level 1 includes the RAID techniques known as
`mirroring and duplexing. Mirroring comprises making a
`duplicate copy of the write data and storing the original write
`data and the duplicate data on separate disks. In this way, if
`the disk storing the original data fails,
`the data is still
`available on the alternate drive. Duplexing is similar to
`mirroring but goes one step further by also providing a
`duplicate disk controller and cable to further increase redun-
`dancy in the system. These techniques offer a high degree of
`reliability, but are relatively expensive to implement because
`they require double the storage space that would normally be
`required to store the data. Nonetheless, the level 1 tech-
`niques are currently the most widely used in RAID systems.
`In RAID level 2, the write data is bit interleaved across the
`drives in the array (i.e., the first bit is stored on the first drive,
`the second bit is stored on the second drive, etc.) with some
`of the drives being reserved for error correctional codes
`(ECCs) corresponding to the stored write data. If one or
`more of the drives fail, the error correctional codes may be
`used to reconstruct the lost data. This technique offers high
`performance and reliability but, like the level 1 techniques,
`is very expensive to implement. The technique is expensive
`because the number of drives required to implement the
`technique corresponds to the word length used by the system
`(i.e., 32 drives in a 32 bit system) plus the additional drives
`required for storing the ECC. This level is normally used in
`applications requiring high data transfer rates.
`RAID level 3 uses byte interleaving across the drives in
`the array with one of the drives being reserved for parity
`information. If a data drive in the array fails,
`the parity
`information and the data on the surviving data drives can be
`used to reconstruct the lost data. If the parity drive fails, the
`data can be read from and written to the data drives as usual.
`
`Byte interleaving allows the level 3 system to employ all of
`the drives in the array in parallel for every read or write
`operation and,
`therefore,
`level 3 systems are generally
`referred to as parallel access arrays. Because the disk drives
`are accessed in parallel, level 3 systems support extremely
`high data transfer rates. As such, level 3 systems are best
`suited for applications having low to medium I/O volume
`with relatively long records.
`RAID level 4 employs block interleaving across the
`drives in the array with one of the drives being reserved for
`parity information. Block interleaving allows the reading or
`writing of a complete logical data unit in a single I/O data
`access. Systems using this technique are capable of reading
`multiple drives simultaneously (depending on block length),
`but cannot write to more than one drive at a time because all
`
`writes must share the same parity drive. Consequently, level
`4 systems are best suited for processing short records which
`are read more often than written. Because the disk drives in
`
`a level 4 system can be accessed individually, these systems
`are known as independent access arrays.
`RAID level 5 provides block interleaving and indepen-
`dent access like level 4, but instead of reserving a single
`drive for parity information, level 5 systems distribute the
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 8
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 8
`
`

`

`5,974,502
`
`5
`parity information across all of the drives in the array. FIG.
`2 illustrates one way of accomplishing such a distribution of
`the parity information. This distribution of the parity infor-
`mation allows the system to perform multiple write opera-
`tions concurrently, resulting in high transaction throughput.
`Level 5 systems are best suited to handle high I/O activity
`with relatively short records.
`Operation of a RAID system, implementing any of the
`above-described levels, with one or more failed drives is
`known as degraded mode operation. Once in degraded
`mode, the tolerance of the system to additional hardware
`faults is reduced or eliminated. To take a system out of the
`degraded mode of operation,
`it
`is usually necessary to
`replace, or “hot-swap”, the failed drive with a fully opera-
`tional drive. After hot-swapping the failed drive, the missing
`data may then be reconstructed by the RAID controller using
`the redundancy information and stored on the new drive,
`thus restoring the system to full fault tolerance.
`The invention will be described as implemented in a
`RAID level 5 data storage system 20, such as the one
`illustrated in FIG. 2. As seen in the figure, a RAID level 5
`system may include: a host system 22, a disk array controller
`24, and an array of disk drives 26. As described previously;
`the disk array controller 24 receives I/O requests from the
`host 22 and performs these I/O requests by coordinating the
`transfer of data between the host 22 and the array of disk
`drives 26. The array of disk drives 26 may include any
`number of drives, each of which is capable of being read
`from and written to by the controller 24.
`As mentioned previously, in a RAID level 5 system, data
`is block interleaved among all of the drives 30—38 in the
`array 26 with interspersed parity information. FIG. 2 illus-
`trates one method of accomplishing this. In FIG. 2, each
`parity block stored in the array corresponds to 4 data blocks
`to form a redundancy group, i.e., PARITY 0 corresponds to
`BLOCKs 0, 1, 2, and 3 to form a first redundancy group,
`PARITY 1 corresponds to BLOCKs 4, 5, 6, and 7 to form a
`second redundancy group, etc. As seen in the figure, parity
`information is stored on every disk in the array 26 in a
`staggered fashion across the array.
`The parity information stored in the parity block of any
`particular redundancy group may be calculated in the con-
`troller 24 using a bit wise exclusive-OR of the data stored in
`the 4 corresponding data blocks. In this regard, anytime it is
`desired to rewrite the data in one of the data blocks of a
`
`redundancy group, the corresponding parity information has
`to be recalculated and resaved. Recalculation of the parity
`information can be accomplished in one of two ways. In one
`approach, the controller can read the data blocks from the
`four data drives which are not being written to and
`exclusive-OR these with the new write data to create the new
`
`parity information. In a simpler and preferred approach,
`called the Read-Modify-Write method, the controller first
`reads the data block to which new data is to be written and
`
`the corresponding parity block and then bit-wise exclusive-
`ORs the two blocks with the new data to create the new
`
`parity information. The new data and new parity information
`are then written to the appropriate blocks.
`If one of the disk drives 30—38 in the array 26 fails,
`operation of the RAID level 5 data storage system 20 may
`continue in degraded mode. In other words, if drive D in
`FIG. 2 fails, data can still be read from and written to the
`array 26, even if the data is supposed to be stored on drive
`D. For example, suppose Drive D fails and it is desired to
`write data to BLOCK 2. Instead of writing the data to
`BLOCK 2, the level 5 controller simply reads BLOCKs 0,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`1, and 3 and exclusive-ORs them with the new data to create
`new parity information for PARITY 0. When it is desired to
`read from BLOCK 2, the controller simply reads BLOCKs
`0, 1, and 3 and PARITY 0 and uses the data obtained to
`calculate the desired data.
`
`FIG. 5 illustrates a computer system 40 implementing one
`embodiment of the present invention. The computer system
`40 includes: a host computer 42, a disk array controller 44,
`and an array of disk drives 46. The host computer 42 delivers
`I/O requests to the disk array controller 44 via I/O request
`line 48. The disk array controller 44 then processes the I/O
`requests by controlling the transfer of data between the host
`computer 42 and the array of disk drives 46. When a write
`operation is being performed, the host 42 delivers data to the
`controller, via data line 50, where it is separated, processed,
`and distributed to the array of disk drives 46 according to the
`appropriate RAID level. When a read operation is being
`performed,
`the controller 44 accesses and assembles the
`requested data from the array of disk drives 46 according to
`the appropriate RAID level and delivers the data to the host
`computer via data line 50.
`As illustrated in FIG. 3, the disk array controller 44 may
`include: a request size calculation unit (RSCU) 54, a com-
`parator 56, a switch 58, a request division unit (RDU) 60,
`and a RAID controller function unit (RCFU) 52. The con-
`troller 44 may be implemented in the form of a plug-in card
`for a computer or as a resident circuit in a drive array
`subsystem. The RSCU 54 is operative for receiving I/O
`requests from the host computer 42 via I/O request line 48
`and for calculating the amount of data which each I/O
`request requires to be transferred between the host 42 and
`the array 46. The comparator 56 receives a signal from the
`RSCU 54 indicative of the size of a current request and
`compares the size of the current request to a LARGE I/O
`SIZE parameter received from the RCFU 52. The LARGE
`I/O SIZE parameter represents the largest single I/O request
`which will be processed by the RCFU 52 of the present
`invention and, as will be described in more detail shortly,
`may be a tunable parameter.
`The comparator 56 creates an output signal indicative of
`whether the size of the current I/O request is greater than the
`LARGE I/O SIZE parameter. This output signal is then used
`by switch 58 to determine whether to deliver the current I/O
`request directly to the RCFU 52 or to deliver it first to the
`RDU 60. If the size of the current I/O request is less than or
`equal to the LARGE I/O SIZE parameter, than the entire
`request is delivered to the RCFU 52 for processing. If the
`size of the current I/O request is greater than the LARGE I/O
`SIZE parameter, the request is delivered to the RDU 60
`which, among other things, divides the current I/O request
`into a plurality of block requests which are each equal to or
`smaller in size than the LARGE I/O SIZE parameter. The
`RDU then delivers the block requests to the RCFU 52.
`As illustrated in FIG. 3,
`the RDU 60 may receive a
`NUMBER OF CONCURRENT LARGE I/O PIECES
`(NCLIOP) parameter from the RCFU 52. The NCLIOP
`parameter represents the total number of block requests that
`will be active in the RCFU 52 at any one time. The RDU 60
`receives the NCLIOP parameter and uses it to determine
`how many block requests to issue concurrently to the RCFU
`52. As with the LARGE I/O SIZE parameter, the NCLIOP
`parameter of the present invention may be a tunable param-
`eter.
`FIG. 4 illustrates one embodiment of the RFCU 52 of
`
`FIG. 3. As seen in the figure, the RFCU 52 may include: a
`microprocessor 62, a random access memory (RAM) 64, an
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 9
`
`UNIFIED PATENTS EXHIBIT 1009
`PAGE 9
`
`

`

`5,974,502
`
`7
`
`exclusive-OR (XOR) engine 66, and a buffer memory 68.
`The microprocessor 62 is operative for controlling the
`timing and operation of the RCFU 52. The microprocessor
`62 receives the current I/O request either as a full 1/0 request
`from the switch 58 or as a series of block requests from the
`RDU 60 and proceeds to process the requests. To process the
`requests, the microprocessor 62 must perform such func-
`tions as: (a) allotting microprocessor overhead, (b) reserving
`space in the buffer memory 68 to receive and store data from
`the host 42 or from the array 46, (c) controlling the transfer
`of data between the buffer 68 and the host 42, (d) requesting
`the XOR engine 66 to perform operations using data stored
`in the buffer 68 in order to produce redundancy information
`or to regenerate data stored on a failed drive, (e) controlling
`the transfer of data and redundancy information from the
`array 46 to the buffer 68, and (f) controlling the transfer of
`data and redundancy information from the buffer 68 to the
`proper locations in the array 68. All of these operations will
`be performed in accordance with the RAID level which is
`being implemented in the controller. The RAM 64 is opera-
`tive for, among other things, storing the algorithms used by
`the microprocessor 62 in performing the I/O requests.
`As described previously, the NCLIOP parameter repre-
`sents the total number of block requests that will be active
`in the RCFU 52 at any one time. By “active” it is meant that
`the microprocessor will only reserve enough buffer
`resources to process a specific number of the block requests,
`X, at any one time. Initially, the microprocessor will load
`data into the buffer 68 corresponding to the first X block
`requests and will begin to process the data. When the
`processing of the data corresponding to one of the X block
`requests is completed, the microprocessor 62 will load data
`into the buffer 68 corresponding to the next block request
`and will begin to process that data. The result of this is that
`the total number of block requests which are active, or are
`becoming active, in the RCFU 52 is constant until the full
`1/0 request is near completion.
`The disk array controller 44 of FIG. 3 may be used in
`conjunction with any number of different host computers or
`drive arrays. These host computers and drive arrays may
`operate at any number of different channel speeds. The
`present invention is capable of tuning the LARGE I/O SIZE
`parameter and the NCLIOP parameter to increase the effi-
`ciency of data transfer between the host 42 and the array 46.
`In general, the optimal values for the LARGE I/O SIZE and
`NCLIOP parameters will be the values which allow the
`greatest amount of functional overlap to occur in the RCFU
`52. In other words, the optimal values of the two parameters
`will result in the RCFU 52 being able to perform a maximum
`number of different functions concurrently and, therefore, to
`maximize data throughput in the controller 24.
`In one embodiment of the present
`invention, a static
`tuning approach is implemented. In this approach, the RAM
`64, or another memory device, stores substantially optimal
`values for the LARGE I/O SIZE and NCLIOP parameters
`for each possible combination of system variables, such as
`host channel speed and drive channel speed. These substan-
`tially optimal values may be determined empirically or by
`utilizing an appropriate algorithm. The microprocessor 62
`determines the type of host 42 and drive array 46 which are
`being used by dynamically inquiring the active channels and
`by knowing the respective clock frequencies. Once the
`microprocessor 62 determines the type of host 42 and drive
`array 46 that
`is being used,
`it retrieves the appropriate
`parameter data from the RAM 64. The microprocessor 62
`then transmits the LARGE I/O SIZE parameter to the
`comparator 56 and the NCLIOP parameter to the RDU 60.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`In addition to host channel speed and drive channel speed,
`other system variables may also be used in the above-
`described embodiment. For example, optimal values for the
`LARGE I/O SIZE and NCLIOP parameters can be found
`and stored for every possible combination of host channel
`speed, drive channel speed, XOR engine processing speed,
`and processor overhead. Values for all of these variables
`would then have to be estimated by the controller and the
`appropriate parameters retrieved from the memory. A prob-
`lem with this embodiment is that once a particular set of
`optimal values is being used, the system is unable to change
`the parameters if one or more of the system variables
`changes.
`in another embodiment of the present
`Therefore,
`invention, a dynamic feedback tuning approach is imple-
`mented. In this approach, the microprocessor 62 periodically
`estimates the values of a number of different feedback
`
`variables and uses these values to dynamically tune the
`LARGE I/O SIZE and NCLIOP parameters. The feedback
`approach offers the advantage that it is free running and
`therefore is constantly adapting the drive array controller 44
`to the needs of the system 40.
`Among the feedback variables which may be used by the
`invention are: measured or estimated incoming 1/0 rate, 1/0
`rate from the disk array, XOR processing rate, cache full
`percentage, controller overhead, and others. The micropro-
`cessor 62 can, for example, approximate the values of these
`variables by utilizing known feedback techniques. Once the
`values of these parameters are known or approximated, the
`microprocessor 62 can tune the values of the LARGE I/O
`SIZE and NCLIOP parameters according to a formula or by
`other methods. The formula can be developed by empirical
`means or based upon known I/O relationships.
`As mentioned previously, the dynamic feedback tuning
`approach periodically determines values for a number of
`feedback variables and uses the variables to update the
`LARGE I/O SIZE and NCLIOP parameters. During the
`period between updates, the RCFU 52 processes the current
`block requests using the current values of the parameters. It
`should be appreciated that in one embodiment of the present
`invention, the length of the period between updates, known
`as the I/O interval, is another system parameter which may
`be tuned in the controller 44.
`
`The operation of an embodiment of the present invention
`employ

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket