throbber

`
`
`
`
`UNIFIED PATENTS
`
`EXHIBIT 1011
`
`
`UNIFIED PATENTS
`
`EXHIBIT 1011
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 1
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 1
`
`

`

`United States Patent
`Stallmo et al.
`
`[191
`
`[11] Patent Number:
`
`5,617,530
`
`*Apr. 1, 1997
`[45] Date of Patent:
`
`|||||||||I||||llllllllIIIIlI||||Il|||||I|||II||I||II|||||llllllllllllllllll
`U8005617530A
`
`[54] STORAGE DEVICE ARRAY ARCHITECTURE
`WITH COPYBACK CACHE
`
`[75]
`
`Inventors: David C. Stallmo; William A. Brant,
`both of Boulder, C010.
`
`[73] Assignee: EMC Corporation, Hopkinton, Mass.
`
`[ *] Notice:
`
`The term of this patent shall not extend
`beyond the expiration date of Pat. No.
`5,526,482.
`
`[21] Appl. No.2 579,545
`
`[22]
`
`Filed:
`
`Dec. 27, 1995
`
`Related US. Application Data
`
`[63] Continuation of Ser. No. 112,791, Aug. 26, 1993, Pat. No.
`5,526,482, which is a continuation—in-part of Ser. No. 638,
`167, Jan. 4, 1991, Pat. No. 5,274,794.
`
`Int. Cl.6 ...................................................... GO6F 11/10
`[51]
`
`..............
`. 395/182.04; 395/441
`[52] US. Cl.
`[58] Field of Search ......................... 395/182.04, 182.03,
`395/182.01, 181, 182.13; 371/401
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`3,893,178
`4,092,732
`4,430,701
`4,467,421
`4,562,576
`4,593,354
`4,604,750
`4,667,326
`
`7/1975 Sordello ............................... 360/7302
`5/1978 Ouchi ............
`. 39S/182.03
`
`2/1984 Christian et a1.
`.
`...... 395/446
`8/1984 White .................. 395/445
`
`12/1985 Ratcliife
`395/185 02
`6/1986 Ushiro .......
`395/185.01
`1/1986 Manton .......... 371/402
`5/1987 Young et a1.
`.......................... 371/391
`
`(List continued on next page.)
`FOREIGN PATENT DOCUMENTS
`
`8/1990 European Pat. Off.
`0416968A2
`W089909468 10/1989 WIPO .
`
`.
`
`OTHER PUBLICATIONS
`
`Patterson,D.A., Gibson, 6., and Katz, H.; A Case For
`Redundant Arrays of Inexpensive Disks (RAID) (Jun. 1,
`1988).
`Lee, E.K.; Software and Performance Issues in the Imple-
`mentation of a RAID Prototype (May 1990).
`Chen. P. Gibson, G., Katz, R. H., Patterson, D.A., and
`Schulze, M.; Introduction to Redundant Arrays of Inexpen-
`sive Disks (RAID) (Dec. 1988).
`
`(List continued on next page.)
`
`Primary Examiner—Robert W. Beausoliel, Jr.
`Assistant Examiner—Ly V. Hua
`Attorney, Agent, or Finn—William J. Kubida; Richard A.
`Bachand; Holland & Hart LLP
`
`[57]
`
`ABSTRACT
`
`A fault-tolerant storage device array using a copyback cache
`storage unit for temporary storage. When a Write occurs to
`the RAID system, the data is immediately written to the first
`available location in the copyback cache storage unit. Upon
`completion of the Write to the copyback cache storage unit,
`the host CPU is immediately informed that the Write was
`successful. Thereafter, further storage unit accesses by the
`CPU can continue without waiting for an error-correction
`block update for the data just written. In a first embodiment
`of the invention, Read-Modify-Write operations are per-
`formed during idle time. In a second embodiment of the
`invention, normal Read—Modify-Write operation by the
`RAID system controller continue use Write data in the
`controller’s buffer memory. In a third embodiment, at least
`two controllers, each associated with one copyback cache
`storage unit, copy Write data from controller buffers to the
`associated copyback cache storage unit. If a copyback cache
`storage unit fails, more than one controller share a single
`copyback storage unit. In a fourth embodiment, Write data
`is copied from a controller bufl°er to a reserved area of each
`storage unit comprising the array.
`
`32 Claims, 10 Drawing Sheets
`
`403
`
`403
`
`407
`
`
`
`CONTROLLER #2
`
`
`
`405
`
`I
`
`
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 2
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 2
`
`

`

`5,617,530
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`4,722,085
`4,733,396
`4,754,397
`4,761,785
`4,768,193
`4,775,978
`4,791,642
`4,817,035
`4,849,929
`4,870,643
`4,899,342
`4,914,656
`4,920,539
`4,958,351
`4,993,030
`4,995,041
`5,058,116
`5,185,876
`5,274,799
`5,402,428
`5,463,765
`5,499,337
`
`10988 Flora etal.
`............................ 371/401
`3/1988 Baldwin
`. 371/402
`
`6/1988 Varaiya eta1..
`...... 361/685
`
`8/1988 Clark et a1.
`..... 371/511
`
`. 37100.3
`8/1988 Takemae
`
`10/1988 Hartness.
`.371/40.1
`12/1988 Taylor
`..... 37l/40.2
`
`. 395/185.07
`3/1989 Timsit
`3/1989 Timsit ................ 395082.03
`
`9/1989 Bultrnan etal.
`..... 395082.05
`2/1990 Potter et a1. ............ 395/182.04
`4/1990 Dunphy, Jr. et a1.
`. 395082.05
`
`.371/402
`4/1990 Albonesi
`9/1990 Flora .............
`.371/40.1
`
`2/1991 Krakaueretal.
`..
`. 371/401
`.371/40.l
`2/1991 Hetherington
`
`10/1991 Chao .................
`.371/40.2
`...... 395/841
`2/1993 Nguyen etal.
`
`..... 395082.04
`12/1993 Brantetal.
`
`.
`3/1995 Kakuta et a1.
`..... 395082.04
`
`.
`10/1995 Kakuta et a1.
`..... 395082.04
`
`3/1996 Gordon .............................. 395082.04
`
`OTHER PUBLICATIONS
`
`Chen, P., Gibson, G., Katz, R.H., Patterson, D.A., and
`Schulze, M., et a1.; Evolution of the RAID 2 Architecture
`(Jun. 12, 1990).
`
`Chen, P., Gibson, G., Katz, R.h., Patterson, D.A., and
`Schulze, M.; How Reliable is RAID?(Feb. 1988).
`
`Maximum Strategy, Inc., San Jose, CA; Strategy 2 Disk
`Array Controller Operation Manual (Nov. 2, 1988).
`Maximum Strategy, Inc., San Jose, CA; Strategy 1 Disk
`Array Controller Operation Manual (Date Unknown).
`Gibson, G.A ., Performance and Reliability in Redundant
`Arrays of Inexpensive Disks (Date Unknown).
`
`Chen, P., An Evaluation of Redundant Arrays of Disks
`Unsing an Amdahl 5890; (May 1989).
`Katz, R.H., Gibson, GA. and Patterson, D.A.; Disk system
`Architectures for High Performance Computing (Mar.
`1989).
`
`Gray, J ., Horst, B., and Walker, M.; Parity Striping and Disc
`Arrays: Low—Cost Reliable Storage wqith Acceptable
`Throughput (Jan. 1990).
`Schultz, M.E.; Considerations in the Design of Raid Proto—
`type (Aug. 1988).
`Clark and Corrigan; IBM Systems Journal, vol. 28, No. 3,
`1989.
`
`Schulze, M., et a1.; “How Reliable is RAID?”; Compcom,
`Spring 1989, pp. 1—12.
`Ousterhout, J., and Douglis, F., Beating the I/O Bottleneck:
`A Case for Log—Structured File Systems (Aug. 4, 1988).
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 3
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 3
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 1 of 10
`
`5,617,530
`
`1
`
`3
`
`2
`
`Fig. 1
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 4
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 4
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 2 of 10
`
`5,617,530
`
`START
`
`20
`
`
` YES
`
`HAS
`CPU ISSUED
`WRITE
`
`21
`
`
`
`
`
`
`
`22
`
`NO
`
`25
`
`COMMAND
`?
`
`WRITE BLOCK
`TO CACHE
`
`23
`
`UPDATE DIRECTORY
`TO POINT TO CACHE
`& FLAG ARRAY
`LOCATION
`
`
`
`
`
`
` HAS
`CPU ISSUED
`
`
`
`READ
`COMMAND
`?
`
`YES
`
`
`
`33
`
`NO
`
`
`
`
`ANY
`PENDING
`
`BLOCKS
`?
`
`
`
`34
`
`R
`
`£25;
`
`36
`
`37
`
`
`
`NO
`
`
`
`
`TO CPU
`
`
`24
`
`YES
`
`35
`
`
`
`ACKNOWLEDGE
`WRITE OPERATION
`
`READ A
`
`PENDING BLOCK
`
`
`READ OLD DATA &
`
`425;?ng
`
`PARITY BLOCKS
`
`ON FLAGGED STRIPE
`
`
`
`
`
`
`
`XOR OLD DATA &
`PARITY BLOCKS
`W/PENDING BLOCK
`FOR NEW PARITY
`
`BLOCK
`
`Fig. 2A
`
`(A)
`
`(5)
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 5
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 5
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 3 of 10
`
`5,617,530
`
`(A)
`
`(3)
`
`
`
`STORE NEW PARITY
`& PENDING BLOCK AT
`FLAGGED ARRAY
`LOCATION
`
`
`
`39
`
`38
`
`
`
`
`
`UPDATE DIRECTORY
`ENTRY TO POINT
`TO ARRAY
`
`RETU RN
`TO START
`
`27
`
`READ DIRECTORY
`FOR POINTER TO
`
`DATA BLOCK
`
`
` 28
`IS
`DATA BLOCK
`
`
`IN ARRAY
`
`?
`
`31
`YES
`
`NO
`
`29
`
`READ DATA BLOCK
`FROM ARRAY AND
`TRANSFER TO CPU
`
`READ DATA BLOCK
`FROM CACHE AND
`
`TRANSFER TO CPU
`Fig. 23
`
`
`32
`
`.
`
`RETURN
`TO START
`
`30
`
`RETURN
`TO. START
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 6
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 6
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 4 of 10
`
`5,617,530
`
`START
`
`5°
`
`51
`
`9
`
`HAS
`
`
`CPU ISSUED
`
`WRITE
`
`
`COMMAND
`
`
`
`YES
`
`
`
`52
`
`NO
`
`56
`
`YES
`
`
`
` HAS
`CPU ISSUED
`
`READ
`
`
`
`COMMAND
`?
`
`WRITE BLOCK
`TO CACHE
`
`UPDATE DIRECTORY
`TO POINT TO BUFFER
`& FLAG ARRAY
`LOCATION
`
`
`
`63
`
`NO
`
`
` NO ANY
`PENDING
`
`
`BLOCKS
`
`?
`
`. YES
`
`55
`
`‘54
`
`RETU RN
`TO START
`
`ACKNOWLEDGE
`WRITE OPERATION
`TO CPU
`
`ACCESS A
`PENDING BLOCK
`
`55
`
`RETU RN
`TO START
`
`
`
`
`
`READ OLD DATA &
`PARITY BLOCKS
`ON FLAGGED
`STRIPE
`
`
`
`66
`
`67
`
`
`
`XOR OLD DATA &
`PARITY BLOCKS
`W/PENDING BLOCK
`FOR NEW PARITY
`BLOCK
`
`Fig. 3A
`
`(A)
`
`(B)
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 7
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 7
`
`

`

`US. Patent
`
`Apr. 1,1997
`
`Sheet 5 of 10
`
`'
`
`5,617,530
`
`(A)
`
`(B)
`
`STORE NEW PARITY
`BLOCK & PENDING
`BLOCK AT FLAGGED
`ARRAY LOCATION
`
`59
`
`UPDATE DIRECTORY
`ENTRY TO POINT
`TO ARRAY
`
`68
`
`
`
`
`
`
`DATA BLOCK
`
`RETURN
`TO START
`
`57
`
`READ DIRECTORY
`FOR POINTER TO
`
`NO
`
`59
`
`YES
`
`61
`
`READ DATA BLOCK
`FROM ARRAY AND
`
`READ DATA BLOCK
`FROM BUFFER AND
`TRANSFER TO CPU
`
`
` 58
`IS
`DATA BLOCK
`IN ARRAY
`
`?
`TRANSFER TO CPU
`
`
`
`Fig. 3B
`
`RETURN
`TO START
`
`60
`
`_
`
`52
`
`RETURN
`TO START
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 8
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 8
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 6 of 10
`
`5,617,530
`
`'<
`
`407
`
`A2
`
`408
`
`8
`V'
`
`1—
`=33:
`
`E
`j
`0
`D:
`I—
`5
`0
`
`$9$9
`
`
`
`E
`j
`O
`I
`I-
`6
`O
`
`N
`it
`
`
`
`E?
`V
`
`w
`I
`
`it
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 9
`
`DD
`
`.O
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 9
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 7 0f 10
`
`5,617,530
`
`STEP 501
`STEP 510
`
`
`
`
`READ
`
`
`NO
`WRITE
`
`
`REQUEST
`
`
`
`
`
`RECEIVED
`9
`
`
`
`
`
`REQUEST
`RECEIVED
`?
`
`
`
`STEP 512
`
`WRITE PENDING DATA
`
`DETERMINE LOCATION
`
`
`
`STEP 533
`
`
`
`
`BLOCK TO CONTROLLER
`OF DATA BLOCK TO
`
`BUFFER
`BE READ
`
`
`
`
`STEP 506
`APPEND HEADER AND
`STEP 514
`
`TRAILER TO PENDING
`
`DATA BLOCK
`
`
`STEP 507 STEP 516
`
`
`
`READ DATA FROM
`BLOCK, HEADER, AND
`ARRAY AND TRANSFER
`
`
`TRAILER TO STORAGE UNIT
`TO CPU
`
`
`COPY PENDING DATA
`
`ACKNOWLEDGE WRITE
`OPERATION
`
`TO CPU
`
`RETURN
`TO START
`
`(A)
`
`Fig. 5A
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 10
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 10
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 8 of 10
`
`5,617,530
`
`(A)
`
`
`
`ANY
`STEP 520
`
`PENDING
`NO
`
`
`DATA BLOCKS IN
`CONTROLLER
`
`BUFFER
`?
`
`YES
`
`STEP 522
`
`RETURN
`TO START
`
`ACCESS PENDING DATA
`
`BLOCK FROM BUFFER
`BLOCK FROM ARRAY
`
`STEP 524
`
`READ OLD DATA
`
`STEP 526
`
`WRITE PENDING DATA
`
`BLOCK TO ARRAY
`
`STEP 528
`
`INVALIDATE BIT MAP
`IN CONTROLLER
`
`BUFFER
`
`STEP 530
`
`NOTE SEQUENCE
`
`NUMBER LAST USED
`
`STEP 532
`
`READ OLD PARITY BLOCK
`
`STEP 534
`
`XOR OLD DATA AND
`PARATY BLOCK
`W/PENDING DATA
`
`BLOCK
`
`STEP 535
`
`STEP 538
`
`WRITE NEW PARITY
`BLOCK IN ARRAY
`
`NUMBER LIST USED
`
`READ SEQUENCE
`
`
`STEP 540
`
`YES
`SAVE
`
`AS LAST TIME
`READ
`
`?
`
`
`
`
`STEP 542
`
`
`INVALIDATE BIT
`MAP IN COPYBACK
`
`
`CACHE STORAGE
`UNIT
`
`
`
`RETURN
`TO START
`
`Fig; 53
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 11
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 11
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 9 of 10
`
`5,617,530
`
`COPYBACK CACHE
`STORAGE
`UNAVAILABLE
`
`
`
`STEP 603 STEP 604
`
`STEP 501
`
`
`ANY
` WRITE PENDING
`
`
`HOLD UP
`
`PENDING
`
`
`DATA BLOCKS
`
`SUBSEQUENT
`
`
`
`DATA BLOCKS
`
`
`TO ARRAY
`‘
`WRITE
`
`
`?
`
`OPERATIONS
`
`
`
`STEP 605
`
`
`REBUILD FAILED
`STORAGE UNIT
`
`
`
`STEP 509
`
`STEP 607
`
`
`ANY
`UNUSED
`
`
`STORAGE
`AS COPYBACK a
`
`
`UNITS
`
`
`?
`
`
`
`
`
`ACKNOLAGE
`
`AT
`
`WRITE COMPLETE
`
`LEAST ONE
`
`
`ONLY AFI'ER RMW
`
`SEQUENCE IS
`COMPLETE
`
`
`USE UNUSED
`STORAGED UNIT
`
`CACHE STORAGE
`UNIT
`
`
`
`
`STEP 6“
`
`STEP 615
`
`STORAGE UNIT
`
`SHARE THAT
`
`END
`
`Fig. 6
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 12
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 12
`
`

`

`US. Patent
`
`Apr. 1, 1997
`
`Sheet 10 of 10
`
`5,617,530
`
`408
`
`°°
`
`CPU
`
`0))
`
`9))
`
`501
`
`501
`
`Fig.7
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 13
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 13
`
`

`

`5,617,530
`
`1
`STORAGE DEVICE ARRAY ARCHITECTURE
`WITH COPYBACK CACHE
`
`RELATED APPLICATION
`
`This is a continuation of application Ser. No. 08/112,791
`to David C. Stallano and William A. Brant, filed Aug. 26,
`1993 for Storage Device Array Architecture with Copyback
`Cache, now US. Pat. No. 5,526,482, which issued Jun. 11,
`1996, and which is a continuation-in-part of application Ser.
`No. 07/638,167 of William A. Brant, filed Jan. 4, 1991,
`entitled Storage Device Array Architecture With Copyback
`Cache, now US. Pat. No. 5,274,799, which issued Dec. 28,
`1993.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`This invention relates to computer system data storage,
`and more particularly to a fault~tolerant storage device array
`using a copyback cache storage unit for temporary storage.
`2. Description of Related Art
`A typical data processing system generally involves one
`or more storage units which are connected to a Central
`Processor Unit (CPU) either directly or through a control
`unit and a channel. The function of the storage units is to
`store data and programs which the CPU uses in performing
`particular data processing tasks.
`Various type of storage units are used in current data
`processing systems. A typical system may include one or
`more large capacity tape units and/or disk drives (magnetic,
`optical, or semiconductor) connected to the system through
`respective control units for storing data.
`However, a problem exists if one of the large capacity
`storage units fails such that information contained in that
`unit is no longer available to the system. Generally, such a
`failure will shut down the entire computer system.
`The prior art has suggested several ways of solving the
`problem of providing reliable data storage. In systems where
`records are relatively small,
`it
`is possible to use error
`correcting codes which generate ECC syndrome bits that are
`appended to each data record within a storage unit. With
`such codes, it is possible to correct a small amount of data
`that may be read erroneously. However, such codes are
`generally not suitable for correcting or recreating lOng
`records which are in error, and provide no remedy at all if
`a complete storage unit fails. Therefore, a need exists for
`providing data reliability external to individual storage units.
`Other approaches to such “external” reliability have been
`described in the art. A research group at the University of
`Califomia, Berkeley, in a paper entitled “A Case for Redun~
`dant Arrays of Inexpensive Disks (RAID)”, Patterson, et al.,
`Proc. ACM SIGMOD, June 1988, has catalogued a number
`of different approaches for providing such reliability when
`using disk drives as storage units. Arrays of disk drives are
`characterized in one of five architectures, under the acronym
`“RAID” (for Redundant Arrays of Inexpensive Disks).
`A RAID 1 architecture involves providing a duplicate set
`of “mirror” storage units and keeping a duplicate copy of all
`data on each pair of storage units. While such a solution
`solves the reliability problem, it doubles the cost of storage.
`A number of implementations of RAID 1 architectures have
`been made, in particular by Tandem Corporation.
`A RAID 2 architecture stores each bit of each word of
`
`data, plus Error Detection and Correction (EDC) bits for
`each word, on separate disk drives (this is also known as “bit
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`
`striping”). For example, US. Pat. No. 4,722,085 to Flora et
`al. discloses a disk drive memory using a plurality of
`relatively small, independently operating disk subsystems to
`function as a large, high capacity disk drive having an
`unusually high fault tolerance and a very high data transfer
`bandwidth. A data organizer adds 7 EDC bits (determined
`using the well-known Hamming code) to each 32-bit data
`word to provide error detection and error correction capa-
`bility. The resultant 39-bit word is written, one bit per disk
`drive, on to 39 disk drives. If one of the 39 disk drives fails,
`the remaining 38 bits of each stored 39—bit word can be used
`to reconstruct each 32-bit data word on a word-by-word
`basis as each data word is read from the disk drives, thereby
`obtaining fault tolerance.
`An obvious drawback of such a system is the large
`number of disk drives required for a minimum system (since
`most large computers use a 32—bit word), and the relatively
`high ratio of drives required to store the EDC bits (7 drives
`out of 39). A further limitation of a RAID 2 disk drive
`memory system is that the individual disk actuators are
`operated in unison to write each data block, the bits of which
`are distributed over all of the disk drives. This arrangement
`has a high data transfer bandwidth, since each individual
`disk transfers part of a block of data, the net effect being that
`the entire block is available to the computer system much
`faster than if a single drive were accessing the block. This is
`advantageous for large data blocks. However, this arrange-
`ment also effectively provides only a single read/write head
`actuator for the entire storage unit. This adversely affects the
`random access performance of the drive array when data
`files are small, since only one data file at a time can be
`accessed by the “single” actuator. Thus, RAID 2 systems are
`generally not considered to be suitable for computer systems
`designed for On-Line Transaction Processing (OLTP), such
`as in banking, financial, and reservation systems, where a
`large number of random accesses to many small data files
`comprises the bulk of data storage and transfer operations.
`A RAID 3 architecture is based on the concept that each
`disk drive storage unit has internal means for detecting a
`fault or data error. Therefore, it is not necessary to store extra
`information to detect the location of an error; a simpler form
`of parity—based error correction can thus be used. In this
`approach, the contents of all storage units subject to failure
`are “Exclusive OR’d” (XOR’d) to generate parity informa-
`tion. The resulting parity information is stored in a single
`redundant storage unit. If a storage unit fails, the data on that
`unit can be reconstructed on to a replacement storage unit by
`XOR’ing the data from the remaining storage units with the
`parity information. Such an arrangement has the advantage
`over the mirrored disk RAID 1 architecture in that only one
`additional storage unit is required for “N” storage units. A
`further aspect of the RAID 3 architecture is that the disk
`drives are operated in a coupled manner, similar to a RAID
`2 system, and a single disk drive is designated as the parity
`unit.
`
`One implementation of a RAID 3 architecture is the
`Micropolis Corporation Parallel Drive Array, Model 1804
`SCSI, that uses four parallel, synchronized disk drives and
`one redundant parity drive. The failure of one of the four
`data disk drives can be remedied by the use of the parity bits
`stored on the parity disk drive. Another example of a RAID
`3 system is described in US. Pat. No. 4,092,732 to Ouchi.
`A RAID 3 disk drive memory system has a much lower
`ratio of redundancy units to data units than a RAID 2 system.
`However, a RAID 3 system has the same performance
`limitation as a RAID 2 system, in that the individual disk
`actuators are coupled, operating in unison. This adversely
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 14
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 14
`
`

`

`3
`
`4
`
`5,617,530
`
`affects the random access performance of the drive array
`when data files are small, since only one data file at a time
`can be accessed by the “single” actuator. Thus, RAID 3
`systems are generally not considered to be suitable for
`computer systems designed for OLTP purposes.
`A RAID 4 architecture uses the same parity error correc—
`tion concept of the RAID 3 architecture, but improves on the
`performance of a RAID 3 system with respect to random
`reading of small files by “uncoupling” the operation of the
`individual disk drive actuators, and reading and writing a
`larger minimum amount of data (typically, a disk sector) to
`each disk (this is also known as block striping). A further
`aspect of the RAID 4 architecture is that a single storage unit
`is designated as the parity unit.
`A limitation of a RAID 4 system is that Writing a data
`block on any of the independently operating data storage
`units also requires writing a new parity block on the parity
`unit. The parity information stored on the parity unit must be
`read and XOR’d with the old data (to “remove” the infor—
`mation content of the old data), and the resulting sum must
`then be XOR’d with the new data (to provide new parity
`information). Both the data and the parity records then must
`be rewritten to the disk drives. This process is commonly
`referred to as a “Read-Modify-Write” sequence.
`Thus, a Read and a Write on the single parity unit occurs
`each time a record is changed on any of the data storage units
`covered by the parity record on the parity unit. The parity
`unit becomes a bottle-neck to data writing operations since
`the number of changes to records which can be made per
`unit of time is a function of the access rate of the parity unit,
`as opposed to the faster access rate provided by parallel
`operation of the multiple data storage units. Because of this
`limitation, a RAID 4 system is generally not considered to
`be suitable for computer systems designed for OLTP pur-
`poses. Indeed, it appears that a RAID 4 system has not been
`implemented for any commercial purpose.
`A RAID 5 architecture uses the same parity error correc-
`tion concept of the RAID 4 architecture and independent
`actuators, but improves on the writing performance of a
`RAID 4 system by distributing the data and parity informa—
`tion across all of the available disk drives. Typically, “N+l”
`storage units in a set (also known as a “redundancy group”)
`are divided into a plurality of equally sized address areas
`referred to as blocks. Each storage unit generally contains
`the same number of blocks. Blocks from each storage unit
`in a redundancy group having the same unit address ranges
`are referred to as “stripes”. Each stripe has N blocks of data,
`plus one parity block on one storage unit containing parity
`for the remainder of the stripe. Further stripes each have a
`parity block, the parity blocks being distributed on different
`storage units. Parity updating activity associated with every
`modification of data in a redundancy group is therefore
`distributed over the different storage units. No single unit is
`burdened with all of the parity update activity.
`For example, in a RAID 5 system comprising 5 disk
`drives, the parity information for the first stripe of blocks
`may be written to the fifth drive; the parity information for
`the second stripe of blocks may be written to the fourth
`drive; the parity information for the third stripe of blocks
`may be written to the third drive; etc. The parity block for
`succeeding stripes typically “precesses” around the disk
`drives in a helical pattern (although other patterns may be
`used).
`
`Thus, no single disk drive is used for storing the parity
`information, and the bottle-neck of the RAID 4 architecture
`is eliminated. An example of a RAID 5 system is described
`in US. Pat. No. 4,761,785 to Clark et al.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`4o
`
`45
`
`50
`
`55
`
`60
`
`65
`
`As in a RAID 4 system, a limitation of a RAID 5 system
`is that a change in a data block requires a Read-Modify-
`Write sequence comprising two Read and two Write opera-
`tions: the old parity block and old data block must be read
`and XOR’d, and the resulting sum must then be XOR’d with
`the new data. Both the data and the parity blocks then must
`be rewritten to the disk drives. While the two Read opera-
`tions may be done in parallel, as can the two Write opera-
`tions, modification of a block of data in a RAID 4 or a RAID
`5 system still
`takes substantially longer then the same
`operation on a conventional disk. A conventional disk does
`not require the preliminary Read operation, and thus does
`have to wait for the disk drives to rotate back to the previous
`position in order to perform the Write operation. The rota
`tional latency time alone can amount to about 50% of the
`time required for a typical data modification operation.
`Further, two disk storage units are involved for the duration
`of each data modification operation, limiting the throughput
`of the system as a whole.
`Despite the Write performance penalty, RAID 5 type
`systems have become increasingly popular, since they pro-
`vide high data reliability with a low overhead cost for
`redundancy, good Read performance, and fair Write perfor-
`mance. However, it would be desirable to have the benefits
`of a RAID 5 system without the Write performance penalty
`resulting from the rotational latency time imposed by the
`parity update operation.
`The present invention provides such a system.
`
`SUMMARY OF THE INVENTION
`
`The present invention solves the error-correction block
`bottleneck inherent in a RAID 5 architecture by recognition
`that storage unit accesses are intennittent. That is, at various
`times one or more of the storage units in a RAID 5 system
`are idle in terms of access requests by the CPU. This
`characteristic can be exploited by providing a “copyback
`cache” storage unit as an adjunct to a standard RAID system.
`The present invention provides four alternative methods of
`operating such a system.
`In each embodiment, when a Write occurs to the RAID
`system, the data is immediately written to the first available
`location in the copyback cache storage unit. Upon comple—
`tion of the Write to the copyback cache storage unit, the host
`CPU is immediately informed that the Write was successful.
`Thereafter, further storage unit accesses by the CPU can
`continue without waiting for an error—correction block
`update for the data just written.
`In the first embodiment of the invention, during idle time,
`for relevant storage units of the storage system, an error-
`correction block (e.g., XOR parity) is computed for each
`“pending” data block on the copyback cache storage unit,
`and the data block and corresponding error-correction block
`are copied to their proper location in the RAID system.
`Optionally, if a number of pending data blocks are to be
`written to the same stripe, an error-correction block can be
`calculated from all data blocks in the stripe at one time, thus
`achieving some economy of time. In this embodiment, the
`copyback cache storage unit in effect stores “peak load”
`Write data and then completes the actual Read-Modify-
`Write operations to the RAID system during relatively
`quiescent periods of I/O accesses by the CPU.
`In the second embodiment of the invention, after Write
`data is logged to the copyback cache storage unit, normal
`Read-Modify-Write operation by the RAID system control-
`ler continues in overlapped fashion with other CPU I/O
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 15
`
`UNIFIED PATENTS EXHIBIT 1011
`PAGE 15
`
`

`

`5,617,530
`
`5
`accesses, using Write data in the array controller’s bufier
`memory. Thus, there are at least two redundant copies of
`each pending data block. Performance is enhanced because
`the CPU can continue processing as soon as the simple Write
`operation to the copyback cache storage unit completes, thus
`eliminating the delay caused by a normal Read-Modify-
`Write RAID system. In this embodiment,
`the copyback
`cache storage unit acts more as a running “log” of Write
`data. Data integrity and redundancy is preserved since the
`Write data is saved to the copyback cache storage unit and
`thus accessible even if the Read-Modify-Write operation
`from the controller buffer to the RAID system never com-
`pletes.
`In the third embodiment of the invention, one or more
`logical arrays of storage units are defined. Each logical array
`of storage units is associated with at least one controller.
`When Write data is presented to a controller, the data is
`immediately stored in a controller bufier. Each controller
`buffer is associated with a unique copyback 'cache storage
`unit. A copy of the data stored in the controller buffer is
`written to its associated copyback cache storage unit. Thus,
`there are at least two redundant copies of each pending data
`block. The CPU is informed that the Write operation was
`successfully completed immediately after the data is written
`to the copyback cache storage unit. Data blocks are only
`read from a copyback cache storage unit upon a failure of the
`associated controller buffer. In one alternative embodiment,
`each copyback cache unit is logically divided into at least
`two logical areas, so that if a copyback cache storage unit
`fails, another copyback cache storage unit associated with
`another logical array can be shared by two logical arrays.
`In the fourth embodiment of the invention, no physical
`storage unit is assigned as a copyback cache unit. Rather, at
`least one logical stripe within each of the storage units that
`comprise the array is reserved for storing a copy of the
`pending blocks of data in the controller buffer. As is the case
`in the third embodiment, pending data blocks are read only
`from the controller buffer during Read—Modify-Write opera—
`tions unless the controller buffer has failed.
`
`The copyback cache storage unit is preferably non-vola-
`tile, so that data will not be lost on a power failure. If the
`copyback cache storage unit is a disk drive, it may be paired
`with a “mirror” storage unit for additional fault tolerance.
`Optionally,
`the copyback cache storage unit may be a
`solid-state storage unit, which can achieve substantially
`faster Write and error-correction block update times than a
`disk drive.
`
`The details of the preferred embodiments of the present
`invention are set forth in the accompanying drawings and
`the description below. Once the details of the invention are
`known, numerous additional innovations and changes will
`become obvious to one skilled in the art.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is block diagram of a copyback cache RAID
`system in accordance with the present invention.
`FIGS. 2A and 2B is a flow-chart of Read and Write
`
`operation in accordance with a first embodiment of the
`present invention.
`FIGS. 3A and 3B is a flow-chart of Read and Write
`
`operation in accordance with a second embodiment of the
`present invention.
`FIG. 4 is a block diagram of a copyback cache RAID
`system in accordance with an alternative embodiment of the
`present invention having multiple controllers.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`4o
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`
`FIGS. 5A and 5B is a high-level flow chart representing
`the steps of Read and Write operations for the embodiment
`of the invention in which a RAID system has multiple
`controllers.
`
`FIG. 6 is a high level flow chart of the steps taken in
`accordance with the present invention when a copyback
`cache storage unit becomes unavailable.
`FIG. 7 is a block diagram of a RAID system in accordance
`with another embodiment of the present invention, in which
`pending data blocks are spread over a number of storage
`units that comprise the array.
`Like reference numbers and designations in the drawings
`refer to like elements.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`Throughout this description, the preferred embodiments
`and examples shown should be considered as exemplars,
`rather than as limitations on the present invention.
`FIG. 1 is block diagram of a copyback cache RAID
`system in accordance with the present invention. Shown are
`a CPU 1 coupled by a bus 2 to an array controller 3, which
`in the preferred embodiment is a fault—tolerant controller.
`The array controller 3 is coupled to each of the plurality of
`storage units Sl—SS (five being shown by way of example
`only) by an I/O bus (e.g., a SCSI bus). The array controller
`3 preferably includes a separately programmable processor
`(for example, the MIPS R3000 RISC processor, made by
`MIPS of Sunnyvale, Calif.) which can act independently of
`the CPU 1 to control the storage units.
`The storage units 81—55 are failure independent, meaning
`that the failure of one unit does not affect the physical
`operation of other units. Optionally, the storage units 81—85
`are disk drive units including solid state storage unit buffers
`7. Such solid state storage unit buffers 7 allow data to be
`written by a controller to a disk drive without encountering
`the delay associated with disk rotational latency of a disk
`drive. Data that is written to the solid state storage unit buffer
`7 is preferably written from the solid state storage unit buffer
`7 to the rotating medium of the disk drive unit as soon as
`possible.
`Also attached to the controller 3 is a copyback cache
`storage unit CC, which in the preferred embodiment is
`coupled to the common I/O bus (e.g., a SCSI bus) so that
`data can be transferred between the copyback cache storage
`unit CC and the storage units 81—55. The copyback cache
`storage unit CC is preferably non-volatile, so that data will
`not be lost on a power failure. If the copyback cache storage
`unit CC is a disk drive, it may be paired with a “mirror”
`storage unit CC' for additional fault tolerance. The mirror
`storage unit CC' is coupled to the controller 3 such that all
`data written to the copyback cache storage unit CC is also
`written essentially simultaneously to the mirror storage unit
`CC‘,
`in known fashion. Optionally,
`the copyback cache
`storage unit CC may be a solid—state storage unit, which can
`achieve substantially faster Write and error-correction block
`update times than a disk drive. In such a case, the solid-state
`storage unit preferably includes error—detection and correc—
`tion circuitry, and is either non-volatile or has a battery
`backup on the power supply

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket