`
`
`
`
`UNIFIED PATENTS
`
`EXHIBIT 1003
`
`UNIFIED PATENTS
`
`EXHIBIT 1003
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 1
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 1
`
`
`
`(12) United States Patent
`US 6,370,605 B1
`(10) Patent N0.:
`
`Chong, Jr. Apr. 9, 2002 (45) Date of Patent:
`
`
`US006370605B1
`
`(54) SWITCH BASED SCALABLE
`PERFORMANCE STORAGE
`ARCHITECTURE
`
`(75)
`
`Inventor: Fay Chong, Jr., Cupertino, CA (US)
`
`(73) Assignee: Sun Microsystems, Inc., Palo Alto, CA
`(US)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/358,356
`
`(22)
`
`Filed:
`
`Jul. 21, 1999
`
`Related US. Application Data
`
`(63)
`
`Continuation—in—part of application No. 09/262,407, filed on
`Mar. 4, 1999, which is a continuation—in—part of application
`No. 09/261,978, filed on Mar. 4, 1999.
`
`Int. Cl.7 ................................................ G06F 13/14
`(51)
`(52) US. Cl.
`.......................................... 710/131; 710/38
`(58) Field of Search .............................. 710/30, 31, 33,
`710/38, 131; 370/389, 355
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`6,065,096 A
`5/2000 Day et al.
`7/2000 Lucas et al.
`6,085,285 A
`8/2000 Nishtala et al.
`6,101,565 A
`..... 370/477
`9/2000 Balachandran et al.
`6,115,394 A *
`6,134,627 A * 10/2000 Bak .............................. 711/6
`6,161,208 A
`12/2000 Dutton et al.
`6,167,424 A * 12/2000 Bak et al.
`................... 709/100
`OTHER PUBLICATIONS
`
`AC&NC Raid Technology, Raid Level: 0 1 2 3 4 5 6 7 10
`53, http://www.acnc.com/raid.html, Dec. 17, 1998,
`(11
`pages).
`Fibre Channel Overview, http://www.cern.ch/HIS/fcs/spec/
`overview.html, Jan. 11, 1999, (9 pages).
`
`* cited by examiner
`
`Primary Examiner—B. James Peikari
`(74) Attorney, Agent, or Firm—Conley, Rose & Tayon, PC;
`Robert C. Kowert
`
`(57)
`
`ABSTRACT
`
`Several embodiments of a computer system are described
`which achieve separation of control and data paths during
`data transfer operations, thus allowing independent scalabil-
`ity of storage system performance factors (e.g., storage
`system ops and data transfer rate). In one embodiment, the
`computer system includes a data switch coupled between a
`host computer and one or more storage devices. A storage
`controller for managing the storage of data within the one or
`more storage devices is coupled to the switch. The switch
`includes a memory for storing data routing information
`generated by the controller, and uses the data routing infor-
`mation to route data directly between the host computer and
`the one or more storage devices such that the data does not
`pass through the storage controller. Within the computer
`system,
`information may be conveyed between the host
`computer, the switch, the one or more storage devices, and
`the storage controller according to a two party protocol such
`as the Fibre Channel protocol. The computer system
`achieves separation of control and data paths using a modi-
`fied switch and standard host adapter hardware and host
`driver software. In addition, a two party protocol such as the
`Fibre Channel protocol is not violated.
`
`17 Claims, 18 Drawing Sheets
`
`4,151,593 A
`4,603,416 A *
`5,148,432 A
`5,206,943 A
`5,448,709 A
`5,487,160 A
`5,526,497 A
`5,668,956 A
`5,720,028 A
`5,724,539 A
`5,793,763 A *
`5,867,733 A
`5,870,521 A *
`5,896,492 A
`5,913,057 A *
`6,023,754 A
`
`............... 370/417
`
`4/1979 Jenkins et al.
`7/1986 Sewel et al.
`9/1992 Gordon et al.
`4/1993 Callison et al.
`9/1995 Chandler et al.
`1/1996 Bemis
`6/1996 Zilka et al.
`9/1997 Okazawa et al.
`2/1998 Matsumoto et al.
`3/1998 Riggle et al.
`8/1998 Mayes et al.
`2/1999 Meyer
`2/1999 Shimoda ...................... 386/52
`4/1999 Chong, Jr.
`6/1999 Labatte et al.
`2/2000 DuLac et al.
`
`............... 370/389
`
`................. 713/2
`
`Storage 1m .
`
`Parity
`Calculator 1
`fl
`
`Calculator 2
`fl
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 2
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 2
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 1 0f 18
`
`US 6,370,605 B1
`
`[10
`
`
`
`
`
`
`
`Host Computer
`(|D=H)
`1_2_
`
`
`
`
`Storage
`Controller
`(ID=A)
`1g
`
`
`
`
`
`
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 3
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 3
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 2 0f 18
`
`US 6,370,605 B1
`
`3%8.2%m.5
`
`oo_>mn_
`
`Dun:
`
`mo_>mn_
`
`5:05:00
`
`mun:
`
`E39:22E\86%9mm <nn=
`
`50:
`
`Inn:
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 4
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 4
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 3 0f 18
`
`US 6,370,605 B1
`
`1_2_
`
` Host Computer
`
`
`
`
`1XMB/sec
`1XMB/sec
`
`
`251
`
`Commands
`
`
`Status
`
`
` Control Module
`
`1X MB/Sec
`2_4
`
`
`
`
`
`Commands
`1XMB/sec
`1XMB/sec
`
`
`Status
`
`
`
`21/
`
` Storage
`_1_§
`
` Fig. 3A
`
`241
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 5
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 5
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 4 0f 18
`
`US 6,370,605 B1
`
`Host Computer
`3
`
`Commands
`
`Control Module
`
`2_4
`
`Data
`
`
`1XMB/sec
`
`Status
`
`-----------------------------------------------------------------------
`
` 1X MB/sec
`
`
`Commands
`
`
`Status
`
`
`
`1XMB/sec
`
`262
`
`
`
`
`Storage
`fl
` Fig. BB
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 6
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 6
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 5 0f 18
`
`US 6,370,605 B1
`
` Host Computer
`
`> 3X MB/sec
`
`261 C
`
`
`
`1XMB/sec
`1XMB/sec
`
`.....................................................................
`
`
` 1X MB/sec
`
`Control Module
`
`a
`
`..................................................................
`
` 1XMB/sec
`
`1XMB/sec
`1XMB/sec
`
`
`262C
`
`> 3X MB/sec
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 7
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 7
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 6 0f 18
`
`US 6,370,605 B1
`
`Host Computer
`fl
`
`Status
`
` Commands
`
`0 3
`
` Commands
`1XMB/sec
`
`
`
`
`
`21 j 0
`
`
`
`Status
`
`262
`
`g
`5
`
`.
`Fig. 30
`
`Storage
`Jfi
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 8
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 8
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 7 0f 18
`
`US 6,370,605 B1
`
` Host Computer
`
`
`1;
`
` 1XMB/sec
` Control Module
`
`Data
` 1XMB/sec
`
`Control Module
`
`24A
`
`2_4|§
`
` ----------------------------------------------------------------------
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 9
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 9
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 8 0f 18
`
`US 6,370,605 B1
`
` Host Computer
`
`2
`
`..........................................................................................
`
`
`Control Module
`24A
`
`
`Parity
`Calculator
`
`
`Q
`
`
`
`
`
`
`Storage
`1§
`
`
`
`Fig. 4A
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 10
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 10
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 9 0f 18
`
`US 6,370,605 B1
`
`Host Computer
`
` Parity
`
`Calculator
`
`Q
`
`-__-________——------_-_—-——---------_----_---_-----------a
`
`Command
`
`
`Storage
`lg
`
`
`Fig. 4B
`
`
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 11
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 11
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 10 0f 18
`
`US 6,370,605 B1
`
`Control
`Module 1
`&
`
`Parity
`Calculator 1
`Q1
`
`
`
`Parity
`Calculator 2
`&
`
`Storage 1
`fl
`
`Storage 2
`fl
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 12
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 12
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 11 0f 18
`
`US 6,370,605 B1
`
`60
`
`_\
`
`Host Computer
`12
`
`Commands
`
`Status
`
`Control Module
`
`(ID=A)
`Q
`
`r------------
`
`
`Command+Status+Data
`
`
`Fig. 6
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 13
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 13
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 12 0f 18
`
`US 6,370,605 B1
`
`$25@9205
`
`Own:
`
`K.3
`
`$25mmmhofim
`
`mun:
`
`B=otcoo
`
`<Hn:
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 14
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 14
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 13 0f 18
`
`US 6,370,605 B1
`
`36\
`
`38\
`
`Host Computer
`(ID=H)
`l_2_
`
`
`
`
`
`
`
`Storage
`Controller
`(I D=A)
`141
`
`.
`Fig. 8
`
`
`
`
`Storage
`Device
`(I D=B)
`1.8_a
`
`
`Storage
`Device
`(I D=C)
`m
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 15
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 15
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 14 0f 18
`
`US 6,370,605 B1
`
`m.3
`
`3ton.Sag
`
`mmtom59;
`
`52:5
`
`xEmE
`
`6:80
`
`:54
`
`mm
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 16
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 16
`
`
`
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 15 0f 18
`
`US 6,370,605 B1
`
`vm‘
`
`E
`
`
`
`aaomo59mm:
`
`“mmtO89%?
`
`we.
`
`mocwscww
`
`n:
`
`flog.
`
`8:2:me
`
`g$0594
`
`850m
`
`ammmiv<
`
`
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 17
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 17
`
`
`
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 16 0f 18
`
`US 6,370,605 B1
`
` Port Control Unit fl
`
`Memory Unit @
`
`Packet
`Processing
`
`Circuitry
`
`flfl
`
`
`
`
`Offset
`Caic. Unit
`Queue
`1m
`
`
`T0
`input
`
`
` CRC
`
`Calc, Unit
`
`fl;
`
` Fig. 11
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 18
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 18
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 17 0f 18
`
`US 6,370,605 B1
`
`83mm
`
`Dun:
`
`q§5E
`
`8300
`
`mun:
`
`<nD_
`
`Lo=obcoo mum.
`
`52:5
`
`uwmwmn.Ema
`
`5256339i
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 19
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 19
`
`
`
`US. Patent
`
`Apr. 9, 2002
`
`Sheet 18 0f 18
`
`US 6,370,605 B1
`
`mmBSm
`QHDUA
`
`woSwD
`
`Cum:
`
`mosmo
`
`mug
`
`5:95:00
`
`<nn:
`
`:BEm
`
`mun:
`
`Hon
`
`Inn:
`
`mm“.3
`
`<m3mwm
`.WHD’X
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 20
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 20
`
`
`
`
`US 6,370,605 B1
`
`1
`SWITCH BASED SCALABLE
`PERFORMANCE STORAGE
`ARCHITECTURE
`
`CONTINUATION DATA
`
`This patent application is a continuation-in-part to appli-
`cation Ser. No. 09/262,407 entitled “Scalable Performance
`Storage Architecture” by Fay Chong, Jr. filed Mar. 4, 1999,
`and application Ser. No. 09/261,978 entitled “Redirected I/O
`for Scalable Performance Storage Architecture” by Fay
`Chong, Jr. filed Mar. 4, 1999.
`Patent application Ser. No. 09/262,407 entitled “Scalable
`Performance Storage Architecture” and application Ser. No.
`09/261,978 entitled “Redirected I/O for Scalable Perfor-
`mance Storage Architecture” are incorporated herein by
`reference in their entirety.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`This invention relates to data storage systems, and more
`particularly to data storage systems having a storage device
`controller interposed between a host computer and one or
`more data storage devices wherein the controller manages
`the storage of data within the one or more storage devices.
`2. Description of the Related Art
`Auxiliary storage devices such as magnetic or optical disk
`arrays are usually preferred for high-volume data storage.
`Many modem computer applications, such as high resolu-
`tion video or graphic displays involving on-demand video
`servers, may heavily depend on the capacity of the host
`computer to perform in a data-intensive environment. In
`other words, necessity for external storage of data in rela-
`tively slower auxiliary data storage devices demands that the
`host computer system accomplish requisite data transfers at
`a rate that does not severely restrict
`the utility of the
`application that necessitated high-volume data transfers.
`Due to the speed differential between a host processor and
`an external storage device, a storage controller is almost
`invariably employed to manage data transfers to/from the
`host and from/to the storage device.
`The purpose of a storage controller is to manage the
`storage for the host processor, leaving the higher speed host
`processor to perform other tasks during the time the storage
`controller accomplishes the requested data transfer to/from
`the external storage. The host generally performs simple
`data operations such as data reads and data writes. It is the
`duty of the storage controller to manage storage redundancy,
`hardware failure recovery, and volume organization for the
`data in the auxiliary storage. RAID (Redundant Array of
`Independent Disks) algorithms are often used to manage
`data storage among a number of disk drives.
`FIG. 1 is a diagram of a conventional computer system 10
`including a host computer 12 coupled to a storage controller
`14 by a link 16 and two storage devices 18a—b coupled to
`storage controller 14 by respective links 20a—b. Each stor-
`age device 18 may be, for example, a disk drive array or a
`tape drive. Links 16 and 20a—b may include suitable inter-
`faces for I/O data transfers (e.g., Fibre Channel, small
`computer system interface or SCSI, etc.) As is evident from
`FIG. 1, all of the information involved in data transfers
`between host computer 12 and storage devices 16a—b passes
`through storage controller 14. Storage controller 14 receives
`command, status, and data packets during the data transfer.
`FIG. 2 is a diagram illustrating an exemplary flow of
`packets during a data read operation initiated by host com-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`puter 12 of FIG. 1. Links 16 and 20a—b in FIG. 1 may be
`Fibre Channel links, and the data transfer protocol evident in
`FIG. 2 may be the Fibre Channel protocol. Referring now to
`FIGS. 1 and 2 together, host computer 12 issues a read
`command packet identifying storage controller 14 as its
`destination (XID=H,A) via link 16. Storage controller 14
`receives the read command and determines that two separate
`read operations are required to obtain the requested data; one
`from storage device 18a and the other from storage device
`18b.
`
`Storage controller 14 translates the read command from
`host computer 12 into two separate read commands, one
`read command for storage device 18a and the other read
`command for storage device 18b. Storage controller 14
`transmits a first read command packet identifying storage
`device 18a as its destination (XID=A,B) via link 20a, and a
`second read command packet identifying storage device 18b
`as its destination (XID=A,C) via link 20b. Each read com-
`mand packet instructs respective storage devices 18a—b to
`access and provide data identified by the read command.
`Storage device 18a (ID=B) accesses the requested data and
`transmits a data packet followed by a status packet (XID=
`B,A) to storage controller 14 via link 20a. Storage device
`18b (ID=C) accesses the requested data and transmits a data
`packet followed by a status packet (XID=C,A) to storage
`controller 14 via link 20b. Each status packet may indicate
`whether the corresponding read operation was successful,
`i.e. whether the data read was valid.
`
`As indicated in FIG. 2, storage controller 14 temporarily
`stores the data and status packets in a memory unit within
`storage controller 14. Storage controller 14 then consoli-
`dates the data received from storage devices 18a—b and
`processes the status packets received from storage devices
`18a—b to form a composite status. Storage controller 14
`transmits the consolidated data followed by the composite
`status (XID=A,H) to host computer 12 via link 16, com-
`pleting the read operation. In the event that the composite
`status indicates a read operation error, host computer 12 may
`ignore the consolidated data and initiate a new read opera-
`tion. In general, the flow of packets depicted in FIG. 2 is
`typical of a two-party point-to-point interface protocol (e.g.,
`the Fibre Channel protocol).
`A typical storage controller includes multiple ports and
`one or more CPUs coupled to a communication bus, and a
`memory bus coupling the one or more CPUs to a memory
`unit. Two parameters are commonly used to measure the
`performance of a storage system: (1) the number of input/
`output (I/O) operations per second (iops), and (2) the data
`transfer rate of the storage system. Generally, the rate of
`execution of iops by a storage controller is governed by the
`type, speed and number of CPUs within the storage con-
`troller. The data transfer rate depends on the data transfer
`bandwidth of the storage controller. In computer system 10
`described above, all of the data transferred between host
`computer 12 and storage devices 18a—b is temporarily stored
`within the memory unit of storage controller 14, and thus
`travels through the memory bus of storage controller 14. As
`a result, the data transfer bandwidth of storage controller 14
`is largely dependent upon the bandwidth of the memory bus
`of storage controller 14.
`Current storage systems have restricted scalability
`because of the storage controllers having a relatively inflex-
`ible ratio of CPU to bandwidth capability. In other words, as
`evident in FIGS. 1 and 2, the data transfer rate between host
`computer 12 and storage devices 18a—b is dependent upon
`control functions (i.e., command translation and status
`processing) performed by storage controller 14. This inter-
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 21
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 21
`
`
`
`US 6,370,605 B1
`
`3
`dependence between iops and data transfer rate results in
`less efficient scalability of performance parameters. For
`example, in conventional storage controller architectures, an
`increase in data transfer rate may require both an increase in
`data transfer bandwidth and an increase in the number of
`
`CPUs residing within the controller.
`It would thus be desirable to have a storage controller
`where control functionality (as measured by the iops
`parameter) is scalable independently of the data transfer
`bandwidth (which determines the data transfer rate), and
`vice versa. It may be further desirable to achieve indepen-
`dence in scalability without necessitating a change in the
`eXisting interface protocol managing the host-controller-
`storage interface.
`SUMMARY OF THE INVENTION
`
`Several embodiments of a computer system are described
`which achieve separation of control and data paths during
`data transfer operations, thus allowing independent scalabil-
`ity of storage system performance factors (e.g., storage
`system iops and data transfer rate). In one embodiment, the
`computer system includes a data switch coupled between a
`host computer and one or more storage devices. A storage
`controller for managing the storage of data within the one or
`more storage devices is coupled to the switch. The switch
`includes a memory for storing data routing information
`generated by the controller, and uses the data routing infor-
`mation to route data directly between the host computer and
`the one or more storage devices such that the data does not
`pass through the storage controller. Within the computer
`system,
`information may be conveyed between the host
`computer, the switch, the one or more storage devices, and
`the storage controller according to a two party protocol such
`as the Fibre Channel protocol. The computer system
`achieves separation of control and data paths using a modi-
`fied switch and standard host adapter hardware and host
`driver software. In addition, a two party protocol such as the
`Fibre Channel protocol is not violated.
`The one or more storage devices, the storage controller,
`and the switch make up a storage system of the computer
`system. The switch receives a data transfer command from
`the host computer and directs the data transfer command to
`the storage controller.
`In response to the data transfer
`command, the storage controller translates the data transfer
`command into one or more translated data transfer
`
`commands, and also generates frame header substitution
`information. The storage controller transmits the one or
`more translated data transfer commands and the frame
`header substitution information to the switch.
`The switch routes the one or more translated data transfer
`
`commands to appropriate storage device and stores the
`frame header substitution information within the memory.
`The switch replaces header information of one or more data
`frames associated with the data transfer operation with the
`substitute header information such that the data frames are
`
`routed directly between the host computer and the storage
`device and do not pass through the storage controller.
`Each data frame includes header information within a
`header field, and the header information includes a destina-
`tion address. The switch routes a given data frame based
`upon the destination address. The frame header substitution
`information includes a substitute destination address gener-
`ated by the storage controller such that when the switch
`replaces header information of the data frames with the
`substitute header information, the data frames are routed
`directly between the host computer and the storage device
`and do not pass through the storage controller.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`When the data transfer command from the host computer
`is a read command, the substitute destination address is the
`address of the host computer. The switch receives the one or
`more data frames associated with the read operation from
`the one or more storage device, and routes the one or more
`data frames directly to the host computer such that the data
`frames do not pass through the storage controller.
`When the data transfer command from the host computer
`is a write command, the substitute destination address is the
`address of the one or more storage devices. The switch
`receives the one or more data frames associated with the
`
`write operation from the host computer, and routes the data
`frames directly to the one or more storage devices such that
`the data frames do not pass through the storage controller.
`The frame header substitution information may include
`target header information and corresponding substitute
`header information. Upon receiving a data frame, the switch
`may compare the header information of the data frame to the
`target header information stored within the memory. If the
`header information of the data frame matches the target
`header information,
`the switch may replace the header
`information of the data frame with the substitute header
`
`information corresponding to the target header information.
`Following replacement of the header information of the data
`frame with the substitute header information, the switch may
`calculate a cyclic redundancy check (CRC) value for the
`data frame and insert the CRC value into the data frame. The
`
`substitute header information may include the substitute
`destination address as described above. The switch may then
`route the data frame dependent upon the substitute destina-
`tion address. As a result, the data frame may move directly
`between the host computer and the storage device such that
`the data frame does not pass through the storage controller.
`Following a data transfer operation,
`the switch may
`receive status information associated with the data transfer
`
`operation from the one or more storage devices. The switch
`may route the status information to the storage controller. In
`response, the storage device may generate an overall status
`which may be a consolidation of separate status information
`from multiple storage devices. The storage controller may
`transmit the overall status to the switch, and the switch may
`route the overall status to the host computer.
`The one or more storage devices may include multiple
`disk drives, and the storage controller may manage the one
`or more storage devices as a RAID (Redundant Array of
`Independent Disks) array. Accordingly, the storage control-
`ler may generate the translated data transfer commands
`dependent upon the RAID array configuration of the one or
`more storage devices.
`One embodiment of the data switch is a crossbar switch
`
`including multiple input and output ports coupled to an array
`of switching elements. Each input port is adapted for cou-
`pling to a transmission medium and receives information via
`the transmission medium. Each output port is adapted for
`coupling to a transmission medium and configured to trans-
`mit information via the transmission medium. The array of
`switching elements selectively couples the input ports to the
`output ports. A switch matriX control unit receives routing
`information from the input ports and controls the array of
`switching elements dependent upon the routing information.
`Each input port includes a memory unit for storing the frame
`header substitution information. Each input port receives
`frame header substitution information and stores the frame
`
`header substitution information within the memory unit.
`During a data transfer operation, one or more of the input
`ports receives a data frame including header as described
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 22
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 22
`
`
`
`US 6,370,605 B1
`
`5
`above. Each input port receiving a data frame replaces the
`header information of the data frame with the substitute
`
`header information stored within the memory unit. As a
`result, the substitute destination address becomes the desti-
`nation address, and the input port provides the destination
`address to the switch matrix control unit as the routing
`information.
`
`Each input port may include a port control unit configured
`to control the input port and an input queue for storing
`received information, wherein the port control unit
`is
`coupled to the memory unit and to the input queue. When the
`data frame is received, the data frame is stored within the
`input queue. The port control unit may compare the header
`information of the data frame to the target header informa-
`tion stored within the memory. If the header information of
`the data frame matches the target header information, the
`port control unit may replace the header information of the
`data frame with the substitute header information corre-
`
`sponding to the target header information. After the port
`control unit replaces the header information of the data
`frame with the substitute header information, the port con-
`trol unit may calculate a CRC value for the data frame and
`inserts the CRC value into the data frame. The switch matrix
`
`control unit couples the input port to an output port via the
`array of switching elements dependent upon the substitute
`destination address. As a result, the data frame may move
`directly between the host computer and the storage device
`such that the data frame does not pass through the storage
`controller.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Other objects and advantages of the invention will
`become apparent upon reading the following detailed
`description and upon reference to the accompanying draw-
`ings in which:
`FIG. 1 is a diagram of a conventional computer system
`including a host computer coupled to a storage controller
`and two storage devices coupled to the storage controller;
`FIG. 2 is an exemplary flow diagram of control and data
`packets during a read operation initiated by the host com-
`puter of FIG. 1, wherein the host computer, storage
`controller, and storage devices communicate via a two-party
`data transfer protocol;
`FIGS. 3A, 3B, 3C, 3D, 3E, and 4A show several different
`embodiments of a computer system which achieves separa-
`tion of data and control paths between the host computer and
`a storage device;
`FIG. 4B shows an exemplary flow of command, status,
`and data packets within the computer system of FIG. 4A;
`FIG. 5 illustrates a computer system in an exemplary
`fault-tolerant configuration and including a data storage
`system with scalable performance;
`FIG. 6 shows an exemplary embodiment of a computer
`system wherein the storage controller employs a messaging
`scheme that facilitates data transfer to/from the host com-
`puter under a two-way point-to-point interconnect standard;
`FIG. 7 is an exemplary flow diagram of control and data
`packets during a read operation initiated by the host com-
`puter of FIG. 6;
`FIG. 8 is a block diagram of one embodiment of a
`computer system including a switch coupled between the
`host computer and two storage devices, and wherein the
`storage controller is coupled to the switch, and wherein the
`computer system achieves separation of control and data
`paths using a modified switch and standard host adapter
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`hardware and host driver software, and wherein a two party
`protocol such as the Fibre Channel protocol is not violated;
`FIG. 9 is a block diagram of one embodiment of the
`switch of the computer system of FIG. 8, wherein the switch
`includes multiple input ports, and wherein each input port
`includes a receiver, and input queue, a port control unit, and
`a memory unit for storing frame header substitution infor-
`mation;
`FIG. 10A is a diagram of an exemplary data frame
`according to a data transfer standard such as the Fibre
`Channel standard, wherein the data frame includes a header
`field;
`FIG. 10B is a diagram of an exemplary header field of the
`data frame of FIG. 10A;
`FIG. 11 is a block diagram of one embodiment of one of
`the port control units of FIG. 9 coupled to the respective
`memory unit, wherein the port control unit includes packet
`processing circuitry, an offset calculation unit, and a CRC
`calculation unit; and
`FIGS. 12A and 12B illustrate an exemplary flow of
`control and data packets during a data read operation initi-
`ated by the host computer of the computer system of FIG. 8.
`While the invention is susceptible to various modifica-
`tions and alternative forms, specific embodiments thereof
`are shown by way of example in the drawings and will
`herein be described in detail.
`It should be understood,
`however, that the drawings and detailed description thereto
`are not intended to limit the invention to the particular form
`disclosed, but on the contrary, the intention is to cover all
`modifications, equivalents and alternatives falling within the
`spirit and scope of the present invention as defined by the
`appended claims.
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`Referring now to FIG. 3A, a block diagram of a computer
`system 21 including one embodiment of a storage controller
`26 is shown. The storage controller 26 includes a control
`module 24 and a switch 22. The control
`information
`
`(including command and status signals) flows over a control
`path defined by the interconnecting links 271, 272 and 273.
`On the other hand, the data flows directly between the host
`computer 12 and the storage device 18 through the switch 22
`and over the data path defined by the interconnecting links
`251 and 252. This is different from the conventional storage
`controller 14 (FIG. 1) where all command, status and data
`information is passed between the host computer 12 and the
`storage controller 14 as well as between the storage con-
`troller 14 and storage devices 18a—b.
`The storage controller architecture described herein is
`organized into functional units. The control module receives
`data transfer commands (read or write commands) from the
`host computer 12 through the control path including the
`links 271 and 273. The control module 24 translates a data
`
`transfer command from the host 12 prior to transmitting the
`translated commands to the storage device 18 over the links
`273 and 272. The control module 24 performs translation of
`the command received from the host 12 into one or more
`
`commands depending on the data transfer request (read or
`write request) specified by the command from the host. The
`storage controller 26 may store data into the storage device
`18 using, for example, one or more RAID (Redundant Array
`of Independent Disks) levels. In this case, the translated set
`of commands from the control module 24 may also include
`appropriate commands for the RAID level selected. The
`control module 24 may include one or more processors
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 23
`
`UNIFIED PATENTS EXHIBIT 1003
`PAGE 23
`
`
`
`US 6,370,605 B1
`
`7
`labeled 241 and 242 in FIG. 3A to perform various control
`functions (or iops), including the translation of the com-
`mands received from the host computer 12.
`In general, the RAID level is determined when the storage
`volume is set up. At that time, the system software or the
`user may decide which RAID level to use. For example,
`mirroring under RAID 1 may be used. Alternatively, RAID
`5 with parity calculation may be chosen. A combination of
`more than one RAID level (for example, RAID 0 and RAID
`1) may also be implemented. In one embodiment, parts of
`the storage volume may be stored under different RAID
`levels or combination of RAID levels. The control module
`
`24 may be provided with the necessary information for the
`RAID level selected for data storage. This information may
`then be utilized by the control module 24 when issuing
`appropriate commands during data write operations. In some
`embodiments, during a data read operation, there may be no
`choice of RAID level and any redundancy present in the data
`read may be ignored.
`In one embodiment, the control module 24 dynamically
`selects one or more RAID levels (from the group of RAID
`levels identified when storage volume was set up) for the
`data to be written into the storage device 18. Depending on
`the write command received from the host 12 and depending
`on the prior storage history for specific types of writes from
`the host 12,
`the control module driving software may
`instruct the storage device 18 to divide the data to be stored
`into more than one block and each block of data may be
`stored according to a different RAID algorithm (for
`example, one data block may be stored according to RAID
`1 whereas another data bock may be stored according to
`RAID 5) as indicated by the commands from the control
`module 24 to the storage device 18.
`In an alternative
`embodiment, the control module 24 may simply instruct the
`storage device 18 to store the data received from the host 12
`using one fixed, predetermined RAID level



