`
`(12)
`
`United States Patent
`Coates
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 7,266,556 B1
`Sep. 4, 2007
`
`(54) FAILOVER ARCHITECTURE FOR A
`DISTRIBUTED STORAGE SYSTEM
`
`(75) Inventor: Joshua L. Coates, Orinda, CA (US)
`
`_
`-
`(73) Asslgnee'
`
`-
`Corporatlon’ Santa Clara’ CA
`
`6,148,349 A 11/2000 Chow et a1.
`6,170,013 B1
`1/2001 Murata
`6,173,374 B1
`1/2001 Heil et a1.
`6,236,999 B1
`5/2001 Jacobs et al'
`6,256,673 B1* 7/2001 Gayman ................... .. 709/232
`6,263,402 B1
`7/2001 Ronstrom et a1.
`6,272,584 B1 *
`8/2001 Stancil ..................... .. 710/241
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U-S-C- 154(1)) by 0 days-
`
`EP
`
`(21) Appl. N0.: 09/753,332
`
`(22) Filed:
`
`Dec. 29, 2000
`
`(Continued)
`FOREIGN PATENT DOCUMENTS
`0646858 A1
`4/1995
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`.
`
`11 .
`
`(51) I t Cl
`(2006.01)
`G06F 17/30
`(52) US. Cl. ....... ...... .., ..................... .. 707/10; 707/203
`(58) Fleld 0f
`
`714/100, 6
`See application ?le for complete search history.
`References Cited
`
`(56)
`
`US. PATENT DOCUMENTS
`
`5,497,422 A
`5,506,984 A
`
`3/1996 Tysen et a1.
`4/1996 Miller
`
`707/10
`
`5,550,986 A
`8/1996 DuLac . . . . . . . .
`. . . .. 710/100
`5,692,155 A * 11/1997 Iskiyan et a1. ............ .. 711/162
`5,708,832 A
`1/1998 Inniss et a1.
`5,757,920 A
`5/1998 Misra et a1.
`5,764,972 A
`6/1998 Crouse et a1.
`5,796,952 A
`8/1998 Davis et a1.
`5,805,699 A
`9/1998 Akiyama et a1.
`5,870,537 A *
`2/1999 Kern et a1. ............... .. 711/162
`5,933,834 A
`8/1999 Aichelen
`5,937,406 A
`8/1999 Balabine et a1.
`5,978,577 A * 11/1999 Rierden et a1. ............. .. 707/10
`6,081,883 A
`6/2000 Popelka et a1.
`6,108,155 A
`8/2000 Tanaka et a1.
`6,128,627 A 10/2000 Mattis et a1.
`
`Mogul (RFC0917 : Internet subnets, 1984, ACM, pp. 1-17).*
`(Continued)
`Primary Examineriloon HWan Hwang
`(74) Attorney, Agent, or F irmiCaroline M. Fleming
`
`(57)
`
`ABSTRACT
`
`A network storage system 1nc1udes a V1rtual ?le system
`(“VFS”) that manages the ?les of the network storage
`system, and a storage center that stores the ?les. The VFS
`and the storage center are separated, such that a client
`accesses the VFS to conduct ?le system operations and the
`client accesses the storage center to upload/download ?les.
`The client accesses the network storage system through one
`or more storage ports. The storage center includes a plurality
`of distributed object storage managers (DOSMs) and a
`storage cluster that includes a plurality of intelligent storage
`nodes. The network storage system includes additional stor
`age centers at geographically disparate locations. The net
`work storage system uses a multi-cast protocol to maintain
`?le information at the DOSMs regarding ?les stored in the
`intelligent storage nodes, including ?les stored in disparate
`storage centers.
`
`29 Claims, 32 Drawing Sheets
`
`F119 Upload/Download
`anemic":
`
`um sum-um; PM;
`:10
`
`1
`
`now
`1
`
`i
`
`DOSM
`z
`
`l
`
`1 gfm
`
`lmarrmvmat Flbnc
`m
`
`Shane New
`1
`
`am,- um
`2
`
`lmllugnnl am.» we
`
`HPE, Exh. 1004, p. 1
`
`
`
`US 7,266,556 B1
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`6,304,980
`6,314,465
`6,324,581
`6,327,614
`6,351,775
`6,356,929
`6,360,306
`6,389,462
`6,393,466
`6,405,201
`6,438,125
`6,442,548
`6,507,883
`6,553,376
`6,553,389
`6,574,641
`6,611,869
`6,622,247
`6,654,772
`6,704,838
`6,718,347
`6,782,389
`6,895,418
`6,912,548
`6,931,450
`6,948,062
`2002/0010757
`2002/0054114
`
`10/2001
`11/2001
`11/2001
`12/2001
`2/2002
`3/2002
`3/2002
`5/2002
`5/2002
`6/2002
`8/2002
`8/2002
`1/2003
`4/2003
`4/2003
`6/2003
`8/2003
`9/2003
`11/2003
`3/2004
`4/2004
`8/2004
`5/2005
`6/2005
`8/2005
`9/2005
`1/2002
`5/2002
`
`Beardsley et al.
`Paul et al.
`Xu et al. .................. .. 709/229
`Asano et al.
`Yu
`Gall et al.
`Bergsten ................... .. 711/162
`Cohen et al.
`Hickman et al.
`NaZari
`Brothers
`Balabine et al.
`Bello et al.
`Lewis et al.
`Golding et al. ........... .. 707/202
`Dawson et al.
`Eschelbeck et al.
`Isaak
`Crow et al.
`Anderson
`Wilson ..................... .. 707/201
`Chrin et al.
`Crow et al.
`Black
`Howard et al.
`Clapper
`Granik et al.
`Shuping et al.
`
`2002/0083120 A1
`2002/0133491 A1
`2002/0133539 A1
`2004/0078465 A1
`2004/0078466 A1
`2004/0088297 A1
`
`6/2002 Soltis
`9/2002 Sim et al.
`9/2002 Monday
`4/2004 Coates et al.
`4/2004 Coates et al.
`5/2004 Coates
`
`FOREIGN PATENT DOCUMENTS
`
`WO
`W0
`
`WO99/45491 A
`WO 01/67707
`
`9/1999
`9/2001
`
`OTHER PUBLICATIONS
`
`Microsoft Press (Computer Dictionary Third Edition, 1997,
`Microsoft Press, p. 377).*
`KLINE: Distributed File Systems for Storage Area Networks;
`Advanced Digital Info Corp; 11 pages.
`Apache-SSL: Certi?cates, Con?guration and More Information
`Systems and Technology; University of Waterloo; Sep. 15, 1998; 16
`pages.
`Schroeder: Implementing Multi-Protocol Authentication/Security
`Systems; Grid (GSI) Secure Socket Layer (X.509 Certi?cates),
`Kerberos, DCE, and Custom Systems for Secure Distributed Data
`Access; Jul. 19, 1999; 4 pages.
`BEA WebLogic SSL; Server 4.5; 20 pages.
`Certi?cate to LDAP Mapping in Netscape Servers, Rev 1.2; Aug.
`28, 1997; Structured Arts Technical Notes.
`U.S. Appl. No. 10/367,541 Of?ce action dated Oct. 11, 2006, 1-15.
`
`* cited by examiner
`
`HPE, Exh. 1004, p. 2
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 1 of 32
`
`US 7,266,556 B1
`
`
`
`
`
`Wa}sksabesojsOMEN
`
`washksSiigjenUA 06
`
`JaysnjQeBesoys
`OL(oySd.
`
`
`
`palqo
`
`Salts
`
`atsyelqo,
`
`sjsenbay
`
`
`
`Kuayoauig-joqueg
`
`sucyesedo
`
`pyelao
`
`quaidiooy
`
`og
`
`
`
`WajsAsalld
`
`joyueD
`
`09
`
`Launbig
`
`HPE, Exh. 1004, p. 3
`
`HPE, Exh. 1004, p. 3
`
`
`
`
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 2 of 32
`
`US 7,266,556 B1
`
`
`
`SdIAlagabelojs
`
`O€L
`
`ost
`
`a10}§ByeQ JanagulbuE
`
`JUayUOD
`
`sjsenbeydilH
`
`seid}alqO104
`
`
`
`sisanbeydilH
`
`S8[l4WLH404
`
`ool
`
`ZzaanBi4
`
`HPE, Exh. 1004, p. 4
`
`HPE, Exh. 1004, p. 4
`
`
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 3 of 32
`
`US 7,266,556 B1
`
`File Upload/Download
`Operations
`
`30o
`
`Load Balancing Fabric
`310
`
`V
`
`DOSM
`1
`
`Distributed Object Storage Managers
`
`Y
`
`DOSM
`2
`
`320
`
`I I I
`
`Y
`
`DOSM
`"nu
`
`Interconnect Fabric
`330
`
`V
`
`V
`
`Y
`
`Storage Node
`1
`
`Storage Node
`2
`
`340
`
`I I I
`
`Storage Node
`..n..
`
`Intelligent Storage Nodes
`
`Figure 3
`
`HPE, Exh. 1004, p. 5
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 4 0f 32
`
`US 7,266,556 B1
`
`Receive
`Download Request
`(SRL)
`?
`
`YES
`
`Switching Fabric Selects DOSM
`410
`
`i
`
`Parse SRL to Extract Certi?cate and
`Object Fingerprint
`415
`
`/_ 420
`
`Does SRL
`Authenticate?
`
`440
`
`Does DOSM
`Contain Location
`
`No
`
`Storage
`Node
`Readable
`
`Yes
`
`Figure 4
`
`Broadcast Request To
`Storage Nodes to Locate
`Object
`450
`
`L
`
`Each Storage Node
`Determines Whether It
`Stores The Object
`460
`
`Error Message Is
`Sent to Recipient
`
`DOSM Obtains Connection
`With Storage Node With
`Object
`435
`
`Object is Transmitted From
`storage cluster to Recipient
`495
`
`>
`
`/\
`
`DOSM Establishes Point to
`Point Communications With
`Each Storage Node
`464
`
`P 466
`
`Object Located
`1)
`
`Storage Node With Object
`sfoadbfasls Object.
`Identi?cation Information
`for all DOSMs
`470
`<—-—-_]
`
`/'\
`
`i
`
`Perform Failover Operation
`to Obtain Object
`468
`
`HPE, Exh. 1004, p. 6
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 5 0f 32
`
`US 7,266,556 B1
`
`Decode SRL To Extract Client
`Identification, Object Fingerprint
`And SRL Certi?cate
`500
`
`l
`
`Extract Secret For
`The Corresponding Client
`510
`
`Generate Calculated Certi?cate
`520
`
`Compare Calculated Certi?cate
`With SRL Certi?cate
`
`Generate Error Message
`To Requester
`550
`
`SRL Authenticates
`560
`
`Figure 5
`
`HPE, Exh. 1004, p. 7
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 6 6f 32
`
`US 7,266,556 B1
`
`DOSM File Look-up Table
`610
`
`File Id
`?le1.MD5
`
`lP Addr
`10.3.1001
`
`Disk Id
`3
`
`?le2.MD5
`
`10.30981
`
`?Ie3.MD5
`
`10.30501
`
`?le4.MD5
`
`10.3.1001
`
`?|e5.MD5
`
`10.3.098.1
`
`1
`
`6
`
`2
`
`1
`
`Data Cache
`620
`
`Film Snippet
`
`Advertisement
`
`Film Preview
`
`State Table
`630
`
`Read - Write State of Storage Nodes
`
`Health of Storage Nodes
`
`Load of Storage Nodes
`- Storage Capacity
`- Number of HO Operations
`Per Second
`
`2 NW
`l: a 1:1:
`
`D :1
`/ U" \
`DOSM Server
`600
`
`Figure 6
`
`HPE, Exh. 1004, p. 8
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 7 0f 32
`
`US 7,266,556 B1
`
`
`
`900 mEmmmuEn.
`
`boEwE
`
`on“
`
`K om»
`
`HPE, Exh. 1004, p. 9
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 8 0f 32
`
`US 7,266,556 B1
`
`j
`
`Start
`
`j
`
`7
`VFS Performs Directory Operations for New
`Object File And Transmits Upload Request
`To Storage Cluster
`805
`
`V
`
`Switching Fabric Selects DOSM
`810
`
`V
`Parse Request to Extract Ce rti?cate, Client,
`Directory And Object Information
`820
`
`830 f
`oes Reques Yes
`Authenticate?
`
`N0
`
`V
`DOSM Communicates Fingerprint, Folder
`id, Client information and Meta Data to VFS
`880
`
`B90
`
`DOSM Veri?es Upload To The Source
`895
`
`Send Error
`
`
`
`‘ l‘ Source c,"- 10
`
`
`
`835
`
`i
`
`DOSM Seiects Destination Storage Node(s)
`To Store Object File
`840
`
`i
`
`DOSM Obtains Connection With
`Destination Storage Node(s)
`850
`
`V
`Object File is Transmitted to Destination
`Storage Node(s)
`860
`
`V
`Generate and Verify A Fingerprint for the
`Object File
`870
`
`Figure 8
`
`HPE, Exh. 1004, p. 10
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 9 0f 32
`
`US 7,266,556 B1
`
`Create A Temporary File With The Contents Of
`The Object At The Destination Node.
`900
`
`Compute an MD5 Hash On The Contents Of The
`Temporary File
`910
`
`V
`DOSM Determines Whether The MD5 Hash
`Fingerprint Currently Exists In The Storage
`System
`920
`
`f 930
`Fingerprint
`Currently Exist
`?
`
`Yes
`
`No
`
`Convert Temporary File To
`Permanent File With MD5
`Fingerprint Identi?cation
`960
`
`Delete Temporary File
`
`Increment The Reference
`Count For The Existing
`
`Figure 9
`
`HPE, Exh. 1004, p. 11
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 10 0f 32
`
`US 7,266,556 B1
`
`205930
`
`82550685: 6E
`
`
`
`25mm 052.23 33
`
`HPE, Exh. 1004, p. 12
`
`
`
`Sep. 4, 2007
`
`Sheet 11 of 32
`
`US 7,266,556 B1
`
`U.S. Patent
`
`
`
`M034dit
`
`
`
`sysonbayuoneiedo
`
`
`
`A10y21GdiLH
`
`
`
`sjsenbayuoqesado
`
`
`
`s}sanbayuoyjeiadg
`
`Aayoa1igdiLH
`
`
`
`Ob)saBeueyAopangpaynquysig
`
`
`
`
`
`
`
`wu,JebeueyAojag
`
`
`
`
`
`ZsaBeuewAloj9IIqpainquysig
`
`paynquisig
`
`|Lopauig
`
`painquysiq |seBeuewAopad
`paynquysig
`
`
`
`
`
`wu,Aiopaugpainquysig
`
`
`
`ZAuoyengpenguisig
`
`OOLT
`
`LLounBig
`
`OZLLSaoWSpanquasiq
`
`
`
`HPE, Exh. 1004, p. 13
`
`HPE, Exh. 1004, p. 13
`
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 12 or 32
`
`US 7,266,556 B1
`
`Customer Table
`
`Customer Name Customer Reserved Fields
`
`Customer A
`
`[Customer stores data ...]
`
`Customer B
`
`[Customer stores data ...]
`
`1200
`
`/
`
`Customer C
`
`[Customer stores data ...]
`
`Customer D
`
`[Customer stores data ...]
`
`Folder Table
`
`Customer ld
`
`Folder Id
`
`Folder Parent ld
`
`Metadata
`
`3
`
`3
`3
`
`3
`
`2
`
`100
`251
`
`166
`
`-
`
`2
`2
`
`[Reserved]
`
`[Reserved]
`[Reserved]
`
`251
`
`[Reserved]
`
`1210
`
`File Table
`
`Customer ld File Handle Folder ld
`
`Folder Parent Id
`
`Metadata
`
`1220
`
`3
`
`3
`3
`3
`
`52.MD5
`
`55.MD5
`99.MD5
`67.MD5
`
`100
`
`100
`166
`166
`
`2
`
`2
`251
`251
`
`[Reserved]
`
`[Reserved]
`[Reserved]
`[Reserved]
`
`Figure 12
`
`HPE, Exh. 1004, p. 14
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 13 0f 32
`
`US 7,266,556 B1
`
`Parse Request to Extract Certi?cate, Client
`Information, Operation Code and
`Arguments
`
`Validate?
`
`Send Error
`Message To
`Requester
`1325
`
`Open Folder?
`
`Move Folder?
`
`Create Folder?
`
`Move File
`
`1370
`
`Access File And Folder
`Tables To Extract File
`Ids And Sub-Folder Ids
`1345
`
`Revise Folder Table
`Entries To Re?ect New
`Location
`1355
`
`Add Entry For New
`F°'de' '"gg'ge' “me
`
`Revise File Table
`Entries To Reflect New
`Location
`1375
`
`1
`
`Return Arguments To Requester
`1390
`
`Figure 13A
`
`17
`End
`
`V
`(To Figure 138)
`
`HPE, Exh. 1004, p. 15
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 14 0f 32
`
`US 7,266,556 B1
`
`(From Figure 13A)
`
`Delete Folder?
`
`Delete File?
`
`Create File?
`
`Update Folder?
`
`Update File?
`
`Delete Folder
`Em From Folder
`ry Table
`1374
`
`Delete File Entry
`From File Table
`1378
`
`Add Entry For File
`In File Table
`1384
`
`Update Client
`Metadata In Folder
`Table For Folder
`Entry
`1388
`
`Update Client
`Metadata In File
`Table For Fm;
`Entry
`1394
`
`ll
`
`Return Arguments To Requester
`1396
`
`ll
`
`End
`
`Figure 135
`
`HPE, Exh. 1004, p. 16
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 15 0f 32
`
`US 7,266,556 B1
`
`Perform Validation Using Client Information
`and Certi?cate From The Request
`1405
`
`Send Error
`Message
`1415
`
`1410
`
`(
`
`Request
`Validated?
`
`Extract MD5 Handler From Database Entry
`1420
`
`l
`
`Delete File Identi?cation in File Table
`1450
`
`DDM Constructs A Delete SRL And Transmits
`The Delete SRL To The Storage Cluster
`1460
`
`Decrement Reference Count By One
`1440
`
`s Reference
`Count Greater
`Than One?
`
`Yes
`
`Storage Cluster Deletes File From Appropriate
`Storage Node
`1470
`
`Figure 14
`
`HPE, Exh. 1004, p. 17
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 16 6f 32
`
`US 7,266,556 B1
`
`2 65mm
`
`382w
`6:60
`
`o Pm?
`
`HPE, Exh. 1004, p. 18
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 17 of 32
`
`US 7,266,556 B1
`
`
`
`Hippydl
`
`Uppydi
`
`
`
`ougejosuu0qaIuy
`
`age
`
`Ooze
`
`
`
`siaheueyabeiocjsyoolggpaynquysiq
`
`
`
`
`
`
`
`OcSL
`
`
`
`
`
`ouqe4Gulsuejegpeo7
`
`OLe
`
`
`
`wayshgoilyeNPIA,
`
`as
`
`J9USQaBeo0}S
`
`
`
`
`
`sjsanboysysonboy
`
`
`
`abe10}Sabe10js
`
`zLSPONSpon
`
`
`
`abesojsabeins
`
`
`wl,
`
`/z@PONapon
`
`APPYdi
`
`
`
`ouqey~ouUOETSIy|
`
`O£e
`
`“1ppYdl
`
`‘Ippydi
`
`peajumoqspeodnuoneiadgAopaig
`
`
`
`
`peojumoqg/peodf)uonesedgAuoypeaiig
`
`
`
`
`
`OpeSepON@Bes0;5USB!|/AIUT
`
`gpaunbBiy
`
`
`
`
`
`
`
`OveSOPONBBe10)5JUaHNOVU]
`
`
`
`
`
` AsieHeuewe6es03¢jooldgpainquisig
`
`
`
`
`
`
`
`ouge4Guisuejegpeo?
`
`OLE
`
`
`
`s\sanbeysayuayabesroys
`
`OLS
`
`sjsonboy
`
`
`
`
`
`WEISASSIl4|ENLIA
`
`os
`
`HPE, Exh. 1004, p. 19
`
`HPE, Exh. 1004, p. 19
`
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 18 of 32
`
`US 7,266,556 B1
`
`
`
`(s)ayuayafe20)$
`
`Oldb
`
`
`
`JOMO}"WLUOD
`
`OZL4
`
`
`
`youmgjeondo
`
`OBLL
`
`y
`
`keonfle|
`
`JanesNGO
`
`OEZt
`
`
`aOzZ1
`
`{.AeASSGaMUlGUOJUE}UOD
`
`
`
`ZbLaunbi4OPAL
`
`
`
`le.
`
`
`19S8/-puy
`
`sayndwog
`
`HPE, Exh. 1004, p. 20
`
`HPE, Exh. 1004, p. 20
`
`
`
`
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 19 of 32
`
`US 7,266,556 B1
`
` End-User Computer Generates HTTP Request
`1800
`
`
` Content Origin Server Returns to End-User
`Computer HTML With Embedded File URL
`1810
`
`
`To Content Origin Web Server
`
` End-User Computer Generates HTTPFile
`Request To Content Delivery Network
`
`1820
`
` File In CDN
`
`Cache?
`
`
` CDN Generates An HTTPFile Requestto
`Storage Center
`
`1830
`
`1840
` Storage Center Downloads File To CDN Cache
`
`1850
` CDNDelivers File To End-User Computer
`
`Figure 18
`
`HPE, Exh. 1004, p. 21
`
`HPE, Exh. 1004, p. 21
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 20 of 32
`
`US 7,266,556 B1
`
` Ele
`
`
`
`End-User
`Computer
`1900
`
`
`Client Site
`1910
`
`
`Content Web Server
`1925
`
`
`
`Storage Port
`1930
`
`1950
`
`Storage Center
`
`Figure 19
`
`HPE, Exh. 1004, p. 22
`
`HPE, Exh. 1004, p. 22
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 21 of 32
`
`US 7,266,556 B1
`
`Client Receives URL File Request From End-User
`2010
`
`2020
`
`Client Generates Local File System Request
`
`Retrieve ObjectFile
`From Data Cache
`
`2040
`
`Yes
`
`2030
`
`rc
`
`ObjectFile
`Local?
`
`No
`
`Request Object File From Storage Center(s)
`
`2050
`
`Receive Object File From Storage Center(s)
`2060
`
`
`Return Object File In Response To Local File System
`Request
`2070
`
`Figure 20
`
`
`
`Deliver Object File To End-User In Response To URL
`File Request
`2080
`
`HPE, Exh. 1004, p. 23
`
`HPE, Exh. 1004, p. 23
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 22 of 32
`
`US 7,266,556 B1
`
`
`|Jajyuayabesog;+}
`
`
`oeizsuojesadgali4
`
`HOYa6e10}¢Sol
`—y4&6
`
`
`
`suoHeledgAopaiqLd
`
`
`
`
`
`OLLeJAAIBSGap,jUa]UOD
`
`0G1z/
`
`ezainbiy
`
`ooLe
`
`SSIZ
`
`
`
`
`
`Aieiqiywaysigabeiojg-
`
`(seo
`
`
`
`L542Sldi¥waysksabeiojs[]eD-
`
`
`
`
`
`
`
`Ja]u9996e10}5
`
`o9lz
`
`suonjesadoA0jag
`
`qizainbi4
`
`Oeste}Orla
`
`
`
`JANSSGanajua}uOD
`
`HPE, Exh. 1004, p. 24
`
`HPE, Exh. 1004, p. 24
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 23 of 32
`
`US 7,266,556 B1
`
`WOMION
`
`109Bulsseo0ld
`
`OLec
`
`HogaBes0jS
`
`00cz
`
`
`
`Woya6es0}¢
`
`aqoseieg
`
`Ovez
`
`
`
`(s)zoepayu)OMAN
`
`azzz
`
`
`
`SOBLSIU]HIOMION
`
`BOELa}U]WOMION
`
`SOELO}U]
`
`Npied
`
`Zpied
`
`bpied
`
`
`
`all,SAUQSIGLSauqXsSI0
`
`zzainBi4
`
`HPE, Exh. 1004, p. 25
`
`HPE, Exh. 1004, p. 25
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 24 of 32
`
`US 7,266,556 B1
`
`2300
`
` Client Local
`
`File System
`
`
`2310
`
`Directory Cache
`2370
`
`Local File System
`
`Operating System Kernel
`
`2360
`
`Storage System Access
`Processes
`2330
`
`To
`VFS
`
`Information
`
`Operation XML Requests
`
`
` Directory
` Storage System Kernel
`
`
`
`
`
`
`
`File System
`Translator
`2320
`
` Local File System
`interception
`2340
`
`
`Storage System
`Requests
`
`Processes
`2350
`
`
`
`Figure 23
`
`HPE, Exh. 1004, p. 26
`
`HPE, Exh. 1004, p. 26
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 25 of 32
`
`US 7,266,556 B1
`
`Client Issues Local File Open Request
`
`2400
`
`import Local File Request
`
`
`
`2410
`
`
`
`Dispatch File System Request ta File System Translator
`2420
`
`
`
`Data
`Additional Directory
`
`
`Required
`Information Required?
`
`
`
`
`
`Return Directory
`
`Information
`2437
`
`Generate Request to VFS for File
`
`2470 Generate SRL Request To Download
`and Directory Information
`
`
`File From Storage Cluster
`
`2450
`
`
`Is File in Data
`
`Cache?
` Receive Directory Information
`Receive and Cache Object File
`And Store In Directory Cache
`2460
`2480
` Transfer Object From Storage Port
`
`
`to Client Requester
`2490
`
`
`Figure 24
`
`HPE, Exh. 1004, p. 27
`
`HPE, Exh. 1004, p. 27
`
`
`
`Embedded S
`
`
`HTML With
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 26 of 32
`
`US 7,266,556 B1
`
`2600
`
`URL
`Request
`
`
`
`Computer
`2610
`
`
`
`
`Served
`
`Content Web Server
`2630
`
`Storage Port
`2640
`
`2660
`
`SRL
`Request
`
`Object File
`
`Storage Center
`
`Figure 25
`
`HPE, Exh. 1004, p. 28
`
`HPE, Exh. 1004, p. 28
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 27 of 32
`
`US 7,266,556 B1
`
`Content Web Server Generates Local File
`System Request for SRL(s) for File(s)
`2700
`
`Storage Port Generates SRL(s) for File(s)
`2710
`
`Time Out Parameter Is Added to SRL
`2720
`
`SRLIs Embedded In HTML Web Page
`2730
`
`End-User Issues Web Page Request
`2740
`
`Content Web Server Downloads HTML With
`Embedded SRL
`2745
`
`End-User Generates HTTP Request to Storage
`Center With SRL
`2750
`
`Valid?
`
`Authenticate
`
`File
`
`Storage Center Downloads
`Object File To End-User
`2770
`
`Figure 26
`
`HPE, Exh. 1004, p. 29
`
`HPE, Exh. 1004, p. 29
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 28 of 32
`
`US 7,266,556 B1
`
`Nh oooS oO|
`
`
`
`End-User
`Computer
`2810
`
`Location
`
`URL
`
`Request
`
`
`
`HTML With
`Embedded S
`
`
` Client Site 2820
`
` Content Web Server
`2830
`
` File
`Request
`
`
`
`Private File Manager
`2840
`
`
`
`Storage Center
`
` SRL
`ObjectFile
`Served
`Request
`
`
`
`
`Figure 27
`
`HPE, Exh. 1004, p. 30
`
`HPE, Exh. 1004, p. 30
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 29 of 32
`
`US 7,266,556 B1
`
` End-User Issues URL Request
`
`2900
`
`
`
`Content Web Server Generates A File Location
`Requestto File Manager
`2910
`
` File Manager Retrieves File SRL for Content
`Web Server
`2920
`
`
`
`
`2930
`
` Content Web Server Transmits To End-User
`HTML With Embedded SRL
`
`
`SRL
`Authenticate
`
`
`
`
`End-User Generates HTTP Request to Storage
`Center With SRL
`2940
`
`
`
`
`?
`
`2947
` Storage Center Generates MD5 Hash on "SRL"
`and Identifies File
`
`
`2950
` Storage Center Downloads Object File To End-
`User
`
`
`Figure 28
`
`HPE, Exh. 1004, p. 31
`
`HPE, Exh. 1004, p. 31
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 30 of 32
`
`US 7,266,556 B1
`
`SUSjUsl]|D
`
`YIOMJON
`
`090¢_/
`
`OL0€
`
`ueaH
`
`Bunoyuoyy
`
`ogoe
`
`JOAOIIE4
`
`JPPYdiSHOMISNWOde6e10)SyoMJaN
`
`
`
`
`eoeye}uU|OLEsoepequt
`
`yueey
`
`
`
`osoeBuvoyEW)Soe
`
`
`
`6zasnbly
`DAISSEd
`siomjan|OfO€
`annoy
`WOMEN|SOE
`Sepa
`a@oeprayu}
`GuvoyuOW
`
`
`OPS]ZOepayU]
`
`
`womenHoyabeiojsyOAN
`
`0zoe
`
`SeoeSz0¢
`
`
`
`HPE, Exh. 1004, p. 32
`
`HPE, Exh. 1004, p. 32
`
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 31 of 32
`
`US 7,266,556 B1
`
`Storage Port
`Failover?
`
`ro
`Read File
`
`
`
`
`Operation
`
`
`Requested
`?
`
`
`Open File
`Operation
`Requested?
`
`3130
`
`Execute Open File Operation
`3140
`
`
`
`
`
`Generate XML Request To VFSforFile
`Identification
`3150
`
`
`
`
`
`Figure 30
`
`HPE, Exh. 1004, p. 33
`
`aV
`
`FS Returns File Identification
`3160
`
`oo
`
`Storage Port Updates Directory Cache
`3170
`
`op
`
`Generate SRL Request To Storage Cluster
`3180
`
`Receive File and Update Cache
`3190
`
`ee
`
`HPE, Exh. 1004, p. 33
`
`
`
`U.S. Patent
`
`Sep. 4, 2007
`
`Sheet 32 of 32
`
`US 7,266,556 B1
`
`
`
`
`DOSMsUpdates State Table For Storage
`Node Failover
`3220
`
`Fails?
`
`
` File
`Requested
`?
`
`DOSM Issues Multi-cast Protoco! Request
`To Storage Nodes
`3230
`
`Each Storage Node Determines WhetherIt
`
`
`
`
`
`
`Contains The Requested File DOSM Selects Different
`
`
`
`
`
`
`Storage Center
`3247
`
`Storage Node With File Broadcasts File
`Identification Information
`3250
`
`
`
`DOSM Snoops, Through Multi-Cast
`Protocol, To Update File Information
`3260
`
`
`
`Figure 31
`
`HPE, Exh. 1004, p. 34
`
`HPE, Exh. 1004, p. 34
`
`
`
`US 7,266,556 B1
`
`1
`FAILOVER ARCHITECTURE FOR A
`DISTRIBUTED STORAGE SYSTEM
`
`CROSS-REFERENCES TO RELATED
`APPLICATIONS
`
`This application claims the benefit of U.S. patent appli-
`cation Ser. No. 09/695,499, filed Oct. 23, 2000, entitled “A
`Network Storage System”, and to U.S. Provisional Patent
`Applications Nos. 60/186,693 and 60/186,774, filed Mar.3,
`2000, entitled “Method and Apparatus for Implementing A
`Network-Based Storage Service” and “Method and Appa-
`ratus for Establishing Control and Data Lines ‘lo A Storage
`Facility, And API For Supporting Such Lines”, respectively.
`
`BACKGROUNDOF TILE INVENTION
`
`1. Field of the Invention
`The present
`invention is directed toward the field of
`wycS
`storage, and more particularly toward accessing remote 2
`storage through use of a local device.
`2. Art Background
`With the rapid digitization of music, film and photo-
`graphs, customer demandis driving the Internet to become
`the most preferred transport mechanism for all forms of 2
`digital media. Using the Internct, users have instantaneous
`worldwide access to their favorite movies, songs, or per-
`sonal memorabilia. As the producers and owners of media
`content increasingly use the Internet as a primary method for
`worldwide distribution, the aggregate amount of rich media
`content available over the Internet
`is increasing at an
`extremelyrapid rate.
`Not only is the numberof rich media objects available
`over the Internet growing exponentially, but the size of the
`media, generally referred to herein as objects, is also dra-
`matically increasing. A median Web object is 5 kilobytes
`(KB) in size, while the size of a rich media object may be
`100 to 1 million times larger. For example, high-resolution
`digital photographs average 500 KB per picture. Digital
`music runs 3 to 5 megabytes (“MB”) per song, and digital
`movies may reach up to 4 gigabytes (“GB”) in size.
`As the numberof personal computers, digital camcorders,
`digital cameras, and personal digital audio players grow,
`demand for Internet bandwidth to store, share and retrieve
`media files across the Internet also will grow. As the use of
`high-bandwidth digital subscriber
`lines
`(“DSL”), cable
`modems, and digital broadcast satellite networks gain in
`popularity, which supports the growth of the Internet back-
`bone, the demandfor using the Internet as a primarydelivery
`i2
`channel for rich media objects also gains in popularity. This 5
`development causes a virtuous cycle, where the installation
`of broadband networks drives the use of rich media devices,
`which in turn, creates demand for further improvements in
`network bandwidth, and so on.
`The distribution of rich media objects across the Internet
`creates the need for increased storage capacily to store these
`rich media objects. As the numberofpersonal media devices
`grows, and the network bandwidth expands, the amount of
`storage media required to store the various MP3 files,
`photographs, films, and video clips will also grow. Also, as
`more storage becomes readily available, more people will
`use the Internet to catalog, store, and access their rich media
`objects (c.g., digital photographs of family members).
`To date, only traditional storage solutions from estab-
`lished enterprise vendors have been available to a Web site
`developer implementing rich media repositories. One chal-
`lenge with adopting today’s existing storage technology for
`
`60
`
`5
`
`2
`use with the Internet is meeting current and future scalability
`requirements. Today, large scale storage systems only scale
`to a few dozen terabytes. This amount of storage space is
`inadequate for storing substantial amounts of rich media
`objects. For example, if just 10 percent of America on line
`(“AOI”) users place two 15 minute videos on a personal
`home page,
`then one petabyte (i.e., 1000 terabytes) of
`storage would be required. Today’s enterprise storage sys-
`tem architectures cannot support this level of storage capac-
`ity.
`in addition to providing mass
`In the Internet world,
`storage, it is also critically important to provide universal
`access to that storage across the wide area network. The
`content provider, regardless of the location of their content
`servers, cache servers, or stream servers, would ideally like
`to provide ubiquitous access to an entire store of rich media
`objects. Current technology, including storage area networks
`and network attached storage technologies, do not provide
`direct access to the wide area network. Only servers located
`within the same metropolitan area can directly access these
`types ofstorage systems.
`Since Internet users are measuredin thetens of thousands
`or even millions of users,
`instead of hundreds of users,
`another challenge in mass storage is the ability to scale
`delivery of media as the demandincreases. A true Internet
`based storage system must be able to handle peak loads of
`millions of simultancous requests fromall around the world.
`Traditional storage architectures are designed to support a
`few hundred simultaneous requests from the fastest possible
`response time to match the speed of the server CPU.For the
`Internet, storage systems must be able to manageliterally
`millions of simultaneous downloadsat the speed of the wide
`area network. Thus, these traditional storage architectures
`are not “impedance matched” with the wide area network
`because the storage devices handle far too few simultaneous
`transactions that far exceed the latency requirements of the
`wide area network. In addition, these traditional storage
`architectures are typically implemented with expensive
`disks and expensive connection technologies.
`Another issue regarding storage of rich media objects is
`the time to market. he time to market is often a crucial
`
`requirement for new rich media Web sites. Growthrates are
`measured in terabytes per month. Quickly bringing new
`capacity online becomesa strategic advantage in fast-mov-
`ing markets. Typically, with traditional storage solutions, it
`takes a customer two to six months to integrate a fully
`operational multi-terabytes storage unit with the content
`providers site. This start-up time is to slow to meet rapidly
`increasing business demands. Pre-building large amounts of
`excess capacity in anticipation of this demandis onetactic
`to deal with unpredictable demand spikes, but this approach
`is prohibitively expensive.
`Traditional storage architectures have been optimized for
`databaseandfile server applications. The Internet introduces
`a whole new set of demands on storage devices, including
`scalability, global access, user accounts, and rapid deploy-
`ment. With the explosive growth in rich media served over
`the Internet over the next several years, this is coming to a
`head. The comingtitle wave of rich content will surpass the
`capabilities of even the most robust enterprise storage archi-
`tectures. Accordingly, there is a demand to develop new
`paradigms in new ways ofdesigning Internet ready rich
`media storage systems.
`
`HPE, Exh. 1004, p. 35
`
`HPE, Exh. 1004, p. 35
`
`
`
`3
`SUMMARY OF THE INVENTION
`
`US 7,266,556 B1
`
`4
`FIG.3 is a block diagram illustrating one embodiment for
`the storage cluster.
`FIG.4 is a flowdiagramillustrating one embodiment for
`the download operation in the storage cluster.
`FIG. 5 is a flowchart illustrating one embodiment for
`authentication in the network storage system.
`FIG.6 illustrates one embodiment of a distributed object
`storage manager (“DOSM”).
`FIG.7 is a block diagram illustrating one embodiment for
`an intelligent storage node.
`FIG.8 is a flow diagram illustrating one embodiment for
`processing upload requests in the storage cluster.
`FIG. 9 is a flow diagramillustrating one embodiment for
`generating unique fingerprints of object files.
`FIG. 10 is a block diagram illustrating one embodiment
`for caching data in the storage cluster.
`FIG. 11 is a block diagramillustrating one embodiment
`for implementing a VI'S for use with a network storage
`system.
`FIG. 12 illustrates example database tables for imple-
`menting the file system with a database.
`FIGS. 13A and 13B are flow diagrams illustrating one
`embodiment for performing directory operations in the VES.
`VIG. 14 is a flow diagram illustrating one embodimentfor
`the delete file operation for the network storage system.
`FIG. 15 illustrates geographical replications of storage
`centers.
`
`0
`
`a 5
`
`A distributed storage system includes multiple intelligent
`storage nodes arranged in one or more storage centers. The
`intelligent storage nodes, which store the files for the dis-
`tributed storage system, are combined with one or more
`distributed object storage managers (“DOSMs’”). A network
`couples the DOSMsto the intelligent storage nodes. The
`DOSMs manage requests from clients, such as content
`servers and end-user computers, to download files from a
`storage center.
`The distributed storage system are arranged to support
`uninterrupted delivery of files in the event of a failure in an
`intelligent storage node. In one embodiment, the failover
`archilecture includesstoring, for each file, a duplicate file in
`a different intelligent storage node. In the event the distrib-
`uted storage system enters a failover condition, the DOSM
`determines, for files stored in the failed intelligent storage
`node, a location (i.e.,
`intelligent storage node)
`for the
`duplicate files. In one embodiment, to determine the location
`of the intelligent storage node storing a duplicate file, the
`DOSMsstore a map. The map provides a correspondence
`between a network address of the failed intelligent storage
`node and a network address of the intelligent storage node
`that stores the duplicate file. In one embodiment for a
`distributed storage system that uses TCP/IP network proto-
`cols, the IP addresses between two correspondingintelligent
`storage nodes differ only in a subnet portion of the IP
`network addresses.
`
`FIG. 16 is a block diagram illustrating one embodiment
`for replicating, the storage centers.
`FIG. 17 illustrates one embodiment for use of the storage
`In one embodiment, the intelligent storage nodes, which
`center in a content delivery network.
`store duplicatefiles, are located in different storage centers
`VIG. 18 is a flow diagram illustrating one embodimentfor
`(.e., first and second storage centers). The storage centers
`use of the storage center with a content delivery network.
`are located in different geographic areas (e.g., west coast of
`United States and east coast of United States.) In one
`FIG.19 illustrates one embodimentfor use of the storage
`port in the network storage system.
`embodiment, the storage centers are mirrored. Thus, the files
`stored in one intelligent storage node in the first storage
`FIG.20 is a flow diagram illustrating one embodimentfor
`center are the same as the files stored in one intelligent
`use of a storage port to deliver content.
`storage node in the second storage center. The copiesoffiles
`FIG. 21a illustrates one hardware configuration for a
`may also be stored locally in the same storage center.
`storage port device.
`the
`In other embodiments, after a failover condition,
`VIG. 214 illustrates embodiments for implementing the
`storage port in software.
`DOSMs maysearch for files in other intclligent storage
`nodes. For this embodiment, the distributed storage system
`FIG. 22 is a block diagram illustrating one embodiment
`employs a mulli-cast protocol. The multi-cast protocol per-
`for a storage port.
`mits the DOSM to broadcast a request to intelligent storage
`FIG. 23 is a block diagram illustrating one embodiment
`nodesto locateafile. If located, the intelligent storage node
`for file system translation in the storage port.
`broadcasts a location for all DOSMs. The DOSMssnoop the
`FIG.24 is a flow diagramillustrating one embodiment for
`multi-cast protocol packets, learn the new location for the
`translating a file system operation from a lacalfile system to
`file, and update their reference tables to reflect the new
`the network storage file system.
`location for thefile.
`FIG. 25 is a block diagram illustrating one embodiment
`for using the storage port to directly download objectfiles to
`the end-user.
`FIG.26 is a flow diagramillustrating one embodiment for
`directly downloading object files to an end-user.
`FIG. 27 is a block diagramillustrating one embo