throbber
1111111111111111 IIIIII IIIII 11111 1111111111 11111 111111111111111 11111 111111111111111 11111111
`US 20060031842Al
`
`(19) United States
`(12) Patent Application Publication
`Neiman et al.
`
`(10) Pub. No.: US 2006/0031842 Al
`Feb. 9, 2006
`(43) Pub. Date:
`
`(54) SYSTEM AND METHOD FOR DIVIDING
`COMPUTATIONS
`
`(76)
`
`Inventors: Steven Neiman, Staten Island, NY
`(US); Roman Sulzhyk, New York, NY
`(US)
`
`Correspondence Address:
`MILBANK, TWEED, HADLEY & MCCLOY
`1 CHASE MANHATTAN PLAZA
`NEW YORK, NY 10005-1413 (US)
`
`(21)
`
`Appl. No.:
`
`11/222,470
`
`(22)
`
`Filed:
`
`Sep. 8, 2005
`
`Related U.S. Application Data
`
`(62)
`
`Division of application No. 10/177,597, filed on Jun.
`20, 2002.
`
`(60)
`
`Provisional application No. 60/355,274, filed on Feb.
`8, 2002.
`
`Publication Classification
`
`(51)
`
`Int. Cl.
`G06F 9/46
`
`(2006.01)
`
`(52) U.S. Cl. .............................................................. 718/103
`
`(57)
`
`ABSTRACT
`
`In certain aspects, the invention features a system and
`method for receiving a parent job configured to produce one
`or more descendant jobs, and scheduling computation of the
`parent job on a node computing device that is one of a
`plurality of node computing devices of a distributed com(cid:173)
`puting system. In such an aspect, the distributed computing
`system further includes a scheduler server configured to
`selectively reschedule computation of a job other than a
`parent job from any one of the plurality of node computing
`devices to another of the node computing devices. Such an
`aspect further includes preventing rescheduling of the parent
`job unless each of the descendant jobs is completed or
`terminated. In other aspects, the invention features a system
`and method for receiving, for computation by a node com(cid:173)
`puting device, a parent job configured to produce a descen(cid:173)
`dant job, wherein the node computing device is one of a
`plurality of node computing devices of a distributed com(cid:173)
`puting system that also includes a scheduler server. In such
`aspects, the distributed computing system creates the
`descendant job, and the parent and descendant jobs are
`scheduled for computation on different node computing
`devices.
`
`s
`~

`
`Service
`Manager
`700

`
`1/0 Communications Ports 425
`
`Software 430
`
`Operating
`I
`I
`System432
`I Database Management I
`Program434
`
`I
`
`CPU
`405
`
`I
`
`Input
`Device
`410
`
`TRANSACTION MANAGER 400
`
`DATA STORAGE DEVICES 415
`'
`__,
`-
`Database
`420-1
`
`,.__
`----
`
`__,
`
`'-
`
`-
`
`~
`
`~
`
`Database
`420-2
`
`'-
`
`~
`
`LJ LJ
`LJ
`
`Netflix, Inc. - Ex. 1006, Page 000001
`IPR2022-00322 (Netflix, Inc. v. CA, Inc.)
`
`

`

`Patent Application Publication Feb. 9, 2006 Sheet 1 of 13
`
`US 2006/0031842 Al
`
`APPLICATION LA YER
`
`PRESENTATION LA YER
`
`SESSION LA YER
`
`TRANSPORT LA YER
`
`NETWORK LA YER
`
`LINK.LAYER
`
`PHYSICAL LA YER
`
`FIG. I
`
`Netflix, Inc. - Ex. 1006, Page 000002
`
`

`

`~
`0
`0
`N
`rJJ
`d
`
`""""
`>
`N
`,i:..
`"""" 00
`0 8
`
`"""" ~
`0 ...,
`~ ....
`rJJ =(cid:173)~
`
`N
`
`0'I
`0
`0
`N
`-.1.C
`?"
`~
`~
`
`~ .... o· =
`O"' c::
`l
`~ .... o· =
`;:;·
`'E.
`?
`~ = ....
`~ ....
`""C
`
`(')
`
`FIG.2
`
`i .......................................................................................................................... ----------------................................................................................................................................................ ;
`
`;
`
`CACHE
`GLOBAL
`
`900
`
`1000
`
`ADMINISTRA TNE GIB
`
`COMPUTER 1+0
`
`NODE
`
`800•N
`
`• • •
`
`800·2
`
`COMPUTER
`
`NODE
`
`COMPUTER 10
`
`800-1
`
`.------.
`
`NODE
`
`/
`
`0
`
`SCHEDULER
`
`600
`
`500
`
`QUEUE
`
`SERVICE MANAGER
`
`700
`
`400
`
`TRANSACTION MANAGER
`
`BACKBONE
`COMPUTE
`
`300
`
`................................................................. ········• ······················•···································································•··························•······
`
`r
`
`,.
`
`API190
`
`1-_I
`
`100-N
`
`COMPUTER
`
`LOCAL
`
`•
`
`•
`
`•
`
`COMPUTER
`
`LOCAL
`
`100-3
`
`SYSTEM lO
`
`COMPUTER
`
`LOCAL
`
`I00-2
`
`COMPUTER
`
`LOCAL
`
`100-1
`
`Netflix, Inc. - Ex. 1006, Page 000003
`
`

`

`8
`0
`~
`0
`0
`N
`rJJ.
`d
`
`> I--"
`
`N
`.i:..
`00
`I--"
`
`~
`I--"
`0 -.,
`~
`
`~ ...
`rJJ. =(cid:173)~
`
`0'I
`0
`0
`N
`~~
`?'
`~
`l'zj
`
`~ ... o· =
`O" -....
`~
`~ ... o· =
`t "Cl -....
`~ ... ~ = ...
`
`(')
`
`(')
`
`""C
`
`FIG. 3
`
`110
`
`DATA STORAGE DEVICES
`
`COMMUNICATIONS
`
`PORTS 150
`
`INPUT/OUTPUT
`
`I
`I
`OPERA TING SYSTEM 170 I
`
`API 190
`
`APPLICATION 180
`
`SOFTWARE 160
`
`OUTPUT DEVICES
`
`140
`
`130
`
`INPUT DEVICES
`
`30
`
`APPLICATIONS DEVELOPER
`
`20
`
`USER
`
`LOCAL COMPUTER 100
`
`Netflix, Inc. - Ex. 1006, Page 000004
`
`

`

`.... 0 =
`~ ....
`O' -....
`~
`.... 0 =
`~ "Cl -....
`~ .... ~ = ....
`
`'"""'
`>
`N
`,i;;..
`'"""' 00
`8
`0
`~
`0
`0
`N
`'JJ.
`d
`
`'"""' ~
`0 ....,
`~ ....
`'JJ. =(cid:173)~
`
`,i;;..
`
`0'I
`0
`0
`N
`~~
`?'
`~
`"'!"j
`
`(')
`
`~ ....
`
`(')
`
`""C
`
`FIG.4
`
`TRANSACTION MANAGER 400
`
`410
`
`Device
`Input
`
`405
`CPU
`
`420-2
`
`Database
`
`420-1
`
`Database
`
`DATA STORAGE DEVICES 415
`
`Program434
`
`Database Management
`
`System 432
`Operating
`
`Software 430
`
`I/O Communications Ports 425
`

`700
`
`Manager
`Service
`

`~
`8
`
`Netflix, Inc. - Ex. 1006, Page 000005
`
`

`

`""""
`>
`N
`.i.
`"""" 00
`8
`Q
`~
`Q
`Q
`N
`rJ'1
`~
`
`~ -tll
`r,:i =(cid:173)~
`=-(cid:173)
`N g
`
`"""" w
`0 ..,,
`
`~~
`?'
`~
`
`~ = -s· =
`~ r::,' =-:
`
`-= = g = -{ -· ~ = -s· =
`
`FIG.5
`
`SCHEDULER 600
`
`610
`
`Device
`Input
`
`605
`CPU
`
`620-2
`
`· Database
`
`620-1
`
`Database
`
`DATA STORAGE DEVICES 615
`
`Program 634
`
`Database Management
`
`System 632
`Operating
`
`Software 630
`
`UO Communications Ports 625
`

`700
`
`Manager
`Service
`
`~ y
`
`Netflix, Inc. - Ex. 1006, Page 000006
`
`

`

`> I--'
`00 .,.
`8
`0
`~
`0
`0
`N
`rJJ
`Cj
`
`I--'
`
`N
`
`~
`I--'
`
`0'I
`
`0 ...,
`(0 ....
`rJJ =(cid:173) (0
`
`0'I
`0
`0
`N
`~~
`~
`(0
`l'zj
`
`~ .... o· =
`O" -....
`~
`~ .... o· =
`? "Cl -....
`(0 = ....
`~ ....
`""C
`
`r')
`
`r')
`
`__.,,
`....
`
`, .....
`,,---
`
`"-
`/
`DATA STORAGE DEV[CES 715
`
`-
`
`----
`
`"
`
`r-.
`r
`
`_.,.,,,
`....
`
`_.,.,,,
`'
`
`FIG.6
`
`'-
`
`I'-.
`
`'-
`
`'
`r
`
`......_
`
`720-2
`
`Database
`
`720-1 ·
`Database
`___,
`'
`
`SERVICE MANAGER 700
`
`710
`
`Device
`Input
`
`l
`
`705
`CPU
`
`Program 734
`
`Database Management
`
`System 732
`Operating
`
`Software 730
`
`1/0 Communications Ports 725
`
`<D
`
`GUI 1000
`
`Administrative
`
`Q}
`
`Scheduler 600
`

`400
`
`Manager
`Transaction
`
`i
`800-N
`
`Node Computer
`
`• • •
`
`<D
`
`800-1
`
`Node Computer
`
`Netflix, Inc. - Ex. 1006, Page 000007
`
`

`

`8
`0
`~
`0
`0
`N
`rJJ.
`d
`
`> I--"
`
`N
`.i:..
`00
`I--"
`
`~
`I--"
`0 -.,
`---l
`
`~ ...
`rJJ. =(cid:173)~
`
`0'I
`0
`0
`N
`~~
`?'
`~
`l'zj
`
`~ ... o· =
`O" -....
`~
`~ ... o· =
`t "Cl -....
`~ ... ~ = ...
`
`""C
`
`(')
`
`(')
`
`. Manager 700
`
`L_j\...... Global Cache 900
`.. A
`~ Service
`.i---7
`
`1'4
`
`(cid:141)
`
`Queue 500
`
`FIG. 7
`
`810
`
`DATA STORAGE DEVICES
`
`COMMUNICATIONS
`
`PORTS850
`
`INPUT/OUTPUT
`
`NODE COMPUTER 800
`
`I
`OPERA TING SYSTEM 870 I
`
`LAUNCHER880
`
`SOFI'W ARE 860
`
`840
`
`OUTPUT DEVICES
`
`830
`
`INPUT DEVICES
`
`30
`
`APPLICATIONS DEVELOPER
`
`20
`
`USER
`
`Netflix, Inc. - Ex. 1006, Page 000008
`
`

`

`Patent Application Publication Feb. 9, 2006 Sheet 8 of 13
`
`US 2006/0031842 Al
`
`OBTAIN WORKER MODULE HA VINO
`A COMPUTE FUNCTION THEREIN
`AND DEPLOY COMPUTE FUNCTION
`ON COMPUTE BACKBONE 300
`1610
`
`!
`
`OBTAIN REQUEST FROM A CALLING
`APPLICATION 180 TO CREATE A JOB
`(COMPUTATION SESSION) WITH THE
`COMPUTEBACKBONE300
`1620
`
`l
`
`QUEUE JOB 182 IN PERSISTENT
`STORAGE QUEUE 500
`1625
`l
`
`DETERMINE THE AVAILABILITY OF
`NODE COMPUTERS 800-1 TO 800-N
`1630
`
`!
`
`. SCHEDULE THE JOB 182 ON AN AVAILABLE
`NODE COMPUTER 800 IN ACCORDANCE
`WITH NODE ALLOCATION REQUIREMENTS
`PROVIDED IN META•INFORMATION OF JOB
`1640
`
`l
`
`SEND JOB 182 TO PROPER NODE
`COMPUTER 800 AND INITIATE JOB
`182 ON THAT NODE 800
`1650
`
`+ Q
`
`FIG. 8a
`
`Netflix, Inc. - Ex. 1006, Page 000009
`
`

`

`Patent Application Publication Feb. 9, 2006 Sheet 9 of 13
`
`US 2006/0031842 Al
`
`DETERMINE IF WORKER MODULE
`TO BE INVOKED BY THE JOB 182 IS
`LOADED INTO MEMORY 820 OF
`NODE COMPUTER 800
`1660
`
`Yes
`
`NODE COMPUTER 800 ACCESSES
`THE WORKER MODULE AND LOADS ·
`IT INTO THE MEMORY 820 OF THE
`NODE COMPUTER
`1670
`
`JOB 182 RECEIVES A TASK 186 AND
`GLOBAL DATA, AND NODE COivlPUTER
`800 INVOKES THE WORKER MODULE
`TO GET A RESULT
`1680
`
`NODECOMPUTER800MAKESTHE
`RESULT AVAILABLE ON THE
`COlvIPUTEBACKBONE300FOR
`RETRIEVAL BY THE CALLING
`APPLICATION 180
`1690
`
`FIG. 8b
`
`Netflix, Inc. - Ex. 1006, Page 000010
`
`

`

`Patent Application Publication Feb. 9, 2006 Sheet 10 of 13
`
`US 2006/0031842 Al
`
`NO
`
`COMPUTE THE COMPUTE
`FUNCTION ON NODE COMPUTER
`TO DETERMINE PARTIAL OR
`INTERMEDIATE RESULT
`1860
`
`YES
`
`RETURN PARTIAL OR
`INTERMEDIATE RESULT
`1855
`
`•
`BLOCK OTHER NODE
`COMPUTERS FROM COMPUTING
`THE COMPUTE FUNCTION WHILE
`IT IS BEING COMPUTED TO
`OBTAIN PARTIAL OR
`INTERMEDIATE RESULT
`1865
`
`FIG. lOb
`
`Netflix, Inc. - Ex. 1006, Page 000011
`
`

`

`Patent Application Publication Feb. 9, 2006 Sheet 11 of 13
`
`US 2006/0031842 Al
`
`. PARENT JOB 182 OF A CALLING APPLICATION 180 IS SENT TO A NODE
`COMPUTER 800 ACCORDING TO A SCHEDULING INSTRUCTION BY THE
`SCHEDULER 600
`1710
`
`1
`
`PARENT JOB 182 USES MET AINFORMATION TO INSTRUCT NODE
`COMPUTER 800TODMDEOtrrDESCENDANT JOBS 182-1 TO 182-N, TO
`SEND DESCENDENT JOBS 182-1 TO 182-N TO THE SCHEDULE 600 FOR
`PRIORITIZATION, AND NOT TO TERMINATE PARENT JOB 182 UNTIL
`ALL DESCENDANT JOBS 182-1 TO 182-N ARE COMPLETED
`1720
`
`1
`
`SCHEDULER 600 PRIORITIZES AND SENDS EACH DESCENDANT JOB
`182-1 TO 182-N TO AN AVAILABLE NODE COMPUTER 800 FOR
`PROCESSING
`1730
`
`l
`
`NODE COMPUTER 800 PROCESSES DESCENDANT JOB ACCORDING TO
`COMPUTE FUNCTION SPECIFIED BY MET AINFORMATION
`1740
`
`i
`
`RESULTS OF EACH DESCENDANT JOB 182-1 TO 182-N ARE STORED IN
`QUEUE 500 AND MADE AVAILABLE TO TIIE PARENT JOB 182
`1750
`
`1
`
`RESULTS OF ANY ONE DESCENDANT JOB 182-1 ARE STORED IN
`GLOBAL CACHE 900 FOR USE BY OTIIER DESCENDANT JOBS OR
`PARENT JOB 182
`1760
`
`l
`
`PARENT JOB 182 USES TIIE RESULTS FROM THE DESCENDANT JOBS
`182-1 TO 182-NTO CREATE A PARENT JOB RESULT
`1770
`
`l
`
`PARENT JOB RESULT IS MADE AVAILABLE FOR RETRJEV AL BY THE
`CALLING APPLICATION 180
`1780
`
`FIG. 9
`
`1
`
`PARENT JOB 182
`TERMINATES
`1752
`
`l
`
`ALL DESCENDANT
`JOBS 182-1 TO 182-N
`OF PARENT JOB 182
`TERMINATE UPON
`TERMINATION OF
`PARENT JOB 182
`1754
`
`Netflix, Inc. - Ex. 1006, Page 000012
`
`

`

`Patent Application Publication Feb. 9, 2006 Sheet 12 of 13
`
`US 2006/0031842 Al
`
`COMPUTE BACKBONE 300 RECEIVES A JOB 182
`FROM CALLING APPLICATION 180
`1810
`
`SCHEDULER 600 SENDS THE JOB 182 TO AN
`AVAILABLE NODE COMPUTER 800-1
`1815
`
`NODE COMPUTER 800-1 PROCESSES TASKS 186-1
`TO 186-N OF THE JOB 182 TO CREATE RESULT
`PREVIOUSLY IDENTIFIED ASA PARTIAL OR
`INTERMEDIATE RESULT TO BE CACHED
`1820
`
`NODE COMPUTER 800-1 SENDS PREVIOUSLY IDENTIFIED
`PARTIAL OR INTERMEDIATE RESULT TO GLOBAL CACHE
`900 (E.G., SAN OR RAID) FOR STORAGE THEREIN BY
`ASSIGNING KEY/RESULT PAIR TO PARTIAL OR
`INfERMEDIA TE RESULT
`1825
`
`SCHEDULER 600
`SENDS JOB 182 TO
`YES
`ANOTHERAVAILABLE - - - - - - - - (cid:173)
`NODE 800-2
`1835
`
`NODE COMPUTER 800-
`2 ACCESSES GLOBAL
`CACHE900TO
`RETRIEVE PARTIAL
`OR lNTERMEDIA TE
`RESULT
`1840
`
`SUBSEQUENT JOB 182 RUNNING ON ANY NODE
`COMPUTER 800 ACCESSES GLOBAL CACHE 900 TO
`RETRIEVE PARTIAL OR INTERMEDIATE RESULT FROM
`PRJOR JOB 182 BY PRESENTING TO GLOBAL CACHE 900
`AN ATOMIC LOOKUP FUNCTION WITH A KEY AND
`COMPUTE FUNCTION FOR EACH INTERMEC>IATE RESULT
`1845
`
`FIG. 10a
`
`Netflix, Inc. - Ex. 1006, Page 000013
`
`

`

`Patent Application Publication Feb. 9, 2006 Sheet 13 of 13
`
`US 2006/0031842 Al
`
`APPLICATION DEVELOPER 30 CREATES JOB
`182 AND WORKER MODULE 195 ON LOCAL
`COMPUTER 100
`1910
`
`CALLING APPLICATION 180 INITIALIZED AND
`LINKED TO API 190 "LOCAL" MODE FILE
`1920
`
`, r
`
`API 190 LOADS WORK.ER MODULE 1.95 INTO
`PROCESS SPACE OF LOCAL COMPUTER 100
`1930
`
`API 190 LOADS FUNCTIONAL REPLICA OF
`COMPUTE BACKBONE 300 INTO PROCESS SPACE
`OF LOCAL COMPUTER 100
`1940
`
`WORKERS 155-1 TO 155-N OF WORKER MODULE
`195 PROCESSED BY CPU 120 OF LOCAL
`COMPUTER I 00
`1950
`
`FIG. 11
`
`Netflix, Inc. - Ex. 1006, Page 000014
`
`

`

`US 2006/0031842 Al
`
`Feb.9,2006
`
`1
`
`SYSTEM AND METHOD FOR DIVIDING
`COMPUTATIONS
`
`BACKGROUND
`I. Field of the Invention
`
`[0001]
`
`[0002] The present invention relates to the structure and
`operation of computing systems, and more particularly, to
`distributed computing systems and methods of operating
`such systems.
`
`[0003]
`
`II. Description of the Related Art
`
`[0004] Certain organizations have a need for high perfor(cid:173)
`mance computing resources. For example, a financial insti(cid:173)
`tution may use such resources to perform risk management
`modeling of the valuations for particular instruments and
`portfolios at specified states of the world. As another
`example, a pharmaceutical manufacturer may use high per(cid:173)
`formance computing resources to model the effects, efficacy
`and/or interactions of new drugs it is developing. As a
`further example, an oil exploration company may evaluate
`seismic information using high performance computing
`resources.
`
`[0005] One conventional computing system includes a
`mainframe computer attached to an individual user terminal
`by a network connection. Using the terminal, a user may
`instruct the mainframe computer to execute a command. In
`this conventional system, almost all data storage and pro(cid:173)
`cessing functionality resides on the mainframe computer,
`while relatively little memory or processing capability exists
`at the terminal. This terminal/mainframe architecture may
`not, however, allow computations requested by a user to be
`computed rapidly or automatically.
`
`[0006] The open systems interconnection (OSI) model
`describes one conceptual network architecture represented
`by seven functional layers. In this model, the functions of a
`networking system in a data communications network are
`reflected as a set of seven layers, including a physical layer,
`data link layer, network layer, transport layer, session layer,
`presentation layer and application layer. One or more enti(cid:173)
`ties within each layer implement the functionality of the
`layer. Each entity provides facilities for use only by the layer
`above it, and interacts directly only with the layer below it.
`FIG. 1 depicts the seven functional layers of the OSI model.
`
`[0007] The physical layer describes the physical charac(cid:173)
`teristics of hardware components used to form a network.
`For example, the size of cable, the type of connector, and the
`method of termination are defined in the physical layer.
`
`[0008] The data link layer describes the organization of
`the data to be transmitted over the particular mechanical/
`electrical/optical devices described in the physical layer. For
`example, the framing, addressing and check summing of
`Ethernet packets is defined in the data link layer.
`
`[0009] The network layer describes how data is physically
`routed and exchanged along a path for delivery from one
`node of a network to another. For example, the addressing
`and routing structure of the network is defined in this layer.
`
`[0010] The transport layer describes means used to ensure
`that data is delivered from place to place in a sequential,
`error-free, and robust (i.e., no losses or duplications) con(cid:173)
`dition. The complexity of the transport protocol is defined by
`the transport layer.
`
`[0011] The session layer involves the organization of data
`generated by processes running on multiple nodes of a
`network in order to establish, use and terminate a connection
`between those nodes. For example, the session layer
`describes how security, name recognition and logging func(cid:173)
`tions are to take place to allow a connection to be estab(cid:173)
`lished, used and terminated.
`
`[0012] The presentation layer describes the format the data
`presented to the application layer must possess. This layer
`translates data from the format it possesses at the sending/
`receiving station of the network node to the format it must
`embody to be used by the application layer.
`
`[0013] The application layer describes the service made
`available to the user of the network node in order to perform
`a particular function the user wants to have performed. For
`example, the application layer implements electronic mes(cid:173)
`saging (such as "e-mail") or remote file access.
`
`[0014]
`In certain conventional high performance comput(cid:173)
`ing systems designed using the OSI model, the hardware
`used for computation-intensive processing may be dedicated
`to only one long-running program and, accordingly, may not
`be accessible by other long running programs. Moreover, it
`may be difficult to easily and dynamically reallocate the
`computation-intensive processing from one long running
`program to another. In the event processing resources are to
`be reallocated, a program currently running on a conven(cid:173)
`tional high performance computer system typically must be
`terminated and re-run in its entirety at a later time.
`
`SUMMARY OF IBE INVENTION
`
`[0015]
`In one aspect, the invention features a method
`including receiving, for computation by a node computing
`device of a distributed computing system, a parent job
`configured to produce one or more descendant jobs, wherein
`the node computing device is one of a plurality of node
`computing devices of the distributed computing system.
`Such a method also includes scheduling computation of the
`parent job on the node computing device. In accordance with
`such an aspect, the distributed computing system further
`includes a scheduler server configured
`to selectively
`reschedule computation of a job other than the parent job
`from any one of said plurality of node computing devices to
`another of the node computing devices, and to receive data
`descriptive of an indication that the parent job is not to be
`rescheduled unless each of the descendant jobs is completed
`or terminated. According to such an aspect, the method
`further includes preventing rescheduling of the parent job
`unless each of the descendant jobs is completed or termi(cid:173)
`nated.
`
`[0016]
`In another aspect, the invention features a distrib(cid:173)
`uted computing system including a plurality of node com(cid:173)
`puting devices, means for receiving, for computation by at
`least one of the node computing devices, a parent job
`configured to produce one or more descendant jobs. Such a
`system also includes means for scheduling computation of
`the parent job on the node computing device. In accordance
`with such an aspect, the means for scheduling is further
`configured to selectively reschedule computation of a job
`other than the parent job from any one of the plurality of
`node computing devices to another of the node computing
`devices, and to receive data descriptive of an indication that
`the parent job is not to be rescheduled unless each of the
`
`Netflix, Inc. - Ex. 1006, Page 000015
`
`

`

`US 2006/0031842 Al
`
`Feb.9,2006
`
`2
`
`descendant jobs is completed or terminated. According to
`such an aspect, the distributed computing system further
`includes means for preventing rescheduling of the parent job
`unless each of the descendant jobs is completed or termi(cid:173)
`nated.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0017] Features and other aspects of the invention are
`explained in the following description taken in conjunction
`with the accompanying drawings, wherein:
`[0018] FIG. 1 depicts the seven functional layers of the
`open systems interconnection (OSI) model;
`[0019] FIG. 2 illustrates a system 10 including a compute
`backbone 300 according to one embodiment of the present
`invention;
`
`[0020] FIG. 3 illustrates certain components of one
`embodiment of a local computer 100 of the system 10 shown
`in FIG. 2;
`[0021] FIG. 4 illustrates certain components of one
`embodiment of a transaction manager 400 of the system 10
`shown in FIG. 2;
`
`[0022] FIG. 5 illustrates certain components of one
`embodiment of a scheduler 600 of the system 10 shown in
`FIG. 2;
`
`[0023] FIG. 6 illustrates certain components of one
`embodiment of a service manager 700 of the system 10
`shown in FIG. 2;
`[0024] FIG. 7 illustrates certain components of one
`embodiment of a node computer 800 of the system 10 shown
`in FIG. 2;
`[0025] FIGS. Sa and Sb illustrate one embodiment of a
`method of executing a computing application using the
`system shown in FIG. 2.
`
`[0026] FIG. 9 illustrates one embodiment of a method of
`distributing computations using the system 10 shown in
`FIG. 2;
`
`[0027] FIGS. 10a and 10b illustrate one embodiment of a
`method of caching results using the system 10 shown in
`FIG. 2; and
`
`[0028] FIG. 11 illustrates one embodiment of a method of
`debugging using the system 10 shown in FIG. 2.
`
`[0029]
`It is to be understood that the drawings are exem(cid:173)
`plary, and are not limiting.
`
`DETAILED DESCRIPTION OF PREFERRED
`EMBODIMENTS
`
`[0030] Various embodiments of the present invention will
`now be described in greater detail with reference to the
`drawings.
`
`I. System Embodiments of the Invention
`
`[0031] FIG. 2 illustrates certain components of one
`embodiment of a system 10 of the present invention, which
`may generally include a number of local computers 100-1 to
`100-N in communication, via a network 200, with a compute
`backbone 300.
`
`[0032] A function of this embodiment of the system 10 is
`to service parametric computation requests of various users
`20 or groups of users. In particular, such a system 10 may
`allow each user 20 access to a service on a common
`infrastructure for performing compute dense calculations by
`dynamically allocating a portion of the compute backbone
`300 infrastructure to the user 20 for processing of each
`user's 20 distinct application. A system 10 of one embodi(cid:173)
`ment may include software that allows compute intensive
`applications to queue, schedule and prioritize their calcula(cid:173)
`tions on the infrastructure. In addition, the infrastructure and
`software of such an embodiment may operate to manage
`resource allocation, authentication, job distribution, data
`flow and fault tolerance. In accordance with this system 10,
`distinct applications may each connect to the compute
`backbone 300 infrastructure, which may perform several
`operations including prioritizing compute requests from the
`applications according to a policy (predetermined or other(cid:173)
`wise), allocating hardware and software resources, assigning
`compute requests to a proper computation resource, and
`returning results to the applications.
`
`[0033] A Local Computer 100
`
`In the embodiment depicted in FIGS. 2 and 3, each
`[0034]
`local computer 100 may generally include one or more data
`storage devices 110, a central processing unit (CPU) 120,
`one or more input devices 130, one or more output devices
`140, input/output (1/0) communications ports 150, and other
`hardware components (not shown) which facilitate perfor(cid:173)
`mance of the functions of the local computer 100 and/or the
`system 10 as described herein. In one embodiment, the
`hardware devices of a local computer 100 may be in
`communication with one another by a shared data bus and/or
`by dedicated connections (not shown). In addition, a number
`of software components 160 may run on each local com(cid:173)
`puter 100.
`
`[0035] A local computer 100-1 of one embodiment may
`be, for example, a shared memory multiprocessor machine
`made by Sun Microsystems configured to run programs
`created using the Smalltalk programming language. Another
`embodiment of a local computer 100-2 may be an IBM
`machine running programs created using the C program(cid:173)
`ming language. Yet another embodiment of a local computer
`100-3 may be an SGI machine running programs using the
`C++ and/or Java programming languages. A further embodi(cid:173)
`ment of a local computer 100-4 may include a composition
`of a number of separate devices.
`
`[0036] The data storage devices 10 of one embodiment
`may include one or more hard disk drives. However, it is to
`be understood that data storage devices 10 such as RAM,
`ROM, CD-ROM, DVD-ROM, solid state drive, floppy
`disk-drive or combinations thereof may also be included in
`the embodiment shown in FIG. 3, or in certain other
`appropriate embodiments. One embodiment of a local com(cid:173)
`puter 100-1 may include input device(s) 130 (e.g., keyboard,
`pointing/selecting device such as a mouse or track ball,
`floppy disk-drive, scanner and/or touch screen interface) that
`may enable a user 20 and/or applications developer 30 of the
`system 10 to provide information and instructions for stor(cid:173)
`age in the local computer 100 and use in operation of the
`system 10. An embodiment of a local computer 100-1 may
`also include output devices 140 ( e.g., printer, display device,
`floppy disk-drive and/or computer monitor) that may enable
`
`Netflix, Inc. - Ex. 1006, Page 000016
`
`

`

`US 2006/0031842 Al
`
`Feb.9,2006
`
`3
`
`a user 20 and/or applications developer 30 to receive, for
`further manipulation and/or storage, information generated
`using the local computer 100 and/or the system 10. The 1/0
`communications ports 150 of a local computer 100-1 of one
`embodiment may be serial and parallel, and may be config(cid:173)
`ured to include multiple communications channels for
`simultaneous connections. The software components 160
`may include an operating system 170 ( e.g., Linux, Unix,
`Microsoft Windows NT), one or more user interface tools
`175, calling applications 180, and an application program
`interface (API) 190. One embodiment of the system 10 may
`include ten or more local computers 100-1 to 100-N.
`
`[0037]
`
`i. Calling Application 180
`
`In one embodiment, a calling application 180 may
`[0038]
`be a computer program that contains logic to achieve or
`produce an outcome for a user 20. The software architecture
`of certain applications may conceptually consist of four
`layers: user interface and ad hoc calculation tools; logic;
`persistence; and high performance computing. The user 20
`may send certain computation intensive portions of a par(cid:173)
`ticular calling application 180 (i.e., the high performance
`computing layer) to the compute backbone 300 for process(cid:173)
`ing rather than have the local computer 100 process those
`computation intensive portions. In accordance with one
`embodiment, the user 20 may do so by (i) creating one or
`more worker modules 195-1 to 195-N (e.g., shared libraries,
`executable files compliant with a compute backbone 300,
`Java archive files and/or other archive files), each of which
`contains one or more compute functions or engines called
`"workers"l55-1 to 155-N, (ii) deploying the worker mod(cid:173)
`ules 195-1 to 195-N on the compute backbone 300, and (iii)
`sending to the compute backbone 300 a job 182 that requests
`the compute backbone 300 to perform a computation using
`a worker 155 contained in a worker module 195 that has
`been deployed on the compute backbone 300. A worker 155
`may be constructed to conform to and operate with the API
`190, and may conceptually "plug" into the infrastructure of
`the compute backbone 300 (in particular, to the launcher 880
`as described below in section v.). A compute function may
`be implemented in a number of ways including, without
`limitation, as a function, as a class method or as an execut(cid:173)
`able constructed to be compatible with the compute back(cid:173)
`bone 300. In accordance with one embodiment, a worker
`155 may be capable of staying initialized after completing a
`computation in order to handle additional compute requests
`should the scheduler 600 send such requests to the node
`computer 800 on which the worker 155 is invoked.
`
`[0039] According to one embodiment, a worker 155 may
`be capable of computing tasks 186-1 to 186-N once loaded
`onto the compute backbone 300. For example, a worker 155
`may be a function that takes task inputs and returns a task
`output or an error indication. Furthermore, a worker 155
`may itself create a job 182 and schedule tasks 186-1 to
`186-N with the compute backbone 300, thereby further
`subdividing computations to be performed.
`
`[0040] A job 182 may be conceptualized as a means for
`opening or establishing a computation session with the
`infrastructure of the compute backbone 300. In one embodi(cid:173)
`ment, a job 182 may include and supply to the compute
`backbone 300 certain defining requirements or parameters
`for a computation session. In particular, one embodiment of
`a job 182 may include meta-information, such as an iden-
`
`tification of a particular worker 155 to be used with the job.
`In one embodiment, meta-information supplied by a job 182
`identifies only one worker 155 such that all jobs 182-1 to
`182-N on the compute backbone may have a generally
`homogeneous format. In another embodiment, meta-infor(cid:173)
`mation supplied by a job 182 may identify more than one
`worker 155-1 to 155-N.
`
`[0041] Other optional meta-information may
`include
`information about the priority of the job 182 in relation to
`other jobs, a specification of minimal hardware requirements
`(e.g., minimum RAM and/or CPU power) for the job 182, a
`specification of a minimum number of nodes to be allocated
`in order for the particular job 182 to be run properly or
`efficiently, the amount of debugging information the job 182
`is to provide while it is running, and task logic to control
`sequencing and control of task computation (e.g., fail all
`tasks if one task fails, one task is dependent upon another
`task).
`
`[0042] According to one embodiment, certain meta-infor(cid:173)
`mation may be changed while a job 182 is running. For
`example, the priority of the job 182 may be adjusted by a
`user 20 without terminating or suspending the job 182. As
`another example, a user 20 may modify the amount of
`debugging information the job is to provide while it is
`running.
`In one embodiment, a job 182 may also contain one
`[0043]
`or more tasks 186 and inputs which collectively represent a
`unit of computational work to be performed by a processor.
`Such inputs may include optional global data. A particular
`worker 155 of a worker module 195 deployed on the
`compute backbone 300 may perform each task 186-1 to
`186-N. Global data and task inputs 187-1 to 187-N may
`combine to represent the inputs to a particular computation.
`For example, a job 182 may be defined to compute the value
`of a number of financial instruments based on the market
`conditions at closing time on a particular trading day. A user
`20 may configure the job 182 such that the global data for
`the job 182 defines the market conditions at closing, and
`each instrument may be represented by a separate task 186.
`In such a case, the task inputs 187-1 to 187-N and global
`data would be supplied to generate task output 189. How(cid:173)
`ever, inputs (e.g., global data and/or task inputs 187-1 to
`187-N) need not be provided to a job 182 at the time the job
`182 is created. In addition, tasks 186-1 to 186-N need not be
`supplied at the time of job 182 creation. A job 182 also may
`have a dynamic collection of one or more tasks 186-1 to
`186-N.
`
`[0044] A task 186 may be an encapsulation of a single
`computation to be performed by the compute backbone 300.
`A task 186 has an input object 187 (i.e., the input needed for
`a calculation), and on success it will have an output object
`or an error indication 189. At any point in time a task 186
`also has a state 188, such as an indication of whether the task
`186 has been completed or not (e.g., queued, running,
`completed, rescheduled, suspended, or error), and produce
`log data as generated by the worker 155. In accordance with
`one embodiment, a worker 155 on the compute backbone
`300 loads a worker module 195, performs a requested
`computation, and creates task output 189.
`
`[0045]
`In one embodiment, calling applications 180-1 to
`180-N running on the local computers 100-1 to 100-N are
`programmed to interface with the compute backbone 300. In
`
`Netflix, Inc. - Ex. 1006, Page 000017
`
`

`

`US 2006/0031842 Al
`
`Feb.9,2006
`
`4
`
`particular, a calling application 180 running on a particular
`local computer 100 is compatible with the API 190 also
`running on that local computer 100. For example, a calling
`application 180 created in C programming language may be
`compatible with the C language API 190 running on a
`particular local computer 100. In such an example, a portion
`of the API 190 may communicate with both the calling
`application 180 and the compute backbone 300 in the
`following manner. First, a calling application 180 may send
`a request, in C langua

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket