`US 20090271799Al
`
`c19) United States
`c12) Patent Application Publication
`Barsness et al.
`
`c10) Pub. No.: US 2009/0271799 Al
`Oct. 29, 2009
`(43) Pub. Date:
`
`(54) EXECUTING A DISTRIBUTED JAVA
`APPLICATION ON A PLURALITY OF
`COMPUTE NODES
`
`(75)
`
`Inventors:
`
`Eric L. Barsness, Pine Island, MN
`(US); David L. Darrington,
`Rochester, MN (US); Amanda E.
`Peters, Rochester, MN (US); John
`M. Santosuosso, Rochester, MN
`(US)
`
`Correspondence Address:
`IBM (ROC-BLF)
`C/O BIGGERS & OHANIAN, LLP, P.O. BOX 1469
`AUSTIN, TX 78767-1469 (US)
`
`(73) Assignee:
`
`INTERNATIONAL BUSINESS
`MACHINES CORPORATION,
`ARMONK, NY (US)
`
`(21) Appl. No.:
`
`12/109,248
`
`(22) Filed:
`
`Apr. 24, 2008
`
`Publication Classification
`
`(51)
`
`Int. Cl.
`G06F 9/46
`(2006.01)
`(52) U.S. Cl. ........................................................ 718/106
`
`(57)
`
`ABSTRACT
`
`Methods, systems, and products are disclosed for executing a
`distributed Java application on a plurality of compute nodes.
`The Java application includes a plurality of jobs distributed
`among the plurality of compute nodes. The plurality of com(cid:173)
`pute nodes are connected together for data communications
`through a data communication network. Each of the plurality
`of compute nodes has installed upon it a Java Virtual Machine
`(' NM') capable of supporting at least one job of the Java
`application. Executing a distributed Java application on a
`plurality of compute nodes includes: tracking, by an applica(cid:173)
`tion manager, a just-in-time ('JIT') compilation history for
`the JVMs installed on the plurality of compute nodes; and
`configuring, by the application manager, the plurality of jobs
`for execution on the plurality of compute nodes in depen(cid:173)
`dence upon the JIT compilation history for the JVMs installed
`on the plurality of compute nodes.
`
`Track, By An Application Manager, A JIT Compilation
`History For The JVMs Installed On The Plurality Of
`Compute Nodes 700
`
`11 Nodes 600
`
`I ~
`
`Application 601
`
`11
`
`Jobs 158 I~
`
`r
`
`1
`
`JIT Compilation
`History 652
`
`Desired JIT
`Levels 707
`
`-~
`
`
`
`Configure, By The Application Manager, The Plurality
`Of Jobs For Execution On The Plurality Of Compute
`Nodes In Dependence Upon The JIT Compilation
`History For The JVMs Installed On The Plurality Of
`Compute Nodes 702
`
`Performing For At Least One Job Of The Java
`Application 704
`
`---- fl~
`
`Select, For That Job, One Of The Compute
`Nodes On Which To Execute That Job In
`Dependence Upon The JIT Compilation
`History For The JVMs And At Least One
`Desired JIT Level For That Job 706
`
`Configure That Job On The Selected Compute
`Node For Execution 710
`
`Selected Node 708 11
`I
`
`Netflix, Inc. - Ex. 1017, Page 000001
`IPR2022-00322 (Netflix, Inc. v. CA, Inc.)
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 1 of 8
`
`US 2009/0271799 Al
`
`Compute Nodes 102
`
`Operational
`Group
`132
`
`Service
`Application ill
`
`Service
`Application
`Interface
`126
`
`Parallel
`Computer
`1001
`
`-
`
`~User
`
`"'4128
`
`FIG. 1
`
`1/0 Node
`110
`
`1/0 Node
`114
`
`L
`
`LAN 130
`
`Data Storage
`lli
`
`Printer
`120
`
`Terminal
`122
`
`Netflix, Inc. - Ex. 1017, Page 000002
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 2 of 8
`
`US 2009/0271799 Al
`
`Compute Node 152
`
`Processor 164
`
`~
`
`I ALU 166 I
`7 U::s;
`-
`
`Bus Adapter
`194
`
`III]
`
`RAM 156
`Application Manager 125
`
`Network Monitor 201
`
`Job 158
`
`Java Virtual Machine 200
`
`Memory Bus 154
`
`Messaging Module 161
`
`Operating System 162
`
`OMA Controller 195
`
`I OMA Engine 1971
`
`Extension Bus 168
`
`H
`
`H
`
`a
`
`J l
`
`, '
`Ethernet
`Adapter
`172
`
`,1
`
`,,
`Gigabit
`Ethernet
`174
`
`, '
`
`JTAG
`Slave
`176
`
`JI
`
`,,
`JTAG
`Master
`178
`
`,,
`Point To Point
`Adapter
`180
`
`'
`
`j
`
`a
`
`j
`
`' lh
`
`,.
`
`+X
`181 '11
`-X
`182 ,,
`+Y
`183
`
`'Ir
`-Y
`184 , "
`+Z
`185 , ,
`-Z
`186
`
`\. ___ __ _,.)
`
`V
`Point To Point
`Network
`108
`
`' I IR 169 I
`•
`r ALU 170 l
`•
`
`1'
`Global Combining
`Network Adapter
`188
`
`.~
`,.
`\. ___ ___ )
`
`,l,l
`
`,.,,
`Children
`
`Parent
`
`V
`Collective
`Operations
`Network
`106
`
`FIG. 2
`
`Netflix, Inc. - Ex. 1017, Page 000003
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 3 of 8
`
`US 2009/0271799 Al
`
`+Z
`185
`
`-Y
`184
`
`Compute Node 152
`
`-X
`182 - - -
`
`Point To Point ~.....---(cid:173)
`Adapter
`180
`
`+X
`1fil
`
`+Y
`183
`
`FIG. 3A
`
`-Z
`186
`
`Parent
`192
`
`Compute Node 152
`
`Global Combining
`Network Adapter
`188
`
`'-y-----1
`Children
`190
`
`FIG. 38
`
`Netflix, Inc. - Ex. 1017, Page 000004
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 4 of 8
`
`US 2009/0271799 Al
`
`+Z
`185
`
`+Y
`183
`
`+X
`1fil
`
`-X
`182
`
`-Y
`184
`
`-Z
`186
`
`I
`
`A Parallel Operations Network, Organized
`As A 'Torus' Or 'Mesh'
`108
`
`Dots Represent
`Compute Nodes
`102
`
`FIG. 4
`
`Netflix, Inc. - Ex. 1017, Page 000005
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 5 of 8
`
`US 2009/0271799 Al
`
`Physical Root
`~ 202
`oe
`-- .....
`--
`--
`....
`....
`.....
`.......
`
`--
`
`---
`
`Links
`103
`
`•••
`•. • ,•·
`,•
`
`•••
`····•...
`··.. ¥
`
`Ran ks
`250
`...........
`
`•• ••
`,•·····
`.
`
`I
`
`I
`
`I
`
`I
`
`A
`!\
`! \
`\
`!
`
`I
`
`f
`
`A
`I\
`:\
`\
`!
`
`I
`
`I
`
`A
`!\
`:\
`\
`!
`
`I
`
`I
`
`A
`!\
`:\
`I
`\
`
`I
`
`I
`
`A
`!\
`:\
`l
`\
`
`Branch
`Nodes
`204
`
`Leaf
`
`ft::_ _:=:·;·,.
`•• ••
`···• •••
`.
`.,l<\
`/f ~\
`:~t:.\
`_It.· .. \
`• • • • • • • •
`: \ : \ I \ I \ I\/\}
`~iies
`
`I
`
`I
`
`I
`
`I
`
`A
`A
`A
`!\
`!\
`!\
`!\ !\ ! \
`,
`,
`,
`,
`!
`\
`\
`!
`\
`!
`
`•
`
`• • • . • •
`
`• •
`
`\
`
`I
`
`\ •
`
`• •
`
`•
`
`!
`
`A Collective Operations Organized As A
`Binary Tree
`106
`
`Dots Represent
`Compute Nodes
`102
`
`FIG. 5
`
`Netflix, Inc. - Ex. 1017, Page 000006
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 6 of 8
`
`US 2009/0271799 Al
`
`Node 600a
`
`Job 158a
`
`JIT Compilation
`History §§2a
`
`Network
`Monitor 2Q.1.
`
`JVM 200
`
`Heap 610
`000
`000
`OOQ
`"
`Objects 612
`
`Stack 614
`I
`I
`I J IT Code 616 I
`. . . . . . . . . . . .
`
`J IT Compiler
`618
`
`\
`\\
`
`Class Loaders 620
`
`Class Storage 636
`
`Primordial 622 11
`
`11
`
`11
`
`Extension 624 11
`
`I Application 626 I
`
`Class Loader
`Cache 634
`
`JIT Code
`Executor 654
`
`Method Code 638
`
`Constant Pool 640
`
`Field Data 642
`
`Method Block 644
`
`Static lnit. 646
`
`Monitor Pool 648
`
`Garbage Collector
`650
`
`---------------------------··········-------··-···
`
`--
`
`,.
`
`1/0 Node 11.Q
`
`,. ~...---->
`II ~
`
`Nodes 600
`
`. .
`
`.
`01~
`Nodes
`600b
`
`Service Node 116
`Application
`Java Application QQ1
`I
`Jobs 1M m
`Manager 12.§
`J IT Compilation I
`.... I _J_IT_P_ro_fi_le_60_3 _ _,
`
`History~
`
`Data Storage
`lli
`
`FIG. 6
`
`Netflix, Inc. - Ex. 1017, Page 000007
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 7 of 8
`
`US 2009/0271799 Al
`
`Track, By An Application Manager, A JIT Compilation
`History For The JVMs Installed On The Plurality Of
`Compute Nodes 700
`
`I Nodes 600 I l
`
`Application 601
`
`I! Jobs 158 I ID
`
`H
`
`JIT Compilation
`History 652
`
`1 r
`
`Configure, By The Application Manager, The Plurality
`Of Jobs For Execution On The Plurality Of Compute
`Nodes In Dependence Upon The JIT Compilation
`History For The JVMs Installed On The Plurality Of
`Compute Nodes 702
`
`Performing For At Least One Job Of The Java
`Application 704
`
`,~
`,.
`
`Select, For That Job, One Of The Compute
`Nodes On Which To Execute That Job In
`Dependence Upon The JIT Compilation
`History For The JVMs And At Least One
`Desired JIT Level For That Job 706
`
`I.,
`
`,-
`
`~
`
`~
`
`Desired JIT
`Levels 707
`
`Configure That Job On The Selected Compute
`Node For Execution 710
`
`I.,
`
`~
`
`Selected Node 708 11
`I
`
`FIG. 7
`
`Netflix, Inc. - Ex. 1017, Page 000008
`
`
`
`Patent Application Publication
`
`Oct. 29, 2009 Sheet 8 of 8
`
`US 2009/0271799 Al
`
`Record, By Each JVM Supporting At Least One Job
`Of The Java Application On The Plurality Of
`Compute Nodes, The JIT Compilation History For
`The Jobs Supported By That JVM 800
`
`, r
`
`J IT Compilation
`History 652
`
`, r
`
`Provide, By Each JVM Supporting At Least One Job
`Of The Java Application On The Plurality Of
`Compute Nodes, The Recorded JIT Compilation
`History To The Application Manager 802
`
`, r
`
`Application Manager 125
`
`FIG. 8
`
`Netflix, Inc. - Ex. 1017, Page 000009
`
`
`
`US 2009/0271799 Al
`
`Oct. 29, 2009
`
`1
`
`EXECUTING A DISTRIBUTED JAVA
`APPLICATION ON A PLURALITY OF
`COMPUTE NODES
`
`BACKGROUND OF THE INVENTION
`
`[0001]
`1. Field of the Invention
`[0002] The field of the invention is data processing, or,
`more specifically, methods, apparatus, and products for
`executing a distributed Java application on a plurality of com(cid:173)
`pute nodes.
`[0003] 2. Description of Related Art
`[0004] The development of the EDVAC computer system
`of 1948 is often cited as the beginning of the computer era.
`Since that time, computer systems have evolved into
`extremely complicated devices. Today's computers are much
`more sophisticated than early systems such as the EDVAC.
`Computer systems typically include a combination of hard(cid:173)
`ware and software components, application programs, oper(cid:173)
`ating systems, processors, buses, memory, input/output
`devices, and so on. As advances in semiconductor processing
`and computer architecture push the performance of the com(cid:173)
`puter higher and higher, more sophisticated computer soft(cid:173)
`ware has evolved to take advantage of the higher performance
`of the hardware, resulting in computer systems today that are
`much more powerful than just a few years ago.
`[0005] Parallel computing is an area of computer technol(cid:173)
`ogy that has experienced advances. Parallel computing is the
`simultaneous execution of the same task (split up and spe(cid:173)
`cially adapted) on multiple processors in order to obtain
`results faster. Parallel computing is based on the fact that the
`process of solving a problem usually can be divided into
`smaller tasks, which may be carried out simultaneously with
`some coordination.
`[0006] Parallel computers execute parallel algorithms. A
`parallel algorithm can be split up to be executed a piece at a
`time on many different processing devices, and then put back
`together again at the end to get a data processing result. Some
`algorithms are easy to divide up into pieces. Splitting up the
`job of checking all of the numbers from one to a hundred
`thousand to see which are primes could be done, for example,
`by assigning a subset of the numbers to each available pro(cid:173)
`cessor, and then putting the list of positive results back
`together. In this specification, the multiple processing devices
`that execute the individual pieces of a parallel program are
`referred to as 'compute nodes.' A parallel computer is com(cid:173)
`posed of compute nodes and other processing nodes as well,
`including, for example, input/output ('I/O') nodes, and ser(cid:173)
`vice nodes.
`[0007] Parallel algorithms are valuable because it is faster
`to perform some kinds oflarge computing tasks via a parallel
`algorithm than it is via a serial (non-parallel) algorithm,
`because of the way modern processors work. It is far more
`difficult to construct a computer with a single fast processor
`than one with many slow processors with the same through(cid:173)
`put. There are also certain theoretical limits to the potential
`speed of serial processors. On the other hand, every parallel
`algorithm has a serial part and so parallel algorithms have a
`saturation point. After that point adding more processors does
`not yield any more throughput but only increases the over(cid:173)
`head and cost.
`[0008] Parallel algorithms are designed also to optimize
`one more resource the data communications requirements
`among the nodes of a parallel computer. There are two ways
`parallel processors communicate, shared memory or message
`
`passing. Shared memory processing needs additional locking
`for the data and imposes the overhead of additional processor
`and bus cycles and also serializes some portion of the algo(cid:173)
`rithm.
`[0009] Message passing processing uses high-speed data
`communications networks and message buffers, but this com(cid:173)
`munication adds transfer overhead on the data communica(cid:173)
`tions networks as well as additional memory need for mes(cid:173)
`sage buffers and latency in the data communications among
`nodes. Designs of parallel computers use specially designed
`data communications links so that the communication over(cid:173)
`head will be small but it is the parallel algorithm that decides
`the volume of the traffic.
`[0010] Many data communications network architectures
`are used for message passing among nodes in parallel com(cid:173)
`puters. Compute nodes may be organized in a network as a
`'torus' or 'mesh,' for example. Also, compute nodes may be
`organized in a network as a tree. A torus network connects the
`nodes in a three-dimensional mesh with wrap around links.
`Every node is connected to its six neighbors through this torus
`network, and each node is addressed by its x,y,z coordinate in
`the mesh. A torus network lends itself to point to point opera(cid:173)
`tions. In a tree network, the nodes typically are connected into
`a binary tree: each node has a parent, and two children (al(cid:173)
`though some nodes may only have zero children or one child,
`depending on the hardware configuration). In computers that
`use a torus and a tree network, the two networks typically are
`implemented independently of one another, with separate
`routing circuits, separate physical links, and separate mes(cid:173)
`sage buffers. A tree network provides high bandwidth and low
`latency for certain collective operations, message passing
`operations where all compute nodes participate simulta(cid:173)
`neously, such as, for example, an allgather.
`[0011] The parallel applications that execute on the nodes
`in the data communications networks may be implemented in
`a variety of software programming languages, including the
`various versions and derivatives of Java™ technology pro(cid:173)
`mulgated by Sun Microsystems. Java applications generally
`run in a virtual execution environment called the Java Virtual
`Machine ('JVM'), rather than running directly on the com(cid:173)
`puter hardware. The Java application is typically compiled
`into byte-code form, and then compiled in a just-in-time
`(' JIT') manner, or on-the-fly, by the JVM into JIT code rep(cid:173)
`resenting hardware commands specific to the hardware plat(cid:173)
`form on which the NM is installed.
`[0012]
`In a parallel computer, the Java application is gen(cid:173)
`erally a distributed application that is composed of multiple
`jobs; each job implemented using one or more Java classes.
`Because the jobs are typically designed in a modular fashion,
`each job may be utilized in more than one Java application.
`The NMs on the compute nodes of the parallel computer
`provide an execution environment for the jobs that make up a
`Java application. As the compute nodes execute the jobs, each
`compute node performs JIT compilation and optimization of
`the job supported by that compute node's JVM. Each time the
`Java application executes on the parallel computer, the com(cid:173)
`pute nodes generally have to recompile and re-optimize the
`byte code representing the jobs of the Java application.
`Because a particular job may be executed on a parallel com(cid:173)
`puter hundreds or thousands of times in one or more Java
`applications, valuable computing resources are often con(cid:173)
`sumed to repeatedly perform JIT compilation and optimiza-
`
`Netflix, Inc. - Ex. 1017, Page 000010
`
`
`
`US 2009/0271799 Al
`
`Oct. 29, 2009
`
`2
`
`tion for the same job regardless of whether that job is used
`repeatedly in the same Java application or different Java
`applications.
`
`SUMMARY OF THE INVENTION
`
`[0013] Methods, systems, and products are disclosed for
`executing a distributed Java application on a plurality of com(cid:173)
`pute nodes. The Java application includes a plurality of jobs
`distributed among the plurality of compute nodes. The plu(cid:173)
`rality of compute nodes are connected together for data com(cid:173)
`munications through a data communication network. Each of
`the plurality of compute nodes has installed upon it a JVM
`capable of supporting at least one job of the Java application.
`Executing a distributed Java application on a plurality of
`compute nodes includes: tracking, by an application manager,
`a JIT compilation history for the NMs installed on the plu(cid:173)
`rality of compute nodes; and configuring, by the application
`manager, the plurality of jobs for execution on the plurality of
`compute nodes in dependence upon the JIT compilation his(cid:173)
`tory for the JVMs installed on the plurality of compute nodes.
`[0014] The foregoing and other objects, features and
`advantages of the invention will be apparent from the follow(cid:173)
`ing more particular descriptions of exemplary embodiments
`of the invention as illustrated in the accompanying drawings
`wherein like reference numbers generally represent like parts
`of exemplary embodiments of the invention.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0015] FIG.1 illustrates an exemplary system for executing
`a distributed Java application on a plurality of compute nodes
`according to embodiments of the present invention.
`[0016] FIG. 2 sets forth a block diagram of an exemplary
`compute node useful in a parallel computer capable of execut(cid:173)
`ing a distributed Java application on a plurality of compute
`nodes according to embodiments of the present invention.
`[0017] FIG. 3A illustrates an exemplary Point To Point
`Adapter useful in systems capable of executing a distributed
`Java application on a plurality of compute nodes according to
`embodiments of the present invention.
`[0018] FIG. 3B illustrates an exemplary Global Combining
`Network Adapter useful in systems capable of executing a
`distributed Java application on a plurality of compute nodes
`according to embodiments of the present invention.
`[0019] FIG. 4 sets forth a line drawing illustrating an exem(cid:173)
`plary data communications network optimized for point to
`point operations useful in systems capable of executing a
`distributed Java application on a plurality of compute nodes in
`accordance with embodiments of the present invention.
`[0020] FIG. 5 sets forth a line drawing illustrating an exem(cid:173)
`plary data communications network optimized for collective
`operations useful in systems capable of executing a distrib(cid:173)
`uted Java application on a plurality of compute nodes in
`accordance with embodiments of the present invention.
`[0021] FIG. 6 sets forth a block diagram illustrating an
`exemplary system useful in executing a distributed Java
`application on a plurality of compute nodes according to
`embodiments of the present invention.
`[0022] FIG. 7 sets forth a flow chart illustrating an exem(cid:173)
`plary method for executing a distributed Java application on a
`plurality of compute nodes according to embodiments of the
`present invention.
`[0023] FIG. 8 sets forth a flow chart illustrating a further
`exemplary method for executing a distributed Java applica-
`
`tion on a plurality of compute nodes according to embodi(cid:173)
`ments of the present invention.
`
`DETAILED DESCRIPTION OF EXEMPLARY
`EMBODIMENTS
`
`[0024] Exemplary methods, apparatus, and computer pro(cid:173)
`gram products for executing a distributed Java application on
`a plurality of compute nodes according to embodiments of the
`present invention are described with reference to the accom(cid:173)
`panying drawings, beginning with FIG. 1. FIG. 1 illustrates
`an exemplary system for executing a distributed Java appli(cid:173)
`cation on a plurality of compute nodes according to embodi(cid:173)
`ments of the present invention. The system of FIG. 1 includes
`a parallel computer (100), non-volatile memory for the com(cid:173)
`puter in the form of data storage device (118), an output
`device for the computer in the form of printer (120), and an
`input/output device for the computer in the form of computer
`terminal (122). Parallel computer (100) in the example of
`FIG. 1 includes a plurality of compute nodes (102).
`[0025] The compute nodes (102) are coupled for data com(cid:173)
`munications by several independent data communications
`networks including a Joint Test Action Group (' JTAG') net(cid:173)
`work (104), a global combining network (106) which is opti(cid:173)
`mized for collective operations, and a torus network (108)
`which is optimized point to point operations. The global
`combining network (106) is a data communications network
`that includes data communications links connected to the
`compute nodes so as to organize the compute nodes as a tree.
`Each data communications network is implemented with data
`communications links among the compute nodes (102). The
`data communications links provide data communications for
`parallel operations among the compute nodes of the parallel
`computer. The links between compute nodes are bi-direc(cid:173)
`tional links that are typically implemented using two separate
`directional data communications paths.
`[0026]
`In addition, the compute nodes (102) of parallel
`computer are organized into at least one operational group
`(132) of compute nodes for collective parallel operations on
`parallel computer (100). An operational group of compute
`nodes is the set of compute nodes upon which a collective
`parallel operation executes. Collective operations are imple(cid:173)
`mented with data communications among the compute nodes
`of an operational group. Collective operations are those func(cid:173)
`tions that involve all the compute nodes of an operational
`group. A collective operation is an operation, a message(cid:173)
`passing computer program instruction that is executed simul(cid:173)
`taneously, that is, at approximately the same time, by all the
`compute nodes in an operational group of compute nodes.
`Such an operational group may include all the compute nodes
`in a parallel computer (100) or a subset all the compute nodes.
`Collective operations are often built around point to point
`operations. A collective operation requires that all processes
`on all compute nodes within an operational group call the
`same collective operation with matching arguments. A
`'broadcast' is an example of a collective operation for moving
`data among compute nodes of an operational group. A
`'reduce' operation is an example of a collective operation that
`executes arithmetic or logical functions on data distributed
`among the compute nodes of an operational group. An opera(cid:173)
`tional group may be implemented as, for example, an MPI
`'communicator.'
`[0027]
`'MPI' refers to 'Message Passing Interface,' a prior
`art parallel communications library, a module of computer
`program instructions for data communications on parallel
`
`Netflix, Inc. - Ex. 1017, Page 000011
`
`
`
`US 2009/0271799 Al
`
`Oct. 29, 2009
`
`3
`
`computers. Examples of prior-art parallel communications
`libraries that may be improved for use with systems according
`to embodiments of the present invention include MPI and the
`'Parallel Virtual Machine' ('PYM') library. PYM was devel(cid:173)
`oped by the University of Tennessee, The Oak Ridge National
`Laboratory, and Emory University. MPI is promulgated by
`the MPI Forum, an open group with representatives from
`many organizations that define and maintain the MPI stan(cid:173)
`dard. MPI at the time of this writing is a de facto standard for
`communication among compute nodes running a parallel pro(cid:173)
`gram on a distributed memory parallel computer. This speci(cid:173)
`fication sometimes uses MPI terminology for ease of expla(cid:173)
`nation, although the use of MPI as such is not a requirement
`or limitation of the present invention.
`[0028] Some collective operations have a single originating
`or receiving process running on a particular compute node in
`an operational group. For example, in a 'broadcast' collective
`operation, the process on the compute node that distributes
`the data to all the other compute nodes is an originating
`process. In a 'gather' operation, for example, the process on
`the compute node that received all the data from the other
`compute nodes is a receiving process. The compute node on
`which such an originating or receiving process runs is
`referred to as a logical root.
`[0029] Most collective operations are variations or combi(cid:173)
`nations of four basic operations: broadcast, gather, scatter,
`and reduce. The interfaces for these collective operations are
`defined in the MPI standards promulgated by the MPI F arum.
`Algorithms for executing collective operations, however, are
`not defined in the MPI standards. In a broadcast operation, all
`processes specify the same root process, whose buffer con(cid:173)
`tents will be sent. Processes other than the root specify receive
`buffers. After the operation, all buffers contain the message
`from the root process.
`[0030]
`In a scatter operation, the logical root divides data
`on the root into segments and distributes a different segment
`to each compute node in the operational group. In scatter
`operation, all processes typically specify the same receive
`count. The send arguments are only significant to the root
`process, whose buffer actually contains sendcount*N ele(cid:173)
`ments of a given data type, where N is the number of pro(cid:173)
`cesses in the given group of compute nodes. The send buffer
`is divided and dispersed to all processes (including the pro(cid:173)
`cess on the logical root). Each compute node is assigned a
`sequential identifier termed a 'rank.' After the operation, the
`root has sent sendcount data elements to each process in
`increasing rank order. Rank O receives the first sendcount data
`elements from the send buffer. Rank 1 receives the second
`sendcount data elements from the send buffer, and so on.
`[0031] A gather operation is a many-to-one collective
`operation that is a complete reverse of the description of the
`scatter operation. That is, a gather is a many-to-one collective
`operation in which elements of a datatype are gathered from
`the ranked compute nodes into a receive buffer in a root node.
`[0032] A reduce operation is also a many-to-one collective
`operation that includes an arithmetic or logical function per(cid:173)
`formed on two data elements. All processes specify the same
`'count' and the same arithmetic or logical function. After the
`reduction, all processes have sent count data elements from
`computer node send buffers to the root process. In a reduction
`operation, data elements from corresponding send buffer
`locations are combined pair-wise by arithmetic or logical
`operations to yield a single corresponding element in the root
`process's receive buffer. Application specific reduction
`
`operations can be defined at runtime. Parallel communica(cid:173)
`tions libraries may support predefined operations. MPI, for
`example, provides the following pre-defined reduction opera(cid:173)
`tions:
`
`MP!_ MAX
`MP!_ MIN
`MP!_ SUM
`MP!_ PROD
`MP!_ LAND
`MP!_ BAND
`MP!_ LOR
`MP!_ BOR
`MP!_ LXOR
`MP!_ BXOR
`
`maximum
`minimwn
`sum
`product
`logical and
`bitwise and
`logical or
`bitwise or
`logical exclusive or
`bitwise exclusive or
`
`[0033]
`In addition to compute nodes, the parallel computer
`(100) includes input/output ('I/0') nodes (110, 114) coupled
`to compute nodes (102) through the global combining net(cid:173)
`work (106). The compute nodes in the parallel computer
`(100) are partitioned into processing sets such that each com(cid:173)
`pute node in a processing set is connected for data commu(cid:173)
`nications to the same I/0 node. Each processing set, there(cid:173)
`fore, is composed of one I/0 node and a subset of compute
`nodes (102). The ratio between the number of compute nodes
`to the number of I/0 nodes in the entire system typically
`depends on the hardware configuration for the parallel com(cid:173)
`puter. For example, in some configurations, each processing
`set may be composed of eight compute nodes and one I/0
`node. In some other configurations, each processing set may
`be composed of sixty-four compute nodes and one I/0 node.
`Such example are for explanation only, however, and not for
`limitation. Each I/0 nodes provide I/0 services between com(cid:173)
`pute nodes (102) of its processing set and a set ofl/0 devices.
`In the example of FIG. 1, the I/0 nodes (110, 114) are con(cid:173)
`nected for data communications I/0 devices (118, 120, 122)
`through local area network ('LAN') (130) implemented using
`high-speed Ethernet.
`[0034] The parallel computer (100) of FIG. 1 also includes
`a service node (116) coupled to the compute nodes through
`one of the networks (104). Service node (116) provides ser(cid:173)
`vices common to pluralities of compute nodes, administering
`the configuration of compute nodes, loading programs into
`the compute nodes, starting program execution on the com(cid:173)
`pute nodes, retrieving results of program operations on the
`computer nodes, and so on. Service node (116) runs a service
`application (124) and communicates with users (128) through
`a service application interface (126) that runs on computer
`terminal (122).
`[0035]
`In the example of FIG. 1, the service node (116) has
`installed upon it an application manager (125). The applica(cid:173)
`tion manager (125) of FIG. 1 includes a set of computer
`program instructions capable of executing a distributed Java
`application on a plurality of compute nodes according to
`embodiments of the present invention. The Java application
`includes a plurality of jobs distributed among the plurality of
`compute nodes (102) for execution. The application manager
`(125) operates generally for executing a distributed Java
`application on a plurality of compute nodes according to
`embodiments of the present invention by: tracking a JIT com(cid:173)
`pilation history for the NMs (200) installed on the plurality
`of compute nodes (102); and configuring the plurality of jobs
`for execution on the plurality of compute nodes (102) in
`dependence upon the JIT compilation history for the NMs
`
`Netflix, Inc. - Ex. 1017, Page 000012
`
`
`
`US 2009/0271799 Al
`
`Oct. 29, 2009
`
`4
`
`(200) installed on the plurality of compute nodes (102).
`Although FIG. 1 illustrates the application manager (125)
`installed on a service node, readers will note that such an
`example is for explanation only and not for limitation. An
`application manager is a software component that may be
`installed on any compute nodes or other computer as will
`occur to those of skill in the art.
`[0036] The JIT compilation history represents a historical
`record of the jitting performed by a particular NM. The terms
`'jit,' 'jitting,' or 'jitted' refer to a NM's process of translating
`byte code into native platform machine code executable on
`the platform's processor and optimizing the machine code for
`enhanced execution performance. The JIT compilation his(cid:173)
`tory typically specifies the portion of the jobs jitted by the
`JVM and the JIT level at which the jitting on those portions
`was performed. The job portions identified in the JIT compi(cid:173)
`lation history may be Java classes, methods, call sequences
`for methods, and so on. Accordingly, the JIT compilation
`history may include JIT levels for Java constructs processed
`on the plurality of compute nodes (102) or JIT levels for Java
`call sequences for particular Java methods processed on the
`plurality of compute nodes (102).
`[0037] A JIT level, also referred to as a' JIT mode,' specifies
`the type of JIT compiling and optimizations performed on a
`particular portion of a Java application. The number and type
`of JIT levels available for a JVM vary from one implementa(cid:173)
`tion to another. For one example, Sun Microsystems' JVM
`has two major JIT levels-client and server. In the client JIT
`level, minimal compilation and optimization is performed in
`an effort to reduce the startup time required for the application
`to begin executing. In the server JIT level, initial startup time
`is sacrificed, and extensive compilation and optimization is
`performed to maximize application performance when the
`application executes. Readers will note that these two JIT
`level are for example and explanation only and not for limi(cid:173)
`tations. Other JIT levels and other terms besides 'client' and
`'server' may be used to identify JIT level as will occur to those
`of skill in the art.
`[0038] Each compute node (102) of FIG. 1 has installed
`upon it a Java Virtual Machine ('JVM') (200) capable of
`supporting a Java application. Each NM (2



