`a2) Patent Application Publication co) Pub. No.: US 2002/0133529 Al
`(43) Pub. Date: Sep. 19, 2002
`
`Schmidt
`
`US 20020133529A1
`
`(54) METHOD AND APPARATUS FOR
`REPRESENTING AND ENCAPSULATING
`ACTIVE COMPUTING ENVIRONMENTS
`
`(76)
`
`Inventor: Brian Keith Schmidt, Mountain View,
`CA (US)
`
`Correspondence Address:
`J. D. Harriman II
`COUDERT BROTHERS
`23rd Floor
`
`333 South Hope St.
`Los Angeles, CA 90071 (US)
`
`(21) Appl. No.:
`
`09/764,771
`
`(22)
`
`Filed:
`
`Jan. 16, 2001
`
`Publication Classification
`
`CSD 1k ©KA GO6F 9/00
`
`(52) US. Che
`
`cescssesssssssesnstntsntsenntnesssessnsstves 709/102
`
`(57)
`
`ABSTRACT
`
`The present invention provides a representation and encap-
`sulation of active computing environments. In accordance
`with one or more embodiments of the present invention a
`“compute capsule” is implemented. Each compute capsule
`serves to represent and encapsulate an active computing
`environment. An active computing environment comprises
`one or more active processes and their associated state
`information. The associated state information is information
`
`in a form that can be understood by any computer and tells
`the computer exactly what the processes in the capsule are
`doing at any given time. In this way, the compute capsule is
`a host-independent encapsulation that can be suspended on
`one computer, moved to a new computer, and re-started on
`the new computer where the new computer is binary com-
`patible.
`
`
`
`
`ADD ALL PROCESSES TO
`THE COMPUTE CAPSULE
`200
`
`
`
`
`
`
` MODIFY PROCESS
`230
`
`ADD SYSTEM ENVIRONMENT
`INFORMATION TO THE
`COMPUTE CAPSULE
`210
`
`HAVE THE PROCESSES
`
`CHANGED?
`220
`
`INFORMATIONIN THE
`COMPUTE CAPSULE
`
`
`
`HAS THE SYSTEM
`ENVIRONMENT CHANGED?
`240
`
`
`
`
`
`
`
`MODIFY SYSTEM ENVIRONMENT
`INFORMATION IN THE
`COMPUTE CAPSULE
`
`250
`
`Google Exhibit 1009
`Google v. VirtaMove
`
`Google Exhibit 1009
`Google v. VirtaMove
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 1 of 16
`
`US 2002/0133529 Al
`
`
`
`Compute
`Capsule
`
`106
`
`
`
`9%
` page this
`
`+) ¢4
`Operatin
`
`System?
`swap| free
`106, table||
`
`
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 2 of 16
`
`US 2002/0133529 Al
`
`
`
`THE COMPUTE CAPSULE
`200
`
`
`
`
`FIGURE 2
`
` ADD ALL PROCESSES TO
` ADD SYSTEM ENVIRONMENT
`210
`
`INFORMATION TO THE
`COMPUTE CAPSULE
`
`
`
`240 MODIFY SYSTEM ENVIRONMENT
`
`
` HAVE THE PROCESSES
`
`CHANGED?
`
`220
`
`
`
`MODIFY PROCESS
`INFORMATIONIN THE
`COMPUTE CAPSULE
`230
`
`
`
`
`HAS THE SYSTEM
`ENVIRONMENT CHANGED?
`
`
`INFORMATIONIN THE
`COMPUTE CAPSULE
`250
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 3 of 16
`
`US 2002/0133529 Al
`
`ADD ALL OF A USER'S PROCESSES TO
`A CAPSULE
`300
`
`310
`
`ADD ALL OF THE USER'S OPEN
`DEVICES TO THE CAPSULE
`
`FIGURE 3A
`
`
`
`© Conbinratien
`ADD ALL OF THE USER'S
`SHBRL
`HHSVQRY.TO THE CAPSULE
`52H ans SO 320
`
`
`
`
`
`ADD ALL OF THE USER'S
`
`ENVIRONMENTINFORMATION TO
`330
`THE CAPSULE
`
`
`
`ADD ALL OF THE USER'S WORKING
`DIRECTORIES AND FILES TO THE
`340
`
`CAPSULE
`
`
` ADD ALL OF THE USER'S ASSIGNED
`
`RESOURCES TO THE CAPSULE
`350
`
`
`
` ADD ALL OF THE USER'S INSTALLED
`
`
`SOFTWARE TO THE CAPSULE
`360
`
`ADD ALL OF THE USER'S INTERNAL
`PROGRAMSTATE TO THE CAPSULE
`
`370
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 4 of 16
`
`US 2002/0133529 Al
`
`START
`
`ADD ALL OF A USER'S PROCESSES TO
`A CAPSULE
`380
`
`ADD ALL OF THE USER'S SYSTEM
`ENVIRONMENT TO THE CAPSULE
`382
`
`FIGURE 3B
`
`RE-PARTITION CPU STATE WHERE SOME OF THE
`STATE IS MOVED AWAY FROM THE OPERATING
`SYSTEM AND INTO THE COMPUTE CAPSULE
`
`384
`392
`
`RE-PARTITION FILE SYSTEM STATE WHERE
`SOME OF THE STATE IS MOVED AWAY FROM
`THE CPERATING SYSTEM AND INTO THE
`COMPUTE CAPSULE
`386
`
`RE-PARTITION DEVICE STATE WHERE SOME OF
`THE STATE IS MOVED AWAY FROM THE
`OPERATING SYSTEM AND INTO THE COMPUTE
`CAPSULE
`388
`
`RE-PARTITION VIRTUAL MEMORY STATE WHERE
`SOME OF THE STATE IS MOVED AWAY FROM THE
`OPERATING SYSTEM AND INTO THE COMPUTE
`CAPSULE
`390
`
`RE-PARTITION IPC STATE WHERE SOME OF THE
`STATE IS MOVED AWAY FROM THE OPERATING
`SYSTEM AND INTO THE COMPUTE CAPSULE
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 5 of 16
`
`US 2002/0133529 Al
`
`START
`
` INITIATE
`CAPSULE_CREATE SYSTEM
`CALL
`
`
`
`410
`
`
`FIGURE 4 400 OBTAIN USER NAME
` OBTAIN INITIAL PROCESS
`420
`430
` IS PRIVILEGED ACCESS
` OBTAIN PRIVILEGE
`
`TO EXECUTE
`
`NEEDED?
`
`INFORMATION
`
`440
`
`450
` INITIALIZE CAPSULE
`
`
`
`Patent Application Publication
`
`US 2002/0133529 Al
`
`FIGURE 5
`
`520 Sep. 19, 2002 Sheet 6 of 16
`
`CREATEFILE SYSTEM
`VIEW
`500
`
`INITIALIZE NAME
`TRANSLATION TABLES
`510
`
`INCLUDEINITIAL
`PROCESS
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 7 of 16
`
`US 2002/0133529 Al
`
` START REGISTER CAPSULE WITH
`
`
`
`
`A WELL KNOWN
`DIRECTORY SERVICE
`600
`
`FIGURE 6
`
` HAVE ALL MEMBERS OF
`
`THE CAPSULE EXITED?
`
` FREE TRANSLATION
`
`
`
`TABLES
`620
`
`610
` REMOVE FILE SYSTEM
` REMOVE CAPSULE FROM
`640
`
`630
`
`VIEW
`
`DIRECTORY SERVICE
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 8 of 16
`
`US 2002/0133529 Al
`
`FIGURE 7
`
`
`
`
`
`
`
`
`
`WITHIN THE KRENEL DOES THE
`USER HAVE THE REQUIRED
`PERMISSIONS?
`700
`
`
` ALLOCATE COMPUTE
`CAPSULE STRUCTURE
`710
`
` INITIALIZE NAME
`TRANSLATION TABLES
`720
`
` ASSIGN UNIQUE NAME TO
`
`THE CAPSULE
`730
`
`
`
`
`
` MOVE INVOKING PROCESS
`
`
`
`INTO THE CAPSULE
`740
`
` REGISTER CAPSULE WITH
`
`
`
`WELL KNOWN DATABASE
`SERVICE
`750
`
`ESTABLISH FILE SYSTEM
`
`
`
`
` INVOKING PROCESS
`
`
`
`
`RESUMESUSER LEVEL
`EXECUTION
`770
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 9 of 16
`
`US 2002/0133529 Al
`
`FIGURE 8
`
`
`
`
`
`EXECUTE CAPSULE_JOIN
`OPERATION
`800
`
`
`
`REQUEST THAT NEW
`PROCESS BE CREATED IN
`THE TARGET CAPSULE
`820
`
`
`
` LOCATE TARGET CAPSULE
`810
`
`
`
`
`
`
`
`
`DOES REQUESTOR HAVE
`CAPSULEJOIN
`
`ACCESS RIGHTS?
`OPERATIONFAILS
`
`
`
`830
`840
`
`
`
`
`
` INSTANTIATE DESIRED
`
`
`
`APPLICATION IN TARGET
`CAPSULE
`850
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 10 of 16
`
`US 2002/0133529 Al
`
`
`
`FIGURE 9
`
`
`
`910
`
`
`LOOK UP NAME OF
`CAPSULEIN DIRECTORY
`
`
`SERVICE
`
`900
`
`INITIATE CAPSULE_JOIN
`FUNCTION CALL ON
`TARGET HOST
`
`
`
`
` DOES THE USER HAVE THE
`REQUIRED PERMISSIONS?
`
`920
`
`
` SEVER TIES WITH FORMER
`
`
`CAPSULE
`930
`
` BECOMECHILD OF THE
`
`INIT PROCESS
`
`940
`
`
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 11 of 16
`
`US 2002/0133529 Al
`
`
`
`
`
`PROCESSOR.
`
`MASS
`
`STORAGE
`
`
`
`
`1043 SERVER
`
`a 1026 1012
`
`
`
`NETWORK. LINK 1021
`
`
`
`FIGURE 10
`
`
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 12 of 16
`
`US 2002/0133529 Al
`
`FIGURE @ ||
`
`\\ S10
`
`CENTRAL SERVER
`INSTALLATION
`
`i{ 200
`
`END USER HARDWARE
`
`
`
`Sep. 19, 2002 Sheet 13 of 16
`
`US 2002/0133529 Al
`
`yo
`
`Patent Application Publication
`
`FIGURE % |2
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 14 of 16
`
`US 2002/0133529 Al
`
`AOS 13.66
`Flash Memory
`
`404 1344
`Embedded
`Processor
`
`HB Noy
`Smartcard
`-
`Interface
`
`Re
`:
`36
`498
`Video Controller
`
`1$10
`
`Video Decoder
`
`Tay
`Video Encoder
`
`\ 4 it
`
`PCI Bus
`
`Control Block \ my
`
`40M 138)
`USB Controller
`
`493 1503
`Audio Codec
`
`Nework*
`
`cae
`
`\ 1Bily
`“Ate
`3
`Figure 4: HID
`
`\ $3h
`+h4
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 15 of 16
`
`US 2002/0133529 Al
`
`1 4504
`Interconnect
`Interface
`
`«| $502
`Internal Bus
`Controller
`
`$505
`Graphics
`Renderer
`
`Controller
`
`ix 1506
`Video
`Controller/Interface
`
`Memory
`
`wi b$03
`Sound
`
`4
`Figure T35: HID Single Chip Implementation
`
`
`
`Patent Application Publication
`
`Sep. 19, 2002 Sheet 16 of 16
`
`US 2002/0133529 Al
`
`
`
`
`PROVIDE ACL HAVING ONLY
`CAPSULE OWNER AS MEMBER
`
`
`1500
` FIGURE 15
`1520
`
`
`INVOKE ONE OR MORE
`CAPSULE_ACL SYSTEM CALLS
`
`
` HAS A CAPSULEJOIN
`
`CALL BEEN MADE?
`1530
`
`
`1550 GRANT ACCESS TO THE CAPSULE
`
`HAS THE CALL BEEN
`
`MADE BY A MEMBER OF
`
`THE ACL?
`1540
`
`
`
`
`
`DENY ACCESS TO THE CAPSULE
`
`
`
`1560
`
`
`
`
`US 2002/0133529 Al
`
`Sep. 19, 2002
`
`METHOD AND APPARATUS FOR REPRESENTING
`AND ENCAPSULATING ACTIVE COMPUTING
`ENVIRONMENTS
`
`BACKGROUND OFTHE INVENTION
`
`[0001]
`
`1. Field of the Invention
`
`[0002] The present inventionrelates to the representation
`and encapsulation of active computing environments.
`
`[0003] Portions of the disclosure of this patent document
`contain material that is subject to copyright protection. The
`copyright owner has no objection to the facsimile reproduc-
`tion by anyoneof the patent documentor the patent disclo-
`sure as it appears in the Patent and Trademark Officefile or
`records, but otherwise reserves all copyright rights whatso-
`ever.
`
`[0004]
`
`2. Background Art
`
`In modern computingit is desirable for a user to be
`[0005]
`interacting with a computer, to stop the interaction with the
`computer, to move to a new computer, and to begin inter-
`acting with the new computerat precisely the point where
`the user stopped interacting with the first computer. Using
`current schemes, however, this is not possible because the
`user’s computing environment cannot be represented in a
`form that can be understood by both computers and moved
`between the computers. Before further discussing the draw-
`backs of current schemes,it is instructive to discuss how the
`nature of computing is changing.
`
`The Nature of Computing
`[0006] The nature of computing is changing. Until
`recently, modern computing was mostly “machine-centric”’,
`where a user accessed a dedicated computer at a single
`location. The dedicated computer had all
`the data and
`computer programs necessary for the user to operate the
`computer, and ideally, it had large amounts of hardware,
`such as disk drives, memory, processors, and the like. With
`the advent of computer networks, however, different com-
`puters have become more desirable and the focus of com-
`puting has become “service-oriented”. In particular, com-
`puter networks allow a user to access data and computer
`programsthat exist elsewhere in the network. Whenthe user
`accesses such data or computer programs, the remote com-
`putcr is said to be providing a service to the user. With the
`improvementin services available to users, the need to have
`a dedicated computer following the machine-centric para-
`digm is greatly reduced. The machine-centric paradigm also
`becomes much less practical in this environment because
`distributing services is much more cost-effective.
`
`In particular, computers in a service-oriented envi-
`(0007]
`ronment have little need for powerful hardware. For
`instance,
`the remote computer processes the instructions
`before providing the service, so a powerful processoris not
`needed on the local access hardware. Similarly, since the
`service is providing the data,thereis little need to have large
`capacity disk drives on the local access hardware. In such an
`environment, one advantage is that computer systems have
`been implemented that allow a user to access any computer
`in the system andstill use the computer in the same manner
`(i.e., have access to the same data and computer programs).
`
`[0008] For instance, a user may be in location A and
`running a word processor, a web browser, and an interactive
`
`multimedia simulation. In a service-oriented environment,
`the user might stop using the computer in location A and
`move to location B where the user could resume these
`
`computer programson a different machineat the exact point
`where the user stopped using the machine at location A, as
`long as both computers had access via the computer network
`to the servers where the programs were being executed. The
`programs in this example, however, cannot be moved
`between computers when they are active because of the
`design of current operating systems.
`
`Current Operating Systems
`
`It would be beneficial to move the active processes
`[0009]
`in the above example for various reasons. One reason occurs
`if the servers crashed or were shut down for maintenance or
`
`upgrades. in this situation,if the active processes cannot be
`moved, the user has no access to the processes while the
`serveris off-line. Also, the processes and their state will be
`lost, possibly resulting in loss of critical data or effort.
`
`[0010] Another reason to move the processes might be if
`one of the servers became very busy. In this scenario, it
`might allow the user a better computing experience if the
`active processes were dynamically load balanced. For
`instance, if the resources on a server became scarce (the
`processor and memory, for instance) one of the processes
`could be moved to a less busy machine and resumed there
`where there is more processor capacity and memory avail-
`able.
`
`Sometimesit is useful to suspend active processes
`[0011]
`in a persistent storage so that they release scarce system
`resources. For instance if the server
`running the word
`processor, web browser, and multimedia simulation is over-
`loaded it would be beneficial to have the ability to suspend
`the non-critical processes and store them in a non-volatile
`storage medium, such as a disk, so that later they can be
`retrieved and restarted on the same or a different computer
`system. Consider also the scenario where there are many
`clients who have initiated long-running jobs on the server.
`Whenthoseusers disconnect, their sessions becomeidle, but
`the sessions remain on the server, which reduces the number
`ofsessionsthat canreside onthe server. By storing these idle
`sessions on disk scalability is improved (i.e., the server can
`host more active sessions).
`
`to migrate a
`it would be beneficial
`In addition,
`[0012]
`group of active computer programs(i.e., a session) to be
`closer to a user. For instance, if a user working in Los
`Angeles traveled to New York,it is beneficial to migrate that
`session to a server closer to New York for
`improved
`response time.
`
`the
`[0013] Using current operating systems, however,
`critical state of an active computation is inaccessible, which
`makes it impossible to stop the active processes and trans-
`port them to new machines. Thecritical state is located in the
`kernel. The kernel acts as a mediator between the user’s
`
`computer programs and the computer hardware. The kernel,
`amongother things, performs memory managementfor all
`of the running processes and makes sure that they all get a
`share of the processor. The critical state of a process is
`dispersed among various kernel data structures, and there is
`no facility for extracting it, storing it off-line, or re-creating
`it from an intermediate representation. For example,
`the
`complete state of a Unix pipe cannot be preciscly deter-
`
`
`
`US 2002/0133529 Al
`
`Sep. 19, 2002
`
`mined by user-level routines. The identities of the endpoints
`and any in-flight data are known only by the kernel, and
`there is no mechanism to duplicate a pipe only from its
`description.
`
`there is no mechanism for encapsulating
`[0014] Thus,
`active computations such that they can be separated from the
`kernel. As such, each active process is locked to a specific
`machine, which binds its existence to the transient lifetime
`of the underlying operating system.
`
`SUMMARY OF THE INVENTION
`
`[0015] The present invention provides a method and appa-
`ratus which represents and encapsulates active computing
`environments. According to one or more embodiments of
`the present invention a “compute capsule” is implemented.
`Each compute capsule serves to represent and encapsulate
`an active computing environment. An active computing
`environment comprises one or more active processes and the
`complete state necessary to allow the encapsulation to be
`suspended and revived on any binary compatible machine.
`
`[0016] The state information is information in a host-
`independent form that tells a computer exactly what each of
`the processes are doing at any given time. This may include
`privileges, configuration settings, working directories and
`files, assigned resources, open devices, installed software,
`and internal program state.
`
`[0017] To encapsulate the state necessary to move a group
`of processes between machines one embodimentvirtualizes
`the application interface to the operating system and repar-
`titions state ownership so that all host-dependent informa-
`tion is moved out of the kernel and into the compute
`capsules. In one embodiment, the repartitioning comprises
`moving one or more members of the CPUstate, the file
`system state, the device state, the virtual memorystate, and
`the inter-process communication (IPC) state into the cap-
`sule.
`
`In one embodiment, capsules are created using a
`{0018]
`capsule_create system call. In another embodiment, a pro-
`cess may join a capsule by invoking a capsule_join system
`call. A compute capsule is entirely self-contained and can be
`suspended in secondary storage, arbitrarily bound to differ-
`ent machines and different operating systems, and transpar-
`ently resumed.
`
`BRIEF DESCRIPTON OF THE DRAWINGS
`
`[0019] These and otherfeatures, aspects and advantages of
`the present invention will become better understood with
`regard to the following description, appended claims and
`accompanying drawings where:
`
`[0020] FIG. 1 is a block diagram showing the re-parti-
`tioning of functionality between the operating system and
`the compute capsule.
`
`[0021] FIG. 2 is a flowchart describing how compute
`capsules are created and maintained according to an embodi-
`ment of the present invention.
`
`[0022] FIG. 3A illustrates the capsule creation process
`according to an embodimentof the present invention.
`
`[0023] FIG. 3B illustrates the capsule creation process
`according to another embodimentof the present invention.
`
`[0024] FIG. 4 is a flowchart showing an embodimentof
`the present invention that uses a capsule_create system call.
`
`(0025] FIG. 5 isa flowchart showing capsule initialization
`according to an embodiment of the present invention.
`
`[0026] FIG. 6 is a flowchart showing how a capsule is
`maintained and removed from the system according to an
`embodimentof the present invention.
`
`[0027] FIG. 7 is a flowchart showing how a capsule is
`created according to an embodimentof the present invention
`where some of the functionalitv remains in the kernel.
`
`FIG.8 illustrates the capsule join process accord-
`[0028]
`ing to an embodimentof the present invention.
`
`[0029] FIG. 9 illustrates the capsule join process accord-
`ing to another embodiment of the present invention.
`
`[0030] FIG. 10 is an embodimentof a computer execution
`environment in which one or more embodiments of the
`present invention can be implemented.
`
`[0031] FIG. 11 shows an example ofa thin client topology
`called a virtual desktop system architecture.
`
`[0032] FIG. 12 displays the partitioning of the function-
`ality of the virtual desktop system architecture.
`
`FIG.13 is a block diagram of an example embodi-
`[0033]
`ment of a humaninterface device.
`
`[0034] FIG. 14 is a block diagram of a single chip
`implementation of a human interface device.
`
`[0035] FIG. 15 is a flowchart showing the manner in
`which one embodiment of the present invention controls
`access to compute capsules.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`[0036] The invention relates to the representation and
`encapsulation of active computing environments.
`In the
`following description, numerous specific details are set forth
`to provide a more thorough description of embodiments of
`the invention. It is apparent, however, to one skilled in the
`art,
`that
`the invention may be practiced without
`these
`specific details. In other instances, well known features have
`not been described in detail so as not
`to obscure the
`invention.
`
`Compute Capsules
`
`[0037] A compute capsule comprises one or more pro-
`cesses and their associated system environment. A compute
`capsule is configured to provide an encapsulated form that is
`capable of being moved between computers or stored off-
`line, for instance on a disk drive or other non-volatile storage
`medium. The system environment in a capsule comprises
`state information relating to exactly what the processes are
`doing at any given time in a form that is understandable by
`any binary compatible machine. System environmentinfor-
`mation may include, for instance, privileges, configuration
`settings, working directories and files, assigned resources,
`open devices, installed software, and internal program state.
`
`[0038] Processes in the same capsule may communicate
`with each other and share data via standard IPC mecha-
`
`nisms, for instance using pipes, shared memory, or signals.
`
`
`
`US 2002/0133529 Al
`
`Sep. 19, 2002
`
`Communication with processes outside the capsule, on the
`other hand,
`is restricted to Internet sockets and globally
`shared files. This ensures that capsules can move without
`restriction. For example, a pipe between processes in dif-
`ferent capsules would force both capsules to reside on the
`same machine, but a socket can be redirected. The use of
`compute capsules is completely transparent, and applica-
`tions need not take any special measures, such as source
`code modification, re-compilation, or linking with special
`libraries. In addition, a system using compute capsules can
`seamlessly inter-operate with systems that do not.
`
`Re-Partitioning the Operating System
`
`[0039] To provide such functionality, the traditional oper-
`ating system is re-partitioned as shown in FIG.1 so thatall
`host-dependant and personalized elements of the computing
`environment are moved into the capsule 100, while lever-
`aging policies and management of the shared underlying
`system 105. The computing environment comprises CPU
`110, file system 115, devices 120, virtual memory 125, and
`IPC 130. Each of these components of the computing
`environment have been partitioned as indicated by the
`curved line 135.
`
`[0040] The state of the CPU scheduler 140 is left in the
`operating system 105. This state comprises formation that
`the operating system maintains so that
`it knows which
`processes may run, where they are, what priority they have,
`how muchtime they will be granted processorattention,etc.
`Process state 145, which is moved to the compute capsule
`100, has process-specific information, such as the values in
`the registers,
`the signal handlers registered, parent/child
`relationships, access rights, and file tables. The file system
`115 leaveslocal files 150 that are identically available on all
`machines,(e.g., /usr/bin or /man on a UNIX system) in the
`operating system 105. The file system 115 further leaves
`disk blocks 152 outside the capsule, which are cachesof disk
`blocks that are read into the system and can be later used
`when neededto be read again. The disk structure 154 is also
`left outside the capsule. The disk structure is specific to an
`operating system and serves as a cache of where files are
`located on the disk, (.e., a mapping of pathnamesto file
`locations). Network file system (NFS) is a protocol for
`accessing files on remote systems. The operating system
`maintains information 156 with respect to the NFS and a
`cache 158, which is a cacheoffiles the operating system has
`retrieved from remote servers and stored locally. Similar
`state is maintained for other network based file systems.
`
`[0041] What has been partitioned away from the operating
`system isthe file state 160. The filc state 160 is movedto the
`capsule 100. The file state 160 is the state of a file that some
`process in the capsule has opened. File state 160 includes,
`for instance, the name of the file and where the process
`currently accessingthe file. If the file is not accessible via the
`network (e.g., stored on a local disk), then its contents are
`placed in the capsule.
`
`[0042] Devices 120 arc components that are attached to
`the computer. For each device there is a driver that maintains
`the state of the device. The disk state 165 remains in the
`
`operating system 105. The othcr device components are
`specific to a log-in session and are movedto the capsule 100.
`The other devices include a graphics controller state 170,
`whichis the content that is being displayed on the screen, for
`
`instance the contents of a frame buffer that holds color
`values for each pixel on a display device, such as a monitor.
`
`[0043] Keyboard state 172 and mouse state 175 includes
`the state associated with the user’s current interaction with
`
`the keyboard, for instance whether caps lockis on oroff and
`with the screen, for instance where the pointer is currently
`located. Tty state 174 includes information associated with
`the terminals the user is accessing, for instance if a user
`opens an Xwindow on a UNIX system orif a user telnets or
`performs an rlogin. Tty state 174 also includes information
`about what the cursor looks like, what types of fonts are
`displayed in the terminals, and whatfilters should be applied
`to make the text appear a certain way, for instance.
`
`[0044] Virtual memory 125 has state associated with it.
`The capsule tracks the state associated with changes made
`from within the capsule which are termed read/write pages
`176. Read-only pages 178 remain outside the capsule.
`However,
`in one cmbodiment read-only pages 178 arc
`moved to the capsule as well, which is useful in some
`scenarios. For instance, certain commands one would expect
`to find on a new machine whentheir capsule migrates there
`may not be available. Take, for instance, a command such as
`Is or more on a UNIX system. Those read-only pages may
`not be necessary to bring into the capsule when it
`is
`migrating between UNIX machines, because those pages
`exist on every UNIX machine. If, however, a user is moving
`to a machine that does not use those commands, it is useful
`to move those read only pages into the capsule as well. The
`swap table 180, which records what virtual memory pages
`have been replaced and moved to disk, remains outside the
`capsule as do the [ree list 182, (which is a list of empty
`virtual memory pages), and the page table 184.
`
`[0045] Nearly all IPC 130 is moved into the capsule. ‘This
`includes shared memory 186, which comprises a portion of
`memory that multiple processes may be using, pipes 188,
`fifos 190, signals 192, including handler lists and the state
`needed to know what handler the process was using and to
`find the handler. Virtual interface and access control 194 is
`
`useful for separating the capsule from host-dependent infor-
`mation that is specific to a machine, such asthe structure of
`internal program state or the IDs for its resources. The
`interface 194 refers generally to the virtualized naming of
`resources and translations between virtual resource names
`and physical resources, as well as lists that control access to
`processes trying to access capsules.
`
`[0046] Thus, capsule state includes data that are host-
`specific, cached on the local machine to which the capsule
`is bound,or not otherwise globally accessible. This includes
`the following information:
`
`[0047] Capsule State: Nametranslation tables, access con-
`trol list, owner ID, capsule name, etc.;
`
`[0048] Processes: Tree structure, process control block,
`machine context, thread contexts, scheduling parameters,
`etc.;
`
`[0049] Address Space Contents: Read/write pages of vir-
`tual memory; because they are available in the file system,
`contents of read-only files mapped into the address space
`(e.g., the application binary and libraries) are not included
`unless explicitly requested;
`
`[0050] Open File State: Only file names, permissions,
`offsets, cte. are required for objects available in the global
`
`
`
`US 2002/0133529 Al
`
`Sep. 19, 2002
`
`file system. However, the contents of personal files in local
`storage (e.g., /tmp) must be included. Because the pathname
`of a file is discardedafter it is opened, for each process one
`embodiment of the invention maintains a hash table that
`
`mapsfile descriptors to their corresponding pathnames. In
`addition, some open files have no pathname, (ie., if an
`unlink operation has been performed). The contents of such
`files are included in the capsule as well;
`
`IPC Channels: IPC state has been problematic in
`[0051]
`most prior systems. The present
`invention adds a new
`interface to the kernel modules for each form of IPC. This
`
`interface includes two complementary elements: export cur-
`rent state, and import state to re-create channel. For
`example, the pipe/fifo module is modified to exportthelist
`of processes attached to a pipe, its current mode, the list of
`filter modules it employs, file system mount points, and
`in-flight data. When given this state data, the system can
`re-establish an identical pipe;
`
`[0052] Open Devices: By adding a state import/export
`interface similar to that used for IPC, the invention supports
`the most commonly used devices: keyboard, mouse, graph-
`ics controller, and pseudo-terminals. ‘(he mouse and key-
`board havevery little state, mostly the location of the cursor
`and the state of the LEDs (e.g., caps lock). The graphics
`controller is more complex. The video mode(e.g., resolution
`and refresh rate) and the contents of the frame buffer must
`be recorded, along with any color tables or other specialized
`hardware settings. Supporting migration between machines
`with different graphics controllers is troublesome, but a
`standard remote display interface can address that issue.
`Pseudo-terminal state includes the controlling process, con-
`trol settings, a list of streams modules that have been pushed
`onto it, and any unprocessed data.
`
`[0053] Capsules do not include shared resources or the
`slale necessary lo manage them (e¢.g., the processor sched-
`uler, page tables), state for kernel optimizations (e.g., disk
`caches),
`local
`file system, physical resources (e.g.,
`the
`network), etc.
`
`Capsule Operation
`
`[0054] Once the operating system is re-partitioned, one
`embodimentof the present invention creates and maintains
`a compute capsule as shownin the flowchart of FIG. 2. At
`step 200, all of a user’s processes are added to the capsule.
`Next, at step 210 all of the user’s system environmentis
`addedto the capsule. Thereafter, at step 220,it is determined
`whether a user has initiated new processesor if some of the
`user’s processes have terminated. If so, the process infor-
`mation in the capsule is modified at step 230. Next, it is
`determined at step 240, whether the user’s system environ-
`ment has changed. If it has not, the process repeats at step
`220. If it has, the system environment information in the
`capsule is updated at step 250 and the process repeats at step
`220.
`
`[0055] Another embodiment of the present invention is
`shownin FIG.3A.At step 300, all of a uscr’s processes are
`added to the capsule. Next, at step 310 all privileges are
`added to the capsule. Then, at step 320, all open devices are
`added to the capsule. After that all configuration scttings are
`added to the capsule at step 330. Next, all working direc-
`tories and files are addedto the capsule at step 340. Then, all
`assigned resources arc added to the capsule at step 350.
`
`Next, all installed software is added to the capsule at step
`360. Finally, all
`internal program state is added to the
`capsule at step 370. Note that FIG. 3 is for the purpose of
`providing an example of what may be added to a capsule and
`is not an exhaustivelist.
`
`[0056] Another embodiment of the present invention is
`shownin FIG. 3B. At step 380, all of a user’s processes are
`added to the capsule. Next, at step 382 all of the user’s
`system environmentis added to the capsule. Then, at step
`384 CPUstate is re-partitioned, where some ofthe state is
`moved away from the operating system and into the com-
`pute capsule, which mayincludeprocessstate. Next, at step
`386, file system state is re-partitioned, where some of the
`state is moved away from the operating system andinto the
`compute capsule, which mayincludefile state. After that, at
`step 388 device state is re-partitioncd where some of the
`state is moved away from the operating system and into the
`compute capsule, which may include keyboard state,
`tty
`(.e., pseduo-terminal) state, mouse state, and graphicsstate.
`Then, at step 390, virtual memorystate is re-partitioned,
`where someof the state is moved away from the operating
`system and into the compute capsule, which may include
`read/write pages and optionally read only pages. Thereafter,
`IPC state is re-partitioned at step 392, where some of the
`state is moved away from the operating system andinto the
`compute capsule, which may include shared memory, pipes,
`fifos, and signals.
`
`In operation, one embodiment of the invention
`[0057]
`defines several newoperating system interface routines.
`These new routines are defined in Table 1.
`
`TABLE1
`
`Name
`
`Description
`
`capsule__create
`capsule_join
`capsule_acl
`capsule_checkpoint
`capsule_restart
`
`Create a new capsule andinstall a process.
`Move a process into an existing capsule.
`Manage the access control list for a capsule.
`Suspend a capsule and recordits state.
`Restart a capsule from its recordedstate.
`
`Creating a Capsule
`
`In one embodiment, an application may instantiate
`[0058]
`anew capsule or join an existing capsule via two new system
`calls, capsule_create and capsule_join, respectively. Typi-
`cally, a user performs these operations via standard login
`facilities, which are modified to capture entire user sessions
`as capsules. To ensure proper use, the capsule_create an