throbber
(12) United States Patent
`US 6,795,966 B1
`(10) Patent N0.:
`Lim et al.
`
`(45) Date of Patent: Sep. 21, 2004
`
`US006795966B1
`
`(54) MECHANISM FOR RESTORING, PORTING,
`REPLICATING AND CHECKPOINTING
`COMPUTER SYSTEMS USING STATE
`EXTRACTION
`
`(75)
`
`Inventors: Beng-Hong Lim, Menlo Park, CA
`(US); Edouard Bugnion, Mountain
`View, CA (US); Scott W. Devine,
`Redwood City, CA (US)
`
`(73) Assignee: VMWare, Inc., Palo Alto, CA (US)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/497,978
`
`(22)
`
`Filed:
`
`Feb. 4, 2000
`
`Related US. Application Data
`
`(63)
`
`(60)
`
`Continuation—in—part of application No. 09/151,175, filed on
`Sep. 10, 1998, now Pat. No. 6,496,847.
`Provisional application No. 60/118,862, filed on Feb. 5,
`1999, and provisional application No. 60/085,685, filed on
`Feb. 15, 1998.
`
`Int. Cl.7 ................................................ G06F 9/455
`(51)
`
`.. 718/1; 718/100
`(52) US. Cl.
`.....
`(58) Field of Search .......................... 709/1; 718/1, 100
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`5,386,552 A
`5,452,462 A *
`5,634,096 A
`5,715,464 A
`5,758,174 A
`5,905,855 A *
`
`1/1995 Garney et al.
`9/1995 Matsuura et al.
`5/1997 Baylor et al.
`2/1998 Crump et al.
`5/1998 Crump et al.
`5/1999 Klaiber et al.
`
`.............. 709/1
`
`................ 714/31
`
`OTHER PUBLICATIONS
`
`J. Howard, et al., “Scale and performance in a distributed file
`system,” ACM Transactions
`on Computer Systems,
`6(1):51—81, Feb., 1988.
`
`M. Litzkow, et al., “Supporting Checkpointing and Process
`Migration Outside the UNIX Kernel,” Proceedings of the
`1994 Winter USENIX Technical Conference, San Francisco,
`CA, Jan. 1992.
`C. Landau, “The Checkpoint Mechanism in KeyKOS,”
`Proceedings of the Second International Workshop on
`Object Orientation in Operating Systems, Sep. 1992.
`De Jonge,et al., “The Logical Disk: A New Approach to
`Improving File Systems,” in Proceedings of the 141h ACM
`Symposium on Operating System Principles, pp. 15—28,
`Dec. 1993.
`
`D. Hitz, et al., “File system design for a file server appli-
`ance,” Proceedings of the 1994 Winter USENIX Technical
`Conference, pp. 235—245, San Francisco, CA, Jan. 1994.
`Lee, et al., “Petal: Distributed Virtual Disks,” Proc. 1 “Intl.
`Conf. On Architectural Support
`for Programming Lan-
`guages and Operating Systems,” pp. 84—92, Oct. 1996.
`
`(List continued on next page.)
`
`Primary Examiner—John Follansbee
`Assistant Examiner—Ashok Patel
`
`(74) Attorney, Agent, or Firm—Jeffrey Pearce
`
`(57)
`
`ABSTRACT
`
`A computer system is interrupted, and its entire state infor-
`mation is extracted as one or more checkpoints at one or
`more respective points during operation of the system. The
`checkpoint may be restored into the system at any later time,
`even multiple times, and it may also even be loaded into one
`or more other systems; all systems loaded with the same
`checkpoint will then execute from the same checkpointed
`state. The state extraction mechanism is preferably a virtual
`machine monitor, on which one or more virtual machines are
`installed, each virtual machine constituting an encapsulated,
`virtualized computer system whose states can be check-
`pointed under control of the virtual machine monitor.
`Checkpoints may be stored on a portable memory device or
`transmitted as a batch or dynamically over a network so that
`even virtual machines installed at different sites may execute
`from the same state.
`
`21 Claims, 5 Drawing Sheets
`
`200
`VM1
`2201
`2202
`
`VPROC
`
`NON-
`VOLATILE
`STORAGE
`
`
`
`VMM
`
` DEVICE EMULATORS
`._.
`_____
`252
`___
`
`
`
`
` HARDWARE
`
`<->
`PROC
`
`100/
`
`110
`
`142
`
`VEEAM 1004
`
`IPR of US. Patent No. 7,093,086
`
`1 50
`PERIPH
`
`
`E
`)
`202
`
`VOS
`VPERIPH
`
`_/
`208
`
`200n
`
`
`
`
`
`
`
`
`
`
`

`

`US 6,795,966 B1
`Page 2
`
`OTHER PUBLICATIONS
`
`Litzkow, et al., “Checkpoint and Migration of UNIX Pro-
`cesses in the Condor Distributed Processing System,” M.
`University of Wisconsin—Madison Computer Sciences Tech-
`nical Report #1346, Apr. 1997.
`J. Shapiro, et al., “EROS: a fast capability system,” Pro-
`ceedings of the 17th ACM Symposium on Operating Sys
`
`tems Principles (SOSP ’99), Dec. 1999, Charleston, South
`Carolina.
`
`D. Santry, et al., “Deciding When to forget in the Elephant
`file system,” Proceedings of the 17th ACM Symposium on
`Operating Principles, Dec. 1999, Charleston, Sout Carolina.
`
`* cited by examiner
`
`

`

`US. Patent
`
`Sep.21,2004
`
`SheetlrofS
`
`US 6,795,966 B1
`
`0:
`
`N:
`
`-ZOZ
`
`mu__._.<._O>
`
`m0<m0._.m
`
`m_I__._.<._O>
`
`MOEOHw
`
`omF ..Imam:
`
`
`
`mm<>>0m<I
`
`oow
`
`_‘
`EV
`
`_ON_\
`
`NONF
`
`gr
`
`om;
`
`_..0.”—
`
`

`

`S.U
`
`eton”
`
`m
`
`S
`
`5f0
`
`.com
`
`:1.COON
`
`tmomnomw
`
`Smom
`
`4m_|__._.<n_O>m.202
`
`29ooon:_>_>
`
`wmu<mO._.mVON
`
`omN
`
`
`
`2wN—OFSDSEMOSMDwamma>mo
`
`N.0."—
`
`In=mm_n_
`
`o:
`
`m.__._.<._O>
`
`m0<m0._.w
`
`US 6,795,966 B1
`
`N:o:08
`
`DONE
`
`mm<>>DE<I
`
`
`

`

`US. Patent
`
`Sep.21,2004
`
`Sheet3 0f5
`
`US 6,795,966 B1
`
`
`
`

`

`US. Patent
`
`Sep.21,2004
`
`Sheet4 0f5
`
`US 6,795,966 B1
`
`
`
`m.0."—
`
`

`

`US. Patent
`
`Sep. 21, 2004
`
`Sheet 5 0f 5
`
`US 6,795,966 B1
`
`
`
`SEEMS.ZO_ww_S_wZ<N_._.
`
`a:can.aa20
`
`o.0."—
`
`
`

`

`US 6,795,966 B1
`
`1
`
`MECHANISM FOR RESTORING, PORTING,
`REPLICATING AND CHECKPOINTING
`COMPUTER SYSTEMS USING STATE
`EXTRACTION
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application is a Continuation-in-Part of commonly
`assigned US. patent application Ser. No. 09/151,175, “Sys-
`tem and Method for Virtualizing Computer Systems, ” filed
`Sep. 10, 1998, which issued as US. Pat. No. 6,496,847 on
`Dec. 17, 2002. This application also claims priority of the
`pending US. Provisional Application No. 60/085,685, filed
`Feb. 15, 1998.
`This application also incorporates by reference the US.
`patent application Ser. No. 09/179,137, “Virtualization Sys-
`tem Including a Virtual Machine Monitor for a Computer
`with a Segmented Architecture,” filed Oct. 26, 1998, which
`issued as US. Pat. No. 6,397,242 on May 28, 2002.
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`
`This application relates to an arrangement of a computer
`system, in particular, to a system and a method for acquiring,
`storing and using data concerning the state of hardware and
`software components within the computer system.
`2. Description of the Related Art
`Modern computers “crash” with irritating frequency, with
`much work lost or recovered only with time-consuming
`effort. Sometimes, crashes or other errors are expected, for
`example, when designing new software or debugging an
`existing program. In such cases, and even when first turning
`the computer on, time is also lost waiting for computers to
`“boot” or “reboot.” At other times, when problems occur for
`an ordinary user of a commercial application, even more
`time is often lost when the frustrated user must try to explain
`orally what has happened to a technician located far away in
`a customer service department. These are just a few of many
`possible examples of situations when information about the
`state of the computer system is either desirable, for example,
`when debugging a new program, or necessary, for example,
`when the computer is to reboot and automatically load
`previously running applications along with the data they
`were processing when exited.
`One known attempt to ensure the ability to analyze and
`reconstruct the state of a physical memory, disk or data base
`is based on the concept of a “transaction,” which involves
`on-going tracking of updates to at
`least one region of
`storage. In this context, a transaction is a collection of
`updates that are bundled together so that they are atomic that
`is, either all of the updates occur, or none of them occur. The
`idea of transactions is typically applied to databases, where
`a series of updates to different tables need to occur simul-
`taneously.
`A transaction proceeds as follows: A begin command
`from the operating system or an application marks the
`beginning of the series of updates that make up the trans-
`action. After the updates complete, a commit command
`marks the end of the transaction and the updates become
`permanent. If an error occurs during one of the updates that
`are part of the transaction, a rollback command is used to
`undo any updates in the transaction that may have com-
`pleted.
`
`Transactional Disks
`
`In the prior art, this use of the concept of transactions is
`commonly implemented in database systems. Recently,
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`transactions have been extended to apply to logical disks
`(also referred to as virtual disks), which are a software
`construct that emulate physical disks. One example of this
`solution, in the context of a parallel or distributed processing
`arrangement,
`is described in US. Pat. No. 5,634,096
`(Baylor, et al., May 27, 1997, “Using virtual disks for disk
`system checkpointing”), which discloses a scheme for stor-
`ing data on disks in such a way that a “checkpoint” is taken
`across several disks connected to different processors. This
`checkpoint is then used to restore the entire disk system to
`a known state after one or more of the disks or processors
`fails.
`
`Yet another solution involving virtual disks is described in
`“The Logical Disk: A New Approach to Improving File
`Systems,” by de Jonge, Kaashoek, and Hsieh, in Proceed-
`ings of the 141h ACM Symposium on Operating System
`Principles, pp. 15—28, December 1993. In this paper, the
`term “Atomic Recovery Unit” is used to describe transac-
`tions to the logical disk.
`The implementation of a logical disk requires the inter-
`ception of requests to the physical disk, and transforming
`them into operations on a logical disk. Once this has been
`accomplished, it is possible to keep a log of all of the updates
`to the logical disk and defer the update so that the original
`data is not overwritten. When the updates are kept in a log
`in this fashion, then a rollback can be accomplished by
`discarding the updates in the log for a particular transaction.
`Acommit can be accomplished by retaining these updates in
`the log, and eventually applying them to the logical disk.
`A similar concept has been proposed in “Petal: Distrib-
`uted Virtual Disks,” by Lee and Thekkath, in Proc. 1 “Intl.
`Conf. 0n Architectural Support for Programming Lan-
`guages and Operating Systems,” pp. 84—92, October 1996.
`The Petal virtual disk supports the ability to take snapshots
`of the virtual disk, using techniques known as “copy-on-
`write.” Copy-on-write is a common technique that allows
`copies to be created quickly, using a table of pointers to the
`actual data, and only copying the data when it is modified by
`a user program.
`In Petal, the virtual disk itself is implemented as a table
`of pointers, and the snapshot (equivalent to a “checkpoint”)
`is implemented by including an identifier (called an epoch
`number) in this table. When a snapshot is taken, the current
`epoch number is assigned to the snapshot. The epoch
`number is then incremented, and all subsequent updates to
`the virtual disk belong to this new epoch number. When a
`block of the disk is next updated, there will be no copy at the
`current epoch number, so a copy of the block will be created.
`In short, as the term “copy-on-write” implies, a copy is made
`only when a disk block is written to. The original data is still
`available, under the epoch number of the snapshot.
`Both the logging technique and the snapshot technique
`allow the implementation of transactions on a logical disk.
`In both cases, there are two copies of the modified disk
`block:
`the original version and the updated version. By
`restoring the state of the logical disk to point to the original
`version of all the disk blocks that were modified during the
`transaction, the transaction can be rolled back, that is, the
`state of the disk at the beginning of the transaction can be
`restored.
`
`The concepts of transactions on virtual disks and snap-
`shots of virtual disks have a number of limitations. The first
`
`is that they are useful only in the context of restoring the
`state of the disk: These systems provide no way to recover
`from, for example, failures caused by errors in a peripheral
`device.
`
`

`

`US 6,795,966 B1
`
`3
`Another limitation is that, during the operation of a typical
`computer system,
`the state of the disk is not complete:
`Modern operating systems employ disk caches that contain
`copies of data from the disk, as well as data that needs to be
`written to the disk. Applications also buffer data, so that even
`the operating system itself lacks a complete view of all the
`data entered by a user of the computer system. Snapshots of
`the disk state taken at an arbitrary point are only as consis-
`tent as the disk would be if the computer system were to
`crash at that point. On the other hand, any data that is present
`in the cache or in application memory, but that is not yet
`written to disk, is lost.
`If snapshots of the disk state are taken only at points when
`the operating system is shut down, then the disk is in a
`consistent state, and no data is lost. However, this represents
`a significant
`limitation on the concept of transactions:
`Before a transaction can begin or end, all applications must
`be closed and the operating system must be shut down. This
`makes the snapshot technique inadequate to restore the full
`state of the disk when the system or an application
`“crashes,” that is, when an application terminates other than
`as a result of a prescribed shut-down routine and whose
`execution cannot proceed. Alternatively, the application or
`operating system must explicitly issue commands that cause
`the buffered or cached data to be written back to the disk. In
`
`short, the reality of modern systems does not always con-
`form to the “clean” assumptions of the snapshot model, or
`they require the explicit coordination of application or
`operating system software.
`The technique of taking snapshots (also known as
`“checkpointing”) has also been used not only for virtual
`disks, but also for other subsystems such as file systems.
`Moreover, checkpointing has also been proposed for
`applications, and, in certain very restricted senses and cases,
`for systems as a whole. Examples of each will now be given.
`
`File System Checkpointing
`
`One example of checkpointing of file systems is disclosed
`in “Deciding when to forget in the Elephant file system,” D.
`Santry, et al., Proceedings of the 17th ACM Symposium on
`Operating Systems Principles, Charleston, SC. This
`“Elephant File System” uses copy-on-write techniques, as
`well as per-file characteristics to implement checkpointing
`of the file system, albeit only on a file-by-file basis.
`Other checkpointing techniques for
`file systems are
`described in “File system design for a file server appliance,”
`D. Hitz, et al., Proceedings of the 1994 Winter USENIX
`Technical Conference, pages 235—245, San Francisco,
`Calif., January 1994; and “Scale and performance in a
`distributed file system,” J. Howard, et al., ACM Transactions
`on Computer Systems, 6(1):51—81, February, 1988. In both
`of these systems, copy-on-write techniques are used to
`create whole file system checkpoints.
`
`System Checkpointing
`
`Many different proposals have also been put forward for
`checkpointing systems in certain restricted situations. One
`such proposal for the system known as KeyKOS is
`described, for example, in “The Checkpoint Mechanism in
`KeyKOS,” C. Landau, Proceedings of the Second Interna-
`tional Workshop on Object Orientation in Operating
`Systems, September 1992. The KeyKOS system, which
`operates as a microkernel-based operating system (OS),
`treats an entire system (from a software perspective) as a
`collection of objects and periodically takes checkpoints of
`all the objects. After a crash, the objects can be restored and
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`the system resumed. One shortcoming of the KeyKOS
`system is that it requires new system software to be written,
`in particular, new application program interfaces (API’s).
`Yet another disadvantage of KeyKOS is that, after a crash,
`the OS still needs to go through a boot-up process before
`restoring the objects.
`Still another known system-checkpointing technique is
`described in “EROS: a fast capability system,” J. Shapiro, et
`al., Proceedings of the 17th ACM Symposium on Operating
`Systems Principles (SOSP ’99), December 1999,
`Charleston, SC. Like KeyKOS, this EROS system is an
`object-oriented operating system with objects that are made
`persistent by checkpointing them. This checkpointing
`requires that all state resides in special objects called
`“pages” and “nodes,” and that all kernel (OS) operations are
`atomic. Like KeyKOS, the system requires a new API, that
`is, new software, to be written, and requires O/S coordina-
`tion. In EROS, periodic copies (checkpoints) are made of all
`objects, which are saved using copy-on-write techniques.
`Also like KeyKOS, the EROS system requires an O/S reboot
`after a crash.
`
`As its title implies, US. Pat. No. 5,715,464 (Crump, et al.,
`Feb. 3, 1998, “Computer system having suspend once
`resume many sessions”) describes a computer system that
`has suspend once resume many (SORM) sessions. This
`SORM arrangement operates in a manner similar to the way
`in which existing portable computers are able to “suspend”
`their operation, for example, when the lid is closed, and then
`resume operation when reactivated. In the SORM system
`described in the Crump ’464 patent, however, the suspended
`image is preserved after resuming and thus may be restored
`multiple times, although subject
`to the very restrictive
`condition that the suspended image may no longer be valid
`after the next disk access in a resumed system. Moreover, the
`disclosed system-checkpointing solution describes possibil-
`ity of keeping multiple suspended images, each for a dif-
`ferent operating system, so that one can alternate between
`running the suspended operating systems.
`Yet another system with features similar to the suspend-
`to-disk features of a portable computer is disclosed in US.
`Pat. No. 5,758,174 (Crump, et al., May 26, 1998, “Computer
`system having a plurality of stored system capability states
`from which to resume”). In this system, multiple suspended
`images may be kept and the user may resume from any one
`of them.
`
`In both the Crump ’464 and ’174 systems, the operating
`system (OS) and application software must participate in the
`suspension and must go through a shutdown and a wake-up
`phase. In particular, these known systems require software
`executing within the operating-system, such as an Advanced
`Power Management
`(APM) driver, and applications/
`subsystems to register with the APM driver.
`Still another system of this type is described in US. Pat.
`No. 5,386,552 (Garney, et al., Jan. 31, 1995, “Preservation
`of a computer system processing state in a mass storage”).
`In this system, the contents of system registers and system
`memory are saved in a mass storage device upon the
`occurrence of a triggering event, such as during power-off or
`when the system is to enter a low-power mode. The system
`then enters a suspend state. Once processing is resumed, the
`contents of a previously saved processing state are read in
`and control is returned to the previously running application
`program. This system requires two separate modules—a
`special
`interrupt handler and a system management
`module—to handle saving different partitions—isolated and
`non-isolated—of the memory.
`
`

`

`US 6,795,966 B1
`
`5
`As in other suspend-and-resume systems, in the Garney
`system, the evolution of the computer system state is always
`moving forward in a linear trajectory. In other words, once
`the system is resumed, there is no way to go back to the
`previously suspended state. This is in part because the
`contents of the disk, which are not saved when the system
`enters the suspend state, may be freely modified after
`resuming—any post-resume modification prevents resum-
`ing again from the previously saved state. Thus,
`is not
`possible to resume multiple times from a saved image. It is
`also not possible to save the state, continue execution, and
`then resume later from the saved state.
`
`The Garney system also illustrates another common dis-
`advantage of existing arrangements that provide for saving
`at
`least some part of the system state:
`It requires that
`software within the system itself must participate in saving
`the system state. Thus, in order to save the partial state in the
`Garney system,
`the additional system software needs to
`cause the processor to go into a system management inter-
`rupt state so that it can access a system management memory
`area. The processor must also be in the system management
`interrupt state in order to ensure that a critical part of the
`save routine will not be interrupted by a hardware interrupt.
`
`Application/Process-level Checkpointing
`
`One known system for checkpointing applications is the
`“Condor” distributed processing system, which is described
`in “Checkpoint and Migration of UNIX Processes in the
`Condor Distributed Processing System,” M. Litzkow, et al.,
`University of Wisconsin-Madison Computer Sciences Tech-
`nical Report #1346, April 1997; and “Supporting Check-
`pointing and Process Migration Outside the UNIX Kernel,”
`M. Litzkow, et al., Proceedings of the 1994 Winter USENIX
`Technical Conference, San Francisco, Calif., January 1992.
`The Condor system checkpoints the processes of running
`applications, and can migrate them to other machines as long
`as these also are running Condor. Only the application state
`is checkpointed, however, and the applications themselves
`must participate in the checkpointing by making calls to a
`checkpoint library.
`All of the known systems and methods mentioned above
`suffer from one or more of the following disadvantages:
`1) They save only part of the entire system state; as such,
`they cannot ensure complete restoration of the system state
`sufficient to guarantee that all applications will be able to
`continue exactly as they would have when the saved state is
`restored.
`
`2) They are not able to generate checkpoints and save the
`state of the system at arbitrary points, or at multiple points.
`The systems will therefore not correctly save the partial state
`except when processing is interrupted at specific points or
`under specific conditions. This implies, of course, that there
`will be circumstances when the state cannot be saved at all.
`
`This means, in turn that such systems cannot be used for
`such operations as full-state, step-by-step debugging of
`applications. In many cases, this limitation is caused by a
`need for synchronization of the partial state-saving proce-
`dure with applications, or a need to wait for some other
`internal process—such as a shut down of some sub-
`system—to be completed before saving the partial state.
`3) They require specialized system software such as
`special API’s or operating systems. Alternatively,
`they
`assume and work only for particular operating systems and
`hardware architectures. They are therefore not beneficial to
`the most common users—those who need to run off-the-
`
`shelf applications using an off-the-shelf operating system.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`An additional consequence of this is that the checkpoints are
`not portable between different systems.
`4) They need to flush disk caches.
`What is needed is some way to overcome these disad-
`vantages of the prior art, and in particular, to extract and
`restore the entire state of the computer system as a whole,
`not just of some portion of the memory. This then would
`enable complete restoration of the system to any point in its
`processing without requiring any application or operating
`system intervention, or any specialized or particular system
`software (such as API’s and 08’s) or hardware architecture.
`This invention provides a system and method that accom-
`plishes this, and it does so in a way that makes possible even
`other unique features, such as the ability for one or even
`multiple users to run, evaluate, test, restart, and duplicate a
`processing stream not only from the same point, but also
`from different points. The invention accomplishes this,
`moreover, in a manner that allows checkpointing the entire
`state of the system in a way that allows state information to
`be portable between different hardware platforms and sys-
`tem software configurations.
`SUMMARY OF THE INVENTION
`
`The preferred embodiment of the invention provides a
`computer system that has a hardware processor; physical
`system storage; a virtual machine monitor functionally con-
`nected to the hardware processor and physical system stor-
`age; and at least one virtual machine that is functionally
`connected to the virtual machine monitor.
`
`Each virtual machine includes: a) a virtual processor that
`has a plurality of registers and executes a sequence of virtual
`processor instructions, virtual processor instructions being
`conveyed by the virtual machine monitor to the hardware
`processor, physical execution of the virtual processor
`instructions being carried out by the hardware processor; b)
`a virtual memory; c) a virtual operating system; d) at least
`one virtual peripheral device having a plurality of peripheral
`settings and peripheral device data; and e) at
`least one
`application program executing on the virtual machine.
`At the completion of execution of each processor instruc-
`tion by the virtual processor, the virtual machine has a total
`state that includes state information of the virtual processor,
`of the virtual memory, of the virtual operating system, of
`each application program, and of each virtual peripheral
`device. The virtual machine monitor is then used as a
`
`checkpointing mechanism a) for interrupting the virtual
`processor; b) for sensing a checkpoint request for the virtual
`machine; and c) for generating at least one checkpoint, each
`checkpoint comprising a list of the total state of the virtual
`machine corresponding to the respective checkpoint request,
`at the completion of execution of a respective virtual pro-
`cessor instruction.
`
`In the preferred embodiment of the invention, checkpoint-
`ing mechanism is further provided a)
`for storing each
`generated checkpoint; b)
`for sensing a state restoration
`request corresponding to one of the checkpoints; and c) upon
`sensing the state restoration request, for restoring into the
`virtual machine the total state corresponding to the check-
`point that in turn corresponds to the state restoration request,
`the virtual machine thereby resuming execution from the
`restored total state,
`that
`is, of the sequence of virtual
`processor instructions following the virtual processor
`instruction after whose completion the checkpoint was
`generated, identically as it would have if the virtual proces-
`sor had not been interrupted.
`The physical system storage preferably includes a check-
`point storage portion accessible only by the virtual machine
`
`

`

`US 6,795,966 B1
`
`7
`monitor. Each generated checkpoint is then stored in the
`checkpoint storage portion. Stored checkpoints thereby
`remain unchanged by other access to the physical system
`storage. In this case, the total state of the virtual machine
`excludes the contents of the corresponding checkpoint stor-
`age portion.
`The invention allows the use of a plurality of virtual
`machines. In this case, one of the virtual machines may serve
`as a source virtual machine, for which at least one check-
`point is generated, and the checkpointing mechanism is then
`further provided for loading into at least one secondary
`virtual machine, which is different from the source virtual
`machine, the total state of the source virtual machine cor-
`responding to one of the generated checkpoints. The differ-
`ent virtual machines may also be installed on different
`virtual machine monitors, and the different virtual machine
`monitors may even be installed on different hardware plat-
`forms.
`If the source virtual machine is located within a source
`
`computer, the secondary virtual machine may be located
`within a secondary computer that contains a secondary
`virtual machine that is structurally substantially identical to
`the source virtual machine. The source and secondary com-
`puters may thereby have separate hardware processors and
`physical system storage. The system according to the inven-
`tion may then include a transmission medium via which the
`checkpoint is transferred from the source virtual machine to
`the secondary virtual machine. Examples of the transmission
`medium include a portable memory device and a network to
`which the source and secondary computers are connected.
`One advantage of having distinct but structurally identical
`virtual machines running on a virtual machine monitor is
`that they may then be independent of an underlying system
`hardware platform that includes the hardware processor and
`the physical system storage. The total state of any source
`virtual machine is thus exportable to any other secondary
`virtual machine.
`
`The computer system according to the invention may
`include a plurality of peripheral system hardware devices. In
`this case,
`in the preferred embodiment, for each of the
`plurality of peripheral system hardware devices, the virtual
`machine monitor preferably includes a respective device
`emulation module corresponding to a predetermined emu-
`lation of a representative, virtualized hardware device cor-
`responding to the peripheral system hardware device. The
`device emulation modules then form a device interface
`
`between the virtual machines and the peripheral system
`hardware devices. The virtual machine monitor then pref-
`erably further includes a conversion module converting
`device control signals and data from any virtual machine in
`a format of the respective representative virtualized hard-
`ware device into corresponding signals and data into a
`format of the corresponding peripheral system hardware
`device, and vice versa. Each virtual machine is thereby made
`independent of the underlying system hardware platform,
`which includes the hardware processor, the physical system
`storage, and the peripheral system hardware devices. The
`total state of any source virtual machine then is exportable
`to any other secondary virtual machine, regardless of what
`physical peripherals might be included in a particular sys-
`tem.
`
`In an embodiment of the invention that does not rely on
`the use of virtual machines and a virtual machine monitor
`
`there is at least one application program running on system
`software such as an operating system. In this embodiment,
`the mechanism for generating checkpoints is a driver
`installed in the operating system.
`
`8
`According to the invention, checkpoints may be restored
`a plurality of times. In such case, the corresponding virtual
`machine thereby resumes execution multiple times from an
`identical total state.
`
`Whenever multiple checkpoints are generated and stored,
`each subsequently generated checkpoint is generated with-
`out changing any earlier generated checkpoint.
`Also in the preferred embodiment of the invention, any
`arbitrary one of the stored checkpoints may be restored into
`an arbitrary one of the virtual machines. Moreover, the same
`stored checkpoint may be loaded concurrently into a plu-
`rality of virtual machines; each virtual machine is then able
`to proceed from the loaded checkpoint independently of the
`other virtual machines.
`
`The checkpointing procedure does not require modifica-
`tion or long suspension of the operation of a virtual machine.
`Rather, according to the invention, unmodified execution of
`the virtual machine may be resumed immediately after the
`step of generating each checkpoint.
`The invention also provides a way to reduce how much
`storage space is needed for storing the state information
`contained in checkpoints. If this feature is implemented,
`then the steps of sensing each checkpoint request and
`generating a corresponding checkpoint may then include the
`following sub-steps: a) generating an initial checkpoint
`including the total state of the virtual machine; and b) for at
`least one subsequent checkpoint after the initial checkpoint,
`generating the subsequent checkpoint as a list of state
`changes relative to an immediately preceding generated
`checkpoint as well as link information to the immediately
`preceding generated checkpoint.
`According to another aspect of the preferred embodiment
`of the invention, checkpoints may also be deleted. In this
`case, at least one checkpoint is designated for deletion. If it
`is the initial checkpoint is the one designated for deletion, all
`the total state information in the initial checkpoint is incor-
`porated into each checkpoint
`immediately following the
`designated checkpoints. If the initial checkpoint is not one
`designated for deletion then all state change information in
`the checkpoints designated for deletion is incorporated into
`each checkpoint
`immediately following the designated
`checkpoints. (Note that a checkpoint may have more than
`one immediate successor, for example, when a checkpoint

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket