PGR2021-00098, No. 1018 Exhibit - Exhibit 1018 (P.T.A.B. Jul. 6, 2021)

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/224645602
`
`Application Performance Management in Virtualized Server Environments
`
`Conference Paper · May 2006
`
`DOI: 10.1109/NOMS.2006.1687567 · Source: IEEE Xplore
`
`CITATIONS
`275
`
`4 authors, including:
`
`Kirk Beaty
`IBM Watson Health
`
`29 PUBLICATIONS   1,273 CITATIONS
`
`SEE PROFILE
`
`Andrzej Kochut
`IBM
`
`65 PUBLICATIONS   1,844 CITATIONS
`
`SEE PROFILE
`
`READS
`1,285
`
`Gautam Kar
`IBM
`
`37 PUBLICATIONS   1,067 CITATIONS
`
`SEE PROFILE
`
`Some of the authors of this publication are also working on these related projects:
`
`Manager of Architecture, Cloud Development, Machine Learning team for Watson For Genomics View project
`
`All content following this page was uploaded by Kirk Beaty on 29 May 2014.
`
`The user has requested enhancement of the downloaded file.
`
`VMware, Inc. Exhibit 1018 Page 1
`
`

`Application Performance Management in
`Virtualized Server Environments
`
`
`
`
`
`Gunjan Khanna*
`Dept. of Electrical and Computer Engineering,
`Purdue University, West Lafayette, IN, USA
`gkhanna@purdue.edu
`
`
`Abstract — As businesses have grown, so has the need to
`deploy I/T applications rapidly to support the expanding
`business processes. Often, this growth was achieved in an
`unplanned way: each time a new application was needed a new
`server along with the application software was deployed and
`new storage elements were purchased. In many cases this has
`led to what is often referred to as “server sprawl”, resulting in
`low server utilization and high system management costs. An
`architectural approach that is becoming increasingly popular
`to address this problem is known as server virtualization. In
`this paper we introduce the concept of server consolidation
`using virtualization and point out associated issues that arise
`in the area of application performance. We show how some of
`these problems can be solved by monitoring key performance
`metrics and using the data to trigger migration of Virtual
`Machines within physical servers. The algorithms we present
`attempt to minimize the cost of migration and maintain
`acceptable application performance levels.
`
`Index Terms: Application performance management, virtual
`machine migration, virtual server.
`
`
`I. INTRODUCTION
`
`The guiding principles of distributed computing and
`client-server architectures have shaped the typical I/T
`environment for the last three decades. Many vendors have
`thrived by selling elements of a distributed infrastructure,
`such as servers, desktops, storage and networking elements
`and, of course, distributed applications, such as email,
`CRM, etc.
`As businesses have grown, so has the need to deploy I/T
`applications rapidly to support the expanding business
`processes. Often, this growth was achieved in an unplanned
`way: each time a new application was needed a new server
`along with the application software was deployed and new
`storage elements were purchased. In many cases this has
`led to what is often referred to as “server and storage
`sprawl”,
`i.e., many
`underutilized
`servers, with
`heterogeneous storage elements. A critical problem
`associated with “server sprawl”
`is
`the difficulty of
`managing such an environment. Examples are: average use
`of server capacity is only 10-35 %, thus wasting resources
`and the bigger staff required to manage large number of
`heterogeneous servers, thereby increasing the total cost of
`ownership (TCO).
`An architectural approach that is becoming increasingly
`
`*The work was done while the author was a summer intern at the IBM T. J.
`Watson Research Center.
`
`Kirk Beaty, Gautam Kar, Andrzej Kochut
`IBM T.J. Watson Research Center,
`Hawthorne, NY, USA
`(kirkbeaty, gkar, akochut)@us.ibm.com
`
`popular to address this problem is known as virtualization.
`Virtualization occurs both at the server and the storage
`levels and several papers have recently been published on
`this topic [5], [7], [8], [12], [14]. Most of these publications
`deal with the design of operating system hypervisors that
`enable multiple virtual machines, often heterogeneous guest
`operating systems, to exist and operate on one physical
`server. In this paper we start with the concept of server
`level virtualization and explore how it can be used to
`address some of the typical management issues in a small to
`medium size data center.
`The first system management problem we will look at in
`this paper is in the category of configuration management
`and is called server consolidation where the goal is to
`reduce the number of servers in a data center by grouping
`together multiple applications in one server. A very
`effective way to do this is by using the concept of server
`virtualization as shown in Figure 1. Each application is
`packaged to run on its own virtual machine, these are in
`turn mapped to physical machines – with storage provided
`by a storage area network (SAN). Details of this approach
`are given in Section II.
`Each application in an I/T environment is usually
`associated with a service level agreement (SLA), which in
`the simplest case, consists of response time and throughput
`requirements. During run time, if the SLA of an application
`is violated, it is often because of factors such as high CPU
`utilization and high memory usage of the server where it is
`hosted. This leads to the main issue that we address in this
`paper: how to detect and resolve application performance
`problems in a virtual server based data center. Our
`approach is based on a novel algorithm for migrating virtual
`machines (VMs) within a pool of physical machines (PMs)
`when performance problems are detected.
`In Section II, we outline a simple procedure to perform
`server consolidation using the concept of virtualization.
`Related work is presented in Section III. Section IV
`describes our algorithm (DMA) for dynamic migration of
`virtual machines and Section V presents some experimental
`results. We conclude the paper and outline our future
`research plans in Section VI.
`
`II. BACKGROUND
`
`The following process, very commonly used by I/T
`service organizations [22], provides a simple algorithm for
`server consolidation, as represented pictorially in Figure 1:
`
`  
`
`
`
`VMware, Inc. Exhibit 1018 Page 2
`
`

`H e te r o g e n e o u s u n d e r u tiliz e d s e rv e r e n v ir o n m e n t (o n e a p p lic a tio n p e r s e rv e r )
`
`S e rv e r 1
`
`S e rv e r 2
`
`S e rv e r 3
`
`S e rv e r 4
`
`S e rv e r 5
`
`S e rv e r m
`
`S e rv e r n
`
`A p p 1
`
`A p p 2
`
`A p p 3
`
`A p p 4
`
`A p p 5
`
`A p p m
`
`A p p n
`
`O S 1
`
`3 0 %
`
`O S 2
`
`4 0 %
`
`O S 3
`
`2 5 %
`
`O S 4
`
`3 0 %
`
`O S 5
`
`3 5 %
`
`O S m
`
`2 8 %
`
`O S n
`
`5 0 %
`
`C o n s o lid a tio n P r o c e s s
`
`S e rv e r1
`
`S e r v e r 2
`
`S e rv e r m
`
`V M 1
`
`V M 2
`
`V M 3
`
`V M 4
`
`V M 5
`
`V M m
`
`V M n
`
`A p p 1
`o n
`G u e s t
`O S
`
`A p p 2
`o n
`G u e s t
`O S
`
`A p p 3
`o n
`G u e s t
`O S
`
`A p p 4
`o n
`G u e s t
`O S
`
`A p p 5
`o n
`G u e s t
`O S
`
`A p p m
`o n
`G u e s t
`O S
`
`A p p n
`o n
`G u e s t
`O S
`
`H y p e rv is o r
`
`H y p e rv is o r
`
`H y p e rv is o r
`
`H o m o g e n e o u s s e r v e r e n v ir o n m e n t w ith v ir tu a l m a c h in e s a n d h ig h u tiliz a tio n
`
`
`
`Figure 1: A Typical Virtualized I/T Environment
`
`
`1) For each server to be consolidated, collect measurements
`that can be used to compute the average CPU usage, memory
`requirements, disk I/O and network bandwidth usage over a
`period of time (e.g., several weeks). Let us assume there are
`X servers.
`2) Choose a target server type with compatible architecture,
`associated memory, access to shared disk storage and
`network communications.
`3) Take each of the X servers, one at a time, and construct a
`virtual machine image of it. For instance, if server 1 is an
`email application on Windows, create a Windows virtual
`machine (e.g., using VMWare [5], [8]). The resource
`requirements of the virtual machine will be approximately
`the same as the original server which is being virtualized. At
`this step we will have X virtual machines.
`4) Map the first virtual machine to the first server selected
`in step 2. Map the second virtual machine to the same server
`if it can accommodate the resource requirements. If not,
`introduce a new physical machine (PM) and map the VM to
`this new machine. Continue this step until each of the VMs
`has been mapped to a PM, introducing a new PM when
`required.
`5) The set of PMs, at the end of step 4, each with possibly
`multiple associated VMs, comprise the new, consolidated
`server farm. Readers will surely associate this process with
`static bin packing techniques, which yields a sub-optimal
`mapping of VMs to physical servers.
`
`
`In this paper we start with such a mapping of VMs to
`physical servers and propose techniques that will provide
`two system management functions:
`a) Observe the performance of the key metrics of the
`operational virtualized systems (VMs) and as necessary
`(because of increased workload) or periodically (to
`
`optimize the consolidation when demand is less) move
`the virtual machines to maximize the performance;
`doing so while using the minimum number of physical
`servers.
`If any of the VMs report an SLA violation (e.g., high
`response time) perform dynamic re-allocation of VM(s)
`from the hosting PM to another physical machine such
`that the SLA is restored.
`
`b)
`
`III. RELATED WORK
`
`Related work is considered in two parts: the first looks at
`schemes for virtualization and server management, the
`second at the relevant algorithmic approaches of packing
`objects into bins, i.e., bin packing.
`Virtualization has provided a new dimension for today’s
`systems. Classic virtual machine managers (VMM) like
`VM/370 [20] have existed for quite some time but VMWare
`[8] and dynamic logical partitioning (DLPARs) [19] have
`provided an impetus for using virtual machines in new ways.
`There have been efforts to build open source virtual machine
`managers like Xen [6] which provides an open source
`In addition,
`framework
`to build custom
`solutions.
`virtualization has provided new avenues for research like
`Trusted Virtual Domains [23] and Grid Computing using
`VMs [12]. Virtualization at the application level is well
`addressed by products from Meiosys [18].
`VMWare ESX server (hypervisor) runs on the bare
`hardware and provides ability to create VMs and move them
`from one PM to another using VMotion [8]. It requires the
`PM to have shared storage such as SAN or NAS. The cited
`paper [5] provides an overview of the memory management
`scheme employed in the ESX server. VMWare has the
`Virtual Center which provides a management interface to the
`virtual farm. Although some metrics are provided by the
`
`  
`
`
`
`VMware, Inc. Exhibit 1018 Page 3
`
`

`Virtual Center, they are not fine-grained and require
`extensive human
`interaction for use
`in management.
`Resource management of these virtual machines still needs to
`be addressed in a cost effective manner whether in a virtual
`server farm or in a grid computing environment as pointed
`out in [12]. We provide a solution to this problem through a
`dynamic management infrastructure to manage the virtual
`machine while adhering to the SLA. A wide variety of
`approaches exist
`in
`the
`literature
`to perform
`load
`management, but most of these approaches concentrate on
`how to re-direct incoming customer requests to provide a
`balanced workload distribution [1] (identical to a front-end
`sprayer which selectively directs incoming requests to
`servers). An example is Websphere XD [4], which uses a
`to manage workload and allocate
`dynamic algorithm
`customer requests. The ODR component of XD
`is
`responsible for load management and for efficiently directing
`customer requests to back-end replicas, similar to a sprayer.
`Use of replication to tackle high workload and provide fair
`allocation with fault-tolerance has been a common practice,
`but for a virtual farm where each server is running a different
`application, it is not applicable.
`The problem of allocating virtual machines to physical
`machines falls in the category of vector-packing problems in
`theoretical computer science (see survey [16]). It is known
`that finding optimal solutions to vector-packing (or its super-
`set Bin-packing and class-constrained packing problems) is
`NP-hard. Several authors including [9], [11] have proposed
`polynomial time approximate solutions (PTAS) to these
`problems with a low approximation ratio. Cited paper [10]
`gives an algorithm for a restricted job allocation problem
`with minimum migration constraint, but the problem does
`not allow for multiple jobs being assigned to a single
`machine. It is similar to the sprayer approach, developing a
`system which sits at the front end and makes decisions as to
`where to forward incoming requests. These approaches also
`assume that the size of the vectors and bins are fixed, i.e.,
`deterministic values are considered. In a virtual server
`environment, a VM’s utilization may change, thus making
`static allocation techniques unfit, and, instead, requiring
`accurate modeling and dynamic re-allocation. In order to
`precisely model the changing workload, authors in [13]
`propose stochastic load balancing in which probabilistic
`bounds on
`the resources are provided. Stochastic or
`deterministic packing solutions have largely looked at a static
`initial allocation which is close to the optimal.
`It is, however, significantly more challenging to design a
`dynamic re-allocation mechanism which performs allocation
`of VMs (vectors) at discrete time steps making the system
`self-adjusting to workload, without violating the SLA. Also,
`it is important to note that the problem of minimizing
`migrations among bins (VMs to PMs in our case) during re-
`allocation is still an open research problem. Specifically, in
`this paper we address the issue of dynamic re-allocation of
`VMs, minimizing the migration cost, where cost is defined in
`terms of metrics, such as CPU and memory usage.
`
`IV. PROBLEM FORMULATION
`
`We start with the environment described in the previous
`section, namely, an I/T environment consisting of a set of
`
`physical servers, each of which hosts one or more virtual
`machines (VM). In
`the
`interest of simplification of
`presentation, we assume that the physical server environment
`is homogeneous. Heterogeneous environments where PMs
`may have different resource capacities, e.g., CPU, memory,
`etc. can be handled by appropriate scaling of the migration
`cost matrix. Scaling is a widely used approach (see [11],
`[16]) to simplify the problem formulation bringing no change
`to the proposed solution (or algorithm). Typically each
`virtual machine implements one customer application. Due to
`workload changes, resources used by the VMs, (CPU,
`memory, disk and Network I/O) will vary, possibly leading
`to SLA violations. The objective of our research is to design
`algorithms that will be able to resolve SLA violations by
`reallocating VMs to PMs, as needed. Metrics representing
`CPU and memory utilization, disk usage, etc., are collected
`from both the VMs and the PMs hosting them using standard
`resource monitoring modules. Thus, from a resource usage
`viewpoint, each VM can be represented as a d-dimensional
`vector where each dimension represents one of the monitored
`resources. We model resource utilization of a virtual machine
`VMi as a random process represented by a d-dimensional
`utilization vector (Ui(t)) at time t. For a physical machine
`PMk the combined system utilization is represented by Lk(t).
`Each physical machine, say PMj, has a fixed capacity Cj in d-
`dimensional space. Assume that there are a total of n VMs
`which reside on m physical machines. The number of PMs
`(m) may change dynamically, as the algorithm proceeds, to
`meet the increasing workload requirements. At the initial
`state (t=0) the system starts with some predefined allocation
`(through server consolidation as outlined in Section II). As
`the state of the VMs change (due to changes in utilization), it
`causes utilization to exceed thresholds in the pre-defined
`allocation, leading to possible SLA violations. We propose a
`dynamic re-allocation of these stochastic vectors (Ui(t)) on
`the PMs to meet the required SLA. This dynamic algorithm
`runs at discrete time instances t0, t1,…,tk… to perform re-
`allocation when triggered via a resource threshold violation
`alert. In our model we assume a mapping of SLA to system
`resource utilization and hence thresholds are placed on
`utilization, exceeding which,
`triggers
`the re-allocation
`procedure. Below, we explain the nature of the inputs to the
`algorithm and the objective function that we attempt to
`optimize.
`The input includes a function which maps the individual
`resource utilization to the combined utilization of the
`physical machine, i.e., Lk(t) = f(U1(t), U2(t)..) for all VMs
`located on machine PMk. The combined utilization is usually
`considered as a vector sum in traditional vector-packing
`literature but it is not generally true for several shared system
`resources, like SAN and CPU, because of the overhead
`associated with resource sharing among VMs. The latency of
`SAN access grows non-linearly w. r. t. the applied load. If
`we look at the average response time Ravg(t) for all the VMs
`on the same PM, then it grows non-linearly as a function of
`the load on the physical machine (Figure 2). Let VMj’s
`resource utilization at time t be denoted by the vector:
`=
`)(
`[
`(
`),
`(
`)....
`(
`)]
`tU
`u
`ut
`t
`u
`t
`
`1
`2
`j
`j
`j
`jd
`We assume that a VM’s resource utilization for a specific
`resource is equivalent to the fraction of that resource used by
`
`  
`
`
`
`VMware, Inc. Exhibit 1018 Page 4
`
`

`10
`
`20
`
`30
`
`40
`
`50
`
`60
`
`70
`
`80
`
`90
`
`Loa d
`
`10 0
`
`
`
`14
`
`12
`
`10
`
`8
`
`46
`
`2
`
`0
`
`0
`
`Response Time (sec)
`
`Figure 2: Response time VS Applied Load on a system dimension
`
`=
`)(
`[
`(
`).........
`...
`(
`)]
`),
`(
`),
`(
`tR
`trtrtr
`r
`t
`
`1
`2
`3
`m
`where ri(t) is the residual capacity vector of the ith physical
`machine at time t.
`Residual capacity for a resource, such as CPU or memory,
`in a given machine denotes the unused portion of that
`resource that could be allocated to an incoming VM. In order
`to keep the response time within acceptable bounds it is
`desirable that the physical machine’s utilization be below the
`threshold (knee). In some cases, such as batch applications,
`throughput rather than response time is the more critical SLA
`parameter.
` In such situations, thresholds can be set
`appropriately (from Figure 2) by specifying a higher value
`for acceptable response time. We would like to achieve
`maximum possible utilization for a given set of machines and
`avoid adding new physical machines unless necessary.
`The aim is to achieve a new allocation of the VMs, given a
`previous allocation, which minimizes the cost of migration
`and provides the same throughput. System performance is
`monitored consistently for a violation of an SLA, and re-
`allocation is triggered when a violation occurs and is
`performed at discrete time instances t1, t2… tk. System
`monitoring for system metrics can be performed using
`standard monitoring tools like IBM Director [24]. Because of
`the costs associated with migration and the use of new
`physical machines, it is implied that the residual capacity of a
`machine should be as low as possible and migrations should
`be minimized, thus bringing the new solution close to the
`previous one. It is important to note that low values of ri(t)
`might not be sufficient to accommodate an incoming VM
`during migration. Thus the goal of our algorithm is to keep
`the variance of the vector R(t) as high as possible. We will
`illustrate the concept behind this principle, using an example.
`Let each of the ci(t) be simply CPU utilizations (0 ! ci(t) !
`100) of the ith machine. Consider the following two vectors
`C(t) : [40, 50, 30] and [90, 0, 30]. The latter vector has a
`higher variance of the residual vector ([10, 100, 70]) with
`less number of machines having high utilization. Thus, this
`state is more likely to be able to accommodate a migrating
`VM, or, in some cases a new VM, when a new customer
`application is introduced. Alternatively, since PM2’s resource
`usage, represented by the second number, is 0, it can
`effectively be removed. This provides us with one of the
`
`this VM on the associated PM. If A denotes the set of VMs
`allocated to a physical machine PMk then the load on PMk in
`the ith dimension (i.e. the ith resource) is given by:
`=
`
`)(
`)(
`tL
`u
`t
`i
`
`ji
`
`Aj
`In general, it is hard to relate an SLA parameter, e.g.
`response time of a customer application, quantitatively to the
`utilization of the ith resource. The equation (1) approximates
`the non-linear behavior of the response time Ravg(t) as it
`relates to the load in the ith dimension on the physical
`machine. The ni is the knee of the system beyond which the
`response time rises exponentially and approaches infinity
`asymptotically. The variable k is a tuning parameter to adjust
`the initial slope of the graph. Authors in [1] use a similar
`function for customer utility associated with a given
`allocation of resources to a specific customer. Their function
`yields a linear increase below the knee. In real systems, in
`order to model the graph depicted in Figure 2, we need an
`asymptotic increase as utilization moves close to 1, which is
`yielded by equation (1).
`
`!∈
`
`)
`
`2
`
`+
`
`k
`
`
`
`]
`
`−
`
`n
`
`i
`)(
`tL
`i
`
`(
`
`)(
`tL
`i
`−
`
`1
`
`[(
`
`)(
`tL
`i
`
`−
`
`n
`
`i
`
`)
`
`+
`
`21
`
`R
`
`avg
`
`)(
`t
`
`=
`
`
`
`
`
`( 1)
`
`Equation (1) is a hyperbola which closely approximates
`the graph in Figure 2. One can obtain similar equations for
`multiple dimensions. To meet the SLA requirements in terms
`of response time, a system should (preferably) operate below
`the knee. In each resource dimension the knee could occur at
`different points, hence the set of knees can be represented as
`a vector. This serves as a threshold vector for triggering
`incremental re-allocation to lower the utilizations. The
`utilizations are constantly monitored and the algorithm
`ensures, through dynamic re-allocation of VMs to physical
`servers, that they stay below the threshold (knee). Since the
`Li(t) are modeled as random processes, checking for a
`threshold violation would be done as a probabilistic
`guarantee {P(Li(t)<ni)>!} which means that with probability
`! the utilization would remain below ni; this forms a
`constraint.
`The re-allocation procedure must consider the costs
`associated with performing the migration of VMs. These
`VMs are logical servers and may be serving real time
`requests. Therefore, any delay resulting from the migration
`needs to be considered as a cost. Use of a cost function also
`helps in designing an algorithm which is stable and does not
`cause frequent migration of machines. Let the cost of
`migration of one unit vector in d-dimension be denoted by
`the row vector Mc. It consists of migration cost coefficients
`for each dimension. These cost coefficients depend on the
`implementation of the virtual server migration. In this model
`we assume that the coefficient of Mc remains the same for all
`migrations. Thus the cost of migration of VMj is given by
`Mc.Uj(t). The cost of bringing in a new PM during migration
`is denoted by NBc which is assumed to be orders of
`magnitude larger than migration cost Mc. In short, this is
`because, introduction of a new machine incurs hardware,
`software, and provisioning costs. Let matrix R(t) denote the
`residual capacity of the system:
`
`  
`
`
`
`VMware, Inc. Exhibit 1018 Page 5
`
`

`these weights are used to normalize each component to an
`equal scale. These can be specified by
`the system
`administrator and be fine tuned.
`Weights also reflect the importance of each cost/gain
`function. For example, in a system where it is relatively
`cheaper to perform migration of VMs across the physical
`servers and more expensive to add a new PM, w2 would be
`much lower as compared to w3. If an administrator would
`like a fair utilization across physical machines and would not
`like to reclaim a physical machine when utilization wanes,
`then s/he can reduce the weight w1. The constraint in
`Equation (2) represents the amount of load which each PM
`can hold without going over the threshold njk) and hence not
`violating the SLA.
`
`A. Design Challenges
`The general scenario of allocating VMs to physical
`machines is conceptually close to the classical vector-
`packing problem [11] making it NP-hard. As evident from
`the problem formulation, the number of migrations needs to
`be minimized and since the number of physical machines is
`not fixed, the use of techniques of relaxation to Linear
`Programming is not suitable. A solution involving LP would
`require re-solving the LP at each discrete time instance and a
`new solution might require a whole new re-allocation leading
`to a high migration cost. Solutions aiming to minimize the
`number of moves across physical machines (analogous to
`bin(s)) is still an open research problem. The solution
`approach must handle dynamic inclusion and exclusion of
`physical machines to satisfy the constraints. Cost to the
`customers can be calculated on the basis of usage of these
`PMs, providing greater flexibility at the user level.
`The general problem, as represented by Equation 2, is NP-
`hard. In a practical setting, such as a consolidated server
`environment which was introduced in Section II, we would
`like to implement a heuristic (PTAS) algorithm that can be
`executed online to address SLA problems arising from over
`utilization of resources. In the section below we present such
`an algorithm which outlines actions that can be performed to
`optimize each of the components separately.
`
`B. Algorithm
`Assume that PM1, PM2….PMm are the physical machines
`and VMij is the jth virtual machine on PMi. An initial
`allocation is already provided and the proposed dynamic
`management algorithm (DMA) focuses on the dynamic part.
`For each VM and its host physical machine the utilizations
`are monitored. Here utilization consists of the observed
`metrics like CPU, memory, etc. For each physical machine
`PMi, we maintain a list of all the virtual machines allocated
`to it in non-decreasing utilization order, i.e. the first virtual
`machine, VMi1, has the lowest utilization. Since migration
`cost is directly calculated based on the utilization, another
`way to look at the order is “Virtual Machines are ordered
`according to their migration costs within each Physical
`Machine”. For each physical machine we calculate and store
`its residual capacity ri(t). Additionally, we maintain the list
`of residual capacities in non-decreasing order of l2 norm (i.e.,
`magnitude of the vector). Without loss of generality we can
`represent the VMs as shown in Figure 3.The constraints, as
`indicated in equation (2) are constantly monitored and any
`violation of these constraints triggers the algorithm to
`
`members of the objective function that we have formulated
`i.e. maximizing the variance of the residual vector.
`When a resource threshold is violated and the migration
`algorithm is set in motion, there are three decisions which it
`needs to make, namely:
`a. Which physical machine (PM) to remove a VM (i.e.,
`migrate) from?
`b. Which VM to migrate from the chosen PM (from step
`1)?
`c. Which new PM to migrate the chosen VM (from step 2)
`to?
`Since thresholds are set on the utilization at each physical
`machine, violation of a threshold triggers the algorithm to
`determine which (one or more) of the VMs from the physical
`machine (at which the violation took place) needs to be
`migrated to another physical machine.
`More formally, let Xt be an n x m allocation matrix
`containing allocation variables xij equal to 1 if virtual
`machine i is allocated to physical machine j. Given an
`allocation at time t denoted by Xt we want to compute
`another allocation of machines at time t + " i.e. Xt+". The
`migrations performed by re-allocation is given by the
`migration matrix ZM (n x 1), obtained from the difference
`Xt+" - Xt and setting the rows with positive difference to be 1.
`The expected migration cost incurred by the new allocation
`is given by the scalar value:
`⋅
`⋅
`[
`)(
`tUME
`c
`M
`The problem in its most general form can be represented as
`follows:
`Max { w1Var(R(t))– w2
`
`T
`
`Z
`
`]
`
`
`
`⋅
`)(
`[
`tUME
`c
`
`T
`
`⋅
`
`Z
`
`M
`
`]
`
`- w
`3
`
`n.NBc}
`
`u
`
`ki
`
`.
`
`x
`
`ij
`
`<
`
`n
`
`jk
`
`)
`
`>
`
`ξ
`
`1;
`
`≤≤
`mj
`
` ( 2)
`
`n
`
`!=
`
`i
`
`1
`
`P
`
`(
`
`where n is the number of new physical machines brought in
`to accommodate the migrating VM if necessary. For a matrix
`M, Var(M) is defined as an L2 norm of the variance vector
`obtained by computing sample variances of each row. For
`example if M1 is a sample row with values [10 10 20], then
`variance of this row is 33.33. For a n x m matrix we first
`obtain a variance vector (say A) of size n x 1 such that
`element i of the vector A is a sample variance of the values in
`the ith row of M. Finally a L2 norm of the vector A gives the
`required scalar. Equation (2) expresses the SLA constraint
`which forces a physical machine’s total utilization in each
`n
`
`u
`
`ki
`
`⋅
`
`x
`
`ij
`
`i
`
`=
`1
`
`, to stay below the knee (njk).
`
`dimension i.e. !
`Here ! is the probability confidence with which the
`utilization is below the input capacity knee njk. This equation
`is true for all physical machines. Thus the optimization
`function in this formulation consists of costs/gains associated
`with each of
`the previously discussed metrics. The
`maximization function can be divided into three components,
`each with a configurable coefficient wi. The first sub-
`component reflects the gain because of Var(R(t)) which
`reflects how close the allocation is. The second term is the
`migration cost compared to the previous allocation. The last
`term is the cost incurred because of adding n new servers.
`Each of wi represent the amount of weight the sub-
`component has on the objective function, in other words
`
`  
`
`
`
`VMware, Inc. Exhibit 1018 Page 6
`
`

`
`
`
`
`
`
`VM14VM14VM14
`
`
`
`
`
`VM13VM13VM13
`
`
`
`
`
`VM12VM12VM12
`
`
`
`
`
`VM11VM11VM11
`
`
`
`
`
`PM1PM1PM1
`
`
`Mc(V11) <Mc(V12)..< Mc(V11) <Mc(V12)..<
`
`Mc(V1j)Mc(V1j)
`
`
`
`VM23VM23
`
`
`VM22VM22
`
`VM21VM21
`
`
`
`PM1PM1
`
`
`
`VMk3VMk3
`
`
`
`VMk2VMk2
`
`
`
`VMk1VMk1
`
`
`
`…………
`
`PMkPMk
`
`
`Height represents Height represents
`
`the utilization of a the utilization of a
`
`VMVM
`
`
`
`Figure 3: The VMs on each PM are ordered with respect to their
`Migration Costs.
`
`perform re-allocation. Because the resource variations are not
`predictable, the times at which the algorithm runs is not pre-
`determined. Since our problem settings dictate minimum
`residual space, we keep lower bounds on utilization as well.
`An instance of the utilization falling below a low mark for
`one or more physical machines, would trigger the same
`algorithm, but for garbage collection, i.e., reclaiming the
`physical machine (emptying it of VMs). Assume for physical
`machine PMk the constraints are violated. Hence a VM from
`the physical machine PMk must be migrated. We use the
`residual capacity list to decide the destination physical
`machine.
`We select the VM from PMk with the lowest utilization
`(i.e., the least migration cost) and move it to a physical
`machine which has the least residual capacity big enough to
`hold this VM.The process of choosing the destination
`physical machine is done by searching through the ordered
`residual capacity list (requires O(log(m) time). After moving
`VMk1 the residual capacities are re-calculated and the list is
`re-ordered. Moving VMk1 might not yet satisfy the SLA
`constraints on PMk, so we repeat this process for the next
`lowest utilization VM i.e. VMk2 until we satisfy the
`constraints. Since the destination PM (say PMj) is chosen
`only if it has enough residual capacity to hold the VM
`(VMk1), allocating VMk1 to PMj does not violate the SLA
`constraints on that machine. In case the algorithm is unable
`to find a physical machine with big enough residual capacity
`to hold this VM, then it instantiates a new physical machine
`and allocates the VM to that machine. As a pre-condition to
`performing the migration, we compare the sum of residual
`capacities with the (extra needed) utilization of physical
`machine PMk. If residual capacities are not enough to meet
`the required extra need of machine PMk then we introduce a
`new physical machine and re-check. This process a

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go Monthly

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases