`US 6,574,587 B2
`(10) Patent No.:
`*Jun. 3, 2003
`(45) Date of Patent:
`Waclawski
`
`US006574587B2
`
`(54) SYSTEM AND METHOD FOR EXTRACTING
`AND FORECASTING COMPUTING
`RESOURCE DATA SUCH AS CPU
`CONSUMPTION USING AUTOREGRESSIVE
`METHODOLOGY
`
`(75)
`
`Inventor: Anthony C. Waclawski, Colorado
`Springs, CO (US)
`
`(73) Assignee: MCI Communications Corporation,
`Washington, DC (US)
`
`(*) Notice:
`
`This patent issued on a continued pros-
`ecution application filed under 37 CFR
`1.53(d), and is subject to the twenty year
`patent
`term provisions of 35 U.S.C.
`154(a)(2).
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`US.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/031,966
`
`(22)
`
`(65)
`
`Filed:
`
`Feb. 27, 1998
`
`Prior Publication Data
`US 2001/0013008 A1 Aug. 9, 2001
`
`Inte C17 eee ceeneeeeeeneeseennerseeanees GO6F 17/00
`(ST)
`(52) U.S. Ch. oe cence 702/186; 702/179
`(58) Field of Search oo... 705/1; 709/104,
`709/103, 224, 226; 714/1, 4; 702/179, 181,
`182, 186
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`5,796,606 A *
`8/1998 Spring ....... 702/179 X
`5,884,037 A *
`se. 709/226
`3/1999 Aras et al.
`........
`
`5,956,702 A *
`.. 706/22
`9/1999 Matsuokaet al.
`5,966,509 A * 10/1999 Abect al. wee 714/4
`
`FOREIGN PATENT DOCUMENTS
`
`WO
`
`99/44112
`
`*
`
`9/1999
`
`OTHER PUBLICATIONS
`
`Groschwitz, et al., “A Time Series Model of Long-Term
`NSFNETBackboneTraffic”, Proceedings of the IEEE Intnl.
`Conf. on Communications, pp. 1400-4, May 1994.*
`Zhou, “Forecasting Sales and Price for Existing Single-
`—Family Homes: A VAR Model with Error Correction’,
`Journal of Real Estate Research, 1997, pp. 155-167.*
`Brockwell and Davis,
`Introduction to Time Series and
`Forecasting, Springer-Verlag, 1996, pp. 13-14, 28-29,
`31-34.*
`
`Bolot et al. “Performance Engineering of the World Wide
`Web: Application to Dimensioning and Cache Design”
`Proceedings of the Fifth International Conf. on the WWW
`Paris, France pp. 1-12, 1996.*
`(List continued on next page.)
`Primary Examiner—M. Kemper
`(57)
`ABSTRACT
`
`A system and method for extracting and forecasting com-
`puting resource data such as workload consumption of
`mainframe computing resources using an autoregressive
`model. The system and method forecast mainframe central
`processing unit (CPU) consumption with ninety-five percent
`accuracy using historical performance data. The system and
`method also provide an upper ninety-five percent confidence
`level and a lower ninety-five percent confidence level. The
`system and method retrieve performance records from a
`computer platform in one second intervals, statistically
`collapses the one second performance data into fifteen
`minute performance data, statistically collapses the fifteen
`minute performance data into one week performance data,
`and generates a time series equivalent to collecting perfor-
`mance data at one week intervals. The system and method
`ensure that the resulting timeseriesis statistically stationary,
`and applies an autoregressive construct to the time series to
`generale forecast of future CPU utilization, as well as to
`generate reports and graphs comparing actual vs. forecast
`CPU utilization. Because the system and method rely on
`electronically generated empirical historical computer per-
`formance data as an input, they provide a turnkey solution
`to CPU consumption forecasting that can be implemented
`easily by any system network manager.
`
`13 Claims, 5 Drawing Sheets
`
`106
`
`202
`
` Computer
`
`Statistical Collapser #1
`
`
`Database
`
`Data Extractor
`
`
`
`
`AutRegressive
`Modeling Tool
`
`
`Analyzer
`Time Series
`
`
`Result Processor
`
`Time Point
`Converter
`
`
`
`210
`
`
`
`
`Google Exhibit 1027
`Google Exhibit 1027
`Google v. Valtrus
`Googlev. Valtrus
`
`
`
`US 6,574,587 B2
`Page 2
`
`OTHER PUBLICATIONS
`
`Wolski, “Forecasting Network Performance to Support
`Dynamic Scheduling Using the Network Weather Service”
`‘The Sixth IEEE International Symposium on High Perfor-
`mance Distributed Computing 1997 Proceedings pp.
`316-325, Aug. 1997.*
`Prokopenko, “Learning Algorithm for Selection of an
`Autoregressive Model for Multi-step Ahead Forecast” Pro-
`ceedings of the Third Australian and New Zealand Confer-
`ence on Intelligent Information Systems pp. 47-52, Nov.
`1995.*
`
`Alexopoulos, C. “Advanced Simulation Output Analysis for
`a Single System” Winter Simulation Conference Proceed-
`ings pp. 89-96, Dec. 1993.*
`Basuetal. “Time Series models for Internet Traffic” INFO-
`COM ’96, Fifteenth Annual Joint Conference of the IEEE
`Computer Societies Networking the Next Generation V2 pp.
`611-620, Mar. 1996.*
`
`Chanda, “Chi-square Goodness of Fit Tests for Strong
`Mixing Stationary Processes”, abstract,
`Interim Report,
`Aug. 1973.*
`
`Choukri ct al., “A Gencral Class of Chi-squareStatistics for
`Goodness—of-Fit Tests for Stationary Time Series”, Proc.
`SPIL-The Int’] Soc. Optical Engineering, abstract, Jul.
`1994.*
`
`Shimakawa et al., “Acquisition and Service of Temporal
`Data for Real-Time Plant Monitoring” Proc. Real-Time
`Sys. Symposium, pp. 112-118, Dec. 1993.*
`
`Vis et al., “A Note on Recursive Maximum Likelihood for
`Autoregressive Modeling” IEEE Trans. on Signal Process-
`ing v.42,n. 10,pp. 2881-2883, Oct. 1994.*
`
`* cited by examiner
`
`
`
`U.S. Patent
`
`Jun. 3, 2003
`
`Sheet 1 of 5
`
`US 6,574,587 B2
`
`100
`
`Computer
`
`
` Results
`
` Processor
`
`Computer
`Computer
`
`102 Computer
`
`
`
`
`Results
`
`Processor
`
`FIG. 1
`
`
`
`U.S. Patent
`
`Jun. 3, 2003
`
`Sheet 2 of 5
`
`US 6,574,587 B2
`
`
`
`aseqejedL#J4asdeyjoyjeousneisJeBeueyyaounosey
`
`
`
`
`
`901
`
`Jajndwoy
`
`80¢
`
`
`
`Joyoexyeyeq
`
`Ole
`
`ZHJasdeyjo_jeoysiyes
`
`SOASOWI,
`
`JezAyeuy
`
`JUlOdSU|
`
`Japeauoy
`
`¢Oils
`
`
`
`joo,Buyjepoy
`
`
`
`
`
`anlsselbeyyinyJOSS890J/q}INSeY
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Jun. 3, 2003
`
`Sheet 3 of 5
`
`US 6,574,587 B2
`
`[BONSHELS
`
`Jasdeyjo4
`
`eyeq
`
`JOPeI]X4
`
`JESNSHEIS
`
`Jesdeyjon
`
`€‘Old
`
`spi00eY
`
`
`
`eyeqsoueWOLed
`
`adinosay
`
`JajndwoySN
`
`ebewyaj6uls
`
`Jayndwioy
`
`aseqejeq
`
`spi098y
`
`JaeBeuey,
`
`
`
`SWBIUIEWINS/dd
`
`
`
`BIEqsOURWJOLeYd
`
`
`
`U.S. Patent
`
`Jun.3, 2003
`
`Sheet 4 of 5
`
`US 6,574,587 B2
`
`400
`
`414
`
`402
`
`404
`
`406
`
`408
`
`410
`
`412
`
`Collect One Second
`Performance Data
`
`Collapse One Second
`Performance Data
`Into Fifteen Minute
`Data
`
`Collapse Fifteen
`Minute Data Into
`Weekly Data
`
`Generate Statiscally
`Stationary Time Series
`
`Apply Autoregressive
`Tool To Time Series
`
`FIG. 4
`
`
`
`U.S. Patent
`
`Jun.3, 2003
`
`Sheet 5 of 5
`
`US 6,574,587 B2
`
`OnA DUAN
`NWUHDAOr
`AwMhHpAar
`Any DYAanr
`othpHor
`ANPADPAAN
`AMHPwAor
`OMADZar
`SdANDZAAr
`NM Ea nar
`AOD GHAR
`dns aH ar
`OtDamHaM
`NW GAKRAM
`NOKAMaAr
`ANT OAMar
`COW GAMA
`NADA UMOMr
`NMDA Moar
`ASE AAAN
`OnRAGCanr
`OA ZAGar
`NO haar
`Nama goan
`AMNhaAgoanr
`Orhaagar
`MOnAZAnr
`NZ AZO
`awd Zar
`HxHOn daar
`Onna aor
`NrAaAnUagw
`ARARORW
`AMNARONDY
`OMARUAY
`NAZOPawo
`AM ZIOpayw
`ONAZOP aw
`SOMAOP Aw
`NWOQOHMW
`HDOUHDW
`HAOCOUCURAY
`OtTOUHDY
`NOM AA w
`NOM IA m0
`HMA BAO
`COWUNDWADAW
`NAGDUAW
`AMG DOAW
`AN GDUAwW
`OnMdDUAW
`CHA gDUaw
`NORDHAW
`ADMHDHAW
`ANHDH Aw
`orhDpDHaw
`NOhRDZAaw
`NOHRDZAMY
`HFHDAKAO
`oWwWhDZAanw
`NADGAARAO
`NTE adHyaw
`AmB GHnwo
`HODAHHAWw
`
`OME MAaAwW
`
`NOGABRAW
`
`ANKAMaAW
`BAH Maeno
`
`0
`130
`120
`100
`110
`70
`60
`90 80
`30
`
`20
`
`10
`
`PercentCPUBusy
`
`50
`
`40
`
`Data
`
`FIG.5
`
`
`
`US 6,574,587 B2
`
`1
`SYSTEM AND METHOD FOR EXTRACTING
`AND FORECASTING COMPUTING
`RESOURCE DATA SUCH AS CPU
`CONSUMPTION USING AUTOREGRESSIVE
`METHODOLOGY
`
`TECHNICAL FIELD
`
`The present invention relates to a computer platform, and
`in particular, to a system and methodto forecast the perfor-
`mance of computing resources.
`
`10
`
`BACKGROUND OF THE INVENTION
`
`The computing resources of a large business represent a
`significant financial investment. When the business grows,
`resource managers must ensure that new resourcesare added
`as processing requirements increase. The fact
`that
`the
`growth and evolution of a computing platform is often rapid
`and irregular complicates managementefforts. This is espe-
`cially true for computing platforms common to banking
`institutions and telecommunications companies,
`for
`example, whose computing platforms typically include hun-
`dreds of geographically distributed computers.
`To effectively manage the vast resources of a computing
`platform and to justify any requests for acquisition of new
`resources, managers need accurate forecasts of computing
`platform resource performance.
`Ilowever, conventional
`forecasting tools may not be adequate for use on computing
`platforms. For example, conventional sales performance
`forecasting tools, which use linear regression and multivari-
`able regression to analyze data, commonly factor in such
`causalvariables as the effect of holiday demand, advertising
`campaigns, price changes, etc. Similarly, pollution forecast-
`ing tools typically consider the causal effect of variations in
`traffic patterns. As such, using these tools to forecast com-
`puting platform resources may be problematical because
`causal parameters generally are difficult to establish and are
`unreliable.
`
`Other conventional forecasting tools may be limited by
`the amount of data they can process. For example, some
`forecasting tools may not adequately purge older or non-
`essential data. Other forecasting tools may not appropriately
`incorporate new data as it becomes available. Still other
`forecasting tools may not have the computing power ta
`perform calculations on large amounts of data.
`The limitations of established forecasting tools are par-
`ticularly troublesome when forecasting resources in com-
`puting platforms that are expanding or are already
`re-engineered. These computing platforms need a forecast-
`ing system and methodthat deal appropriately with new data
`as well as unneeded data. Moreover, these computing plat-
`forms need a forecasting system and method that augment
`causal-based forecasting tools to provide accurate and reli-
`able forecasts.
`
`SUMMARYOF THE INVENTION
`
`Presented herein is a system and method to forecast
`computing platform resource performance that overcomes
`the limitations associated with conventional
`[forecasting
`tools. An embodiment applies an autoregressive model to
`electronically generated empirical data to produce accurate
`and reliable computing platform resource performance fore-
`casts. An embodiment of the present invention also statis-
`tically collapses large amounts of data, eliminates unneeded
`data, and recursively processes new data. The forecasts are
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`compared to actual performance data, which may be graphi-
`cally displayed or printed. A specitic type of data is not
`important for the present invention, and those skilled in the
`art will understand that a wide variety of data may be used
`in the present invention. For example, the present invention
`contemplates any data that may be collected and verified
`over time. These data include, for example, Internct meter-
`ing data, marketing data on the successorfailure of product
`offerings,
`telephone usage patterns, cash flow analyses,
`financial data, customer survey data on productreliability,
`customer survey data on product preference, etc.
`The system and method operate within a computing
`platform. In one embodiment, the computing platform may
`be a multiple virtual storage (MVS) computing platform. In
`another embodiment,
`the computing platform may be a
`UNIX computing platform. In other embodiments, the com-
`puting platforms may be disk operating system (DOS)
`computing platforms. Those skilled in the art will appreciate
`that a variety of computing platforms may be used to
`implement the present invention.
`The computing platform includes at least one resource
`whose performance is forecast.
`In one embodiment,
`the
`computing platform resource may be a central processing
`unit (CPU). In another embodiment, the computing platform
`resource may be a memory storage unit.
`In other
`embodiments, the computing platform resource may be a
`printer, a disk, or a disk drive unit. A specific computing
`platform resource is not important for the present invention,
`and those skilled in the art will understand that a number of
`
`resources may be used in the present invention.
`Each resource includes at least one aspect. The aspect
`may be a performance metric. The performance metric may
`be resource utilization. “Utilization” is defined generally
`herein as the percentage that a particular computing platform
`resource is kept busy. Utilization is often termed “consump-
`tion.”
`
`In another embodiment, the performance metric may be
`resource efficiency or resource redundancy. “Efficiency” is
`defined generally herein as the measure of the useful portion
`of the total work performed bythe resource. “Redundancy”
`is defined generally herein as the measure of the increase in
`the workload of a particular resource. Of course,
`those
`skilled in the art will appreciate that a particular perfor-
`mance metric is not required by the present
`invention.
`Instead, a number of performance metrics maybe used.
`In one embodiment, the computing platform includes a
`resource manager. The resource manager collects perfor-
`mance data from its associated resource. The performance
`data is associated with a performance metric.
`In one
`embodiment,
`the resource manager collects performance
`data representing a CPU utilization performance metric.
`The resource manager collects the performance data in
`regular intervals.
`In one embodiment,
`regular
`intervals
`include one-second intervals, for example. That is, in this
`embodiment,
`the resource manager collects performance
`data from its associated computer(s) every second. The
`interval size in which performance data is collected may be
`determined bythe particular use for the performance metric,
`the particular resource, the particular computing platform,
`etc,
`
`The computing platform also includes a plurality of
`statistical collapsers that statistically collapse the perfor-
`mance data into a series. In one embodiment, the series may
`be a time series representing a performance metric. A “time
`series” is defined generally herein as any ordered sequence
`of observations. Each observation represents a given point in
`
`
`
`US 6,574,587 B2
`
`3
`time and is thus termed a “time point.” Accordingly, a time
`series includes at least one time point.
`A first statistical collapscr generates a first time scrics
`representing a performance metric as though its associated
`performance data had been collected at a first interval. The
`first time series includes a first set of time points. In one
`embodiment, the first statistical collapser generates a time
`series representing a performance metric as though its
`associated performance data had been collected in fifteen
`minute intervals. Accordingly, the time series includes four
`time points for each hour. In another embodiment,the first
`statistical collapser generates a time series representing a
`performance metric as though its associated performance
`data had been collected hourly. Accordingly, the time series
`includes one time point for each hour. It will be understood
`by persons skilled in the relevant art thal the present inven-
`tion encompassesstatistical collapsers that generate time
`series representing performance metrics as though their
`associated performance data had been collected at any of a
`variety of suitable intervals. ‘he interval size and corrte-
`sponding number of time points generated by the first
`statistical collapser may be determined bythe particular use
`for the performance metric,
`the particular resource,
`the
`particular computing platform,etc.
`The computing platform also includes a database that
`stores data. In one embodiment, the database stores the time
`series representing the performance metric as though its
`associated performance data had been collected at fifteen-
`minute intervals.
`
`The computing platform also includes a data extractor to
`extract data from the database. According to one
`embodiment, the data extractor extracts from the database
`the time series representing the performance metric as
`though its associated performance data had been collected at
`fifteen minute intervals.
`
`The computing platform also includes a secondstatistical
`collapser. The second statistical collapser statistically col-
`lapses the first time series, producing a second timeseries.
`The second timeseries includes a secondset of time points.
`In one embodiment, the secondstatistical collapser statisti-
`cally collapses the fifteen minute time series into a one-week
`timeseries. That is, the secondstatistical collapser generates
`a time series representing a performance metric as thoughits
`associated performance data had been collected weekly.
`Accordingly, the time series includes approximately four
`time points for each month. In another embodiment,
`the
`secondstatistical collapser generates a time series represent-
`ing a performance metric as though its associated perfor-
`mance data had been collected daily.
`‘The corresponding
`time series includes approximately thirty time points for
`each month. It will be understood by persons skilled in the
`relevant art that the secondstatistical collapser may generate
`time series representing a performance metric as thoughits
`performance data had been collected at any of a varicty of
`suitable intervals. As described above with reference to the
`
`first statistical collapser, the interval size and corresponding
`number of time points generated by the secondstatistical
`collapser may be determined by the particular use for the
`performance metric, the particular resource, the particular
`computing platform,etc.
`The computing platform also includes a time series ana-
`lyzer to determine whether the second timeseriesis statis-
`tically stationary. The time series analyzer uses a plurality of
`X? (chi-square) tests to make this determination. The time
`series analyzer also evaluates autocorrelation statistics and
`autocovariancestatistics. If the time series analyzer deter-
`
`10
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`the time series is statistically nonstationary,
`mines that
`which is likely the case,
`then the time series analyzer
`converts the statistically nonstationary time series to a
`statistically stationary time series by differencing each time
`point in the time series. The statistically stationary time
`series nowrepresents the differenced values of performance
`data.
`
`The computing platform also includes a time point con-
`verter. If the time series is alreadystatistically stationary or
`after the time series analyzer converts the time series to
`statistical stationarity,
`the time point converter applies a
`statistical data set to the time series. Recall that the time
`
`series represents the performance metric as thoughits asso-
`ciated performance data had been collected from the com-
`puting platform at regular intervals. As such, the time series
`includes information indicating the time that
`the perfor-
`mance data was collected. In one embodiment, this infor-
`mation includes a date/time stamp. That is, each data point
`in the time series includes a date/time stamp. Thestatistical
`data set converts each date/time stampin the timeseries into
`a value representing a decimal number equivalent to the
`date/time stamp.
`One feature of the present invention is an autoregressive
`modeling tool, which is applied to the converted time series
`to forecast a particular aspect of the computing platform.
`The autoregressive modeling tool is chosen by calculating
`autocorrelation, inverse autocorrelation, and partial autocor-
`relation functions, and by comparing these functions to
`theoretical correlation functions of several autoregressive
`constructs. In particular, one embodiment applies a first
`order mixed autoregressive construct, such as an autoregres-
`sive moving average (ARMA)construct, to the differenced
`time series. Another embodiment applies an autoregressive
`integrated moving average (ARIMA)construct to the dif-
`ferenced time series. In the embodiment where the perfor-
`mance metric is resource utilization and the resource is a
`
`the resulting autoregressive modeling tool reliably
`CPU,
`forecasts CPU consumption with a ninety-five percent
`accuracy, provides an upper ninety-five percent confidence
`level, and provides a lower ninety-five percent confidence
`level. Conventional systems and methodsthat rely on linear
`regression or multivariable regression techniques maycarry
`a lower confidence level.
`
`Another feature of the present invention is that it uses
`empirical data as inputs to the autoregressive modelingtool.
`Using empirical data rather than causal variables provides
`more accurate forecasts.
`In the embodiment where the
`performance metric is resource utilization and the resource
`is a central processing unit,
`the empirical data is actual
`historical performance data, including logical CPU utiliza-
`tion information as well as physical CPU utilization infor-
`mation. Moreover, the system and method generate recur-
`sive forecasts whereby actual future performance data is fed
`back into the autoregressive modeling tool to calibrate the
`autoregressive modelingtool.
`The computing platform includes a results processor,
`which generates graphical representations of a performance
`metric. The results processor also generates information for
`use in written reports that document
`the results of the
`forecasting process. The graphical and textual representa-
`tions demonstrate the greater accuracy and reliability the
`present invention provides over conventional forecasting
`systems and methods.
`the results processor may be a
`In one embodiment,
`graphical display unit, such as a computer display screen. In
`another embodiment, the results processor may be a textual
`
`
`
`US 6,574,587 B2
`
`5
`display unit, such as a printer. In the embodiment where the
`performance metric is resource utilization and the resource
`is a central processing unit, the results processor produces
`reports and graphical representations of comparisons of
`actual CPU utilization with CPU utilization forecasts.
`
`Further features and advantages of the present invention
`as well as the structure and operation of various embodi-
`ments are described in detail below.
`
`BRIE DESCRIPTION OF THE FIGURES
`
`The present invention is best understood by reference to
`the figures, wherein references with like reference numbers
`indicate identical or
`functionally similar elements.
`In
`addition, the left-most digits refer to the figure in which the
`reference first appears in the accompanyingfigures in which:
`FIG. 1 is a high-level block diagram of a computer
`platform suitable for use in an embodiment of the present
`invention;
`FIG. 2 is a more detailed depiction of the block diagram
`of the computer platform of FIG. 1;
`TIG. 3 is a more detailed depiction of the block diagram
`of the computer platform of FIG. 2;
`FIG. 4 showsa flowchart of a forecasting process suitable
`for use in an embodiment of the present invention; and
`FIG. 5 graphically depicts the comparisons of actual CPU
`ulizalion with CPU utilization forecasts which may be
`produced by one embodimentof the present invention.
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`A computer platform, and in particular, a system and
`method for forecasting computer platform resource perfor-
`mance is described herein. In the following description,
`numerous specific details, such as specific statistical sym-
`bols and relationships, specific methods of analyzing and
`processing computer performancedata, etc., are set forth in
`order to provide a full understanding of the present inven-
`tion. One skilled in the relevant art, however, will readily
`recognize that the present invention can be practiced without
`one or more of the specific details, or with other methods,
`etc. In other instances, well-knownstructures or operations
`are not shown in detail in order to avoid obscuring the
`present invention.
`For illustrative purposes, embodiments of the present
`invention are sometimes described with respect to a system
`and method for forecasting computer platform resource
`performance. It should be understood that the present inven-
`tion is not limited ta these embodiments. Instead, the present
`invention contemplates any data that may be collected and
`verified over time. These data may include, for example,
`Internet metering data, marketing data on the success or
`failure of product offerings, telephone usage patterns, cash
`flow analyses, financial data, customer survey data on prod-
`uct reliability, customer survey data on product preference,
`etc.
`I. EXAMPLE ENVIRONMENT
`
`FIG. 1 is a high-level block diagram of a computing
`platform 100 suitable for implementing an embodimentof
`the present
`invention. In this embodiment,
`the computer
`platform 100 is a multiple virtual storage (MVS) platform
`available from International Business Machines (IBM), or
`equivalent platform available from Amdahl and Hitachi Data
`Systems. In another embodiment, the computing platform
`100 may be a UNIX computing platform.
`In other
`embodiments, the computing platform 100 may be a disk
`
`10
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`operating system (DOS)or a personal computer disk oper-
`ating system (PC-DOS) computing platform. ‘Those skilled
`in the art will appreciate that a variety computing platforms
`may be used to implement the present invention.
`The computing platform 100 includes a computing net-
`work 102. Typically, the computing network 102 may be a
`manufacturing facility, a telecommunications nctwork, a
`multinational corporation, a financial
`institution, or a
`university, for example,
`that operates in a client-server
`environment. In that instance, the computing network 102
`may connect “client” systems with “server” systems so that
`the server systems may perform a computation, retrieve a
`file, or search a database for a particular entry in response to
`a request by the client system. It is not uncommonfor the
`client system to subsequently translate the response from the
`server system into a format that a human can understand.
`To illustrate, suppose that the computing network 102
`supports a bank. The bank has customerfinancial records,
`including personal bank account information stored in a
`large database. The personal bank account database acts as
`a server. The bank also provides access to its personal
`account database by certain client systems. For example, one
`client system may include a large number of point-of-sale
`cash registers or gas pump bank card readers. As a customer
`with a bank account at
`the bank attempts to purchase
`merchandise or gas using her bank card, the point-of-sale
`cash register or gas pump bank card reader accesses the
`customer’s bank account information stored in the database.
`
`The point-of-sale cash register or gas pump bank card reader
`acting as a client system requests a determination from the
`bank personal account database of whether the customer has
`funds to cover the purchase price. The database responds
`accordingly, and the purchaseis either authorized or refused.
`A particular type of client-server environmentis not essen-
`tial to the present invention. It will be apparent to those
`skilled in the art that the exemplary embodiment may be
`implementedin other client-server environments, such as an
`airline flight reservation system, a mail-order facility, etc.
`In one embodiment, the computing network 102 includes
`a plurality of computers 106, as represented by computers
`106a-106d. For ease of explanation, however, the various
`embodiments generally are described with respect to only
`one computer 106. Moreover, although an embodiment is
`sometimes described in the context of a large complex of
`distributed computers, the present invention is not limited to
`this embodiment. For example, the computers 106 may be
`arranged in a local area network (LAN) configuration in a
`building or in a group of buildings within a few miles of each
`other. Alternatively, the computers 106 may be located in a
`wide area network (WAN) configuration, wherein the com-
`putcrs 106 are linked together but gcographically separated
`by great distances. The computers 106 may also be stand-
`alone devices not necessarily in communication with each
`other. The computer 106 in one embodimentis a mainframe
`computer available from IBM or equivalent mainframe
`computer available from Amdahl and Hitachi Data Systems.
`Alternatively, the computer 106 may be a high-performance
`workstation. Alternatively still, the computer 106 may be a
`personal computer.
`least one
`The computing platform 100 includes at
`resource.
`In one embodiment,
`the computing platform
`resource may be a central processing unit (CPU). In another
`embodiment, the computing platform resource may be a
`memory storage unit. In other embodiments, the computing
`platform resource may bea printer, a disk, or a disk drive
`unit. While a specific computing platform resource is not
`important for the present invention, those skilled in the art
`
`
`
`US 6,574,587 B2
`
`7
`will understand that any number of resources can be used in
`the present invention.
`Each resource includes at least one aspect. The aspect
`may be a performance metric.
`In one embodiment
`the
`performance metric may be resource utilization. Utilization
`is the measure of the percentage that a particular computing
`platform resource is kept busy, and is sometimes termed
`consumption. In another embodiment, the performance met-
`ric may be resource efficiency, which is defined as the
`measure of the useful portion of the total work performed by
`the resource. In another embodiment, the performance met-
`ric may be resource redundancy, which is defined as the
`measure of the increase in the workload of a particular
`resource. Of course, those skilled in the art will appreciate
`that a particular performance metric is not required by the
`present invention. Instead, the present invention supports
`any of a number of performance metrics.
`FIG. 2 is a more detailed block diagram of the computing
`platform 100 according to one embodiment. As illustrated,
`each computer 106 includes a resource manager 202. Each
`resource manager 202 collects performance data from its
`associated resource. The performance data is associated with
`a performance metric. According to one embodiment, the
`resource manager 202 is a resource management facility
`(RMP)available with the multiple virtual storage (MVS)
`operating system that is running on the IBM mainframe
`computer as noted above or an equivalent mainframe com-
`puter available from Amdahl and Hitachi Data Systems.
`According to this cmbodiment, the resource manager 202
`extracts historical performance data from a processor
`resource/systems manager (PR/SM) (not shown) of the
`computer 106. This historical computer performance data
`represents the CPU utilization and is equivalent to perfor-
`mance metering data obtained by real-time monitors. Thus,
`the CPU utilization information collected by the resource
`manager 202 are CPU utilization records that contain CPU
`activity measurements.
`The resource manager 202 collects the performance data
`from the computer 106 at regular intervals. According to an
`exemplary embodiment, the regular intervals are one-second
`intervals. That is, according to the exemplary embodiment,
`the resource manager collects CPU workload performance
`data every second from computer 106. In this way,
`the
`resource manager 202 provides the percent busy for each
`computer 106 each second in time. The interval size in
`which performance data is collected may be determined by
`the particular use of the performance metric, the particular
`resource, the particular computing platform,etc.
`Because the computers 106 typically are maintained by
`large entities, the amount of data collected usually is quite
`large. Consequently, the data must be reduced to a manage-
`able level. Statistically collapsing the one-second records
`generated by the resource manager 202 serves this purpose.
`The computing platform 100 thus also includesa plurality of
`Statistical collapsers that statistically collapse the perfor-
`mance data into time series representing a performance
`metric. A “time series” is defined herein generally as any
`ordered sequence of observations. Each observation repre-
`sents a given pointin time andis thus termed a “time point.”
`A statistical collapser averages a series of time points and
`generales a lime series representing a performance metric as
`though its associated performance data had been collected at
`a particular interval. The resulting time series contains a set
`of time points commensurate with the representative collec-
`tion interval.
`According to one embodiment, the computing platform
`100 includes a statistical collapser 204 that statistically
`
`10
`
`15
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`
`
`8
`collapses the performance data collected by the resource
`manager 202 into a timeseries. ‘he statistical collapser 204
`generates a time series representing performance data as
`though it had been collected at fifteen-minute intervals.
`Accordingly, the time series would include four time points
`for each hour. In another embodiment, the first statistical
`collapscr gencrates a time scrics representing a performance
`metric as