WO2004059477A1 - A system and method for resource usage prediction in the deployment of software applications - Google Patents
A system and method for resource usage prediction in the deployment of software applications Download PDFInfo
- Publication number
- WO2004059477A1 WO2004059477A1 PCT/US2002/041546 US0241546W WO2004059477A1 WO 2004059477 A1 WO2004059477 A1 WO 2004059477A1 US 0241546 W US0241546 W US 0241546W WO 2004059477 A1 WO2004059477 A1 WO 2004059477A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- parameter
- statistic
- deployment
- accordance
- values
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/61—Installation
Definitions
- the present invention relates to a system and method for predicting the resources required for the deployment of a software application.
- Deploying new software or upgrading existing software to a newer version can often disrupt a computer system, which might need to be shut down and restarted as part of the deployment process and/or might experience degradation of performance. It would be of use to the deployer (the person deploying software applications in a computer system) to know in advance the expected duration of such interruptions and system performance degradation. Deployment tools (typical installation programs) do not provide sufficient information to the deployer to make an informed decision regarding whether to carry out the deployment or to alter his/her deployment plan before deploying in order to minimize the impact on the users of the system.
- the present invention provides a method of predicting a quantity of a resource required for the deployment of a software application on a computing system, comprising the steps of providing historical resource utilisation data for deployment of software applications on computing systems, providing a value for a parameter of the computing system relevant to resource utilisation, providing a value for a parameter of the software application relevant to resource utilisation, and utilising the historical resource utilisation data and parameter values to predict the quantity of the resource required for deployment of the software application.
- the present invention essentially utilises the historical data relating to resources required for deployment of software applications to predict a resource required for future deployment of a software application.
- the historical resource utilisation data includes parameter values of the computing systems and parameter values of the software applications historically deployed. It also preferably includes values of the quantities of resources used in the historical deployment (termed herein statistics) .
- a parameter will be understood to mean a feature or characteristic of the configuration of the computing system, such as, for example, the amount of Random Access Memory in the computing system, or a feature or characteristic of the software application, such as the size of the software application.
- a statistic will be understood to mean a quantity of a specific resource required to perform a task, such as, for example, the time it may take for the software application to be deployed.
- the historical resource utilisation data includes at least two parameter/statistic pairs for historical deployments.
- a relationship between the parameter and statistic pairs is derived, wherein the resultant relationship may be utilised to predict a statistic for any parameter value.
- the relationship between the parameter and statistic pairs is derived by applying a statistical model to the parameter/statistic pairs.
- the method comprises the further step of obtaining m n different values for each parameter P n , and further obtaining at least m 1 m 2 ...m I1 values of a statistic for each distinct combination of parameter values, where m ⁇ m 2 ...m n represents the product of values m x , m 2/ ... m n .
- the relationship between the statistic and the parameter or n parameters is determined by assuming that the relationship between the parameter/statistic pairs takes the form of a linear relationship.
- the equation of the linear relationship is calculated using co-ordinate geometry.
- the mathematical model takes the form:
- a computer resource may encompass any hardware or software resource, such as a CPU, volatile or non-volatile memory, the number of processors, the operating system or other software packages, or any other suitable resource.
- the present invention preferably provides a number of advantages. Firstly, the invention allows a system administrator or deployer to calculate an estimate of the amount of time needed to deploy an application. In environments running mission critical applications, the amount of "down time” is an important consideration when deciding to upgrade software. A system administrator needs to be able to predict, with reasonable accuracy, the amount of "up time” that will be lost in deploying an application, as it is commonly necessary to make other arrangements (e.g. letting users know in advance when the system will be down, transferring the load to another server, etc.)
- an embodiment of the present invention allows a system administrator to provide estimates for different computing systems with different resources.
- Many large corporations run a mixture of different machines, with different resources, different architectures, and different operating systems.
- a system administrator can preferably plan and more efficiently deploy system resources if an estimate of deployment time can be provided for each different system.
- an application developer may know how much time will be taken for an application to deploy, as this may allow the application developer to incorporate changes into the application to make the deployment process more efficient. For example, if an application developer finds that an application deployment time is appreciably increased when a system has little free memory, the application developer may reconfigure or tweak the deployment process to use less volatile memory.
- the present invention provides a computing system arranged to facilitate the prediction of resources required for the deployment of a software application, comprising a database arranged to provide historical resource utilisation data for deployment of software applications on computing systems, means for providing a value for a parameter of the computing system relevant to resource utilisation, and a value for a parameter of the software application relevant to resource utilisation, and computation means arranged to utilise the historical resource utilisation data and parameter values to predict the quantity of the resource required for deployment of the software application.
- a computer program arranged, when loaded on a computing system, to control the computing system to implement the method provided in the first aspect of the invention.
- a computer readable medium providing a computer program in accordance a third aspect of the invention.
- the present invention provides a method for building a model for use in the prediction of resources required for the deployment of a software application, the method comprising the steps of collecting historical resource utilisation data for deployment of software applications on computing systems, and storing the historical resource usage data.
- the present invention provides a model comprising historical resource utilisation data for deployment of software applications on computing systems, the data being stored in a database.
- Figure 1 illustrates a general-purpose computer that may be used to implement the present invention
- Figure 2 is a logical diagram of the components of a computer system that may utilize the present invention
- Figure 3 is a flowchart showing the steps for creating an Impact Analysis Model and predicting the resource usage during deployment
- Figure 4 is a diagram showing a possible graphical representation of the relationship between an attribute of the computer system and the quantity of a resource used during deployment
- Figure 5 is a diagram showing an example of a binary tree that can be used to estimate the quantity of a resource used during deployment based on more than one attribute
- Figure 6 is a flowchart showing the steps of a method that can be used to estimate the resource quantity used during a deployment based on more than one attribute
- Figure 7 is a flowchart showing the steps for updating the Impact Analysis Model to improve the accuracy of the prediction of resource usage for a particular computer system.
- FIG. 1 there is shown a schematic diagram of a computing system 10 suitable for use with an embodiment of the present invention.
- the computing system 10 may be used to execute applications and/or system services such as deployment services in accordance with an embodiment of the present invention.
- the computing system 10 preferably comprises a processor 12, read-only memory (ROM) 14, random access memory (RAM) 16, and input/output devices such as disk drives 18, keyboard 22, mouse 24, display 26, printer 28, and communications link 20.
- the computer includes programs that may be stored in RAM 16, ROM 14, or disk drives 18 and may be executed by the processor 12.
- the communications link 20 connects to a computer network but could be connected to a telephone line, an antenna, a gateway or any other type of communications link.
- Disk drives 18 may include any suitable storage media, such as, for example, floppy disk drives, hard disk drives, CD ROM drives or magnetic tape drives.
- the computing system 10 may use a single disk drive 18 or multiple disk drives.
- the computing system 10 may use any suitable operating systems, such as Wi »nd_owsTM or Uni'xTM.
- FIG. 2 is a diagram showing a networked computer system 40 comprising one or more computing systems 10 of figure 1 networked such that data may be interchanged between the networked computer systems .
- the networked computer system 40 preferably comprising a server 42 is arranged to run one or more software applications 44. Data used by the software applications 44 is maintained in one or more databases 50 contained in storage media controlled by one or more of the computers 10.
- the computer system also contains a deployment engine 48 used to deploy software applications to the server.
- the deployment engine 48 may or may not be running on the same server as the deployed software applications.
- An embodiment of the present invention relates to the functionality of the deployment engine 48 to predict the quantities of various resources of the computer system 40 including disk space consumption and time required to perform a deployment of a software application 44 to the server 42, and provide this information to the deployer 46 who will use it to improve decision making regarding deployments.
- Figure 3 is a flowchart detailing the process 60 of constructing an Impact Analysis Model, in accordance with an embodiment of the present invention, in order to implement the functionality of the deployment engine 48 of predicting the amount of various resources required during deployment.
- the model relies on the following definitions.
- a parameter is an independent variable that can be quantified.
- a parameter is a feature of the configuration of the computer system 40 and/or the software application 44 to be deployed that the computer system 40 can be instrumented to measure. Examples of parameters include but are not limited to: • the size of the software application to be deployed measured in kilobytes; • the amount of RAM 16 available on the server 42 measured in megabytes;
- a statistic is a variable assumed to be dependent on one or more parameters that can be measured. Insofar as it relates to this embodiment of the present invention, a statistic is the quantity of a particular resource required during a deployment. Examples of statistics include but are not limited to:
- the example of predicting the time required for deployment will be used throughout the description of this embodiment of the invention.
- the present invention is not limited to prediction of this resource.
- the "assumptions" of the model have to be defined at step 64. This involves identifying the parameters that are assumed to influence the value of the statistic to be predicted. In accordance with the example, a possible assumption is that the time required statistic is dependent on only two parameters (for simplicity) , the size of the application to be deployed and the amount of RAM 16 available on the server 42. However, more than two parameters may be used in constructing the model without departing from the scope of the invention. There may be provided a system in accordance with an embodiment of this invention that contains "built in” assumptions. That is, a deployer or user may not have the opportunity to choose which independent variables are to be used in the estimation of the statistic. Alternately, in another embodiment, a deployer or user may choose one or more from a range of independent variables.
- step 66 After the assumptions of the model are defined at step 64, actual data is collected at step 66, the data forming the historical utilisation data. This data establishes the relationship between the parameters defined in step 64 and the statistic being predicted. Step 66 thus involves performing many deployments. For each deployment the parameters (for the example these are the size of the application to be deployed and the amount of RAM available) of the specific configuration are measured prior to performing the deployment, and the statistic of interest (for the example this is the time taken to deploy) is measured during the deployment. This data is then stored in accordance with step 66 in permanent storage, for instance a relational database, to be retrieved when required for the prediction of the statistic. The volume and scope of data collected in step 66 depends on the method used in step 70 to predict the statistic and the accuracy level required.
- the deployment service could include an internal counter or clock, which counts the time elapsed since the beginning of deployment.
- the deployment could be monitored by internal counters and/or clocks that form part of the computing system or operating system.
- step 68 can utilize any means necessary to determine the values of the parameters from the current configuration of the computer system. Values of some parameters may be determined by calls to the operating system running on the computer 10. Other parameter values may need to be sampled programmatically. Both step 66 and step 68 assume that the computer system can be instrumented to enable the embodiment of the present invention to collect values of parameters and measure values of statistics. Step 68 is also implicitly carried out in step 66 when deployment statistics are collected, as a set of parameters is collected prior to each deployment and is then associated with the statistic for that deployment . Based on the assumptions in step 64 and the definitions of a parameter and a statistic above, each parameter has a relationship with the statistic. This relationship can be expressed mathematically as a function. In general, if there are n parameters that influence a statistic S, the statistic can be expressed as a function f k of each parameter Pk, 1 ⁇ k ⁇ n, as follows:
- the value of the statistic is to be predicted using the values of the parameters that influence the statistic and that are sampled in step 68. Since the exact relationship between each parameter and the statistic is not known, the value of the statistic based on each parameter can be approximated.
- a sufficient set of data is collected in step 66 in process 60.
- data needs to be collected for at least 2 values of each parameter. If data is to be collected for mi different values of parameter P x , m 2 different values of parameter P 2 ... , m n different values of parameter P n , then the number of statistics that need to be collected in step 66 is m 1 xm 2 x...xm n/ one for each of the different combinations of parameter values.
- the output of the process 120 (figure 6) is a predicted value of the statistic.
- n + 1 level binary tree of objects is constructed in step 124 according to the following rules . (Note that the top level is 1 and the bottom level is (n + 1) .) a.
- the root node object is called a Line object.
- the leaf node objects are called
- BoundaryValue objects The other node objects in the tree are called LineBoundaryValue objects, b.
- All left child nodes in the tree contain the lower boundary value for the parameter in the parent node.
- All right child nodes contain the upper boundary values of the parameter in the parent node .
- Each node in the k h level contains a statistic value corresponding to the set of boundary values of parameters represented by nodes on the path from that node to the root node of the tree.
- the values of the statistics in the (n + l) th level correspond to combinations of values of all n parameters, and are obtained from the data collected in step 66.
- the process 120 then moves to the nth level in step 126, which is the second from the last level in the tree.
- step 128 for each node in the level, the value of the parameter in the node (according to paragraph (b) above) , the boundary values in the two child nodes (according to paragraph (c) and paragraph (d) above) and the values of the statistic in the two child nodes (according to paragraph (e) above) together with the linear approximation equation II are used to find a statistic for that node.
- a statistic value is obtained in the Line object node of the tree (the root node) , which is an estimate of the statistic for the set of parameter values obtained in step 68 of process 60 (figure 3) prior to performing the deployment .
- the process 120 depicted in Figure 6 will now be further explained using an example and Figure 5.
- the example will solve the problem of predicting the time required statistic for a deployment based on two parameters, the size of the application to be deployed, paraml, and the amount of RAM available in the system, param2.
- the necessary inputs to the process 120 are:
- Table I Sampled data for the relationship between the parameters paraml and param2 and the time required statistic, as collected in step 66.
- the process 120 begins at step 122.
- the object tree 100 constructed to represent the input data is shown in Figure 5.
- the two child objects hold the boundary values (values of the parameter for which statistics were measured in step 65) .
- the Line object 102 holds the first parameter (paraml) with the value 2 for which the statistic is to be estimated.
- the two left child objects, the BoundaryValue objects 108 and 112 hold the lower boundary value of 50 for param2 (from the third column of Table I) .
- the combination of the boundary values held by the objects in the branch is a combination for which a statistic was measured.
- step 126 in process 120 (figure 6) is carried out and level 2 is chosen as the current level since there are two parameters.
- step 128 the LineBoundaryValue objects 104 and 106 in level 2 of the tree 100 are considered in turn.
- statisticl in the LineBoundaryValuel object 104 is given by
- step 128 in process 120 concludes step 128 in process 120, and process 120 moves to step 130. Since the current level is level 1, process 120 ends in step 134.
- the predicted value of the statistic is now in the Linl object 102 and is equal to 107.5 seconds.
- the process 120 can be applied to predicting any statistic, not just time, based on any number of parameters, not just two parameters, as described in the example above. Furthermore, the process 120 is only one possible method of implementing step 70 of process 60 in Figure 3, and the present invention is not limited to this implementation of step 70.
- the present invention also relates to the method for improving the accuracy of the predicted statistic by updating the Impact Analysis model 60 to include actual statistics collected as more and more deployments are performed.
- This method 140 is depicted in Figure 7. Beginning at step 142, the resource usage for a deployment is predicted in step 144 by carrying out the steps of the model 60. In step 146, the deployment for which resource usage has been predicted is carried out, and the actual resource usage during the deployment is measured and stored in step 148.
- step 150 the model 60 is updated to incorporate the statistic from step 148 by combining the actual statistic with the data previously collected and stored in step 66 of the model 60.
- This embodiment of the present invention does not propose a method for performing step
- the method for updating the model 60 ends at step 152.
- the present invention is concerned with the prediction of statistics.
- This embodiment of the present invention uses linear ' approximation to estimate a statistic, and this estimate is thus subject to the linear approximation error.
- This embodiment of the present invention does not include a method for determining this error. It should be noted that the methods used to implement this embodiment of the present invention could be modified without departing from the scope of the invention.
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2002/041546 WO2004059477A1 (en) | 2002-12-27 | 2002-12-27 | A system and method for resource usage prediction in the deployment of software applications |
US10/540,947 US20060075399A1 (en) | 2002-12-27 | 2002-12-27 | System and method for resource usage prediction in the deployment of software applications |
AU2002361877A AU2002361877A1 (en) | 2002-12-27 | 2002-12-27 | A system and method for resource usage prediction in the deployment of software applications |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2002/041546 WO2004059477A1 (en) | 2002-12-27 | 2002-12-27 | A system and method for resource usage prediction in the deployment of software applications |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2004059477A1 true WO2004059477A1 (en) | 2004-07-15 |
Family
ID=32679958
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2002/041546 WO2004059477A1 (en) | 2002-12-27 | 2002-12-27 | A system and method for resource usage prediction in the deployment of software applications |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2002361877A1 (en) |
WO (1) | WO2004059477A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5805898A (en) * | 1995-02-24 | 1998-09-08 | International Business Machines Corporation | Method and apparatus for estimating installation time in a data processing system |
US5901320A (en) * | 1996-11-29 | 1999-05-04 | Fujitsu Limited | Communication system configured to enhance system reliability using special program version management |
US6343312B1 (en) * | 1995-07-14 | 2002-01-29 | Sony Corporation | Data processing method and device |
US6421778B1 (en) * | 1999-12-20 | 2002-07-16 | Intel Corporation | Method and system for a modular scalability system |
-
2002
- 2002-12-27 AU AU2002361877A patent/AU2002361877A1/en not_active Abandoned
- 2002-12-27 WO PCT/US2002/041546 patent/WO2004059477A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5805898A (en) * | 1995-02-24 | 1998-09-08 | International Business Machines Corporation | Method and apparatus for estimating installation time in a data processing system |
US5960206A (en) * | 1995-02-24 | 1999-09-28 | International Business Machines Corporation | Method and apparatus for estimating installation time in a data processing system |
US6343312B1 (en) * | 1995-07-14 | 2002-01-29 | Sony Corporation | Data processing method and device |
US5901320A (en) * | 1996-11-29 | 1999-05-04 | Fujitsu Limited | Communication system configured to enhance system reliability using special program version management |
US6421778B1 (en) * | 1999-12-20 | 2002-07-16 | Intel Corporation | Method and system for a modular scalability system |
Also Published As
Publication number | Publication date |
---|---|
AU2002361877A1 (en) | 2004-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20060075399A1 (en) | System and method for resource usage prediction in the deployment of software applications | |
US9356846B2 (en) | Automated upgrading method for capacity of IT system resources | |
JP6952058B2 (en) | Memory usage judgment technology | |
Gmach et al. | Capacity management and demand prediction for next generation data centers | |
US8543711B2 (en) | System and method for evaluating a pattern of resource demands of a workload | |
US7996204B2 (en) | Simulation using resource models | |
CA3090095C (en) | Methods and systems to determine and optimize reservoir simulator performance in a cloud computing environment | |
US10909503B1 (en) | Snapshots to train prediction models and improve workflow execution | |
CN105677538A (en) | Method for adaptive monitoring of cloud computing system based on failure prediction | |
KR20060061759A (en) | Automatic validation and calibration of transaction-based performance models | |
US9367300B2 (en) | Method and apparatus for determining installation order of software | |
CN109120463B (en) | Flow prediction method and device | |
US11314553B2 (en) | Method, apparatus, and computer program product for determining usage change rate of storage system | |
US9396160B1 (en) | Automated test generation service | |
Aggarwal et al. | Reliability analysis for multi-release open-source software systems with change point and exponentiated Weibull fault reduction factor | |
CA2713889C (en) | System and method for estimating combined workloads of systems with uncorrelated and non-deterministic workload patterns | |
KR20060061758A (en) | Automatic configuration of trasaction-based performance models | |
CN112910710A (en) | Network flow space-time prediction method and device, computer equipment and storage medium | |
US20070233532A1 (en) | Business process analysis apparatus | |
US20150012629A1 (en) | Producing a benchmark describing characteristics of map and reduce tasks | |
Abate et al. | Adaptive aggregation of Markov chains: Quantitative analysis of chemical reaction networks | |
Mazkatli et al. | Continuous integration of performance model | |
Willnecker et al. | Optimization of deployment topologies for distributed enterprise applications | |
WO2004059477A1 (en) | A system and method for resource usage prediction in the deployment of software applications | |
CN115858648A (en) | Database generation method, data stream segmentation method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
ENP | Entry into the national phase |
Ref document number: 2006075399 Country of ref document: US Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10540947 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase | ||
WWP | Wipo information: published in national office |
Ref document number: 10540947 Country of ref document: US |