US20070162602A1 - Template-based approach for workload generation - Google Patents

Template-based approach for workload generation Download PDF

Info

Publication number
US20070162602A1
US20070162602A1 US11/327,071 US32707106A US2007162602A1 US 20070162602 A1 US20070162602 A1 US 20070162602A1 US 32707106 A US32707106 A US 32707106A US 2007162602 A1 US2007162602 A1 US 2007162602A1
Authority
US
United States
Prior art keywords
template
workload
generation
workload generation
defines
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/327,071
Inventor
Kay Anderson
Eric Bouillet
Parijat Dube
Zhen Liu
Dimitrios Pendarakis
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Security Agency
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/327,071 priority Critical patent/US20070162602A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ANDERSON, KAY S, PENDARAKIS, DIMITRIOS, BOUILLET, ERIC P, DUBE, PARIJAT, LIU, ZHEN
Publication of US20070162602A1 publication Critical patent/US20070162602A1/en
Assigned to NATIONAL SECURITY AGENCY reassignment NATIONAL SECURITY AGENCY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INTERNATIONAL BUSINESS MACHINES CORPORATION
Priority to US12/128,959 priority patent/US8924189B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3414Workload generation, e.g. scripts, playback

Definitions

  • Workload generation is employed for performance characterization, testing and benchmarking of computer systems dealing with processing, forwarding, storing and/or analysis of network traffic.
  • Workload generation typically aims to simulate or emulate traffic generated by different types of applications, protocols and activities.
  • the activities might include email, chat, web browsing and traffic from sensor networks.
  • the sensor networks might include video surveillance sensors, temperature monitoring sensors, and the like.
  • Different approaches have been used for generating the traffic, such as model driven simulations and client-server architectures.
  • Examples of currently available traffic generation tools include commercial products such as LoadRunner, Netpressure, Http-Load, and MegaSIP; and academic prototypes such as SURGE, Wagon, Httperf, Harpoon, NetProbe, D-ITG, MGEN, and LARIAT.
  • the existing workload generation approaches focus primarily on matching predetermined volumetric and timing properties, and ignore statistical properties at the content level, such as content and contextual semantics.
  • Most of the existing approaches for traffic generation are application specific or lack scalability and/or modularity.
  • the traffic generated by these approaches is not suitable for testing and benchmarking systems that analyze data content and make intelligent decisions based on the content.
  • the majority of these tools are not content based or generate only a limited level of content and contextual richness.
  • An exemplary system for workload generation includes a processor for identifying a workload model by determining each of a hierarchy for workload generation, time scales for workload generation, and states and transitions at each of the time scales, and defining a parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function (PDF) for each of the attributes; a user level template unit corresponding to a relatively slow time scale in signal communication with the processor; an application level template corresponding to a relatively faster time scale in signal communication with the processor; a stream level template corresponding to a relatively fastest time scale in signal communication with the processor; and a communications adapter in signal communication with the processor for defining a workload generating unit (WGU) responsive to the template units.
  • WGU workload generating unit
  • a corresponding exemplary method for workload generation includes identifying a workload model by determining each of a hierarchy for workload generation, time scales for workload generation, and states and transitions at each of the time scales; defining a parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function (PDF) for each of the attributes; constructing a template for workload generation wherein the template is a user level template corresponding to a relatively slow time scale, an application level template corresponding to a relatively faster time scale or a stream level template corresponding to a relatively fastest time scale; and defining a workload generating unit (WGU) responsive to the template.
  • WGU workload generating unit
  • the present disclosure teaches a template-based approach for workload generation in accordance with the following exemplary figures, in which:
  • FIG. 1 shows a schematic diagram of a system implementing a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of a network supporting a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure
  • FIG. 3 shows a flow diagram of a method for a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure
  • FIG. 4 shows a schematic diagram of templates for a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure.
  • the present disclosure provides a template-based approach for workload generation.
  • An exemplary embodiment lays a framework for generating scalable, content and contextually rich traffic in accordance with the template-based approach.
  • a template is a common pattern characterizing the traffic to be generated for different layers, different protocols, different users or different application domains. Templates capture the most pertinent and repetitive patterns of traffic and can be combined in a layered or recursive manner to define complex traffic generation models. In addition, templates contain fields that allow the specification of different application, protocol and network specific attributes of the traffic.
  • the different attributes are parametric and are treated as variables or random variables. By specifying different values or probability distributions for these parameters, the behavior of a wide population of users, applications and network conditions can be captured. Templates can specify underlying distributions and other attributes that define the pattern and behavior of the traffic generating units where a single unit can be used to generate either a large or a small class of communicants. This approach has the advantage that it gives complete control to what is generated, including simulating protocols that are not yet well defined such as sensor networks, network impairments, and the like. Further templates allow simplified construction of models without recreating full protocol models.
  • Templates are then used to define Workload Generating Units (WGU).
  • WGU Workload Generating Units
  • Multiple templates can be used to define a single WGU when different templates specify different components of a WGU behavior, or a single template can be used to construct many WGUs with all of the WGUs having the same behavior as specified by the template.
  • a single WGU can be used to generate traffic for either a large or a small class of communicants.
  • the system 100 includes at least one processor or central processing unit (CPU) 102 in signal communication with a system bus 104 .
  • a read only memory (ROM) 106 , a random access memory (RAM) 108 , a display adapter 110 , an I/O adapter 112 , a user interface adapter 114 and a communications adapter 128 are also in signal communication with the system bus 104 .
  • a display unit 116 is in signal communication with the system bus 104 via the display adapter 110 .
  • a disk storage unit 118 such as, for example, a magnetic or optical disk storage unit is in signal communication with the system bus 104 via the I/O adapter 112 .
  • a mouse 120 , a keyboard 122 , and an eye tracking device 124 are in signal communication with the system bus 104 via the user interface adapter 114 .
  • a user level template unit 170 , an application level template unit 180 and a stream level template unit 190 are also included in the system 100 and in signal communication with the CPU 102 and the system bus 104 . While the user level template unit 170 , application level template unit 180 and stream level template unit 190 are illustrated as coupled to the at least one processor or CPU 102 , these components are preferably embodied in computer program code stored in at least one of the memories 106 , 108 and 118 , wherein the computer program code is executed by the CPU 102 .
  • the network 200 may be a part of a bigger application system, such as when connected in signal communication with the communications adapter of FIG. 1 .
  • the network 200 includes two remote servers 209 and 210 connected to client machines performing web requests, and also connected to a local server 208 where a main database and web site are hosted via a network connection 207 .
  • the local server 208 includes a database 201 , an application server 202 and a web server 203 .
  • the remote servers 209 and 210 each include a remote application server 205 and a remote web server 206 .
  • the remote server 209 has a remote data cache 204 . Requests for dynamic content are received by the remote server and handled by application components hosted inside the remote application server 205 . These components issue database queries, which are intercepted by the remote data cache 204 and handled from the remote database, if possible. If the query can not be handled by the remote database, the remote data cache 204 forwards the request to the local database 201 and retrieves the results from there.
  • the method 300 includes a function block 310 for model identification, which determines the hierarchy of workload generation, the time scales of workload generation, as well as the states and transitions at different scales.
  • the function block 310 passes control to a function block 320 for parameter definition.
  • the function block 320 determines the fields for user specific, application specific, network specific, and content specific attributes, as well as a probability distribution function (PDF) for different attributes.
  • PDF probability distribution function
  • the function block 320 passes control to a function block 330 for template construction, which constructs templates for different scales of workload behavior.
  • the function block 330 passes control to a function block 340 , which provides workload generating units.
  • a set of templates for a template-based approach for workload generation is indicated generally by the reference numeral 400 .
  • the set includes a user level template 410 , an application level template 420 , and a stream level template 430 .
  • the user level template 410 provides states and transitions, the states including times of day such as 9AM-5PM, morning/noon/evening, and the like, activities such as email, chat, browsing, telephone, video conferencing, and the like; and the transitions including going from email to chat and the like, and the fraction of time spent in email, chat and the like.
  • the application level template 420 is for any given application, such as chat, for example.
  • the application level template for chat includes states, transitions and parameters applicable to chat.
  • the relevant states include typing, clearing, and sending.
  • the relevant transitions include going from typing to clearing, and the like.
  • the relevant parameters include language, topic, and the relationship between the parties to the chat, for example.
  • the stream level template 430 is for any given application, such as chat, for example.
  • the stream level template for chat includes parameters applicable to chat.
  • the relevant parameters are the length of the sentences, a text construction model using n-grams, dictionaries for words, biometrics such as typing speed, and the like.
  • the workload generation behavior is viewed as the aggregate of correlated behaviors at different time scales. For example, to generate templates for workload generated on the internet due to human activities such as chat, web browsing, VoIP and the like, different time scales of traffic generation are identified and the human behavior and the resulting traffic are modeled in a hierarchical manner.
  • the user level behavioral model is characterized by a slower time scale on the order of minutes to hours; the usage frequencies of the various applications; the fraction of time spent in different applications during the day; the types of applications, such as emails, chat, http and the like; and the number and identification of associates.
  • the application level behavioral model is characterized by a faster time scale on the order of seconds to minutes; dynamics of activities within a session; possible states within an application; and OSI Layer 7 level protocols such as login, handshake, and session closing.
  • the data stream level model is characterized by a very fast time scale on the order of microseconds; content based such as topic, language, and volumetrics; the Codec such as GSM, MPEG, MP3; and OSI Layer-2-6 protocols.
  • Templates are created for these three different time-scales of traffic.
  • the template for the slow-time scale session-level behavioral model has fields corresponding to different times of day; different types of applications such as web-browsing, email, and chat, that an individual is involved in; associates with whom an individual interacts; and transitions between different places.
  • the parameters are places, transitions, fraction of time spent before firing a transition and other attributes specific to the types of the places and the transitions.
  • the template at this level will be used to schedule traffic generation units at the fast-time scale. At this level, the specificities such as protocol level of the particular applications are relatively unimportant.
  • the template for the fast-time scale application-level behavioral model has fields corresponding to different possible states an individual is in a particular application, such as typing, sending, clearing in case of chat, and transitions between these places.
  • the parameters are places, transitions, fraction of time spent before firing a transition and other attributes specific to the type of the place or the transition.
  • the templates at this level will be used to generate data streams that shall constitute the traffic.
  • the streams are generated in compliance with the specific protocol on which the application is running.
  • the data generation templates implement the logic for generating the content according to high-level control parameters passed on by the application level behavioral model.
  • the parameters can be topic, spoken language, dictionaries, noise levels, level of realism, and source if pre-recorded.
  • PDFs probability distribution functions
  • dictionaries By specifying the probability distribution functions (PDFs) and dictionaries, the user can control the length of the sentences, stochastic rules for concatenating the words, the language and the various topics during the chat, and biometric characteristics such as typing speed.
  • PDFs probability distribution functions
  • the content generated by using the templates at this level will be packaged into the appropriate stack of Protocol Data Units (PDU) before writing it to the respective output streams.
  • PDU Protocol Data Units
  • theses templates can provide the user with the additional ability to control network related attributes such as IP addresses of the parties involved in the chat, TCP parameters such as port numbers, window sequence numbers, ACK, and the like.
  • the method 300 that provides a framework for template-based workload generation highlights the major building blocks of embodiments of the present disclosure.
  • the exemplary templates 410 , 420 and 430 are relevant to a workload generation pattern in a corporate environment, where different templates are shown for different scales.
  • this exemplary embodiment identifies different time scales of workload generation and defines templates at these time scales for workload generation in a generic corporate scenario with 9AM-5PM working hours.
  • the templates work for defining workload generation patterns at different time scales in a corporate environment.
  • the template-based approach provides the foundation for building workload generators with important features.
  • the feature of controllability provides for easy orchestration of volumetric and contextual statistics such as protocol mix of generated traffic, time ranges of causal traffic, virtual and network topology attributes, traffic loss and delay characteristics, data source perturbation, tunable levels of accuracy in the data offered to the tested system, and ability to infuse cross-stream correlations.
  • the feature of scalability is achieved since all the traffic is artificially generated.
  • the template-based approach is much more scalable and is not limited by the storage bottlenecks as in the case of client-server approaches for traffic generation.
  • the template-based approach is less dependent on external parameters such as intermittent resource congestions and server availability.
  • the features of modularity and extensibility are attained because the templates for different applications can be built independently using application specific statistical properties. These can be used, in turn, to define or build on the fly independent agents generating traffic for the particular application.
  • the right volumetric mix of traffic from different applications can be easily generated by invoking the right number of these agents, and the right contextual mix can be generated by tuning the contents of the data units generated by these agents.
  • teachings of the present disclosure may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof. Most preferably, the teachings of the present disclosure are implemented as a combination of hardware and software.
  • the software is preferably implemented as an application program tangibly embodied on a program storage unit.
  • the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
  • the machine is implemented on a computer platform having hardware such as one or more central processing units (CPU), a random access memory (RAM), and input/output (I/O) interfaces.
  • CPU central processing units
  • RAM random access memory
  • I/O input/output
  • the computer platform may also include an operating system and microinstruction code.
  • the various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU.
  • various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.
  • the exemplary method for determining how many attributes should be determined may be augmented or replaced with more sophisticated attribute determination techniques.
  • the template-based framework may be incorporated into advanced network support systems that are responsive to multi-modal data, such as numeric data, text data, voice data and video data. All such changes and modifications are intended to be included within the scope of the present disclosure as set forth in the appended claims.

Abstract

A system and method for workload generation include a processor for identifying a workload model by determining each of a hierarchy for workload generation, time scales for workload generation, and states and transitions at each of the time scales, and defining a parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function for each of the attributes; a user level template unit corresponding to a relatively slow time scale in signal communication with the processor; an application level template corresponding to a relatively faster time scale in signal communication with the processor; a stream level template corresponding to a relatively fastest time scale in signal communication with the processor; and a communications adapter in signal communication with the processor for defining a workload generating unit responsive to the template units.

Description

    GOVERNMENT LICENSE RIGHTS
  • This invention was made with Government support under Contract No. H98230-04-3-0001 awarded by the U.S. Department of Defense. The Government has certain rights in this invention.
  • BACKGROUND
  • Workload generation is employed for performance characterization, testing and benchmarking of computer systems dealing with processing, forwarding, storing and/or analysis of network traffic. Workload generation typically aims to simulate or emulate traffic generated by different types of applications, protocols and activities. For example, the activities might include email, chat, web browsing and traffic from sensor networks. The sensor networks might include video surveillance sensors, temperature monitoring sensors, and the like. Different approaches have been used for generating the traffic, such as model driven simulations and client-server architectures.
  • Examples of currently available traffic generation tools include commercial products such as LoadRunner, Netpressure, Http-Load, and MegaSIP; and academic prototypes such as SURGE, Wagon, Httperf, Harpoon, NetProbe, D-ITG, MGEN, and LARIAT.
  • The existing workload generation approaches focus primarily on matching predetermined volumetric and timing properties, and ignore statistical properties at the content level, such as content and contextual semantics. Most of the existing approaches for traffic generation are application specific or lack scalability and/or modularity. The traffic generated by these approaches is not suitable for testing and benchmarking systems that analyze data content and make intelligent decisions based on the content. The majority of these tools are not content based or generate only a limited level of content and contextual richness.
  • SUMMARY
  • These and other drawbacks and disadvantages of the prior art are addressed by a template-based approach for workload generation.
  • An exemplary system for workload generation includes a processor for identifying a workload model by determining each of a hierarchy for workload generation, time scales for workload generation, and states and transitions at each of the time scales, and defining a parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function (PDF) for each of the attributes; a user level template unit corresponding to a relatively slow time scale in signal communication with the processor; an application level template corresponding to a relatively faster time scale in signal communication with the processor; a stream level template corresponding to a relatively fastest time scale in signal communication with the processor; and a communications adapter in signal communication with the processor for defining a workload generating unit (WGU) responsive to the template units.
  • A corresponding exemplary method for workload generation includes identifying a workload model by determining each of a hierarchy for workload generation, time scales for workload generation, and states and transitions at each of the time scales; defining a parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function (PDF) for each of the attributes; constructing a template for workload generation wherein the template is a user level template corresponding to a relatively slow time scale, an application level template corresponding to a relatively faster time scale or a stream level template corresponding to a relatively fastest time scale; and defining a workload generating unit (WGU) responsive to the template.
  • These and other aspects, features and advantages of the present disclosure will become apparent from the following description of exemplary embodiments, which is to be read in connection with the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present disclosure teaches a template-based approach for workload generation in accordance with the following exemplary figures, in which:
  • FIG. 1 shows a schematic diagram of a system implementing a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure;
  • FIG. 2 shows a schematic diagram of a network supporting a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure;
  • FIG. 3 shows a flow diagram of a method for a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure; and
  • FIG. 4 shows a schematic diagram of templates for a template-based approach for workload generation in accordance with an illustrative embodiment of the present disclosure.
  • DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
  • The present disclosure provides a template-based approach for workload generation. An exemplary embodiment lays a framework for generating scalable, content and contextually rich traffic in accordance with the template-based approach.
  • In exemplary embodiments, a template is a common pattern characterizing the traffic to be generated for different layers, different protocols, different users or different application domains. Templates capture the most pertinent and repetitive patterns of traffic and can be combined in a layered or recursive manner to define complex traffic generation models. In addition, templates contain fields that allow the specification of different application, protocol and network specific attributes of the traffic.
  • The different attributes are parametric and are treated as variables or random variables. By specifying different values or probability distributions for these parameters, the behavior of a wide population of users, applications and network conditions can be captured. Templates can specify underlying distributions and other attributes that define the pattern and behavior of the traffic generating units where a single unit can be used to generate either a large or a small class of communicants. This approach has the advantage that it gives complete control to what is generated, including simulating protocols that are not yet well defined such as sensor networks, network impairments, and the like. Further templates allow simplified construction of models without recreating full protocol models.
  • Templates are then used to define Workload Generating Units (WGU). Multiple templates can be used to define a single WGU when different templates specify different components of a WGU behavior, or a single template can be used to construct many WGUs with all of the WGUs having the same behavior as specified by the template. In addition, a single WGU can be used to generate traffic for either a large or a small class of communicants.
  • As shown in FIG. 1, a system implementing a template-based approach for workload generation, according to an illustrative embodiment of the present disclosure, is indicated generally by the reference numeral 100. The system 100 includes at least one processor or central processing unit (CPU) 102 in signal communication with a system bus 104. A read only memory (ROM) 106, a random access memory (RAM) 108, a display adapter 110, an I/O adapter 112, a user interface adapter 114 and a communications adapter 128 are also in signal communication with the system bus 104. A display unit 116 is in signal communication with the system bus 104 via the display adapter 110. A disk storage unit 118, such as, for example, a magnetic or optical disk storage unit is in signal communication with the system bus 104 via the I/O adapter 112. A mouse 120, a keyboard 122, and an eye tracking device 124 are in signal communication with the system bus 104 via the user interface adapter 114.
  • A user level template unit 170, an application level template unit 180 and a stream level template unit 190 are also included in the system 100 and in signal communication with the CPU 102 and the system bus 104. While the user level template unit 170, application level template unit 180 and stream level template unit 190 are illustrated as coupled to the at least one processor or CPU 102, these components are preferably embodied in computer program code stored in at least one of the memories 106, 108 and 118, wherein the computer program code is executed by the CPU 102.
  • Turning to FIG. 2, an exemplary network embodiment is indicated generally by the reference numeral 200. The network 200 may be a part of a bigger application system, such as when connected in signal communication with the communications adapter of FIG. 1. The network 200 includes two remote servers 209 and 210 connected to client machines performing web requests, and also connected to a local server 208 where a main database and web site are hosted via a network connection 207. The local server 208 includes a database 201, an application server 202 and a web server 203.
  • The remote servers 209 and 210 each include a remote application server 205 and a remote web server 206. The remote server 209 has a remote data cache 204. Requests for dynamic content are received by the remote server and handled by application components hosted inside the remote application server 205. These components issue database queries, which are intercepted by the remote data cache 204 and handled from the remote database, if possible. If the query can not be handled by the remote database, the remote data cache 204 forwards the request to the local database 201 and retrieves the results from there.
  • Turning now to FIG. 3, a method for a template-based approach for workload generation is indicated generally by the reference numeral 300. The method 300 includes a function block 310 for model identification, which determines the hierarchy of workload generation, the time scales of workload generation, as well as the states and transitions at different scales. The function block 310 passes control to a function block 320 for parameter definition.
  • The function block 320 determines the fields for user specific, application specific, network specific, and content specific attributes, as well as a probability distribution function (PDF) for different attributes. The function block 320, in turn, passes control to a function block 330 for template construction, which constructs templates for different scales of workload behavior. The function block 330 passes control to a function block 340, which provides workload generating units.
  • As shown in FIG. 4, a set of templates for a template-based approach for workload generation is indicated generally by the reference numeral 400. The set includes a user level template 410, an application level template 420, and a stream level template 430. The user level template 410 provides states and transitions, the states including times of day such as 9AM-5PM, morning/noon/evening, and the like, activities such as email, chat, browsing, telephone, video conferencing, and the like; and the transitions including going from email to chat and the like, and the fraction of time spent in email, chat and the like.
  • The application level template 420 is for any given application, such as chat, for example. Here, the application level template for chat includes states, transitions and parameters applicable to chat. Thus, the relevant states include typing, clearing, and sending. The relevant transitions include going from typing to clearing, and the like. The relevant parameters include language, topic, and the relationship between the parties to the chat, for example.
  • The stream level template 430 is for any given application, such as chat, for example. Here, the stream level template for chat includes parameters applicable to chat. Thus, the relevant parameters are the length of the sentences, a text construction model using n-grams, dictionaries for words, biometrics such as typing speed, and the like.
  • In operation, the workload generation behavior is viewed as the aggregate of correlated behaviors at different time scales. For example, to generate templates for workload generated on the internet due to human activities such as chat, web browsing, VoIP and the like, different time scales of traffic generation are identified and the human behavior and the resulting traffic are modeled in a hierarchical manner.
  • Here, the user level behavioral model is characterized by a slower time scale on the order of minutes to hours; the usage frequencies of the various applications; the fraction of time spent in different applications during the day; the types of applications, such as emails, chat, http and the like; and the number and identification of associates. The application level behavioral model is characterized by a faster time scale on the order of seconds to minutes; dynamics of activities within a session; possible states within an application; and OSI Layer 7 level protocols such as login, handshake, and session closing. The data stream level model is characterized by a very fast time scale on the order of microseconds; content based such as topic, language, and volumetrics; the Codec such as GSM, MPEG, MP3; and OSI Layer-2-6 protocols.
  • Templates are created for these three different time-scales of traffic. The template for the slow-time scale session-level behavioral model has fields corresponding to different times of day; different types of applications such as web-browsing, email, and chat, that an individual is involved in; associates with whom an individual interacts; and transitions between different places. The parameters are places, transitions, fraction of time spent before firing a transition and other attributes specific to the types of the places and the transitions. The template at this level will be used to schedule traffic generation units at the fast-time scale. At this level, the specificities such as protocol level of the particular applications are relatively unimportant.
  • The template for the fast-time scale application-level behavioral model has fields corresponding to different possible states an individual is in a particular application, such as typing, sending, clearing in case of chat, and transitions between these places. As before, the parameters are places, transitions, fraction of time spent before firing a transition and other attributes specific to the type of the place or the transition. The templates at this level will be used to generate data streams that shall constitute the traffic. The streams are generated in compliance with the specific protocol on which the application is running.
  • The data generation templates implement the logic for generating the content according to high-level control parameters passed on by the application level behavioral model. For example, in chat the parameters can be topic, spoken language, dictionaries, noise levels, level of realism, and source if pre-recorded. By specifying the probability distribution functions (PDFs) and dictionaries, the user can control the length of the sentences, stochastic rules for concatenating the words, the language and the various topics during the chat, and biometric characteristics such as typing speed. The content generated by using the templates at this level will be packaged into the appropriate stack of Protocol Data Units (PDU) before writing it to the respective output streams. In addition, by emulating the protocol stack down to the IP layer, theses templates can provide the user with the additional ability to control network related attributes such as IP addresses of the parties involved in the chat, TCP parameters such as port numbers, window sequence numbers, ACK, and the like.
  • Referring back to FIG. 3, the method 300 that provides a framework for template-based workload generation highlights the major building blocks of embodiments of the present disclosure.
  • Recalling FIG. 4, the exemplary templates 410, 420 and 430 are relevant to a workload generation pattern in a corporate environment, where different templates are shown for different scales. Thus, this exemplary embodiment identifies different time scales of workload generation and defines templates at these time scales for workload generation in a generic corporate scenario with 9AM-5PM working hours. Here, the templates work for defining workload generation patterns at different time scales in a corporate environment.
  • The template-based approach provides the foundation for building workload generators with important features. The feature of controllability provides for easy orchestration of volumetric and contextual statistics such as protocol mix of generated traffic, time ranges of causal traffic, virtual and network topology attributes, traffic loss and delay characteristics, data source perturbation, tunable levels of accuracy in the data offered to the tested system, and ability to infuse cross-stream correlations. The feature of scalability is achieved since all the traffic is artificially generated. Thus, the template-based approach is much more scalable and is not limited by the storage bottlenecks as in the case of client-server approaches for traffic generation.
  • The features of reliability and robustness are attained. Unlike client-server approaches, the template-based approach is less dependent on external parameters such as intermittent resource congestions and server availability. The features of modularity and extensibility are attained because the templates for different applications can be built independently using application specific statistical properties. These can be used, in turn, to define or build on the fly independent agents generating traffic for the particular application. The right volumetric mix of traffic from different applications can be easily generated by invoking the right number of these agents, and the right contextual mix can be generated by tuning the contents of the data units generated by these agents.
  • It is to be understood that the teachings of the present disclosure may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof. Most preferably, the teachings of the present disclosure are implemented as a combination of hardware and software.
  • Moreover, the software is preferably implemented as an application program tangibly embodied on a program storage unit. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (CPU), a random access memory (RAM), and input/output (I/O) interfaces.
  • The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.
  • It is to be further understood that, because some of the constituent system components and methods depicted in the accompanying drawings are preferably implemented in software, the actual connections between the system components or the process function blocks may differ depending upon the manner in which the present disclosure is programmed. Given the teachings herein, one of ordinary skill in the pertinent art will be able to contemplate these and similar implementations or configurations of the present disclosure.
  • Although exemplary embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the present disclosure is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present disclosure. For example, the exemplary method for determining how many attributes should be determined may be augmented or replaced with more sophisticated attribute determination techniques. For another example, the template-based framework may be incorporated into advanced network support systems that are responsive to multi-modal data, such as numeric data, text data, voice data and video data. All such changes and modifications are intended to be included within the scope of the present disclosure as set forth in the appended claims.

Claims (20)

1. A method for workload generation comprising:
identifying a workload model by determining each of a hierarchy for workload generation, a plurality of time scales for workload generation, and states and transitions at each of the plurality of time scales;
defining at least one parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function (PDF) for each of the attributes;
constructing at least one template for workload generation wherein the at least one template is a user level template corresponding to a relatively slow time scale of the plurality of time scales, an application level template corresponding to a relatively faster time scale of the plurality of time scales or a stream level template corresponding to a relatively fastest time scale of the plurality of time scales; and
defining at least one workload generating unit (WGU) responsive to the at least one template.
2. A method as defined in claim 1 wherein the at least one template defines states for workload generation.
3. A method as defined in claim 1 wherein the at least one template defines transitions for workload generation.
4. A method as defined in claim 1 wherein the at least one template defines parameters for workload generation.
5. A method as defined in claim 1 wherein a plurality of templates defines the at least one WGU.
6. A method as defined in claim 1 wherein the at least one template defines a plurality of WGUs.
7. A method as defined in claim 1 wherein the at least one WGU is used to generate traffic for a large or small class of communicants.
8. A system for workload generation comprising:
a processor for identifying a workload model by determining each of a hierarchy for workload generation, a plurality of time scales for workload generation, and states and transitions at each of the plurality of time scales, and defining at least one parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function (PDF) for each of the attributes;
a user level template unit corresponding to a relatively slow time scale of the plurality of time scales in signal communication with the processor;
an application level template corresponding to a relatively faster time scale of the plurality of time scales in signal communication with the processor;
a stream level template corresponding to a relatively fastest time scale of the plurality of time scales in signal communication with the processor; and
a communications adapter in signal communication with the processor for defining at least one workload generating unit (WGU) responsive to at least one of the template units.
9. A system as defined in claim 8 wherein at least one of the template units defines states for workload generation.
10. A system as defined in claim 8 wherein at least one of the template units defines transitions for workload generation.
11. A system as defined in claim 8 wherein at least one of the template units defines parameters for workload generation.
12. A system as defined in claim 8 wherein a plurality of template units defines the at least one WGU.
13. A system as defined in claim 8 wherein at least one template unit defines a plurality of WGUs.
14. A system as defined in claim 8 wherein the communications adapter uses at least one WGU to generate traffic for a large or small class of communicants.
15. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform program steps for workload generation, the program steps comprising:
identifying a workload model by determining each of a hierarchy for workload generation, a plurality of time scales for workload generation, and states and transitions at each of the plurality of time scales;
defining at least one parameter by determining each of fields for user specific attributes, application specific attributes, network specific attributes, content specific attributes, and a probability distribution function (PDF) for each of the attributes;
constructing at least one template for workload generation wherein the at least one template is a user level template corresponding to a relatively slow time scale of the plurality of time scales, an application level template corresponding to a relatively faster time scale of the plurality of time scales or a stream level template corresponding to a relatively fastest time scale of the plurality of time scales; and
defining at least one workload generating unit (WGU) responsive to the at least one template.
16. A program storage device as defined in claim 15 wherein the at least one template defines states for workload generation.
17. A program storage device as defined in claim 15 wherein the at least one template defines transitions for workload generation.
18. A program storage device as defined in claim 15 wherein the at least one template defines parameters for workload generation.
19. A program storage device as defined in claim 15 wherein a plurality of templates defines the at least one WGU.
20. A program storage device as defined in claim 15 wherein the at least one template defines a plurality of WGUs.
US11/327,071 2006-01-06 2006-01-06 Template-based approach for workload generation Abandoned US20070162602A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/327,071 US20070162602A1 (en) 2006-01-06 2006-01-06 Template-based approach for workload generation
US12/128,959 US8924189B2 (en) 2006-01-06 2008-05-29 Template-based approach for workload generation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/327,071 US20070162602A1 (en) 2006-01-06 2006-01-06 Template-based approach for workload generation

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/128,959 Continuation US8924189B2 (en) 2006-01-06 2008-05-29 Template-based approach for workload generation

Publications (1)

Publication Number Publication Date
US20070162602A1 true US20070162602A1 (en) 2007-07-12

Family

ID=38234023

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/327,071 Abandoned US20070162602A1 (en) 2006-01-06 2006-01-06 Template-based approach for workload generation
US12/128,959 Expired - Fee Related US8924189B2 (en) 2006-01-06 2008-05-29 Template-based approach for workload generation

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/128,959 Expired - Fee Related US8924189B2 (en) 2006-01-06 2008-05-29 Template-based approach for workload generation

Country Status (1)

Country Link
US (2) US20070162602A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090327492A1 (en) * 2006-01-06 2009-12-31 Anderson Kay S Template-based approach for workload generation
US7698106B2 (en) 2006-05-05 2010-04-13 International Business Machines Corporation System and method for benchmarking correlated stream processing systems
EP2343653A2 (en) * 2009-12-29 2011-07-13 Microgen Aptitude Limited Generating and monitoring data items
US9265458B2 (en) 2012-12-04 2016-02-23 Sync-Think, Inc. Application of smooth pursuit cognitive testing paradigms to clinical drug development
US9380976B2 (en) 2013-03-11 2016-07-05 Sync-Think, Inc. Optical neuroinformatics
US9800465B2 (en) 2014-11-14 2017-10-24 International Business Machines Corporation Application placement through multiple allocation domain agents and flexible cloud scheduler framework

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9009007B2 (en) 2012-03-13 2015-04-14 International Business Machines Corporation Simulating stream computing systems

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5440719A (en) * 1992-10-27 1995-08-08 Cadence Design Systems, Inc. Method simulating data traffic on network in accordance with a client/sewer paradigm
US5809282A (en) * 1995-06-07 1998-09-15 Grc International, Inc. Automated network simulation and optimization system
US6134514A (en) * 1998-06-25 2000-10-17 Itt Manufacturing Enterprises, Inc. Large-scale network simulation method and apparatus
US6295518B1 (en) * 1997-12-09 2001-09-25 Mci Communications Corporation System and method for emulating telecommunications network devices
US6393386B1 (en) * 1998-03-26 2002-05-21 Visual Networks Technologies, Inc. Dynamic modeling of complex networks and prediction of impacts of faults therein
US6549882B1 (en) * 1998-12-21 2003-04-15 Cisco Technology, Inc. Mechanisms for providing and using a scripting language for flexibly simulationg a plurality of different network protocols
US20040181370A1 (en) * 2003-03-10 2004-09-16 International Business Machines Corporation Methods and apparatus for performing adaptive and robust prediction
US20040240387A1 (en) * 2003-05-28 2004-12-02 Lucent Technologies, Incorporated System and method for simulating traffic loads in packetized communication networks
US6845352B1 (en) * 2000-03-22 2005-01-18 Lucent Technologies Inc. Framework for flexible and scalable real-time traffic emulation for packet switched networks
US20050149908A1 (en) * 2002-12-12 2005-07-07 Extrapoles Pty Limited Graphical development of fully executable transactional workflow applications with adaptive high-performance capacity
US7356770B1 (en) * 2004-11-08 2008-04-08 Cluster Resources, Inc. System and method of graphically managing and monitoring a compute environment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6996517B1 (en) * 2000-06-06 2006-02-07 Microsoft Corporation Performance technology infrastructure for modeling the performance of computer systems
US20070011052A1 (en) * 2005-06-08 2007-01-11 Tieming Liu Method and apparatus for joint pricing and resource allocation under service-level agreement
US20070162602A1 (en) * 2006-01-06 2007-07-12 International Business Machines Corporation Template-based approach for workload generation
US8918496B2 (en) * 2007-04-30 2014-12-23 Hewlett-Packard Development Company, L.P. System and method for generating synthetic workload traces
US20090313631A1 (en) * 2008-06-11 2009-12-17 Fabio De Marzo Autonomic workload planning
KR101269549B1 (en) * 2009-05-08 2013-06-04 한국전자통신연구원 System and method for testing software reliability using fault injection
AU2010317394A1 (en) * 2009-11-16 2012-04-05 Tata Consultancy Services Limited A system and method for budget-compliant, fair and efficient manpower management

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5440719A (en) * 1992-10-27 1995-08-08 Cadence Design Systems, Inc. Method simulating data traffic on network in accordance with a client/sewer paradigm
US5809282A (en) * 1995-06-07 1998-09-15 Grc International, Inc. Automated network simulation and optimization system
US6295518B1 (en) * 1997-12-09 2001-09-25 Mci Communications Corporation System and method for emulating telecommunications network devices
US6393386B1 (en) * 1998-03-26 2002-05-21 Visual Networks Technologies, Inc. Dynamic modeling of complex networks and prediction of impacts of faults therein
US6134514A (en) * 1998-06-25 2000-10-17 Itt Manufacturing Enterprises, Inc. Large-scale network simulation method and apparatus
US6549882B1 (en) * 1998-12-21 2003-04-15 Cisco Technology, Inc. Mechanisms for providing and using a scripting language for flexibly simulationg a plurality of different network protocols
US6845352B1 (en) * 2000-03-22 2005-01-18 Lucent Technologies Inc. Framework for flexible and scalable real-time traffic emulation for packet switched networks
US20050149908A1 (en) * 2002-12-12 2005-07-07 Extrapoles Pty Limited Graphical development of fully executable transactional workflow applications with adaptive high-performance capacity
US20040181370A1 (en) * 2003-03-10 2004-09-16 International Business Machines Corporation Methods and apparatus for performing adaptive and robust prediction
US7039559B2 (en) * 2003-03-10 2006-05-02 International Business Machines Corporation Methods and apparatus for performing adaptive and robust prediction
US20040240387A1 (en) * 2003-05-28 2004-12-02 Lucent Technologies, Incorporated System and method for simulating traffic loads in packetized communication networks
US7356770B1 (en) * 2004-11-08 2008-04-08 Cluster Resources, Inc. System and method of graphically managing and monitoring a compute environment

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090327492A1 (en) * 2006-01-06 2009-12-31 Anderson Kay S Template-based approach for workload generation
US8924189B2 (en) * 2006-01-06 2014-12-30 International Business Machines Corporation Template-based approach for workload generation
US7698106B2 (en) 2006-05-05 2010-04-13 International Business Machines Corporation System and method for benchmarking correlated stream processing systems
EP2343653A2 (en) * 2009-12-29 2011-07-13 Microgen Aptitude Limited Generating and monitoring data items
US9265458B2 (en) 2012-12-04 2016-02-23 Sync-Think, Inc. Application of smooth pursuit cognitive testing paradigms to clinical drug development
US9380976B2 (en) 2013-03-11 2016-07-05 Sync-Think, Inc. Optical neuroinformatics
US9800465B2 (en) 2014-11-14 2017-10-24 International Business Machines Corporation Application placement through multiple allocation domain agents and flexible cloud scheduler framework
US10326649B2 (en) 2014-11-14 2019-06-18 International Business Machines Corporation Application placement through multiple allocation domain agents and flexible cloud scheduler framework

Also Published As

Publication number Publication date
US8924189B2 (en) 2014-12-30
US20090327492A1 (en) 2009-12-31

Similar Documents

Publication Publication Date Title
US8924189B2 (en) Template-based approach for workload generation
Kaiser et al. Kinesthetics extreme: An external infrastructure for monitoring distributed legacy systems
US20110054878A1 (en) Automated performance prediction for cloud services
US7467066B2 (en) System and method for benchmarking correlated stream processing systems
Petriu Software Model‐based Performance Analysis
Aichernig et al. How fast is MQTT? Statistical model checking and testing of IoT protocols
Singh et al. Analytical modeling for what-if analysis in complex cloud computing applications
Chen et al. Automatic performance-optimal offloading of network functions on programmable switches
Yanggratoke et al. A service‐agnostic method for predicting service metrics in real time
CN113704765A (en) Operating system identification method and device based on artificial intelligence and electronic equipment
Juhasz et al. A performance analyser and prediction tool for parallel discrete event simulation
Anderson et al. SWORD: Scalable and flexible workload generator for distributed data processing systems
Ciobanu et al. Development of a News Recommender System based on Apache Flink.
Zeng et al. Cross-layer SLA management for cloud-hosted big data analytics applications
Boguhn Benchmarking the scalability of distributed stream processing engines in case of load peaks
Yu et al. A two steps method of resources utilization predication for large Hadoop data center
Alam et al. Optimizing SIEM throughput on the cloud using parallelization
Souza et al. Using stochastic petri nets for performance modelling of application servers
Georgantas Service Oriented Computing in Mobile Environments: Abstractions and Mechanisms for Interoperability and Composition
Zhu et al. QoS enhancement for PDES grid based on time series prediction
Nocera et al. A model for Reflective Middleware based on fuzzy rule for context-awareness injection in ubiquitous computing environments
Kolesnikov Load Modelling and Generation in IP-based Networks
Stanford Geo-distributed stream processing
Wu et al. Design and Implementation of a Distributed Container-Deployed Web Business Traffic Generator
Abdelli et al. Time Petri Nets for performance evaluation of composite web services architectures

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ANDERSON, KAY S;BOUILLET, ERIC P;DUBE, PARIJAT;AND OTHERS;REEL/FRAME:017036/0899;SIGNING DATES FROM 20050924 TO 20051024

AS Assignment

Owner name: NATIONAL SECURITY AGENCY, MARYLAND

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:019632/0374

Effective date: 20061012

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION