US7246159B2 - Distributed data gathering and storage for use in a fault and performance monitoring system - Google Patents

Distributed data gathering and storage for use in a fault and performance monitoring system Download PDF

Info

Publication number
US7246159B2
US7246159B2 US10/286,447 US28644702A US7246159B2 US 7246159 B2 US7246159 B2 US 7246159B2 US 28644702 A US28644702 A US 28644702A US 7246159 B2 US7246159 B2 US 7246159B2
Authority
US
United States
Prior art keywords
data
performance
performance data
fault
gathering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US10/286,447
Other versions
US20040088386A1 (en
Inventor
Vikas Aggarwal
Rajib Rashid
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
McAfee LLC
NetScout Systems Inc
Original Assignee
Network General Corp
Fidelia Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Network General Corp, Fidelia Technology Inc filed Critical Network General Corp
Priority to US10/286,447 priority Critical patent/US7246159B2/en
Assigned to FIDELIA TECHNOLOGY, INC. reassignment FIDELIA TECHNOLOGY, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AGGARWAL, VIKAS
Publication of US20040088386A1 publication Critical patent/US20040088386A1/en
Assigned to NETWORK GENERAL CORPORATION reassignment NETWORK GENERAL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RASHID, RAJIB
Application granted granted Critical
Publication of US7246159B2 publication Critical patent/US7246159B2/en
Assigned to WELLS FARGO BANK, NATIONAL ASSOCIATION reassignment WELLS FARGO BANK, NATIONAL ASSOCIATION SECURITY AGREEMENT Assignors: FIDELIA TECHNOLOGY, INC.
Assigned to KEYBANK NATIONAL ASSOCIATION reassignment KEYBANK NATIONAL ASSOCIATION INTELLECTUAL PROPERTY SECURITY AGREEMENT Assignors: FIDELIA TECHNOLOGY, INC.
Assigned to KEYBANK NATIONAL ASSOCIATION AS ADMINISTRATIVE AGENT reassignment KEYBANK NATIONAL ASSOCIATION AS ADMINISTRATIVE AGENT INTELLECTUAL PROPERTY SECURITY AGREEMENT Assignors: FIDELIA TECHNOLOGY, INC.
Assigned to FIDELIA TECHNOLOGY, INC. reassignment FIDELIA TECHNOLOGY, INC. SECURITY AGREEMENT Assignors: WELLS FARGO BANK, NATIONAL ASSOCIATION
Assigned to JPMORGAN CHASE BANK, N.A. reassignment JPMORGAN CHASE BANK, N.A. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FIDELIA TECHNOLOGY, INC., NETSCOUT SERVICE LEVEL CORPORATION, NETSCOUT SYSTEMS, INC., OnPath Technologies Inc.
Assigned to FIDELIA TECHNOLOGY, INC. reassignment FIDELIA TECHNOLOGY, INC. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: KEYBANK NATIONAL ASSOCIATION
Assigned to FIDELIA TECHNOLOGY, INC. reassignment FIDELIA TECHNOLOGY, INC. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: KEYBANK NATIONAL ASSOCIATION
Assigned to NETSCOUT SYSTEMS, INC. reassignment NETSCOUT SYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FIDELIA TECHNOLOGY, INC.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3495Performance evaluation by tracing or monitoring for systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/22Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks comprising specially adapted graphical user interfaces [GUI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5003Managing SLA; Interaction between SLA and QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/50Network service management, e.g. ensuring proper service fulfilment according to agreements
    • H04L41/5058Service discovery by the service manager

Definitions

  • the present invention concerns network management systems (“NMSs”).
  • NMSs network management systems
  • the present invention concerns combining fault and performance management.
  • FIG. 1 illustrates components of a system 100 that may be used by a so-called e-commerce business.
  • this system may include a web interface server 110 , a search and navigation server 120 associated with a product inventory database 125 , a purchase or “shopping cart” server 130 associated with a user database 135 , a payment server 140 associated with a credit card database 145 , a transaction server 150 associated with a transaction database 155 , a shipping server 180 associated with a shipping database 185 , a local area network (“LAN”) 160 , and a network 170 including linked routers 175 .
  • the search and navigation server 120 , the purchase or “shopping cart” server 130 , the payment server 140 and the transaction server 150 may communicate with one another via the LAN 160 .
  • these servers may communicate with the shipping server 180 via the network 170 .
  • Each of the servers may include components (e.g., power supplies, power supply backups, printers, interfaces, CPUs, chassis, fans, memory, disk storage, etc.) and may run applications or operating systems (e.g., Windows, Linux, Solaris, Microsoft Exchange, etc.) that may need to be monitored.
  • the various databases e.g., Microsoft SQL Server, Oracle Database, etc.
  • the networks, as well as their components, may need to be monitored.
  • the system 100 includes various discreet servers, networks, and databases, the system can be thought of as offering an end-to-end service.
  • that end-to-end service is on-line shopping—from browsing inventory, to product selection, to payment, to shipping.
  • NMSs network management systems
  • Fault management pertains to whether something is operating or not.
  • Performance management pertains to a measure of how well something is working and to historical and future trends.
  • a fault management system generates and works with “real time” events (exceptions). It can query the state of a device and trigger an event upon a state change or threshold violation.
  • fault management systems typically do not store the polled data—they only store events and alerts (including SNMP traps which are essentially events).
  • the user interface console for a fault management system is “exception” driven. That is, if a managed element is functioning, it is typically not even displayed. Generally, higher severity fault events are displayed with more prominence (e.g., at the top of a list of faults), and less critical events are displayed with less prominence (e.g., lower in the list).
  • performance management systems generally store all polled data. This stored data can then be used to analyze trends or to generate historical reports on numerical data collected.
  • a major challenge in performance management systems is storing such large amounts of data. For example, just polling 20 variables every 5 minutes from 1000 devices generates 6 million data samples per day. Assuming each data sample requires 50 bytes of storage, about 9 GB of data will be needed per month. Consequently, performance management systems are designed to handle large volumes of data, perform data warehousing and reporting functions.
  • Performance management systems are typically batch oriented. More specifically, generally, distributed data collectors poll data and periodically (e.g., each night) feed them to a centralized database. Since the size of the centralized database will become huge, database management is a prime concern in such products.
  • the present invention discloses apparatus, data structures, and/or methods for distributing data gathering and storage for use in a fault and performance monitoring system.
  • FIG. 1 is a diagram of a e-commerce system to which the present invention may be applied to monitor faults and performance.
  • FIG. 2 is a bubble chart illustrating an architecture of the present invention.
  • FIG. 3 is a diagram illustrating an exemplary application of the present invention to the e-commerce system of claim 1 .
  • FIG. 4 is a flow diagram of an exemplary method that may be used to perform system configuration operations in a manner consistent with the principles of the present invention.
  • FIG. 5 is a flow diagram of an exemplary method that may be used to perform information extraction, combination and presentation operations in a manner consistent with the principles of the present invention.
  • FIG. 6 is a flow diagram of an exemplary method that may be used to perform distributed data gathering, (preprocessing) and storage operations in a manner consistent with the principles of the present invention.
  • FIGS. 7–10 are exemplary object-oriented data structures that may be used to store configuration information in a manner consistent with the principles of the present invention.
  • FIGS. 11A and 11B illustrate an exemplary events report.
  • FIG. 12 illustrates an exemplary test status summary report.
  • FIGS. 13A and 13B illustrates an exemplary test details report.
  • FIGS. 14A and 14B illustrates an exemplary service instability report.
  • FIG. 15 illustrates an exemplary usage and trend report.
  • FIG. 16 illustrates an exemplary account status summary report.
  • FIG. 17 illustrates an exemplary service status summary report.
  • FIG. 18 is block diagram of apparatus that may be used to effect at least some aspects of the present invention.
  • the present invention involves methods, apparatus and/or data structures for monitoring system faults and system performance.
  • the following description is presented to enable one skilled in the art to make and use the invention, and is provided in the context of particular applications and their requirements.
  • Various modifications to the disclosed embodiments will be apparent to those skilled in the art, and the general principles set forth below may be applied to other embodiments and applications.
  • the present invention is not limited to the embodiments shown and the inventor regards his invention as the following disclosed methods, apparatus and data structures and any other patentable subject matter.
  • FIG. 2 is a bubble chart of an exemplary system fault and performance monitoring architecture 200 which employs distributed data gathering and storage.
  • This distributed architecture enables the system to handle the large volume of data collected for performance monitoring. It also enables real-time performance monitoring. More specifically, a number of data gathering operations 210 (also referred to as “data gathering elements” or “DGEs”) are distributed across a number of facilities or components of a system (not shown). For example, referring back to the exemplary system 100 of FIG. 1 , a first DGE may be provided on the local area network 160 , a second DGE may be provided on the network 170 , and a third DGE may be provided on the shipping server 180 .
  • DGEs data gathering elements
  • DGEs can collect traps and messages and can receive data from an external feed. As described in more detail in ⁇ 4.2 below, DGEs can perform further tasks. Data gathered and/or generated by each DGE 210 is stored in an associated database 220 .
  • DGEs 210 can be configured using system configuration operations 230 , in accordance with a configuration database 240 .
  • the system configuration operations 230 can (i) allow configuration information to be entered into the configuration database 240 , (ii) inform each DGE of its startup configuration 210 , and (iii) inform each DGE 210 of runtime changes to its configuration.
  • Information extraction, combination and presentation operations 250 may collect fault information from the DGEs 210 (either by asking a proxy process or directly via their databases 220 ), may collect performance information from the databases 220 of the DGEs 210 , may combine fault and performance information from different DGEs, and may present fault and performance information to a user in a unified, integrated manner.
  • the presentation of this information may be in the form of screens, graphs, reports, etc.
  • an application programming interface (“API”) operation 260 may be provided to permit users to expand the fault and performance monitoring functionality of the present invention.
  • the API permits provisioning accounts, users, devices, tests, actions, DGE locations, and DGE hosts through a socket interface. Such an embodiment enables mass data entry, updates and searches. Searches for test results and events are also permitted via this interface. A limited number of reports are available, although a full complement of reporting is offered via a graphical user interface (“GUI”).
  • GUI graphical user interface
  • a perl API is provided which uses the underlying socket interface. Organizations with large numbers of monitored devices can provision, update or search systems using the API.
  • the system configuration operations 230 , the configuration database 240 , the information extraction, combination and presentation operations 250 , and the API operations 260 may all be performed from and provided at the same facility or server.
  • the information extraction, combination and presentation operations may be referred to as a “business visibility engine” or “BVE”.
  • a “BVE” may also include the configuration operations 230 , the configuration database 240 , and the API operations 260 .
  • the architecture 200 of FIG. 2 is much different in that the information extraction, combination and presentation operations 250 seamlessly integrates distributed DGE databases 220 and can issue queries in parallel across the distributed DGEs 210 . The responses from such queries can then be combined (also referred to as response “correlation”).
  • the n-tier architecture 200 is centered on a configuration database management system. The distributed nature of the system 200 permits committing explicit resources to important processes and systems, hence achieving real-time scalability and performance. Typical traffic flow across an n-tier system consists of a number of clients that access services from one tier, which in turn requests services from one or more systems.
  • This architecture pushes even the correlation and notification to the distributed DGEs so that there is no central bottleneck and the system operates as a loosely coupled but coordinated cluster.
  • One embodiment consistent with the principles of the present invention, uses key technology standards such as XML, JMS, JDBC, SOAP and XSLT layered on a J2EE framework.
  • FIG. 3 illustrates an exemplary system 300 in which the fault and performance monitoring architecture of FIG. 2 has been applied to the exemplary e-commerce system 100 of FIG. 1 .
  • the components of the exemplary e-commerce system 100 are depicted with dashed lines.
  • a first data gathering element (and an associated database) 310 a / 320 a is provided on the LAN 160
  • a second data gathering element (and an associated database) 310 b / 320 b is provided on the shipping server 180 ′
  • a third data gathering element (and an associated database) 310 c / 320 c is provided on the network 170 ′.
  • These elements may be configured by, and may provide information to, a business visibility engine 390 .
  • the business visibility engine 390 may include system configuration operations 330 , a configuration database 340 , information extraction, combination and presentation operations 350 and API operations 360 .
  • System configuration may include information learned or discovered from the system and/or information entered via the API operation.
  • FIG. 4 is a flow diagram of an exemplary method 400 that may be used to generate system configuration information.
  • a list of (e.g., Internet Protocol) networks can be read and this list can be used to discover devices (e.g., servers, routers, applications, etc.) on those networks.
  • this information may be manually entered or otherwise defined (e.g., via the API operation).
  • Each of the devices is associated with one or more fault and/or performance tests as indicated by block 420 . This association may be established via an auto-discovery mechanism. Alternatively, this association may be manually entered or otherwise defined (e.g., via the API operation).
  • each of a number of device objects 720 may include one or more test objects 730 .
  • each of at least one data gathering operation is associated one or more of the devices as indicated by block 430 .
  • This association may be manually entered or otherwise defined (e.g., via the API operation), but is preferably discovered.
  • a DGE at a particular location is associated with devices at the same location.
  • the load of monitoring the devices at that location may be balanced across the DGEs at that location.
  • a location 1010 may include one or more DGEs 1020 .
  • Each of the DGEs 1020 may be associated with one or more device objects 1030 .
  • thresholds are associated with the tests.
  • the thresholds may be default thresholds, or may be provided, for example via the API operation, on a case-by-case basis.
  • Exemplary thresholds for example, may include a “warning” threshold and a “critical” threshold.
  • the test may, by definition, include (default) thresholds.
  • performance test parameters may be associated with at least some of the tests. The parameters may be default parameters, or may be provided, for example via the API operation, on a case-by-case basis.
  • a number of actions may be provided, and one or more tests may be associated with each action.
  • an action may be “e-mail a critical threshold violation to network administrator”.
  • a number of fault tests may be associated with this action such that if any of the tests violate a critical threshold, the network administrator is informed.
  • These associations may be entered via the API operation, or may be defined in some other way (e.g., by default).
  • an action object 810 may include one or more test objects 820 .
  • the various associations may be stored in the configuration database 240 . Although these associations may be stored in an object-oriented database, other data structures may be used to store this information in an alternate database type. However, an object-oriented database allows easy and flexible schema maintenance as compared to other database types available today.
  • the fault and performance configuration information may be provided (e.g., signaled) to respective data gathering operations as indicated by block 470 . If the respective data gathering operations are already available (e.g., on standby), this signaling may occur immediately. If, on the other hand, the respective data gathering operations are not yet available, this signaling may be done in response to an indication that a new data gathering objection has been added. For example, in such an embodiment, upon startup, a DGE only needs to know its own identifier (as used in the configuration database) and the (IP) address of the server running the configuration database.
  • a new DGE can be started up with the identifier of the failed DGE, and this new DGE will download its configuration from the configuration database and thus assume the work of the failed DGE. Furthermore, if a connection to the configuration database is lost, or if the configuration database goes down, configured DGEs can continue to function as presently configured until the connection and/or configuration database is restored.
  • tests may be associated with a device.
  • a “monitor” at a DGE performs a test based on the test object.
  • a “scheduler” at the DGE determines a test type from the test object and then puts it onto a queue for the monitor. Thus, the actual testing is done via a monitor of a DGE.
  • monitors may be predefined, the API operation may allow users to create “plug-ins” to define new tests (e.g., for a new device) to be performed by new monitors.
  • monitors are similar to device drivers in an PC operating system. More specifically, a PC operating system has drivers for may popular peripherals. However, device drivers for new peripherals or less popular peripherals may be added. Similarly, as new devices types are added to the system being monitored, new monitors for testing these new device types may be added.
  • the present invention may overprovision a DGE with monitors. In this way, even though some monitors might not be used, as devices are added, the DGE can simply activate a monitor needed to test the newly added device.
  • a list of at least some exemplary monitors that may be supported by the present invention is provided in ⁇ 4.3.1.1.1 below.
  • ICMP network monitors may be used to check the reachability of hosts on an Internet Protocol (“IP”) network using the ICMP protocol.
  • IP Internet Protocol
  • the ICMP monitor reports on packet loss and latency for a sequence of ICMP packets. These monitors may include:
  • SNMP v1, v2, v3 may use 64-bit counters where available, and may also account for rollover of 32-bit counters. Multiple SNMP queries to the same host may be sent in the same packet for optimization. An alternate SNMP port may be queried instead of default.
  • These monitors may include:
  • SNMP Host Resources (SNMP v1, v2, v3) monitors may include:
  • TCP Port monitors for monitoring the transaction of well known Internet services such as HTTP, HTTPS, FTP, POP3, IMAP, IMAPS, SMTP, NNTP.
  • Exemplary port monitors may include:
  • the Simple Network Management Protocol (“SNMP”) is a popular protocol for network management.
  • SNMP facilitates communication between a managed device (i.e., a device with an SNMP agent, such as a router for example) and an SNMP manager or management application (represents a user of network management).
  • the SNMP agent on the managed device provides access to data (managed objects) stored in the managed device.
  • the SNMP manager or management application uses this access to monitor and control the managed device.
  • PDUs SNMP Protocol Data Units
  • the manager can perform a GET (or read) to obtain information from the agent about an attribute of a managed object.
  • the manager can perform a GET-NEXT to do the same for the next object in the tree of objects in the managed device.
  • the manager can perform a SET (or write) to set the value of an attribute of a managed object.
  • the agent can send a TRAP, or asynchronous notification, to the manager telling it about some event in the managed device.
  • MIB Management Information Base
  • Exemplary application monitors may include:
  • External data feeds (“EDF”) monitors may be used to insert result values into the system using a socket interface.
  • the inserted data is treated just as if it were collected using internal monitors.
  • the present invention can provide a plug-in monitor framework so that a user can write a custom monitor in Java or any other external script or program.
  • the monitor itself and a definition file in XML are put into a plug-in directory, and treated as integrated parts of the DGE itself.
  • a payroll service may consist of a payroll application on one server, a backend database on another server, and a printer, all connected by a network router. Any of these underlying IT components can fail and cause the payroll service to go down.
  • Service views and reports can be created in the exemplary product by grouping together all the underlying components of a service into a consolidated service view. If and when any of the underlying IT components fails, the entire service is reported as down, thus allowing one to measure the impact of underlying IT components on business services.
  • Tests can be provisioned using one or more of the following techniques.
  • Port and SNMP tests can be automatically “discovered” by querying the device to see what services are running.
  • the system can automatically detect disk partitions, volumes and their sizes so that the usage is normalized as a percentage. This normalization may also be done for memory, disk partitions, and database tablespace.
  • the target device database record may be updated with vendor and model information. If a user has checked the SNMP tests box when creating a device, the model and vendor information may be displayed on a configure tests page.
  • the present invention can provide a mechanism for refreshing maximum values or SNMP object identifiers (SNMP OID) when an SNMP test has changed. For example, when memory or disk capacity has changed, tests that return percentage-based values would be incorrect unless the maximum value (for determining 100%) is refreshed. Similarly, in the case of a device rebuild, it is possible that the SNMP OIDs may change, thus creating a mismatch between the current SNMP OIDs and the ones discovered during initial provisioning. If any of these situations occurs, the user need only repeat the test provisioning process in the web application for a changed device. The present invention can discover whether any material changes on the device have occurred and highlight those changes on the configure tests page, giving the user the option to also change thresholds and/or actions that apply to the test.
  • SNMP OID SNMP object identifiers
  • Default warning and critical thresholds may be set globally for each type of test. Tests can be overridden at the individual device level, or reset for a set of tests in a department or other group.
  • a service level (SLA) threshold can be set separately to track levels of service or system utilization, which will not provide alarms or actions.
  • data gathering may be performed by distributed data gathering operations (e.g., DGEs). Gathered data may be stored locally by each DGE. Further, DGEs may optionally perform some local data preprocessing such as calculating rate, delta, percentages, etc.
  • DGEs may optionally perform some local data preprocessing such as calculating rate, delta, percentages, etc.
  • FIG. 6 is a flow diagram of an exemplary method 600 that may be used to perform a data gathering operation. Since these operations are distributed, this method 600 may be performed, possibly asynchronously and independently, by multiple autonomous DGEs. As indicated by decision block 605 and block 610 , if the DGE is not yet configured, it should try to get such configuration information. For example, it may do so by connecting to the configuration database and downloading any needed configuration information. Referring back to decision block 605 , once the DGE is configured, it monitors device(s) in accordance with such configuration information as indicated by block 615 . Recall each DGE may test devices using “monitors” at scheduled intervals specified in each test object.
  • the remainder of the method 600 may depend on whether the DGE gathers data using a “pull model” (i.e., with distinct requests) or whether it gathers data using a “push mode” (i.e., without a distinct request).
  • the DGE can receive an exception indication if a device performs a self-test and finds an error. Such errors are typically reported using SNMP traps or via a log message.
  • the various ways of gathering data are shown together. However, a particular implementation of the present invention need not use all of these alternative data gathering techniques.
  • trigger (event) block 620 if it is time for the DGE to get data for a particular test (e.g., as specified by a polling scheduler in the DGE), it requests (polls for) data as indicated by block 625 and the requested data is accepted as indicated by block 630 . Since these blocks “pull” data from devices, they effect a pull data gathering technique. The period at which data for a particular test is requested may be defined by the test (object) and/or configuration data associated with the test. The request may be placed in a queue. The method 600 then proceeds to decision block 635 , described later.
  • the data is fault data or performance data. If the data is performance data, it is stored locally as indicated by block 640 , before the method 600 is left via return node 670 . In one embodiment, the stored data is aggregated (e.g., daily data is combined to weekly data, weekly data is combined to quarterly data, quarterly data is combined to annual data, etc.). As shown by optional block 642 , the performance data may be pre-processed. For example, the DGE can pre-process the performance data to calculate rates, deltas, percentages, etc. It can also normalize the collected data.
  • the data is compared with one or more thresholds as indicated by block 645 . Then, as indicated by decision block 650 , it is determined whether or not the threshold is violated. (In the following, it will be assumed that the fault data is only checked against one threshold to simplify the description. However, the data can be compared against more than one threshold, such as a “critical” threshold and a “warning” threshold.) If the threshold is not violated, the method 600 is simply left via RETURN node 670 . If, on the other hand, the threshold is violated, the method 600 branches to block 660 which starts processing for a fault exception.
  • a fault exception e.g., generated by a device self-test
  • an action for the fault exception is determined (Recall, e.g., data structure 800 of FIG. 8 .) and performed.
  • fault events may be handled by the DGE.
  • the occurrence of the fault exception may be stored.
  • fault data is not stored if no threshold violation exists, the data itself, or merely the fault exception, can and should be stored in the event of a fault exception occurrence.
  • a threshold if a threshold has been crossed, an event is generated and fed into a correlation-processor.
  • This thread looks at a rules engine to determine the root-cause of the problem (e.g., upstream devices, IP stack, etc.) and if a notification or action needs to be taken.
  • all data is stored in a JDBC compliant SQL database such as Oracle or MySQL.
  • Data is collected by the DGEs and stored using JDBC in one of a set of distributed databases which may be local or remote on another server.
  • Such distributed storage minimizes data maintenance requirements and offers parallel processing.
  • All events (a test result that crosses a threshold) may be recorded for historical reporting and archiving.
  • Information may be permanently stored for all events (until expired from database). All messages and alerts that may have been received may be permanently stored by the appropriate DGE (until expired from the database).
  • Raw results data (polled data values) may be progressively aggregated over time.
  • a default aggregation scheme is five-minute samples for a day, 30-minute averages for a week, one-hour averages for three months and daily averages for a year.
  • Each event, as well as each exception or message received by the DGE is assigned a severity.
  • a message is assigned a severity based on a user specified regular expression pattern match.
  • the visual GUI indicates these severity conditions by unique icons or other means.
  • the following severity states are supported:
  • Tests may be ‘suppressed’ when they are in a known condition, and are hidden from view until the state changes after which the suppressed flag is automatically cleared.
  • An event may be recorded for a test's very first result and for every time a test result crosses a defined threshold. For example, the very first test result for an ICMP round trip time test falls into the “OK” range. Five minutes later, the same test returns a higher value that falls in the “WARNING” range. Another five minutes passes, the test is run again, and the round trip time decreases and falls back into the “OK” range. For the ten minutes that just past, 3 separate events may have been recorded—one because the test was run for the first time, and two more for crossing the “WARNING” threshold, both up and back.
  • One time text messages, or SNMP traps, or text alarms may be displayed in a separate ‘message’ window. All messages should have a severity and device associated with them, and the user can filter the messages displayed and acknowledge them to remove from the messages window. A user can match on a regular expression and assign a severity to a text message, thus triggering actions and notifications similar to events.
  • An action may be a notification via email or pager, or any other programmable activity such as opening a trouble ticket or restarting a server.
  • Actions may be configured and assigned to tests in the form of a profile, with each profile preferably containing any number of individual sub-actions. Each of these sub-actions may configured with the following information:
  • data collection and storage is distributed across various DGEs which each store data locally or a remote distributed database.
  • data analysis may be distributed across various DGEs, each of which may analyze local data.
  • a (more) centralized reporting facility is relieved of at least some data storage and analysis responsibilities.
  • FIG. 5 is a flow diagram of an exemplary method 500 that may be used to perform an information extraction, combination and presentation operations. As indicated by trigger (event) block 510 , various branches of the method 500 may be effected depending upon the occurrence of a trigger (event).
  • trigger (event) block 510 various branches of the method 500 may be effected depending upon the occurrence of a trigger (event).
  • the user In response to a user query (Note that a user login may infer a default query.), the user should be authenticated as indicated by block 520 . Any known authentication techniques, such as password, radius, or external directory, etc., may be used.
  • a user's authorization is determined as indicated by block 522 .
  • a user's authorization may depend on a group to which the use belongs. (Recall, e.g., data structure 900 of FIG. 9 .)
  • An administrator may associate a user to a group using the configuration API.
  • a group object may have defined “permissions” (e.g., create actions, create devices, see data of other user, etc.) and defined “limits” (e.g., number of devices, types of devices, device locations, number of tests, etc.).
  • the defined permissions are typically provided for security purposes.
  • the defined limits are typically provided for security purposes and/or for providing flexible software licensing terms.
  • a database query may be generated using a report type (e.g., fault report or performance report) and the user's authorization.
  • the dissemination e.g., multicast or broadcast fan-out
  • the dissemination e.g., multicast or broadcast fan-out
  • the appropriate data gathering operations e.g., DGE databases. Since the configuration information associates users with devices (See, e.g., 710 and 720 of FIG.
  • the appropriate DGEs can be determined.
  • the query can be simply broadcast to all DGEs.
  • Non-relevant DGEs can simply not transmit back their data.
  • the data combination act (described later with reference to block 546 ) could suppress such non-relevant data.
  • a query response is received, as indicated by decision block 540 , it is determined whether all (or enough) responses have been received. If not, it is determined whether a time out (for receiving enough query responses) has occurred. If, not, the method 500 branches back to trigger (event) block 510 . If, on the other hand, a time out has occurred, a time out error action may be taken as indicated by block 544 , before the method 500 is left via RETURN node 550 . Referring back to decision block 540 , if it is determined that all (or enough) responses have been received, the data from the various DGEs is combined (e.g., correlated) for presentation, as indicated by block 546 .
  • a presentation of the information (e.g., a report, a table, a graph, etc.) is generated for rendering to the user. Since the method 500 gets “fresh” data from the distributed databases, real-time performance reporting is possible in addition to real-time fault reporting. Accounting, if any, is performed as indicated by block 549 , before the method 500 is left via RETURN node 550 .
  • the user can “drill-down” into a report to view data or information underlying a presentation result.
  • Such a presentation may be in the form of reports, graphs and tables. Exemplary reports, graphs and tables are now described. Various embodiments of the presented invention may support some or all of the following reports.
  • FIG. 16 illustrates an exemplary account status summary report.
  • FIG. 17 illustrates an exemplary service status summary report. Administrative users can then drill down on individual devices for more detail. End users running the report will only see the device level metrics.
  • a “Downtime” report is similar to the Availability report, in that it is based on device availability as measured by the ICMP packet loss test. However, the results are only for device states equal to CRITICAL, rather than CRITICAL and UNREACHABLE. This more accurately reflects the situation when a single device outage occurs, with no regard for any possible parent device outages that may cause a child device to become UNREACHABLE. Again, downtime distribution metrics and a histogram permit administrative users to see account level metrics and drill down to individual device details, whereas end users may only see the device level metrics.
  • An exemplary “Event” report is illustrated in FIGS. 11A and 11B .
  • a “Number of Events per Day” report displays the number of events recorded each day during the reporting period per account, per device, and per test. Users may select time frame and event severity. Administrative users can view this report at the account level and then drill down on individual devices and tests for more detail. End users running the report may only see the device and test level metrics.
  • a “Number of Events” report displays the total number of events recorded during the reporting period per account, per device, and per test. Users may select time frame and event severity. Administrative users can view this report at the account level and then drill down on individual devices and tests for more detail. End users running the report may only see the device and test level metrics.
  • An “Event Distribution” report displays the total number of events recorded during the reporting period per account, per device, and per test. Users may select time frame and event severity. Administrative users can view this report at the account level and then drill down on individual devices and tests for more detail. End users running the report may only see the device and test level metrics.
  • the histogram is an event duration distribution of the numbers of accounts/devices/tests falling into bins of equal duration for the reporting period. That is, the reporting period may be divided into an equal number of multi-hour (e.g. 4 hour) blocks, with the number of accounts/devices/tests falling into each of those blocks.
  • a “Device Performance” report snapshot is a period (e.g., 24 hour) snapshot (hour by hour) of event summaries for all tests on a single device.
  • Raw event data is analyzed hourly and the worst test state is displayed for each test as a colored block on the grid (24 hours ⁇ list of active tests on the device). For example, if a test is CRITICAL for one minute during the hour, the entire hour may be displayed as a red box representing the CRITICAL state.
  • the Device Performance Report only applies to target devices, not to device groups.
  • An exemplary test status summary report is illustrated in FIG. 12 .
  • Trend reports can use regression algorithm for analyzing raw data and predicting the number of days to hit the specified thresholds.
  • An exemplary service instability report is illustrated in FIGS. 14A and 14B .
  • An exemplary usage and trend report is illustrated in FIG. 15 .
  • Users can define custom reports in which devices, tests and the type of report to generate for these devices (e.g., top 10, events per day, statistical, trend, event distribution) are selected.
  • the method 500 runs under an application server such as Jakarta Tomcat or BEA Weblogic.
  • FIG. 18 is high-level block diagram of a machine 1800 that may perform one or more of the operations discussed above.
  • the machine 1800 basically includes a processor(s) 1810 , an input/output interface unit(s) 1830 , a storage device(s) 1820 , and a system bus or network 1840 for facilitating the communication of information among the coupled elements.
  • An input device(s) 1832 and an output device(s) 1834 may be coupled with the input/output interface(s) 1830 .
  • the processor(s) 1810 may execute machine-executable instructions (e.g., C or C++ or Java running on the Solaris operating system available from Sun Microsystems Inc. of Palo Alto, Calif. or the Linux operating system widely available from a number of vendors such as Red Hat, Inc. of Durham, N.C.) to perform one or more aspects of the present invention. At least a portion of the machine executable instructions may be stored (temporarily or more permanently) on the storage device(s) 1820 and/or may be received from an external source via an input interface unit 1830 .
  • machine-executable instructions e.g., C or C++ or Java running on the Solaris operating system available from Sun Microsystems Inc. of Palo Alto, Calif. or the Linux operating system widely available from a number of vendors such as Red Hat, Inc. of Durham, N.C.
  • the machine 1800 may be one or more conventional personal computers.
  • the processing unit(s) 1810 may be one or more microprocessors.
  • the bus 1840 may include a system bus.
  • the storage devices 1820 may include system memory, such as read only memory (ROM) and/or random access memory (RAM).
  • the storage device(s) 1820 may also include a hard disk drive for reading from and writing to a hard disk, a magnetic disk drive for reading from or writing to a (e.g., removable) magnetic disk, and an optical disk drive for reading from or writing to a removable (magneto-) optical disk such as a compact disk or other (magneto-) optical media.
  • a user may enter commands and information into the personal computer through input devices 1832 , such as a keyboard and pointing device (e.g., a mouse) for example.
  • input devices 1832 such as a keyboard and pointing device (e.g., a mouse) for example.
  • Other input devices such as a microphone, a joystick, a game pad, a satellite dish, a scanner, or the like, may also (or alternatively) be included.
  • These and other input devices are often connected to the processing unit(s) 1810 through an appropriate interface 1830 coupled to the system bus 1840 .
  • the output device(s) 1834 may include a monitor or other type of display device, which may also be connected to the system bus 1840 via an appropriate interface.
  • the personal computer may include other (peripheral) output devices (not shown), such as speakers and printers for example.
  • a refined embodiment of the present invention can eliminate sending multiple notifications when a device goes down or is unavailable. Based on the inherent dependency between the ping packet loss test results and the availability of the device, if the ping packet loss test returns a CRITICAL result, then communication with the device has somehow been lost. Configured notifications for all other tests on the device are suppressed until packet loss returns to normal. Smart notification may include:
  • a refined embodiment of the present invention supports device dependencies to suppress excessive notifications when a gateway-type device has gone down or is unavailable.
  • Switches, routers, and other hardware are often the physical gateways that govern whether other network devices are reachable. Monitoring of many devices may be impeded if one of these critical “parent devices” becomes unavailable.
  • a parent and child hierarchy is created between monitored devices in order to distinguish the difference between a CRITICAL test on a device and an UNREACHABLE one.
  • a refined embodiment of the present invention supports a “federated user model”. End user security may be controlled by permissions granted to a “User Group”. Each end user can only belong to a single “Account”, and each Account can only belong to a single User Group. Thus, an end user belongs to one and only one User Group for ease of administration. End users of one account are isolated from all other accounts, thus allowing various departments within an enterprise to each have a fully functional “virtual” copy of the invention.
  • Each User Group may have a unique privilege and limits matrix as defined by an Administrative user with administrative control over the User Group. Privileges for User Groups may be defined for devices, tests & actions. Limits at the User Group level may be defined for minimum test interval, max devices, max tests, max actions and max reports.
  • the system permits separate administrative users who can look at multiple ‘accounts’ (which a normal end-user cannot do).
  • This framework allows senior management or central operation centers or customer care to report on multiple departments that they are responsible for. This eliminates the need for multiple deployments of the same product, while allowing seamless reporting across services that span IT infrastructure managed by different departments in an enterprise.
  • Administrative user security may be controlled by permissions granted to an Administrative Group.
  • Administrative Groups and User Groups have a many-to-many relationship, allowing the administration of User Groups by numerous administrators who have varying permissions.
  • Privileges for Administrative Groups may be defined for accounts, users, user groups, limits, devices, tests, and actions.
  • a separate set of privileges is defined for each relationship between an Administrative Group and a User Group.
  • a very simple configuration could establish the organization's Superuser as the only administrative user and all end-users belonging to a single User Group.
  • a complex organizational model might require the establishment of Administrative Groups for Network Administration, Database Administration, and Customer Service, with User Groups for C-level executives, IT Support, Marketing, etc.
  • Superusers are not constrained by a privileges matrix—they can perform any of the actions in the matrix on any user.
  • Superusers create Administrative Groups and User Groups, and define the privileges the former has over the latter.
  • the ‘superuser’ accounts are used to effectively bootstrap the system.
  • Each User Group has a privileges matrix associated with it that describes what operations the members of that User Group can perform. As mentioned previously, there is a similar, but more complex privileges matrix that describes what operations a member of an Administrative Group can do to administer one or more User Groups.
  • Limits are numerical bounds associated with a User Group that define minimum test interval, maximum devices, maximum tests, maximum actions and maximum reports for end-user accounts. An end user's actions are constrained by the Limits object associated with their User Group, unless there is another Limits object that is associated with the particular user (e.g. Read-only user) that would override the limits imposed by the User Group.
  • Administrative users occasionally need to directly administer an end-user's account, by logging into that account and providing on-line support to view the account and perform operations. This capability is especially helpful when an end-user's capabilities are limited to administer their own account.
  • the administrative user need not use the end-user's login/password, but rather “masquerades” as the end-user subject only to the administrative user's own privileges, which are often more extensive.
  • Administrators that have permissions to create end users and their accounts have the option of creating users with read-only capabilities. In this way, administrators may give certain end users access to large amounts of data in the system, but without authority to change any of the characteristics of the devices, tests, actions or reports they are viewing.
  • an administrator When representing an end user, an administrator (if given proper create privileges) may create devices and tests for the end user in the end user's own account, via a “Represent” feature.
  • One option the administrator has at the time of device creation is to make the device read-only. The tests on the read-only device become read-only as well. This feature was created to enable an end-user to observe the activity on a mission-critical network component, such as a switch or even a switch port, but not have the authority to modify its device or test settings.
  • Data may be collected from all DGEs and presented a consolidated view to the user primarily using a Web based interface.
  • An end user only needs a commonly available Web browser to access the full functionality and reporting features of the product.
  • Real-time status views are available for all accounts or devices or tests within an administrator's domain, all tests or devices or tests within an account, or all tests on a single device or device Group. Users can drill down on specific accounts, devices, and tests, and see six-hour, daily, weekly, monthly, and yearly performance information.
  • users can set default filters for the account and device summary pages to filter out devices in OK state, etc. For example, administrators may elect to filter out accounts and devices that are in an “OK” status. Especially for large deployments, this can dramatically cut down on the number of entries a user must scroll through to have a clear snapshot of system health.
  • a toggle switch on the account and device summary pages may be used to quickly disable or enable the filter(s).
  • General administration features including: DGE location and host creation; administration of Administrative Group domains; Administration of User Group thresholds, privileges and actions; Account and user management; Administration of devices, device groups, tests and actions; and Password Management, all may be supported by a graphical user interface.
  • a user Via either an “Update Device” page or during device suspension, a user can enter a comment that will display on a “Device Status Summary” page. This could be used to identify why a device is being suspended, or as general information on the current state of the device.
  • the present invention can export data to other systems, or can send notifications to trouble ticketing or other NOC management tools.
  • the present invention can import data from third party systems, such as OpenView from Hewlett-Packard, to provide a single administrative and analytical interface to all performance management measurements. More specifically, the present invention can import device name, IP address, SNMP community string and topology information from the HP OpenView NNM database, thereby complementing OpenView's topology discovery with the enhanced reporting capabilities of the present invention. Devices are automatically added/removed as the nodes are added or removed from NNM. Traps can be sent between NNM and the present invention as desired.
  • the present invention can open trouble tickets automatically using the Remedy notification plug in. It can automatically open trouble tickets in RT using the RT notification plug in.
  • DGE Physical locations (which are arbitrarily defined by the superuser) of where Data Gathering Elements are installed are created in the system. Recall that a DGE is a data collection agent assigned to a “location.” To create a new DGE, its IP address and location are provided. Since multiple DGEs can exist in one location, soft and hard limits that define DGE load balancing may be set. The present invention may use a load balancing mechanism based on configurable device limits to ensure that DGE hosts are not overloaded. In this embodiment, each device is provisioned to a DGE when it is created based on the following heuristics:
  • DGE DGE-based test setup
  • devices and tests are provisioned in the system, typically using an auto-discovery tool which finds all IP devices and available tests on them in the given subnets. Default thresholds and actions are used if none is provided by the user.
  • the system is ready to be operational.
  • a DGE When a DGE is enabled (either a process on the same machine as the configuration database or on another machine), it connects to the configuration database, identifies itself and downloads its configuration. After download its configuration, the DGE starts monitoring tests as described earlier.
  • the fault and performance monitoring system of the present invention can be set up and installed in a stand-alone environment in a few hours. Default test settings, action profiles, and reports may be pre-loaded into the system. Lists of devices can be batch-imported automatically into the system using the API.
  • the present invention discloses apparatus, data structures and methods for combining system fault and performance monitoring.
  • distributed data collection and storage of performance data storage requirements are relaxed and real-time performance monitoring is possible.
  • Data collection and storage elements can be easily configured via a central configuration database.
  • the configuration database can be easily updated and changed.
  • a federated user model allows normal end users to monitor devices relevant to the part of a service they are responsible for, while allowing administrative users to view the fault and performance of a service in an end-to-end manner across multiple accounts or departments.

Abstract

Combining system fault and performance monitoring using distributed data collection and storage of performance data. Storage requirements are relaxed and real-time performance monitoring is possible. Data collection and storage elements can be easily configured via a central configuration database. The configuration database can be easily updated and changed. A federated user model allows normal end users to monitor devices relevant to the part of a service they are responsible for, while allowing administrative users to view the fault and performance of a service in an end-to-end manner.

Description

§ 1. BACKGROUND OF THE INVENTION
§ 1.1 Field of the Invention
The present invention concerns network management systems (“NMSs”). In particular, the present invention concerns combining fault and performance management.
§ 1.2 Description of Related Art
The description of art in this section is not, and should not be interpreted to be, an admission that such art is prior art to the present invention.
As computer, hardware, software and networking systems, and systems combining one or more of these systems, have become more complex, it has become more difficult to monitor the “health” of these systems. For example, FIG. 1 illustrates components of a system 100 that may be used by a so-called e-commerce business. As shown, this system may include a web interface server 110, a search and navigation server 120 associated with a product inventory database 125, a purchase or “shopping cart” server 130 associated with a user database 135, a payment server 140 associated with a credit card database 145, a transaction server 150 associated with a transaction database 155, a shipping server 180 associated with a shipping database 185, a local area network (“LAN”) 160, and a network 170 including linked routers 175. As shown, the search and navigation server 120, the purchase or “shopping cart” server 130, the payment server 140 and the transaction server 150 may communicate with one another via the LAN 160. As further shown, these servers may communicate with the shipping server 180 via the network 170.
Each of the servers may include components (e.g., power supplies, power supply backups, printers, interfaces, CPUs, chassis, fans, memory, disk storage, etc.) and may run applications or operating systems (e.g., Windows, Linux, Solaris, Microsoft Exchange, etc.) that may need to be monitored. The various databases (e.g., Microsoft SQL Server, Oracle Database, etc.) may also need to be monitored. Finally, the networks, as well as their components, (e.g., routers, firewalls, switches, interfaces, protocols, etc.) may need to be monitored.
Although the system 100 includes various discreet servers, networks, and databases, the system can be thought of as offering an end-to-end service. In this exemplary system, that end-to-end service is on-line shopping—from browsing inventory, to product selection, to payment, to shipping.
Tools have been developed to monitor these systems. Such tools have come to be known as network management systems (NMSs). (The term network management systems should not be interpreted to be limited to monitoring networks—network management systems have been used to monitor things other than networks.) Traditionally, NMSs have performed either fault management, or performance management, but not both. Fault management pertains to whether something is operating or not. Performance management pertains to a measure of how well something is working and to historical and future trends.
A fault management system generates and works with “real time” events (exceptions). It can query the state of a device and trigger an event upon a state change or threshold violation. However, fault management systems typically do not store the polled data—they only store events and alerts (including SNMP traps which are essentially events). Generally, the user interface console for a fault management system is “exception” driven. That is, if a managed element is functioning, it is typically not even displayed. Generally, higher severity fault events are displayed with more prominence (e.g., at the top of a list of faults), and less critical events are displayed with less prominence (e.g., lower in the list).
On the other hand, performance management systems generally store all polled data. This stored data can then be used to analyze trends or to generate historical reports on numerical data collected. A major challenge in performance management systems is storing such large amounts of data. For example, just polling 20 variables every 5 minutes from 1000 devices generates 6 million data samples per day. Assuming each data sample requires 50 bytes of storage, about 9 GB of data will be needed per month. Consequently, performance management systems are designed to handle large volumes of data, perform data warehousing and reporting functions.
Performance management systems are typically batch oriented. More specifically, generally, distributed data collectors poll data and periodically (e.g., each night) feed them to a centralized database. Since the size of the centralized database will become huge, database management is a prime concern in such products.
As can be appreciated from the foregoing, conventional fault management systems are limited in that they do not store data gathered for later use in performance analysis. Conventional performance management systems are limited in that they require huge amounts of storage. Furthermore, since data is batched and sent to a centralized location for storage, the stored data can become “stale” if enough time has elapsed since the last batch of data was stored.
Furthermore, most enterprises currently use a minimum of two, if not more, products for information technology management. It is common to find several independent products being used by various departments within an enterprise to meet the basic needs of monitoring and performance management across networks, servers and applications. Moreover, since the performance and fault monitoring systems are disjointed, correlating data from these different systems is not trivial.
Recognizing that correlation between the collective information technology (“IT”) infrastructure and business service is needed, several Manager of Manager (“MoM”) tools have appeared in the market. These products interface with the various well known commercial tools and try to present a unified view to IT managers. Unfortunately, however, such integration is complex and requires depending on yet another product which needs to be learned and supported each time an underlying tool is updated. The addition of yet another tool just adds to the operational costs rather than reducing it.
In view of the foregoing limitations of existing network management systems, there is a need to simplify the processing related to monitoring faults and performance. There is also a need to monitor end-to-end service faults and performance of a service. Such needs should be met by a technique or system that is simple to install and administer, that has real-time capabilities, and that scales well in view of the large amount of data storage that may be required by a performance management system. Finally, there is a need to provide different users with different levels of monitoring, either for purposes of security, for purposes of software licensing, or both.
§ 2. SUMMARY OF THE INVENTION
The present invention discloses apparatus, data structures, and/or methods for distributing data gathering and storage for use in a fault and performance monitoring system.
§ 3. BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a diagram of a e-commerce system to which the present invention may be applied to monitor faults and performance.
FIG. 2 is a bubble chart illustrating an architecture of the present invention.
FIG. 3 is a diagram illustrating an exemplary application of the present invention to the e-commerce system of claim 1.
FIG. 4 is a flow diagram of an exemplary method that may be used to perform system configuration operations in a manner consistent with the principles of the present invention.
FIG. 5 is a flow diagram of an exemplary method that may be used to perform information extraction, combination and presentation operations in a manner consistent with the principles of the present invention.
FIG. 6 is a flow diagram of an exemplary method that may be used to perform distributed data gathering, (preprocessing) and storage operations in a manner consistent with the principles of the present invention.
FIGS. 7–10 are exemplary object-oriented data structures that may be used to store configuration information in a manner consistent with the principles of the present invention.
FIGS. 11A and 11B illustrate an exemplary events report.
FIG. 12 illustrates an exemplary test status summary report.
FIGS. 13A and 13B illustrates an exemplary test details report.
FIGS. 14A and 14B illustrates an exemplary service instability report.
FIG. 15 illustrates an exemplary usage and trend report.
FIG. 16 illustrates an exemplary account status summary report.
FIG. 17 illustrates an exemplary service status summary report.
FIG. 18 is block diagram of apparatus that may be used to effect at least some aspects of the present invention.
§ 4. DETAILED DESCRIPTION
The present invention involves methods, apparatus and/or data structures for monitoring system faults and system performance. The following description is presented to enable one skilled in the art to make and use the invention, and is provided in the context of particular applications and their requirements. Various modifications to the disclosed embodiments will be apparent to those skilled in the art, and the general principles set forth below may be applied to other embodiments and applications. Thus, the present invention is not limited to the embodiments shown and the inventor regards his invention as the following disclosed methods, apparatus and data structures and any other patentable subject matter.
§ 4.1 Exemplary Architecture
FIG. 2 is a bubble chart of an exemplary system fault and performance monitoring architecture 200 which employs distributed data gathering and storage. This distributed architecture enables the system to handle the large volume of data collected for performance monitoring. It also enables real-time performance monitoring. More specifically, a number of data gathering operations 210 (also referred to as “data gathering elements” or “DGEs”) are distributed across a number of facilities or components of a system (not shown). For example, referring back to the exemplary system 100 of FIG. 1, a first DGE may be provided on the local area network 160, a second DGE may be provided on the network 170, and a third DGE may be provided on the shipping server 180. As indicated by the arrows, DGEs can collect traps and messages and can receive data from an external feed. As described in more detail in § 4.2 below, DGEs can perform further tasks. Data gathered and/or generated by each DGE 210 is stored in an associated database 220.
DGEs 210 can be configured using system configuration operations 230, in accordance with a configuration database 240. Basically, the system configuration operations 230 can (i) allow configuration information to be entered into the configuration database 240, (ii) inform each DGE of its startup configuration 210, and (iii) inform each DGE 210 of runtime changes to its configuration.
Information extraction, combination and presentation operations 250 may collect fault information from the DGEs 210 (either by asking a proxy process or directly via their databases 220), may collect performance information from the databases 220 of the DGEs 210, may combine fault and performance information from different DGEs, and may present fault and performance information to a user in a unified, integrated manner. The presentation of this information may be in the form of screens, graphs, reports, etc.
Finally, an application programming interface (“API”) operation 260 may be provided to permit users to expand the fault and performance monitoring functionality of the present invention. In one embodiment consistent with the principles of the present invention, the API permits provisioning accounts, users, devices, tests, actions, DGE locations, and DGE hosts through a socket interface. Such an embodiment enables mass data entry, updates and searches. Searches for test results and events are also permitted via this interface. A limited number of reports are available, although a full complement of reporting is offered via a graphical user interface (“GUI”). In a particular embodiment of the present invention, a perl API is provided which uses the underlying socket interface. Organizations with large numbers of monitored devices can provision, update or search systems using the API.
In one embodiment of the invention, the system configuration operations 230, the configuration database 240, the information extraction, combination and presentation operations 250, and the API operations 260 may all be performed from and provided at the same facility or server. The information extraction, combination and presentation operations may be referred to as a “business visibility engine” or “BVE”. A “BVE” may also include the configuration operations 230, the configuration database 240, and the API operations 260.
Recall that although some traditional NMS products have distributed collectors, they require consolidating all the data into a central database for reporting. Thus the architecture 200 of FIG. 2 is much different in that the information extraction, combination and presentation operations 250 seamlessly integrates distributed DGE databases 220 and can issue queries in parallel across the distributed DGEs 210. The responses from such queries can then be combined (also referred to as response “correlation”). The n-tier architecture 200 is centered on a configuration database management system. The distributed nature of the system 200 permits committing explicit resources to important processes and systems, hence achieving real-time scalability and performance. Typical traffic flow across an n-tier system consists of a number of clients that access services from one tier, which in turn requests services from one or more systems.
This architecture pushes even the correlation and notification to the distributed DGEs so that there is no central bottleneck and the system operates as a loosely coupled but coordinated cluster. One embodiment, consistent with the principles of the present invention, uses key technology standards such as XML, JMS, JDBC, SOAP and XSLT layered on a J2EE framework.
§ 4.2 ENVIRONMENT IN WHICH THE PRESENT INVENTION MAY OPERATE
FIG. 3 illustrates an exemplary system 300 in which the fault and performance monitoring architecture of FIG. 2 has been applied to the exemplary e-commerce system 100 of FIG. 1. The components of the exemplary e-commerce system 100 are depicted with dashed lines. As shown, a first data gathering element (and an associated database) 310 a/320 a is provided on the LAN 160, a second data gathering element (and an associated database) 310 b/320 b is provided on the shipping server 180′, and a third data gathering element (and an associated database) 310 c/320 c is provided on the network 170′. These elements may be configured by, and may provide information to, a business visibility engine 390. The business visibility engine 390 may include system configuration operations 330, a configuration database 340, information extraction, combination and presentation operations 350 and API operations 360.
§ 4.3 EXEMPLARY METHODS, APPARATUS AND DATA STRUCTURES
Exemplary methods, apparatus, and data structures that may be used to effect the configuration, data gathering, and information extraction, combination and presentation operations are now described.
§ 4.3.1 CONFIGURATION
System configuration may include information learned or discovered from the system and/or information entered via the API operation. FIG. 4 is a flow diagram of an exemplary method 400 that may be used to generate system configuration information. As indicated by block 410, a list of (e.g., Internet Protocol) networks can be read and this list can be used to discover devices (e.g., servers, routers, applications, etc.) on those networks. Alternatively, this information may be manually entered or otherwise defined (e.g., via the API operation). Each of the devices is associated with one or more fault and/or performance tests as indicated by block 420. This association may be established via an auto-discovery mechanism. Alternatively, this association may be manually entered or otherwise defined (e.g., via the API operation). As shown in the exemplary data structure 700 of FIG. 7, each of a number of device objects 720 may include one or more test objects 730.
Further, each of at least one data gathering operation (e.g., a DGE) is associated one or more of the devices as indicated by block 430. This association may be manually entered or otherwise defined (e.g., via the API operation), but is preferably discovered. In one embodiment, a DGE at a particular location is associated with devices at the same location. In this embodiment, when additional DGEs are added to a location, the load of monitoring the devices at that location may be balanced across the DGEs at that location. As shown in the exemplary data structure 1000 of FIG. 10, a location 1010 may include one or more DGEs 1020. Each of the DGEs 1020 may be associated with one or more device objects 1030.
As indicated by block 440, thresholds are associated with the tests. The thresholds may be default thresholds, or may be provided, for example via the API operation, on a case-by-case basis. Exemplary thresholds, for example, may include a “warning” threshold and a “critical” threshold. As just alluded to, the test may, by definition, include (default) thresholds. Similarly, as indicated by block 450, performance test parameters may be associated with at least some of the tests. The parameters may be default parameters, or may be provided, for example via the API operation, on a case-by-case basis.
As indicated by block 460, a number of actions may be provided, and one or more tests may be associated with each action. For example, an action may be “e-mail a critical threshold violation to network administrator”. A number of fault tests may be associated with this action such that if any of the tests violate a critical threshold, the network administrator is informed. These associations may be entered via the API operation, or may be defined in some other way (e.g., by default). As shown in the exemplary data structure 800 of FIG. 8, an action object 810 may include one or more test objects 820.
The various associations may be stored in the configuration database 240. Although these associations may be stored in an object-oriented database, other data structures may be used to store this information in an alternate database type. However, an object-oriented database allows easy and flexible schema maintenance as compared to other database types available today.
Referring back to FIG. 4, the fault and performance configuration information may be provided (e.g., signaled) to respective data gathering operations as indicated by block 470. If the respective data gathering operations are already available (e.g., on standby), this signaling may occur immediately. If, on the other hand, the respective data gathering operations are not yet available, this signaling may be done in response to an indication that a new data gathering objection has been added. For example, in such an embodiment, upon startup, a DGE only needs to know its own identifier (as used in the configuration database) and the (IP) address of the server running the configuration database. Further, if there is a failure, a new DGE can be started up with the identifier of the failed DGE, and this new DGE will download its configuration from the configuration database and thus assume the work of the failed DGE. Furthermore, if a connection to the configuration database is lost, or if the configuration database goes down, configured DGEs can continue to function as presently configured until the connection and/or configuration database is restored.
§ 4.3.1.1 MONITORS AND PLUG-INS
Recall from block 420 that tests may be associated with a device. A “monitor” at a DGE performs a test based on the test object. A “scheduler” at the DGE determines a test type from the test object and then puts it onto a queue for the monitor. Thus, the actual testing is done via a monitor of a DGE.
Although monitors may be predefined, the API operation may allow users to create “plug-ins” to define new tests (e.g., for a new device) to be performed by new monitors. In this regard, monitors are similar to device drivers in an PC operating system. More specifically, a PC operating system has drivers for may popular peripherals. However, device drivers for new peripherals or less popular peripherals may be added. Similarly, as new devices types are added to the system being monitored, new monitors for testing these new device types may be added. The present invention may overprovision a DGE with monitors. In this way, even though some monitors might not be used, as devices are added, the DGE can simply activate a monitor needed to test the newly added device.
A list of at least some exemplary monitors that may be supported by the present invention is provided in § 4.3.1.1.1 below.
§ 4.3.1.1.1 EXEMPLARY NETWORK MONITORS
ICMP network monitors may be used to check the reachability of hosts on an Internet Protocol (“IP”) network using the ICMP protocol. The ICMP monitor reports on packet loss and latency for a sequence of ICMP packets. These monitors may include:
    • ICMP Round Trip Time—Average time of 5 packets sent at 1 second intervals of 100 bytes each. Measured in milliseconds.
    • ICMP Packet Loss—% of packets lost out of 5 packets sent at 1 second intervals of 100 bytes each.
SNMP network monitors for querying devices using the standard SNMP v1, v2 and v3 protocol. Certain enhancements have been made to the monitor such using 64-bit counters where available, account for rollover of 32-bit counters, asynchronous polling to avoid waiting for responses and optimize timeout periods, multiple queries in the same SNMP packet, automatically sending individual queries if the multiple query packet fails for any reason, and querying an alternate SNMP port. In an exemplary embodiment, a external definition library has been built which defines which SNMP variables and post processing (such as rate, delta, etc.) needs to be queried based on the device type. This permits easily updating the definition library without having to edit the core product.resources (SNMP v1, v2, v3) may use 64-bit counters where available, and may also account for rollover of 32-bit counters. Multiple SNMP queries to the same host may be sent in the same packet for optimization. An alternate SNMP port may be queried instead of default. These monitors may include:
    • Bandwidth Utilization by Interface—% of total network bandwidth, both incoming and outgoing, calculated by the delta bytes between each sample.
    • Throughput by Interface—number of packets per second.
    • Interface Errors—CRC error rate (per minute) calculated by the delta between sample intervals.
    • BGP Monitor—BGP peer state (connected or failed), route flaps (rate of routing updates).
    • Environment—Cisco, Foundry chassis temperature, fan status, power supply.
    • SNMP Traps—Customizable trap handler which assigns a severity to received traps based on a customizable configuration file and inserts into the system.
SNMP Host Resources (SNMP v1, v2, v3) monitors may include:
    • CPU load—Average % per minute.
    • Disk space—% of total disk available for each partition; does not show total size.
    • Physical Memory—% of physical memory used.
    • Virtual Memory—% of virtual memory used.
    • Paging/Memory Swapping—number of page swaps per unit time.
    • Printer MIB support—printer health, paper tray capacity, cover status, available storage.
TCP Port monitors for monitoring the transaction of well known Internet services such as HTTP, HTTPS, FTP, POP3, IMAP, IMAPS, SMTP, NNTP.
Exemplary port monitors may include:
    • HTTP—Hypertext Transport Protocol—Monitors the availability and response time of HTTP Web servers. Checks for error response.
    • HTTPS—HTTP Secure Socket Layer—This monitor supports all of the features of the HTTP monitor, but also supports SSL encapsulation, in which case the communication is encrypted using SSLv2/SSLv3 protocols for increased security. The monitor may establish the SSL session and then perform HTTP tests to ensure service availability.
    • SMTP—Simple Mail Transport Protocol—Monitors the availability and response time of any mail transport application that supports the SMTP protocol (e.g., Microsoft Exchange, Sendmail, Netscape Mail.)
    • POP3—Post Office Protocol (E-mail)—Monitors the availability and response time of POP3 email services. If legitimate username and password is supplied, it may login and validate server response.
    • Generic Port—Any TCP port can be monitored for a response string.
    • IMAP4—Internet Message Access Protocol—Monitors the availability and response time of IMAP4 email services. If legitimate username and password is supplied, it may login and validate server response.
    • IMAPS—IMAP Secure Socket Layer—This monitor may support all of the features of the IMAP monitor, but may also support SSL encapsulation, in which case the communication is encrypted using SSLv2/SSLv3 protocols for increased security. The monitor may establish the SSL session and then perform IMAP tests to ensure service availability.
    • FTP—File Transport Protocol—Monitors the availability and response time of FTP port connection. It may send a connection request, receive OK response and then disconnect. If legitimate username and password is supplied, it may login and validate server response.
    • NNTP—Connects to the NNTP service to check whether or not Internet newsgroups are available, receives OK response and then disconnects. Note that for POP, FTP & IMAP monitors, if the user does not specify a username or password, then just a port connection is deemed OK. If the user specifies a username/password combo, then an actual LOGIN is considered OK, else fail.
§ 4.3.1.1.1.1 MORE NETWORK MONITORS
The Simple Network Management Protocol (“SNMP”) is a popular protocol for network management. SNMP facilitates communication between a managed device (i.e., a device with an SNMP agent, such as a router for example) and an SNMP manager or management application (represents a user of network management). The SNMP agent on the managed device provides access to data (managed objects) stored in the managed device. The SNMP manager or management application uses this access to monitor and control the managed device.
Communication between the managed device and the management operation is via SNMP Protocol Data Units (“PDUs”) that are typically encapsulated in UDP packets. Basically, four kinds of operations are permitted between managers and agents (managed device). The manager can perform a GET (or read) to obtain information from the agent about an attribute of a managed object. The manager can perform a GET-NEXT to do the same for the next object in the tree of objects in the managed device. The manager can perform a SET (or write) to set the value of an attribute of a managed object. Finally, the agent can send a TRAP, or asynchronous notification, to the manager telling it about some event in the managed device.
SNMP agents for different types of devices provide access to objects that are specific to the type of device. To enable the SNMP manager or management application to operate intelligently on the data available in the device, the manager needs to know the names and types of objects in the managed device. This is made possible by Management Information Base (“MIB”) modules, which are specified in MIB files usually provided with managed devices. (See, e.g., the publication Request for Comments 1213, the Internet Engineering Task Force (incorporated herein by reference).)
One embodiment of the present invention may support at least some of the following SNMP MIBs:
  • RFC1253—OSPF Version 2
    • OSPF {neighbor} Status
    • OSPF {neighbor} Errors
    • OSPF External LSA
    • OSPF LSA Sent/Received
  • RFC1514—Host Resources MIB
    • Disk Space Utilization
    • Physical Memory Utilization
    • Swap/Virtual Memory Utilization
    • CPU Load
    • Running Application/Process Count
    • Logged In User Count
  • RFC1657—Border Gateway Protocol (BGP-4)
    • BGP {neighbor} Status
    • BGP {neighbor} Updates
    • Sent/Received
    • BGP {neighbor} FSM Transitions
  • RFC1697—Relational Database Management
    • {rdbms} Status
    • {rdbms} Disk Space Utilization
    • {rdbms} Transaction Rate
    • {rdbms} Disk Reads/Writes
    • {rdbms} Page Reads/Writes
    • {rdbms} Out Of Space Errors
  • RFC1724—RIP Version 2
    • RIP Route Changes
    • RIP {interface} Updates Sent
    • RIP {neighbor} Bad Routes Received
  • RFC1759—Printer MIB
    • Printer Status
    • Printer Paper Capacity
    • Printer Door Status
  • RFC2115—Frame Relay DTE
    • Frame Relay {dlci} Status
    • Frame Relay {dlci} FECN/BECN
    • Frame Relay {dlci} Discards/DE
    • Frame Relay {dlci} Traffic In/Out
  • RFC2863—Interfaces Group MIB
    • {interface} Status
    • {interface} Utilization In/Out
    • {interface} Traffic In/Out
    • {interface} Packets In/Out
    • {interface} Discards In/Out
    • {interface} Errors In/Out.
One embodiment of the present invention may support at least some of the following vendor specific MIBs:
  • APC UPS
    • UPS Battery Status
    • UPS Battery Capacity
    • UPS Battery Temperature
    • UPS Voltage
    • UPS Output Status
  • Checkpoint FW-1
    • Packets Accepted
    • Packets Rejected
    • Packets Dropped
    • Packets Logged
    • CPU Utilization
  • Cisco 340/350 Wireless Access Points
    • Associated Stations
    • Neighbor Access Point Count
  • Cisco Local Director
    • Virtual {server}:{port} Status
    • Virtual {server}:{port} Connections
    • Virtual {server}:{port} Traffic In/Out
    • Virtual {server}:{port} Packets In/Out
    • Real {server}:{port} Status
    • Real {server}:{port} Connections
    • Real {server}:{port} Traffic In/Out
    • Real {server}:{port} Packets In/Out
    • Failover Cable Status
  • Cisco PIX Firewall
    • Firewall Status
    • Active IP Connections
    • Active FTP Connections
    • Active HTTP Connections
    • Active HTTPS Connections
    • Active SMTP Connections
    • Active H.323 Connections
    • Active NetShow Connections
    • Active NFS Connections
  • Cisco Router/Catalyst Switch
    • {interface} CRC Errors
    • Backplane Utilization
    • VLAN Traffic In/Out
    • VLAN Error In/Out
    • CPU Utilization
    • Memory Utilization
    • Buffer Allocation Failure
    • Chassis Temperature
    • Fan Status
    • Power Supply Status
    • Module Status
  • Compaq Insight Manager
    • Network Interface Status
    • Network Interface Utilization In/Out
    • Network Interface Alignment Error In/Out
    • Network Interface FCS Error In/Out
    • CPU Utilization
    • Disk Space Utilization
    • RAID Controller Status
    • RAID Array Chassis Temperature
    • RAID Array Fan Status
    • RAID Array Power Supply Status
  • Foundry Network Router/Switch
    • CPU Utilization
    • Chassis Temperature
    • Fan Status
    • Power Supply Status
  • HP/UX
    • Disk Space Utilization
    • Physical Memory Utilization
    • Swap/Virtual Memory Utilization
    • CPU Load
    • Running Application/Process Count
    • Logged In User Count
  • LAN Manager (Windows Only)
    • Windows Login Errors
    • System Errors
    • Workstation I/O Response
    • Active Connections
  • Microsoft DHCP Server
    • Available Address In Scope
    • DISCOVER Request Received
    • REQUEST Request Received
    • RELEASE Request Received
    • OFFER Response Sent”
    • ACK Request Received
    • NACK Request Received
  • Microsoft Exchange Server
    • Exchange Server Traffic In/Out
    • Exchange Server ExDS Access Violations
    • Exchange Server ExDS Reads
    • Exchange Server ExDS Writes
    • Exchange Server ExDS Connections
    • Exchange Server Address Book Connections
    • Exchange Server LDAP Queries
    • Exchange Server MTS
    • Exchange Server SMTP Connections
    • Exchange Server Failed Connections
    • Exchange Server Queue
    • Exchange Server Delivered Mails
    • Exchange Server Looped Mails
    • Exchange Server Active Users
    • Exchange Server Active Connections
    • Exchange Server Xfer Via IMAP
    • Exchange Server Xfer Via POP3
    • Exchange Server Thread Pool Usage
    • Exchange Server Disk Operation (delete)
    • Exchange Server Disk Operation (sync)
    • Exchange Server Disk Operation (open)
    • Exchange Server Disk Operation (read)
    • Exchange Server Disk Operation (write)
  • Microsoft Internet Information Server (IIS)
    • Incoming/Outgoing Traffic
    • Files Sent/Received
    • Active Anonymous Users
    • Active Authenticated Users
    • Active Connections
    • GET Requests
    • POST Requests
    • HEAD Requests
    • PUT Requests
    • CGI Requests
    • Throttled Requests
    • Rejected Requests
    • Not Found (404) Errors
  • Microsoft SQL Server (Using Network Harmoni ACM)
    • {database} Status
    • {database} Page Reads/Writes
    • {database} TDS Packets
    • {database} Network Errors
    • {database} CPU Utilization
    • {database} Threads
    • {database} Page Faults
    • {database} Users Connected
    • {database} Lock Timeouts
    • {database} Deadlocks
    • {database} Cache Hit Ratio
    • {database} Disk Space Utilization
    • {database} Transaction Rate
    • {database} Log Space Utilization
    • {database} Replication Rate
  • Oracle 8/9i Database □. Oracle DB {database} Status
    • Oracle DB {database} Disk Utilization
    • Oracle DB {database} Transaction Rate
    • Oracle DB {database} Disk Reads/Writes
    • Oracle DB {database} Page Reads/Writes
    • Oracle DB {database} OutOfSpace Errors
    • Oracle DB {database} Query Rate
    • Oracle DB {database} Committed/Aborted Transactions
    • Oracle Table {table} Space Utilization
    • Oracle Table {table} Status
    • Oracle Datafile {file} Reads
    • Oracle Datafile {file} Writes
    • Oracle Replication Status
    • Oracle Listener Status
    • Oracle SID Connections
  • Sun Solaris
    • System Interrupts
    • Swap In/Out to Disk
    • CPU Load
  • NET-SNMP (formerly UCD-SNMP) □. Disk Space Utilization
    • Physical Memory Utilization
    • Swap/Virtual Memory Utilization
    • CPU Load
    • System Interrupts
    • Swap In/Out to Disk
    • Block I/O Sent/Received
    • System Load Average.
One embodiment of the present invention may support at least some of the following non-SNMP tests:
  • Networking
    • Ping Packet Loss
    • Ping Round Trip Time
    • RPC Ping
  • Internet Services
    • HTTP
    • HTTPS
    • SMTP
    • IMAP
    • IMAPS
    • POP3
    • POP3S
    • NNTP
    • FTP
  • Applications
    • Radius
    • NTP
    • DNS Domain
    • SQL Query
    • LDAP Search
    • DHCP Request
    • URL/Web Transaction Test
  • Custom
    • External Data Feed
    • External Plug in Monitors
    • Advanced Port Test
    • Advanced SNMP Test.
§ 4.3.1.1.2 EXEMPLARY APPLICATION MONITORS
Exemplary application monitors may include:
    • URL transaction monitor—Measures time to complete an entire multi-step URL transaction. Can fill forms, clicks on hyperlinks, etc. May work with proxy and also support https.
    • Oracle system performance—Measures RDBMS size, RDBMS transaction rate, and table size.
    • SQL database query—measures query response time for a SQL query from databases such as Oracle, Sybase, SQL Server, Postgres, MySQL. Required inputs may include legitimate username, password, database driver selection, database name, and proper SQL query syntax. May support Oracle, Sybase, SQL Server, Postgres, MySQL.
    • Poet OQL database query—Measures query response time. Required inputs may include legitimate username, password, database name, and proper OQL query syntax.
    • LDAP database query—Connects to any directory service supporting an LDAP interface and checks whether the directory service is available within response bounds and provides the correct lookup to a known entity. Required inputs may include base, scope and filter.
    • NTP—Monitors time synchronization service running on NTP servers.
    • RADIUS—Remote Authentication Dial-In User Service (RFC 2138 and 2139)—Performs a complete authentication test against a RADIUS service.
    • DNS—Domain Name Service (RFC 1035)—Uses the DNS service to look up the IP addresses of one or more hosts. It monitors the availability of the service by recording the response times and the results of each request.
    • DHCP Monitor—Checks if DHCP service on a host is available, whether it has IP addresses available for lease and how long it takes to answer a lease request.
    • RPC Portmapper—Checks if the RPC portmapper is running on a Unix host (a better alternative to icmp ping for an availability test).
    • BEA Weblogic—Checks heap size and transaction rate.
    • SQL Server—Checks state, transaction rate, write operations performance, cache hit rate, buffers, concurrent users, available database and log space.
    • LAN Manager—Checks authentication failures, system errors, I/O performance, and concurrent sessions.
§ 4.3.1.1.3 EXTERNAL DATA FEED MONITORS
External data feeds (“EDF”) monitors may be used to insert result values into the system using a socket interface. The inserted data is treated just as if it were collected using internal monitors.
§ 4.3.1.1.4 PLUG-IN MONITORS
The present invention can provide a plug-in monitor framework so that a user can write a custom monitor in Java or any other external script or program. The monitor itself and a definition file in XML are put into a plug-in directory, and treated as integrated parts of the DGE itself.
§ 4.3.1.2 MONITORING BUSINESS SERVICES (END-TO-END)
Since IT infrastructure is typically used to deliver business services within an enterprise, it is increasingly important to correlate the different IT components of a business service. As an example, a payroll service may consist of a payroll application on one server, a backend database on another server, and a printer, all connected by a network router. Any of these underlying IT components can fail and cause the payroll service to go down.
Service views and reports can be created in the exemplary product by grouping together all the underlying components of a service into a consolidated service view. If and when any of the underlying IT components fails, the entire service is reported as down, thus allowing one to measure the impact of underlying IT components on business services.
§ 4.3.1.3 TEST PROVISIONING
Most of the test discovery on a device is done by a separate task. Note that any adds/changes are made to the configuration database which essentially controls the behavior of the DGE processes as described earlier.
Tests can be provisioned using one or more of the following techniques.
Automated Test Discovery
Port and SNMP tests can be automatically “discovered” by querying the device to see what services are running. The system can automatically detect disk partitions, volumes and their sizes so that the usage is normalized as a percentage. This normalization may also be done for memory, disk partitions, and database tablespace.
Auto-Discovery of Vendor, Model, OS
When the auto-discovery for SNMP occurs, the target device database record may be updated with vendor and model information. If a user has checked the SNMP tests box when creating a device, the model and vendor information may be displayed on a configure tests page.
Auto-Update for Device Capacity Change
The present invention can provide a mechanism for refreshing maximum values or SNMP object identifiers (SNMP OID) when an SNMP test has changed. For example, when memory or disk capacity has changed, tests that return percentage-based values would be incorrect unless the maximum value (for determining 100%) is refreshed. Similarly, in the case of a device rebuild, it is possible that the SNMP OIDs may change, thus creating a mismatch between the current SNMP OIDs and the ones discovered during initial provisioning. If any of these situations occurs, the user need only repeat the test provisioning process in the web application for a changed device. The present invention can discover whether any material changes on the device have occurred and highlight those changes on the configure tests page, giving the user the option to also change thresholds and/or actions that apply to the test.
Centralized Administration of Thresholds and Notifications
Default warning and critical thresholds may be set globally for each type of test. Tests can be overridden at the individual device level, or reset for a set of tests in a department or other group. In addition, a service level (SLA) threshold can be set separately to track levels of service or system utilization, which will not provide alarms or actions.
At this point, the system is configured. Data gathering and storage (in accordance with the configuration) is described in § 4.3.2 below. Then, information extraction, combination and presentation (in accordance with the configuration) is described in § 4.3.3 below.
§ 4.3.2 DATA GATHERING AND STORAGE
To reiterate, under the present invention, data gathering may be performed by distributed data gathering operations (e.g., DGEs). Gathered data may be stored locally by each DGE. Further, DGEs may optionally perform some local data preprocessing such as calculating rate, delta, percentages, etc.
FIG. 6 is a flow diagram of an exemplary method 600 that may be used to perform a data gathering operation. Since these operations are distributed, this method 600 may be performed, possibly asynchronously and independently, by multiple autonomous DGEs. As indicated by decision block 605 and block 610, if the DGE is not yet configured, it should try to get such configuration information. For example, it may do so by connecting to the configuration database and downloading any needed configuration information. Referring back to decision block 605, once the DGE is configured, it monitors device(s) in accordance with such configuration information as indicated by block 615. Recall each DGE may test devices using “monitors” at scheduled intervals specified in each test object.
The remainder of the method 600 may depend on whether the DGE gathers data using a “pull model” (i.e., with distinct requests) or whether it gathers data using a “push mode” (i.e., without a distinct request). In either model, the DGE can receive an exception indication if a device performs a self-test and finds an error. Such errors are typically reported using SNMP traps or via a log message. For purposes of simplicity, the various ways of gathering data are shown together. However, a particular implementation of the present invention need not use all of these alternative data gathering techniques.
Referring to trigger (event) block 620, if it is time for the DGE to get data for a particular test (e.g., as specified by a polling scheduler in the DGE), it requests (polls for) data as indicated by block 625 and the requested data is accepted as indicated by block 630. Since these blocks “pull” data from devices, they effect a pull data gathering technique. The period at which data for a particular test is requested may be defined by the test (object) and/or configuration data associated with the test. The request may be placed in a queue. The method 600 then proceeds to decision block 635, described later.
Referring back to trigger (event) block 620, if data is made available (e.g., “pushed”) to the DGE, it accepts the data as indicated by block 655, before the method 600 proceeds to decision block 635. Since this branch accepts data that has been “pushed” to the DGE from a device, it effects a push data gathering technique.
Referring now to decision block 635, it is determined whether the data is fault data or performance data. If the data is performance data, it is stored locally as indicated by block 640, before the method 600 is left via return node 670. In one embodiment, the stored data is aggregated (e.g., daily data is combined to weekly data, weekly data is combined to quarterly data, quarterly data is combined to annual data, etc.). As shown by optional block 642, the performance data may be pre-processed. For example, the DGE can pre-process the performance data to calculate rates, deltas, percentages, etc. It can also normalize the collected data.
Referring back to decision block 635, the data is compared with one or more thresholds as indicated by block 645. Then, as indicated by decision block 650, it is determined whether or not the threshold is violated. (In the following, it will be assumed that the fault data is only checked against one threshold to simplify the description. However, the data can be compared against more than one threshold, such as a “critical” threshold and a “warning” threshold.) If the threshold is not violated, the method 600 is simply left via RETURN node 670. If, on the other hand, the threshold is violated, the method 600 branches to block 660 which starts processing for a fault exception.
Referring back to trigger (event) block 620, notice that the method 600 proceeds to block 660 if a fault exception (e.g., generated by a device self-test) is reported to it. As indicated by blocks 660 and 665, an action for the fault exception is determined (Recall, e.g., data structure 800 of FIG. 8.) and performed. Thus, fault events may be handled by the DGE. As indicated by optional block 670, the occurrence of the fault exception may be stored. Thus, in this embodiment, although fault data is not stored if no threshold violation exists, the data itself, or merely the fault exception, can and should be stored in the event of a fault exception occurrence.
In one embodiment, if a threshold has been crossed, an event is generated and fed into a correlation-processor. This thread looks at a rules engine to determine the root-cause of the problem (e.g., upstream devices, IP stack, etc.) and if a notification or action needs to be taken.
§ 4.3.2.1 DATA STORAGE
In an exemplary embodiment, consistent with the principles of the present invention, all data is stored in a JDBC compliant SQL database such as Oracle or MySQL. Data is collected by the DGEs and stored using JDBC in one of a set of distributed databases which may be local or remote on another server. Such distributed storage minimizes data maintenance requirements and offers parallel processing. All events (a test result that crosses a threshold) may be recorded for historical reporting and archiving. Information may be permanently stored for all events (until expired from database). All messages and alerts that may have been received may be permanently stored by the appropriate DGE (until expired from the database). Raw results data (polled data values) may be progressively aggregated over time. In one embodiment, a default aggregation scheme is five-minute samples for a day, 30-minute averages for a week, one-hour averages for three months and daily averages for a year.
§ 4.3.2.2 EVENTS AND MESSAGES
Recall from blocks 650, 660 and 665 that a threshold violation or exception may cause an event to be generated. Each event, as well as each exception or message received by the DGE is assigned a severity. A message is assigned a severity based on a user specified regular expression pattern match.
Based on these severity levels, the visual GUI indicates these severity conditions by unique icons or other means. The following severity states are supported:
    • OK, WARNING, CRITICAL: Typical alarming occurs when test results cross warning and critical thresholds set by the end-user or administrator, and may display yellow and red icons or bars on the various status pages. Devices and tests in a normal state may display an OK icon or green color bar.
    • UNKNOWN: A test result returns an “unknown” value when the monitor receives no response from the device for that particular test. Unknown results may display a question mark (?) and may also create events that are graphed on reports.
    • FAIL: This state occurs when a test result is received, but the value returned is invalid. For example, if a POP3 username or password is incorrect, the device may be reached by the test but the login will fail. Failed tests may be displayed and stored as CRITICAL events and graphed accordingly.
    • UNREACHABLE: It is desirable to differentiate between when a device is unavailable due to its own error and when it is unreachable due to the unavailability of a gateway device (e.g. router or switch).
    • SUSPENDED. Although not an alarm per se, suspended devices and tests may be displayed with a unique icon to indicated the state.
Events may be recorded for these state changes in order to track historical activity, or lack thereof. Tests can be ‘suppressed’ when they are in a known condition, and are hidden from view until the state changes after which the suppressed flag is automatically cleared.
An event may be recorded for a test's very first result and for every time a test result crosses a defined threshold. For example, the very first test result for an ICMP round trip time test falls into the “OK” range. Five minutes later, the same test returns a higher value that falls in the “WARNING” range. Another five minutes passes, the test is run again, and the round trip time decreases and falls back into the “OK” range. For the ten minutes that just past, 3 separate events may have been recorded—one because the test was run for the first time, and two more for crossing the “WARNING” threshold, both up and back.
One time text messages, or SNMP traps, or text alarms may be displayed in a separate ‘message’ window. All messages should have a severity and device associated with them, and the user can filter the messages displayed and acknowledge them to remove from the messages window. A user can match on a regular expression and assign a severity to a text message, thus triggering actions and notifications similar to events.
§ 4.3.2.3 ACTIONS
Recall that events and exceptions trigger actions. An action may be a notification via email or pager, or any other programmable activity such as opening a trouble ticket or restarting a server. Actions may be configured and assigned to tests in the form of a profile, with each profile preferably containing any number of individual sub-actions. Each of these sub-actions may configured with the following information:
    • notification type—email, pager or external script;
    • message recipient—email address;
    • notify on state—OK, Warning, Critical, Unknown (choose one, several, or all);
    • delay—choose to notify immediately or after N test cycles;
    • repeat—if the test stays in the trigger state, either don't repeat notification or repeat it every N tests; and
    • time of day—the time of day that this sub-action is valid. Actions may be assigned to tests by reference. They may be assigned en masse to multiple devices, and thus all the test configurations on each device. Updating an action may automatically update all test configurations to which the action was assigned.
Having described data gathering (in accordance with the configuration), information extraction, combination and presentation (in accordance with the configuration) is now described in § 4.3.3 below.
§ 4.3.3 INFORMATION EXTRACTION, COMBINATION AND PRESENTATION
To reiterate, under the present invention, data collection and storage is distributed across various DGEs which each store data locally or a remote distributed database. Further, at least some data analysis may be distributed across various DGEs, each of which may analyze local data. Thus, a (more) centralized reporting facility is relieved of at least some data storage and analysis responsibilities.
FIG. 5 is a flow diagram of an exemplary method 500 that may be used to perform an information extraction, combination and presentation operations. As indicated by trigger (event) block 510, various branches of the method 500 may be effected depending upon the occurrence of a trigger (event).
In response to a user query (Note that a user login may infer a default query.), the user should be authenticated as indicated by block 520. Any known authentication techniques, such as password, radius, or external directory, etc., may be used.
Then, the user's authorization is determined as indicated by block 522. A user's authorization may depend on a group to which the use belongs. (Recall, e.g., data structure 900 of FIG. 9.) An administrator may associate a user to a group using the configuration API. For example, in one exemplary embodiment, a group object may have defined “permissions” (e.g., create actions, create devices, see data of other user, etc.) and defined “limits” (e.g., number of devices, types of devices, device locations, number of tests, etc.). The defined permissions are typically provided for security purposes. The defined limits are typically provided for security purposes and/or for providing flexible software licensing terms.
Referring back to FIG. 5, as indicated by block 524, a database query may be generated using a report type (e.g., fault report or performance report) and the user's authorization. Finally, as indicated by block 526, the dissemination (e.g., multicast or broadcast fan-out) of the database query to appropriate ones of the data gathering elements is started, before the method 500 is left via RETURN node 550. That is, since the fault and performance data is distributed among various data gathering elements, and is not centrally stored, a query is distributed to the appropriate data gathering operations (e.g., DGE databases). Since the configuration information associates users with devices (See, e.g., 710 and 720 of FIG. 7.) and devices with DGEs (See, e.g., 1020 and 1030 of FIG. 10.), the appropriate DGEs can be determined. Alternatively, as alluded to above, the query can be simply broadcast to all DGEs. Non-relevant DGEs can simply not transmit back their data. Alternatively, the data combination act (described later with reference to block 546) could suppress such non-relevant data.
Referring back to trigger (event) block 510, if a query response is received, as indicated by decision block 540, it is determined whether all (or enough) responses have been received. If not, it is determined whether a time out (for receiving enough query responses) has occurred. If, not, the method 500 branches back to trigger (event) block 510. If, on the other hand, a time out has occurred, a time out error action may be taken as indicated by block 544, before the method 500 is left via RETURN node 550. Referring back to decision block 540, if it is determined that all (or enough) responses have been received, the data from the various DGEs is combined (e.g., correlated) for presentation, as indicated by block 546. The correlation is transparent from the user's perspective. Then, as indicated by block 548, a presentation of the information (e.g., a report, a table, a graph, etc.) is generated for rendering to the user. Since the method 500 gets “fresh” data from the distributed databases, real-time performance reporting is possible in addition to real-time fault reporting. Accounting, if any, is performed as indicated by block 549, before the method 500 is left via RETURN node 550.
Although not shown, in one embodiment, the user can “drill-down” into a report to view data or information underlying a presentation result.
§ 4.3.3.1 REPORTS, GRAPHS AND TABLES
Recall from block 548, information is presented back to the user. Such a presentation may be in the form of reports, graphs and tables. Exemplary reports, graphs and tables are now described. Various embodiments of the presented invention may support some or all of the following reports.
An “Availability” report may be based on event data which shows the number of threshold violations, the distribution of such violations and total downtime. This report can be generated for a device, or individual tests or a business service. Device availability may be measured by the ICMP packet loss test. Metrics are captured for the device state equal to CRITICAL or UNREACHABLE. The report shows the top n (e.g., n=10) violations by amount of “unavailability”, displaying total time unavailable and % unavailable, with graphics showing either view. Users may link to an availability distribution report/graph for either accounts or devices, depending on which view is being accessed. This histogram is a distribution of the numbers of accounts or devices falling into blocks of 10% availability. That is, it displays the number of accounts/devices falling between 0–10% availability, 10–20% availability, and so on. Administrative users can view this report at the account level. FIG. 16 illustrates an exemplary account status summary report. Similarly, FIG. 17 illustrates an exemplary service status summary report. Administrative users can then drill down on individual devices for more detail. End users running the report will only see the device level metrics.
A “Downtime” report is similar to the Availability report, in that it is based on device availability as measured by the ICMP packet loss test. However, the results are only for device states equal to CRITICAL, rather than CRITICAL and UNREACHABLE. This more accurately reflects the situation when a single device outage occurs, with no regard for any possible parent device outages that may cause a child device to become UNREACHABLE. Again, downtime distribution metrics and a histogram permit administrative users to see account level metrics and drill down to individual device details, whereas end users may only see the device level metrics.
A “Top N” report displays the top N (e.g., N=10 accumulations (based on number of events recorded) during the reporting period per account, per device, and per test. Users may select time frame and event severity. Administrative users can view this report at the account level and then drill down on individual devices and tests for more detail. End users running the report may only see the device and test level metrics. An exemplary “Event” report is illustrated in FIGS. 11A and 11B.
A “Number of Events per Day” report displays the number of events recorded each day during the reporting period per account, per device, and per test. Users may select time frame and event severity. Administrative users can view this report at the account level and then drill down on individual devices and tests for more detail. End users running the report may only see the device and test level metrics.
A “Number of Events” report displays the total number of events recorded during the reporting period per account, per device, and per test. Users may select time frame and event severity. Administrative users can view this report at the account level and then drill down on individual devices and tests for more detail. End users running the report may only see the device and test level metrics.
An “Event Distribution” report displays the total number of events recorded during the reporting period per account, per device, and per test. Users may select time frame and event severity. Administrative users can view this report at the account level and then drill down on individual devices and tests for more detail. End users running the report may only see the device and test level metrics. The histogram is an event duration distribution of the numbers of accounts/devices/tests falling into bins of equal duration for the reporting period. That is, the reporting period may be divided into an equal number of multi-hour (e.g. 4 hour) blocks, with the number of accounts/devices/tests falling into each of those blocks.
A “Device Performance” report snapshot is a period (e.g., 24 hour) snapshot (hour by hour) of event summaries for all tests on a single device. Raw event data is analyzed hourly and the worst test state is displayed for each test as a colored block on the grid (24 hours×list of active tests on the device). For example, if a test is CRITICAL for one minute during the hour, the entire hour may be displayed as a red box representing the CRITICAL state. The Device Performance Report only applies to target devices, not to device groups. An exemplary test status summary report is illustrated in FIG. 12.
From the “Test Details” pages, users can view the “raw” data, showing all the individual test results for a single test. The difference between the raw data and viewing events is that events only occur when thresholds are crossed, whereas raw data shows the test results for every test interval. An exemplary test details report is illustrated in FIGS. 13A and 13B.
Statistical reports calculate statistics from raw results data such as mean, 95th and 98th percentiles, max and min values.
Trend reports can use regression algorithm for analyzing raw data and predicting the number of days to hit the specified thresholds. An exemplary service instability report is illustrated in FIGS. 14A and 14B. An exemplary usage and trend report is illustrated in FIG. 15.
Users can define custom reports in which devices, tests and the type of report to generate for these devices (e.g., top 10, events per day, statistical, trend, event distribution) are selected.
In one embodiment, the method 500 runs under an application server such as Jakarta Tomcat or BEA Weblogic.
§ 4.3.4 EXEMPLARY APPARATUS
FIG. 18 is high-level block diagram of a machine 1800 that may perform one or more of the operations discussed above. The machine 1800 basically includes a processor(s) 1810, an input/output interface unit(s) 1830, a storage device(s) 1820, and a system bus or network 1840 for facilitating the communication of information among the coupled elements. An input device(s) 1832 and an output device(s) 1834 may be coupled with the input/output interface(s) 1830.
The processor(s) 1810 may execute machine-executable instructions (e.g., C or C++ or Java running on the Solaris operating system available from Sun Microsystems Inc. of Palo Alto, Calif. or the Linux operating system widely available from a number of vendors such as Red Hat, Inc. of Durham, N.C.) to perform one or more aspects of the present invention. At least a portion of the machine executable instructions may be stored (temporarily or more permanently) on the storage device(s) 1820 and/or may be received from an external source via an input interface unit 1830.
In one embodiment, the machine 1800 may be one or more conventional personal computers. In this case, the processing unit(s) 1810 may be one or more microprocessors. The bus 1840 may include a system bus. The storage devices 1820 may include system memory, such as read only memory (ROM) and/or random access memory (RAM). The storage device(s) 1820 may also include a hard disk drive for reading from and writing to a hard disk, a magnetic disk drive for reading from or writing to a (e.g., removable) magnetic disk, and an optical disk drive for reading from or writing to a removable (magneto-) optical disk such as a compact disk or other (magneto-) optical media.
A user may enter commands and information into the personal computer through input devices 1832, such as a keyboard and pointing device (e.g., a mouse) for example. Other input devices such as a microphone, a joystick, a game pad, a satellite dish, a scanner, or the like, may also (or alternatively) be included. These and other input devices are often connected to the processing unit(s) 1810 through an appropriate interface 1830 coupled to the system bus 1840.
The output device(s) 1834 may include a monitor or other type of display device, which may also be connected to the system bus 1840 via an appropriate interface. In addition to (or instead of) the monitor, the personal computer may include other (peripheral) output devices (not shown), such as speakers and printers for example.
§ 4.3.4 ADDITIONAL FEATURES
Various refinements to the present invention are now described. Various embodiments of the present invention may include some or all of these refinements.
§ 4.3.4.1 SMART EVENT NOTIFICATION
A refined embodiment of the present invention can eliminate sending multiple notifications when a device goes down or is unavailable. Based on the inherent dependency between the ping packet loss test results and the availability of the device, if the ping packet loss test returns a CRITICAL result, then communication with the device has somehow been lost. Configured notifications for all other tests on the device are suppressed until packet loss returns to normal. Smart notification may include:
    • Suppressing alarms for all other device events. Smart alarming shows only actual failed tests.
    • Identifying relationships between devices to correlate and identify the actual point of network failure/outage and suppress alarms downstream.
    • Creating multi-level action profiles to handle event escalation.
§ 4.3.4.2 DEVICE DEPENDENCIES
A refined embodiment of the present invention supports device dependencies to suppress excessive notifications when a gateway-type device has gone down or is unavailable. Switches, routers, and other hardware are often the physical gateways that govern whether other network devices are reachable. Monitoring of many devices may be impeded if one of these critical “parent devices” becomes unavailable. To provide correlation, a parent and child hierarchy is created between monitored devices in order to distinguish the difference between a CRITICAL test on a device and an UNREACHABLE one.
In many cases, a device is considered to be “reachable”. However, if a test on a device is CRITICAL (for all thresholds), UNKNOWN, or FAILED, some additional processing is used to determine if the device is truly reachable. Such additional processing may involve the following. First, a current packet loss test is examined for the device. If such a test exists and the packet loss test result is not CRITICAL, the device is considered reachable. If no such test exists, all immediate parent devices are examined. If the device has no parents, the device is considered reachable and the result of the test is the measured value. The device is only considered unreachable if all the immediate parents have a “current” packet loss test result=100%. “Old” packet loss tests (those that occurred prior to the state change in the child's test result (i.e., OK to CRITICAL)) or the inexistence of a parent packet loss test for a parent has no effect on the result.
§ 4.3.4.3 MULTI-TIERED ADMINISTRATION MODEL
A refined embodiment of the present invention supports a “federated user model”. End user security may be controlled by permissions granted to a “User Group”. Each end user can only belong to a single “Account”, and each Account can only belong to a single User Group. Thus, an end user belongs to one and only one User Group for ease of administration. End users of one account are isolated from all other accounts, thus allowing various departments within an enterprise to each have a fully functional “virtual” copy of the invention.
Each User Group may have a unique privilege and limits matrix as defined by an Administrative user with administrative control over the User Group. Privileges for User Groups may be defined for devices, tests & actions. Limits at the User Group level may be defined for minimum test interval, max devices, max tests, max actions and max reports.
In addition to end-users, the system permits separate administrative users who can look at multiple ‘accounts’ (which a normal end-user cannot do). This framework allows senior management or central operation centers or customer care to report on multiple departments that they are responsible for. This eliminates the need for multiple deployments of the same product, while allowing seamless reporting across services that span IT infrastructure managed by different departments in an enterprise.
Administrative user security may be controlled by permissions granted to an Administrative Group. Administrative Groups and User Groups have a many-to-many relationship, allowing the administration of User Groups by numerous administrators who have varying permissions. Privileges for Administrative Groups may be defined for accounts, users, user groups, limits, devices, tests, and actions. A separate set of privileges is defined for each relationship between an Administrative Group and a User Group. A very simple configuration could establish the organization's Superuser as the only administrative user and all end-users belonging to a single User Group. In contrast, a complex organizational model might require the establishment of Administrative Groups for Network Administration, Database Administration, and Customer Service, with User Groups for C-level executives, IT Support, Marketing, etc.
Unlike administrators, the actions of “Superusers” are not constrained by a privileges matrix—they can perform any of the actions in the matrix on any user. Superusers create Administrative Groups and User Groups, and define the privileges the former has over the latter. The ‘superuser’ accounts are used to effectively bootstrap the system.
“Privileges” are the right to create, read, update, delete, suspend, etc. Each User Group has a privileges matrix associated with it that describes what operations the members of that User Group can perform. As mentioned previously, there is a similar, but more complex privileges matrix that describes what operations a member of an Administrative Group can do to administer one or more User Groups.
“Limits” are numerical bounds associated with a User Group that define minimum test interval, maximum devices, maximum tests, maximum actions and maximum reports for end-user accounts. An end user's actions are constrained by the Limits object associated with their User Group, unless there is another Limits object that is associated with the particular user (e.g. Read-only user) that would override the limits imposed by the User Group.
Administrative users occasionally need to directly administer an end-user's account, by logging into that account and providing on-line support to view the account and perform operations. This capability is especially helpful when an end-user's capabilities are limited to administer their own account. To circumvent the limited privileges of the end-user, the administrative user need not use the end-user's login/password, but rather “masquerades” as the end-user subject only to the administrative user's own privileges, which are often more extensive.
Administrators that have permissions to create end users and their accounts, have the option of creating users with read-only capabilities. In this way, administrators may give certain end users access to large amounts of data in the system, but without authority to change any of the characteristics of the devices, tests, actions or reports they are viewing.
When representing an end user, an administrator (if given proper create privileges) may create devices and tests for the end user in the end user's own account, via a “Represent” feature. One option the administrator has at the time of device creation is to make the device read-only. The tests on the read-only device become read-only as well. This feature was created to enable an end-user to observe the activity on a mission-critical network component, such as a switch or even a switch port, but not have the authority to modify its device or test settings.
§ 4.3.4.4 GRAPHICAL USER INTERFACE
Data may be collected from all DGEs and presented a consolidated view to the user primarily using a Web based interface. An end user only needs a commonly available Web browser to access the full functionality and reporting features of the product. Real-time status views are available for all accounts or devices or tests within an administrator's domain, all tests or devices or tests within an account, or all tests on a single device or device Group. Users can drill down on specific accounts, devices, and tests, and see six-hour, daily, weekly, monthly, and yearly performance information.
By using user administration pages, users can set default filters for the account and device summary pages to filter out devices in OK state, etc. For example, administrators may elect to filter out accounts and devices that are in an “OK” status. Especially for large deployments, this can dramatically cut down on the number of entries a user must scroll through to have a clear snapshot of system health. A toggle switch on the account and device summary pages may be used to quickly disable or enable the filter(s).
General administration features including: DGE location and host creation; administration of Administrative Group domains; Administration of User Group thresholds, privileges and actions; Account and user management; Administration of devices, device groups, tests and actions; and Password Management, all may be supported by a graphical user interface.
Via either an “Update Device” page or during device suspension, a user can enter a comment that will display on a “Device Status Summary” page. This could be used to identify why a device is being suspended, or as general information on the current state of the device.
§ 4.3.4.5 INTEGRATION WITH EXTERNAL SYSTEMS
The present invention can export data to other systems, or can send notifications to trouble ticketing or other NOC management tools. In addition, the present invention can import data from third party systems, such as OpenView from Hewlett-Packard, to provide a single administrative and analytical interface to all performance management measurements. More specifically, the present invention can import device name, IP address, SNMP community string and topology information from the HP OpenView NNM database, thereby complementing OpenView's topology discovery with the enhanced reporting capabilities of the present invention. Devices are automatically added/removed as the nodes are added or removed from NNM. Traps can be sent between NNM and the present invention as desired.
The present invention can open trouble tickets automatically using the Remedy notification plug in. It can automatically open trouble tickets in RT using the RT notification plug in.
§ 4.4 EXEMPLARY DEPLOYMENT AND ADMINISTRATION
The following exemplifies how the present invention may be deployed on a system and administered. All configuration can be done by the GUI or via the API.
Physical locations (which are arbitrarily defined by the superuser) of where Data Gathering Elements are installed are created in the system. Recall that a DGE is a data collection agent assigned to a “location.” To create a new DGE, its IP address and location are provided. Since multiple DGEs can exist in one location, soft and hard limits that define DGE load balancing may be set. The present invention may use a load balancing mechanism based on configurable device limits to ensure that DGE hosts are not overloaded. In this embodiment, each device is provisioned to a DGE when it is created based on the following heuristics:
    • 1. Find a DGE that services the location of the device.
    • 2. If there are many such DGEs and the user already has devices on one of them, pick that DGE.
    • 3. If there are many DGEs where the user already has devices, choose the one that's the least loaded.
    • 4. If there aren't any devices on which the user already has a device, pick the least loaded DGE that does service the location of the device.
    • 5. Only pick a DGE that has available capacity—available is defined as “below critical level” if the DGE already has devices for the user, else “below warning level”.
    • 6. If there's no DGE that services the device location and has available capacity, log the error.
After creating the DGEs in the system, user groups and accounts are created in the configuration database. After this, devices and tests are provisioned in the system, typically using an auto-discovery tool which finds all IP devices and available tests on them in the given subnets. Default thresholds and actions are used if none is provided by the user. At this stage, the system is ready to be operational. When a DGE is enabled (either a process on the same machine as the configuration database or on another machine), it connects to the configuration database, identifies itself and downloads its configuration. After download its configuration, the DGE starts monitoring tests as described earlier.
The fault and performance monitoring system of the present invention can be set up and installed in a stand-alone environment in a few hours. Default test settings, action profiles, and reports may be pre-loaded into the system. Lists of devices can be batch-imported automatically into the system using the API.
§ 4.5 CONCLUSIONS
As can be appreciated from the foregoing disclosure, the present invention discloses apparatus, data structures and methods for combining system fault and performance monitoring. By using distributed data collection and storage of performance data, storage requirements are relaxed and real-time performance monitoring is possible. Data collection and storage elements can be easily configured via a central configuration database. The configuration database can be easily updated and changed. A federated user model allows normal end users to monitor devices relevant to the part of a service they are responsible for, while allowing administrative users to view the fault and performance of a service in an end-to-end manner across multiple accounts or departments.

Claims (26)

1. A method for gathering and storing performance data of a system, based on global configuration information, using data gathering elements distributed within the system, the method comprising:
a) for each of the data gathering elements, configuring the data gathering element using the global configuration information such that the data gathering element has relevant configuration information;
b) for each of the data gathering elements, requesting, with the data gathering element, data using the relevant configuration information;
c) for each of the data gathering elements, accepting, with the data gathering element, the requested data; and
d) for each of the data gathering elements, if the accepted requested data is performance data, storing, with the data gathering element, the performance data or data derived from the performance data,
wherein the performance data or data derived from the performance data is stored, regardless of its value, for use by a performance information extraction, combination and presentation facility.
2. The method of claim 1 wherein the act of storing the performance data or data derived from the performance data for each of the data gathering elements includes aggregating the stored data.
3. The method of claim 1 further comprising:
for each of the data gathering elements, preprocessing, with the data gathering element, the performance data to generate data derived from the performance data.
4. The method of claim 3 wherein the act of preprocessing the performance data includes determining a rate of the performance data.
5. The method of claim 3 wherein the act of preprocessing the performance data includes determining a difference between the performance data and a predetermined value.
6. The method of claim 3 wherein the act of preprocessing the performance data includes determining a difference between the performance data and previously gathered performance data.
7. The method of claim 3 wherein the act of preprocessing the performance data includes determining a percentage of the performance data with respect to a predetermined value.
8. The method of claim 1 further comprising:
if the accepted data is fault data, then
i) comparing, at the data gathering element, the fault data to at least one threshold,
ii) if the fault data violates any one of the at least one data threshold,
A) determining, at the data gathering element, an action, and
B) initiating, at the data gathering element, the determined action.
9. The method of claim 8 wherein if the fault data violates any one of the at least one data threshold, then storing, with the data gathering element, a fault threshold violation occurrence.
10. The method of claim 1 further comprising:
e) for each of at least some of the data gathering elements, accepting a request from the performance information extraction, combination and reporting facility; and
f) for each of at least some of the data gathering elements, providing stored performance data or data derived from the performance data to the performance information extraction, combination and reporting facility in response to the accepted request.
11. The method of claim 1 wherein each of the data gathering elements store performance data or data derived from the performance data for longer periods of time than the performance information extraction, combination and reporting facility.
12. The method of claim 1 wherein the performance information extraction, combination and reporting facility uses the data gathering elements for exclusive long term storage of performance data or data derived from the performance data.
13. The method of claim 1 wherein the act of storing the performance data or data derived from the performance data if the accepted requested data is performance data is performed without prior analysis of the data.
14. A monitoring system for gathering and storing performance data of a system, based on global configuration information, the monitoring system comprising:
a) data gathering elements distributed within the system, each of the data gathering elements adapted to
i) configure itself using the global configuration information such that the data gathering element has relevant configuration information;
ii) request data using the relevant configuration information;
iii) accept the requested data; and
iv) if the accepted requested data is performance data, store the performance data or data derived from the performance data,
wherein the performance data or data derived from the performance data is stored, regardless of its value, for use by a performance information extraction, combination and presentation facility.
15. The monitoring system of claim 14 wherein each of the data gathering elements is further adapted to aggregate the stored data.
16. The monitoring system of claim 14 wherein each of the data gathering elements is further adapted to preprocess the performance data to generate data derived from the performance data.
17. The monitoring system of claim 16 wherein each of the data gathering elements is adapted to preprocess the performance data by determining a rate of the performance data.
18. The monitoring system of claim 16 wherein each of the data gathering elements is adapted to preprocess the performance data by determining a difference between the performance data and a predetermined value.
19. The monitoring system of claim 16 wherein each of the data gathering elements is adapted to preprocess the performance data by determining a difference between the performance data and previously gathered performance data.
20. The monitoring system of claim 16 wherein each of the data gathering elements is adapted to preprocess the performance data by determining a percentage of the performance data with respect to a predetermined value.
21. The monitoring system of claim 14 wherein each of the data gathering elements is further adapted to determine whether the accepted data is fault data, and if the accepted data is fault data, then comparing the fault data to at least one threshold, and if the fault data violates any one of the at least one data threshold,
determining an action, and
initiating the determined action.
22. The monitoring system of claim 21 wherein each of the data gathering elements is further adapted to, if the fault data violates any one of the at least one data threshold, store a fault threshold violation occurrence.
23. The monitoring system of claim 14 wherein each of the data gathering elements if further adapted to
e) accept a request from the performance information extraction, combination and reporting facility; and
f) provide stored performance data or data derived from the performance data to the performance information extraction, combination and reporting facility in response to the accepted request.
24. The monitoring system of claim 14 wherein each of the data gathering elements is adapted to store performance data or data derived from the performance data for longer periods of time than the performance information extraction, combination and reporting facility.
25. The monitoring system of claim 14 wherein the performance information extraction, combination and reporting facility uses the data gathering elements for exclusive long term storage of performance data or data derived from the performance data.
26. A machine readable medium storing machine executable instructions which, when executed by a machine, perform the method of claim 1.
US10/286,447 2002-11-01 2002-11-01 Distributed data gathering and storage for use in a fault and performance monitoring system Active 2024-09-03 US7246159B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/286,447 US7246159B2 (en) 2002-11-01 2002-11-01 Distributed data gathering and storage for use in a fault and performance monitoring system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/286,447 US7246159B2 (en) 2002-11-01 2002-11-01 Distributed data gathering and storage for use in a fault and performance monitoring system

Publications (2)

Publication Number Publication Date
US20040088386A1 US20040088386A1 (en) 2004-05-06
US7246159B2 true US7246159B2 (en) 2007-07-17

Family

ID=32175456

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/286,447 Active 2024-09-03 US7246159B2 (en) 2002-11-01 2002-11-01 Distributed data gathering and storage for use in a fault and performance monitoring system

Country Status (1)

Country Link
US (1) US7246159B2 (en)

Cited By (166)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040165605A1 (en) * 2003-02-25 2004-08-26 Nassar Ayman Esam System and method for automated provisioning of inter-provider internet protocol telecommunication services
US20050105508A1 (en) * 2003-11-14 2005-05-19 Innomedia Pte Ltd. System for management of Internet telephony equipment deployed behind firewalls
US20050278426A1 (en) * 2004-06-15 2005-12-15 First Data Corporation Systems and methods for merging communications
US20060031482A1 (en) * 2004-05-25 2006-02-09 Nortel Networks Limited Connectivity fault notification
US20060200819A1 (en) * 2005-03-02 2006-09-07 Ludmila Cherkasova System and method for attributing CPU usage of a virtual machine monitor to a corresponding virtual machine
US20070011282A1 (en) * 2005-06-28 2007-01-11 Utstarcom, Inc. System and method for performing a distributed configuration across devices
US20070078825A1 (en) * 2005-09-30 2007-04-05 Sap Ag Systems and methods for repeatable database performance testing
US20080049757A1 (en) * 2006-08-22 2008-02-28 Bugenhagen Michael K System and method for synchronizing counters on an asynchronous packet communications network
US20080049615A1 (en) * 2006-08-22 2008-02-28 Bugenhagen Michael K System and method for dynamically shaping network traffic
US20080049747A1 (en) * 2006-08-22 2008-02-28 Mcnaughton James L System and method for handling reservation requests with a connection admission control engine
US20080183715A1 (en) * 2007-01-31 2008-07-31 Wei Wen Chen Extensible system for network discovery
US20090013398A1 (en) * 2007-07-06 2009-01-08 Acterna Llc Remote Testing Of Firewalled Networks
US20090177249A1 (en) * 2007-08-10 2009-07-09 Smiths Medical Md Package deployment of data between a server and a medical device
US7634496B1 (en) 2006-01-03 2009-12-15 Emc Corporation Techniques for managing state changes of a data storage system utilizing the object oriented paradigm
US7693176B2 (en) * 2006-02-27 2010-04-06 Vonage Network Llc Method and system for bidirectional data transfer
US20100131315A1 (en) * 2008-11-25 2010-05-27 International Business Machines Corporation Resolving incident reports
US20100241907A1 (en) * 2009-03-19 2010-09-23 Fujitsu Limited Network monitor and control apparatus
US20100281518A1 (en) * 2009-04-30 2010-11-04 Embarq Holdings Company, Llc System and method for separating control of a network interface device
US7843831B2 (en) 2006-08-22 2010-11-30 Embarq Holdings Company Llc System and method for routing data on a packet network
US20110044439A1 (en) * 2005-10-19 2011-02-24 Marco Schneider Methods and apparatus for authorization and/or routing of outdial communication services
US7899900B1 (en) * 2002-08-22 2011-03-01 Ricoh Company, Ltd. Method and system for monitoring network connected devices with multiple protocols
US7940735B2 (en) 2006-08-22 2011-05-10 Embarq Holdings Company, Llc System and method for selecting an access point
US7948909B2 (en) 2006-06-30 2011-05-24 Embarq Holdings Company, Llc System and method for resetting counters counting network performance information at network communications devices on a packet network
US8000318B2 (en) 2006-06-30 2011-08-16 Embarq Holdings Company, Llc System and method for call routing based on transmission performance of a packet network
US8015294B2 (en) 2006-08-22 2011-09-06 Embarq Holdings Company, LP Pin-hole firewall for communicating data packets on a packet network
US8040811B2 (en) 2006-08-22 2011-10-18 Embarq Holdings Company, Llc System and method for collecting and managing network performance information
US8064391B2 (en) 2006-08-22 2011-11-22 Embarq Holdings Company, Llc System and method for monitoring and optimizing network performance to a wireless device
US8068425B2 (en) 2008-04-09 2011-11-29 Embarq Holdings Company, Llc System and method for using network performance information to determine improved measures of path states
US8098579B2 (en) 2006-08-22 2012-01-17 Embarq Holdings Company, LP System and method for adjusting the window size of a TCP packet through remote network elements
US8102770B2 (en) 2006-08-22 2012-01-24 Embarq Holdings Company, LP System and method for monitoring and optimizing network performance with vector performance tables and engines
US8107366B2 (en) 2006-08-22 2012-01-31 Embarq Holdings Company, LP System and method for using centralized network performance tables to manage network communications
US8111692B2 (en) 2007-05-31 2012-02-07 Embarq Holdings Company Llc System and method for modifying network traffic
US8125897B2 (en) 2006-08-22 2012-02-28 Embarq Holdings Company Lp System and method for monitoring and optimizing network performance with user datagram protocol network performance information packets
US8130793B2 (en) 2006-08-22 2012-03-06 Embarq Holdings Company, Llc System and method for enabling reciprocal billing for different types of communications over a packet network
US8144587B2 (en) 2006-08-22 2012-03-27 Embarq Holdings Company, Llc System and method for load balancing network resources using a connection admission control engine
US8144586B2 (en) 2006-08-22 2012-03-27 Embarq Holdings Company, Llc System and method for controlling network bandwidth with a connection admission control engine
US8184549B2 (en) 2006-06-30 2012-05-22 Embarq Holdings Company, LLP System and method for selecting network egress
US8189468B2 (en) 2006-10-25 2012-05-29 Embarq Holdings, Company, LLC System and method for regulating messages between networks
US8194643B2 (en) 2006-10-19 2012-06-05 Embarq Holdings Company, Llc System and method for monitoring the connection of an end-user to a remote network
US8194555B2 (en) 2006-08-22 2012-06-05 Embarq Holdings Company, Llc System and method for using distributed network performance information tables to manage network communications
US8199653B2 (en) 2006-08-22 2012-06-12 Embarq Holdings Company, Llc System and method for communicating network performance information over a packet network
US8223655B2 (en) 2006-08-22 2012-07-17 Embarq Holdings Company, Llc System and method for provisioning resources of a packet network based on collected network performance information
US8224255B2 (en) 2006-08-22 2012-07-17 Embarq Holdings Company, Llc System and method for managing radio frequency windows
US8228791B2 (en) 2006-08-22 2012-07-24 Embarq Holdings Company, Llc System and method for routing communications between packet networks based on intercarrier agreements
US8238253B2 (en) 2006-08-22 2012-08-07 Embarq Holdings Company, Llc System and method for monitoring interlayer devices and optimizing network performance
US8274905B2 (en) 2006-08-22 2012-09-25 Embarq Holdings Company, Llc System and method for displaying a graph representative of network performance over a time period
US8289965B2 (en) 2006-10-19 2012-10-16 Embarq Holdings Company, Llc System and method for establishing a communications session with an end-user based on the state of a network connection
US8307065B2 (en) 2006-08-22 2012-11-06 Centurylink Intellectual Property Llc System and method for remotely controlling network operators
US20120331034A1 (en) * 2011-06-22 2012-12-27 Alain Fawaz Latency Probe
US8346732B1 (en) * 2005-11-30 2013-01-01 Symantec Operating Corporation Method and apparatus for providing high availability of a database
US8358580B2 (en) 2006-08-22 2013-01-22 Centurylink Intellectual Property Llc System and method for adjusting the window size of a TCP packet through network elements
US8407765B2 (en) 2006-08-22 2013-03-26 Centurylink Intellectual Property Llc System and method for restricting access to network performance information tables
US8488447B2 (en) 2006-06-30 2013-07-16 Centurylink Intellectual Property Llc System and method for adjusting code speed in a transmission path during call set-up due to reduced transmission performance
US8537695B2 (en) 2006-08-22 2013-09-17 Centurylink Intellectual Property Llc System and method for establishing a call being received by a trunk on a packet network
US8549405B2 (en) 2006-08-22 2013-10-01 Centurylink Intellectual Property Llc System and method for displaying a graphical representation of a network to identify nodes and node segments on the network that are not operating normally
US8576722B2 (en) 2006-08-22 2013-11-05 Centurylink Intellectual Property Llc System and method for modifying connectivity fault management packets
US8619600B2 (en) 2006-08-22 2013-12-31 Centurylink Intellectual Property Llc System and method for establishing calls over a call path having best path metrics
US20140006861A1 (en) * 2012-06-28 2014-01-02 Microsoft Corporation Problem inference from support tickets
US20140019797A1 (en) * 2012-07-11 2014-01-16 Ca, Inc. Resource management in ephemeral environments
US8717911B2 (en) 2006-06-30 2014-05-06 Centurylink Intellectual Property Llc System and method for collecting network performance information
US8743703B2 (en) 2006-08-22 2014-06-03 Centurylink Intellectual Property Llc System and method for tracking application resource usage
US8750158B2 (en) 2006-08-22 2014-06-10 Centurylink Intellectual Property Llc System and method for differentiated billing
US8767707B2 (en) 2010-04-23 2014-07-01 Blackberry Limited Monitoring a mobile data service associated with a mailbox
US20140280220A1 (en) * 2013-03-13 2014-09-18 Sas Institute Inc. Scored storage determination
US9094257B2 (en) 2006-06-30 2015-07-28 Centurylink Intellectual Property Llc System and method for selecting a content delivery network
US9208012B2 (en) * 2010-11-29 2015-12-08 Nec Corporation Display processing system, display processing method, and program
US20150381546A1 (en) * 2014-06-30 2015-12-31 Palo Alto Research Center Incorporated System and method for managing devices over a content centric network
US9262253B2 (en) 2012-06-28 2016-02-16 Microsoft Technology Licensing, Llc Middlebox reliability
US9325748B2 (en) 2012-11-15 2016-04-26 Microsoft Technology Licensing, Llc Characterizing service levels on an electronic network
CN105577431A (en) * 2015-12-11 2016-05-11 青岛云成互动网络有限公司 User information identification and classification method based on internet application and system thereof
US9350601B2 (en) 2013-06-21 2016-05-24 Microsoft Technology Licensing, Llc Network event processing and prioritization
US20160216981A1 (en) * 2015-01-26 2016-07-28 Fuji Xerox Co., Ltd. Information processing apparatus, non-transitory computer readable medium, and information processing method
US9473576B2 (en) 2014-04-07 2016-10-18 Palo Alto Research Center Incorporated Service discovery using collection synchronization with exact names
US9479341B2 (en) 2006-08-22 2016-10-25 Centurylink Intellectual Property Llc System and method for initiating diagnostics on a packet network node
US9542459B2 (en) 2013-05-20 2017-01-10 International Business Machines Corporation Adaptive data collection
US9565080B2 (en) 2012-11-15 2017-02-07 Microsoft Technology Licensing, Llc Evaluating electronic network devices in view of cost and service level considerations
US9590887B2 (en) 2014-07-18 2017-03-07 Cisco Systems, Inc. Method and system for keeping interest alive in a content centric network
US9590948B2 (en) 2014-12-15 2017-03-07 Cisco Systems, Inc. CCN routing using hardware-assisted hash tables
US9609014B2 (en) 2014-05-22 2017-03-28 Cisco Systems, Inc. Method and apparatus for preventing insertion of malicious content at a named data network router
US9621354B2 (en) 2014-07-17 2017-04-11 Cisco Systems, Inc. Reconstructable content objects
US9626413B2 (en) 2014-03-10 2017-04-18 Cisco Systems, Inc. System and method for ranking content popularity in a content-centric network
US9660825B2 (en) 2014-12-24 2017-05-23 Cisco Technology, Inc. System and method for multi-source multicasting in content-centric networks
US9686194B2 (en) 2009-10-21 2017-06-20 Cisco Technology, Inc. Adaptive multi-interface use for content networking
US9699198B2 (en) 2014-07-07 2017-07-04 Cisco Technology, Inc. System and method for parallel secure content bootstrapping in content-centric networks
US9716622B2 (en) 2014-04-01 2017-07-25 Cisco Technology, Inc. System and method for dynamic name configuration in content-centric networks
US9729662B2 (en) 2014-08-11 2017-08-08 Cisco Technology, Inc. Probabilistic lazy-forwarding technique without validation in a content centric network
US9729616B2 (en) 2014-07-18 2017-08-08 Cisco Technology, Inc. Reputation-based strategy for forwarding and responding to interests over a content centric network
US9800637B2 (en) 2014-08-19 2017-10-24 Cisco Technology, Inc. System and method for all-in-one content stream in content-centric networks
US9832123B2 (en) 2015-09-11 2017-11-28 Cisco Technology, Inc. Network named fragments in a content centric network
US9832291B2 (en) 2015-01-12 2017-11-28 Cisco Technology, Inc. Auto-configurable transport stack
US9832116B2 (en) 2016-03-14 2017-11-28 Cisco Technology, Inc. Adjusting entries in a forwarding information base in a content centric network
US9836540B2 (en) 2014-03-04 2017-12-05 Cisco Technology, Inc. System and method for direct storage access in a content-centric network
US9882964B2 (en) 2014-08-08 2018-01-30 Cisco Technology, Inc. Explicit strategy feedback in name-based forwarding
US9912776B2 (en) 2015-12-02 2018-03-06 Cisco Technology, Inc. Explicit content deletion commands in a content centric network
US9916457B2 (en) 2015-01-12 2018-03-13 Cisco Technology, Inc. Decoupled name security binding for CCN objects
US9930146B2 (en) 2016-04-04 2018-03-27 Cisco Technology, Inc. System and method for compressing content centric networking messages
US9946743B2 (en) 2015-01-12 2018-04-17 Cisco Technology, Inc. Order encoded manifests in a content centric network
US9954795B2 (en) 2015-01-12 2018-04-24 Cisco Technology, Inc. Resource allocation using CCN manifests
US9954678B2 (en) 2014-02-06 2018-04-24 Cisco Technology, Inc. Content-based transport security
US9959328B2 (en) 2015-06-30 2018-05-01 Microsoft Technology Licensing, Llc Analysis of user text
US9977809B2 (en) 2015-09-24 2018-05-22 Cisco Technology, Inc. Information and data framework in a content centric network
US9986034B2 (en) 2015-08-03 2018-05-29 Cisco Technology, Inc. Transferring state in content centric network stacks
US9992281B2 (en) 2014-05-01 2018-06-05 Cisco Technology, Inc. Accountable content stores for information centric networks
US9992097B2 (en) 2016-07-11 2018-06-05 Cisco Technology, Inc. System and method for piggybacking routing information in interests in a content centric network
US10003507B2 (en) 2016-03-04 2018-06-19 Cisco Technology, Inc. Transport session state protocol
US10003520B2 (en) 2014-12-22 2018-06-19 Cisco Technology, Inc. System and method for efficient name-based content routing using link-state information in information-centric networks
US20180173698A1 (en) * 2016-12-16 2018-06-21 Microsoft Technology Licensing, Llc Knowledge Base for Analysis of Text
US10009266B2 (en) 2016-07-05 2018-06-26 Cisco Technology, Inc. Method and system for reference counted pending interest tables in a content centric network
US10027578B2 (en) 2016-04-11 2018-07-17 Cisco Technology, Inc. Method and system for routable prefix queries in a content centric network
US10033639B2 (en) 2016-03-25 2018-07-24 Cisco Technology, Inc. System and method for routing packets in a content centric network using anonymous datagrams
US10033642B2 (en) 2016-09-19 2018-07-24 Cisco Technology, Inc. System and method for making optimal routing decisions based on device-specific parameters in a content centric network
US10038633B2 (en) 2016-03-04 2018-07-31 Cisco Technology, Inc. Protocol to query for historical network information in a content centric network
US10043016B2 (en) 2016-02-29 2018-08-07 Cisco Technology, Inc. Method and system for name encryption agreement in a content centric network
US10051071B2 (en) 2016-03-04 2018-08-14 Cisco Technology, Inc. Method and system for collecting historical network information in a content centric network
US10063414B2 (en) 2016-05-13 2018-08-28 Cisco Technology, Inc. Updating a transport stack in a content centric network
US10069729B2 (en) 2016-08-08 2018-09-04 Cisco Technology, Inc. System and method for throttling traffic based on a forwarding information base in a content centric network
US10067948B2 (en) 2016-03-18 2018-09-04 Cisco Technology, Inc. Data deduping in content centric networking manifests
US10069933B2 (en) 2014-10-23 2018-09-04 Cisco Technology, Inc. System and method for creating virtual interfaces based on network characteristics
US10075402B2 (en) 2015-06-24 2018-09-11 Cisco Technology, Inc. Flexible command and control in content centric networks
US10075401B2 (en) 2015-03-18 2018-09-11 Cisco Technology, Inc. Pending interest table behavior
US10084764B2 (en) 2016-05-13 2018-09-25 Cisco Technology, Inc. System for a secure encryption proxy in a content centric network
US10091330B2 (en) 2016-03-23 2018-10-02 Cisco Technology, Inc. Interest scheduling by an information and data framework in a content centric network
US10098051B2 (en) 2014-01-22 2018-10-09 Cisco Technology, Inc. Gateways and routing in software-defined manets
US10097346B2 (en) 2015-12-09 2018-10-09 Cisco Technology, Inc. Key catalogs in a content centric network
US10104041B2 (en) 2008-05-16 2018-10-16 Cisco Technology, Inc. Controlling the spread of interests and content in a content centric network
US10103989B2 (en) 2016-06-13 2018-10-16 Cisco Technology, Inc. Content object return messages in a content centric network
US10122624B2 (en) 2016-07-25 2018-11-06 Cisco Technology, Inc. System and method for ephemeral entries in a forwarding information base in a content centric network
US10135948B2 (en) 2016-10-31 2018-11-20 Cisco Technology, Inc. System and method for process migration in a content centric network
US10148572B2 (en) 2016-06-27 2018-12-04 Cisco Technology, Inc. Method and system for interest groups in a content centric network
US10212248B2 (en) 2016-10-03 2019-02-19 Cisco Technology, Inc. Cache management on high availability routers in a content centric network
US10212196B2 (en) 2016-03-16 2019-02-19 Cisco Technology, Inc. Interface discovery and authentication in a name-based network
US10237189B2 (en) 2014-12-16 2019-03-19 Cisco Technology, Inc. System and method for distance-based interest forwarding
US10243851B2 (en) 2016-11-21 2019-03-26 Cisco Technology, Inc. System and method for forwarder connection information in a content centric network
US10257271B2 (en) 2016-01-11 2019-04-09 Cisco Technology, Inc. Chandra-Toueg consensus in a content centric network
US10263965B2 (en) 2015-10-16 2019-04-16 Cisco Technology, Inc. Encrypted CCNx
CN109739837A (en) * 2018-12-28 2019-05-10 深圳市简工智能科技有限公司 Analysis method, terminal and the readable storage medium storing program for executing of smart lock log
US10305865B2 (en) 2016-06-21 2019-05-28 Cisco Technology, Inc. Permutation-based content encryption with manifests in a content centric network
US10313227B2 (en) 2015-09-24 2019-06-04 Cisco Technology, Inc. System and method for eliminating undetected interest looping in information-centric networks
US10313211B1 (en) * 2015-08-25 2019-06-04 Avi Networks Distributed network service risk monitoring and scoring
US10320675B2 (en) 2016-05-04 2019-06-11 Cisco Technology, Inc. System and method for routing packets in a stateless content centric network
US10320760B2 (en) 2016-04-01 2019-06-11 Cisco Technology, Inc. Method and system for mutating and caching content in a content centric network
US10333840B2 (en) 2015-02-06 2019-06-25 Cisco Technology, Inc. System and method for on-demand content exchange with adaptive naming in information-centric networks
US10355999B2 (en) 2015-09-23 2019-07-16 Cisco Technology, Inc. Flow control with network named fragments
US20190245767A1 (en) * 2016-07-08 2019-08-08 Convida Wireless, Llc Methods to monitor resources through http/2
US10404450B2 (en) 2016-05-02 2019-09-03 Cisco Technology, Inc. Schematized access control in a content centric network
US10402435B2 (en) 2015-06-30 2019-09-03 Microsoft Technology Licensing, Llc Utilizing semantic hierarchies to process free-form text
US10425503B2 (en) 2016-04-07 2019-09-24 Cisco Technology, Inc. Shared pending interest table in a content centric network
US10447805B2 (en) 2016-10-10 2019-10-15 Cisco Technology, Inc. Distributed consensus in a content centric network
US10454820B2 (en) 2015-09-29 2019-10-22 Cisco Technology, Inc. System and method for stateless information-centric networking
US20190370145A1 (en) * 2010-02-24 2019-12-05 Salesforce.Com, Inc. System, method and computer program product for monitoring data activity utilizing a shared data store
US10547589B2 (en) 2016-05-09 2020-01-28 Cisco Technology, Inc. System for implementing a small computer systems interface protocol over a content centric network
US10594562B1 (en) 2015-08-25 2020-03-17 Vmware, Inc. Intelligent autoscale of services
CN110914742A (en) * 2017-03-17 2020-03-24 英国研究与创新组织 Super-resolution microscopy
US10693734B2 (en) 2016-03-04 2020-06-23 Vmware, Inc. Traffic pattern detection and presentation in container-based cloud computing architecture
US10701038B2 (en) 2015-07-27 2020-06-30 Cisco Technology, Inc. Content negotiation in a content centric network
US10742596B2 (en) 2016-03-04 2020-08-11 Cisco Technology, Inc. Method and system for reducing a collision probability of hash-based names using a publisher identifier
US10929363B2 (en) 2016-11-11 2021-02-23 International Business Machines Corporation Assisted problem identification in a computing system
US10931548B1 (en) 2016-03-28 2021-02-23 Vmware, Inc. Collecting health monitoring data pertaining to an application from a selected set of service engines
US10956412B2 (en) 2016-08-09 2021-03-23 Cisco Technology, Inc. Method and system for conjunctive normal form attribute matching in a content centric network
US10999168B1 (en) 2018-05-30 2021-05-04 Vmware, Inc. User defined custom metrics
US11044180B2 (en) 2018-10-26 2021-06-22 Vmware, Inc. Collecting samples hierarchically in a datacenter
US11283697B1 (en) 2015-03-24 2022-03-22 Vmware, Inc. Scalable real time metrics management
US11290358B2 (en) 2019-05-30 2022-03-29 Vmware, Inc. Partitioning health monitoring in a global server load balancing system
US11436656B2 (en) 2016-03-18 2022-09-06 Palo Alto Research Center Incorporated System and method for a real-time egocentric collaborative filter on large datasets
US11792155B2 (en) 2021-06-14 2023-10-17 Vmware, Inc. Method and apparatus for enhanced client persistence in multi-site GSLB deployments
US11811861B2 (en) 2021-05-17 2023-11-07 Vmware, Inc. Dynamically updating load balancing criteria

Families Citing this family (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8266270B1 (en) 2002-07-16 2012-09-11 At&T Intellectual Property I, L.P. Delivery performance analysis for internet services
US7779113B1 (en) * 2002-11-25 2010-08-17 Oracle International Corporation Audit management system for networks
JP2004264995A (en) * 2003-02-28 2004-09-24 Toshiba Tec Corp Office equipment, information equipment, information management system of office equipment, information management method of office equipment and information management program
US9137033B2 (en) 2003-03-18 2015-09-15 Dynamic Network Services, Inc. Methods and systems for monitoring network routing
US7457866B1 (en) * 2003-03-24 2008-11-25 Netapp, Inc. Method and apparatus for diagnosing connectivity problems from a network management station
US7574431B2 (en) * 2003-05-21 2009-08-11 Digi International Inc. Remote data collection and control using a custom SNMP MIB
US7848259B2 (en) * 2003-08-01 2010-12-07 Opnet Technologies, Inc. Systems and methods for inferring services on a network
US7353265B2 (en) * 2004-06-02 2008-04-01 Lehman Brothers Inc. Method and system for monitoring and managing assets, applications, and services using aggregated event and performance data thereof
US9537731B2 (en) * 2004-07-07 2017-01-03 Sciencelogic, Inc. Management techniques for non-traditional network and information system topologies
WO2006014504A2 (en) 2004-07-07 2006-02-09 Sciencelogic, Llc Self configuring network management system
US7874000B1 (en) * 2004-11-22 2011-01-18 Symantec Corporation Reducing false positives generated by a database intrusion detection system
US7475130B2 (en) * 2004-12-23 2009-01-06 International Business Machines Corporation System and method for problem resolution in communications networks
JP4313336B2 (en) * 2005-06-03 2009-08-12 株式会社日立製作所 Monitoring system and monitoring method
US20070116234A1 (en) * 2005-10-19 2007-05-24 Marco Schneider Methods and apparatus for preserving access information during call transfers
US7924987B2 (en) * 2005-10-19 2011-04-12 At&T Intellectual Property I., L.P. Methods, apparatus and data structures for managing distributed communication systems
US20070086433A1 (en) * 2005-10-19 2007-04-19 Cunetto Philip C Methods and apparatus for allocating shared communication resources to outdial communication services
US20070086432A1 (en) * 2005-10-19 2007-04-19 Marco Schneider Methods and apparatus for automated provisioning of voice over internet protocol gateways
US8238327B2 (en) * 2005-10-19 2012-08-07 At&T Intellectual Property I, L.P. Apparatus and methods for subscriber and enterprise assignments and resource sharing
US7643472B2 (en) 2005-10-19 2010-01-05 At&T Intellectual Property I, Lp Methods and apparatus for authorizing and allocating outdial communication services
US20070100647A1 (en) * 2005-11-03 2007-05-03 International Business Machines Corporation Eligibility list management in a distributed group membership system
US7761538B2 (en) * 2006-08-30 2010-07-20 Microsoft Corporation Dynamically configuring, allocating and deploying computing systems
US20080065616A1 (en) * 2006-09-13 2008-03-13 Brown Abby H Metadata integration tool, systems and methods for managing enterprise metadata for the runtime environment
US8463894B2 (en) * 2007-06-08 2013-06-11 Oracle International Corporation Performance monitoring web console for distributed transaction service
US20090003310A1 (en) * 2007-06-27 2009-01-01 Kadel Bryan F Dynamic allocation of VOIP service resources
US8595369B2 (en) * 2007-11-13 2013-11-26 Vmware, Inc. Method and system for correlating front-end and back-end transactions in a data center
US8676998B2 (en) * 2007-11-29 2014-03-18 Red Hat, Inc. Reverse network authentication for nonstandard threat profiles
JP5326303B2 (en) * 2008-03-10 2013-10-30 富士通株式会社 Integration device, integration program, and integration method
US8768892B2 (en) * 2008-09-29 2014-07-01 Microsoft Corporation Analyzing data and providing recommendations
US8661116B2 (en) * 2008-12-15 2014-02-25 Verizon Patent And Licensing Inc. Network testing
JP2013030062A (en) * 2011-07-29 2013-02-07 Nomura Research Institute Ltd Operation management support device
US9106663B2 (en) 2012-02-01 2015-08-11 Comcast Cable Communications, Llc Latency-based routing and load balancing in a network
EP2842058A1 (en) * 2012-03-28 2015-03-04 BMC Software, Inc. Requesting and displaying a business service context from a virtual database
US8743893B2 (en) 2012-05-18 2014-06-03 Renesys Path reconstruction and interconnection modeling (PRIM)
US9449032B2 (en) * 2013-04-22 2016-09-20 Sap Se Multi-buffering system supporting read/write access to different data source type
US9559928B1 (en) * 2013-05-03 2017-01-31 Amazon Technologies, Inc. Integrated test coverage measurement in distributed systems
CN103812631A (en) * 2013-11-07 2014-05-21 奥维通信股份有限公司 Embedded linux device based on DHCP synchronous network clock
US10346367B1 (en) * 2015-04-30 2019-07-09 Amazon Technologies, Inc. Load shedding techniques for distributed services with persistent client connections to ensure quality of service
IN2015CH03327A (en) * 2015-06-30 2015-07-17 Wipro Ltd
US11424998B2 (en) * 2015-07-31 2022-08-23 Micro Focus Llc Information technology service management records in a service level target database table
US10708151B2 (en) * 2015-10-22 2020-07-07 Level 3 Communications, Llc System and methods for adaptive notification and ticketing
CN105893226A (en) * 2015-11-13 2016-08-24 乐视云计算有限公司 Service monitoring system
CN105871602B (en) 2016-03-29 2019-10-18 华为技术有限公司 A kind of control method, device and system counting flow
US10514993B2 (en) * 2017-02-14 2019-12-24 Google Llc Analyzing large-scale data processing jobs
US11138230B2 (en) * 2018-03-26 2021-10-05 Mcafee, Llc Methods, apparatus, and systems to aggregate partitioned computer database data
CN111258855B (en) * 2020-02-03 2023-12-05 杭州迪普科技股份有限公司 Apparatus and method for monitoring the operational health of a managed device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6367034B1 (en) * 1998-09-21 2002-04-02 Microsoft Corporation Using query language for event filtering and aggregation
US20020049838A1 (en) * 2000-06-21 2002-04-25 Sylor Mark W. Liveexception system
US20020161873A1 (en) * 2001-04-30 2002-10-31 Mcguire Jacob Console mapping tool for automated deployment and management of network devices
US20030037177A1 (en) * 2001-06-11 2003-02-20 Microsoft Corporation Multiple device management method and system
US6754664B1 (en) * 1999-07-02 2004-06-22 Microsoft Corporation Schema-based computer system health monitoring

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6367034B1 (en) * 1998-09-21 2002-04-02 Microsoft Corporation Using query language for event filtering and aggregation
US6754664B1 (en) * 1999-07-02 2004-06-22 Microsoft Corporation Schema-based computer system health monitoring
US20020049838A1 (en) * 2000-06-21 2002-04-25 Sylor Mark W. Liveexception system
US20020161873A1 (en) * 2001-04-30 2002-10-31 Mcguire Jacob Console mapping tool for automated deployment and management of network devices
US20030037177A1 (en) * 2001-06-11 2003-02-20 Microsoft Corporation Multiple device management method and system

Cited By (266)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7899900B1 (en) * 2002-08-22 2011-03-01 Ricoh Company, Ltd. Method and system for monitoring network connected devices with multiple protocols
US20040165605A1 (en) * 2003-02-25 2004-08-26 Nassar Ayman Esam System and method for automated provisioning of inter-provider internet protocol telecommunication services
US7636324B2 (en) * 2003-02-25 2009-12-22 Ayman Esam Nassar System and method for automated provisioning of inter-provider internet protocol telecommunication services
US20050105508A1 (en) * 2003-11-14 2005-05-19 Innomedia Pte Ltd. System for management of Internet telephony equipment deployed behind firewalls
US20060031482A1 (en) * 2004-05-25 2006-02-09 Nortel Networks Limited Connectivity fault notification
US8862943B2 (en) * 2004-05-25 2014-10-14 Rockstar Consortium Us Lp Connectivity fault notification
US9075717B2 (en) 2004-05-25 2015-07-07 Rpx Clearinghouse Llc Connectivity fault notification
US20050278426A1 (en) * 2004-06-15 2005-12-15 First Data Corporation Systems and methods for merging communications
US8799891B2 (en) * 2005-03-02 2014-08-05 Hewlett-Packard Development Company, L.P. System and method for attributing CPU usage of a virtual machine monitor to a corresponding virtual machine
US20060200819A1 (en) * 2005-03-02 2006-09-07 Ludmila Cherkasova System and method for attributing CPU usage of a virtual machine monitor to a corresponding virtual machine
US20070011282A1 (en) * 2005-06-28 2007-01-11 Utstarcom, Inc. System and method for performing a distributed configuration across devices
US7403954B2 (en) * 2005-09-30 2008-07-22 Sap Ag Systems and methods for repeatable database performance testing
US20070078825A1 (en) * 2005-09-30 2007-04-05 Sap Ag Systems and methods for repeatable database performance testing
US8396198B2 (en) 2005-10-19 2013-03-12 At&T Intellectual Property I, L.P. Methods and apparatus for authorization and/or routing of outdial communication services
US8693651B2 (en) 2005-10-19 2014-04-08 At&T Intellectual Property I, L.P. Methods and apparatus for authorization and/or routing of outdial communication services
US20110044439A1 (en) * 2005-10-19 2011-02-24 Marco Schneider Methods and apparatus for authorization and/or routing of outdial communication services
US8346732B1 (en) * 2005-11-30 2013-01-01 Symantec Operating Corporation Method and apparatus for providing high availability of a database
US7634496B1 (en) 2006-01-03 2009-12-15 Emc Corporation Techniques for managing state changes of a data storage system utilizing the object oriented paradigm
US7693176B2 (en) * 2006-02-27 2010-04-06 Vonage Network Llc Method and system for bidirectional data transfer
US9118583B2 (en) 2006-06-30 2015-08-25 Centurylink Intellectual Property Llc System and method for re-routing calls
US9094257B2 (en) 2006-06-30 2015-07-28 Centurylink Intellectual Property Llc System and method for selecting a content delivery network
US9054915B2 (en) 2006-06-30 2015-06-09 Centurylink Intellectual Property Llc System and method for adjusting CODEC speed in a transmission path during call set-up due to reduced transmission performance
US8976665B2 (en) 2006-06-30 2015-03-10 Centurylink Intellectual Property Llc System and method for re-routing calls
US8184549B2 (en) 2006-06-30 2012-05-22 Embarq Holdings Company, LLP System and method for selecting network egress
US9154634B2 (en) 2006-06-30 2015-10-06 Centurylink Intellectual Property Llc System and method for managing network communications
US10560494B2 (en) 2006-06-30 2020-02-11 Centurylink Intellectual Property Llc Managing voice over internet protocol (VoIP) communications
US7948909B2 (en) 2006-06-30 2011-05-24 Embarq Holdings Company, Llc System and method for resetting counters counting network performance information at network communications devices on a packet network
US8000318B2 (en) 2006-06-30 2011-08-16 Embarq Holdings Company, Llc System and method for call routing based on transmission performance of a packet network
US9549004B2 (en) 2006-06-30 2017-01-17 Centurylink Intellectual Property Llc System and method for re-routing calls
US8717911B2 (en) 2006-06-30 2014-05-06 Centurylink Intellectual Property Llc System and method for collecting network performance information
US10230788B2 (en) 2006-06-30 2019-03-12 Centurylink Intellectual Property Llc System and method for selecting a content delivery network
US8570872B2 (en) 2006-06-30 2013-10-29 Centurylink Intellectual Property Llc System and method for selecting network ingress and egress
US8488447B2 (en) 2006-06-30 2013-07-16 Centurylink Intellectual Property Llc System and method for adjusting code speed in a transmission path during call set-up due to reduced transmission performance
US8477614B2 (en) 2006-06-30 2013-07-02 Centurylink Intellectual Property Llc System and method for routing calls if potential call paths are impaired or congested
US9838440B2 (en) 2006-06-30 2017-12-05 Centurylink Intellectual Property Llc Managing voice over internet protocol (VoIP) communications
US9749399B2 (en) 2006-06-30 2017-08-29 Centurylink Intellectual Property Llc System and method for selecting a content delivery network
US8576722B2 (en) 2006-08-22 2013-11-05 Centurylink Intellectual Property Llc System and method for modifying connectivity fault management packets
US7889660B2 (en) 2006-08-22 2011-02-15 Embarq Holdings Company, Llc System and method for synchronizing counters on an asynchronous packet communications network
US8144587B2 (en) 2006-08-22 2012-03-27 Embarq Holdings Company, Llc System and method for load balancing network resources using a connection admission control engine
US8144586B2 (en) 2006-08-22 2012-03-27 Embarq Holdings Company, Llc System and method for controlling network bandwidth with a connection admission control engine
US8125897B2 (en) 2006-08-22 2012-02-28 Embarq Holdings Company Lp System and method for monitoring and optimizing network performance with user datagram protocol network performance information packets
US9806972B2 (en) 2006-08-22 2017-10-31 Centurylink Intellectual Property Llc System and method for monitoring and altering performance of a packet network
US9832090B2 (en) 2006-08-22 2017-11-28 Centurylink Intellectual Property Llc System, method for compiling network performancing information for communications with customer premise equipment
US8194555B2 (en) 2006-08-22 2012-06-05 Embarq Holdings Company, Llc System and method for using distributed network performance information tables to manage network communications
US20080049757A1 (en) * 2006-08-22 2008-02-28 Bugenhagen Michael K System and method for synchronizing counters on an asynchronous packet communications network
US8199653B2 (en) 2006-08-22 2012-06-12 Embarq Holdings Company, Llc System and method for communicating network performance information over a packet network
US8213366B2 (en) 2006-08-22 2012-07-03 Embarq Holdings Company, Llc System and method for monitoring and optimizing network performance to a wireless device
US8223655B2 (en) 2006-08-22 2012-07-17 Embarq Holdings Company, Llc System and method for provisioning resources of a packet network based on collected network performance information
US8223654B2 (en) 2006-08-22 2012-07-17 Embarq Holdings Company, Llc Application-specific integrated circuit for monitoring and optimizing interlayer network performance
US8224255B2 (en) 2006-08-22 2012-07-17 Embarq Holdings Company, Llc System and method for managing radio frequency windows
US8228791B2 (en) 2006-08-22 2012-07-24 Embarq Holdings Company, Llc System and method for routing communications between packet networks based on intercarrier agreements
US8238253B2 (en) 2006-08-22 2012-08-07 Embarq Holdings Company, Llc System and method for monitoring interlayer devices and optimizing network performance
US8274905B2 (en) 2006-08-22 2012-09-25 Embarq Holdings Company, Llc System and method for displaying a graph representative of network performance over a time period
US20080049615A1 (en) * 2006-08-22 2008-02-28 Bugenhagen Michael K System and method for dynamically shaping network traffic
US8307065B2 (en) 2006-08-22 2012-11-06 Centurylink Intellectual Property Llc System and method for remotely controlling network operators
US9929923B2 (en) 2006-08-22 2018-03-27 Centurylink Intellectual Property Llc System and method for provisioning resources of a packet network based on collected network performance information
US9712445B2 (en) 2006-08-22 2017-07-18 Centurylink Intellectual Property Llc System and method for routing data on a packet network
US8358580B2 (en) 2006-08-22 2013-01-22 Centurylink Intellectual Property Llc System and method for adjusting the window size of a TCP packet through network elements
US8374090B2 (en) 2006-08-22 2013-02-12 Centurylink Intellectual Property Llc System and method for routing data on a packet network
US8107366B2 (en) 2006-08-22 2012-01-31 Embarq Holdings Company, LP System and method for using centralized network performance tables to manage network communications
US8407765B2 (en) 2006-08-22 2013-03-26 Centurylink Intellectual Property Llc System and method for restricting access to network performance information tables
US8472326B2 (en) 2006-08-22 2013-06-25 Centurylink Intellectual Property Llc System and method for monitoring interlayer devices and optimizing network performance
US8102770B2 (en) 2006-08-22 2012-01-24 Embarq Holdings Company, LP System and method for monitoring and optimizing network performance with vector performance tables and engines
US8098579B2 (en) 2006-08-22 2012-01-17 Embarq Holdings Company, LP System and method for adjusting the window size of a TCP packet through remote network elements
US8488495B2 (en) 2006-08-22 2013-07-16 Centurylink Intellectual Property Llc System and method for routing communications between packet networks based on real time pricing
US8509082B2 (en) 2006-08-22 2013-08-13 Centurylink Intellectual Property Llc System and method for load balancing network resources using a connection admission control engine
US8520603B2 (en) 2006-08-22 2013-08-27 Centurylink Intellectual Property Llc System and method for monitoring and optimizing network performance to a wireless device
US8531954B2 (en) * 2006-08-22 2013-09-10 Centurylink Intellectual Property Llc System and method for handling reservation requests with a connection admission control engine
US9992348B2 (en) 2006-08-22 2018-06-05 Century Link Intellectual Property LLC System and method for establishing a call on a packet network
US8537695B2 (en) 2006-08-22 2013-09-17 Centurylink Intellectual Property Llc System and method for establishing a call being received by a trunk on a packet network
US8549405B2 (en) 2006-08-22 2013-10-01 Centurylink Intellectual Property Llc System and method for displaying a graphical representation of a network to identify nodes and node segments on the network that are not operating normally
US9660917B2 (en) 2006-08-22 2017-05-23 Centurylink Intellectual Property Llc System and method for remotely controlling network operators
US9813320B2 (en) 2006-08-22 2017-11-07 Centurylink Intellectual Property Llc System and method for generating a graphical user interface representative of network performance
US8619596B2 (en) 2006-08-22 2013-12-31 Centurylink Intellectual Property Llc System and method for using centralized network performance tables to manage network communications
US8619820B2 (en) 2006-08-22 2013-12-31 Centurylink Intellectual Property Llc System and method for enabling communications over a number of packet networks
US8619600B2 (en) 2006-08-22 2013-12-31 Centurylink Intellectual Property Llc System and method for establishing calls over a call path having best path metrics
US10075351B2 (en) 2006-08-22 2018-09-11 Centurylink Intellectual Property Llc System and method for improving network performance
US9661514B2 (en) 2006-08-22 2017-05-23 Centurylink Intellectual Property Llc System and method for adjusting communication parameters
US8670313B2 (en) 2006-08-22 2014-03-11 Centurylink Intellectual Property Llc System and method for adjusting the window size of a TCP packet through network elements
US8687614B2 (en) 2006-08-22 2014-04-01 Centurylink Intellectual Property Llc System and method for adjusting radio frequency parameters
US8064391B2 (en) 2006-08-22 2011-11-22 Embarq Holdings Company, Llc System and method for monitoring and optimizing network performance to a wireless device
US8040811B2 (en) 2006-08-22 2011-10-18 Embarq Holdings Company, Llc System and method for collecting and managing network performance information
US8743703B2 (en) 2006-08-22 2014-06-03 Centurylink Intellectual Property Llc System and method for tracking application resource usage
US20080049747A1 (en) * 2006-08-22 2008-02-28 Mcnaughton James L System and method for handling reservation requests with a connection admission control engine
US8743700B2 (en) 2006-08-22 2014-06-03 Centurylink Intellectual Property Llc System and method for provisioning resources of a packet network based on collected network performance information
US8750158B2 (en) 2006-08-22 2014-06-10 Centurylink Intellectual Property Llc System and method for differentiated billing
US9621361B2 (en) 2006-08-22 2017-04-11 Centurylink Intellectual Property Llc Pin-hole firewall for communicating data packets on a packet network
US8015294B2 (en) 2006-08-22 2011-09-06 Embarq Holdings Company, LP Pin-hole firewall for communicating data packets on a packet network
US8811160B2 (en) 2006-08-22 2014-08-19 Centurylink Intellectual Property Llc System and method for routing data on a packet network
US9602265B2 (en) 2006-08-22 2017-03-21 Centurylink Intellectual Property Llc System and method for handling communications requests
US7940735B2 (en) 2006-08-22 2011-05-10 Embarq Holdings Company, Llc System and method for selecting an access point
US9479341B2 (en) 2006-08-22 2016-10-25 Centurylink Intellectual Property Llc System and method for initiating diagnostics on a packet network node
US8130793B2 (en) 2006-08-22 2012-03-06 Embarq Holdings Company, Llc System and method for enabling reciprocal billing for different types of communications over a packet network
US9014204B2 (en) 2006-08-22 2015-04-21 Centurylink Intellectual Property Llc System and method for managing network communications
US9042370B2 (en) 2006-08-22 2015-05-26 Centurylink Intellectual Property Llc System and method for establishing calls over a call path having best path metrics
US9054986B2 (en) 2006-08-22 2015-06-09 Centurylink Intellectual Property Llc System and method for enabling communications over a number of packet networks
US7843831B2 (en) 2006-08-22 2010-11-30 Embarq Holdings Company Llc System and method for routing data on a packet network
US10298476B2 (en) 2006-08-22 2019-05-21 Centurylink Intellectual Property Llc System and method for tracking application resource usage
US9094261B2 (en) 2006-08-22 2015-07-28 Centurylink Intellectual Property Llc System and method for establishing a call being received by a trunk on a packet network
US7808918B2 (en) 2006-08-22 2010-10-05 Embarq Holdings Company, Llc System and method for dynamically shaping network traffic
US9112734B2 (en) 2006-08-22 2015-08-18 Centurylink Intellectual Property Llc System and method for generating a graphical user interface representative of network performance
US10469385B2 (en) 2006-08-22 2019-11-05 Centurylink Intellectual Property Llc System and method for improving network performance using a connection admission control engine
US9253661B2 (en) 2006-08-22 2016-02-02 Centurylink Intellectual Property Llc System and method for modifying connectivity fault management packets
US9241277B2 (en) 2006-08-22 2016-01-19 Centurylink Intellectual Property Llc System and method for monitoring and optimizing network performance to a wireless device
US9240906B2 (en) 2006-08-22 2016-01-19 Centurylink Intellectual Property Llc System and method for monitoring and altering performance of a packet network
US9225609B2 (en) 2006-08-22 2015-12-29 Centurylink Intellectual Property Llc System and method for remotely controlling network operators
US9225646B2 (en) 2006-08-22 2015-12-29 Centurylink Intellectual Property Llc System and method for improving network performance using a connection admission control engine
US9241271B2 (en) 2006-08-22 2016-01-19 Centurylink Intellectual Property Llc System and method for restricting access to network performance information
US8194643B2 (en) 2006-10-19 2012-06-05 Embarq Holdings Company, Llc System and method for monitoring the connection of an end-user to a remote network
US8289965B2 (en) 2006-10-19 2012-10-16 Embarq Holdings Company, Llc System and method for establishing a communications session with an end-user based on the state of a network connection
US8189468B2 (en) 2006-10-25 2012-05-29 Embarq Holdings, Company, LLC System and method for regulating messages between networks
US9521150B2 (en) 2006-10-25 2016-12-13 Centurylink Intellectual Property Llc System and method for automatically regulating messages between networks
US20080183715A1 (en) * 2007-01-31 2008-07-31 Wei Wen Chen Extensible system for network discovery
US8111692B2 (en) 2007-05-31 2012-02-07 Embarq Holdings Company Llc System and method for modifying network traffic
US20090013398A1 (en) * 2007-07-06 2009-01-08 Acterna Llc Remote Testing Of Firewalled Networks
US20090177249A1 (en) * 2007-08-10 2009-07-09 Smiths Medical Md Package deployment of data between a server and a medical device
US8879391B2 (en) 2008-04-09 2014-11-04 Centurylink Intellectual Property Llc System and method for using network derivations to determine path states
US8068425B2 (en) 2008-04-09 2011-11-29 Embarq Holdings Company, Llc System and method for using network performance information to determine improved measures of path states
US10104041B2 (en) 2008-05-16 2018-10-16 Cisco Technology, Inc. Controlling the spread of interests and content in a content centric network
US20100131315A1 (en) * 2008-11-25 2010-05-27 International Business Machines Corporation Resolving incident reports
US20100241907A1 (en) * 2009-03-19 2010-09-23 Fujitsu Limited Network monitor and control apparatus
US8195985B2 (en) * 2009-03-19 2012-06-05 Fujitsu Limited Network monitor and control apparatus
US8745702B2 (en) 2009-04-30 2014-06-03 Centurylink Intellectual Property Llc System and method for managing access to a network interface device
US8533784B2 (en) * 2009-04-30 2013-09-10 Centurylink Intellectual Property Llc System and method for separating control of a network interface device
US20100281518A1 (en) * 2009-04-30 2010-11-04 Embarq Holdings Company, Llc System and method for separating control of a network interface device
US9686194B2 (en) 2009-10-21 2017-06-20 Cisco Technology, Inc. Adaptive multi-interface use for content networking
US20190370145A1 (en) * 2010-02-24 2019-12-05 Salesforce.Com, Inc. System, method and computer program product for monitoring data activity utilizing a shared data store
US8767707B2 (en) 2010-04-23 2014-07-01 Blackberry Limited Monitoring a mobile data service associated with a mailbox
US9208012B2 (en) * 2010-11-29 2015-12-08 Nec Corporation Display processing system, display processing method, and program
US20120331034A1 (en) * 2011-06-22 2012-12-27 Alain Fawaz Latency Probe
US20140006861A1 (en) * 2012-06-28 2014-01-02 Microsoft Corporation Problem inference from support tickets
US9262253B2 (en) 2012-06-28 2016-02-16 Microsoft Technology Licensing, Llc Middlebox reliability
US9229800B2 (en) * 2012-06-28 2016-01-05 Microsoft Technology Licensing, Llc Problem inference from support tickets
US20140019797A1 (en) * 2012-07-11 2014-01-16 Ca, Inc. Resource management in ephemeral environments
US9218205B2 (en) * 2012-07-11 2015-12-22 Ca, Inc. Resource management in ephemeral environments
US9325748B2 (en) 2012-11-15 2016-04-26 Microsoft Technology Licensing, Llc Characterizing service levels on an electronic network
US10075347B2 (en) 2012-11-15 2018-09-11 Microsoft Technology Licensing, Llc Network configuration in view of service level considerations
US9565080B2 (en) 2012-11-15 2017-02-07 Microsoft Technology Licensing, Llc Evaluating electronic network devices in view of cost and service level considerations
US20140280220A1 (en) * 2013-03-13 2014-09-18 Sas Institute Inc. Scored storage determination
US9542459B2 (en) 2013-05-20 2017-01-10 International Business Machines Corporation Adaptive data collection
US9350601B2 (en) 2013-06-21 2016-05-24 Microsoft Technology Licensing, Llc Network event processing and prioritization
US10098051B2 (en) 2014-01-22 2018-10-09 Cisco Technology, Inc. Gateways and routing in software-defined manets
US9954678B2 (en) 2014-02-06 2018-04-24 Cisco Technology, Inc. Content-based transport security
US9836540B2 (en) 2014-03-04 2017-12-05 Cisco Technology, Inc. System and method for direct storage access in a content-centric network
US10445380B2 (en) 2014-03-04 2019-10-15 Cisco Technology, Inc. System and method for direct storage access in a content-centric network
US9626413B2 (en) 2014-03-10 2017-04-18 Cisco Systems, Inc. System and method for ranking content popularity in a content-centric network
US9716622B2 (en) 2014-04-01 2017-07-25 Cisco Technology, Inc. System and method for dynamic name configuration in content-centric networks
US9473576B2 (en) 2014-04-07 2016-10-18 Palo Alto Research Center Incorporated Service discovery using collection synchronization with exact names
US9992281B2 (en) 2014-05-01 2018-06-05 Cisco Technology, Inc. Accountable content stores for information centric networks
US10158656B2 (en) 2014-05-22 2018-12-18 Cisco Technology, Inc. Method and apparatus for preventing insertion of malicious content at a named data network router
US9609014B2 (en) 2014-05-22 2017-03-28 Cisco Systems, Inc. Method and apparatus for preventing insertion of malicious content at a named data network router
US9426113B2 (en) * 2014-06-30 2016-08-23 Palo Alto Research Center Incorporated System and method for managing devices over a content centric network
US20150381546A1 (en) * 2014-06-30 2015-12-31 Palo Alto Research Center Incorporated System and method for managing devices over a content centric network
US9699198B2 (en) 2014-07-07 2017-07-04 Cisco Technology, Inc. System and method for parallel secure content bootstrapping in content-centric networks
US10237075B2 (en) 2014-07-17 2019-03-19 Cisco Technology, Inc. Reconstructable content objects
US9621354B2 (en) 2014-07-17 2017-04-11 Cisco Systems, Inc. Reconstructable content objects
US9929935B2 (en) 2014-07-18 2018-03-27 Cisco Technology, Inc. Method and system for keeping interest alive in a content centric network
US9590887B2 (en) 2014-07-18 2017-03-07 Cisco Systems, Inc. Method and system for keeping interest alive in a content centric network
US10305968B2 (en) 2014-07-18 2019-05-28 Cisco Technology, Inc. Reputation-based strategy for forwarding and responding to interests over a content centric network
US9729616B2 (en) 2014-07-18 2017-08-08 Cisco Technology, Inc. Reputation-based strategy for forwarding and responding to interests over a content centric network
US9882964B2 (en) 2014-08-08 2018-01-30 Cisco Technology, Inc. Explicit strategy feedback in name-based forwarding
US9729662B2 (en) 2014-08-11 2017-08-08 Cisco Technology, Inc. Probabilistic lazy-forwarding technique without validation in a content centric network
US9800637B2 (en) 2014-08-19 2017-10-24 Cisco Technology, Inc. System and method for all-in-one content stream in content-centric networks
US10367871B2 (en) 2014-08-19 2019-07-30 Cisco Technology, Inc. System and method for all-in-one content stream in content-centric networks
US10069933B2 (en) 2014-10-23 2018-09-04 Cisco Technology, Inc. System and method for creating virtual interfaces based on network characteristics
US10715634B2 (en) 2014-10-23 2020-07-14 Cisco Technology, Inc. System and method for creating virtual interfaces based on network characteristics
US9590948B2 (en) 2014-12-15 2017-03-07 Cisco Systems, Inc. CCN routing using hardware-assisted hash tables
US10237189B2 (en) 2014-12-16 2019-03-19 Cisco Technology, Inc. System and method for distance-based interest forwarding
US10003520B2 (en) 2014-12-22 2018-06-19 Cisco Technology, Inc. System and method for efficient name-based content routing using link-state information in information-centric networks
US9660825B2 (en) 2014-12-24 2017-05-23 Cisco Technology, Inc. System and method for multi-source multicasting in content-centric networks
US10091012B2 (en) 2014-12-24 2018-10-02 Cisco Technology, Inc. System and method for multi-source multicasting in content-centric networks
US9954795B2 (en) 2015-01-12 2018-04-24 Cisco Technology, Inc. Resource allocation using CCN manifests
US9832291B2 (en) 2015-01-12 2017-11-28 Cisco Technology, Inc. Auto-configurable transport stack
US9916457B2 (en) 2015-01-12 2018-03-13 Cisco Technology, Inc. Decoupled name security binding for CCN objects
US9946743B2 (en) 2015-01-12 2018-04-17 Cisco Technology, Inc. Order encoded manifests in a content centric network
US10440161B2 (en) 2015-01-12 2019-10-08 Cisco Technology, Inc. Auto-configurable transport stack
US9697017B2 (en) * 2015-01-26 2017-07-04 Fuji Xerox Co., Ltd. Configuring and processing management information base (MIB) in a distributed environment
US20160216981A1 (en) * 2015-01-26 2016-07-28 Fuji Xerox Co., Ltd. Information processing apparatus, non-transitory computer readable medium, and information processing method
US10333840B2 (en) 2015-02-06 2019-06-25 Cisco Technology, Inc. System and method for on-demand content exchange with adaptive naming in information-centric networks
US10075401B2 (en) 2015-03-18 2018-09-11 Cisco Technology, Inc. Pending interest table behavior
US11283697B1 (en) 2015-03-24 2022-03-22 Vmware, Inc. Scalable real time metrics management
US10075402B2 (en) 2015-06-24 2018-09-11 Cisco Technology, Inc. Flexible command and control in content centric networks
US9959328B2 (en) 2015-06-30 2018-05-01 Microsoft Technology Licensing, Llc Analysis of user text
US10402435B2 (en) 2015-06-30 2019-09-03 Microsoft Technology Licensing, Llc Utilizing semantic hierarchies to process free-form text
US10701038B2 (en) 2015-07-27 2020-06-30 Cisco Technology, Inc. Content negotiation in a content centric network
US9986034B2 (en) 2015-08-03 2018-05-29 Cisco Technology, Inc. Transferring state in content centric network stacks
US10313211B1 (en) * 2015-08-25 2019-06-04 Avi Networks Distributed network service risk monitoring and scoring
US11411825B2 (en) 2015-08-25 2022-08-09 Vmware, Inc. In intelligent autoscale of services
US10594562B1 (en) 2015-08-25 2020-03-17 Vmware, Inc. Intelligent autoscale of services
US10419345B2 (en) 2015-09-11 2019-09-17 Cisco Technology, Inc. Network named fragments in a content centric network
US9832123B2 (en) 2015-09-11 2017-11-28 Cisco Technology, Inc. Network named fragments in a content centric network
US10355999B2 (en) 2015-09-23 2019-07-16 Cisco Technology, Inc. Flow control with network named fragments
US9977809B2 (en) 2015-09-24 2018-05-22 Cisco Technology, Inc. Information and data framework in a content centric network
US10313227B2 (en) 2015-09-24 2019-06-04 Cisco Technology, Inc. System and method for eliminating undetected interest looping in information-centric networks
US10454820B2 (en) 2015-09-29 2019-10-22 Cisco Technology, Inc. System and method for stateless information-centric networking
US10263965B2 (en) 2015-10-16 2019-04-16 Cisco Technology, Inc. Encrypted CCNx
US9912776B2 (en) 2015-12-02 2018-03-06 Cisco Technology, Inc. Explicit content deletion commands in a content centric network
US10097346B2 (en) 2015-12-09 2018-10-09 Cisco Technology, Inc. Key catalogs in a content centric network
CN105577431A (en) * 2015-12-11 2016-05-11 青岛云成互动网络有限公司 User information identification and classification method based on internet application and system thereof
US10581967B2 (en) 2016-01-11 2020-03-03 Cisco Technology, Inc. Chandra-Toueg consensus in a content centric network
US10257271B2 (en) 2016-01-11 2019-04-09 Cisco Technology, Inc. Chandra-Toueg consensus in a content centric network
US10043016B2 (en) 2016-02-29 2018-08-07 Cisco Technology, Inc. Method and system for name encryption agreement in a content centric network
US10051071B2 (en) 2016-03-04 2018-08-14 Cisco Technology, Inc. Method and system for collecting historical network information in a content centric network
US10003507B2 (en) 2016-03-04 2018-06-19 Cisco Technology, Inc. Transport session state protocol
US10693734B2 (en) 2016-03-04 2020-06-23 Vmware, Inc. Traffic pattern detection and presentation in container-based cloud computing architecture
US10038633B2 (en) 2016-03-04 2018-07-31 Cisco Technology, Inc. Protocol to query for historical network information in a content centric network
US10742596B2 (en) 2016-03-04 2020-08-11 Cisco Technology, Inc. Method and system for reducing a collision probability of hash-based names using a publisher identifier
US10469378B2 (en) 2016-03-04 2019-11-05 Cisco Technology, Inc. Protocol to query for historical network information in a content centric network
US9832116B2 (en) 2016-03-14 2017-11-28 Cisco Technology, Inc. Adjusting entries in a forwarding information base in a content centric network
US10129368B2 (en) 2016-03-14 2018-11-13 Cisco Technology, Inc. Adjusting entries in a forwarding information base in a content centric network
US10212196B2 (en) 2016-03-16 2019-02-19 Cisco Technology, Inc. Interface discovery and authentication in a name-based network
US10067948B2 (en) 2016-03-18 2018-09-04 Cisco Technology, Inc. Data deduping in content centric networking manifests
US11436656B2 (en) 2016-03-18 2022-09-06 Palo Alto Research Center Incorporated System and method for a real-time egocentric collaborative filter on large datasets
US10091330B2 (en) 2016-03-23 2018-10-02 Cisco Technology, Inc. Interest scheduling by an information and data framework in a content centric network
US10033639B2 (en) 2016-03-25 2018-07-24 Cisco Technology, Inc. System and method for routing packets in a content centric network using anonymous datagrams
US10931548B1 (en) 2016-03-28 2021-02-23 Vmware, Inc. Collecting health monitoring data pertaining to an application from a selected set of service engines
US10320760B2 (en) 2016-04-01 2019-06-11 Cisco Technology, Inc. Method and system for mutating and caching content in a content centric network
US10348865B2 (en) 2016-04-04 2019-07-09 Cisco Technology, Inc. System and method for compressing content centric networking messages
US9930146B2 (en) 2016-04-04 2018-03-27 Cisco Technology, Inc. System and method for compressing content centric networking messages
US10425503B2 (en) 2016-04-07 2019-09-24 Cisco Technology, Inc. Shared pending interest table in a content centric network
US10841212B2 (en) 2016-04-11 2020-11-17 Cisco Technology, Inc. Method and system for routable prefix queries in a content centric network
US10027578B2 (en) 2016-04-11 2018-07-17 Cisco Technology, Inc. Method and system for routable prefix queries in a content centric network
US10404450B2 (en) 2016-05-02 2019-09-03 Cisco Technology, Inc. Schematized access control in a content centric network
US10320675B2 (en) 2016-05-04 2019-06-11 Cisco Technology, Inc. System and method for routing packets in a stateless content centric network
US10547589B2 (en) 2016-05-09 2020-01-28 Cisco Technology, Inc. System for implementing a small computer systems interface protocol over a content centric network
US10404537B2 (en) 2016-05-13 2019-09-03 Cisco Technology, Inc. Updating a transport stack in a content centric network
US10063414B2 (en) 2016-05-13 2018-08-28 Cisco Technology, Inc. Updating a transport stack in a content centric network
US10693852B2 (en) 2016-05-13 2020-06-23 Cisco Technology, Inc. System for a secure encryption proxy in a content centric network
US10084764B2 (en) 2016-05-13 2018-09-25 Cisco Technology, Inc. System for a secure encryption proxy in a content centric network
US10103989B2 (en) 2016-06-13 2018-10-16 Cisco Technology, Inc. Content object return messages in a content centric network
US10305865B2 (en) 2016-06-21 2019-05-28 Cisco Technology, Inc. Permutation-based content encryption with manifests in a content centric network
US10148572B2 (en) 2016-06-27 2018-12-04 Cisco Technology, Inc. Method and system for interest groups in a content centric network
US10581741B2 (en) 2016-06-27 2020-03-03 Cisco Technology, Inc. Method and system for interest groups in a content centric network
US10009266B2 (en) 2016-07-05 2018-06-26 Cisco Technology, Inc. Method and system for reference counted pending interest tables in a content centric network
US20190245767A1 (en) * 2016-07-08 2019-08-08 Convida Wireless, Llc Methods to monitor resources through http/2
US11070456B2 (en) * 2016-07-08 2021-07-20 Convida Wireless, Llc Methods to monitor resources through HTTP/2
US9992097B2 (en) 2016-07-11 2018-06-05 Cisco Technology, Inc. System and method for piggybacking routing information in interests in a content centric network
US10122624B2 (en) 2016-07-25 2018-11-06 Cisco Technology, Inc. System and method for ephemeral entries in a forwarding information base in a content centric network
US10069729B2 (en) 2016-08-08 2018-09-04 Cisco Technology, Inc. System and method for throttling traffic based on a forwarding information base in a content centric network
US10956412B2 (en) 2016-08-09 2021-03-23 Cisco Technology, Inc. Method and system for conjunctive normal form attribute matching in a content centric network
US10033642B2 (en) 2016-09-19 2018-07-24 Cisco Technology, Inc. System and method for making optimal routing decisions based on device-specific parameters in a content centric network
US10212248B2 (en) 2016-10-03 2019-02-19 Cisco Technology, Inc. Cache management on high availability routers in a content centric network
US10897518B2 (en) 2016-10-03 2021-01-19 Cisco Technology, Inc. Cache management on high availability routers in a content centric network
US10447805B2 (en) 2016-10-10 2019-10-15 Cisco Technology, Inc. Distributed consensus in a content centric network
US10135948B2 (en) 2016-10-31 2018-11-20 Cisco Technology, Inc. System and method for process migration in a content centric network
US10721332B2 (en) 2016-10-31 2020-07-21 Cisco Technology, Inc. System and method for process migration in a content centric network
US10929364B2 (en) 2016-11-11 2021-02-23 International Business Machines Corporation Assisted problem identification in a computing system
US11650966B2 (en) 2016-11-11 2023-05-16 International Business Machines Corporation Assisted problem identification in a computing system
US11537576B2 (en) 2016-11-11 2022-12-27 International Business Machines Corporation Assisted problem identification in a computing system
US10929363B2 (en) 2016-11-11 2021-02-23 International Business Machines Corporation Assisted problem identification in a computing system
US10243851B2 (en) 2016-11-21 2019-03-26 Cisco Technology, Inc. System and method for forwarder connection information in a content centric network
US20180173698A1 (en) * 2016-12-16 2018-06-21 Microsoft Technology Licensing, Llc Knowledge Base for Analysis of Text
US10679008B2 (en) * 2016-12-16 2020-06-09 Microsoft Technology Licensing, Llc Knowledge base for analysis of text
US11086137B2 (en) 2017-03-17 2021-08-10 United Kingdom Research And Innovation Super-resolution microscopy
CN110914742A (en) * 2017-03-17 2020-03-24 英国研究与创新组织 Super-resolution microscopy
US10999168B1 (en) 2018-05-30 2021-05-04 Vmware, Inc. User defined custom metrics
US11044180B2 (en) 2018-10-26 2021-06-22 Vmware, Inc. Collecting samples hierarchically in a datacenter
US11171849B2 (en) 2018-10-26 2021-11-09 Vmware, Inc. Collecting samples hierarchically in a datacenter
US11736372B2 (en) 2018-10-26 2023-08-22 Vmware, Inc. Collecting samples hierarchically in a datacenter
CN109739837A (en) * 2018-12-28 2019-05-10 深圳市简工智能科技有限公司 Analysis method, terminal and the readable storage medium storing program for executing of smart lock log
US11290358B2 (en) 2019-05-30 2022-03-29 Vmware, Inc. Partitioning health monitoring in a global server load balancing system
US11582120B2 (en) 2019-05-30 2023-02-14 Vmware, Inc. Partitioning health monitoring in a global server load balancing system
US11909612B2 (en) 2019-05-30 2024-02-20 VMware LLC Partitioning health monitoring in a global server load balancing system
US11811861B2 (en) 2021-05-17 2023-11-07 Vmware, Inc. Dynamically updating load balancing criteria
US11792155B2 (en) 2021-06-14 2023-10-17 Vmware, Inc. Method and apparatus for enhanced client persistence in multi-site GSLB deployments
US11799824B2 (en) 2021-06-14 2023-10-24 Vmware, Inc. Method and apparatus for enhanced client persistence in multi-site GSLB deployments

Also Published As

Publication number Publication date
US20040088386A1 (en) 2004-05-06

Similar Documents

Publication Publication Date Title
US7246159B2 (en) Distributed data gathering and storage for use in a fault and performance monitoring system
US6985944B2 (en) Distributing queries and combining query responses in a fault and performance monitoring system using distributed data gathering and storage
US20040088404A1 (en) Administering users in a fault and performance monitoring system using distributed data gathering and storage
US20040088403A1 (en) System configuration for use with a fault and performance monitoring system using distributed data gathering and storage
US7275103B1 (en) Storage path optimization for SANs
US10491486B2 (en) Scalable performance management system
US7685269B1 (en) Service-level monitoring for storage applications
US7606895B1 (en) Method and apparatus for collecting network performance data
US8234365B2 (en) Method and system of alert notification
US8024494B2 (en) Method of monitoring device forming information processing system, information apparatus and information processing system
US6754664B1 (en) Schema-based computer system health monitoring
US8239527B2 (en) System and interface for monitoring information technology assets
US7577701B1 (en) System and method for continuous monitoring and measurement of performance of computers on network
US20030135611A1 (en) Self-monitoring service system with improved user administration and user access control
US20040015579A1 (en) Method and apparatus for enterprise management
US20040042470A1 (en) Method and apparatus for rate limiting
US20080177874A1 (en) Method and System for Visualizing Network Performance Characteristics
US20030212643A1 (en) System and method to combine a product database with an existing enterprise to model best usage of funds for the enterprise
US7269647B2 (en) Simplified network packet analyzer for distributed packet snooper
JP2006331392A (en) Method and system for remotely monitoring storage system
CN114244676A (en) Intelligent IT integrated gateway system
US10986136B1 (en) Methods for application management and monitoring and devices thereof
Horn et al. Monitoring the World with NetBSD
Wynd Enterprise Network Monitoring and Analysis
Stamatelopoulos et al. QoS Management for Internet Information Services

Legal Events

Date Code Title Description
AS Assignment

Owner name: FIDELIA TECHNOLOGY, INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AGGARWAL, VIKAS;REEL/FRAME:013754/0101

Effective date: 20030129

AS Assignment

Owner name: NETWORK GENERAL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RASHID, RAJIB;REEL/FRAME:018760/0001

Effective date: 20060222

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: WELLS FARGO BANK, NATIONAL ASSOCIATION, MINNESOTA

Free format text: SECURITY AGREEMENT;ASSIGNOR:FIDELIA TECHNOLOGY, INC.;REEL/FRAME:020054/0365

Effective date: 20071101

AS Assignment

Owner name: KEYBANK NATIONAL ASSOCIATION, OHIO

Free format text: INTELLECTUAL PROPERTY SECURITY AGREEMENT;ASSIGNOR:FIDELIA TECHNOLOGY, INC.;REEL/FRAME:020521/0568

Effective date: 20071221

FEPP Fee payment procedure

Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: KEYBANK NATIONAL ASSOCIATION AS ADMINISTRATIVE AGE

Free format text: INTELLECTUAL PROPERTY SECURITY AGREEMENT;ASSIGNOR:FIDELIA TECHNOLOGY, INC.;REEL/FRAME:027282/0856

Effective date: 20111122

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: FIDELIA TECHNOLOGY, INC., MASSACHUSETTS

Free format text: SECURITY AGREEMENT;ASSIGNOR:WELLS FARGO BANK, NATIONAL ASSOCIATION;REEL/FRAME:036065/0075

Effective date: 20150701

AS Assignment

Owner name: FIDELIA TECHNOLOGY, INC., MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:KEYBANK NATIONAL ASSOCIATION;REEL/FRAME:036087/0786

Effective date: 20150714

Owner name: FIDELIA TECHNOLOGY, INC., MASSACHUSETTS

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:KEYBANK NATIONAL ASSOCIATION;REEL/FRAME:036087/0781

Effective date: 20150714

Owner name: JPMORGAN CHASE BANK, N.A., NEW YORK

Free format text: SECURITY INTEREST;ASSIGNORS:NETSCOUT SYSTEMS, INC.;FIDELIA TECHNOLOGY, INC.;NETSCOUT SERVICE LEVEL CORPORATION;AND OTHERS;REEL/FRAME:036087/0808

Effective date: 20150714

AS Assignment

Owner name: NETSCOUT SYSTEMS, INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:FIDELIA TECHNOLOGY, INC.;REEL/FRAME:048035/0559

Effective date: 20190116

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12