US20050243731A1 - Silent datapath failure detection - Google Patents

Silent datapath failure detection Download PDF

Info

Publication number
US20050243731A1
US20050243731A1 US10/834,129 US83412904A US2005243731A1 US 20050243731 A1 US20050243731 A1 US 20050243731A1 US 83412904 A US83412904 A US 83412904A US 2005243731 A1 US2005243731 A1 US 2005243731A1
Authority
US
United States
Prior art keywords
egress
datapath
count
ingress
pdu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/834,129
Other versions
US7489642B2 (en
Inventor
Desmond Smith
Terrence Sellars
Stephen Shortt
David Graham
Christopher Trader
Myles Dear
Thomas Kam
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WSOU Investments LLC
Original Assignee
Alcatel SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alcatel SA filed Critical Alcatel SA
Priority to US10/834,129 priority Critical patent/US7489642B2/en
Assigned to ALCATEL reassignment ALCATEL ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DEAR, MYLES KEVIN, GRAHAM, DAVID HENRY, KAM, THOMAS MAN CHUN, SELLARS, TERRENCE VINCENT, SHORTT, STEPHEN MICHAEL, SMITH, DESMOND GLENN, TRADER, CHRISTOPHER EDWARD
Priority to EP05300327A priority patent/EP1592171B1/en
Priority to EP05300326A priority patent/EP1655892A1/en
Publication of US20050243731A1 publication Critical patent/US20050243731A1/en
Assigned to ALCATEL LUCENT reassignment ALCATEL LUCENT CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: ALCATEL
Application granted granted Critical
Publication of US7489642B2 publication Critical patent/US7489642B2/en
Assigned to OMEGA CREDIT OPPORTUNITIES MASTER FUND, LP reassignment OMEGA CREDIT OPPORTUNITIES MASTER FUND, LP SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WSOU INVESTMENTS, LLC
Assigned to WSOU INVESTMENTS, LLC reassignment WSOU INVESTMENTS, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ALCATEL LUCENT
Assigned to BP FUNDING TRUST, SERIES SPL-VI reassignment BP FUNDING TRUST, SERIES SPL-VI SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WSOU INVESTMENTS, LLC
Assigned to WSOU INVESTMENTS, LLC reassignment WSOU INVESTMENTS, LLC RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: OCO OPPORTUNITIES MASTER FUND, L.P. (F/K/A OMEGA CREDIT OPPORTUNITIES MASTER FUND LP
Assigned to OT WSOU TERRIER HOLDINGS, LLC reassignment OT WSOU TERRIER HOLDINGS, LLC SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: WSOU INVESTMENTS, LLC
Assigned to WSOU INVESTMENTS, LLC reassignment WSOU INVESTMENTS, LLC RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: TERRIER SSC, LLC
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0681Configuration of triggering conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Definitions

  • the invention is directed to communication networks and in particular to an apparatus and method for detection of silent datapath failures.
  • PDU transmission protocol data units
  • CRC Cyclic Redundancy check
  • OAM-CC operations, administration and maintenance
  • An OAM (operation, administration and maintenance) cell is a specially tagged ATM cell specified to support ATM network maintenance features like connectivity verification, alarm surveillance, continuity check, and performance monitoring.
  • OAM-CC is not supported by all network equipment makes (network equipment supliers).
  • the operational impact of configuring and monitoring large networks with thousands of connections becomes significant.
  • OAM-CC can detect a datapath issue, it cannot isolate the cause (node, card) and therefore fails to reduce the fault isolation time and any related revenue loss issues. For these reasons, this solution is declined by many network customers.
  • Another conventional solution for datapath fault detection includes measuring the traffic at each node along the datapath using customized, off-line testers. These testers generally provide counters at ingress and egress ends of the datapath for counting the number of PDUs traversing these ends. The values of the counters over a preset period of time are compared to determine if cell loss or cell addition has occurred at the respective node.
  • this tester since this tester is a stand-alone device, the traffic needs to be stopped during the measurement, thus adversely affecting subscriber services. Also, these measurements take place after an end-user complains to the service provider about a failure.
  • On-line counters may also be used, as described in the co-pending U.S. patent application Ser. No. 10/717,377, entitled “Method And Apparatus For Detection Of Transmission Unit Loss And/Or Replication”, Steven Driediger et al. filed on 19 Nov. 2003, and assigned to Alcatel.
  • aggregate per connection coherent counters are kept on ingress and egress line cards and periodically compared. This mechanism requires bounded latency through the switch fabric but subtle traffic loss or replication is also detected.
  • this method adds complexity to PDU latency measurements as the PDU's traverse the fabric. This is because it is difficult to accurately measure and track the PDU's since latency is continuously changing.
  • the invention is directed to a switched communication network that enables establishing a datapath over a switching node of the network.
  • the invention provides an apparatus for silent datapath failure detection at the node comprising an ingress statistics unit for collecting an ingress protocol data unit (PDU) count over a time interval at an ingress port of the node associated with the datapath; an egress statistics unit for collecting an egress PDU count over the time interval at an egress node at the node associated with the datapath; means for comparing the ingress PDU count and the egress PDU count and generating an error signal in case of a mismatch between the ingress PDU count and the egress PDU count; and means for alarming the mismatch and specifying a type of datapath failure associated with the error signal.
  • PDU protocol data unit
  • a traffic manager is provided at a switching node of such a communication network.
  • the traffic manager comprises: means for determining an egress count including all egress traffic PDU's in the datapath counted at the egress port over a period of time; and means for comparing the egress count with an ingress count and generating an error signal in the case of a mismatch between the ingress count and the egress count, wherein the ingress count includes all ingress traffic PDU's in the datapath counted at the ingress port over the period of time.
  • a traffic manager is provided at a switching node of a communication network.
  • the traffic manager comprises: means for determining an ingress count including all ingress traffic PDU's in the datapath counted at the ingress port over a period of time; and means for comparing the ingress count with an egress count and generating an error signal in the case of a mismatch between the ingress count and the egress count, wherein the egress count includes all egress traffic PDU's in the datapath counted at the egress port over the period of time.
  • a method for silent datapath failure detection is also provided for a switched communication network of the type that enables establishing a datapath over a switching node of the network.
  • the method includes the steps of: at an ingress port associated with the datapath, collecting an ingress PDU count over a time interval; at an egress port associated with the datapath, collecting an egress PDU count over the time interval; identifying the ingress port associated with the egress port whenever the egress PDU count violates a threshold; and comparing the ingress PDU count with the egress PDU count and generating an error signal in case of a mismatch between the ingress PDU count and the egress PDU count.
  • the apparatus and method of the invention enables a network provider to isolate the fault to a certain node, and to a certain pair of line cards without live traffic interruption. This improves the node availability by reducing the fault detection and isolation times, thereby reducing the node's mean time-to-repair (MTTR).
  • MTTR mean time-to-repair
  • the fault detection time can be controlled by selecting the statistics collection time, so that fast detection may be obtained.
  • the invention proposed herein does not consume any network bandwidth as for example the OAM-CC mechanism does.
  • FIG. 1 illustrates a datapath connecting two end-points over a switched network
  • FIG. 1A shows the types of datapaths in a switched network
  • FIG. 2 shows the datapath from an ingress port to an egress port at a node of a switched network according to an embodiment of the invention
  • FIG. 3A illustrates a block diagram of the apparatus for detection of silent datapath failures according to the invention.
  • FIG. 3B illustrates a block diagram of a further embodiment of the apparatus for detection of silent datapath failures according to the invention.
  • a silent failure is a disruption in a connection-oriented datapath that causes the datapath traffic to become unidirectional or to partially or completely stop flowing, and this is not detected except by the end-user.
  • FIG. 1 illustrates a datapath 10 connecting two end-points over a switched network 1 .
  • a data aggregation unit 5 combines the traffic from a plurality of local users connected to node A and a data distribution unit 5 ′ distributes the traffic received at node B over network 1 between a plurality of local users.
  • the invention enables the network provider to detect linecard and switching fabric failures for which traditional error detection mechanisms such as CRC and OAM CC checks are not feasible.
  • the invention enables locating the failure at node C and furthermore, enables locating the pair of cards at this node causing the failure.
  • datapath 10 may be a bidirectional point-to-point connection as shown at 10 ′, or a unidirectional point-to-multipoint connection as shown at 10 ′′, as shown in FIG. 1A .
  • FIG. 2 shows the datapath from an ingress port to an egress port at node C of network 1 (see FIG. 1 ). More precisely, this figure illustrates a linecard 20 , 20 ′ and the switch fabric 15 , which are the main elements along the datapath.
  • the linecard is illustrated here as two distinct units 20 , 20 ′ one comprising the ingress (input) logic and the other, the egress (output) logic. Note that a datapath may use the ingress and egress logic of the same linecard, or of two distinct linecards.
  • Ingress logic 20 comprises in general terms a line interface 21 , a traffic manager 22 with the associated memory 23 , and a backplane interface 24 .
  • the line interface 21 on ingress logic 20 accommodates a plurality of ingress ports, such as ports Pi 1 to Pin. Index ‘i’ indicates an ingress (input) port and “n” indicates the maximum number of ingress ports on the respective card.
  • Line interface 21 includes the line receivers for physically terminating the respective line. Interface 21 forwards the PDU's to traffic manager 22 , illustrated in further details in FIG. 2 . Traffic manager 22 examines the PDU header, verifies the header integrity and provides the PDU to the backplane interface 24 and the switching fabric 15 .
  • network 1 is an ATM network
  • ATM is a transmission protocol based upon asynchronous time division multiplexing using fixed length protocol data units called cells. These cells typically have a length of 53 bytes (octets), each cell containing 48 octets of user data (payload) and 5 octets of network information (header).
  • the header of a cell contains address information which allows the network to route the cell over the network from the entry point to the exit, and also includes error correction information. It is however to be understood that the invention is applicable to other types of connection-oriented services (Ethernet, MPLS, FR).
  • traffic manager 22 is responsible for extracting OAM (operation, administration and maintenance) cells and executing the operations required by these cells. Furthermore, traffic manager 22 calculates the local routing/switching information (fabric header) for the input cells so that they are routed correctly by the switching fabric.
  • the fabric routing information is obtained based on the origin and destination address in the cell header and established using look-up tables in memory 23 (preferably a random addressable memory). In the case of an ATM cell, this fabric header is preferably 7 bytes long.
  • Traffic manager 22 may also provide cell header detection and correction via CRC checks. Most fault locations can be monitored using error detection (e.g. CRC); however corruption of preliminary header bytes at the line ingress interface remains at risk in many current implementations. For example, Alcatel's ATM line cards actually use a number of different devices (such as ATMC and ATLAS) for this functionality, which can come from different vendors, such as Motorola or PMC Sierra. It is apparent that a fault at the memory 23 cannot be detected by calculating the CRC at manager 22 , since unit 22 performs the CRC on the header bits prior to buffering the cells in memory 23 . This may result in errors in routing the PDU along the correct datapath 10 .
  • error detection e.g. CRC
  • CRC error detection
  • the ingress logic 20 also interfaces with the switching fabric 15 over a backplane interface 24 , as shown by links from unit 20 to fabric 15 denoted with Bo 1 -Bom.
  • a housekeeping processor 25 is used for the general tasks enabling proper operation of the interfaces 21 , 24 and manager 22 .
  • the links from fabric 15 to egress logic 20 ′ are denoted with Bi′ 1 -Bi′m.
  • the traffic manager 22 ′ performs similar tasks to the tasks performed by manager 22 on the PDU's received from the backplane interface 24 ′, namely local traffic routing, policing, OAM insertion, stripping of the internal fabric header, and addition of CRC bytes to the cell header to enable downstream header corruption detection.
  • Traffic manager 22 ′ forwards the cells to the egress line interface 21 ′ which then routes the cells to a transmitter corresponding to an appropriate egress port Pe′ 1 -Pe′k for conveying the traffic to the next node, where index ‘e’ indicates an egress (output) port and “k” indicates the maximum number of output ports on the respective card. It is to be noted that the number of input ports on a card may differ from the number of output ports, and that two cards may have different numbers of such ports.
  • datapath 10 uses a point-to-point (P2P) bidirectional connection, only the forward direction being illustrated for simplification.
  • P2P point-to-point
  • the datapath 10 at node C runs in this example from ingress port Pi 1 on ingress logic 20 over link Bo 2 at the exit of backplane interface 24 , is switched by fabric 15 from Bo 2 on link Bi′j, and then routed from backplane interface 24 ′ on egress port Pe′k.
  • backplane connection is used herein for a trail from an exit pin on backplane interface 24 , over fabric 15 , to an input pin on backplane interface 24 ′.
  • the traffic managers 22 , 22 ′ assume that every cell received by the switching node must egress the switch in the case of a P2P connection 10 ′, or must be replicated in a case of a P2mP connection 10 ′′. Ideally, any cell which is originated and/or terminated within the node itself should be excluded from the endpoint statistics count. This includes the OAM cells inserted by the ingress card into the fabric, and extracted by the egress card. This also includes any cells sourced/terminated by the node and used for real-time flow control.
  • FIG. 3A illustrates a block diagram of the apparatus for detection of silent datapath failures according to the invention.
  • ingress and egress traffic managers 22 , 22 ′ collect the respective ingress and egress statistics at the connection level, such as on datapath 10 , as shown by blocks 31 and 33 .
  • the statistics are maintained on a per datapath basis and preferably refer only to the cells received at the input of the node and respectively transmitted at an output of the node (excluding any cells sourced/terminated by the node).
  • the statistics may be collected in the respective memory 23 , 23 ′ (see FIG. 2 ), or may use separate memory means.
  • the statistics are collected over a preset period of time, denoted here with T, as shown by reset unit 34 . If any egress cell count is zero over a full interval T, determined by a decision unit 30 , then the ingress interface for that datapath is determined, using e.g. the node's cross-connection information 32 . The corresponding linecard for that egress interface is contacted to determine the ingress cell count available at 33 . It is to be noted that a variant where the comparison between the ingress and egress statistics is made continuously is also possible, in which case a decision unit 30 is not necessary.
  • ingress and egress statistics are collected over a rather large time interval T (15 minutes).
  • T time interval
  • the present invention may use this existent feature.
  • the present invention may use shorter intervals, in which case the statistics do not need to rely on an entire such interval. Using shorter intervals would enable faster alarming of datapath disruption errors.
  • the ingress cell count is non-zero for an egress count of zero, as determined by a comparator 40 , the datapath is alarmed as shown at 50 , since the complete lack of cells transmitted at the egress indicates a datapath disruption.
  • the phrase “non-zero count of ingress cells” should be taken to mean “enough ingress cells to allow at least one to reach the egress side, given maximum expected switch latency and max expected traffic rate of the datapath”. If this threshold is too low, a datapath failure alarm could be erroneously raised when no such failure exists.
  • the comparator 40 uses a plurality of thresholds rather than just one “zero” threshold, in order to rule out cases of congestion, linecard resets, etc.
  • the thresholds may be determined based on experiments and may be provided for various states of operation for the respective node or datapath.
  • the result of the comparison against the respective threshold is sent to the node controller, where a failure detection and diagnosis unit 40 establishes the cause of the discrepancy between the ingress and egress statistics.
  • the datapath is alarmed as shown at 50 only if the valid cases of discrepancies were eliminated.
  • the decision unit 30 could employ an egress statistics threshold to decide whether or not the datapath is suspected of traffic failure, instead of checking for non-zero egress statistics.
  • a discrepancy between mating ingress and egress cell counts of a datapath could be alarmed if congestion could be discounted as the cause (e.g. CBR or high priority connections) and the discrepancy was substantial, the determination of which would depend on the connection's bandwidth. That is, the egress cell count is not zero because the datapath disruption occurred partway through the interval.
  • a datapath alarm could be inhibited if a zero count of egress cells and a non-zero count of ingress cells over a partial interval are observed, and the shortened interval was due to a verifiable card reset, or creation of the connection.
  • another interval of statistics may be required before a determination whether to alarm a datapath or not is made.
  • failure detection and diagnosis unit 40 which monitors the ingress/egress statistics, may be provided with additional means for detecting the type of datapath failure, as shown in dotted lines at 45 .
  • Unit 45 enables the operator to distinguish the failure type by providing specific messages for various types of silent datapath failures.
  • unit 45 enables the operator to recognize an eventual corruption of the ingress or/and egress statistics being stored in memory 23 , 23 ′. This occurrence, even if atypical, will result in a small number of cases where an alarm may be raised when no traffic loss is occurring, or may be not raised when real traffic loss is occurring. In this situation, unit 45 provides a specific alarm message to the operator indicating that the alarm was raised by a statistics corruption.
  • datapath failure type recognition unit 45 may use the failure data to trigger automatic recovery of the datapath, such as local connection repair, or connection re-route (if the connection is switched).
  • ingress and egress cell count statistics are collected periodically on connections provided by the node.
  • a datapath's mating pair of ingress and egress statistics are substantially mismatched in a way that can not be explained by verifiable normal behavior of the node or traffic affecting fault conditions for which alarms have already been raised, then the datapath is alarmed at 50 .

Abstract

The apparatus and method herein provide a means for actively monitoring a communications node for datapath disruptions caused within the node. It was developed to address linecard failures that were detected in the field and for which traditional error detection mechanisms such as CRC and OAM CC checks are not feasible. According to one implementation, statistics are collected over a preset time interval. If any egress cell counts are zero over a full interval, then the ingress interface for that datapath is determined, e.g. using the node's cross-datapath information, and the corresponding linecard of that ingress interface is contacted to determine the ingress cell count. If the ingress cell count is non-zero then the datapath is alarmed since the complete lack of cells transmitted at the egress indicates a datapath disruption. Other implementations that can detect datapath disruptions without a complete interval of statistics collection and which do not require a zero egress cell count are possible for cases where congestion, linecard resets, etc. can be ruled out.

Description

    FIELD OF THE INVENTION
  • The invention is directed to communication networks and in particular to an apparatus and method for detection of silent datapath failures.
  • BACKGROUND OF THE INVENTION
  • Generally speaking, detection and isolation of connectivity loss is much more difficult to identify than is detection of transmission protocol data units (PDU) corruption. This is because a corrupted PDU is typically available for inspection and analysis, while detection of datapath interruption requires analysis of the PDU stream, rather than individual PDU's.
  • Thus, most transmission protocols associate a CRC (Cyclic Redundancy check) to each PDU, which is computed by applying a predetermined function to a block of data to be transmitted. The receiver at the far-end of the datapath recalculates the CRC using the same function as at transmission and compares the transmitted and received CRC. The corrupted bits are detected and may then be corrected using the CRC bits.
  • It is known to use OAM-CC cells to monitor for an end-to-end datapath connectivity in ATM networks. An OAM (operation, administration and maintenance) cell is a specially tagged ATM cell specified to support ATM network maintenance features like connectivity verification, alarm surveillance, continuity check, and performance monitoring. However, OAM-CC is not supported by all network equipment makes (network equipment supliers). In addition, the operational impact of configuring and monitoring large networks with thousands of connections becomes significant. Also, although OAM-CC can detect a datapath issue, it cannot isolate the cause (node, card) and therefore fails to reduce the fault isolation time and any related revenue loss issues. For these reasons, this solution is declined by many network customers.
  • Another conventional solution for datapath fault detection includes measuring the traffic at each node along the datapath using customized, off-line testers. These testers generally provide counters at ingress and egress ends of the datapath for counting the number of PDUs traversing these ends. The values of the counters over a preset period of time are compared to determine if cell loss or cell addition has occurred at the respective node. However, since this tester is a stand-alone device, the traffic needs to be stopped during the measurement, thus adversely affecting subscriber services. Also, these measurements take place after an end-user complains to the service provider about a failure. These limitations make this type of conventional solution essentially incompatible with real-time background diagnostic monitoring of a datapath.
  • On-line counters may also be used, as described in the co-pending U.S. patent application Ser. No. 10/717,377, entitled “Method And Apparatus For Detection Of Transmission Unit Loss And/Or Replication”, Steven Driediger et al. filed on 19 Nov. 2003, and assigned to Alcatel. According to the solution proposed in Driediger's et al. Patent Application, aggregate per connection coherent counters are kept on ingress and egress line cards and periodically compared. This mechanism requires bounded latency through the switch fabric but subtle traffic loss or replication is also detected. On the other hand, this method adds complexity to PDU latency measurements as the PDU's traverse the fabric. This is because it is difficult to accurately measure and track the PDU's since latency is continuously changing.
  • There is a need to provide a method and apparatus that enables fast datapath failure detection while leveraging the hardware infrastructure that most nodes already have.
  • SUMMARY OF THE INVENTION
  • It is an object of the invention to provide an apparatus and method for detection of silent datapath failures that alleviates totally or in part the drawbacks of the prior art datapath failure systems.
  • The invention is directed to a switched communication network that enables establishing a datapath over a switching node of the network. Namely, the invention provides an apparatus for silent datapath failure detection at the node comprising an ingress statistics unit for collecting an ingress protocol data unit (PDU) count over a time interval at an ingress port of the node associated with the datapath; an egress statistics unit for collecting an egress PDU count over the time interval at an egress node at the node associated with the datapath; means for comparing the ingress PDU count and the egress PDU count and generating an error signal in case of a mismatch between the ingress PDU count and the egress PDU count; and means for alarming the mismatch and specifying a type of datapath failure associated with the error signal.
  • According to another aspect of the invention, a traffic manager is provided at a switching node of such a communication network. The traffic manager comprises: means for determining an egress count including all egress traffic PDU's in the datapath counted at the egress port over a period of time; and means for comparing the egress count with an ingress count and generating an error signal in the case of a mismatch between the ingress count and the egress count, wherein the ingress count includes all ingress traffic PDU's in the datapath counted at the ingress port over the period of time.
  • According to still another aspect of the invention, a traffic manager is provided at a switching node of a communication network. The traffic manager comprises: means for determining an ingress count including all ingress traffic PDU's in the datapath counted at the ingress port over a period of time; and means for comparing the ingress count with an egress count and generating an error signal in the case of a mismatch between the ingress count and the egress count, wherein the egress count includes all egress traffic PDU's in the datapath counted at the egress port over the period of time.
  • A method for silent datapath failure detection is also provided for a switched communication network of the type that enables establishing a datapath over a switching node of the network. According to another aspect of the invention, the method includes the steps of: at an ingress port associated with the datapath, collecting an ingress PDU count over a time interval; at an egress port associated with the datapath, collecting an egress PDU count over the time interval; identifying the ingress port associated with the egress port whenever the egress PDU count violates a threshold; and comparing the ingress PDU count with the egress PDU count and generating an error signal in case of a mismatch between the ingress PDU count and the egress PDU count.
  • From the market perspective, node availability is a very important performance parameter of any telecommunication network. Advantageously, the apparatus and method of the invention enables a network provider to isolate the fault to a certain node, and to a certain pair of line cards without live traffic interruption. This improves the node availability by reducing the fault detection and isolation times, thereby reducing the node's mean time-to-repair (MTTR).
  • In addition, no special hardware is required for implementing the solution according to the invention. The fault detection time can be controlled by selecting the statistics collection time, so that fast detection may be obtained. Furthermore, the invention proposed herein does not consume any network bandwidth as for example the OAM-CC mechanism does.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of the preferred embodiments, as illustrated in the appended drawings, where:
  • FIG. 1 illustrates a datapath connecting two end-points over a switched network;
  • FIG. 1A shows the types of datapaths in a switched network;
  • FIG. 2 shows the datapath from an ingress port to an egress port at a node of a switched network according to an embodiment of the invention;
  • FIG. 3A illustrates a block diagram of the apparatus for detection of silent datapath failures according to the invention; and
  • FIG. 3B illustrates a block diagram of a further embodiment of the apparatus for detection of silent datapath failures according to the invention.
  • DETAILED DESCRIPTION
  • A silent failure is a disruption in a connection-oriented datapath that causes the datapath traffic to become unidirectional or to partially or completely stop flowing, and this is not detected except by the end-user.
  • The invention provides a means for actively monitoring a node of a communications network for silent datapath disruptions caused within the node. FIG. 1 illustrates a datapath 10 connecting two end-points over a switched network 1. In this example, a data aggregation unit 5 combines the traffic from a plurality of local users connected to node A and a data distribution unit 5′ distributes the traffic received at node B over network 1 between a plurality of local users. The invention enables the network provider to detect linecard and switching fabric failures for which traditional error detection mechanisms such as CRC and OAM CC checks are not feasible. In this example, the invention enables locating the failure at node C and furthermore, enables locating the pair of cards at this node causing the failure.
  • It is to be noted here that datapath 10 may be a bidirectional point-to-point connection as shown at 10′, or a unidirectional point-to-multipoint connection as shown at 10″, as shown in FIG. 1A.
  • FIG. 2 shows the datapath from an ingress port to an egress port at node C of network 1 (see FIG. 1). More precisely, this figure illustrates a linecard 20, 20′ and the switch fabric 15, which are the main elements along the datapath. The linecard is illustrated here as two distinct units 20, 20′ one comprising the ingress (input) logic and the other, the egress (output) logic. Note that a datapath may use the ingress and egress logic of the same linecard, or of two distinct linecards.
  • Ingress logic 20 comprises in general terms a line interface 21, a traffic manager 22 with the associated memory 23, and a backplane interface 24. The line interface 21 on ingress logic 20 accommodates a plurality of ingress ports, such as ports Pi1 to Pin. Index ‘i’ indicates an ingress (input) port and “n” indicates the maximum number of ingress ports on the respective card.
  • Line interface 21 includes the line receivers for physically terminating the respective line. Interface 21 forwards the PDU's to traffic manager 22, illustrated in further details in FIG. 2. Traffic manager 22 examines the PDU header, verifies the header integrity and provides the PDU to the backplane interface 24 and the switching fabric 15.
  • Let us assume that network 1 is an ATM network; ATM is a transmission protocol based upon asynchronous time division multiplexing using fixed length protocol data units called cells. These cells typically have a length of 53 bytes (octets), each cell containing 48 octets of user data (payload) and 5 octets of network information (header). The header of a cell contains address information which allows the network to route the cell over the network from the entry point to the exit, and also includes error correction information. It is however to be understood that the invention is applicable to other types of connection-oriented services (Ethernet, MPLS, FR).
  • Generically, traffic manager 22 is responsible for extracting OAM (operation, administration and maintenance) cells and executing the operations required by these cells. Furthermore, traffic manager 22 calculates the local routing/switching information (fabric header) for the input cells so that they are routed correctly by the switching fabric. The fabric routing information is obtained based on the origin and destination address in the cell header and established using look-up tables in memory 23 (preferably a random addressable memory). In the case of an ATM cell, this fabric header is preferably 7 bytes long.
  • Traffic manager 22 may also provide cell header detection and correction via CRC checks. Most fault locations can be monitored using error detection (e.g. CRC); however corruption of preliminary header bytes at the line ingress interface remains at risk in many current implementations. For example, Alcatel's ATM line cards actually use a number of different devices (such as ATMC and ATLAS) for this functionality, which can come from different vendors, such as Motorola or PMC Sierra. It is apparent that a fault at the memory 23 cannot be detected by calculating the CRC at manager 22, since unit 22 performs the CRC on the header bits prior to buffering the cells in memory 23. This may result in errors in routing the PDU along the correct datapath 10.
  • The ingress logic 20 also interfaces with the switching fabric 15 over a backplane interface 24, as shown by links from unit 20 to fabric 15 denoted with Bo1-Bom. A housekeeping processor 25 is used for the general tasks enabling proper operation of the interfaces 21, 24 and manager 22.
  • At the egress side of the node, the links from fabric 15 to egress logic 20′ are denoted with Bi′1-Bi′m. The traffic manager 22′ performs similar tasks to the tasks performed by manager 22 on the PDU's received from the backplane interface 24′, namely local traffic routing, policing, OAM insertion, stripping of the internal fabric header, and addition of CRC bytes to the cell header to enable downstream header corruption detection. Traffic manager 22′ forwards the cells to the egress line interface 21′ which then routes the cells to a transmitter corresponding to an appropriate egress port Pe′1-Pe′k for conveying the traffic to the next node, where index ‘e’ indicates an egress (output) port and “k” indicates the maximum number of output ports on the respective card. It is to be noted that the number of input ports on a card may differ from the number of output ports, and that two cards may have different numbers of such ports.
  • In this example, datapath 10 uses a point-to-point (P2P) bidirectional connection, only the forward direction being illustrated for simplification. The datapath 10 at node C runs in this example from ingress port Pi1 on ingress logic 20 over link Bo2 at the exit of backplane interface 24, is switched by fabric 15 from Bo2 on link Bi′j, and then routed from backplane interface 24′ on egress port Pe′k. The term “backplane connection” is used herein for a trail from an exit pin on backplane interface 24, over fabric 15, to an input pin on backplane interface 24′.
  • According to a preferred embodiment of the invention, the traffic managers 22, 22′ assume that every cell received by the switching node must egress the switch in the case of a P2P connection 10′, or must be replicated in a case of a P2mP connection 10″. Ideally, any cell which is originated and/or terminated within the node itself should be excluded from the endpoint statistics count. This includes the OAM cells inserted by the ingress card into the fabric, and extracted by the egress card. This also includes any cells sourced/terminated by the node and used for real-time flow control.
  • FIG. 3A illustrates a block diagram of the apparatus for detection of silent datapath failures according to the invention. As seen in FIG. 3A, ingress and egress traffic managers 22, 22′ collect the respective ingress and egress statistics at the connection level, such as on datapath 10, as shown by blocks 31 and 33. As discussed above, the statistics are maintained on a per datapath basis and preferably refer only to the cells received at the input of the node and respectively transmitted at an output of the node (excluding any cells sourced/terminated by the node).
  • The statistics may be collected in the respective memory 23, 23′ (see FIG. 2), or may use separate memory means. The statistics are collected over a preset period of time, denoted here with T, as shown by reset unit 34. If any egress cell count is zero over a full interval T, determined by a decision unit 30, then the ingress interface for that datapath is determined, using e.g. the node's cross-connection information 32. The corresponding linecard for that egress interface is contacted to determine the ingress cell count available at 33. It is to be noted that a variant where the comparison between the ingress and egress statistics is made continuously is also possible, in which case a decision unit 30 is not necessary.
  • There are current implementations where ingress and egress statistics are collected over a rather large time interval T (15 minutes). The present invention may use this existent feature. Alternatively, the present invention may use shorter intervals, in which case the statistics do not need to rely on an entire such interval. Using shorter intervals would enable faster alarming of datapath disruption errors.
  • If the ingress cell count is non-zero for an egress count of zero, as determined by a comparator 40, the datapath is alarmed as shown at 50, since the complete lack of cells transmitted at the egress indicates a datapath disruption. In this document, the phrase “non-zero count of ingress cells” should be taken to mean “enough ingress cells to allow at least one to reach the egress side, given maximum expected switch latency and max expected traffic rate of the datapath”. If this threshold is too low, a datapath failure alarm could be erroneously raised when no such failure exists.
  • In a more sophisticated variant of the invention, shown in FIG. 3B, the comparator 40 uses a plurality of thresholds rather than just one “zero” threshold, in order to rule out cases of congestion, linecard resets, etc. The thresholds may be determined based on experiments and may be provided for various states of operation for the respective node or datapath. The result of the comparison against the respective threshold is sent to the node controller, where a failure detection and diagnosis unit 40 establishes the cause of the discrepancy between the ingress and egress statistics. The datapath is alarmed as shown at 50 only if the valid cases of discrepancies were eliminated. In addition, the decision unit 30 could employ an egress statistics threshold to decide whether or not the datapath is suspected of traffic failure, instead of checking for non-zero egress statistics.
  • For example, a discrepancy between mating ingress and egress cell counts of a datapath could be alarmed if congestion could be discounted as the cause (e.g. CBR or high priority connections) and the discrepancy was substantial, the determination of which would depend on the connection's bandwidth. That is, the egress cell count is not zero because the datapath disruption occurred partway through the interval. In other cases, a datapath alarm could be inhibited if a zero count of egress cells and a non-zero count of ingress cells over a partial interval are observed, and the shortened interval was due to a verifiable card reset, or creation of the connection. In some cases, another interval of statistics may be required before a determination whether to alarm a datapath or not is made.
  • Preferably, failure detection and diagnosis unit 40, which monitors the ingress/egress statistics, may be provided with additional means for detecting the type of datapath failure, as shown in dotted lines at 45. Unit 45 enables the operator to distinguish the failure type by providing specific messages for various types of silent datapath failures. For example, unit 45 enables the operator to recognize an eventual corruption of the ingress or/and egress statistics being stored in memory 23, 23′. This occurrence, even if atypical, will result in a small number of cases where an alarm may be raised when no traffic loss is occurring, or may be not raised when real traffic loss is occurring. In this situation, unit 45 provides a specific alarm message to the operator indicating that the alarm was raised by a statistics corruption.
  • Still further, datapath failure type recognition unit 45 may use the failure data to trigger automatic recovery of the datapath, such as local connection repair, or connection re-route (if the connection is switched).
  • In the broadest sense, ingress and egress cell count statistics are collected periodically on connections provided by the node. When a datapath's mating pair of ingress and egress statistics are substantially mismatched in a way that can not be explained by verifiable normal behavior of the node or traffic affecting fault conditions for which alarms have already been raised, then the datapath is alarmed at 50.

Claims (42)

1. In a switched communication network that enables establishing a datapath over a switching node of said network, an apparatus for silent datapath failure detection at said node comprising:
at an ingress port of said node associated with said datapath, an ingress statistics unit for collecting an ingress protocol data unit (PDU) count over a time interval;
at an egress port of said node associated with said datapath, an egress statistics unit for collecting an egress PDU count over said time interval;
means for comparing said ingress PDU count and said egress PDU count and generating an error signal in case of a mismatch between said ingress PDU count and said egress PDU count; and
means for alarming said mismatch and specifying a type of datapath failure associated with said error signal.
2. The apparatus of claim 1, wherein said means for comparing generates said error signal whenever said egress PDU count is zero and said ingress PDU count is non-zero.
3. The apparatus of claim 1, further comprising a decision unit for comparing said egress PDU count with a threshold, and, whenever said egress PDU count violates said threshold, identifying said ingress PDU count corresponding to said ingress port and triggering said means for comparing.
4. The apparatus of claim 3, wherein said decision unit uses a plurality of thresholds, each corresponding to a particular mode of operation of said node.
5. The apparatus of claim 3, wherein said decision unit uses an additional threshold for distinguishing congestion from a datapath failure.
6. The apparatus of claim 3, wherein said decision unit uses an additional threshold for distinguishing linecard resets from a datapath failure.
7. The apparatus of claim 1, wherein said means for comparing generates a specific error signal whenever said egress PDU count violates a specified threshold.
8. The apparatus of claim 7, wherein said alarm unit associates a specific datapath fault message to each said specific error signal, identifying the datapath fault that originated said specific error signal.
9. The apparatus of claim 1, further comprising a datapath failure type recognition unit for receiving said error signal and providing to said alarm unit additional data pertinent to a mismatch between said egress PDU count and said ingress PDU count.
10. The apparatus of claim 6, further comprising a datapath failure type recognition unit for receiving said specific error signal, correlating said ingress PDU count with said egress PDU count and providing said alarm unit with diagnostic data indicating the type of datapath fault that resulted in violation of said specified threshold.
11. The apparatus of claim 3, further comprising a datapath failure type recognition unit for providing said alarm unit with diagnostic data indicating the type of datapath fault that resulted in violation of said threshold.
12. At a switching node of a communication network for cross-connecting a datapath from an ingress port to an egress port based on routing data in a cross-connect table, a traffic manager comprising:
means for determining an egress count including all egress traffic PDU's in said datapath counted at said egress port over a period of time; and
means for comparing said egress count with an ingress count and generating an error signal in the case of a mismatch between said ingress count and said egress count,
wherein said ingress count includes all ingress traffic PDU's in said datapath counted at said ingress port over said period of time.
13. The traffic manager of claim 12, further comprising a decision unit for comparing said egress PDU count with a threshold, and, whenever said egress PDU count violates said threshold, identifying said ingress PDU count corresponding to said ingress port and triggering said means for comparing.
14. The traffic manager of claim 13, further comprising means for alarming said mismatch and associating a specific datapath fault message to said error signal for identifying a type of datapath fault that originated said error signal.
15. The traffic manager of claim 12, wherein said traffic manager further comprises means for resetting said ingress count and said egress count after said period of time.
16. The traffic manager of claim 13, wherein said decision unit uses a plurality of thresholds, each corresponding to a particular mode of operation of said node.
17. The traffic manager of claim 13, wherein said decision unit comprises means for triggering access to said cross-connect table for determining said ingress port associated to said egress port whenever said egress count violates said threshold.
18. The traffic manager of claim 12, wherein said ingress count is provided over a backplane interface whenever said ingress port and said egress port are on different linecards.
19. The apparatus of claim 14, wherein said means for comparing generates a specific error signal whenever said egress count violates a specified threshold.
20. The apparatus of claim 19, wherein said means for alarming associates a specific datapath fault message to each said specific error signal, identifying the datapath fault that originated said specific error signal.
21. At a switching node of a communication network for cross-connecting a datapath from an ingress port to an egress port of said node based on routing data in a cross-connect table, a traffic manager comprising:
means for determining an ingress count including all ingress traffic PDU's in said datapath counted at said ingress port over a period of time; and
means for comparing said ingress count with an egress count and generating an error signal in the case of a mismatch between said ingress count and said egress count,
wherein said egress count includes all egress traffic PDU's in said datapath counted at said egress port over said period of time.
22. The traffic manager of claim 21, further comprising means for alarming said mismatch and associating a specific datapath fault message to said error signal for identifying a type of datapath fault that originated said error signal.
23. The traffic manager of claim 12, wherein said egress count is provided over a backplane interface whenever said ingress port and said egress port are on different linecards.
24. The apparatus of claim 14, wherein said means for comparing generates a specific error signal whenever said egress count violates a specified threshold.
25. The apparatus of claim 24, wherein said means for alarming associates a specific datapath fault message to each said specific error signal, identifying the datapath fault that originated said specific error signal.
26. For a switched communication network that enables establishing a datapath over a switching node of said network, a method for silent datapath failure detection, comprising the steps of:
at an ingress port associated with said datapath, collecting an ingress PDU count over a time interval;
at an egress port associated with said datapath, collecting an egress PDU count over said time interval;
identifying said ingress port associated with said egress port whenever said egress PDU count violates a threshold; and
comparing said ingress PDU count with said egress PDU count and generating an error signal in case of a mismatch between said ingress PDU count and said egress PDU count.
27. The method of claim 26, further comprising resetting said ingress PDU count and said egress PDU count after said time interval.
28. The method of claim 26, further comprising alarming said particular mismatch and associating a specific datapath fault message to said error signal for identifying a type of datapath fault that originated said error signal.
29. The method of claim 26, wherein said PDU is one of an ATM cell, an Ethernet frame and a Frame Relay frame.
30. The method of claim 26, wherein said ingress PDU count and said egress PDU count are collected in a local memory on a linecard associated with said ingress port and said egress port, respectively.
31. The method of claim 26, wherein said step of comparing includes:
comparing said egress PDU count with a threshold,
whenever said egress PDU count violates said threshold, identifying said ingress PDU count corresponding to said ingress port; and
evaluating said ingress PDU count and said egress PDU count in the context of said error signal for identifying a type of datapath fault.
32. The method of claim 26, wherein said step of comparing comprises generating a specific error signal whenever said egress PDU count violates a specified threshold.
33. The method of claim 30, further comprising associating a datapath alarm message to each said specific error signal, for identifying a respective type of datapath fault.
34. The method of claim 268, further comprising automatically triggering recovery of said datapath, in response to said error signal.
35. The method of claim 26, wherein said step of identifying comprises accessing a cross-connect table at said node for determining said ingress port corresponding to said egress port on said datapath.
36. The method of claim 26, wherein said threshold is established taking into account the bandwidth allocated to said datapath.
37. The method of claim 26, wherein said threshold is zero for alarming a datapath interruption.
38. The method of claim 26, wherein said step of identifying uses a plurality of thresholds, each corresponding to an type of allowable mismatch between said egress PDU count and said ingress PDU count.
39. The method of claim 38, wherein one of said thresholds is used for establishing if a datapath disruption occurred partway through said time interval.
40. The method of claim 38, wherein one of said thresholds is used for determining if said particular mismatch due to one of a linecard reset and creation of a new datapath.
41. The method of claim 38, wherein each said threshold is associated with a specified time interval for measuring said egress PDU count.
42. The method of claim 38, wherein one of said thresholds is used for determining corruption of one of said ingress and egress PDU count.
US10/834,129 2004-04-29 2004-04-29 Silent datapath failure detection Expired - Fee Related US7489642B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US10/834,129 US7489642B2 (en) 2004-04-29 2004-04-29 Silent datapath failure detection
EP05300327A EP1592171B1 (en) 2004-04-29 2005-04-27 Silent datapath failure detection
EP05300326A EP1655892A1 (en) 2004-04-29 2005-04-27 Silent datapath failure detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/834,129 US7489642B2 (en) 2004-04-29 2004-04-29 Silent datapath failure detection

Publications (2)

Publication Number Publication Date
US20050243731A1 true US20050243731A1 (en) 2005-11-03
US7489642B2 US7489642B2 (en) 2009-02-10

Family

ID=34942572

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/834,129 Expired - Fee Related US7489642B2 (en) 2004-04-29 2004-04-29 Silent datapath failure detection

Country Status (2)

Country Link
US (1) US7489642B2 (en)
EP (2) EP1592171B1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050265231A1 (en) * 2004-05-27 2005-12-01 Gunther Brian W Communication network connection rerouting methods and systems
DE102005059021A1 (en) * 2005-12-08 2007-06-14 Volkswagen Ag Embedded systems operating method for use in motor vehicle, involves transmitting diagnosis signal to embedded system during variation of confidential value of another signal, where diagnosis signal includes confidential value information
WO2011020323A1 (en) * 2009-08-20 2011-02-24 中兴通讯股份有限公司 Idle exit path establishment method and device for an access service network
US20140211636A1 (en) * 2013-01-30 2014-07-31 Accedian Networks Inc. Layer-3 performance monitoring sectionalization
US9166900B2 (en) 2012-07-24 2015-10-20 Accedian Networks Inc. Automatic setup of reflector instances
US11271776B2 (en) * 2019-07-23 2022-03-08 Vmware, Inc. Logical overlay network monitoring

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006003241A1 (en) * 2004-06-30 2006-01-12 Nokia Corporation Failure detection of path information corresponding to a transmission path
CN100395994C (en) * 2005-06-23 2008-06-18 华为技术有限公司 Channel failure handling method in ASON

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5394408A (en) * 1992-02-10 1995-02-28 Nec Corporation Policing control apparatus
US6144636A (en) * 1996-12-06 2000-11-07 Hitachi, Ltd. Packet switch and congestion notification method
US6363056B1 (en) * 1998-07-15 2002-03-26 International Business Machines Corporation Low overhead continuous monitoring of network performance
US20040037277A1 (en) * 2002-06-04 2004-02-26 Mathews Gregory S Testing and error recovery across multiple switching fabrics
US6763024B1 (en) * 1999-10-14 2004-07-13 Alcatel Canada Inc. Method and devices for cell loss detection in ATM telecommunication devices
US6831890B1 (en) * 2000-10-31 2004-12-14 Agilent Technologies, Inc. Measuring network performance parameters in data communication networks

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11127155A (en) * 1997-10-20 1999-05-11 Fujitsu Ltd Exchange
US6188674B1 (en) * 1998-02-17 2001-02-13 Xiaoqiang Chen Method and apparatus for packet loss measurement in packet networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5394408A (en) * 1992-02-10 1995-02-28 Nec Corporation Policing control apparatus
US6144636A (en) * 1996-12-06 2000-11-07 Hitachi, Ltd. Packet switch and congestion notification method
US6363056B1 (en) * 1998-07-15 2002-03-26 International Business Machines Corporation Low overhead continuous monitoring of network performance
US6763024B1 (en) * 1999-10-14 2004-07-13 Alcatel Canada Inc. Method and devices for cell loss detection in ATM telecommunication devices
US6831890B1 (en) * 2000-10-31 2004-12-14 Agilent Technologies, Inc. Measuring network performance parameters in data communication networks
US20040037277A1 (en) * 2002-06-04 2004-02-26 Mathews Gregory S Testing and error recovery across multiple switching fabrics

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7436774B2 (en) * 2004-05-27 2008-10-14 Alcatel Lucent Communication network connection rerouting methods and systems
US20050265231A1 (en) * 2004-05-27 2005-12-01 Gunther Brian W Communication network connection rerouting methods and systems
DE102005059021A1 (en) * 2005-12-08 2007-06-14 Volkswagen Ag Embedded systems operating method for use in motor vehicle, involves transmitting diagnosis signal to embedded system during variation of confidential value of another signal, where diagnosis signal includes confidential value information
DE102005059021B4 (en) * 2005-12-08 2016-06-30 Volkswagen Ag An embedded system and method for operating an embedded system with improved identification of erroneous signals being exchanged
WO2011020323A1 (en) * 2009-08-20 2011-02-24 中兴通讯股份有限公司 Idle exit path establishment method and device for an access service network
US9419883B2 (en) 2012-07-24 2016-08-16 Accedian Networks Inc. Automatic setup of reflector instances
US10110448B2 (en) 2012-07-24 2018-10-23 Accedian Networks Inc. Automatic setup of reflector instances
US9166900B2 (en) 2012-07-24 2015-10-20 Accedian Networks Inc. Automatic setup of reflector instances
US9306830B2 (en) * 2013-01-30 2016-04-05 Accedian Networks Inc. Layer-3 performance monitoring sectionalization
US9577913B2 (en) 2013-01-30 2017-02-21 Accedian Networks Inc. Layer-3 performance monitoring sectionalization
US20140211636A1 (en) * 2013-01-30 2014-07-31 Accedian Networks Inc. Layer-3 performance monitoring sectionalization
US10135713B2 (en) 2013-01-30 2018-11-20 Accedian Networks Inc. Layer-3 performance monitoring sectionalization
US11271776B2 (en) * 2019-07-23 2022-03-08 Vmware, Inc. Logical overlay network monitoring

Also Published As

Publication number Publication date
EP1592171B1 (en) 2012-08-29
US7489642B2 (en) 2009-02-10
EP1655892A1 (en) 2006-05-10
EP1592171A3 (en) 2006-02-15
EP1592171A2 (en) 2005-11-02

Similar Documents

Publication Publication Date Title
EP1592171B1 (en) Silent datapath failure detection
US8953456B2 (en) Ethernet OAM performance management
US8243592B2 (en) Methods and systems for automatically rerouting data in a data network
KR101360120B1 (en) Connectivity fault management (cfm) in networks with link aggregation group connections
US6654923B1 (en) ATM group protection switching method and apparatus
US20050099955A1 (en) Ethernet OAM fault isolation
US20050099951A1 (en) Ethernet OAM fault detection and verification
US20050099954A1 (en) Ethernet OAM network topography discovery
US7133367B2 (en) Embedded cell loopback method and system for testing in ATM networks
US20050099949A1 (en) Ethernet OAM domains and ethernet OAM frame format
US10015066B2 (en) Propagation of frame loss information by receiver to sender in an ethernet network
US7839795B2 (en) Carrier Ethernet with fault notification
US20060092847A1 (en) Method and apparatus for providing availability metrics for measurement and management of Ethernet services
US8295175B2 (en) Service metrics for managing services transported over circuit-oriented and connectionless networks
CA2519751A1 (en) Methods and apparatus for non-intrusive measurement of delay variation of data traffic on communication networks
US9203719B2 (en) Communicating alarms between devices of a network
Farkouh Managing ATM-based broadband networks
US8274904B1 (en) Method and apparatus for providing signature based predictive maintenance in communication networks
US6898177B1 (en) ATM protection switching method and apparatus
US7860023B2 (en) Layer 2 network rule-based non-intrusive testing verification methodology
US9306822B2 (en) Method and system for silent trunk failure detection
US7046693B1 (en) Method and system for determining availability in networks
JP4477512B2 (en) Physical line monitoring method for packet communication
Gruber Performance and fault management functions for the maintenance of SONET/SDH and ATM transport networks
KR20040063494A (en) Device for diagnosing stability of link using a feature of traffic in internet protocol network and method therof

Legal Events

Date Code Title Description
AS Assignment

Owner name: ALCATEL, FRANCE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SMITH, DESMOND GLENN;SELLARS, TERRENCE VINCENT;SHORTT, STEPHEN MICHAEL;AND OTHERS;REEL/FRAME:015277/0094

Effective date: 20040427

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: ALCATEL LUCENT, FRANCE

Free format text: CHANGE OF NAME;ASSIGNOR:ALCATEL;REEL/FRAME:021990/0751

Effective date: 20061130

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: OMEGA CREDIT OPPORTUNITIES MASTER FUND, LP, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNOR:WSOU INVESTMENTS, LLC;REEL/FRAME:043966/0574

Effective date: 20170822

Owner name: OMEGA CREDIT OPPORTUNITIES MASTER FUND, LP, NEW YO

Free format text: SECURITY INTEREST;ASSIGNOR:WSOU INVESTMENTS, LLC;REEL/FRAME:043966/0574

Effective date: 20170822

AS Assignment

Owner name: WSOU INVESTMENTS, LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ALCATEL LUCENT;REEL/FRAME:044000/0053

Effective date: 20170722

AS Assignment

Owner name: BP FUNDING TRUST, SERIES SPL-VI, NEW YORK

Free format text: SECURITY INTEREST;ASSIGNOR:WSOU INVESTMENTS, LLC;REEL/FRAME:049235/0068

Effective date: 20190516

AS Assignment

Owner name: WSOU INVESTMENTS, LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OCO OPPORTUNITIES MASTER FUND, L.P. (F/K/A OMEGA CREDIT OPPORTUNITIES MASTER FUND LP;REEL/FRAME:049246/0405

Effective date: 20190516

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20210210

AS Assignment

Owner name: OT WSOU TERRIER HOLDINGS, LLC, CALIFORNIA

Free format text: SECURITY INTEREST;ASSIGNOR:WSOU INVESTMENTS, LLC;REEL/FRAME:056990/0081

Effective date: 20210528

AS Assignment

Owner name: WSOU INVESTMENTS, LLC, CALIFORNIA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:TERRIER SSC, LLC;REEL/FRAME:056526/0093

Effective date: 20210528