WO2001097469A2 - Packet transmission scheduler - Google Patents

Packet transmission scheduler Download PDF

Info

Publication number
WO2001097469A2
WO2001097469A2 PCT/NO2001/000247 NO0100247W WO0197469A2 WO 2001097469 A2 WO2001097469 A2 WO 2001097469A2 NO 0100247 W NO0100247 W NO 0100247W WO 0197469 A2 WO0197469 A2 WO 0197469A2
Authority
WO
WIPO (PCT)
Prior art keywords
packet
packets
list
lists
receiver
Prior art date
Application number
PCT/NO2001/000247
Other languages
French (fr)
Other versions
WO2001097469A3 (en
Inventor
Hans Rygh
Original Assignee
Sun Microsystems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems, Inc. filed Critical Sun Microsystems, Inc.
Priority to AU2001282688A priority Critical patent/AU2001282688A1/en
Publication of WO2001097469A2 publication Critical patent/WO2001097469A2/en
Publication of WO2001097469A3 publication Critical patent/WO2001097469A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/30Flow control; Congestion control in combination with information about buffer occupancy at either end or at transit nodes

Definitions

  • the present invention relates to a method and an apparatus for transferring information through a communication medium. More particularly, the present invention relates to a packet transmission scheduler and a method of operation.
  • the receiver decides whether it wishes to accept or reject a packet, and returns this decision in a message to the transmitter.
  • This is illustrated- in Figure 1 , showing a point-to-point link with transmitter and receiver, the transmitter and receiver being in separate nodes. Both the transmitter and the receiver comprise buffer pools with entries holding packets.
  • the transmitter in Figure 1 cannot be reallocated until a positive (packet accepted) feedback has arrived.
  • the receiver does not wish to accept a packet, and a negative feedback is returned. If a strictly ordered scheme is applied, it is then a waste of bandwidth to send other packets in the same thread, since they will then anyway be rejected.
  • a thread is here defined to be an ordered sequence of packets with a set of common features. Criteria for belonging to a thread could be destination address of the packet, whether the packet is a request or response, whether the packet is for maintenance/diagnostics or for ordinary traffic etc.
  • Multithreading means here the ability to handle several threads at the same time, and non-blocking means that the lack of progress in one thread will not affect the progress of other threads.
  • the packet transmission scheduler described is especially designed for use in a system with the N x N Crossbar Switch described in US 09/520,066, and with the Virtual Channel Flow Control as described in US 09/520,063, both filed March 7, 2000, and both applications being assigned to the assignee of the present application, and hereby included in their entirety by reference.
  • this should not be interpreted as limiting for the invention, as the scheduler here described is also applicable in other communication systems.
  • the invention provides a method for scheduling packets in a communication network, the network including at least a transmitter and a receiver, the transmitter comprising a buffer pool for storing packets to be transmitted, the method comprising:
  • the packets are grouped on the basis of destination address, size or function of the packets.
  • a number of linked lists are used for managing the packets to be transmitted.
  • the linked lists are controlled by functions applied whenever: i) a packet is stored in the buffer pool, ii) a new packet needs to be scheduled for transmission, and iii) feedback from a transmitted packet is returned to the transmitter.
  • the algorithm may include a set of atomic operations, executed one at a time.
  • the invention provides a packet transmission scheduler for scheduling packets in a communication network, the packets being grouped into independent threads, the scheduler comprising, a buffer pool with a number of entries for storing the packets to be transmitted, a linkable packet tag element associated with each entry in the buffer pool, the packet tag element describing the state of the buffer entry, wherein the packet tag elements forming a number of linked lists, the lists keeping the packet history, providing fairness and optimal exploitation of the connection resources.
  • the lists comprise: a global list holding the packet tag elements associated with threads currently being accepted by the receiver, a global list holding the packet tag elements associated with packets that have been rejected by the receiver, a list for each thread holding the packet tag elements associated with packets currently being in progress, and a list for each thread holding the packet tag elements associated with packets currently being on hold.
  • the packet transmission scheduler described is based on a set of operations and a number of linked lists.
  • the operations are atomic, i.e. they cannot be divided into sub-operations, and they are executed one at the time.
  • Each entry in the buffer pool has an associated linkable packet tag element describing the state of the corresponding buffer entry.
  • Information for each entry such as what thread the associated packet belongs to, whether the packet is first or last in its thread, whether the packet has been tried transmitted before etc., is stored therein.
  • the scheduler uses the lists to keep track of previous and current state, like what packets have been sent and what have been rejected, so that fairness combined with optimal exploitation of the connection resources can be delivered.
  • Figure 1 is a view showing a point-to-point link with a transmitter and a receiver.
  • the packet transmission scheduler in the present invention uses three atomic operations, Store, Next and Update, for handling the transmission of packets. These atomic operations have the following definitions:
  • the store operation is invoked when a packet is stored in the buffer pool of the transmitter.
  • the scheduler uses the scheduler to keep track of the history.
  • the lists are defined as follows: •
  • the fresh list A global list holding the packet associated elements associated with threads currently being accepted by the receiver.
  • the old list A global list holding the packet associated elements associated with packets that have been rejected by the receiver.
  • the inprog lists A list for each thread holding the packet associated elements associated with packets currently being in progress.
  • the onhold lists A list for each thread holding the packet associated elements associated with packets currently being on hold.
  • the store operation will typically append an element to the tail of the fresh list, unless the stored packet is in a thread that is currently being rejected by the receiver. In that case the element will be appended to the tail of its onhold list.
  • the next operation will every other time take an element from the head of the fresh list and the head of the old list, and append it to the inprog list for this element's thread. If one of the lists is empty, an element will be taken from the non-empty list every time. If both lists are empty, there are no packets in the buffer pool, and the operation will not be invoked.
  • the update operation is by far the most complex. If a transmission was successful, i.e. the receiver has accepted a packet, the associated element is removed from its inprog list. If there are packets in the same thread that are on hold, their associated elements are taken from their onhold list and appended to the tail of the fresh list.
  • the rejected packet is either the oldest packet in its thread in the buffer pool, or it is placed after another already rejected packet in sequence.
  • the rejected packet is the oldest packet in its thread in the buffer pool
  • the rejected packet's associated element is taken from its inprog list and added to the tail of the old list.
  • any elements of its inprog list will be appended to the tail of its onhold list.
  • any element in this thread in the fresh list will be extracted, linked together in ordered sequence and appended to the tail of its onhold list.
  • the rejected packet In the second case, i.e. the rejected packet is not the oldest packet in its thread in the buffer pool, the packet must have been in progress when there was performed an update operation on a rejected packet in the same thread earlier.

Abstract

The present invention relates to a method and an apparatus for transferring information through a communication medium. More particularly, the present invention relates to a multithreading, non-blocking packet transmission scheduler and a method of operation. The packet transmission scheduler described is based on a set of atomic operations and a number of linked lists. The scheduler uses the lists to keep track of previous and current state, like what packets have been sent and what have been rejected, so that fairness combined with optimal exploitation of the connection can be delivered.

Description

PACKET TRANSMISSION SCHEDULER
Field of the invention
The present invention relates to a method and an apparatus for transferring information through a communication medium. More particularly, the present invention relates to a packet transmission scheduler and a method of operation.
Background for the Invention
In a network where the connection resources are shared between streams of packets to and from several nodes, hotspots and initially local lack of forward progress will very often bring performance degradation and problems to the whole system. A way of preventing this is by reducing dependencies to a minimum. This can be achieved by partitioning the traffic into threads, groups of packets that can be handled independently. One potential danger when removing dependencies is that some packets gets treated unfair and loses forward progress because of this. It is the responsibility of the packet transmission scheduler to provide fairness between packets in a thread, and between the different threads. Fairness between packets in a thread can be guaranteed when a strict ordering scheme is applied, i.e. when the first (oldest) packet in a thread always is accepted before any newer packets in the same thread, by the receiver on the link. When using a feedback based link protocol, the receiver decides whether it wishes to accept or reject a packet, and returns this decision in a message to the transmitter. This is illustrated- in Figure 1 , showing a point-to-point link with transmitter and receiver, the transmitter and receiver being in separate nodes. Both the transmitter and the receiver comprise buffer pools with entries holding packets. The transmitter in Figure 1 cannot be reallocated until a positive (packet accepted) feedback has arrived. On a link with a certain round trip delay, it is desirable for the transmitter to pipeline packets on the link instead of waiting for the feedback to arrive before the next packet is sent. This solution will reduce latency when the receiver is capable of processing, or passing on, packets at minimum the same speed as they are received.
Sometimes the receiver does not wish to accept a packet, and a negative feedback is returned. If a strictly ordered scheme is applied, it is then a waste of bandwidth to send other packets in the same thread, since they will then anyway be rejected.
The solution to the problems presented above is a multithreading, non- blocking packet transmission scheduler that may be used in networks based on point-to-point links. A thread is here defined to be an ordered sequence of packets with a set of common features. Criteria for belonging to a thread could be destination address of the packet, whether the packet is a request or response, whether the packet is for maintenance/diagnostics or for ordinary traffic etc. Multithreading means here the ability to handle several threads at the same time, and non-blocking means that the lack of progress in one thread will not affect the progress of other threads.
The packet transmission scheduler described is especially designed for use in a system with the N x N Crossbar Switch described in US 09/520,066, and with the Virtual Channel Flow Control as described in US 09/520,063, both filed March 7, 2000, and both applications being assigned to the assignee of the present application, and hereby included in their entirety by reference. However, this should not be interpreted as limiting for the invention, as the scheduler here described is also applicable in other communication systems.
For the further explanation of the present invention, the following definitions are needed:
- When a packet is sent, but its feedback is not yet returned, the packet is here said to be in progress.
- Packets that are not sent until the first-in-thread packet is accepted are here said to be on hold.
Summary of the Invention
In accordance with a first aspect the invention provides a method for scheduling packets in a communication network, the network including at least a transmitter and a receiver, the transmitter comprising a buffer pool for storing packets to be transmitted, the method comprising:
- grouping the buffered packets into independent threads, and
- applying a scheduling algorithm selecting the next packet to be transmitted.
In a preferred embodiment of the invention the packets are grouped on the basis of destination address, size or function of the packets. A number of linked lists are used for managing the packets to be transmitted.
The linked lists are controlled by functions applied whenever: i) a packet is stored in the buffer pool, ii) a new packet needs to be scheduled for transmission, and iii) feedback from a transmitted packet is returned to the transmitter.
The algorithm may include a set of atomic operations, executed one at a time.
In accordance with a second aspect the invention provides a packet transmission scheduler for scheduling packets in a communication network, the packets being grouped into independent threads, the scheduler comprising, a buffer pool with a number of entries for storing the packets to be transmitted, a linkable packet tag element associated with each entry in the buffer pool, the packet tag element describing the state of the buffer entry, wherein the packet tag elements forming a number of linked lists, the lists keeping the packet history, providing fairness and optimal exploitation of the connection resources.
In a preferred embodiment the lists comprise: a global list holding the packet tag elements associated with threads currently being accepted by the receiver, a global list holding the packet tag elements associated with packets that have been rejected by the receiver, a list for each thread holding the packet tag elements associated with packets currently being in progress, and a list for each thread holding the packet tag elements associated with packets currently being on hold.
The packet transmission scheduler described is based on a set of operations and a number of linked lists. The operations are atomic, i.e. they cannot be divided into sub-operations, and they are executed one at the time. Each entry in the buffer pool has an associated linkable packet tag element describing the state of the corresponding buffer entry. Information for each entry, such as what thread the associated packet belongs to, whether the packet is first or last in its thread, whether the packet has been tried transmitted before etc., is stored therein. The scheduler uses the lists to keep track of previous and current state, like what packets have been sent and what have been rejected, so that fairness combined with optimal exploitation of the connection resources can be delivered. Brief description of the drawings
The above and further advantages may be more fully understood by referring to the following description and accompanying drawing of which:
Figure 1 is a view showing a point-to-point link with a transmitter and a receiver.
Detailed description
The packet transmission scheduler in the present invention uses three atomic operations, Store, Next and Update, for handling the transmission of packets. These atomic operations have the following definitions:
• The store operation is invoked when a packet is stored in the buffer pool of the transmitter.
• The next operation is invoked when the transmitter is ready to send a packet. • The update operation is invoked when the transmitter receives feedback from the receiver saying whether the packet was accepted or not.
Also, a number of, possibly empty, lists of packet associated elements are used by the scheduler to keep track of the history. The lists are defined as follows: • The fresh list: A global list holding the packet associated elements associated with threads currently being accepted by the receiver.
• The old list: A global list holding the packet associated elements associated with packets that have been rejected by the receiver.
• The inprog lists: A list for each thread holding the packet associated elements associated with packets currently being in progress.
• The onhold lists: A list for each thread holding the packet associated elements associated with packets currently being on hold.
The store operation will typically append an element to the tail of the fresh list, unless the stored packet is in a thread that is currently being rejected by the receiver. In that case the element will be appended to the tail of its onhold list.
The next operation will every other time take an element from the head of the fresh list and the head of the old list, and append it to the inprog list for this element's thread. If one of the lists is empty, an element will be taken from the non-empty list every time. If both lists are empty, there are no packets in the buffer pool, and the operation will not be invoked.
The update operation is by far the most complex. If a transmission was successful, i.e. the receiver has accepted a packet, the associated element is removed from its inprog list. If there are packets in the same thread that are on hold, their associated elements are taken from their onhold list and appended to the tail of the fresh list.
If, on the other hand, the transmission was not successful, the rejected packet is either the oldest packet in its thread in the buffer pool, or it is placed after another already rejected packet in sequence.
In the first case, i.e. the rejected packet is the oldest packet in its thread in the buffer pool, the rejected packet's associated element is taken from its inprog list and added to the tail of the old list. Also, any elements of its inprog list will be appended to the tail of its onhold list. Finally, any element in this thread in the fresh list will be extracted, linked together in ordered sequence and appended to the tail of its onhold list.
In the second case, i.e. the rejected packet is not the oldest packet in its thread in the buffer pool, the packet must have been in progress when there was performed an update operation on a rejected packet in the same thread earlier.
The now rejected packet's associated element is then already moved to its onhold list, so no list handling needs to be done in this case.
Having described preferred embodiments of the invention it will be apparent to those skilled in the art that other embodiments incorporating the concepts may be used. These and other examples of the invention illustrated above are intended by way of example only and the actual scope of the invention is to be determined from the following claims.

Claims

C L A I M S
1. A method for scheduling packets in a communication network, the network including at least a transmitter and a receiver, the transmitter comprising a buffer
5 pool for storing packets to be transmitted, the method comprising:
- grouping the buffered packets into independent threads, and
- applying a scheduling algorithm selecting the next packet to be transmitted.
2. The method according to claim 1 , comprising grouping the buffered packets o on the basis of destination address, size or function of the packets.
3. The method according to claim 1 , comprising managing the buffered packets by using a number of linked lists.
s 4. The method according to claim 3, comprising controlling the linked lists by functions applied whenever: i) a packet is stored in the buffer pool, ii) a new packet needs to be scheduled for transmission, and iii) feedback from a transmitted packet is returned to the transmitter. 0
5. The method according to claim 1 , wherein the scheduling algorithm comprises a set of atomic operations executed one at a time.
6. A packet transmission scheduler for scheduling packets in a communication 5 network, the packets being grouped into independent threads, the scheduler comprising,
- a buffer pool with a number of entries for storing the packets to be transmitted,
- a linkable packet tag element associated with each entry in the buffer pool, the packet tag element describing the state of the buffer entry, wherein o - the packet tag elements forming a number of linked lists, the lists keeping track of the packet history, providing fairness and optimal exploitation of the connection resources.
7. The scheduler according to claim 6, wherein the lists comprising - a global list holding the packet tag elements associated with threads currently being accepted by the receiver,
- a global list holding the packet tag elements associated with packets that have been rejected by the receiver, - a list for each thread holding the packet tag elements associated with packets currently being in progress, and
- a list for each thread holding the packet tag elements associated with packets currently being on hold.
PCT/NO2001/000247 2000-06-14 2001-06-12 Packet transmission scheduler WO2001097469A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001282688A AU2001282688A1 (en) 2000-06-14 2001-06-12 Packet transmission scheduler

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US59345000A 2000-06-14 2000-06-14
US09/593,450 2000-06-14

Publications (2)

Publication Number Publication Date
WO2001097469A2 true WO2001097469A2 (en) 2001-12-20
WO2001097469A3 WO2001097469A3 (en) 2002-05-02

Family

ID=24374759

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NO2001/000247 WO2001097469A2 (en) 2000-06-14 2001-06-12 Packet transmission scheduler

Country Status (2)

Country Link
AU (1) AU2001282688A1 (en)
WO (1) WO2001097469A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG155038A1 (en) * 2001-09-28 2009-09-30 Consentry Networks Inc A multi-threaded packet processing engine for stateful packet processing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5923656A (en) * 1996-10-22 1999-07-13 Board Of Trustees Of The University Of Illinois Scalable broad band input-queued ATM switch including weight driven cell scheduler
US5996019A (en) * 1995-07-19 1999-11-30 Fujitsu Network Communications, Inc. Network link access scheduling using a plurality of prioritized lists containing queue identifiers
WO2000007126A1 (en) * 1998-07-30 2000-02-10 Teledyne Technologies Incorporated Aircraft flight data acquisition and transmission system
WO2000028701A1 (en) * 1998-11-09 2000-05-18 Cabletron Systems, Inc. Method and apparatus for fair and efficient scheduling of variable size data packets in input buffered switch
EP1009189A2 (en) * 1998-12-08 2000-06-14 Nec Corporation RRGS-round-robin greedy scheduling for input/output buffered terabit switches

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5996019A (en) * 1995-07-19 1999-11-30 Fujitsu Network Communications, Inc. Network link access scheduling using a plurality of prioritized lists containing queue identifiers
US5923656A (en) * 1996-10-22 1999-07-13 Board Of Trustees Of The University Of Illinois Scalable broad band input-queued ATM switch including weight driven cell scheduler
WO2000007126A1 (en) * 1998-07-30 2000-02-10 Teledyne Technologies Incorporated Aircraft flight data acquisition and transmission system
WO2000028701A1 (en) * 1998-11-09 2000-05-18 Cabletron Systems, Inc. Method and apparatus for fair and efficient scheduling of variable size data packets in input buffered switch
EP1009189A2 (en) * 1998-12-08 2000-06-14 Nec Corporation RRGS-round-robin greedy scheduling for input/output buffered terabit switches

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BOGENENI K ET AL: "Low-complexity multiple access protocols for wavelength-division multiplexed photonic networks" IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS. PROTOCOLS FOR GIGABIT NETWORKS, vol. 12, no. 2, February 1993 (1993-02), pages 1-35, XP002902333 *
DUAN H ET AL: "A high-performance OC-12/OC-48 queue design prototype for input-buffered ATM switches. In: INFOCOM'97. Sixteenth annual joint conference of the IEEE computer and communications societies. Driving the information revolution " PROCEEDINGS IEEE, vol. 1, 7 - 11 April 1997, pages 20-28, XP010252017 Kobe, Japan *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG155038A1 (en) * 2001-09-28 2009-09-30 Consentry Networks Inc A multi-threaded packet processing engine for stateful packet processing

Also Published As

Publication number Publication date
WO2001097469A3 (en) 2002-05-02
AU2001282688A1 (en) 2001-12-24

Similar Documents

Publication Publication Date Title
EP1137225B1 (en) A switch and a switching method
US7558269B2 (en) Method for transmitting high-priority packets in an IP transmission network
US8542585B2 (en) Method and system for transmit scheduling for multi-layer network interface controller (NIC) operation
US5732087A (en) ATM local area network switch with dual queues
EP2050199B1 (en) Expedited communication traffic handling apparatus and methods
CN100401791C (en) Buffer management for supporting QoS ensuring and data stream control in data exchange
AU746246B2 (en) Method and apparatus for supplying requests to a scheduler in an input-buffered multiport switch
US7397809B2 (en) Scheduling methods for combined unicast and multicast queuing
US20090285231A1 (en) Priority scheduling using per-priority memory structures
US6574232B1 (en) Crossbar switch utilizing broadcast buffer and associated broadcast buffer management unit
JP4105955B2 (en) Distributed shared memory packet switch
US8199764B2 (en) Scalable approach to large scale queuing through dynamic resource allocation
US20050190779A1 (en) Scalable approach to large scale queuing through dynamic resource allocation
US8107372B1 (en) Collision compensation in a scheduling system
WO2001097469A2 (en) Packet transmission scheduler
JP2005245015A (en) Packet transfer device
JP2018505591A (en) System and method for supporting an efficient virtual output queue (VOQ) packet flushing scheme in a networking device
JP2002033749A (en) Buffer unit and switching unit
EP1797682B1 (en) Quality of service (qos) class reordering
CN104954284A (en) Probabilistic-routing-oriented DTN (delay-tolerant network) congestion avoiding method
JP2007184941A (en) Instant service method for data packet scheduling in deficit round robin manner
CA2358301A1 (en) Data traffic manager
JP2001244981A (en) Queue controller
EP1665663B1 (en) A scalable approach to large scale queuing through dynamic resource allocation
WO2009156409A1 (en) Method and device for processing data in an optical network and communication system comprising such device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP