PACKET TRANSMISSION SCHEDULER
Field of the invention
The present invention relates to a method and an apparatus for transferring information through a communication medium. More particularly, the present invention relates to a packet transmission scheduler and a method of operation.
Background for the Invention
In a network where the connection resources are shared between streams of packets to and from several nodes, hotspots and initially local lack of forward progress will very often bring performance degradation and problems to the whole system. A way of preventing this is by reducing dependencies to a minimum. This can be achieved by partitioning the traffic into threads, groups of packets that can be handled independently. One potential danger when removing dependencies is that some packets gets treated unfair and loses forward progress because of this. It is the responsibility of the packet transmission scheduler to provide fairness between packets in a thread, and between the different threads. Fairness between packets in a thread can be guaranteed when a strict ordering scheme is applied, i.e. when the first (oldest) packet in a thread always is accepted before any newer packets in the same thread, by the receiver on the link. When using a feedback based link protocol, the receiver decides whether it wishes to accept or reject a packet, and returns this decision in a message to the transmitter. This is illustrated- in Figure 1 , showing a point-to-point link with transmitter and receiver, the transmitter and receiver being in separate nodes. Both the transmitter and the receiver comprise buffer pools with entries holding packets. The transmitter in Figure 1 cannot be reallocated until a positive (packet accepted) feedback has arrived. On a link with a certain round trip delay, it is desirable for the transmitter to pipeline packets on the link instead of waiting for the feedback to arrive before the next packet is sent. This solution will reduce latency when the receiver is capable of processing, or passing on, packets at minimum the same speed as they are received.
Sometimes the receiver does not wish to accept a packet, and a negative feedback is returned. If a strictly ordered scheme is applied, it is then a waste of
bandwidth to send other packets in the same thread, since they will then anyway be rejected.
The solution to the problems presented above is a multithreading, non- blocking packet transmission scheduler that may be used in networks based on point-to-point links. A thread is here defined to be an ordered sequence of packets with a set of common features. Criteria for belonging to a thread could be destination address of the packet, whether the packet is a request or response, whether the packet is for maintenance/diagnostics or for ordinary traffic etc. Multithreading means here the ability to handle several threads at the same time, and non-blocking means that the lack of progress in one thread will not affect the progress of other threads.
The packet transmission scheduler described is especially designed for use in a system with the N x N Crossbar Switch described in US 09/520,066, and with the Virtual Channel Flow Control as described in US 09/520,063, both filed March 7, 2000, and both applications being assigned to the assignee of the present application, and hereby included in their entirety by reference. However, this should not be interpreted as limiting for the invention, as the scheduler here described is also applicable in other communication systems.
For the further explanation of the present invention, the following definitions are needed:
- When a packet is sent, but its feedback is not yet returned, the packet is here said to be in progress.
- Packets that are not sent until the first-in-thread packet is accepted are here said to be on hold.
Summary of the Invention
In accordance with a first aspect the invention provides a method for scheduling packets in a communication network, the network including at least a transmitter and a receiver, the transmitter comprising a buffer pool for storing packets to be transmitted, the method comprising:
- grouping the buffered packets into independent threads, and
- applying a scheduling algorithm selecting the next packet to be transmitted.
In a preferred embodiment of the invention the packets are grouped on the basis of destination address, size or function of the packets. A number of linked
lists are used for managing the packets to be transmitted.
The linked lists are controlled by functions applied whenever: i) a packet is stored in the buffer pool, ii) a new packet needs to be scheduled for transmission, and iii) feedback from a transmitted packet is returned to the transmitter.
The algorithm may include a set of atomic operations, executed one at a time.
In accordance with a second aspect the invention provides a packet transmission scheduler for scheduling packets in a communication network, the packets being grouped into independent threads, the scheduler comprising, a buffer pool with a number of entries for storing the packets to be transmitted, a linkable packet tag element associated with each entry in the buffer pool, the packet tag element describing the state of the buffer entry, wherein the packet tag elements forming a number of linked lists, the lists keeping the packet history, providing fairness and optimal exploitation of the connection resources.
In a preferred embodiment the lists comprise: a global list holding the packet tag elements associated with threads currently being accepted by the receiver, a global list holding the packet tag elements associated with packets that have been rejected by the receiver, a list for each thread holding the packet tag elements associated with packets currently being in progress, and a list for each thread holding the packet tag elements associated with packets currently being on hold.
The packet transmission scheduler described is based on a set of operations and a number of linked lists. The operations are atomic, i.e. they cannot be divided into sub-operations, and they are executed one at the time. Each entry in the buffer pool has an associated linkable packet tag element describing the state of the corresponding buffer entry. Information for each entry, such as what thread the associated packet belongs to, whether the packet is first or last in its thread, whether the packet has been tried transmitted before etc., is stored therein. The scheduler uses the lists to keep track of previous and current state, like what packets have been sent and what have been rejected, so that fairness combined with optimal exploitation of the connection resources can be delivered.
Brief description of the drawings
The above and further advantages may be more fully understood by referring to the following description and accompanying drawing of which:
Figure 1 is a view showing a point-to-point link with a transmitter and a receiver.
Detailed description
The packet transmission scheduler in the present invention uses three atomic operations, Store, Next and Update, for handling the transmission of packets. These atomic operations have the following definitions:
• The store operation is invoked when a packet is stored in the buffer pool of the transmitter.
• The next operation is invoked when the transmitter is ready to send a packet. • The update operation is invoked when the transmitter receives feedback from the receiver saying whether the packet was accepted or not.
Also, a number of, possibly empty, lists of packet associated elements are used by the scheduler to keep track of the history. The lists are defined as follows: • The fresh list: A global list holding the packet associated elements associated with threads currently being accepted by the receiver.
• The old list: A global list holding the packet associated elements associated with packets that have been rejected by the receiver.
• The inprog lists: A list for each thread holding the packet associated elements associated with packets currently being in progress.
• The onhold lists: A list for each thread holding the packet associated elements associated with packets currently being on hold.
The store operation will typically append an element to the tail of the fresh list, unless the stored packet is in a thread that is currently being rejected by the receiver. In that case the element will be appended to the tail of its onhold list.
The next operation will every other time take an element from the head of the fresh list and the head of the old list, and append it to the inprog list for this
element's thread. If one of the lists is empty, an element will be taken from the non-empty list every time. If both lists are empty, there are no packets in the buffer pool, and the operation will not be invoked.
The update operation is by far the most complex. If a transmission was successful, i.e. the receiver has accepted a packet, the associated element is removed from its inprog list. If there are packets in the same thread that are on hold, their associated elements are taken from their onhold list and appended to the tail of the fresh list.
If, on the other hand, the transmission was not successful, the rejected packet is either the oldest packet in its thread in the buffer pool, or it is placed after another already rejected packet in sequence.
In the first case, i.e. the rejected packet is the oldest packet in its thread in the buffer pool, the rejected packet's associated element is taken from its inprog list and added to the tail of the old list. Also, any elements of its inprog list will be appended to the tail of its onhold list. Finally, any element in this thread in the fresh list will be extracted, linked together in ordered sequence and appended to the tail of its onhold list.
In the second case, i.e. the rejected packet is not the oldest packet in its thread in the buffer pool, the packet must have been in progress when there was performed an update operation on a rejected packet in the same thread earlier.
The now rejected packet's associated element is then already moved to its onhold list, so no list handling needs to be done in this case.
Having described preferred embodiments of the invention it will be apparent to those skilled in the art that other embodiments incorporating the concepts may be used. These and other examples of the invention illustrated above are intended by way of example only and the actual scope of the invention is to be determined from the following claims.