US20090187395A1 - Event-synchronization protocol for parallel simulation of large-scale wireless networks - Google Patents
Event-synchronization protocol for parallel simulation of large-scale wireless networks Download PDFInfo
- Publication number
- US20090187395A1 US20090187395A1 US11/123,233 US12323305A US2009187395A1 US 20090187395 A1 US20090187395 A1 US 20090187395A1 US 12323305 A US12323305 A US 12323305A US 2009187395 A1 US2009187395 A1 US 2009187395A1
- Authority
- US
- United States
- Prior art keywords
- event
- lps
- timestamp
- incrementer
- simulation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
Definitions
- the present invention relates in general to a conservative event-synchronization protocol, time-based synchronization, which is particularly suited for parallel discrete event simulation of mobile ad hoc wireless networks.
- a mobile ad hoc wireless network is a collection of mobile wireless nodes that form a temporary network without any infrastructure or centralized control.
- MANETs used in such situations could conceivably contain several thousand nodes.
- Simulators that evaluate MANETs must therefore be capable of simulating large-scale networks.
- researchers wishing to quickly simulate such networks typically use parallel discrete-event simulators (PDES). These simulators all use conservative event-synchronization protocols to ensure that they produce the same results as a sequential simulator would.
- PDES parallel discrete-event simulators
- a parallel discrete-event simulator can be considered to be composed of N logical processes, LP0; : : : ; LPN — 1; which communicate by sending messages containing time-stamped events.
- Each logical process has the following components: (1) the state variables that correspond to the part of the simulated physical system that the LP represents, (2) a time-ordered event queue, and (3) a local clock whose value equals the timestamp of the LP's most-recently-executed event.
- An LP in a simulation using a conservative event-synchronization protocol must obey the local causality constraint. If such an LP has an event with timestamp T at the head of its event queue, it cannot execute this event until it is sure that it will not later receive a message with a timestamp earlier than T.
- an LP receives messages via incoming-message queues (one for each LP that can send messages to the LP in question). Each such queue has a “clock,” which equals the timestamp of the last message that the destination LP removed from the queue.
- an LP Because the messages sent on each queue are guaranteed to have non-decreasing timestamps, an LP will become sure that it can execute the event at the head of its event queue when the clocks of all of its incoming-message queues are later than or equal to T. To ensure progress, LPs send time-stamped null messages, which do not contain actual events, but instead contain an implicit promise that the senders will not send to the receivers any messages with timestamps earlier than the timestamps of the null messages.
- a MANET simulator typically uses two events per node to model a wireless transmission. The first represents the beginning of the transmission, and the second represents the end of the transmission.
- node A transmits a packet at simulated time 5 ⁇ s.
- the packet will reach nodes B and C only after some time, called the propagation delay, has elapsed. Assume that the propagation delay from node A to node B is 2 ⁇ s, and from node A to node C is 3 ⁇ s.
- LPA will send the events for nodes B and C to LPB and LPC, respectively.
- the second event at each node represents the end of the transmission.
- LPA Low-power amplifier
- LPB Low-power amplifier
- LPC will schedule transmission-ending events with timestamps 105 ⁇ s, 107 ⁇ s, and 108 ⁇ s, respectively.
- Each logical process schedules its transmission-ending event itself.
- LPB nor LPC would receive a message from LPA telling it to simulate the end of the transmission.
- the initial messages contain fields indicating the duration of the simulated transmission.
- an LP performs the radio calculations for a given transmission, such as determining path loss and fading, when it executes the event representing the beginning of a transmission. If an LP executes an event representing the beginning of another transmission before simulating the end of the first transmission, then it must decide whether it should simulate a collision. This decision usually depends on the signal strengths of the two transmissions and some characteristics of the simulated receiver's radio.
- a drawback to know MANET simulation protocols is that the speed at which an LP can execute an event depends on the rate at which the LP receives messages from other LPs in the network.
- such limitations are not particularly burdensome.
- such limitations can dramatically increase the amount of time necessary to run a network simulation.
- the present invention relates in general to new conservative event synchronization protocol called time-based synchronization (TBS).
- TBS time-based synchronization
- a time scale is employed to control operation of a network simulation such that the size of the network to be simulated is no longer a factor in determining the speed at which the simulation will run.
- TBS-based protocol enables the simulation to be carried out at speeds many times faster than real time, depending on the processor architecture used to run the simulation. With this protocol, very large MANETs, for example, can be simulated quickly.
- An LP in a simulation using TBS becomes sure that it can execute an event when a timestamp of the event is less than the elapsed real time (or scaled version thereof) since the simulation began. In other words, if the simulation has been executing for time t, then an LP can execute the event when the event's timestamp T is less than the elapsed real time (or, in the case of scaled time, less than the elapsed real time multiplied by the time scale of the simulation). Note that, unlike an LP in a simulation using the null message protocol, an LP in a TBS-based simulation does not need to know which other LPs can send it messages.
- an LP in a TBS-based simulation can determine whether an event is executable without waiting for information from other LPs.
- the ability of an LP to make this determination on its own is what allows TBS-based simulations to scale well and be unaffected by the number of LPs in the network being simulated.
- each LP of a network to be simulated has its own dedicated processor.
- 100 or more of the processors can be disposed on a single chip and interconnected with one another to form what is referred to as a network on a chip (NoC).
- NoC network on a chip
- Each of the processors is specially designed to include a timer coprocessor (or is programmed to carry out the timer coprocessor function).
- the timer coprocessor contains a set of timestamp registers and an incrementer.
- the incrementer is simply a time tracking device that keeps track of the scaled version of the current time. Whenever the value in a timestamp register is equal to the value of the incrementer, a timestamp notification occurs. If the processor core is not currently executing an event, it will jump to a particular address, depending on which timestamp register produced the timestamp notification, and begin executing an event. If the processor core is busy executing another event, then the timer coprocessor will place a token representing the current timestamp notification in a notification FIFO queue.
- the processor executes the event specified by the timestamp notification, it can determine the current simulated time by reading the value of the timestamp register that produced the timestamp notification.
- a host computer for the NoC determines the rate of each LP processor's incrementer at compile time. If, for example, the smallest unit of simulated time is 100 nanoseconds, and the time scale is equal to one, then the incrementer will increment itself once every 100 nanoseconds. Likewise, if the time scale is two, the incrementer will increment itself once every 50 nanoseconds.
- FIG. 1 is a table illustrating a sequence of events that is typically carried out by a MANET simulator and show the typical duration of each event;
- FIG. 2 is a schematic diagram showing how the sequence of events in the table of FIG. 2 can be rearranged for optimized performance of a simulation using the TBS-based synchronization scheme of the present invention
- FIG. 3 is a multiprocessor system that employs one or more elements known as a Network on a Chip (NoC) that is specifically designed for executing a MANET simulation using the TBS-based synchronization protocol of the present invention
- FIG. 4 is a block diagram of the elements that form one of the processors on a NoC.
- the preferred embodiment of the invention comprises a TBS based sensor network simulation protocol and a network architecture for implementing the same.
- a sensor network can be modeled with a number of LPs.
- An LP in a simulation using TBS becomes sure that it can execute a given event when the timestamp of the event is less than the scaled version of the elapsed real time since the simulation began. In other words, if the simulation has been executing for time t, then an LP can execute the event when the following inequality is true:
- an LP in a TBS-based simulation does not need to know which other LPs can send it messages. Moreover, an LP in a TBS-based simulation can determine whether an event is executable without waiting for information from other LPs. The ability of an LP to make this determination on its own is what allows TBS-based simulations to scale well.
- a simulation designer can ensure that this inequality is true by decreasing the time scale.
- the first is the value of t latency . Decreasing the latency to send a message between LPs enables a TBS-based simulator to use a higher time scale.
- the second factor is the ability of the LP to send its messages as early as possible. If, in the example, a programmer rewrote the simulator such that M was produced during the execution of E 21 instead of E 22 ; the 28 ⁇ s in Equation 4 would change to 24 ⁇ s, allowing the time scale to be increased. Another point to note is that an LP does not have to execute an event as soon as the event becomes executable.
- the LP executes E 22 when the elapsed time is 28 ⁇ s, or 6 ⁇ s after E 22 becomes executable.
- the difference between the simulation time, or Clock, of a particular LP, and the elapsed real time for the entire simulation should be noted.
- Clock i is equal to the timestamp of LPi's most-recently-executed event.
- the elapsed real time is a property of the entire simulation and is always equal for every LP. (It should be obvious that there will never exist a Clock that is greater than s ⁇ t).
- MAC medium access control
- ACK acknowledgment packet
- the important point to note is that the events are not distributed evenly. Most of the computation (other than radio calculations) in MANET simulations occurs between events that correspond to the end of transmissions and the events that correspond to the beginning of new transmissions. However, the simulated time between two such events is much shorter than the simulated time between the beginning and end of a transmission. For example, the transmission of an incoming data packet and outgoing ACK take hundreds of ⁇ s, whereas most of the computations carried out by the processor only take on the order of 10 ⁇ s as illustrated in the table of FIG. 1 .
- the critical path is the time between executing RadioEndRxNoErrors and the arrival at their destination LPs of the messages sent during the execution of SendAck.
- the left half of FIG. 2 shows the unoptimized time line for executing this sequence of events.
- the first step in optimizing the way in which the simulator executes this sequence of events is to have the execution of ExaminePacket and CreateAck occur speculatively, after the execution of RadioBeginRx but before that of RadioEndRxNoErrors.
- the conventional simulator executes these events after RadioEndRx-NoErrors to make sure that it will not receive any messages containing events corresponding to packets colliding with the original data packet. If such a simulated collision occurred, then the simulated data packet would have errors, the LP would execute RadioEndRxWithErrors, and it would merely simulate the receiving node dropping the packet. There would therefore be no ACK to simulate.
- the processor will be idle for a long period of time between executing RadioBeginRx and RadioEndRxNoErrors. Therefore, if ExaminePacket and CreateAck are speculatively executed during this idle time and the LP eventually simulates a collision and the dropping of the packet, this speculative execution will not have cost any time (it will have cost some energy, however). On the other hand, if the LP does not simulate a collision, then the processor will have fewer events to execute before SendAck than it would have had before the TBS optimization. This makes the subject critical path shorter.
- the processor will be idle for a fairly long time after executing TransmitAckBegin. This period corresponds to the time spent simulating the transmission of the ACK.
- the execution of UpdateRoutingTables can be easily postponed to the time after TransmitAckBegin, since the content of the ACK does not depend on the updates to these tables.
- the final sequence of events looks like the following: RadioBeginRx, SpeculativelyExaminePacket, SpeculativelyCreateAck, RadioEndRxNoErrors, SendAck, TransmitAckBegin, UpdateRoutingTables, TransmitAckEnd (see the right hand side of FIG. 2 ).
- the path is now at the point where the processor will be able to send the messages corresponding to the transmission of the ACK almost immediately after RadioEndRxNoErrors becomes executable.
- This example demonstrates the two guidelines that are followed to optimize all paths. First, perform speculatively whatever computation may influence the next outgoing message. Second, postpone whatever computation is not necessary to form a given message to the time after sending the message, when the LP will be simulating the transmission of the packet that the message represents.
- Equation 2 In order to determine the time scale s, Equation 2 is rewritten. If it is assumed that the event leading to the sending of messages is able to be executed as soon as it becomes executable, the constraint for correctness becomes:
- ⁇ T is the difference between the timestamp of the current event and the timestamp of the final message sent as a result of executing this event
- ⁇ t is the real time between the event in question becoming executable and the first word of the last message reaching its destination.
- ⁇ T depends only upon the simulated MANET. It is the sum of the time, called the transmitter-turn-on time (TTOT), for a node's radio to change from sensing mode to transmitting mode, and the worst-case (longest) propagation delay between the sending node and one of the receiving nodes. These two times will be referred to as T ttot and T prop .
- ⁇ t depends on two factors: the time the NoC processor needs to execute the instructions that will send all of the messages into the interconnect, and the worst-case latency for the last message sent into the interconnect to reach its destination processor.
- the latter is a function of the NoC itself; this will be referred to it as t lat .
- the former is essentially the product of the number of messages to be sent, the number of bytes per message, and the time required by the processor to send one byte into the interconnect.
- ⁇ t is said to be equal to t lat +t send ⁇ n; where t send is the time for the processor to send one reservation message into the interconnect and n is the number of messages per simulated transmission.
- the IEEE 802.11 MAC protocol specifies T ttot as 5 ⁇ s. If one takes a worst-case value for T prop of zero (since mobile nodes may be very close to one another) and for n of 32, and if t send is set to be 8 ns and t lat is set to be 100 ns, then the right-hand side of Equation 7 becomes approximately 14:0; meaning that the TBS simulator should be able to simulate MANETs with these parameters fourteen times faster than real time. Moreover, the execution time is independent of the size of the simulated MANET, as long as n remains constant.
- a single chip system 100 referred to as a Network on a Chip (NoC) contains an array of processors 102 , each of which is dedicated to a corresponding LP that forms part of a simulated network.
- the processors are interconnected by a web of interconnects 104 .
- An array 106 of interconnected NoCs is also shown, the processors for which are controlled by a host computer system 108 .
- NoC 100 enhances the performance of the TBS simulator.
- a machine is needed that allows processors to communicate efficiently.
- DSM distributed shared-memory
- NoWs workstations
- TBS-based simulation of MANETs is an application with a large ratio of latency-critical communication to computation. Running such an application on a NoW containing thousands of very-powerful computers is a poor use of resources. The simulation's time scale, and therefore the performance of the simulator, will be limited by the latency to pass messages between workstations. Instead, a platform with less powerful computers but a lower message-passing latency would be preferred.
- processors that can efficiently manage event queues. To do this, the processors need a low-overhead mechanism for determining when an event has become executable. They must be able to quickly compare the timestamps of scheduled events with the scaled version of elapsed time.
- the NoC single chip multiprocessor was created. It is estimated that each chip will contain approximately 100 processors. To enable simulations of MANETs containing thousands of nodes, the NoC is designed such that one can gluelessly combine multiple chips to create a massively-parallel machine.
- the NoC processors are designed specifically to execute LPs in the TBS simulator, thus they lack virtual memory or any other hardware operating-system support. Each processor has its own private 8KB memory, and communicates with the other processors only by passing messages via a highly-pipelined interconnect.
- a simulation run on the NoC 100 is managed by the offchip workstation host 108 .
- the host 108 can send and receive messages to and from the NoC processors 102 ; it sends the processors the code they will execute during a simulation and it collects statistics when a simulation is complete.
- the NoC processors 102 lack multiply/divide and floating point units, meaning that they would require a great deal of time to perform complicated radio calculations. Therefore, instead of the NoC processors performing radio calculations during a simulation, the host performs the calculations before the simulation begins; it can do so for any simulation in which the movement patterns of the nodes are known before the simulation begins.
- the host 108 incorporates these calculations into the code that it sends to the processors 102 (the code for the radio layers has a section that is different for each processor; it tells a given processor how to simulate incoming transmissions based on the sources of the transmissions and the times at which they occur).
- processors 102 execute LPs, which in turn simulate network nodes
- a processor that simulates a given network node will need to exchange messages only with processors simulating network nodes within its node's transmission range. Therefore, if the simulation user maps LPs to processors in an intelligent way (i.e. processors that are close together on a chip simulate nodes that are close together in the simulated terrain), no messages should travel more than a few hops through the interconnect 104 .
- LPs i.e. processors that are close together on a chip simulate nodes that are close together in the simulated terrain
- no messages should travel more than a few hops through the interconnect 104 .
- a mapping that is efficient at the beginning of a simulation may become quite inefficient later. Such simulations can be paused, re-mapped, and started again by the host.
- the host 108 can precompute the remappings.
- a researcher using the NoC 100 to simulate a network with high node mobility may wish to make remapping easier by using only a fraction of the NoC processors on each chip.
- FIG. 4 illustrates the details of a preferred embodiment of one of the chip processors 102 .
- the processor 102 consists of three components: (1) a timer coprocessor 118 , (2) a message coprocessor 120 , which provides an interface to the interconnect web 104 , and (3) a processor core 122 , which consists of an event queue 124 , instruction fetch 126 , decode 128 , execution units 130 , busses 132 , register file 134 , message FIFOs 136 , and memories 138 and 140 .
- the most important of the processor elements in this application for implementing a TBS-based simulation protocol is the timer coprocessor 118 , which it uses to schedule new events, and which alerts it when a previously scheduled event becomes executable.
- the timer coprocessors need to keep track of the current elapsed time; for this purpose, they each contain a time tracking device called an incrementer 142 . All incrementers in the multiprocessor system start from zero on reset and change their values at the same rate. Thus, every processor in the system has the same notion of the current elapsed time.
- the incrementer 142 essentially tracks real time.
- the rest of the timer coprocessor 118 must be fast enough to keep up with the rate at which the incrementer 142 changes. Every timestamp register must be compared against the incrementer 142 every time the incrementer's value changes.
- the software running on the processor 102 can adjust the time scale of the simulation by taking different samples of the incrementer 142 . Shifting the sample one bit to the left corresponds to doubling the time scale.
- the software uses an instruction to set a timestamp register to the timestamp of the event it wishes to schedule.
- the ISA also includes instructions that turn off timestamp registers (cancel events).
- timer coprocessor 118 Although the timer coprocessor 118 must compare every timestamp register against the value of the incrementer 142 every time the incrementer value changes, it does not have to finish the comparisons before the incrementer value changes again.
- the comparison process can be pipelined such that one set of comparisons (the comparisons between every timestamp register and one value of the incrementer) completes every cycle.
- the entire timer coprocessor 118 is highly-pipelined so that it can achieve throughput high enough to enable one comparison to complete every time the incrementer changes its value.
- the TBS protocol of the present invention can simulate very large sensor networks quickly through use of an event execution time tracking technique that is independent from other processes in the network to be simulated.
- Each processor in the simulation keeps track of time independently of other processors in the simulation.
- the arrangement enables a network simulation to be carried out at speeds faster than real time.
Abstract
Description
- This application claims priority under 35 U.S.C. 11 9(e) on U.S. Provisional Application No. 60/568,259, which was filed on May 6, 2004, and is hereby incorporated by reference.
- This application is also related to a U.S. patent application entitled Sensor-Network Processors Using Event-Driven Architecture, which is being filed concurrently herewith on May 6, 2005.
- This invention arose out of research sponsored by a United States Government Agency, the Office of Naval Research (ONR), under ONR Contract No. N00014-00-1-0564. The Government has certain rights in the invention.
- 1. Field of the Invention
- The present invention relates in general to a conservative event-synchronization protocol, time-based synchronization, which is particularly suited for parallel discrete event simulation of mobile ad hoc wireless networks.
- 2. Description of the Background Art
- A mobile ad hoc wireless network (MANET) is a collection of mobile wireless nodes that form a temporary network without any infrastructure or centralized control. Researchers typically cite three uses for MANETs: emergency situations, military operations, and sensor networks. MANETs used in such situations could conceivably contain several thousand nodes. Simulators that evaluate MANETs must therefore be capable of simulating large-scale networks. Researchers wishing to quickly simulate such networks typically use parallel discrete-event simulators (PDES). These simulators all use conservative event-synchronization protocols to ensure that they produce the same results as a sequential simulator would.
- A parallel discrete-event simulator can be considered to be composed of N logical processes, LP0; : : : ; LPN—1; which communicate by sending messages containing time-stamped events. Each logical process has the following components: (1) the state variables that correspond to the part of the simulated physical system that the LP represents, (2) a time-ordered event queue, and (3) a local clock whose value equals the timestamp of the LP's most-recently-executed event.
- An LP in a simulation using a conservative event-synchronization protocol must obey the local causality constraint. If such an LP has an event with timestamp T at the head of its event queue, it cannot execute this event until it is sure that it will not later receive a message with a timestamp earlier than T. As an example, in a simulation using the known null-message protocol, an LP receives messages via incoming-message queues (one for each LP that can send messages to the LP in question). Each such queue has a “clock,” which equals the timestamp of the last message that the destination LP removed from the queue. Because the messages sent on each queue are guaranteed to have non-decreasing timestamps, an LP will become sure that it can execute the event at the head of its event queue when the clocks of all of its incoming-message queues are later than or equal to T. To ensure progress, LPs send time-stamped null messages, which do not contain actual events, but instead contain an implicit promise that the senders will not send to the receivers any messages with timestamps earlier than the timestamps of the null messages.
- A MANET simulator typically uses two events per node to model a wireless transmission. The first represents the beginning of the transmission, and the second represents the end of the transmission. Consider an example with three nodes, node A, node B, and node C, which are simulated by LPA, LPB and LPC. Say that node A transmits a packet at simulated time 5 μs. The packet will reach nodes B and C only after some time, called the propagation delay, has elapsed. Assume that the propagation delay from node A to node B is 2 μs, and from node A to node C is 3 μs. The simulator will use three events to simulate the beginning of the transmission of the packet: one for node A with timestamp 5 μs, one for node B with timestamp 5+2=7 μs, and one for node C timestamp 5+3=8 μs. In a parallel simulator, LPA will send the events for nodes B and C to LPB and LPC, respectively.
- The second event at each node represents the end of the transmission. Say that the transmission of the packet lasts for 100 μs; LPA; LPB; and LPC will schedule transmission-ending events with timestamps 105 μs, 107 μs, and 108 μs, respectively. Each logical process schedules its transmission-ending event itself. Neither LPB nor LPC would receive a message from LPA telling it to simulate the end of the transmission. Instead, the initial messages contain fields indicating the duration of the simulated transmission.
- In most MANET simulators, an LP performs the radio calculations for a given transmission, such as determining path loss and fading, when it executes the event representing the beginning of a transmission. If an LP executes an event representing the beginning of another transmission before simulating the end of the first transmission, then it must decide whether it should simulate a collision. This decision usually depends on the signal strengths of the two transmissions and some characteristics of the simulated receiver's radio.
- A drawback to know MANET simulation protocols, such as the null-message protocol, is that the speed at which an LP can execute an event depends on the rate at which the LP receives messages from other LPs in the network. For small networks, such limitations are not particularly burdensome. However, for very large networks with hundreds or even thousands of nodes, such limitations can dramatically increase the amount of time necessary to run a network simulation.
- The present invention relates in general to new conservative event synchronization protocol called time-based synchronization (TBS). In TBS, a time scale is employed to control operation of a network simulation such that the size of the network to be simulated is no longer a factor in determining the speed at which the simulation will run. Furthermore, the use of a TBS-based protocol enables the simulation to be carried out at speeds many times faster than real time, depending on the processor architecture used to run the simulation. With this protocol, very large MANETs, for example, can be simulated quickly.
- The manner in which the TBS protocol works is as follows. An LP in a simulation using TBS becomes sure that it can execute an event when a timestamp of the event is less than the elapsed real time (or scaled version thereof) since the simulation began. In other words, if the simulation has been executing for time t, then an LP can execute the event when the event's timestamp T is less than the elapsed real time (or, in the case of scaled time, less than the elapsed real time multiplied by the time scale of the simulation). Note that, unlike an LP in a simulation using the null message protocol, an LP in a TBS-based simulation does not need to know which other LPs can send it messages. Moreover, an LP in a TBS-based simulation can determine whether an event is executable without waiting for information from other LPs. The ability of an LP to make this determination on its own is what allows TBS-based simulations to scale well and be unaffected by the number of LPs in the network being simulated.
- Although the subject TBS protocol does not require a special processing system for operation, to take maximum advantage of the TBS protocol's speed capabilities, a special multiprocessor architecture has also been devised that can be used to implement the TBS simulation protocol. In this architecture, each LP of a network to be simulated has its own dedicated processor. In addition, 100 or more of the processors can be disposed on a single chip and interconnected with one another to form what is referred to as a network on a chip (NoC).
- Each of the processors is specially designed to include a timer coprocessor (or is programmed to carry out the timer coprocessor function). In the preferred embodiment, the timer coprocessor contains a set of timestamp registers and an incrementer. The incrementer is simply a time tracking device that keeps track of the scaled version of the current time. Whenever the value in a timestamp register is equal to the value of the incrementer, a timestamp notification occurs. If the processor core is not currently executing an event, it will jump to a particular address, depending on which timestamp register produced the timestamp notification, and begin executing an event. If the processor core is busy executing another event, then the timer coprocessor will place a token representing the current timestamp notification in a notification FIFO queue. When the processor executes the event specified by the timestamp notification, it can determine the current simulated time by reading the value of the timestamp register that produced the timestamp notification. A host computer for the NoC determines the rate of each LP processor's incrementer at compile time. If, for example, the smallest unit of simulated time is 100 nanoseconds, and the time scale is equal to one, then the incrementer will increment itself once every 100 nanoseconds. Likewise, if the time scale is two, the incrementer will increment itself once every 50 nanoseconds.
- The various features and advantages of the invention will become apparent to those of skill in the art from the following description, taken with the accompanying drawings, in which:
-
FIG. 1 is a table illustrating a sequence of events that is typically carried out by a MANET simulator and show the typical duration of each event; -
FIG. 2 is a schematic diagram showing how the sequence of events in the table ofFIG. 2 can be rearranged for optimized performance of a simulation using the TBS-based synchronization scheme of the present invention; -
FIG. 3 is a multiprocessor system that employs one or more elements known as a Network on a Chip (NoC) that is specifically designed for executing a MANET simulation using the TBS-based synchronization protocol of the present invention; and -
FIG. 4 is a block diagram of the elements that form one of the processors on a NoC. - The preferred embodiment of the invention comprises a TBS based sensor network simulation protocol and a network architecture for implementing the same. As discussed previously, a sensor network can be modeled with a number of LPs. An LP in a simulation using TBS becomes sure that it can execute a given event when the timestamp of the event is less than the scaled version of the elapsed real time since the simulation began. In other words, if the simulation has been executing for time t, then an LP can execute the event when the following inequality is true:
-
T<s×t (1) - where s is the time scale of the simulation (for the rest of the application, t will be used to represent the elapsed real time since a simulation began, and s to represent the time scale). When this inequality is true, then the event in question is said to be executable. It is easy to see that a parallel discrete event simulator (PDES) using TBS will execute correctly as long as every event arrives at its destination LP before it is executable. That is, an incoming message with timestamp T arriving at time t must satisfy Equation (2):
-
T≧s×t (2) - Note that, unlike an LP in a simulation using the null message protocol, an LP in a TBS-based simulation does not need to know which other LPs can send it messages. Moreover, an LP in a TBS-based simulation can determine whether an event is executable without waiting for information from other LPs. The ability of an LP to make this determination on its own is what allows TBS-based simulations to scale well.
- As an example, consider an LP that has the events E10; E12; E20; and E22 in its event queue, where the subscripts indicate the events' timestamps, in simulated μs. It is assumed that the LP requires 4 μs of real time to execute any event (this time corresponds to the time needed by whatever hardware is running the simulator). The logical process's clock, Clock; starts at zero. For simplicity, assume that s=1:
- The event E10 becomes executable when 10 μs<s×t. Because s=1, the LP will execute this event after 10 μs of real time have elapsed. Doing so takes 4 μs, after which, Clock=10 μs and t=14 μs. The LP can then immediately execute E12, since it can be sure that no messages with timestamps less than 14 μs will arrive in the future. After executing this event, Clock=12 μs and t=18 μs.
- Now imagine that a message containing an event E21 arrives while the LP is executing E12. The LP will place E21 into its event queue after it executes E12; wait until t=20 μs, and then execute E20, E21, and E22.
- Consider what happens if executing E22 results in the transmission of an outgoing message, M, with timestamp Tmsg. When the LP begins executing E22, t=20 μs+2×4 μs=28 μs (since the LP must execute E20 and E21 before executing E22). Therefore, M will arrive at its destination “on-time” (i.e., before it is executable) only if
-
T msg≧28 μs+t comp +t latency (3) - where tcomp is the time spent on computation before the LP can send M (this computation is a fraction of the total computation involved in executing E22), and tlatency is the latency to send M to the destination LP. For a general time scale, this equation becomes
-
T msg ≧s×(28 μs+t comp +t latency) (4) - A simulation designer can ensure that this inequality is true by decreasing the time scale.
- From this example, two factors can be seen that can dramatically affect the performance of a TBS-based simulator. The first is the value of tlatency. Decreasing the latency to send a message between LPs enables a TBS-based simulator to use a higher time scale. The second factor is the ability of the LP to send its messages as early as possible. If, in the example, a programmer rewrote the simulator such that M was produced during the execution of E21 instead of E22; the 28 μs in Equation 4 would change to 24 μs, allowing the time scale to be increased. Another point to note is that an LP does not have to execute an event as soon as the event becomes executable. For instance, in the example the LP executes E22 when the elapsed time is 28 μs, or 6 μs after E22 becomes executable. Moreover, the difference between the simulation time, or Clock, of a particular LP, and the elapsed real time for the entire simulation should be noted. Remember that, for a logical process LPi, Clocki is equal to the timestamp of LPi's most-recently-executed event. At any given real time, all of the LPs in a simulation can have different Clocks. On the other hand, the elapsed real time is a property of the entire simulation and is always equal for every LP. (It should be obvious that there will never exist a Clock that is greater than s×t).
- Next suppose a node in the simulated MANET receives a data packet that contains some routing protocol information that it must use to update a table. Now suppose that the medium access control (MAC) protocol that the node is using dictates that the node must examine the data packet, wait for a 10 μs; and transmit an acknowledgment packet (ACK). The table in
FIG. 1 shows such a sequence of events taken from an actual MANET simulation. When the LP executes SendAck it sends messages to the other LPs that simulate nodes receiving the ACK. (SendAck is analogous to E22 in the previous example since the execution of each event leads to its LP sending messages.) - In a MANET simulation, the important point to note is that the events are not distributed evenly. Most of the computation (other than radio calculations) in MANET simulations occurs between events that correspond to the end of transmissions and the events that correspond to the beginning of new transmissions. However, the simulated time between two such events is much shorter than the simulated time between the beginning and end of a transmission. For example, the transmission of an incoming data packet and outgoing ACK take hundreds of μs, whereas most of the computations carried out by the processor only take on the order of 10 μs as illustrated in the table of
FIG. 1 . - This distribution of events may at first make TBS seem like a poor event-synchronization protocol to use for simulating MANETs. If the time scale is decreased such that 10 simulated μs scales to enough real time to do all of the necessary computation, then periods of hundreds of μs during which the LP is doing little computation will also scale, resulting in very long idle periods. Fortunately, by making a series of simple changes to the way in which a typical simulator executes events, relatively efficient TBS-based simulator can be created.
- Initial simulations indicated that several critical paths limited the simulations' time scales. As stated earlier, the simulator will execute correctly if every message arrives at its destination before its enclosed event is executable. The critical paths are naturally then the times between executing events that lead to one or more messages being sent, and the latest time by which these messages can arrive at their destination processors without violating Equation 2.
- The inventors discovered that by changing the order in which the simulator executes events, the calculations could be moved of the critical paths. For the sequence of events in the table of
FIG. 1 , the critical path is the time between executing RadioEndRxNoErrors and the arrival at their destination LPs of the messages sent during the execution of SendAck. The left half ofFIG. 2 shows the unoptimized time line for executing this sequence of events. - The first step in optimizing the way in which the simulator executes this sequence of events is to have the execution of ExaminePacket and CreateAck occur speculatively, after the execution of RadioBeginRx but before that of RadioEndRxNoErrors. The conventional simulator executes these events after RadioEndRx-NoErrors to make sure that it will not receive any messages containing events corresponding to packets colliding with the original data packet. If such a simulated collision occurred, then the simulated data packet would have errors, the LP would execute RadioEndRxWithErrors, and it would merely simulate the receiving node dropping the packet. There would therefore be no ACK to simulate. In the subject TBS simulator, however, the processor will be idle for a long period of time between executing RadioBeginRx and RadioEndRxNoErrors. Therefore, if ExaminePacket and CreateAck are speculatively executed during this idle time and the LP eventually simulates a collision and the dropping of the packet, this speculative execution will not have cost any time (it will have cost some energy, however). On the other hand, if the LP does not simulate a collision, then the processor will have fewer events to execute before SendAck than it would have had before the TBS optimization. This makes the subject critical path shorter.
- Likewise, the processor will be idle for a fairly long time after executing TransmitAckBegin. This period corresponds to the time spent simulating the transmission of the ACK. The execution of UpdateRoutingTables can be easily postponed to the time after TransmitAckBegin, since the content of the ACK does not depend on the updates to these tables. After TBS optimizations, the final sequence of events looks like the following: RadioBeginRx, SpeculativelyExaminePacket, SpeculativelyCreateAck, RadioEndRxNoErrors, SendAck, TransmitAckBegin, UpdateRoutingTables, TransmitAckEnd (see the right hand side of
FIG. 2 ). The path is now at the point where the processor will be able to send the messages corresponding to the transmission of the ACK almost immediately after RadioEndRxNoErrors becomes executable. - This example demonstrates the two guidelines that are followed to optimize all paths. First, perform speculatively whatever computation may influence the next outgoing message. Second, postpone whatever computation is not necessary to form a given message to the time after sending the message, when the LP will be simulating the transmission of the packet that the message represents.
- In order to determine the time scale s, Equation 2 is rewritten. If it is assumed that the event leading to the sending of messages is able to be executed as soon as it becomes executable, the constraint for correctness becomes:
-
ΔT≧s×Δt (5) - where ΔT is the difference between the timestamp of the current event and the timestamp of the final message sent as a result of executing this event, and Δt is the real time between the event in question becoming executable and the first word of the last message reaching its destination. ΔT depends only upon the simulated MANET. It is the sum of the time, called the transmitter-turn-on time (TTOT), for a node's radio to change from sensing mode to transmitting mode, and the worst-case (longest) propagation delay between the sending node and one of the receiving nodes. These two times will be referred to as Tttot and Tprop.
- Δt depends on two factors: the time the NoC processor needs to execute the instructions that will send all of the messages into the interconnect, and the worst-case latency for the last message sent into the interconnect to reach its destination processor. The latter is a function of the NoC itself; this will be referred to it as tlat. The former is essentially the product of the number of messages to be sent, the number of bytes per message, and the time required by the processor to send one byte into the interconnect.
- One can perform one final optimization that eliminates the dependence of Δt upon the length of messages. For each message a NoC processor would normally send, an additional reservation message is introduced, which contains only the timestamp of the original message (the original message will now be referred to as the full message). When a NoC processor simulates the transmission of a packet, it first sends reservation messages to all of the receiving processors, and then sends the full messages.
- When an LP receives a reservation message, it knows that it will soon receive a full message, and so it does not execute any events with timestamps later than the reservation message's timestamp. This scheme ensures that all LPs will still execute events in order, while sending only reservation messages, which have minimal length, during the critical path. Δt is said to be equal to tlat+tsend×n; where tsend is the time for the processor to send one reservation message into the interconnect and n is the number of messages per simulated transmission. Using this expression, plus the equation for ΔT; Equation 5 becomes
-
T tot +T prop ≧s×(t lat +t send ×n) (6) - which can be rewritten
-
s≦T ttot +T prop /t lat +t send ×n (7) - The IEEE 802.11 MAC protocol specifies Tttot as 5 μs. If one takes a worst-case value for Tprop of zero (since mobile nodes may be very close to one another) and for n of 32, and if tsend is set to be 8 ns and tlat is set to be 100 ns, then the right-hand side of Equation 7 becomes approximately 14:0; meaning that the TBS simulator should be able to simulate MANETs with these parameters fourteen times faster than real time. Moreover, the execution time is independent of the size of the simulated MANET, as long as n remains constant.
- Although the TBS-based simulation protocol of the subject invention could arguably be implemented on any simulation system, the inventors also devised a multiprocessor architecture that was specifically designed to implement the TBS-based simulation protocol and take full advantage of the protocol's speed capabilities. This system is illustrated in
FIG. 3 . Asingle chip system 100, referred to as a Network on a Chip (NoC), contains an array ofprocessors 102, each of which is dedicated to a corresponding LP that forms part of a simulated network. The processors are interconnected by a web ofinterconnects 104. Anarray 106 of interconnected NoCs is also shown, the processors for which are controlled by ahost computer system 108. - The use of the
NoC 100 enhances the performance of the TBS simulator. There is a one-to-one mapping between simulated network nodes and LPs, and a one-to-one mapping between LPs and processors in the hardware executing the simulator. Therefore, a machine executing the TBS simulator should have thousands of processors. Moreover, because the performance of the TBS simulator depends heavily on the latency to pass messages between LPs, a machine is needed that allows processors to communicate efficiently. Currently-existing parallel platforms that a person would consider for running a TBS-based MANET simulator therefore include distributed shared-memory (DSM) machine and networks of workstations (NoWs). Unfortunately, the largest DSM machines contain only 1024 processors, making them incapable of running TBS-based simulations of MANETs containing greater than 1024 nodes. NoWs are more promising: NoWs containing more than 1024 nodes certainly exist, and the message-passing latency in a NoW can be as low as 6:3 μs. Unfortunately, however, a NoW is still a bad fit. TBS-based simulation of MANETs is an application with a large ratio of latency-critical communication to computation. Running such an application on a NoW containing thousands of very-powerful computers is a poor use of resources. The simulation's time scale, and therefore the performance of the simulator, will be limited by the latency to pass messages between workstations. Instead, a platform with less powerful computers but a lower message-passing latency would be preferred. - In addition to having thousands of moderately-powered processors that can communicate quickly, a machine that executes the subject TBS simulator should have processors that can efficiently manage event queues. To do this, the processors need a low-overhead mechanism for determining when an event has become executable. They must be able to quickly compare the timestamps of scheduled events with the scaled version of elapsed time.
- With these requirements in mind, the NoC single chip multiprocessor was created. It is estimated that each chip will contain approximately 100 processors. To enable simulations of MANETs containing thousands of nodes, the NoC is designed such that one can gluelessly combine multiple chips to create a massively-parallel machine. The NoC processors are designed specifically to execute LPs in the TBS simulator, thus they lack virtual memory or any other hardware operating-system support. Each processor has its own private 8KB memory, and communicates with the other processors only by passing messages via a highly-pipelined interconnect.
- A simulation run on the
NoC 100 is managed by theoffchip workstation host 108. Thehost 108 can send and receive messages to and from theNoC processors 102; it sends the processors the code they will execute during a simulation and it collects statistics when a simulation is complete. - The
NoC processors 102 lack multiply/divide and floating point units, meaning that they would require a great deal of time to perform complicated radio calculations. Therefore, instead of the NoC processors performing radio calculations during a simulation, the host performs the calculations before the simulation begins; it can do so for any simulation in which the movement patterns of the nodes are known before the simulation begins. Thehost 108 incorporates these calculations into the code that it sends to the processors 102 (the code for the radio layers has a section that is different for each processor; it tells a given processor how to simulate incoming transmissions based on the sources of the transmissions and the times at which they occur). - Because the
processors 102 execute LPs, which in turn simulate network nodes, a processor that simulates a given network node will need to exchange messages only with processors simulating network nodes within its node's transmission range. Therefore, if the simulation user maps LPs to processors in an intelligent way (i.e. processors that are close together on a chip simulate nodes that are close together in the simulated terrain), no messages should travel more than a few hops through theinterconnect 104. In simulations of networks with highly-mobile nodes, a mapping that is efficient at the beginning of a simulation may become quite inefficient later. Such simulations can be paused, re-mapped, and started again by the host. In simulations in which the mobility patterns of the nodes are known before the simulation begins, thehost 108 can precompute the remappings. A researcher using theNoC 100 to simulate a network with high node mobility may wish to make remapping easier by using only a fraction of the NoC processors on each chip. -
FIG. 4 illustrates the details of a preferred embodiment of one of thechip processors 102. Theprocessor 102 consists of three components: (1) atimer coprocessor 118, (2) amessage coprocessor 120, which provides an interface to theinterconnect web 104, and (3) aprocessor core 122, which consists of anevent queue 124, instruction fetch 126, decode 128,execution units 130, busses 132,register file 134,message FIFOs 136, andmemories - The most important of the processor elements in this application for implementing a TBS-based simulation protocol is the
timer coprocessor 118, which it uses to schedule new events, and which alerts it when a previously scheduled event becomes executable. To determine when an event is executable, the timer coprocessors need to keep track of the current elapsed time; for this purpose, they each contain a time tracking device called anincrementer 142. All incrementers in the multiprocessor system start from zero on reset and change their values at the same rate. Thus, every processor in the system has the same notion of the current elapsed time. - Because all of the
incrementers 142 advance independently (though at the same rate), there is no centralized control governing all of the processors in the NoC. Hence, adding more nodes to the simulated MANET, and therefore more processors to the TBS simulator, does not add any extra hardware synchronization costs. This ability of the hardware to scale well makes possible the fast simulation of large-scale MANETs. - The
incrementer 142 essentially tracks real time. The rest of thetimer coprocessor 118 must be fast enough to keep up with the rate at which theincrementer 142 changes. Every timestamp register must be compared against theincrementer 142 every time the incrementer's value changes. The software running on theprocessor 102 can adjust the time scale of the simulation by taking different samples of theincrementer 142. Shifting the sample one bit to the left corresponds to doubling the time scale. To schedule an event, the software uses an instruction to set a timestamp register to the timestamp of the event it wishes to schedule. The ISA also includes instructions that turn off timestamp registers (cancel events). - Although the
timer coprocessor 118 must compare every timestamp register against the value of theincrementer 142 every time the incrementer value changes, it does not have to finish the comparisons before the incrementer value changes again. The comparison process can be pipelined such that one set of comparisons (the comparisons between every timestamp register and one value of the incrementer) completes every cycle. Theentire timer coprocessor 118 is highly-pipelined so that it can achieve throughput high enough to enable one comparison to complete every time the incrementer changes its value. - In summary, the TBS protocol of the present invention can simulate very large sensor networks quickly through use of an event execution time tracking technique that is independent from other processes in the network to be simulated. Each processor in the simulation keeps track of time independently of other processors in the simulation. The arrangement enables a network simulation to be carried out at speeds faster than real time.
- Although the invention has been disclosed in terms of a preferred embodiment, it will be understood that numerous variations and modifications could be made thereto without departing from the scope of the invention as defined in the following claims.
Claims (19)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/123,233 US7564809B1 (en) | 2004-05-06 | 2005-05-06 | Event-synchronization protocol for parallel simulation of large-scale wireless networks |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US56825904P | 2004-05-06 | 2004-05-06 | |
US11/123,233 US7564809B1 (en) | 2004-05-06 | 2005-05-06 | Event-synchronization protocol for parallel simulation of large-scale wireless networks |
Publications (2)
Publication Number | Publication Date |
---|---|
US7564809B1 US7564809B1 (en) | 2009-07-21 |
US20090187395A1 true US20090187395A1 (en) | 2009-07-23 |
Family
ID=40872650
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/123,233 Active 2026-12-02 US7564809B1 (en) | 2004-05-06 | 2005-05-06 | Event-synchronization protocol for parallel simulation of large-scale wireless networks |
Country Status (1)
Country | Link |
---|---|
US (1) | US7564809B1 (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080154559A1 (en) * | 2006-10-12 | 2008-06-26 | Chethan Ram | Method and system for variable scale time management for simulation environments |
US20090028059A1 (en) * | 2005-12-22 | 2009-01-29 | Telecom Italia S.P.A. | Method and System for Simulating a Communication Network, Related Network and Computer Program Product Therefor |
US8126696B1 (en) * | 2008-10-15 | 2012-02-28 | Hewlett-Packard Development Company, L.P. | Modifying length of synchronization quanta of simulation time in which execution of nodes is simulated |
US20130124174A1 (en) * | 2011-06-03 | 2013-05-16 | David R. Jefferson | Internal parallelism in a parallel discrete event simulation for space situational awareness |
US20140067358A1 (en) * | 2012-09-05 | 2014-03-06 | Cadence Design Systems, Inc. | Determining an optimal global quantum for an event-driven simulation |
US9053263B2 (en) | 2012-04-27 | 2015-06-09 | International Business Machines Corporation | Scheduling discrete event simulation |
US20150358211A1 (en) * | 2014-06-06 | 2015-12-10 | Netspeed Systems | Transactional traffic specification for network-on-chip design |
WO2020089664A1 (en) * | 2018-10-29 | 2020-05-07 | Siemens Industry Software Ltd. | A method and a system for synchronizing a first and a second simulation system |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8380482B2 (en) * | 2007-06-13 | 2013-02-19 | The Boeing Company | System and method for clock modeling in discrete-event simulation |
CN102119380B (en) * | 2009-06-10 | 2014-04-02 | 松下电器产业株式会社 | Trace processing device and trace processing system |
JP5651251B2 (en) * | 2011-12-05 | 2015-01-07 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Simulation execution method, program, and system |
US10210294B1 (en) * | 2015-07-09 | 2019-02-19 | Xilinx, Inc. | System and methods for simulating a circuit design |
CN107332679B (en) * | 2017-06-06 | 2021-01-15 | 北京元心科技有限公司 | Centerless information synchronization method and device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5826066A (en) * | 1996-08-08 | 1998-10-20 | Tandem Computers Incorporated | Method for keeping accurate time in a computer system |
US6134514A (en) * | 1998-06-25 | 2000-10-17 | Itt Manufacturing Enterprises, Inc. | Large-scale network simulation method and apparatus |
US20040102942A1 (en) * | 2002-11-27 | 2004-05-27 | Opcoast Llc | Method and system for virtual injection of network application codes into network simulation |
US6845352B1 (en) * | 2000-03-22 | 2005-01-18 | Lucent Technologies Inc. | Framework for flexible and scalable real-time traffic emulation for packet switched networks |
US7092866B2 (en) * | 2002-05-10 | 2006-08-15 | International Business Machines Corporation | System and method for time compression during software testing |
US7246054B2 (en) * | 2002-05-13 | 2007-07-17 | Rensselaer Polytechnic Institute | Discrete event simulation system and method |
-
2005
- 2005-05-06 US US11/123,233 patent/US7564809B1/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5826066A (en) * | 1996-08-08 | 1998-10-20 | Tandem Computers Incorporated | Method for keeping accurate time in a computer system |
US6134514A (en) * | 1998-06-25 | 2000-10-17 | Itt Manufacturing Enterprises, Inc. | Large-scale network simulation method and apparatus |
US6845352B1 (en) * | 2000-03-22 | 2005-01-18 | Lucent Technologies Inc. | Framework for flexible and scalable real-time traffic emulation for packet switched networks |
US7092866B2 (en) * | 2002-05-10 | 2006-08-15 | International Business Machines Corporation | System and method for time compression during software testing |
US7246054B2 (en) * | 2002-05-13 | 2007-07-17 | Rensselaer Polytechnic Institute | Discrete event simulation system and method |
US20040102942A1 (en) * | 2002-11-27 | 2004-05-27 | Opcoast Llc | Method and system for virtual injection of network application codes into network simulation |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090028059A1 (en) * | 2005-12-22 | 2009-01-29 | Telecom Italia S.P.A. | Method and System for Simulating a Communication Network, Related Network and Computer Program Product Therefor |
US7912021B2 (en) * | 2005-12-22 | 2011-03-22 | Telecom Italia S.P.A. | Method and system for simulating a communication network, related network and computer program product therefor |
US20080154559A1 (en) * | 2006-10-12 | 2008-06-26 | Chethan Ram | Method and system for variable scale time management for simulation environments |
US8126696B1 (en) * | 2008-10-15 | 2012-02-28 | Hewlett-Packard Development Company, L.P. | Modifying length of synchronization quanta of simulation time in which execution of nodes is simulated |
US20130124174A1 (en) * | 2011-06-03 | 2013-05-16 | David R. Jefferson | Internal parallelism in a parallel discrete event simulation for space situational awareness |
US9053263B2 (en) | 2012-04-27 | 2015-06-09 | International Business Machines Corporation | Scheduling discrete event simulation |
US20140067358A1 (en) * | 2012-09-05 | 2014-03-06 | Cadence Design Systems, Inc. | Determining an optimal global quantum for an event-driven simulation |
US10176276B2 (en) * | 2012-09-05 | 2019-01-08 | Cadence Design Systems, Inc. | Determining an optimal global quantum for an event-driven simulation |
US20150358211A1 (en) * | 2014-06-06 | 2015-12-10 | Netspeed Systems | Transactional traffic specification for network-on-chip design |
US9473359B2 (en) * | 2014-06-06 | 2016-10-18 | Netspeed Systems | Transactional traffic specification for network-on-chip design |
WO2020089664A1 (en) * | 2018-10-29 | 2020-05-07 | Siemens Industry Software Ltd. | A method and a system for synchronizing a first and a second simulation system |
Also Published As
Publication number | Publication date |
---|---|
US7564809B1 (en) | 2009-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7564809B1 (en) | Event-synchronization protocol for parallel simulation of large-scale wireless networks | |
US11106510B2 (en) | Synchronization with a host processor | |
Zheng et al. | Simulation-based performance prediction for large parallel machines | |
EP3474141B1 (en) | Compiler method | |
Wolkotte et al. | Fast, accurate and detailed NoC simulations | |
US10936008B2 (en) | Synchronization in a multi-tile processing array | |
CN1957329B (en) | Signal processing apparatus | |
US20220253399A1 (en) | Instruction Set | |
US10963003B2 (en) | Synchronization in a multi-tile processing array | |
US20120239372A1 (en) | Efficient discrete event simulation using priority queue tagging | |
CN102207904A (en) | Apparatus and method for simulating a reconfigurable processor | |
US11416440B2 (en) | Controlling timing in computer processing | |
Meyer et al. | Path lookahead: a data flow view of pdes models | |
Gilabert et al. | Exploring high-dimensional topologies for NoC design through an integrated analysis and synthesis framework | |
US10817459B2 (en) | Direction indicator | |
US11561926B2 (en) | Data exchange pathways between pairs of processing units in columns in a computer | |
Kumar et al. | A study of achievable speedup in distributed simulation via null messages | |
Andreozzi et al. | A MILP approach to DRAM access worst-case analysis | |
Harbin et al. | Comparative performance evaluation of latency and link dynamic power consumption modelling algorithms in wormhole switching networks on chip | |
Adiga | NoC characterization framework for design space exploration | |
Guerre et al. | Approximate-timed transactional level modeling for mpsoc exploration: A network-on-chip case study | |
Di Natale et al. | Optimized implementation of synchronous models on industrial LTTA systems | |
Kelly et al. | An event-synchronization protocol for parallel simulation of large-scale wireless networks | |
Hsu | Performance measurement and hardware support for message passing in distributed memory multicomputers | |
Blume et al. | Petri Net Based Modelling of Communication in Systems on Chip |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NAVY, SECRETARY OF THE, UNITED STATES OF AMERICA, Free format text: CONFIRMATORY LICENSE;ASSIGNOR:CORNELL UNIVERSITY;REEL/FRAME:017778/0238 Effective date: 20060224 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
AS | Assignment |
Owner name: CORNELL RESEARCH FOUNDATION, INC., NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MANOHAR, RAJIT;KELLY, CLINT;REEL/FRAME:022412/0266;SIGNING DATES FROM 20050919 TO 20050922 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2553); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 12 |