US20050071533A1 - System resource router - Google Patents

System resource router Download PDF

Info

Publication number
US20050071533A1
US20050071533A1 US10/899,988 US89998804A US2005071533A1 US 20050071533 A1 US20050071533 A1 US 20050071533A1 US 89998804 A US89998804 A US 89998804A US 2005071533 A1 US2005071533 A1 US 2005071533A1
Authority
US
United States
Prior art keywords
channel
internal
bus
data
external
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/899,988
Inventor
Lyle Adams
Billy Mills
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Palmchip Corp
Original Assignee
Palmchip Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=34382135&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US20050071533(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Priority claimed from US09/565,282 external-priority patent/US6601126B1/en
Application filed by Palmchip Corp filed Critical Palmchip Corp
Priority to US10/899,988 priority Critical patent/US20050071533A1/en
Assigned to PALMCHIP CORPORATION reassignment PALMCHIP CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ADAMS, LYLE E., MILLS, BILLY D.
Publication of US20050071533A1 publication Critical patent/US20050071533A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7828Architectures of general purpose stored program computers comprising a single central processing unit without memory
    • G06F15/7832Architectures of general purpose stored program computers comprising a single central processing unit without memory on one IC chip (single chip microprocessors)
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4022Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network

Definitions

  • the present invention relates generally to electronic circuits, and more specifically to functional modules on a single semiconductor integrated circuit.
  • Palmchip Corporation markets its COREFRAMETM products to be low-power, high-performance, processor independent, flexible on-chip interconnect architectures for integration of system-on-chip (SOC) blocks in a synthesis-friendly environment.
  • COREFRAME designs combine different processors, systems with resource routers and dynamic bandwidth allocation, systems with multiple clock domains, and systems with a non-memory shared resource and without a processor.
  • COREFRAME can interface between multiple bus standards, as well as fast and slow non-DMA devices on a single channel.
  • Von Neumann and Harvard are two of the most common computer types in use today.
  • a Von Neumann architecture processor uses the same external buses for instruction fetches and data operations in a shared arrangement.
  • a Harvard architecture processor uses separate buses for instruction fetches and data operations.
  • Most digital signal processor (DSP) designs today use the Harvard architecture because the performance benefits far outweigh the cost of adding extra wires and pins.
  • a simple bus architecture is shown below. This is basically a modified external bus standard using unidirectional busses for on-chip data transfers, e.g., separate read data and write data busses.
  • the peripherals may be bridged directory off the CPU or the peripheral bus may be removed entirely and the slower peripheral targets may be mixed with the fast targets on the high-speed bus.
  • bus protocol and arbitration to try and optimize the throughput but they all basically work the same.
  • An initiator requests the bus for a transfer, the bus grants the request and the data is transferred.
  • a disadvantage is that all data passes over the same wires and there is no parallelism. With this architecture the bandwidth is determined by the width of the data path and the clock frequency. In order to increase the bandwidth, the width of the data path and/or clock frequency must be increased. But even these increases have limitations, because in a typical system, most transfers are not a full data path wide.
  • a variation of the simple bus architecture uses multiple high-speed buses with one or more bridges between them, thereby allowing some parallelism. Transactions on bus-A can proceed at the same time bus-B is busy. But this variation still has the same routing problem, because each initiator can still talk to each target across the bridge. This means the bus still routes across the chip and will have problems at high clock frequencies. In addition when an initiator on bus-A talks to a target on bus B both bus-A and bus-B are tied up. The bridge also adds two levels of logic to the data, address, and control signals making it the limiting factor for performance.
  • the point-to-point architecture is an architecture that can only be used to its fullest in on-chip designs due to package-pin limitations.
  • This architecture multiple initiators connect directly to each target through a switching network. Each initiator must arbitrate for the target, but once connected the transfers occur at full bandwidth. The number of target devices determines the maximum bandwidth.
  • This architecture removes many of the disadvantages of the simple bus architecture in that the unnecessary connections are eliminated and portions of the switching network are routed locally. Transactions can operate in parallel.
  • a disadvantage of the point-to-point architecture is the number of accessible target devices is limited. As more and more targets are added, the switching network becomes more difficult to implement. Changes to the switching network in the middle of the design become practically impossible.
  • the present invention is a system resource router within a system-on-chip (SOC) device that includes at least two channel sockets that provide for protocol-based connections to data-transfer initiators and at least first and second internal M-channel buses that alternately connect to one or more of the channel sockets using transfer switches.
  • SOC system-on-chip
  • Each internal M-channel bus connects to an external M-channel bus populated by one or more transaction targets using an M-channel controller.
  • the channel sockets, at least some of the data-transfer initiators, the internal M-channel buses, the external M-channel buses, and at least some of the transaction targets are all contained upon a single integrated circuit (IC) SOC device.
  • IC integrated circuit
  • one or more of the internal M-channel buses are synchronous buses, and the present invention includes synching FIFOs that synchronize data transfers over these busses.
  • data-transfer initiators, one or more of the internal M-channel busses, and transaction targets can all be running at different clock frequencies.
  • Some embodiments may provide for an internal M-channel bus that is an embedded memory channel that provides a point-to-point connection to internal or external memory.
  • the channel sockets and internal interfaces within the present invention are capable of optimizing bandwidth for individual transactions, i.e., converting a transaction from a first group of one or more bursts having a first bandwidth to a second group of one or more bursts having a second bandwidth.
  • the present invention supports split read transactions, wherein read transfers are returned across said internal M-channel buses in a different order than originally requested, and full duplex transactions, where one transaction is a read burst from one data-transfer initiator to a target and the other transaction is a simultaneous write burst from a second data-transfer initiator to a second target.
  • An advantage of the present invention is that a system resource router is provided that divides a high-speed bus into M-channel sub-busses and uses switches at initiator sockets to connect to the different M-channels.
  • Another advantage of the present invention is that dividing a single, high bus into multiple M-channel sub buses enables local routing of each M-channel sub bus and eliminates unnecessary connections.
  • Another advantage of the present invention is that a system resource router is provided that allows different initiator-to-target or memory transactions to occur simultaneously across different M-channels.
  • a further advantage of the present invention is that a system resource router is provided that increases the bandwidth of the system without resorting to larger bus widths or higher clock frequencies.
  • FIG. 1 is a functional block diagram of computer system embodiment of the present invention for system-on-chip with system-resource routing;
  • FIG. 2 is a functional block diagram of a system-resource router embodiment of the present invention for three initiators and targets on two channels;
  • FIG. 3 is a functional block diagram of a system-resource router embodiment of the present invention for two initiators and two channels;
  • FIG. 4 is a functional block diagram of a computer-aided design system for system-resource router designs.
  • the present invention is a system resource router for SOC applications.
  • This disclosure describes numerous specific details that include specific hardware and data structures, circuits, architectures, and logic devices and functions in order to provide a thorough understanding of the present invention.
  • One skilled in the art will appreciate that one may practice the present invention without these specific details.
  • FIG. 1 shows a computer system embodiment of the present invention, and is referred to herein by the general reference numeral 100 .
  • the system 100 comprises a Harvard-architecture processor subsystem 102 connected through a system resource router 104 to a variety of resources on several buses.
  • the system resource router 104 interfaces to a mix of bus initiators 106 , 108 , 110 , and 112 through channel sockets. It further interfaces to M-channel buses, e.g., a set of three M-channel buses 114 , 116 , and 118 .
  • the M-channel bus 114 is shown with a typical complement of resources, e.g., a PalmBus target 120 , an embedded static random access memory (SRAM) 122 , an MBUS target 124 , a VC interface (VCI) target 126 , a PVCI target 128 , and an internal read-only memory (ROM) 130 .
  • the M-channel bus 116 is shown with another typical complement of resources, e.g., an external flash memory 132 , an internal SRAM 134 , and an internal ROM 136 .
  • the third M-channel bus 118 is shown with an external double data rate (DDR) single data rate random access memory (SDRAM) 138 , an internal SRAM 140 , and an internal ROM 142 .
  • DDR double data rate
  • SDRAM single data rate random access memory
  • the socket interfaces can incorporate industry standards, e.g., PalmBus, VCI, or PVCI.
  • the target devices could be PCI slave interfaces that allow a bridge from an initiator peripheral PCI master to connect two PCI busses.
  • the initiator sockets are preferably an MBUS initiator, an AHB master, or a VCI initiator.
  • the on-chip RAM could be used as a shared resource by the CPU or initiator devices. With the correct on-chip control, any. of these sockets could be mixed or matched. Such variations are preferably implemented with conventional devices and methods.
  • the system resource router 104 allows multiple initiator devices, e.g., master or DMA device and processors, to communicate through separate M-channel connections simultaneously with multiple target devices, e.g., slave devices and memory. Initiator and target devices connect to the M-channels through sockets 144 , 146 , 148 , 150 , 152 , and 154 . Each such socket handles all protocol, clock domain, address remap, and bandwidth matching issues.
  • Internal buses 162 , 164 , and 166 interface to the M-channel buses 114 , 116 , and 118 via M-channel controllers 156 , 158 , and 160 .
  • a group of associated bus transfer switches 168 , 170 , and 172 variously connect the sockets to the M-channel controllers 156 , 158 , and 160 .
  • Channel sockets 146 , 150 , and 154 are exemplary of those that are connected directly to a dedicated M-channel controller and bus.
  • Bus 166 is a synchronous bus, and therefore uses synchronizers 174 , 176 , and 178 to interface with synchronous devices running asynchronously from the bus 166 clock.
  • Two-pole transfer switch 168 allows channel socket- 148 to connect to either bus 162 or synchronously to bus 166 .
  • three-pole switch 170 allows channel socket- 144 to connect to either bus 162 , bus 164 , or synchronously to bus 166 .
  • Two-pole transfer switch 172 allows channel socket- 152 to connect to either bus 162 or synchronously to bus 166 . These switches are controlled such that available buses accessible to each switch can provide a master with a data transfer path with an acceptable slave. In other instances, a particular resource on an M-channel bus 114 , 116 , or 118 is connected to the initiator by setting the switches appropriately.
  • System resource router 104 can function like a memory controller that connects external memory and routes on-chip memory. In addition it can connect target devices and other on-chip resources to initiator devices, CPUs, and DSPs. Initiator devices and CPUs (masters) supply a request and an address to a system resource router controller. Such address includes an M-channel identifier, a target device address, a memory bank address, and/or the memory-cell location address.
  • the initiator or CPU waits to be granted access before transferring data. Access is granted when the requested M-channel device is free. Another initiator peripheral or CPU can simultaneously transfer data over a different M-channel while a data transfer is in-progress on the first channel.
  • the Harvard architecture instruction cache (I-cache) in subsystem 102 in FIG. 1 can fetch instructions from the internal ROM 136 while the CPU data cache (D-cache) is simultaneously accessing data from the SDRAM 122 .
  • the system resource router 102 is preferably used in PalmChip (San Jose, Calif.) COREFRAME implementations for higher wide bandwidth applications.
  • a COREFRAME system with a 32-bit PalmBus and a 32-bit external SDRAM running at 100 MHz provides 600 MB/s of available bandwidth on-chip, i.e., 200 MB/s on the PalmBus and 400 MB/s on the M-channel.
  • Adding a separate M-channel for a 32-bit external flash provides 1.0 GB/s of total on-chip bandwidth at 100 Mhz, 200 MB/s on the PalmBus, plus 400 MB/s on each of the channels.
  • Adding a 128-bit internal dual-port RAM channel and changing from a SDR SDRAM to a DDR SDRAM 64-bit DIMM channel yields 3.8 GB/s of bandwidth at 100 MHz, i.e., 200 MB/s on the PalmBus, 400 MB/s on the flash-memory port, plus 1.6 GB/s on each of the other M-channels.
  • GUI graphical user interface
  • Practical system resource routers 102 that are preferably implemented with current semiconductor technology allow up to 8 separate M-channels with as many as eight targets to be connected to each M-channel. This approach allows up to sixty-four target devices to be connected.
  • the configurable design and the easy to use GUI handle the implementation details.
  • the system resource router M-channel can be configured like a simple point-to-point architecture by connecting only one target to the channel. This approach is preferred when there is only external memory, and no internal memory, because it maximizes data throughput.
  • the system resource router 102 can therefore be used in COREFRAME and other system-on-chip implementations to reduce shared memory and initiator/target transfer bottlenecks.
  • a system resource router a CPU can execute from flash-memory while simultaneously processing data from an initiator peripheral in the SDRAM.
  • the DSP can at the same time process data from the dual-port RAM while another peripheral is transferring data to or from the RAM.
  • no changes to any blocks except the resource router are needed for the processors and initiator peripherals to take best advantage of the available bandwidth.
  • the devices When more than one initiator or CPU wants to transfer data at the same time across the M-channel, the devices must arbitrate for the channel. The device with the highest priority will ordinarily be granted the channel. Slow and fast devices can be mixed on a single channel by using split transactions.
  • a first method of configuring the system resource router uses the chip-assembly program. Such program preferably allows the selection of the number of banks, default type of memory for each bank, and size (width and depth) of memory for each memory bank.
  • the user can select the number of M-channels to place, whether synching FIFOs or synched FIFOs are needed to match the initiator operating frequency to the system resource router frequency, the types of bus interface to the command port needed, the types of bus interface for each initiator, the types of interface for each target, and the type of bus arbitration appropriate for each M-channel.
  • a second method of configuring the system resource router includes programming a set of configuration registers through a command port. Such allows changes to be made to memory size, memory types, and memory timing. These changes are preferably made after the device has been synthesized and delivered by simply modifying the intellectual property (IP) software.
  • IP intellectual property
  • System resource router embodiments of the present invention include channel switches, M-channels, and channel sockets.
  • the channel switches handle connections to the different M-channels which actually transfer the data.
  • the sockets do the interfacing chores and make the reuse of IP-products possible.
  • the system resource router uses a socket/channel technology that allows different protocols to be used between the initiator and target device to move the data. Optimized protocols are implemented to move certain types of data, e.g., external memory accesses or initiator-to-target, to keep the initiator/target interfaces simple. The ability to mix protocols is key to avoid having to customize initiator and target interfaces for each instantiation.
  • FIG. 2 illustrates a resource router 200 implemented as a single device 202 with two channels.
  • a set of three initiators represented by DMA devices 204 , 206 , and 208 , can variously be routed, for example, through the two channels to an on-board memory 210 and an off-board memory 212 .
  • a corresponding set of channel decoders 214 , 216 , and 218 detect initiator requests for resources and which channel is needed.
  • a pair of arbiters 220 and 222 resolve conflicts and adjust a switch fabric 224 to connect the particular initiators to their intended resource targets.
  • Channel-1 includes a bank decoder 226 , a controller 228 , and an address and data network 230 .
  • Channel-2 includes an address and data network 232 , a controller 234 , and a bank decoder 236 .
  • FIG. 3 illustrates a system-on-chip (SOC) 300 with a resource router 302 that supports two initiators 304 and 306 in accesses to a target device-A 308 , a target device-B 310 and an off-board memory 312 .
  • Two internal channels are provided, a channel-A and a channel-B.
  • a channel-A arbiter 314 resolves access conflicts to the target device-A 308 and target device-B 310 .
  • An initiator socket 316 interfaces to the initiator-A 304 .
  • a channel-B arbiter 318 resolves access conflicts to the off-board memory 312 .
  • An initiator socket 320 interfaces to the initiator-B 306 .
  • FIG. 4 represents a system resource router design system embodiment of the present invention, referred to herein by the general reference numeral 400 .
  • the design system 400 produces an intellectual (IP) output in the form of VHDL or Verilog computer files 402 that are dependent on a set of user design choices 404 . Such choices are exemplified in Tables I-VI herein.
  • the computer files 402 describe at least two channel sockets that provide for protocol-based connections to external data-transfer initiators, at least two internal M-channel buses, an M-channel controller for connection between an external M-channel bus and a corresponding one of the internal M-channel buses, and a transfer switch for providing alternative connections of at least one of the channel sockets to at least two of the internal M-channel buses.
  • a plurality of processors and other initiators respectively connected to the channel sockets can be routed with the transfer switch to operate in parallel with a plurality of peripherals and memory respectively populating the external M-channel buses.
  • the design system 400 includes a computer-aided design (CAD) platform 406 for providing a user/designer with a means to select and implement a variety of numbers of interconnected ones of the channel sockets, the internal M-channel buses, the M-channel controllers, and the transfer switches.
  • CAD computer-aided design
  • GUI graphical user interface
  • An assembly program 410 automatically chooses how many channel sockets, internal M-channel buses, M-channel controllers, and transfer switches to include from a technology library 412 in a final design based on user input through the GUI.
  • a business model embodiment of the present invention uses the design system 400 to profit from the commercial marketing of intellectual property (IP) hardware description language (HDL) files that are output by the CAD program 406 .
  • IP intellectual property
  • HDL hardware description language
  • Such implements the channel sockets, the internal M-channel buses, the M-channel controllers, and the transfer switches as high-level synthesis (HLS) computer files for later simulation, placement, and routing in a single-chip system-on-chip implementation.
  • HLS high-level synthesis
  • the channel switches typically decode a portion of the addresses supplied by initiators to determine which channel the transaction is directed. The address is decoded and the request is directed to the correct channel that will be handling the transaction. The switch will not move to another channel until the transfer of data is complete. If a request is supplied from an initiator and the address supplied does not decode to a channel, an error will be generated and the system resource router will initiate an interrupt to the CPU. The error is recorded in an initiator socket error register that is preferably accessed through a control port to tell the CPU which initiator had the error. No request can be supplied for that initiator till the error register is cleared through the control port.
  • the first type is an external memory and embedded memory channel for point-to-point connections with only a single target, e.g., internal or external memory.
  • the second type is a target and embedded memory channel that uses a configurable protocol. The user is provided with the ability to customize the protocol for particular applications.
  • Any external memory and embedded memory M-channels are preferably optimized for data transfer between external and internal memory.
  • One way to do this is to configure the channel with system 400 as point-to-point with memory as the only target.
  • Each channel can have several different memory-mapped banks of memory, and in any combination of external or embedded.
  • External memories preferably have programmable timing to allow alternative memory devices to be used in actual production.
  • Each memory bank is controlled by a memory controller for asynchronous, DRAM, or SDRAM memories.
  • a system resource router can be configured to have any reasonable combination of controllers and M-channels. Any memory bank is programmed to use any memory controller used in that M-channel as well as any memory bank is configured to use any controller as the default memory controller for that bank. If a memory type is not used on any of the memory banks in that M-channel, then that controller is not placed in the design by CAD platform 406 .
  • Each bank of memory is preferably programmed in system 400 as Asynchronous, DRAM, or SDRAM. Memory controllers connected to separate M-channels run independently allowing different memories connected to different M-channels to access external memory through separate memory pins.
  • an asynchronous bank If an asynchronous bank is programmed, then it will support flash-memory, compact flash-memory, internal or external SRAM, SSRAM, SFlash, and internal or external ROM if all the control pins are brought out as pins on the part. If a bank is programmed as SDRAM, it will support PC100-compliant SDRAM and DDR SDRAM. If EDO DRAM is programmed, it will support standard EDO and Fast Page Mode EDO DRAM.
  • Target device and embedded memory channels use protocols with special extensions to optimize initiator-to-target transfers.
  • Such channel type is preferably configurable so the user can trade-off performance for gate-count, or remove extensions not needed in particular applications.
  • the channel can have several different memory-mapped target devices or embedded memories. This target devices and embedded memories are preferably mixed in any combination.
  • Full duplex uses a transaction posting system that allows an initiator-A to do burst reads to a target-A at the same time an initiator-B is doing burst writes to an embedded memory-B. This system can double the bandwidth of the channel, but only if bursting is being used, and only if different initiator-to-target reads and writes are happening at the same time. If two initiators are doing a read, this system will not help. And if the two initiators are trying to access the same target, this option will not help in any combination of reads and writes.
  • Split transactions allow reads to return across channels in different order than they were requested. This allows an initiator that is reading from a fast target to jump in and read data while an initiator that is reading from a slow target is still waiting for data. If there is a conflict, e.g., two targets try to return data at the same time, the transaction that was posted first will have priority. This helps when mixing fast and slow targets on the same channel. If all the target devices are fast, this will not improve performance much. It will also not improve writes, or if two initiators are trying to read from the same target.
  • Each M-channel has its own arbitration, and each arbitration is preferably a different type, e.g., round-robin, fixed priority, timed priority, round-robin with one fixed priority, and time-domain slicing schemes.
  • Arbitration between initiators devices is preferably supported for each M-channel.
  • the priority is always fixed with M-channel connection 1 having the highest priority, and the highest-numbered M-channel connection having the lowest. In this priority scheme, it is important which device is attached to which initiator socket.
  • the initiator that is granted moves to the lowest priority and all those that had a lower priority than the granted device move up. In this way, the device that uses the bus the least has the highest priority.
  • This method of arbitration is the fairest method but has the highest gate count.
  • the M-channel connection 1 With round-robin with one fixed priority arbitration, the M-channel connection 1 will always have the highest priority. The other initiators arbitrate using the round-robin arbitration method.
  • each initiator connected to the M-channel is allowed to have only a certain number of transactions across the M-channel before another initiator takes over the M-channel.
  • the number of transactions allowed is programmable for each initiator connected to the M-channel.
  • Arbitration can follow any request removal and the completion of a current memory access, or it can follow any end-of-burst.
  • Each arbiter control register preferably has two sets of registers that are preferably used to effect the operation of the arbiter for each M-channel.
  • One set of bits is the arbiter mask register. These bits are preferably used to mask the requests from initiator or force requests from initiator. This is useful for test development and for system debug.
  • a second part of the arbiter control registers includes arbiter force-request register bits, which are used to force a bus grant from the arbiter to a specific port. This can be useful for testing and system debug.
  • the arbiter state registers allow the user to tell which initiator is granted by reading the register through the control port. This is typically useful for system debug.
  • a watchdog timer is preferably provided as an option for the arbitration of each M-channel.
  • Typical watchdog timers are 16-bit units that count the number of clock cycles since a device has been granted a bus request. The timer resets each time a new initiator socket or multi-master bus is granted access.
  • the watchdog timer control register allows the user to control what happens at timeout for each initiator socket. One option is nothing happens. Another is that the watchdog timer interrupt register will be set and the watchdog timer interrupt pin will go high. The watchdog timer interrupt register tells which initiator socket timed out and is cleared upon reading the register. Once the watchdog timer interrupt register is cleared, the watchdog timer interrupt pin will go low.
  • the third option for what happens on timeout is that the watchdog timer interrupt register is set, the watchdog timer interrupt pin goes high, and at the end of the next memory cycle the grant will be removed from the initiator.
  • Each initiator preferably has its own watchdog timer and timeout value register.
  • This register is typically 8-bits long, and is loaded into the most significant bits of the 16-bit timer when an initiator is granted access. Thus allowing the user to set specific timeout values for different ports.
  • Sockets are a critical element in being able to design once, and then reuse the design over and over in a plug-and-play system. Sockets bring together existing IP technology-library components, new third-party IP, and new project specific IP. Even when all are built to different interface standards, and without necessitating extensive redesign. Sockets provide address remapping, FIFO, synching between different clock domains, and bus-width matching, thus allowing systems to be built without having to redesign existing or third party IP technology-library components.
  • Sockets are preferably placed at any of several interfaces in the system resource router. These include the initiator interface, the interface into the M-channel, and the interface into the target. This allows the user to do such things as have a 32-bit MBus initiator running at 66 MHz, move data over a 128-bit M-channel running at 100 MHz, to a 64-bit VCI target device running at 50 MHz, and configure the entire thing inside the system resource router without modifying either the MBus initiator or the VCI target.
  • Both the synching FIFOs and the synch cells allow a portion of the chip running at one frequency to interface to another portion running at a different frequency.
  • the synching FIFO has about three times the throughput of the synch cell, thereby allowing both sides to run at optimal throughput. However, it has a significantly higher gate count than the synch cell.
  • Each initiator, each channel, and each target can have its own synching FIFO system allowing for as many as 136 different clock domains for each resource router. (thirty-two initiator clocks, eight M-channel clocks, and sixty-four target clocks.)
  • the synching FIFO consists of one FIFO for data read or write, and a four-transaction deep transaction stack.
  • the data FIFO is preferably from two words deep to 2048 words deep and should be set to the maximum burst length ⁇ 4. This setting allows for four maximum length bursts to be pending.
  • the synching FIFO will work whether interfacing from a fast clock to a slow clock, a slow clock to a fast clock, or two same-frequency but unsynchronized clocks. This allows different parts of the chip to be put in power down mode and still be able to transfer data in the power down mode.
  • the reads and writes into the system resource router and out to the memory are performed in the same order. Consecutive reads and write are queued into the synching FIFOs but when switching from a read to a write the synching FIFOs wait until all the reads queued in the FIFOs are complete before queuing the next write operation.
  • the FIFO status register identifies weather the initiator socket still has reads or writes pending on this M-channel.
  • the synch cell will synchronize two clock domains with a minimal of gate count. This cell is preferably slow since it must synchronize from clock domain-A to clock domain-B, and then back from clock domain-B to domain-A to complete a transfer.
  • the synching FIFO hides this by stacking multiple transfers at once.
  • the synched FIFOs work much the same way as the synching FIFO with the exception that the clock domains are assumed to be synchronized. This means that the initiator clock and the system resource router clock should be generated off the same master clock and be some multiple of each other. This multiple of the clock is supplied when the system resource router is configured.
  • the advantage of the synched FIFO over the synching FIFO is that it has fewer gates and has a lower latency between a transaction request and the subsequent read or write.
  • the system resource router supports the new VC Interface Standard. This interface is already built into the system resource router and allows VCI compliant devices to be connected without adding a bus wrapper. This includes VCI initiator, VCI target, and PVCI. This eases the integration of VCI compliant devices and allows persons familiar with VCI to connect into the system resource router without becoming familiar with another bus standard.
  • Address remapping is preferably preformed at several points in the system resource router and the remapping is preferably fixed or programmable through the control port allowing the user a great deal of flexibility in what the initiator memory map looks like.
  • Each socket will allow several different sectors to be remapped.
  • Each sector is preferably either a fixed type or a mapped type.
  • For the fixed sector an address range is selected and the programmed value out of the remap will be fixed for those address bits regardless of the input address.
  • the user specifies the output addresses to which the input addresses are mapped.
  • the address pins compared on the input need not be the same address bits changed on the output address bits.
  • the output address bits may not overlapped for a fixed and mapped sector but they can overlap for two mapped sectors only if the input address bits compared are the same.
  • the number of address bits compared and the number of address bits changed on the output for a sector is preferably no more than 8-bits and must be consecutive.
  • the sockets will perform optional bandwidth matching between interfaces. For example, the socket will convert a 32-bit burst of four into a 64-bit burst of two (2 ⁇ option) on the other side of the socket. This means that on the 64-bit side, only 2 ⁇ clock cycles will be required to complete what was originally a 4-burst transfer. Going the other way, the socket will convert a 128-bit single cycle access into a burst of four 32-bit transfers (quarter option). This capability allows initiators, channels, and targets to effectively communicate without redesigning interfaces.
  • a system resource router with Dynamic Bandwidth Allocation is very similar to the above examples with the exception that internal memory and non-DMA devices or slaves are not assigned to one channel. They are assigned to multiple channels the same as the DMA or master devices. When a DMA or master requests access to an internal memory or non-DMA or slave device both devices are switched to the first available unused channel. This process continues until all the channels are in use. If another transaction is requested the DMA or master device and non-DMA or slave device or memory will be connected to a channel based on either the speed of the ongoing transfers or the priority of the DMA or master on the channel or both. The two transactions will then share the bandwidth of that channel until either one of the transactions is complete or another channel becomes available.
  • a channel becomes available one of the DMA or master devices with its non-DMA or slave device or internal memory will switch to the open channel. If a DMA or master device requests a non-DMA or slave device that is already being used by another DMA or master the requesting DMA or master is switched to the channel with the non-DMA or slave device and the two DMA or master devices arbitrate for the non-DMA or slave device. In this way the bandwidth used by the SOC is always optimal and maximum bandwidth utilization is guaranteed.
  • An additional method that is preferably used to increase bandwidth when the number of read and writes to or from the DMA or Master devices is equal is to split the channel from a Read/Write Channel to a Read Only Channel and a Write Only Channel. Because the internal channel architecture does not use bidirectional busses (low performance, high power consumption, and difficulties with using ASIC design tools) and there are separate mb_rdata and mb_wdata paths inside the system resource router, splitting the channel requires less overhead than adding a complete new channel.
  • individual channels could be defined as read-only or write-only. This provides additional bandwidth in the required direction, thus optimizing system performance.
  • COREFRAME implementations generally comprise a CPU and shared memory
  • embodiments of the present invention are preferably applied to systems with shared resources, e.g., a PCI interface.
  • a support processor is needed only if the peripheral blocks are programmable. If such are programmed through a sequencer, no processor is needed.
  • System resource router embodiments of the present invention basically combine the simple bus architecture and the point-to-point architecture to exploit the advantages of each and avoid the disadvantages.
  • the system resource router is preferably CAD-configured down to a simple bus architecture implementation.
  • the system resource router is preferably CAD-configured as a point-to-point architecture implementation.
  • the present invention is a system resource router for use on an SOC device that includes at least two channel sockets that provide for protocol-based connections to data-transfer initiators and at least two internal M-channel buses that alternately connect to one or more of the channel sockets using transfer switches.
  • Each internal M-channel bus connects to an external M-channel bus populated by one or more transaction targets using an M-channel controller.
  • the channel sockets, at least some of the data-transfer initiators, the internal M-channel buses, the external M-channel buses, and at least some of the transaction targets are all contained upon a single integrated circuit (IC) SOC device.
  • IC integrated circuit
  • one or more of the internal M-channel buses are synchronous buses, and the present invention includes synching FIFOs that synchronize data transfers over these buses.
  • data-transfer initiators, one or more of the internal M-channel buses, and transaction targets can all be running at different clock frequencies.
  • Some embodiments may provide for an internal M-channel bus that is an embedded memory channel that provides a point-to-point connection to internal or external memory.
  • the channel sockets and internal interfaces within the present invention are capable of optimizing bandwidth for individual transactions, i.e., converting a transaction from a first group of one or more bursts having a first bandwidth to a second group of one or more bursts having a second bandwidth.
  • the present invention supports split read transactions, wherein read transfers are returned across said internal M-channel buses in a different order than originally requested, and full duplex transactions, where one transaction is a read burst from one data-transfer initiator to a target and the other transaction is a simultaneous write burst from a second data-transfer initiator to a second target.

Abstract

A system resource router for SOC applications is described. Data-transfer initiators coupled to the router via one of a plurality of channel socket connections (144, 146, 148, 150, 152, 154) alternatively couple to internal M-channel buses (162, 164, 166) using transfer switches (168, 170, 172). Each internal M-channel bus connects to an external M-channel bus (114, 116, 118) populated by one or more transaction targets using an M-channel controller (156, 158, 160). The channel sockets, at least some of the data-transfer initiators, the internal M-channel buses, the external M-channel buses, and at least some of the transaction targets are all contained upon a single integrated circuit (IC) SOC device. Split reads and full duplex transactions are supported. Transactions can occur at different clock frequencies and bandwidths.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. Pat. No. 6,769,046, filed 5 Dec. 2000 (5 Dec. 2000), which is also a continuation in part of U.S. Pat. No. 6,601,126, filed 2 May 2000 (2 May 2000). Additionally, the prior U.S. Pat. No. 6,769,046 claims the benefits of the earlier filed U.S. Provisional Application No. 60/182,406 and U.S. Provisional Application No. 60/217,597. All of these documents are incorporated by reference for all purposes into this specification.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates generally to electronic circuits, and more specifically to functional modules on a single semiconductor integrated circuit.
  • 2. Description of the Related Art
  • The semiconductor art has advanced to the point where whole systems are preferably integrated onto a single-chip device. Processing speeds and architectures are such that very wide buses operated at near gigahertz speeds are routinely used to interface processors, peripherals, and memory. Single-chip system-on-chip (SOC) implementations now permit several such buses to be switched between resources. Off chip, such a bus switch architecture would be impractical.
  • Palmchip Corporation (San Jose, Calif.) markets its COREFRAME™ products to be low-power, high-performance, processor independent, flexible on-chip interconnect architectures for integration of system-on-chip (SOC) blocks in a synthesis-friendly environment. COREFRAME designs combine different processors, systems with resource routers and dynamic bandwidth allocation, systems with multiple clock domains, and systems with a non-memory shared resource and without a processor. COREFRAME can interface between multiple bus standards, as well as fast and slow non-DMA devices on a single channel.
  • Von Neumann and Harvard are two of the most common computer types in use today. A Von Neumann architecture processor uses the same external buses for instruction fetches and data operations in a shared arrangement. A Harvard architecture processor uses separate buses for instruction fetches and data operations. Most digital signal processor (DSP) designs today use the Harvard architecture because the performance benefits far outweigh the cost of adding extra wires and pins.
  • An example of a simple bus architecture is shown below. This is basically a modified external bus standard using unidirectional busses for on-chip data transfers, e.g., separate read data and write data busses. There are several variations of this basic theme, for example, the peripherals may be bridged directory off the CPU or the peripheral bus may be removed entirely and the slower peripheral targets may be mixed with the fast targets on the high-speed bus. There are many variations in bus protocol and arbitration to try and optimize the throughput but they all basically work the same. An initiator requests the bus for a transfer, the bus grants the request and the data is transferred. A disadvantage is that all data passes over the same wires and there is no parallelism. With this architecture the bandwidth is determined by the width of the data path and the clock frequency. In order to increase the bandwidth, the width of the data path and/or clock frequency must be increased. But even these increases have limitations, because in a typical system, most transfers are not a full data path wide.
  • When placing and routing this architecture the high-speed bus must run to all the initiators and targets, which usually means that this bus must run all the way across the chip. In order to keep the high-speed bus running at high speed special layout techniques must be used which will kill the time to market advantages of system-on-chip design.
  • A variation of the simple bus architecture uses multiple high-speed buses with one or more bridges between them, thereby allowing some parallelism. Transactions on bus-A can proceed at the same time bus-B is busy. But this variation still has the same routing problem, because each initiator can still talk to each target across the bridge. This means the bus still routes across the chip and will have problems at high clock frequencies. In addition when an initiator on bus-A talks to a target on bus B both bus-A and bus-B are tied up. The bridge also adds two levels of logic to the data, address, and control signals making it the limiting factor for performance.
  • The point-to-point architecture is an architecture that can only be used to its fullest in on-chip designs due to package-pin limitations. In this architecture multiple initiators connect directly to each target through a switching network. Each initiator must arbitrate for the target, but once connected the transfers occur at full bandwidth. The number of target devices determines the maximum bandwidth. This architecture removes many of the disadvantages of the simple bus architecture in that the unnecessary connections are eliminated and portions of the switching network are routed locally. Transactions can operate in parallel. A disadvantage of the point-to-point architecture is the number of accessible target devices is limited. As more and more targets are added, the switching network becomes more difficult to implement. Changes to the switching network in the middle of the design become practically impossible.
  • SUMMARY OF THE INVENTION
  • It is therefore an object of the present invention to provide a system-on-chip interconnection structure and method for efficient integration of a variety of functional circuits.
  • It is a further object of the present invention to provide an on-chip interconnect architecture that standardizes how systems-on-chip are fabricated on silicon semiconductor integrated circuit chips.
  • The present invention is a system resource router within a system-on-chip (SOC) device that includes at least two channel sockets that provide for protocol-based connections to data-transfer initiators and at least first and second internal M-channel buses that alternately connect to one or more of the channel sockets using transfer switches. Each internal M-channel bus connects to an external M-channel bus populated by one or more transaction targets using an M-channel controller. The channel sockets, at least some of the data-transfer initiators, the internal M-channel buses, the external M-channel buses, and at least some of the transaction targets are all contained upon a single integrated circuit (IC) SOC device. In some embodiments, one or more of the internal M-channel buses are synchronous buses, and the present invention includes synching FIFOs that synchronize data transfers over these busses. In some embodiments, data-transfer initiators, one or more of the internal M-channel busses, and transaction targets can all be running at different clock frequencies. Some embodiments may provide for an internal M-channel bus that is an embedded memory channel that provides a point-to-point connection to internal or external memory.
  • The channel sockets and internal interfaces within the present invention are capable of optimizing bandwidth for individual transactions, i.e., converting a transaction from a first group of one or more bursts having a first bandwidth to a second group of one or more bursts having a second bandwidth. Finally, the present invention supports split read transactions, wherein read transfers are returned across said internal M-channel buses in a different order than originally requested, and full duplex transactions, where one transaction is a read burst from one data-transfer initiator to a target and the other transaction is a simultaneous write burst from a second data-transfer initiator to a second target.
  • An advantage of the present invention is that a system resource router is provided that divides a high-speed bus into M-channel sub-busses and uses switches at initiator sockets to connect to the different M-channels.
  • Another advantage of the present invention is that dividing a single, high bus into multiple M-channel sub buses enables local routing of each M-channel sub bus and eliminates unnecessary connections.
  • Another advantage of the present invention is that a system resource router is provided that allows different initiator-to-target or memory transactions to occur simultaneously across different M-channels.
  • A further advantage of the present invention is that a system resource router is provided that increases the bandwidth of the system without resorting to larger bus widths or higher clock frequencies.
  • These and other objects and advantages of the present invention will no doubt become obvious to those of ordinary skill in the art after having read the following detailed description of the preferred embodiment, which is illustrated in the drawings.
  • DESCRIPTION OF THE DRAWINGS
  • To further aid in understanding the invention, the attached drawings help illustrate specific features of the invention and the following is a brief description of the attached drawings:
  • FIG. 1 is a functional block diagram of computer system embodiment of the present invention for system-on-chip with system-resource routing;
  • FIG. 2 is a functional block diagram of a system-resource router embodiment of the present invention for three initiators and targets on two channels;
  • FIG. 3 is a functional block diagram of a system-resource router embodiment of the present invention for two initiators and two channels; and
  • FIG. 4 is a functional block diagram of a computer-aided design system for system-resource router designs.
  • DETAILED DESCRIPTION OF THE INVENTION
  • The present invention is a system resource router for SOC applications. This disclosure describes numerous specific details that include specific hardware and data structures, circuits, architectures, and logic devices and functions in order to provide a thorough understanding of the present invention. One skilled in the art will appreciate that one may practice the present invention without these specific details.
  • FIG. 1 shows a computer system embodiment of the present invention, and is referred to herein by the general reference numeral 100. The system 100 comprises a Harvard-architecture processor subsystem 102 connected through a system resource router 104 to a variety of resources on several buses. The system resource router 104 interfaces to a mix of bus initiators 106,108, 110, and 112 through channel sockets. It further interfaces to M-channel buses, e.g., a set of three M- channel buses 114, 116, and 118.
  • The M-channel bus 114 is shown with a typical complement of resources, e.g., a PalmBus target 120, an embedded static random access memory (SRAM) 122, an MBUS target 124, a VC interface (VCI) target 126, a PVCI target 128, and an internal read-only memory (ROM) 130. The M-channel bus 116 is shown with another typical complement of resources, e.g., an external flash memory 132, an internal SRAM 134, and an internal ROM 136. The third M-channel bus 118 is shown with an external double data rate (DDR) single data rate random access memory (SDRAM) 138, an internal SRAM 140, and an internal ROM 142. The way the M-channel buses are populated in FIG. 1 is merely for discussion here, such examples help illustrate all the many ways the M-channel buses could be populated and how they would function in embodiments of the present invention.
  • The socket interfaces can incorporate industry standards, e.g., PalmBus, VCI, or PVCI. The target devices could be PCI slave interfaces that allow a bridge from an initiator peripheral PCI master to connect two PCI busses. The initiator sockets are preferably an MBUS initiator, an AHB master, or a VCI initiator. The on-chip RAM could be used as a shared resource by the CPU or initiator devices. With the correct on-chip control, any. of these sockets could be mixed or matched. Such variations are preferably implemented with conventional devices and methods.
  • The system resource router 104 allows multiple initiator devices, e.g., master or DMA device and processors, to communicate through separate M-channel connections simultaneously with multiple target devices, e.g., slave devices and memory. Initiator and target devices connect to the M-channels through sockets 144, 146, 148, 150, 152, and 154. Each such socket handles all protocol, clock domain, address remap, and bandwidth matching issues. Internal buses 162, 164, and 166 interface to the M- channel buses 114, 116, and 118 via M- channel controllers 156, 158, and 160. A group of associated bus transfer switches 168, 170, and 172 variously connect the sockets to the M- channel controllers 156, 158, and 160. Channel sockets 146, 150, and 154 are exemplary of those that are connected directly to a dedicated M-channel controller and bus. Bus 166 is a synchronous bus, and therefore uses synchronizers 174, 176, and 178 to interface with synchronous devices running asynchronously from the bus 166 clock.
  • Two-pole transfer switch 168 allows channel socket-148 to connect to either bus 162 or synchronously to bus 166. Similarly, three-pole switch 170 allows channel socket-144 to connect to either bus 162, bus 164, or synchronously to bus 166. Two-pole transfer switch 172 allows channel socket-152 to connect to either bus 162 or synchronously to bus 166. These switches are controlled such that available buses accessible to each switch can provide a master with a data transfer path with an acceptable slave. In other instances, a particular resource on an M- channel bus 114, 116, or 118 is connected to the initiator by setting the switches appropriately.
  • System resource router 104 can function like a memory controller that connects external memory and routes on-chip memory. In addition it can connect target devices and other on-chip resources to initiator devices, CPUs, and DSPs. Initiator devices and CPUs (masters) supply a request and an address to a system resource router controller. Such address includes an M-channel identifier, a target device address, a memory bank address, and/or the memory-cell location address.
  • The initiator or CPU waits to be granted access before transferring data. Access is granted when the requested M-channel device is free. Another initiator peripheral or CPU can simultaneously transfer data over a different M-channel while a data transfer is in-progress on the first channel. For example, the Harvard architecture instruction cache (I-cache) in subsystem 102 in FIG. 1 can fetch instructions from the internal ROM 136 while the CPU data cache (D-cache) is simultaneously accessing data from the SDRAM 122.
  • The system resource router 102 is preferably used in PalmChip (San Jose, Calif.) COREFRAME implementations for higher wide bandwidth applications. For example, a COREFRAME system with a 32-bit PalmBus and a 32-bit external SDRAM running at 100 MHz provides 600 MB/s of available bandwidth on-chip, i.e., 200 MB/s on the PalmBus and 400 MB/s on the M-channel. Adding a separate M-channel for a 32-bit external flash provides 1.0 GB/s of total on-chip bandwidth at 100 Mhz, 200 MB/s on the PalmBus, plus 400 MB/s on each of the channels. Adding a 128-bit internal dual-port RAM channel and changing from a SDR SDRAM to a DDR SDRAM 64-bit DIMM channel yields 3.8 GB/s of bandwidth at 100 MHz, i.e., 200 MB/s on the PalmBus, 400 MB/s on the flash-memory port, plus 1.6 GB/s on each of the other M-channels.
  • A graphical user interface (GUI) is included in some embodiments of the present invention that assists in system-on-chip design implementations that include a system resource router 102. Practical system resource routers 102 that are preferably implemented with current semiconductor technology allow up to 8 separate M-channels with as many as eight targets to be connected to each M-channel. This approach allows up to sixty-four target devices to be connected. The configurable design and the easy to use GUI handle the implementation details. The system resource router M-channel can be configured like a simple point-to-point architecture by connecting only one target to the channel. This approach is preferred when there is only external memory, and no internal memory, because it maximizes data throughput.
  • The system resource router 102 can therefore be used in COREFRAME and other system-on-chip implementations to reduce shared memory and initiator/target transfer bottlenecks. With a system resource router, a CPU can execute from flash-memory while simultaneously processing data from an initiator peripheral in the SDRAM. The DSP can at the same time process data from the dual-port RAM while another peripheral is transferring data to or from the RAM. With a resource router, no changes to any blocks except the resource router are needed for the processors and initiator peripherals to take best advantage of the available bandwidth.
  • When more than one initiator or CPU wants to transfer data at the same time across the M-channel, the devices must arbitrate for the channel. The device with the highest priority will ordinarily be granted the channel. Slow and fast devices can be mixed on a single channel by using split transactions.
  • Computer automated design (CAD) tools are a modern necessity for complex system-on-chip designs. In order to allow the user the necessary flexibly and ease of use to design system on-chip IC's, preferred embodiments of the system resource router provide several methods to configure to exactly what the user desires. A first method of configuring the system resource router uses the chip-assembly program. Such program preferably allows the selection of the number of banks, default type of memory for each bank, and size (width and depth) of memory for each memory bank. The user can select the number of M-channels to place, whether synching FIFOs or synched FIFOs are needed to match the initiator operating frequency to the system resource router frequency, the types of bus interface to the command port needed, the types of bus interface for each initiator, the types of interface for each target, and the type of bus arbitration appropriate for each M-channel.
  • A second method of configuring the system resource router includes programming a set of configuration registers through a command port. Such allows changes to be made to memory size, memory types, and memory timing. These changes are preferably made after the device has been synthesized and delivered by simply modifying the intellectual property (IP) software.
  • Detailed user/designer options for a system-resource-router assembly program embodiment of the present invention are summarized in Tables I-VI. Users are preferably allowed to modify system resource router configurations to meet changing application requirements even after the device has been delivered to the end user. In any event, all embodiments of the present invention must provide designs that are compact and easy to use.
    TABLE I
    Initiator socket Options
    Initiator Sockets
    1, 32 Sockets
    Data Width 16, 32, 64, 128-bits
    Socket Interface COREFRAME VCI AHB multi-master or
    MBus DMA
    Bandwidth Matching ½, ¼, 2, 4
    Address Remapping 8 Separate Sectors (Programmable optional)
    Clock Domain Asynchronous Clock Synchronous Clock
    synching Domain Domain
  • TABLE II
    Target socket Options
    Initiator Sockets
    1, 8 Sockets
    Port Width 16, 32, 64, 128-bits
    Bandwidth Matching ½, ¼, 2, 4
    Address Remapping 8 Separate Sectors (Programmable optional)
    Clock Domain Asynchronous Clock Synchronous Clock
    synching Domain Domain
  • TABLE III
    M-channel socket Options
    Initiator Sockets
    1, 8 Sockets
    Port Width 16, 32, 64, 128-bits
    Bandwidth Matching ½, ¼, 2, 4
    Address Remapping 8 Separate Sectors (Programmable optional)
    Clock Domain Asynchronous Clock Synchronous Clock
    synching Domain Domain
  • TABLE IV
    M-channel Options
    Number of Channels 1, 8 Full Bandwidth Channels
    Channel Type External and embedded Embedded Memory and
    Memory target
    Starting Address User Selectable
    Channel Width 16, 32, 64, 128-bits
    Arbitration Type round-robin; Fixed; Timed, round-robin with
    1 fixed, Time Domain Slicing
    Watch Dog timer Selectable
    Clock Domain Asynchronous Clock Domain Synchronous Clock
    synching Domain
    Address Remapping 8 Separate Sectors (Programmable optional)
    If embedded and target Channel Type is selected then full duplex Protoco
    and split transactions become options
  • TABLE V
    Command Port Options
    Bus Width 16, 32-bits
    Bus Interface COREFRAME PVCI APB
    PalmBus ™
  • TABLE VI
    Memory Bank
    Memory Banks 1-8 Memory Banks
    Starting Address of each memory bank
    Memory Width 8-bits (all Asynch 16 bits 32-bits 64-bits 72 bits ECC
    Memory Banks) DDR
    SDRAM
    Memory Depth
    128 Kb-128 Mb
    Memory SDR or EDO flash- Compact Internal Internal SIMM DIMM
    Type DDR memory flash- or or
    SDRAM or memory External external
    SFlash SRAM ROM
    or
    SSRAM
    If DDR SDRAM is selected, 72-bit ECC is an option
    If SDRAM or EDO are selected, Refresh timer is an option
    If SIMM is selected, Presence Detect is an option
    If SDRAM is selected, a Serial Presence Detect port is an option
  • System resource router embodiments of the present invention include channel switches, M-channels, and channel sockets. The channel switches handle connections to the different M-channels which actually transfer the data. The sockets do the interfacing chores and make the reuse of IP-products possible. The system resource router uses a socket/channel technology that allows different protocols to be used between the initiator and target device to move the data. Optimized protocols are implemented to move certain types of data, e.g., external memory accesses or initiator-to-target, to keep the initiator/target interfaces simple. The ability to mix protocols is key to avoid having to customize initiator and target interfaces for each instantiation.
  • FIG. 2 illustrates a resource router 200 implemented as a single device 202 with two channels. A set of three initiators, represented by DMA devices 204, 206, and 208, can variously be routed, for example, through the two channels to an on-board memory 210 and an off-board memory 212. A corresponding set of channel decoders 214, 216, and 218 detect initiator requests for resources and which channel is needed. A pair of arbiters 220 and 222 resolve conflicts and adjust a switch fabric 224 to connect the particular initiators to their intended resource targets. Channel-1 includes a bank decoder 226, a controller 228, and an address and data network 230. Channel-2 includes an address and data network 232, a controller 234, and a bank decoder 236.
  • FIG. 3 illustrates a system-on-chip (SOC) 300 with a resource router 302 that supports two initiators 304 and 306 in accesses to a target device-A 308, a target device-B 310 and an off-board memory 312. Two internal channels are provided, a channel-A and a channel-B. A channel-A arbiter 314 resolves access conflicts to the target device-A 308 and target device-B 310. An initiator socket 316 interfaces to the initiator-A 304. A channel-B arbiter 318 resolves access conflicts to the off-board memory 312. An initiator socket 320 interfaces to the initiator-B 306.
  • FIG. 4 represents a system resource router design system embodiment of the present invention, referred to herein by the general reference numeral 400. The design system 400 produces an intellectual (IP) output in the form of VHDL or Verilog computer files 402 that are dependent on a set of user design choices 404. Such choices are exemplified in Tables I-VI herein. The computer files 402 describe at least two channel sockets that provide for protocol-based connections to external data-transfer initiators, at least two internal M-channel buses, an M-channel controller for connection between an external M-channel bus and a corresponding one of the internal M-channel buses, and a transfer switch for providing alternative connections of at least one of the channel sockets to at least two of the internal M-channel buses. A plurality of processors and other initiators respectively connected to the channel sockets can be routed with the transfer switch to operate in parallel with a plurality of peripherals and memory respectively populating the external M-channel buses.
  • The design system 400 includes a computer-aided design (CAD) platform 406 for providing a user/designer with a means to select and implement a variety of numbers of interconnected ones of the channel sockets, the internal M-channel buses, the M-channel controllers, and the transfer switches. A graphical user interface (GUI) 408 is preferably included to collect basic information about a design application. An assembly program 410 automatically chooses how many channel sockets, internal M-channel buses, M-channel controllers, and transfer switches to include from a technology library 412 in a final design based on user input through the GUI.
  • A business model embodiment of the present invention uses the design system 400 to profit from the commercial marketing of intellectual property (IP) hardware description language (HDL) files that are output by the CAD program 406. Such implements the channel sockets, the internal M-channel buses, the M-channel controllers, and the transfer switches as high-level synthesis (HLS) computer files for later simulation, placement, and routing in a single-chip system-on-chip implementation.
  • The channel switches typically decode a portion of the addresses supplied by initiators to determine which channel the transaction is directed. The address is decoded and the request is directed to the correct channel that will be handling the transaction. The switch will not move to another channel until the transfer of data is complete. If a request is supplied from an initiator and the address supplied does not decode to a channel, an error will be generated and the system resource router will initiate an interrupt to the CPU. The error is recorded in an initiator socket error register that is preferably accessed through a control port to tell the CPU which initiator had the error. No request can be supplied for that initiator till the error register is cleared through the control port.
  • There are typically two types of M-channels used in system resource router embodiments. The first type is an external memory and embedded memory channel for point-to-point connections with only a single target, e.g., internal or external memory. The second type is a target and embedded memory channel that uses a configurable protocol. The user is provided with the ability to customize the protocol for particular applications.
  • Any external memory and embedded memory M-channels are preferably optimized for data transfer between external and internal memory. One way to do this is to configure the channel with system 400 as point-to-point with memory as the only target. Each channel can have several different memory-mapped banks of memory, and in any combination of external or embedded. External memories preferably have programmable timing to allow alternative memory devices to be used in actual production.
  • Each memory bank is controlled by a memory controller for asynchronous, DRAM, or SDRAM memories. A system resource router can be configured to have any reasonable combination of controllers and M-channels. Any memory bank is programmed to use any memory controller used in that M-channel as well as any memory bank is configured to use any controller as the default memory controller for that bank. If a memory type is not used on any of the memory banks in that M-channel, then that controller is not placed in the design by CAD platform 406. Each bank of memory is preferably programmed in system 400 as Asynchronous, DRAM, or SDRAM. Memory controllers connected to separate M-channels run independently allowing different memories connected to different M-channels to access external memory through separate memory pins.
  • If an asynchronous bank is programmed, then it will support flash-memory, compact flash-memory, internal or external SRAM, SSRAM, SFlash, and internal or external ROM if all the control pins are brought out as pins on the part. If a bank is programmed as SDRAM, it will support PC100-compliant SDRAM and DDR SDRAM. If EDO DRAM is programmed, it will support standard EDO and Fast Page Mode EDO DRAM.
  • Target device and embedded memory channels use protocols with special extensions to optimize initiator-to-target transfers. Such channel type is preferably configurable so the user can trade-off performance for gate-count, or remove extensions not needed in particular applications. The channel can have several different memory-mapped target devices or embedded memories. This target devices and embedded memories are preferably mixed in any combination.
  • Special extensions include full duplex and split transactions. Full duplex uses a transaction posting system that allows an initiator-A to do burst reads to a target-A at the same time an initiator-B is doing burst writes to an embedded memory-B. This system can double the bandwidth of the channel, but only if bursting is being used, and only if different initiator-to-target reads and writes are happening at the same time. If two initiators are doing a read, this system will not help. And if the two initiators are trying to access the same target, this option will not help in any combination of reads and writes.
  • Split transactions allow reads to return across channels in different order than they were requested. This allows an initiator that is reading from a fast target to jump in and read data while an initiator that is reading from a slow target is still waiting for data. If there is a conflict, e.g., two targets try to return data at the same time, the transaction that was posted first will have priority. This helps when mixing fast and slow targets on the same channel. If all the target devices are fast, this will not improve performance much. It will also not improve writes, or if two initiators are trying to read from the same target.
  • Several arbitration options are preferably offered to users of the system resource router to allow throughput customization. Each M-channel has its own arbitration, and each arbitration is preferably a different type, e.g., round-robin, fixed priority, timed priority, round-robin with one fixed priority, and time-domain slicing schemes. Arbitration between initiators devices is preferably supported for each M-channel.
  • With round-robin arbitration, priority is passed from initiator device to initiator device starting at initiator socket-1 in a round-robin fashion until the initiator socket-1 has priority again.
  • With fixed priority arbitration, the priority is always fixed with M-channel connection 1 having the highest priority, and the highest-numbered M-channel connection having the lowest. In this priority scheme, it is important which device is attached to which initiator socket.
  • With timed priority arbitration, the initiator that is granted moves to the lowest priority and all those that had a lower priority than the granted device move up. In this way, the device that uses the bus the least has the highest priority. This method of arbitration is the fairest method but has the highest gate count.
  • With round-robin with one fixed priority arbitration, the M-channel connection 1 will always have the highest priority. The other initiators arbitrate using the round-robin arbitration method.
  • With time domain slicing, each initiator connected to the M-channel is allowed to have only a certain number of transactions across the M-channel before another initiator takes over the M-channel. The number of transactions allowed is programmable for each initiator connected to the M-channel.
  • Arbitration can follow any request removal and the completion of a current memory access, or it can follow any end-of-burst.
  • Each arbiter control register preferably has two sets of registers that are preferably used to effect the operation of the arbiter for each M-channel. One set of bits is the arbiter mask register. These bits are preferably used to mask the requests from initiator or force requests from initiator. This is useful for test development and for system debug. A second part of the arbiter control registers includes arbiter force-request register bits, which are used to force a bus grant from the arbiter to a specific port. This can be useful for testing and system debug. The arbiter state registers allow the user to tell which initiator is granted by reading the register through the control port. This is typically useful for system debug.
  • A watchdog timer is preferably provided as an option for the arbitration of each M-channel. Typical watchdog timers are 16-bit units that count the number of clock cycles since a device has been granted a bus request. The timer resets each time a new initiator socket or multi-master bus is granted access. The watchdog timer control register allows the user to control what happens at timeout for each initiator socket. One option is nothing happens. Another is that the watchdog timer interrupt register will be set and the watchdog timer interrupt pin will go high. The watchdog timer interrupt register tells which initiator socket timed out and is cleared upon reading the register. Once the watchdog timer interrupt register is cleared, the watchdog timer interrupt pin will go low. The third option for what happens on timeout is that the watchdog timer interrupt register is set, the watchdog timer interrupt pin goes high, and at the end of the next memory cycle the grant will be removed from the initiator.
  • Each initiator preferably has its own watchdog timer and timeout value register. This register is typically 8-bits long, and is loaded into the most significant bits of the 16-bit timer when an initiator is granted access. Thus allowing the user to set specific timeout values for different ports.
  • Sockets are a critical element in being able to design once, and then reuse the design over and over in a plug-and-play system. Sockets bring together existing IP technology-library components, new third-party IP, and new project specific IP. Even when all are built to different interface standards, and without necessitating extensive redesign. Sockets provide address remapping, FIFO, synching between different clock domains, and bus-width matching, thus allowing systems to be built without having to redesign existing or third party IP technology-library components.
  • Sockets are preferably placed at any of several interfaces in the system resource router. These include the initiator interface, the interface into the M-channel, and the interface into the target. This allows the user to do such things as have a 32-bit MBus initiator running at 66 MHz, move data over a 128-bit M-channel running at 100 MHz, to a 64-bit VCI target device running at 50 MHz, and configure the entire thing inside the system resource router without modifying either the MBus initiator or the VCI target.
  • Both the synching FIFOs and the synch cells allow a portion of the chip running at one frequency to interface to another portion running at a different frequency. The synching FIFO has about three times the throughput of the synch cell, thereby allowing both sides to run at optimal throughput. However, it has a significantly higher gate count than the synch cell.
  • Each initiator, each channel, and each target can have its own synching FIFO system allowing for as many as 136 different clock domains for each resource router. (thirty-two initiator clocks, eight M-channel clocks, and sixty-four target clocks.)
  • The synching FIFO consists of one FIFO for data read or write, and a four-transaction deep transaction stack. The data FIFO is preferably from two words deep to 2048 words deep and should be set to the maximum burst length×4. This setting allows for four maximum length bursts to be pending. The synching FIFO will work whether interfacing from a fast clock to a slow clock, a slow clock to a fast clock, or two same-frequency but unsynchronized clocks. This allows different parts of the chip to be put in power down mode and still be able to transfer data in the power down mode.
  • In order to prevent any data coherency problems the reads and writes into the system resource router and out to the memory are performed in the same order. Consecutive reads and write are queued into the synching FIFOs but when switching from a read to a write the synching FIFOs wait until all the reads queued in the FIFOs are complete before queuing the next write operation. The FIFO status register identifies weather the initiator socket still has reads or writes pending on this M-channel.
  • The synch cell will synchronize two clock domains with a minimal of gate count. This cell is preferably slow since it must synchronize from clock domain-A to clock domain-B, and then back from clock domain-B to domain-A to complete a transfer. The synching FIFO hides this by stacking multiple transfers at once.
  • The synched FIFOs work much the same way as the synching FIFO with the exception that the clock domains are assumed to be synchronized. This means that the initiator clock and the system resource router clock should be generated off the same master clock and be some multiple of each other. This multiple of the clock is supplied when the system resource router is configured. The advantage of the synched FIFO over the synching FIFO is that it has fewer gates and has a lower latency between a transaction request and the subsequent read or write.
  • The system resource router supports the new VC Interface Standard. This interface is already built into the system resource router and allows VCI compliant devices to be connected without adding a bus wrapper. This includes VCI initiator, VCI target, and PVCI. This eases the integration of VCI compliant devices and allows persons familiar with VCI to connect into the system resource router without becoming familiar with another bus standard.
  • Address remapping is preferably preformed at several points in the system resource router and the remapping is preferably fixed or programmable through the control port allowing the user a great deal of flexibility in what the initiator memory map looks like. Each socket will allow several different sectors to be remapped. Each sector is preferably either a fixed type or a mapped type. For the fixed sector, an address range is selected and the programmed value out of the remap will be fixed for those address bits regardless of the input address. For the mapped sector the user specifies the output addresses to which the input addresses are mapped. The address pins compared on the input need not be the same address bits changed on the output address bits. The output address bits may not overlapped for a fixed and mapped sector but they can overlap for two mapped sectors only if the input address bits compared are the same. The number of address bits compared and the number of address bits changed on the output for a sector is preferably no more than 8-bits and must be consecutive.
  • The sockets will perform optional bandwidth matching between interfaces. For example, the socket will convert a 32-bit burst of four into a 64-bit burst of two (2× option) on the other side of the socket. This means that on the 64-bit side, only 2× clock cycles will be required to complete what was originally a 4-burst transfer. Going the other way, the socket will convert a 128-bit single cycle access into a burst of four 32-bit transfers (quarter option). This capability allows initiators, channels, and targets to effectively communicate without redesigning interfaces.
  • A system resource router with Dynamic Bandwidth Allocation is very similar to the above examples with the exception that internal memory and non-DMA devices or slaves are not assigned to one channel. They are assigned to multiple channels the same as the DMA or master devices. When a DMA or master requests access to an internal memory or non-DMA or slave device both devices are switched to the first available unused channel. This process continues until all the channels are in use. If another transaction is requested the DMA or master device and non-DMA or slave device or memory will be connected to a channel based on either the speed of the ongoing transfers or the priority of the DMA or master on the channel or both. The two transactions will then share the bandwidth of that channel until either one of the transactions is complete or another channel becomes available. If a channel becomes available one of the DMA or master devices with its non-DMA or slave device or internal memory will switch to the open channel. If a DMA or master device requests a non-DMA or slave device that is already being used by another DMA or master the requesting DMA or master is switched to the channel with the non-DMA or slave device and the two DMA or master devices arbitrate for the non-DMA or slave device. In this way the bandwidth used by the SOC is always optimal and maximum bandwidth utilization is guaranteed.
  • An additional method that is preferably used to increase bandwidth when the number of read and writes to or from the DMA or Master devices is equal is to split the channel from a Read/Write Channel to a Read Only Channel and a Write Only Channel. Because the internal channel architecture does not use bidirectional busses (low performance, high power consumption, and difficulties with using ASIC design tools) and there are separate mb_rdata and mb_wdata paths inside the system resource router, splitting the channel requires less overhead than adding a complete new channel.
  • For systems that require higher bandwidth in the write or read direction, individual channels could be defined as read-only or write-only. This provides additional bandwidth in the required direction, thus optimizing system performance.
  • While COREFRAME implementations generally comprise a CPU and shared memory, embodiments of the present invention are preferably applied to systems with shared resources, e.g., a PCI interface. A support processor is needed only if the peripheral blocks are programmable. If such are programmed through a sequencer, no processor is needed.
  • There are three basic methods of interconnect for on-chip designs, (1) simple bus architecture, (2) simple bus with bridge architecture, and (3) point-to-point architecture. All three have advantages and disadvantages. System resource router embodiments of the present invention basically combine the simple bus architecture and the point-to-point architecture to exploit the advantages of each and avoid the disadvantages. For systems requiring low bandwidth, the system resource router is preferably CAD-configured down to a simple bus architecture implementation. For high bandwidth, the system resource router is preferably CAD-configured as a point-to-point architecture implementation. As this disclosure describes, the present invention allows practitioners to configure a low-bandwidth implementation and a high-bandwidth implementation, as well as several shades of architectural mixes in between.
  • In summarization, the present invention is a system resource router for use on an SOC device that includes at least two channel sockets that provide for protocol-based connections to data-transfer initiators and at least two internal M-channel buses that alternately connect to one or more of the channel sockets using transfer switches. Each internal M-channel bus connects to an external M-channel bus populated by one or more transaction targets using an M-channel controller. The channel sockets, at least some of the data-transfer initiators, the internal M-channel buses, the external M-channel buses, and at least some of the transaction targets are all contained upon a single integrated circuit (IC) SOC device. In some embodiments, one or more of the internal M-channel buses are synchronous buses, and the present invention includes synching FIFOs that synchronize data transfers over these buses. In some embodiments, data-transfer initiators, one or more of the internal M-channel buses, and transaction targets can all be running at different clock frequencies. Some embodiments may provide for an internal M-channel bus that is an embedded memory channel that provides a point-to-point connection to internal or external memory.
  • The channel sockets and internal interfaces within the present invention are capable of optimizing bandwidth for individual transactions, i.e., converting a transaction from a first group of one or more bursts having a first bandwidth to a second group of one or more bursts having a second bandwidth. Finally, the present invention supports split read transactions, wherein read transfers are returned across said internal M-channel buses in a different order than originally requested, and full duplex transactions, where one transaction is a read burst from one data-transfer initiator to a target and the other transaction is a simultaneous write burst from a second data-transfer initiator to a second target.
  • Although the present invention has been described in terms of the presently preferred embodiments, it is to be understood that this disclosure is not interpreted as limiting. Various alterations and modifications will no doubt become apparent to those skilled in the art after having read the above disclosure. Accordingly, it is intended that all appended claims be interpreted as covering all alterations and modifications as falling within the true spirit and scope of the invention.

Claims (9)

1. A system resource router within a system-on-chip (SOC) device, comprising
at least two channel sockets, wherein each said channel socket provides for protocol-based connections to data-transfer initiators;
first and second internal M-channel buses that alternately connect to at least one of said channel sockets using a transfer switch; and
a first M-channel controller that connects said first internal M-channel bus to a first external M-channel bus populated by one or more transaction targets; and
a second M-channel controller that connects said second internal M-channel bus to a second external M-channel bus populated by one or more transaction targets;
wherein said transfer switch operatively couples a data-transfer initiator connected to said channel socket to a transaction target using either said first internal M-channel bus, said first M-channel controller, and said first external M-channel bus, or said second internal M-channel bus, said second M-channel controller, and said second external M-channel bus; and
wherein said channel sockets, said data-transfer initiators, said first and second internal M-channel buses, said first and second external M-channel buses, and at least some of said one or more transaction targets are all contained upon a single integrated circuit (IC) device.
2. A resource routing system, comprising
at least two channel sockets, wherein each said channel socket provides for protocol-based connections to data-transfer initiators;
first and second internal M-channel buses that alternately connect to at least one of said channel sockets using a transfer switch; and
a first M-channel controller that connects said first internal M-channel bus to a first external M-channel bus populated by one or more transaction targets; and
a second M-channel controller that connects said second internal M-channel bus to a second external M-channel bus populated by one or more transaction targets;
wherein said transfer switch operatively couples a data-transfer initiator connected to said channel socket to a transaction target using either said first internal M-channel bus, said first M-channel controller, and said first external M-channel bus, or said second internal M-channel bus, said second M-channel controller, and said second external M-channel bus; and
wherein said channel sockets, said data-transfer initiators, said first and second internal M-channel buses, said first and second external M-channel buses, and at least some of said one or more transaction targets are all contained upon a single integrated circuit (IC) device.
3. A method that makes a system resource router on a system-on-chip (SOC) device, comprising
providing at least two channel sockets, wherein each said channel socket provides for protocol-based connections to data-transfer initiators;
providing first and second internal M-channel buses that alternately connect to at least one of said channel sockets using a transfer switch; and
providing a first M-channel controller that connects said first internal M-channel bus to a first external M-channel bus populated by one or more transaction targets; and
providing a second M-channel controller that connects said second internal M-channel bus to a second external M-channel bus populated by one or more transaction targets;
wherein said transfer switch operatively couples a data-transfer initiator connected to said channel socket to a transaction target using either said first internal M-channel bus, said first M-channel controller, and said first external M-channel bus, or said second internal M-channel bus, said second M-channel controller, and said second external M-channel bus; and
wherein said channel sockets, said data-transfer initiators, said first and second internal M-channel buses, said first and second external M-channel buses, and at least some of said one or more transaction targets are all contained upon a single integrated circuit (IC) device.
4. A method that uses a system resource router within a system-on-chip (SOC) device, comprising
operatively coupling a data-transfer initiator to one of at least two channel sockets, that provides for protocol-based connections;
operatively coupling said data-transfer initiator to one of first and second internal M-channel buses using a transfer switch that provides alternative connections from said one of at least two channel sockets to said first and second internal M-channel buses; and
operatively coupling to a transaction target through one of the following: a first M-channel controller that connects said first internal M-channel bus to a first external M-channel bus populated by one or more transaction targets, or
a second M-channel controller that connects said second internal M-channel bus to a second external M-channel bus populated by one or more transaction targets;
wherein said channel sockets, said data-transfer initiator, said first and second internal M-channel buses, said first and second external M-channel buses, and at least some of said one or more transaction targets are all contained upon a single integrated circuit (IC) device.
5. A dependent claim according to claim 1, 2, 3, or 4 wherein either said first or said second internal M-channel bus further comprises a synchronous bus, and said transfer switch couples said data-transfer initiator to said synchronous bus through a synching FIFO.
6. A dependent claim according to claim 5, wherein said data-transfer initiator, said synchronous bus, and said transaction target are all running at different clock frequencies.
7. A dependent claim according to claim 1, 2, 3, or 4, wherein one of said first and second internal M-channel buses further comprises an embedded memory channel that provides a point-to-point connection to internal or external memory.
8. A dependent claim according to claim 1, 2, 3, or 4 wherein said channel sockets and said M-channel controllers convert a transaction from a first group of one or more bursts having a first bandwidth to a second group of one or more bursts having a second bandwidth.
9. A dependent claim according to claim 1, 2, 3, or 4 wherein two transactions between two data-transfer initiators and two targets occurring over one of said first and second internal M-channel buses comprise one of the following: split read transactions, wherein read transfers are returned across said internal M-channel buses in a different order than originally requested, or full duplex transactions, where one transaction is a read burst from one data-transfer initiator to a target and the other transaction is a simultaneous write burst from a second data-transfer initiator to a second target.
US10/899,988 2000-02-14 2004-07-27 System resource router Abandoned US20050071533A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/899,988 US20050071533A1 (en) 2000-02-14 2004-07-27 System resource router

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US18240600P 2000-02-14 2000-02-14
US09/565,282 US6601126B1 (en) 2000-01-20 2000-05-02 Chip-core framework for systems-on-a-chip
US21759700P 2000-07-11 2000-07-11
US09/731,070 US6769046B2 (en) 2000-02-14 2000-12-05 System-resource router
US10/899,988 US20050071533A1 (en) 2000-02-14 2004-07-27 System resource router

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/731,070 Continuation US6769046B2 (en) 2000-02-14 2000-12-05 System-resource router

Publications (1)

Publication Number Publication Date
US20050071533A1 true US20050071533A1 (en) 2005-03-31

Family

ID=34382135

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/731,070 Expired - Lifetime US6769046B2 (en) 2000-02-14 2000-12-05 System-resource router
US10/899,988 Abandoned US20050071533A1 (en) 2000-02-14 2004-07-27 System resource router

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/731,070 Expired - Lifetime US6769046B2 (en) 2000-02-14 2000-12-05 System-resource router

Country Status (1)

Country Link
US (2) US6769046B2 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040255071A1 (en) * 2003-06-12 2004-12-16 Larson Thane M. Inter-integrated circuit bus router for providing increased security
US20050091432A1 (en) * 2003-10-28 2005-04-28 Palmchip Corporation Flexible matrix fabric design framework for multiple requestors and targets in system-on-chip designs
US20050251608A1 (en) * 2004-05-10 2005-11-10 Fehr Walton L Vehicle network with interrupted shared access bus
US20070038796A1 (en) * 2005-08-11 2007-02-15 P.A. Semi, Inc. Partially populated, hierarchical crossbar
US20070233904A1 (en) * 2006-02-24 2007-10-04 Richard Gerard Hofmann Auxiliary Writes Over Address Channel
US20080270643A1 (en) * 2007-04-24 2008-10-30 Hitachi, Ltd. Transfer system, initiator device, and data transfer method
US20080313365A1 (en) * 2007-06-14 2008-12-18 Arm Limited Controlling write transactions between initiators and recipients via interconnect logic
US20090228616A1 (en) * 2008-03-05 2009-09-10 Microchip Technology Incorporated Sharing Bandwidth of a Single Port SRAM Between at Least One DMA Peripheral and a CPU Operating with a Quadrature Clock
US8107492B2 (en) 2006-02-24 2012-01-31 Qualcomm Incorporated Cooperative writes over the address channel of a bus
US20140086247A1 (en) * 2012-09-25 2014-03-27 Arteris SAS Network on a chip socket protocol

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7366769B2 (en) * 2000-10-02 2008-04-29 Schlumberger Technology Corporation System, method and computer program product for a universal communication connector
US7039750B1 (en) * 2001-07-24 2006-05-02 Plx Technology, Inc. On-chip switch fabric
US7257661B2 (en) * 2001-09-21 2007-08-14 Nxp B.V. Scalable home control platform and architecture
US6918001B2 (en) * 2002-01-02 2005-07-12 Intel Corporation Point-to-point busing and arrangement
EP1333380A1 (en) * 2002-01-30 2003-08-06 STMicroelectronics Limited DMA access generator
US7155618B2 (en) * 2002-03-08 2006-12-26 Freescale Semiconductor, Inc. Low power system and method for a data processing system
US7107365B1 (en) * 2002-06-25 2006-09-12 Cypress Semiconductor Corp. Early detection and grant, an arbitration scheme for single transfers on AMBA advanced high-performance bus
KR100475735B1 (en) * 2002-07-12 2005-03-10 삼성전자주식회사 Method and device for arbitrating common bus by using urgent channel
US7769893B2 (en) * 2002-10-08 2010-08-03 Koninklijke Philips Electronics N.V. Integrated circuit and method for establishing transactions
AU2002368402A1 (en) * 2002-12-05 2004-06-23 Nokia Corporation Device and method for operating memory components
EP1625504A1 (en) * 2003-05-08 2006-02-15 Koninklijke Philips Electronics N.V. Processing system and method for communicating data
US20040255070A1 (en) * 2003-06-12 2004-12-16 Larson Thane M. Inter-integrated circuit router for supporting independent transmission rates
US7412588B2 (en) 2003-07-25 2008-08-12 International Business Machines Corporation Network processor system on chip with bridge coupling protocol converting multiprocessor macro core local bus to peripheral interfaces coupled system bus
US7353362B2 (en) * 2003-07-25 2008-04-01 International Business Machines Corporation Multiprocessor subsystem in SoC with bridge between processor clusters interconnetion and SoC system bus
KR100881416B1 (en) * 2003-10-10 2009-02-05 노키아 코포레이션 Microcontrol architecture for a system on a chip SOC
US7213084B2 (en) * 2003-10-10 2007-05-01 International Business Machines Corporation System and method for allocating memory allocation bandwidth by assigning fixed priority of access to DMA machines and programmable priority to processing unit
US8856401B2 (en) * 2003-11-25 2014-10-07 Lsi Corporation Universal controller for peripheral devices in a computing system
US7028106B2 (en) * 2003-12-05 2006-04-11 Hewlett-Packard Development Company, L.P. Remapping routing information entries in an expander
US7151709B2 (en) * 2004-08-16 2006-12-19 Micron Technology, Inc. Memory device and method having programmable address configurations
US7500129B2 (en) * 2004-10-29 2009-03-03 Hoffman Jeffrey D Adaptive communication interface
WO2007029053A1 (en) * 2005-09-09 2007-03-15 Freescale Semiconductor, Inc. Interconnect and a method for designing an interconnect
US8396041B2 (en) * 2005-11-08 2013-03-12 Microsoft Corporation Adapting a communication network to varying conditions
US8381047B2 (en) 2005-11-30 2013-02-19 Microsoft Corporation Predicting degradation of a communication channel below a threshold based on data transmission errors
TWI321731B (en) * 2006-09-18 2010-03-11 Quanta Comp Inc Device connection system and device connection method
US7934046B2 (en) * 2008-07-02 2011-04-26 International Business Machines Corporation Access table lookup for bus bridge
CH699208B1 (en) * 2008-07-25 2019-03-29 Em Microelectronic Marin Sa Shared memory processor circuit and buffer system.
WO2014073188A1 (en) * 2012-11-08 2014-05-15 パナソニック株式会社 Semiconductor circuit bus system
US9602359B2 (en) 2013-03-15 2017-03-21 Custom Microwave Components, Inc. Methods, systems, and computer program product for providing graphical cross connectivity and dynamic configurability
GB2513979B (en) * 2013-03-15 2021-03-31 Custom Microwave Components Inc Methods, systems and computer program product for providing graphical cross connectivity and dynamic configurability
US9728526B2 (en) 2013-05-29 2017-08-08 Sandisk Technologies Llc Packaging of high performance system topology for NAND memory systems
US9703702B2 (en) * 2013-12-23 2017-07-11 Sandisk Technologies Llc Addressing auto address assignment and auto-routing in NAND memory network
CN112181493B (en) * 2020-09-24 2022-09-13 成都海光集成电路设计有限公司 Register network architecture and register access method

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4001784A (en) * 1973-12-27 1977-01-04 Honeywell Information Systems Italia Data processing system having a plurality of input/output channels and physical resources dedicated to distinct and interruptible service levels
US4536873A (en) * 1984-03-19 1985-08-20 Honeywell Inc. Data transmission system
US5261059A (en) * 1990-06-29 1993-11-09 Digital Equipment Corporation Crossbar interface for data communication network
US5457679A (en) * 1993-12-08 1995-10-10 At&T Corp. Channel sharing and memory sharing in a packet switching system
US5574849A (en) * 1992-12-17 1996-11-12 Tandem Computers Incorporated Synchronized data transmission between elements of a processing system
US5604865A (en) * 1991-07-08 1997-02-18 Seiko Epson Corporation Microprocessor architecture with a switch network for data transfer between cache, memory port, and IOU
US5729763A (en) * 1995-08-15 1998-03-17 Emc Corporation Data storage system
US5815680A (en) * 1993-09-27 1998-09-29 Ntt Mobile Communications Network, Inc. SIMD multiprocessor with an interconnection network to allow a datapath element to access local memories
US5887187A (en) * 1993-10-20 1999-03-23 Lsi Logic Corporation Single chip network adapter apparatus
US5909594A (en) * 1997-02-24 1999-06-01 Silicon Graphics, Inc. System for communications where first priority data transfer is not disturbed by second priority data transfer and where allocated bandwidth is removed when process terminates abnormally
US6009106A (en) * 1997-11-19 1999-12-28 Digi International, Inc. Dynamic bandwidth allocation within a communications channel
US6078953A (en) * 1997-12-29 2000-06-20 Ukiah Software, Inc. System and method for monitoring quality of service over network
US20010014923A1 (en) * 1991-12-06 2001-08-16 Yasuo Inoue Method for connecting caches in external storage subsystem
US6332165B1 (en) * 1997-09-05 2001-12-18 Sun Microsystems, Inc. Multiprocessor computer system employing a mechanism for routing communication traffic through a cluster node having a slice of memory directed for pass through transactions
US20020095549A1 (en) * 1998-12-22 2002-07-18 Hitachi, Ltd. Disk storage system
US20020129188A1 (en) * 1998-11-13 2002-09-12 Siemens Microelectronics, Inc. Data processing device with memory coupling unit
US6523088B2 (en) * 1998-06-19 2003-02-18 Hitachi, Ltd. Disk array controller with connection path formed on connection request queue basis
US6535960B1 (en) * 1994-12-12 2003-03-18 Fujitsu Limited Partitioned cache memory with switchable access paths
US6542954B1 (en) * 1999-02-02 2003-04-01 Hitachi, Ltd. Disk subsystem
US6574687B1 (en) * 1999-12-29 2003-06-03 Emc Corporation Fibre channel data storage system
US6633296B1 (en) * 2000-05-26 2003-10-14 Ati International Srl Apparatus for providing data to a plurality of graphics processors and method thereof
US6898235B1 (en) * 1999-12-10 2005-05-24 Argon St Incorporated Wideband communication intercept and direction finding device using hyperchannelization

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4001784A (en) * 1973-12-27 1977-01-04 Honeywell Information Systems Italia Data processing system having a plurality of input/output channels and physical resources dedicated to distinct and interruptible service levels
US4536873A (en) * 1984-03-19 1985-08-20 Honeywell Inc. Data transmission system
US5261059A (en) * 1990-06-29 1993-11-09 Digital Equipment Corporation Crossbar interface for data communication network
US5604865A (en) * 1991-07-08 1997-02-18 Seiko Epson Corporation Microprocessor architecture with a switch network for data transfer between cache, memory port, and IOU
US20010014923A1 (en) * 1991-12-06 2001-08-16 Yasuo Inoue Method for connecting caches in external storage subsystem
US5574849A (en) * 1992-12-17 1996-11-12 Tandem Computers Incorporated Synchronized data transmission between elements of a processing system
US5815680A (en) * 1993-09-27 1998-09-29 Ntt Mobile Communications Network, Inc. SIMD multiprocessor with an interconnection network to allow a datapath element to access local memories
US5887187A (en) * 1993-10-20 1999-03-23 Lsi Logic Corporation Single chip network adapter apparatus
US5457679A (en) * 1993-12-08 1995-10-10 At&T Corp. Channel sharing and memory sharing in a packet switching system
US6535960B1 (en) * 1994-12-12 2003-03-18 Fujitsu Limited Partitioned cache memory with switchable access paths
US5729763A (en) * 1995-08-15 1998-03-17 Emc Corporation Data storage system
US5909594A (en) * 1997-02-24 1999-06-01 Silicon Graphics, Inc. System for communications where first priority data transfer is not disturbed by second priority data transfer and where allocated bandwidth is removed when process terminates abnormally
US6332165B1 (en) * 1997-09-05 2001-12-18 Sun Microsystems, Inc. Multiprocessor computer system employing a mechanism for routing communication traffic through a cluster node having a slice of memory directed for pass through transactions
US6009106A (en) * 1997-11-19 1999-12-28 Digi International, Inc. Dynamic bandwidth allocation within a communications channel
US6078953A (en) * 1997-12-29 2000-06-20 Ukiah Software, Inc. System and method for monitoring quality of service over network
US6523088B2 (en) * 1998-06-19 2003-02-18 Hitachi, Ltd. Disk array controller with connection path formed on connection request queue basis
US20020129188A1 (en) * 1998-11-13 2002-09-12 Siemens Microelectronics, Inc. Data processing device with memory coupling unit
US20020095549A1 (en) * 1998-12-22 2002-07-18 Hitachi, Ltd. Disk storage system
US6542954B1 (en) * 1999-02-02 2003-04-01 Hitachi, Ltd. Disk subsystem
US6898235B1 (en) * 1999-12-10 2005-05-24 Argon St Incorporated Wideband communication intercept and direction finding device using hyperchannelization
US6574687B1 (en) * 1999-12-29 2003-06-03 Emc Corporation Fibre channel data storage system
US6633296B1 (en) * 2000-05-26 2003-10-14 Ati International Srl Apparatus for providing data to a plurality of graphics processors and method thereof

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040255071A1 (en) * 2003-06-12 2004-12-16 Larson Thane M. Inter-integrated circuit bus router for providing increased security
US7398345B2 (en) * 2003-06-12 2008-07-08 Hewlett-Packard Development Company, L.P. Inter-integrated circuit bus router for providing increased security
US20050091432A1 (en) * 2003-10-28 2005-04-28 Palmchip Corporation Flexible matrix fabric design framework for multiple requestors and targets in system-on-chip designs
US20050251608A1 (en) * 2004-05-10 2005-11-10 Fehr Walton L Vehicle network with interrupted shared access bus
JP2009505241A (en) * 2005-08-11 2009-02-05 ピイ・エイ・セミ・インコーポレーテッド Partially populated hierarchical crossbar
US20070038796A1 (en) * 2005-08-11 2007-02-15 P.A. Semi, Inc. Partially populated, hierarchical crossbar
WO2007022019A2 (en) * 2005-08-11 2007-02-22 P.A. Semi, Inc. Partially populated, hierarchical crossbar
WO2007022019A3 (en) * 2005-08-11 2007-05-31 Pa Semi Inc Partially populated, hierarchical crossbar
US7269682B2 (en) 2005-08-11 2007-09-11 P.A. Semi, Inc. Segmented interconnect for connecting multiple agents in a system
US7426601B2 (en) 2005-08-11 2008-09-16 P.A. Semi, Inc. Segmented interconnect for connecting multiple agents in a system
US8107492B2 (en) 2006-02-24 2012-01-31 Qualcomm Incorporated Cooperative writes over the address channel of a bus
US20070233904A1 (en) * 2006-02-24 2007-10-04 Richard Gerard Hofmann Auxiliary Writes Over Address Channel
US8108563B2 (en) * 2006-02-24 2012-01-31 Qualcomm Incorporated Auxiliary writes over address channel
KR101202317B1 (en) 2006-02-24 2012-11-16 콸콤 인코포레이티드 Auxiliary writes over address channel
US8521914B2 (en) 2006-02-24 2013-08-27 Qualcomm Incorporated Auxiliary writes over address channel
US8675679B2 (en) 2006-02-24 2014-03-18 Qualcomm Incorporated Cooperative writes over the address channel of a bus
US20080270643A1 (en) * 2007-04-24 2008-10-30 Hitachi, Ltd. Transfer system, initiator device, and data transfer method
US20080313365A1 (en) * 2007-06-14 2008-12-18 Arm Limited Controlling write transactions between initiators and recipients via interconnect logic
US20090228616A1 (en) * 2008-03-05 2009-09-10 Microchip Technology Incorporated Sharing Bandwidth of a Single Port SRAM Between at Least One DMA Peripheral and a CPU Operating with a Quadrature Clock
US7739433B2 (en) * 2008-03-05 2010-06-15 Microchip Technology Incorporated Sharing bandwidth of a single port SRAM between at least one DMA peripheral and a CPU operating with a quadrature clock
US20140086247A1 (en) * 2012-09-25 2014-03-27 Arteris SAS Network on a chip socket protocol
US9225665B2 (en) * 2012-09-25 2015-12-29 Qualcomm Technologies, Inc. Network on a chip socket protocol

Also Published As

Publication number Publication date
US6769046B2 (en) 2004-07-27
US20010042147A1 (en) 2001-11-15

Similar Documents

Publication Publication Date Title
US6769046B2 (en) System-resource router
US6601126B1 (en) Chip-core framework for systems-on-a-chip
US5819096A (en) PCI to ISA interrupt protocol converter and selection mechanism
US7793008B2 (en) AMBA modular memory controller
EP1239374B1 (en) Shared program memory for use in multicore DSP devices
US8296526B2 (en) Shared memory having multiple access configurations
US6653859B2 (en) Heterogeneous integrated circuit with reconfigurable logic cores
Nowick et al. Practical asynchronous controller design
US7475182B2 (en) System-on-a-chip mixed bus architecture
US6587905B1 (en) Dynamic data bus allocation
US20050091432A1 (en) Flexible matrix fabric design framework for multiple requestors and targets in system-on-chip designs
EP1652058A1 (en) Switch/network adapter port incorporating selectively accessible shared memory resources
JP2002049576A (en) Bus architecture for system mounted on chip
Sharma et al. Wishbone bus architecture-a survey and comparison
US7007111B2 (en) DMA port sharing bandwidth balancing logic
US11182110B1 (en) On-chip memory block circuit
Wingard et al. Integration architecture for system-on-a-chip design
WO2001024015A2 (en) Asynchronous centralized multi-channel dma controller
US6954869B2 (en) Methods and apparatus for clock domain conversion in digital processing systems
WO2009009133A2 (en) Dual bus system and method
CN114746853A (en) Data transfer between memory and distributed computing array
Remaklus On-chip bus structure for custom core logic designs
JPH052555A (en) Internal bus for workstation interface device
JPH09153009A (en) Arbitration method for hierarchical constitution bus
Acasandrei et al. Open library of IP module interfaces for AMBA bus

Legal Events

Date Code Title Description
AS Assignment

Owner name: PALMCHIP CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ADAMS, LYLE E.;MILLS, BILLY D.;REEL/FRAME:015771/0660;SIGNING DATES FROM 20041110 TO 20041111

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION