WO2004068271A2 - A reconfigurable semantic processor - Google Patents

A reconfigurable semantic processor Download PDF

Info

Publication number
WO2004068271A2
WO2004068271A2 PCT/US2003/036225 US0336225W WO2004068271A2 WO 2004068271 A2 WO2004068271 A2 WO 2004068271A2 US 0336225 W US0336225 W US 0336225W WO 2004068271 A2 WO2004068271 A2 WO 2004068271A2
Authority
WO
WIPO (PCT)
Prior art keywords
parser
production
semantic code
symbol
symbols
Prior art date
Application number
PCT/US2003/036225
Other languages
French (fr)
Other versions
WO2004068271A3 (en
Inventor
Somsubhra Sikdar
Original Assignee
Mistletoe Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mistletoe Technologies, Inc. filed Critical Mistletoe Technologies, Inc.
Priority to AU2003290817A priority Critical patent/AU2003290817A1/en
Priority to CA002513097A priority patent/CA2513097A1/en
Priority to JP2004567407A priority patent/JP4203023B2/en
Priority to EP03783401A priority patent/EP1590744A4/en
Publication of WO2004068271A2 publication Critical patent/WO2004068271A2/en
Publication of WO2004068271A3 publication Critical patent/WO2004068271A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • G06F9/45508Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/427Parsing

Definitions

  • This invention relates generally to digital processors and processing, and more specifically to digital semantic processors for data stream processing.
  • VN von Neumann
  • the VN architecture in its simplest form, comprises a central processing unit (CPU) and attached memory, usually with some form of input/output to allow useful operations.
  • Figure 1 shows a computer 20 comprising a CPU 30, a memory controller 40, memory 50, and input/output (I/O) devices 60.
  • CPU 30 sends data requests to memory controller 40 over address/control bus 42; the data itself passes over a data bus 44.
  • Memory controller 40 communicates with memory 50 and I/O devices 60 to perform data reads and writes as requested by CPU 30 (or possibly by the I/O devices).
  • I/O devices input/output
  • memory 50 stores both program instructions and data.
  • CPU 30 fetches program instructions from the memory and executes the commands contained therein — typical instructions instruct the CPU to load data from memory to a register, write data to memory from a register, perform an arithmetic or logical operation using data in its onboard registers, or branch to a different instruction and continue execution.
  • CPU 30 spends a great deal of time fetching instructions, fetching data, or writing data over data bus 44.
  • elaborate (and usually costly) schemes can be implemented to cache data and instructions that might be useful, implement pipelining, and decrease average memory cycle time, data bus 44 is ultimately a bottleneck on processor performance.
  • the NN architecture is attractive, as compared to gate logic, because it can be made "general-purpose" and can be reconfigured relatively quickly; by merely loading a new set of program instructions, the function of a NN machine can be altered to perform even very- complex functions, given enough time.
  • the tradeoffs for the flexibility of the NN architecture are complexity and inefficiency. Thus the ability to do almost anything comes at the cost of being able to do a few simple things efficiently.
  • Such a device is preferably reconfigurable like a VN machine, as its processing depends on its "programming” — although as will be seen this "programming" is unlike conventional machine code used by a VN machine.
  • VN machine always executes a set of machine instructions that check for various data conditions sequentially
  • the RSP responds directly to the semantics of an input stream.
  • the '"code that the RSP executes is selected by its input.
  • the RSP is ideally suited to fast and efficient packet processing.
  • Some embodiments described herein use a table-driven predictive parser to drive direct execution of the protocols of a network grammar, e.g., an EE (Left-to-right parsing by identifying the Left-most production) parser.
  • EE Left-to-right parsing by identifying the Left-most production
  • Other parsing techniques e.g., recursive descent, ER (Left-to-right parsing by identifying the Right-most production), and LALR (Look Ahead LR) may also be used in embodiments of the invention.
  • the parser responds to its input by launching microinstruction code segments on a simple execution unit.
  • the RSP When the tables are placed in rewritable storage, the RSP can be easily reconfigured, and thus a single RSP design can be useful in a variety of applications. In many applications, the entire RSP, including the tables necessary for its operation, can be implemented on a single, low-cost, low-power integrated circuit.
  • a number of optional features can increase the usefulness of such a device.
  • a bank of execution units can be used to execute different tasks, allowing parallel processing.
  • An exception unit which can be essentially a small NN machine, can be connected and used to perform tasks that are, e.g., complex but infrequent or without severe time pressure.
  • machine-context memory interfaces can be made available to the execution units, so that the execution units do not have to understand the underlying format of the memory units — thus greatly simplifying the code executed by the execution units.
  • Figure 1 contains a block diagram for a typical von Neumann machine
  • Figure 2 contains a block diagram for a predictive parser pattern recognizer previously patented by the inventor of the present invention
  • Figure 3 illustrates, in block form, a semantic processor according to an embodiment of the invention
  • Figure 4 shows one possible parser table construct useful with embodiments of the invention
  • Figure 5 shows one possible production rule table organization useful with embodiments of the invention
  • FIG. 6 illustrates, in block form, one implementation for a direct execution parser (DXP) useful with embodiments of the present invention
  • Figure 7 contains a flowchart for the operation of the DXP shown in Figure 6;
  • DXP direct execution parser
  • Figure 8 shows a block diagram for a reconfigurable semantic processor according to an embodiment of the invention
  • Figure 9 shows the block organization of a semantic code execution engine useful with embodiments of the invention.
  • Figure 10 shows the format of an Address Resolution Protocol packet;
  • Figure 11 illustrates an alternate parser table implementation using a Content- Addressable Memory (CAM).
  • CAM Content- Addressable Memory
  • Predictive parser 84 examines each value (octet) that is passed to it. First, parser 84 performs a table lookup using the value and the offset of that value's location from the beginning of packet 70 as an index into parser table 88. Parser table 88 stores, for each combination of value and offset, one of four possible values: 'A', meaning accept the value at that offset; 'D', meaning that the combination of value and offset is a "don't care"; 'F', meaning failure as the value at the offset is not part of the pattern to be recognized; and '$', for an end symbol.
  • Parser stack 86 is not a true "stack" in the normal meaning of the word (or as applied to the invention embodiments to be described shortly) — it merely keeps a state variable for each "filter” that parser 84 is trying to match.
  • Each state variable is initialized to an entry state.
  • the stack updates each stack variable. For instance, if an 'A' is returned for a stack variable, that stack variable moves from the entry state to a partial match state. If a 'F' is returned, that stack variable moves from either the entry state or the partial match state to a failure state. If a 'D' is returned, that stack variable maintains its current state. And if a '$' is returned while the state variable is in the entry state or the partial match state, the state variable transitions to the match state.
  • parser 84 returns a match value based on the parser stack states. Semantic engine 82 then takes some output action depending on the success or failure of the match. It should be noted that the parser does not control or coordinate the device function, but instead merely acts as an ancillary pattern matcher to a larger system. Each possible pattern to be distinguished requires a new column in the parser table, such that in a hardware implementation device 80 can match only a limited number of input patterns. And a parser table row is required for each input octet position, even if that input octet position cannot affect the match outcome.
  • FIG. 3 shows a semantic processor 100 according to an embodiment of the invention.
  • semantic processor 100 contains a direct execution parser (DXP) 200 that controls the processing of input packets.
  • DXP 200 parses data received at the input port 102, it expands and executes actual grammar productions in response to the input, and instructs semantic code execution engine (SEE) 300 to process segments of the input, or perform other operations, as the grammar executes.
  • SEE semantic code execution engine
  • the semantic processor is reconfigurable, and thus has the appeal of a VN machine without the high overhead. Because the semantic processor only responds to the input it is given, it can operate efficiently with a smaller instruction set than a VN machine. The instruction set also benefits because the semantic processor allows processing in a machine context.
  • Semantic processor 100 uses at least three tables. Code segments for SEE 300 are stored in semantic code table 160. Complex grammatical production rules are stored in a production rule table 140. Codes for retrieving those production rules are stored in a parser table 120. The codes in parser table 120 also allow DXP 200 to detect whether, for a given production rule, a code segment from semantic code table 160 should be loaded and executed by SEE 300.
  • Figure 4 shows a general block diagram for a parser table 120.
  • a production rule code memory 122 stores table values, e.g., in a row-column format. The rows of the table are indexed by a non-terminal code. The columns of the table are indexed by an input data value.
  • codes for many different grammars can exist at the same time in production rule code memory 122. For instance, as shown, one set of codes can pertain to MAC (Media Access Control) packet header format parsing, and other sets of codes can pertain to Address Resolution Protocol (ARP) packet processing, Internet Protocol (IP) packet processing, Transmission Control Protocol (TCP) packet processing, Real-time Transport Protocol (RTP) packet processing, etc.
  • Non-terminal codes need not be assigned in any particular order in production rule code memory 122, nor in blocks pertaining to a particular protocol as shown.
  • Addressor 124 receives non-terminal (NT) codes and data values from DXP 200. Addressor 124 translates [NT code, data value] pairs into a physical location in production rule code memory 122, retrieves the production rule (PR) code stored at that location, and returns the PR code to the DXP.
  • PR production rule
  • Parser table 120 can be located on or off-chip, when DXP 200 and SEE 300 are integrated together in a circuit.
  • a static RAM located on-chip can serve as parser table 120.
  • off-chip DRAM storage can store parser table 120, with addressor 124 serving as or communicating with a memory controller for the DRAM.
  • the parser table can be located in off-chip memory, with an on-chip cache capable of holding a section of the parser table. Addressor 124 may not be necessary in some implementations, but when used can be part of parser 200, part of parser table 120 3 or an intermediate functional block.
  • Figure 5 illustrates one possible implementation for production rule table 140.
  • Production rule memory 142 stores the actual production rule sequences of terminal and nonterminal symbols, e.g., as null-terminated chains of consecutive memory addresses.
  • An addressor 144 receives PR codes, either from DXP 200 or directly from parser table 120.
  • production rules can have various lengths, it is preferable to take an approach that allows easy indexing into memory 142.
  • the PR code could be arithmetically manipulated to determine a production rule's physical memory starting address (this would be possible, for instance, if the production rules were sorted by expanded length, and then PR codes were assigned according to a rule's sorted position).
  • the PR code could also be the actual PR starting address, although in some applications this may make the PR codes unnecessarily lengthy.
  • a pointer table 150 is populated with a PR starting address for each PR code. Addressor 144 retrieves a production rule by querying pointer table 150 using the PR code as an address. Pointer table 150 returns a PR starting address PR_ADD. Addressor 144 then retrieves PR data from production rule memory 142 using this starting address. Addressor 144 increments the starting address and continues to retrieve PR data until a NULL character is detected.
  • Figure 5 shows a second column in table 150, which is used to store a semantic code (SC) starting address.
  • SC semantic code
  • FIG. 6 shows one possible block implementation for DXP 200.
  • Parser control finite state machine (FSM) 210 controls and sequences overall DXP operation, based on inputs from the other logical blocks in Figure 6.
  • Stack handler 220 and stack 222 store and sequence the production rules executed by DXP 200.
  • Parser table interface 230 allows DXP 200 to retrieve PR codes from an attached parser table.
  • Production rule table interface 240 allows DXP 200 to retrieve production rules from an attached production rule table.
  • semcode table interface 250 allows DXP 200 to identify the memory location of semantic code segments associated with production rules (in the illustrated embodiment, interfaces 240 and 250 are partially combined).
  • Input stream sequence control 260 and register 262 retrieve input data symbols from the Si-Bus. Comparator 270 compares input symbols with symbols from parser stack 222. Finally, SEE interface 280 is used to dispatch tasks to one or more SEEs communicating with DXP 200 on the Sx-Bus.
  • stack handler 220 retrieves a production symbol pX pointed to by its top-of-stack pointer psp.
  • the production symbol pX is split into two constituent parts, a prefix p and a symbol Prefix p codes the type of the symbol X, e.g., according to the following mapping for a two- bit prefix:
  • the prefix can indicate a masked terminal symbol.
  • a masked terminal symbol allows the specification of a bit mask for the input symbol, i.e., some (or all) bits of the terminal symbol are "don't care" bits.
  • the masked terminal symbol construct can be useful, e.g., for parsing packet flag fields such as occur in many network protocols.
  • Input stream sequence control 260 also loads the current input stream value pointed to by input pointer ip into aReg register 262. This step may not be necessary if the previous parsing cycle did not advance input pointer ip.
  • parser control FSM 210 When parser control FSM 210 receives the new prefix code » from stack handler 220, it determines (flowchart block 402) which of three possible logic paths to take for this parsing cycle. If the prefix code indicates that X is a terminal symbol, path 410 is taken. If the prefix code indicates that will match any input symbol, path 420 is taken. And if the prefix code indicates that X is a non-terminal symbol, path 430 is taken. The processing associated with each path will be explained in turn.
  • parser control FSM 200 makes another path branch, based on the symbol match signal M supplied by comparator 270.
  • Comparator 270 compares input symbol a to stack symbol X — if the two are identical, signal Mis asserted. If masked terminal symbols are allowed and a masked terminal symbol is supplied, comparator 270 applies the mask such that signal M depends only on the unmasked stack symbol bits.
  • parser control FSM 210 enters an error recovery mode at block 414.
  • error recovery will flush the remainder of the packet from the input (e.g., by matching the input with an end of frame (EOF) symbol until a match is detected), and popping the remaining symbols off the stack.
  • EEF end of frame
  • a semCode segment may also be dispatched to a SEE to clean up any machine state data related to the errant packet.
  • Processing path 420 accomplishes two tasks, shown as blocks 422 and 424 in Figure 7.
  • parser control FSM 210 signals stack handler 220 to "pop" the current value of X off of stack 222, e.g., by decrementing the stack pointer psp.
  • parser control FSM 210 signals input stream sequence control 260 to increment the input pointer Ip to the next symbol in the input stream.
  • Processing path 430 processes non-terminal symbols appearing on stack 222.
  • processing blocks 432, 434, 438, and 440 expand the non-terminal symbol into its corresponding production rule.
  • parser control FSM 210 replaces X on stack 222 with its expanded production rule.
  • Parser control FSM signals production rale table (PRT) interface 240 and SemCode table (SCT) interface 250 to perform lookups using PR code .
  • PRT production rale table
  • SCT SemCode table
  • Parser control FSM 210 also signals stack handler 220 to pop the current value of X off of stack 222.
  • PRT interface 240 returns production rule PR[y]
  • parser control FSM 210 signals stack handler 220 to push PR[y] onto stack 222.
  • this length must be accounted for in the push, i.e. some expansions may require multiple symbol transfers from the production rule table (the path width from the table to the stack handler may, of course, be more than one symbol wide).
  • SCT interface 250 has returned a corresponding SemCode address code
  • the address code SCT[y] may contain an actual physical address for the first SemCode microinstruction corresponding to PR code y, or some abstraction that allows a SEE to load that microinstruction.
  • the address code SCT[y] may contain other information as well, such as an indication of which SEE (in a multiple-SEE system) should receive the code segment.
  • SEE interface 280 When commanded by parser control FSM 210, SEE interface 280 examines SCT£y] and determines whether a code segment needs to be dispatched to a SEE. As shown by decision block 442 in Figure 7, no microinstruction execution is necessary if SCT[y] is not "valid", i.e., a NULL value is represented. Otherwise, SEE interface 280 determines (decision block 444) whether a SEE is currently available. SEE interface 280 examines a semaphore register (not shown) to determine SEE availability. If a particular SEE is indicated by SCT[y], SEE interface 280 examines the semaphore for that SEE.
  • SEE interface 280 enters wait state 446 until the semaphore clears. If any SEE may execute the SemCode segment, SEE interface 280 can simply select one with a clear semaphore.
  • SEE interface 280 captures the SX-bus and transmits SCT[y] to the selected SEE.
  • the selected SEE sets its semaphore to indicate that it has received the request.
  • parser control FSM 210 When parser control FSM 210 first commands SEE interface 280 to dispatch SCT[ ], SEE interface 280 deasserts the SEE status line to suspend further parsing, thereby preventing parser control FSM 210 from exiting the current parsing cycle until SCT[y] is dispatched (the stack push of the expanded production rule PR[y] can continue in parallel while the SEE status line is deasserted). Whether or not DXP 200 continues to suspend parsing once SCT[y] has been transferred to the selected SEE can be dependent on SCT[ ]. For instance, SCT[y] can also code how long the corresponding SemCode segment should block further processing by parser control FSM 210.
  • the DXU can be released: as soon as SCT[y] is dispatched; as soon as the SEE sets its semaphore; a programmable number of clock cycles after the SEE sets its semaphore; or not until the SEE sets and clears its semaphore.
  • the SEE can have different semaphore states corresponding to these different possibilities.
  • stack handler 220 will assert stack empty signal SE to parser control FSM 210 if the stack is empty.
  • SE stack empty signal
  • parser control FSM 210 resets its states to wait for the beginning of the next input packet. As long as the stack is not empty, however, the parser control FSM returns to block 400 and begins a new parsing cycle.
  • FIG 8 shows a second RSP embodiment 500 with expanded capability.
  • RSP 500 incorporates N+l SEES 300-0 to 300-N.
  • RSP 500 also contains several other significant additions: an exception processing unit (EPU) 600, an array machine-context data memory (AMCD) 700, and a variable machine-context data memory (VMCD) 800.
  • EPU exception processing unit
  • AMCD array machine-context data memory
  • VMCD variable machine-context data memory
  • 300-0 is an arithmetic logic unit (ALU) 310, a set of pipeline registers 320, and a semCode (or s-code) instruction decoder 330.
  • An s-code queue 340 stores microinstructions to be executed by the SEE. The microinstructions themselves are stored in semCode table 160 and received by the SEE S-bus interface 360.
  • SEE control finite state machine (FSM) 350 coordinates the operation of the SEE blocks shown.
  • SEE 300-0 sits idle until it receives an execution request (from DXP 200) on the Sx- bus.
  • SEE control FSM 350 examines traffic on the Sx-bus, waiting for a request directed to SEE 300-0 (for' instance, up to 16 SEEs can be addressed with four Sx-bus address lines, each SEE having a unique address).
  • the request contains, e.g., a starting SemCode address.
  • SEE control FSM 350 responds to the request by: setting its semaphore to acknowledge that it is now busy; and instructing S-bus interface 360 to drive a request on the S-bus to retrieve the microinstruction code segment beginning with the received starting SemCode address.
  • S-bus interface 360 is tasked with placing S-code instructions in queue 340 before s- code instruction decoder 330 needs them.
  • S-bus interface does have to contend with other SEE S-bus interfaces for access to the S-bus, therefore it may be beneficial to download multiple sequential instructions at a time in a burst.
  • S-bus interface 360 maintains an s-code address counter (not shown) and continues to download instructions sequentially unless directed otherwise by SEE control FSM 350.
  • S-code microinstruction decoder 330 executes the code segment requested by the
  • ALU 310 can be conventional, e.g., having the capability to perform addition, comparison, shifting, etc., using its own register values and/or values from pipeline register 320.
  • Pipeline registers 320 allow machine-context access to data.
  • the preferred SEE embodiments have no notion of the physical data storage structure used for the data that they operate on. Instead, accesses to data take a machine-context transactional form.
  • Variable (e.g., scalar) data is accessed on the V-bus; array data is accessed on the A-bus; and input stream data is accessed on the Si-bus.
  • the instruction decoder 330 prompts the V-bus interface to issue a bus request ⁇ read, ct, offset, m).
  • the context met refers to the master context of the RSP; other sub-contexts will usually be created and destroyed as the RSP processes input data, such as a sub-context for a current TCP packet or active session.
  • a pipeline register Once a pipeline register has been issued a command, it handles the data transfer process. If multiple bus transfers are required to read or write m octets, the pipeline register tracks the transaction to completion. As an example, a six-octet field can be transferred from the stream input to a machine-context variable using two microinstructions: a first instruction reads six octets from the Si-bus to a pipeline register; a second instruction then writes the six octets from the register to the machine-context variable across the V-bus.
  • the register interfaces perform however many bus data cycles are required to effect the transfer.
  • VMCD 800 serves the requests initiated on the V-bus.
  • VMCD 800 has the capability to translate machine-context variable data requests to physical memory transactions.
  • VMCD 800 preferably maintains a translation table referencing machine context identifiers to physical starting addresses, contains a mechanism for allocating and deallocating contexts, allows contexts to be locked by a given SEE, and ensures that requested transactions do not fall outside of the requested context's boundaries.
  • the actual storage mechanism employed can vary based on application: the memory could be completely internal, completely external, a mix of the two, a cache with a large external memory, etc.
  • An external memory can be shared with external memory for other memory sections, such as the AMCD, e-code table, input buffer, parser table, production rule table, and semCode table, in a given implementation.
  • the A-bus interface and AMCD 700 operate similarly, but with an array machine context organization.
  • different types of arrays and tables can be allocated, resized, deallocated, written to, read from, searched, and possibly even hashed or sorted using simple bus requests.
  • the actual underlying physical memory can differ for different types of arrays and tables, including for example fast onboard RAM, external RAM or ROM, content- addressable memory, etc.
  • each SEE can access input data from buffer 510 across the Si-bus. And each SEE has access to the P-bus and the current symbol on top of the parser stack — this can be useful, e.g., where the same s- code is used with multiple production rules, but its outcome depends on the production rule that initiated it.
  • the pipeline registers of some SEEs can be specialized. For instance, SEE 300-1 in Figure 8 communicates with local I/O block 520 to provide a data path to/from, e.g., local USB or serial ATA devices connected to local I/O block 520. And SEE 300-2 in Figure 8 communicates with EPU 600 to provide a data path to/from an exception unit.
  • each SEE could connect separately with each of these devices, in practice the device is simplified and suffers little performance penalty by pairing certain SEEs with certain other functions.
  • Exception processing unit 600 can be a standard von Neumann central processing unit (CPU), although in many applications it can be a very rudimentary one.
  • EPU 600 is preferably used to handle complex code that either runs infrequently or is not timing- critical. Examples are a user log-on procedure, a request to make a local drive available remotely, error logging and recovery, table loading at system startup, and system configuration.
  • EPU 600 responds to DXP requests indirectly, through s-code segments loaded into SEE 300-2.
  • EPU 600 can also call upon SEE 300-2 to perform functions for it, such as reading or writing to AMCD 700 or VMCD 800.
  • An e-code table 610 is preferably available to EPU 600.
  • the e-code table contains boot instructions for the device, and may contain executable instructions for performing other functions requested by the DXP.
  • e-code table 610 may contain a table for translating s-code requests into instruction addresses for code to be executed, with the instruction addresses located in a conventional external memory space.
  • ARP Address Resolution Protocol
  • This example walks through the creation of production rules, parser table entries, and the functional substance of s-code for handling received ARP packets.
  • ARP packets allow local network nodes to associate each peer's link-layer (hardware) address with a network (protocol) address for one or more network protocols.
  • the hardware protocol is Ethernet, and that the network protocol is Internet Protocol (IP or IPv4).
  • IP or IPv4 Internet Protocol
  • the sender When the opcode field is set to 1, the sender is trying to discover the target hardware address associated with the target protocol address, and is requesting an ARP reply packet.
  • the opcode field When the opcode field is set to 2, the sender is replying to an ARP request — in this case, the sender's hardware address is the target hardware address that the original sender was looking for.
  • a $ indicates the beginning of a production rale, ⁇ enclose s-code to be performed by a SEE:
  • $ARP_OP : 0x01 ARP_REQ_ADDR I 0x02 ARP_REPLY_ADDR
  • $ARP_REQ_ADDR ARP_SENDER_H ARP_SENDER_PROT ARP_TARGET_HW
  • $ARP_REPLY_ADDR: ARP_SENDER_HW ARP_SE DER_PROT ARP_TARGET_HW
  • This example only processes a limited set of all possible ARP packets, namely those properly indicating fields consistent with an Ethernet hardware type and an IP protocol type; all others will fail to parse and will be rejected.
  • This grammar also leaves a hook for processing IP packets ($IP_BODY) and thus will not reject IP packets, but a corresponding IP grammar is not part of this example.
  • $MAC_PDU merely defines the MAC frame format.
  • Two destination MAC addresses are allowed by $MAC_DA: a specific hardware address (0x08 0x01 0x02 0x03 0x04 0x05) and a broadcast address of all l's. All other MAC addresses are automatically rejected, as a packet without one of these two addresses will fail to parse.
  • Any source address is accepted by $MAC_SA; a SEE is called to save the source address to a master context table variable mct->curr_SA on the VMCD.
  • $MAC_PAYLOAD and $ET2 combine to ensure that only two types of payloads are parsed, an ARP payload and an IP payload (further parsing of an IP payload is not illustrated herein). Of course, other packet types can be added by expanding these productions.
  • the first four elements of the ARP body (hardware and protocol types and address lengths) are shown fixed — if ARP were implemented for another protocol as well as IP, these elements could be generalized (note that the generalization of the length fields might allow different sizes for the address fields that follow, a condition that would have to be accounted for in the production rales).
  • Two values for $ARP_OP are possible, a 1 for a request and a 2 for a reply.
  • address parsing does not differ for the two values of ARP OP, the s-code to be executed in each case does.
  • S-code segment 1 which is executed for ARP requests, compares the target protocol to the local IP address stored in the master context table on the VMCD. When these are equal, a SEE generates an ARP reply packet to the sender's hardware and IP addresses.
  • S-code segment 2 executes for both ARP requests and ARP replies — this segment updates an ArpCache array stored in the AMCD with the sender's hardware and protocol addresses and the time received.
  • the "update" command to mct-> ArpCache includes a flag or mask to identify which data in ArpCache should be used to perform the update; normally, the cache would be indexed at least by IP address.
  • ARP_P ADDING will be 18 octets in length.
  • ARP_P ADDING production rule shown here fits any number of octets.
  • an s-code segment is called to calculate the padding length and "throw away" that many octets, e.g., by advancing the input pointer.
  • the parser could use a five- octet look-ahead to the EoFrame token in the input; when the token is found, the preceding four octets are the FCS.
  • the MAC_FCS production indicates that a SEE is to check the FCS attached to the packet.
  • a SEE may actually compute the checksum, or the checksum may be computed by input buffer or other hardware, in which case the SEE would just compare the packet value to the calculated value and reject the packet if no match occurs.
  • exemplary production rule table and parser table values will now be given and explained. First, production rules will be shown, wherein hexadecimal notation illustrates a terminal value, decimal notation indicates a production rule, and "octet" will match any octet found at the head of an input stream.
  • a non-terminal (NT) code is used as an index to the parser table; a production rale (PR) code is stored in the parser table, and indicates which production rule applies to a given combination of NT code and input value.
  • the RHS Non-terminal Values e.g., with a special end-of-rule symbol attached, are what get stored in the RSP's production rule table.
  • the production rule codes are "pointers" to the corresponding production rules; it is the PR codes that actually get stored in the parser table.
  • the following parser table segment illustrates the relationship between PR and PR code:
  • the combination of an NT code and a "Head of Input Stream Data Value" index the parser table values in the RSP.
  • the start symbol S, EoFrame symbol, and bottom of stack symbol $ are special cases — the parser control FSM can be implemented to not reference the parser table for these symbols.
  • the table produces the same PR code regardless of the data value occupying the head of the input stream.
  • all other NT codes have valid values for only one or two head of input stream values (a blank value in a cell represents an invalid entry).
  • This information can be coded in a matrix format, with each cell filled in, or can be coded in some other more economical format.
  • the DXP is stepped by parser cycles, corresponding to one "loop" through the flowchart in Figure 7.
  • the following machine states are tracked: the input pointer ip, indicating the byte address of the current stream input symbol being parsed; the input symbol pointed to by the input pointer, *ip the parser stack pointer psp, indicating which stack value is pointed to at the beginning of the parser cycle; the top-of- parser-stack symbol at the beginning of that parser cycle, *psp, where non-terminal symbols are indicated by the prefix "nt", and the terminal symbol t.
  • xx matches any input symbol; PT[*zp, *psp], the currently indexed value of the parser table; PRT[PT], the production rule pointed to by PT[*zp, *psp]; SCT[PT], the s-code segment pointed to by PT
  • the results for parsing this example packet are shown below in tabular format, followed by a brief explanation. Although the example is lengthy, it is instructive as it exercises most of the basic functions of the RSP.
  • the detailed example above illustrates how production rules are expanded onto the parser stack and then processed individually, either by: matching a terminal symbol with an input symbol (see, e.g., parser cycles 2-7); matching a terminal don't care symbol t.xx with an input symbol (see, e.g., parser cycles 9-14); further expanding a non-terminal symbol either irrespective of input (see, e.g., parser cycle 8) or based on the current input symbol (see, e.g., parser cycles 0, 1, 17); or executing a null cycle, in this case to allow a SEE to adjust the input pointer to "skip" parsing for a padding field (parser cycle 63).
  • This example also illustrates the calls to s-code segments at appropriate points during the parsing process, depending on which production rules get loaded onto the stack (parser cycles 8, 33, 62, 64). It can be appreciated that some of these code segments can execute in parallel with continued parsing.
  • the exemplary grammar given above is merely one way of implementing an ARP grammar according to an embodiment of the invention. Some cycle inefficiencies could be reduced by explicitly expanding some of the non-terminals into their parent production rales, for example.
  • the ARP grammar could also be generalized considerably to handle more possibilities.
  • the coding selected, however, is meant to illustrate basic principles and not all possible optimizations or ARP features. Explicit expansions may also be limited by the chosen stack size for a given implementation.
  • DXP 200 can implement an LL(f(X)) parser, where the look-ahead value f(X) is coded in a stack symbol, such that each stack symbol can specify its own look-ahead.
  • the look- ahead value is coded into the production rale table, such that when the rule is executed DXP 200 looks up (X, a+5) in the production rule table.
  • variable look-ahead capability can also be used to indicate that multiple input symbols are to be used in a table lookup. For instance, the production rale for MAC_DA could be specified as
  • the parser table contains two entries that match six symbols each, e.g., at parser table locations (X a) — (130, 0x08 0x01 0x02 0x03 0x04 0x05) and (130, OxFF OxFF OxFF OxFF OxFF OxFF).
  • CAM is shown in Figure 11.
  • Ternary CAM 900 of Figure 11 is loaded with a table of match addresses and corresponding production rale codes.
  • Each match address comprises a one-octet stack symbolXand six octets of input symbols al, a2, a3, a4, a5, a6.
  • a match address is supplied to CAM 900, it determines whether a match exists in its parser table entries. If a match exists, the corresponding production rale code is returned (alternately, the address of the table entry that caused a match is returned, which can be used as an index into a separate table of production rale codes or pointers).
  • parser table implementation of Figure 11 is that it is more efficient than a matrix approach, as entries are only created for valid combinations of stack and input symbols. This same efficiency allows for longer input symbols strings to be parsed in one parser cycle (up to six input symbols are shown, but a designer could use whatever length is convenient), thus a MAC or IP address can be parsed in one parser cycle. Further, look-ahead capability can be implicitly coded into the CAM, e.g., the next six input symbols can always be supplied to the table.
  • the CAM bits corresponding to a2, a3, a4, a5, a6 on that row are set to a "don't care" value xx, and merely do not contribute to the lookup.
  • the CAM bits corresponding to a3, a4, a5, a6 on those rows are set to xx.
  • a binary CAM can also function in a parser table implementation. The primary difference is that the binary CAM cannot store "don't care” information explicitly, thus leaving the parser state machine (or some other mechanism) responsible for handling any "don't care” functionality in some other manner.
  • the concepts taught herein can be tailored to a particular application in many other advantageous ways. For instance, many variations on the codes and addressing schemes presented are possible.
  • a microinstruction code segment ends with a NULL instruction — the occurrence of the NULL instruction can be detected either by the S-bus interface of a SEE, by the microinstruction decoder, or even by an s-code table function.
  • the s-code addresses do not necessarily have to be known to the SEEs; it is possible for the SCT to track instruction pointers for each SEE, with the instruction pointers for each SEE set by the DXP. Although multiple memory storage areas with different interfaces are illustrated, several of the interfaces can share access to a common memory storage area that serves as a physical storage space for both. Those skilled in the art will recognize that some components, such as the exception processing unit, can either by integrated with the RSP or connect to the RSP as a separate unit.
  • the parser table, production rale table, and s-code table are populated for a given set of grammars — the population can be achieved, for example, through an EPU, a boot-code segment on one of the SEEs, or a boot-grammar segment with the table population instructions provided at the input port.
  • the tables can also, of course, be implemented with non-volatile memory so that table reloading is not required at every power- up.
  • the flowchart illustrating the operation of the DXP is merely illustrative — for instance, it is recognized herein that a given state machine implementation may accomplish many tasks in parallel that are shown here as sequential tasks, and may perform many operations speculatively.
  • an input port merely acknowledges that at least one port exists.
  • the physical port arrangement can be varied depending on application. For instance, depending on port bandwidth and parser performance, several input ports may be multiplexed to the same direct execution parser.

Abstract

Data processors and methods for their configuration and use are disclosed. As opposed to traditional von Neumann microprocessors, the disclosed processors are semantic processors (100) they parse an input stream and direct one or more semantic execution engines (300) to execute code segments, depending on what is being parsed. For defined-structure input streams such as packet data streams, these semantic processors can be both economical and fast as compared to a von Neumann system. Several optional components can augment device operation. For instance, a machine context data interface relieves the semantic execution engines (300) from managing physical memory, allows the orderly access to memory by multiple engines, and implements common access operations. Further, a simple von Neumann exception-processing unit can be attached to a semantic execution engine to execute more complicated, but infrequent or non-time-critical operations.

Description

A RECONFIGURABLE SEMANTIC PROCESSOR
FIELD OF THE INVENTION This invention relates generally to digital processors and processing, and more specifically to digital semantic processors for data stream processing.
BACKGROUND OF THE INVENTION Traditional programmable computers use a von Neumann, or VN, architecture. The VN architecture, in its simplest form, comprises a central processing unit (CPU) and attached memory, usually with some form of input/output to allow useful operations. For example, Figure 1 shows a computer 20 comprising a CPU 30, a memory controller 40, memory 50, and input/output (I/O) devices 60. CPU 30 sends data requests to memory controller 40 over address/control bus 42; the data itself passes over a data bus 44. Memory controller 40 communicates with memory 50 and I/O devices 60 to perform data reads and writes as requested by CPU 30 (or possibly by the I/O devices). Although not shown, the capability exists for various devices to "interrupt" the CPU and cause it to switch tasks.
In a VN machine, memory 50 stores both program instructions and data. CPU 30 fetches program instructions from the memory and executes the commands contained therein — typical instructions instruct the CPU to load data from memory to a register, write data to memory from a register, perform an arithmetic or logical operation using data in its onboard registers, or branch to a different instruction and continue execution. As can be appreciated, CPU 30 spends a great deal of time fetching instructions, fetching data, or writing data over data bus 44. Although elaborate (and usually costly) schemes can be implemented to cache data and instructions that might be useful, implement pipelining, and decrease average memory cycle time, data bus 44 is ultimately a bottleneck on processor performance. The NN architecture is attractive, as compared to gate logic, because it can be made "general-purpose" and can be reconfigured relatively quickly; by merely loading a new set of program instructions, the function of a NN machine can be altered to perform even very- complex functions, given enough time. The tradeoffs for the flexibility of the NN architecture are complexity and inefficiency. Thus the ability to do almost anything comes at the cost of being able to do a few simple things efficiently.
SUMMARY OF THE INVENTION Many digital devices either in service or on the near horizon fall into the general category of packet processors. In other words, these devices communicate with another device or devices using packets, e.g., over a cable, fiber, or wireless networked or point-to- point connection, a backplane, etc. In many such devices, what is done with the data received is straightforward, but the packet protocol and packet processing are too complex to warrant the design of special-purpose hardware. Instead, such devices use a NN machine to implement the protocols. It is recognized herein that a different and attractive approach exists for packet processors, an approach that can be described more generally as a reconfigurable semantic processor (RSP). Such a device is preferably reconfigurable like a VN machine, as its processing depends on its "programming" — although as will be seen this "programming" is unlike conventional machine code used by a VN machine. Whereas a VN machine always executes a set of machine instructions that check for various data conditions sequentially, the RSP responds directly to the semantics of an input stream. In other words, the '"code" that the RSP executes is selected by its input. Thus for packet input, with a defined grammar, the RSP is ideally suited to fast and efficient packet processing.
Some embodiments described herein use a table-driven predictive parser to drive direct execution of the protocols of a network grammar, e.g., an EE (Left-to-right parsing by identifying the Left-most production) parser. Other parsing techniques, e.g., recursive descent, ER (Left-to-right parsing by identifying the Right-most production), and LALR (Look Ahead LR) may also be used in embodiments of the invention. In each case, the parser responds to its input by launching microinstruction code segments on a simple execution unit. When the tables are placed in rewritable storage, the RSP can be easily reconfigured, and thus a single RSP design can be useful in a variety of applications. In many applications, the entire RSP, including the tables necessary for its operation, can be implemented on a single, low-cost, low-power integrated circuit.
A number of optional features can increase the usefulness of such a device. A bank of execution units can be used to execute different tasks, allowing parallel processing. An exception unit, which can be essentially a small NN machine, can be connected and used to perform tasks that are, e.g., complex but infrequent or without severe time pressure. And machine-context memory interfaces can be made available to the execution units, so that the execution units do not have to understand the underlying format of the memory units — thus greatly simplifying the code executed by the execution units.
BRIEF DESCRIPTION OF THE DRAWING The invention may be best understood by reading the disclosure with reference to the drawing, wherein:
Figure 1 contains a block diagram for a typical von Neumann machine; Figure 2 contains a block diagram for a predictive parser pattern recognizer previously patented by the inventor of the present invention;
Figure 3 illustrates, in block form, a semantic processor according to an embodiment of the invention;
Figure 4 shows one possible parser table construct useful with embodiments of the invention; Figure 5 shows one possible production rule table organization useful with embodiments of the invention;
Figure 6 illustrates, in block form, one implementation for a direct execution parser (DXP) useful with embodiments of the present invention; Figure 7 contains a flowchart for the operation of the DXP shown in Figure 6;
Figure 8 shows a block diagram for a reconfigurable semantic processor according to an embodiment of the invention;
Figure 9 shows the block organization of a semantic code execution engine useful with embodiments of the invention; Figure 10 shows the format of an Address Resolution Protocol packet; and
Figure 11 illustrates an alternate parser table implementation using a Content- Addressable Memory (CAM).
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The inventor of the present application is a co-inventor on a previous patent entitled "Pattern Recognition in Data Communications Using Predictive Parsers", U.S. Patent No. 5,916,305, issued June 29, 1999. Although overall the device described in the '305 patent is quite different from the present invention, it is instructive as a general introduction to the use of a rudimentary predictive parser in conjunction with a network protocol, as a pattern matcher. Figure 2 shows a block diagram of a device 80 as described in the '305 patent. A semantic engine 82 reads a packet 70, and passes the packet data octets as values to predictive parser 84. Predictive parser 84 examines each value (octet) that is passed to it. First, parser 84 performs a table lookup using the value and the offset of that value's location from the beginning of packet 70 as an index into parser table 88. Parser table 88 stores, for each combination of value and offset, one of four possible values: 'A', meaning accept the value at that offset; 'D', meaning that the combination of value and offset is a "don't care"; 'F', meaning failure as the value at the offset is not part of the pattern to be recognized; and '$', for an end symbol.
Parser stack 86 is not a true "stack" in the normal meaning of the word (or as applied to the invention embodiments to be described shortly) — it merely keeps a state variable for each "filter" that parser 84 is trying to match. Each state variable is initialized to an entry state. As table entries are subsequently returned for each value and offset, the stack updates each stack variable. For instance, if an 'A' is returned for a stack variable, that stack variable moves from the entry state to a partial match state. If a 'F' is returned, that stack variable moves from either the entry state or the partial match state to a failure state. If a 'D' is returned, that stack variable maintains its current state. And if a '$' is returned while the state variable is in the entry state or the partial match state, the state variable transitions to the match state.
Once semantic engine 82 has passed all packet values to predictive parser 84, parser 84 returns a match value based on the parser stack states. Semantic engine 82 then takes some output action depending on the success or failure of the match. It should be noted that the parser does not control or coordinate the device function, but instead merely acts as an ancillary pattern matcher to a larger system. Each possible pattern to be distinguished requires a new column in the parser table, such that in a hardware implementation device 80 can match only a limited number of input patterns. And a parser table row is required for each input octet position, even if that input octet position cannot affect the match outcome.
The embodiments described herein take a decidedly different approach to data processing. Figure 3 shows a semantic processor 100 according to an embodiment of the invention. Rather than merely matching specific input patterns to specific stored patterns, semantic processor 100 contains a direct execution parser (DXP) 200 that controls the processing of input packets. As DXP 200 parses data received at the input port 102, it expands and executes actual grammar productions in response to the input, and instructs semantic code execution engine (SEE) 300 to process segments of the input, or perform other operations, as the grammar executes. This structure, with a sophisticated grammar parser that assigns machine context tasks to an execution engine, as the data requires, is both flexible and powerful. In preferred embodiments, the semantic processor is reconfigurable, and thus has the appeal of a VN machine without the high overhead. Because the semantic processor only responds to the input it is given, it can operate efficiently with a smaller instruction set than a VN machine. The instruction set also benefits because the semantic processor allows processing in a machine context.
Semantic processor 100 uses at least three tables. Code segments for SEE 300 are stored in semantic code table 160. Complex grammatical production rules are stored in a production rule table 140. Codes for retrieving those production rules are stored in a parser table 120. The codes in parser table 120 also allow DXP 200 to detect whether, for a given production rule, a code segment from semantic code table 160 should be loaded and executed by SEE 300.
Some embodiments of the present invention contain many more elements than those shown in Figure 3, but these essential elements appear in every system or software embodiment. A description of each block in Figure 3 will thus be given before more complex embodiments are addressed.
Figure 4 shows a general block diagram for a parser table 120. A production rule code memory 122 stores table values, e.g., in a row-column format. The rows of the table are indexed by a non-terminal code. The columns of the table are indexed by an input data value. Practically, codes for many different grammars can exist at the same time in production rule code memory 122. For instance, as shown, one set of codes can pertain to MAC (Media Access Control) packet header format parsing, and other sets of codes can pertain to Address Resolution Protocol (ARP) packet processing, Internet Protocol (IP) packet processing, Transmission Control Protocol (TCP) packet processing, Real-time Transport Protocol (RTP) packet processing, etc. Non-terminal codes need not be assigned in any particular order in production rule code memory 122, nor in blocks pertaining to a particular protocol as shown.
Addressor 124 receives non-terminal (NT) codes and data values from DXP 200. Addressor 124 translates [NT code, data value] pairs into a physical location in production rule code memory 122, retrieves the production rule (PR) code stored at that location, and returns the PR code to the DXP. Although conceptually it is often useful to view the structure of production rule code memory 122 as a matrix with one PR code stored for each unique combination of NT code and data value, the present invention is not so limited. Different types of memory and memory organization may be appropriate for different applications (one of which is illustrated in Figure 11).
Parser table 120 can be located on or off-chip, when DXP 200 and SEE 300 are integrated together in a circuit. For instance, a static RAM located on-chip can serve as parser table 120. Alternately, off-chip DRAM storage can store parser table 120, with addressor 124 serving as or communicating with a memory controller for the DRAM. In other embodiments, the parser table can be located in off-chip memory, with an on-chip cache capable of holding a section of the parser table. Addressor 124 may not be necessary in some implementations, but when used can be part of parser 200, part of parser table 1203 or an intermediate functional block. Note that it is possible to implement a look-ahead capability for parser table 120, by giving addressor 124 visibility into the next input value on the input stream and the next value on the DXP's parser stack. Figure 5 illustrates one possible implementation for production rule table 140. Production rule memory 142 stores the actual production rule sequences of terminal and nonterminal symbols, e.g., as null-terminated chains of consecutive memory addresses. An addressor 144 receives PR codes, either from DXP 200 or directly from parser table 120. As production rules can have various lengths, it is preferable to take an approach that allows easy indexing into memory 142. In one approach, the PR code could be arithmetically manipulated to determine a production rule's physical memory starting address (this would be possible, for instance, if the production rules were sorted by expanded length, and then PR codes were assigned according to a rule's sorted position). The PR code could also be the actual PR starting address, although in some applications this may make the PR codes unnecessarily lengthy. In the approach shown in Figure 5, a pointer table 150 is populated with a PR starting address for each PR code. Addressor 144 retrieves a production rule by querying pointer table 150 using the PR code as an address. Pointer table 150 returns a PR starting address PR_ADD. Addressor 144 then retrieves PR data from production rule memory 142 using this starting address. Addressor 144 increments the starting address and continues to retrieve PR data until a NULL character is detected.
Figure 5 shows a second column in table 150, which is used to store a semantic code (SC) starting address. When DXP 200 queries addressor 144 with a PR code, the addressor not only returns the corresponding production rule, but also the SC starting address for a SEE task to be performed. Where no SEE task is needed for a given production rule, the SC starting address is set to a NULL address.
Figure 6 shows one possible block implementation for DXP 200. Parser control finite state machine (FSM) 210 controls and sequences overall DXP operation, based on inputs from the other logical blocks in Figure 6. Stack handler 220 and stack 222 store and sequence the production rules executed by DXP 200. Parser table interface 230 allows DXP 200 to retrieve PR codes from an attached parser table. Production rule table interface 240 allows DXP 200 to retrieve production rules from an attached production rule table. And semcode table interface 250 allows DXP 200 to identify the memory location of semantic code segments associated with production rules (in the illustrated embodiment, interfaces 240 and 250 are partially combined).
Input stream sequence control 260 and register 262 retrieve input data symbols from the Si-Bus. Comparator 270 compares input symbols with symbols from parser stack 222. Finally, SEE interface 280 is used to dispatch tasks to one or more SEEs communicating with DXP 200 on the Sx-Bus.
The basic operation of the blocks in Figure 6 will now be described with reference to the flowchart in Figure 7. At the beginning of each parsing cycle (flowchart block 400), stack handler 220 retrieves a production symbol pX pointed to by its top-of-stack pointer psp. The production symbol pX is split into two constituent parts, a prefix p and a symbol Prefix p codes the type of the symbol X, e.g., according to the following mapping for a two- bit prefix:
Table 1
Figure imgf000011_0001
Note that instead of a prefix for a "don't care" terminal symbol, the prefix can indicate a masked terminal symbol. A masked terminal symbol allows the specification of a bit mask for the input symbol, i.e., some (or all) bits of the terminal symbol are "don't care" bits. The masked terminal symbol construct can be useful, e.g., for parsing packet flag fields such as occur in many network protocols.
Input stream sequence control 260 also loads the current input stream value pointed to by input pointer ip into aReg register 262. This step may not be necessary if the previous parsing cycle did not advance input pointer ip.
When parser control FSM 210 receives the new prefix code » from stack handler 220, it determines (flowchart block 402) which of three possible logic paths to take for this parsing cycle. If the prefix code indicates that X is a terminal symbol, path 410 is taken. If the prefix code indicates that will match any input symbol, path 420 is taken. And if the prefix code indicates that X is a non-terminal symbol, path 430 is taken. The processing associated with each path will be explained in turn.
When path 410 is taken, parser control FSM 200 makes another path branch, based on the symbol match signal M supplied by comparator 270. Comparator 270 compares input symbol a to stack symbol X — if the two are identical, signal Mis asserted. If masked terminal symbols are allowed and a masked terminal symbol is supplied, comparator 270 applies the mask such that signal M depends only on the unmasked stack symbol bits.
When a particular input symbol is expected and not found, parser control FSM 210 enters an error recovery mode at block 414. Generally, error recovery will flush the remainder of the packet from the input (e.g., by matching the input with an end of frame (EOF) symbol until a match is detected), and popping the remaining symbols off the stack. A semCode segment may also be dispatched to a SEE to clean up any machine state data related to the errant packet. These and other actions may depend on the particular grammar being parsed at the time of the error.
Assuming that a match between a andXis found at block 412, further processing joins the processing path 420.
Processing path 420 accomplishes two tasks, shown as blocks 422 and 424 in Figure 7. First, parser control FSM 210 signals stack handler 220 to "pop" the current value of X off of stack 222, e.g., by decrementing the stack pointer psp. Second, parser control FSM 210 signals input stream sequence control 260 to increment the input pointer Ip to the next symbol in the input stream.
Processing path 430 processes non-terminal symbols appearing on stack 222. When a non-terminal symbol X reaches the top of the stack, processing blocks 432, 434, 438, and 440 expand the non-terminal symbol into its corresponding production rule. Parser control FSM
210 first signals parser table interface 230 to return a production rule code; = PT[X,α]. Ify is invalid, parser control FSM 210 performs error recovery (block 436), e.g., as described above.
Assuming that PR code y is valid, parser control FSM 210 replaces X on stack 222 with its expanded production rule. Parser control FSM signals production rale table (PRT) interface 240 and SemCode table (SCT) interface 250 to perform lookups using PR code .
Parser control FSM 210 also signals stack handler 220 to pop the current value of X off of stack 222. When PRT interface 240 returns production rule PR[y], parser control FSM 210 signals stack handler 220 to push PR[y] onto stack 222. As each expanded production rule has a corresponding length, this length must be accounted for in the push, i.e. some expansions may require multiple symbol transfers from the production rule table (the path width from the table to the stack handler may, of course, be more than one symbol wide).
Meanwhile, SCT interface 250 has returned a corresponding SemCode address code
SCTjjμ] for production rule PR[y]. The address code SCT[y] may contain an actual physical address for the first SemCode microinstruction corresponding to PR code y, or some abstraction that allows a SEE to load that microinstruction. The address code SCT[y] may contain other information as well, such as an indication of which SEE (in a multiple-SEE system) should receive the code segment.
When commanded by parser control FSM 210, SEE interface 280 examines SCT£y] and determines whether a code segment needs to be dispatched to a SEE. As shown by decision block 442 in Figure 7, no microinstruction execution is necessary if SCT[y] is not "valid", i.e., a NULL value is represented. Otherwise, SEE interface 280 determines (decision block 444) whether a SEE is currently available. SEE interface 280 examines a semaphore register (not shown) to determine SEE availability. If a particular SEE is indicated by SCT[y], SEE interface 280 examines the semaphore for that SEE. If the ' semaphore indicates that the requested SEE is busy, SEE interface 280 enters wait state 446 until the semaphore clears. If any SEE may execute the SemCode segment, SEE interface 280 can simply select one with a clear semaphore.
When the semaphore is clear for the selected SEE, SEE interface 280 captures the SX-bus and transmits SCT[y] to the selected SEE. The selected SEE sets its semaphore to indicate that it has received the request.
When parser control FSM 210 first commands SEE interface 280 to dispatch SCT[ ], SEE interface 280 deasserts the SEE status line to suspend further parsing, thereby preventing parser control FSM 210 from exiting the current parsing cycle until SCT[y] is dispatched (the stack push of the expanded production rule PR[y] can continue in parallel while the SEE status line is deasserted). Whether or not DXP 200 continues to suspend parsing once SCT[y] has been transferred to the selected SEE can be dependent on SCT[ ]. For instance, SCT[y] can also code how long the corresponding SemCode segment should block further processing by parser control FSM 210. In one embodiment, the DXU can be released: as soon as SCT[y] is dispatched; as soon as the SEE sets its semaphore; a programmable number of clock cycles after the SEE sets its semaphore; or not until the SEE sets and clears its semaphore. Alternately, the SEE can have different semaphore states corresponding to these different possibilities.
At the end of each parser cycle (decision block 460 in Figure 7), stack handler 220 will assert stack empty signal SE to parser control FSM 210 if the stack is empty. Upon the assertion of the SE signal, parser control FSM 210 resets its states to wait for the beginning of the next input packet. As long as the stack is not empty, however, the parser control FSM returns to block 400 and begins a new parsing cycle.
Figure 8 shows a second RSP embodiment 500 with expanded capability. Instead of the single SEE 300 shown in Figure 3, RSP 500 incorporates N+l SEES 300-0 to 300-N.
RSP 500 also contains several other significant additions: an exception processing unit (EPU) 600, an array machine-context data memory (AMCD) 700, and a variable machine-context data memory (VMCD) 800. The function of each block in Figure 8 will now be explained in context. Figure 9 illustrates the basic functional blocks of SEE 300-0. At the heart of SEE
300-0 is an arithmetic logic unit (ALU) 310, a set of pipeline registers 320, and a semCode (or s-code) instruction decoder 330. An s-code queue 340 stores microinstructions to be executed by the SEE. The microinstructions themselves are stored in semCode table 160 and received by the SEE S-bus interface 360. SEE control finite state machine (FSM) 350 coordinates the operation of the SEE blocks shown.
SEE 300-0 sits idle until it receives an execution request (from DXP 200) on the Sx- bus. SEE control FSM 350 examines traffic on the Sx-bus, waiting for a request directed to SEE 300-0 (for' instance, up to 16 SEEs can be addressed with four Sx-bus address lines, each SEE having a unique address). When a request is directed to SEE 300-0, the request contains, e.g., a starting SemCode address. SEE control FSM 350 responds to the request by: setting its semaphore to acknowledge that it is now busy; and instructing S-bus interface 360 to drive a request on the S-bus to retrieve the microinstruction code segment beginning with the received starting SemCode address.
S-bus interface 360 is tasked with placing S-code instructions in queue 340 before s- code instruction decoder 330 needs them. S-bus interface does have to contend with other SEE S-bus interfaces for access to the S-bus, therefore it may be beneficial to download multiple sequential instructions at a time in a burst. S-bus interface 360 maintains an s-code address counter (not shown) and continues to download instructions sequentially unless directed otherwise by SEE control FSM 350. S-code microinstruction decoder 330 executes the code segment requested by the
DXP on ALU 310 and pipeline registers 320. Although preferably a branching capability exists within instruction decoder 330, many code segments will require little or no branching due the overall structure of the RSP.
ALU 310 can be conventional, e.g., having the capability to perform addition, comparison, shifting, etc., using its own register values and/or values from pipeline register 320.
Pipeline registers 320 allow machine-context access to data. As opposed to a standard CPU, the preferred SEE embodiments have no notion of the physical data storage structure used for the data that they operate on. Instead, accesses to data take a machine-context transactional form. Variable (e.g., scalar) data is accessed on the V-bus; array data is accessed on the A-bus; and input stream data is accessed on the Si-bus. For instance, to read a scalar data element of length m octets located at a given location offset within a data context ct, the instruction decoder 330 prompts the V-bus interface to issue a bus request {read, ct, offset, m). The context met refers to the master context of the RSP; other sub-contexts will usually be created and destroyed as the RSP processes input data, such as a sub-context for a current TCP packet or active session.
Once a pipeline register has been issued a command, it handles the data transfer process. If multiple bus transfers are required to read or write m octets, the pipeline register tracks the transaction to completion. As an example, a six-octet field can be transferred from the stream input to a machine-context variable using two microinstructions: a first instruction reads six octets from the Si-bus to a pipeline register; a second instruction then writes the six octets from the register to the machine-context variable across the V-bus. The register interfaces perform however many bus data cycles are required to effect the transfer.
VMCD 800 serves the requests initiated on the V-bus. VMCD 800 has the capability to translate machine-context variable data requests to physical memory transactions. Thus VMCD 800 preferably maintains a translation table referencing machine context identifiers to physical starting addresses, contains a mechanism for allocating and deallocating contexts, allows contexts to be locked by a given SEE, and ensures that requested transactions do not fall outside of the requested context's boundaries. The actual storage mechanism employed can vary based on application: the memory could be completely internal, completely external, a mix of the two, a cache with a large external memory, etc. An external memory can be shared with external memory for other memory sections, such as the AMCD, e-code table, input buffer, parser table, production rule table, and semCode table, in a given implementation. The A-bus interface and AMCD 700 operate similarly, but with an array machine context organization. Preferably, different types of arrays and tables can be allocated, resized, deallocated, written to, read from, searched, and possibly even hashed or sorted using simple bus requests. The actual underlying physical memory can differ for different types of arrays and tables, including for example fast onboard RAM, external RAM or ROM, content- addressable memory, etc.
Returning to the description of SEE 300-0 and its pipeline registers, each SEE can access input data from buffer 510 across the Si-bus. And each SEE has access to the P-bus and the current symbol on top of the parser stack — this can be useful, e.g., where the same s- code is used with multiple production rules, but its outcome depends on the production rule that initiated it. Finally, the pipeline registers of some SEEs can be specialized. For instance, SEE 300-1 in Figure 8 communicates with local I/O block 520 to provide a data path to/from, e.g., local USB or serial ATA devices connected to local I/O block 520. And SEE 300-2 in Figure 8 communicates with EPU 600 to provide a data path to/from an exception unit. Although in theory each SEE could connect separately with each of these devices, in practice the device is simplified and suffers little performance penalty by pairing certain SEEs with certain other functions.
Exception processing unit 600 can be a standard von Neumann central processing unit (CPU), although in many applications it can be a very rudimentary one. When included, EPU 600 is preferably used to handle complex code that either runs infrequently or is not timing- critical. Examples are a user log-on procedure, a request to make a local drive available remotely, error logging and recovery, table loading at system startup, and system configuration. EPU 600 responds to DXP requests indirectly, through s-code segments loaded into SEE 300-2. Preferably, EPU 600 can also call upon SEE 300-2 to perform functions for it, such as reading or writing to AMCD 700 or VMCD 800. An e-code table 610 is preferably available to EPU 600. The e-code table contains boot instructions for the device, and may contain executable instructions for performing other functions requested by the DXP. Optionally, e-code table 610 may contain a table for translating s-code requests into instruction addresses for code to be executed, with the instruction addresses located in a conventional external memory space. An Example
In order to better illustrate operation of RSP 500, an example for an implementation of the Address Resolution Protocol (ARP), as described in IETF RFC 826, is presented. This example walks through the creation of production rules, parser table entries, and the functional substance of s-code for handling received ARP packets. Briefly, ARP packets allow local network nodes to associate each peer's link-layer (hardware) address with a network (protocol) address for one or more network protocols. This example assumes that the hardware protocol is Ethernet, and that the network protocol is Internet Protocol (IP or IPv4). Accordingly, ARP packets have the format shown in Figure 10. When the opcode field is set to 1, the sender is trying to discover the target hardware address associated with the target protocol address, and is requesting an ARP reply packet. When the opcode field is set to 2, the sender is replying to an ARP request — in this case, the sender's hardware address is the target hardware address that the original sender was looking for.
The following exemplary grammar describes one way in which RSP 500 can process ARP packets received at the input port. A $ indicates the beginning of a production rale, {} enclose s-code to be performed by a SEE:
$MAC_PDU := MAC_DA MAC_SA MAC_PAYLOAD MAC_FCS EoFrame $MAC_DA := 0X08 0X01 0X02 0X03 0X04 0X05
I 0XFF 0XFF 0XFF 0XFF 0XFF 0XFF $MAC_SA = etherAddType {s0: mct->curr_SA = MAC_SA}
$MAC_PAYLOAD = 0X08 ET2
$ET2 = 0X06 ARP BODY I 0X00 IP BODY
$ARP_BODY = ARP_HW_TYPE ARP_PROT_TYPE ARP_HW_ADD_LEN ARP_PROT_ADD_LEN ARP_OP ARP_PADDING $ARP_HW_TYPE = 0X0001
$ARP_PROT_TYPE = 0x0800 $ARP HW ADD LEN = 0X06
$ARP_PROT_ADD_LEN : = 0X04 0x00
$ARP_OP := 0x01 ARP_REQ_ADDR I 0x02 ARP_REPLY_ADDR
$ARP_REQ_ADDR := ARP_SENDER_H ARP_SENDER_PROT ARP_TARGET_HW
ARP_TARGET_PROT {si: s-code segl}
$ARP_REPLY_ADDR:= ARP_SENDER_HW ARP_SE DER_PROT ARP_TARGET_HW
ARP_TARGET_PROT {s2: s-code seg2} $ARP_SENDER_HW := etherAddType
$ARP SENDER PRO : = ipAddType $ARP_TARGET_HW := etherAddType $ARP_TARGET PROT : = ipAddType
$ARP_PADDING = octet I null {s3: calc. length; throw away} $IP_BODY = //unresolved by this example $MAC_FCS = octet octet octet octet {s4: check FCS } $etherAddType = octet octet octet octet octet octet $ipAddType = octet octet octet octet {s-code segl = if ARP_TARGET_PROT == mct->myIPAddress then generate ARP reply to mct->curr_SA; s-code seg2}
(s-code seg2 := update mct->ArpCache with
ARP SENDER HW, ARP SENDER PROT, mct->time}
This example only processes a limited set of all possible ARP packets, namely those properly indicating fields consistent with an Ethernet hardware type and an IP protocol type; all others will fail to parse and will be rejected. This grammar also leaves a hook for processing IP packets ($IP_BODY) and thus will not reject IP packets, but a corresponding IP grammar is not part of this example.
Stepping through the productions, $MAC_PDU merely defines the MAC frame format. Two destination MAC addresses are allowed by $MAC_DA: a specific hardware address (0x08 0x01 0x02 0x03 0x04 0x05) and a broadcast address of all l's. All other MAC addresses are automatically rejected, as a packet without one of these two addresses will fail to parse. Any source address is accepted by $MAC_SA; a SEE is called to save the source address to a master context table variable mct->curr_SA on the VMCD. $MAC_PAYLOAD and $ET2 combine to ensure that only two types of payloads are parsed, an ARP payload and an IP payload (further parsing of an IP payload is not illustrated herein). Of course, other packet types can be added by expanding these productions.
When the first two bytes of the MACJPAYLOAD indicate an ARP packet (type = 0x0806), the parser next tries to parse $ARP_BODY. For simplicity, the first four elements of the ARP body (hardware and protocol types and address lengths) are shown fixed — if ARP were implemented for another protocol as well as IP, these elements could be generalized (note that the generalization of the length fields might allow different sizes for the address fields that follow, a condition that would have to be accounted for in the production rales). Two values for $ARP_OP are possible, a 1 for a request and a 2 for a reply. Although address parsing does not differ for the two values of ARP OP, the s-code to be executed in each case does. S-code segment 1, which is executed for ARP requests, compares the target protocol to the local IP address stored in the master context table on the VMCD. When these are equal, a SEE generates an ARP reply packet to the sender's hardware and IP addresses. S-code segment 2 executes for both ARP requests and ARP replies — this segment updates an ArpCache array stored in the AMCD with the sender's hardware and protocol addresses and the time received. The "update" command to mct-> ArpCache includes a flag or mask to identify which data in ArpCache should be used to perform the update; normally, the cache would be indexed at least by IP address. In an Ethernet/IP ARP packet, ARP_P ADDING will be 18 octets in length. The
ARP_P ADDING production rule shown here, however, fits any number of octets. In this example, an s-code segment is called to calculate the padding length and "throw away" that many octets, e.g., by advancing the input pointer. Alternately, the parser could use a five- octet look-ahead to the EoFrame token in the input; when the token is found, the preceding four octets are the FCS. An alternate embodiment where the parser has a variable symbol look-ahead capability will be explained at the conclusion of this example.
The MAC_FCS production indicates that a SEE is to check the FCS attached to the packet. A SEE may actually compute the checksum, or the checksum may be computed by input buffer or other hardware, in which case the SEE would just compare the packet value to the calculated value and reject the packet if no match occurs. To further illustrate how the RSP 500 is configured to execute the ARP grammar above, exemplary production rule table and parser table values will now be given and explained. First, production rules will be shown, wherein hexadecimal notation illustrates a terminal value, decimal notation indicates a production rule, and "octet" will match any octet found at the head of an input stream. A non-terminal (NT) code is used as an index to the parser table; a production rale (PR) code is stored in the parser table, and indicates which production rule applies to a given combination of NT code and input value.
ARP Production Rules
Figure imgf000022_0001
In the ARP production rale table above, the RHS Non-terminal Values, e.g., with a special end-of-rule symbol attached, are what get stored in the RSP's production rule table. The production rule codes are "pointers" to the corresponding production rules; it is the PR codes that actually get stored in the parser table. The following parser table segment illustrates the relationship between PR and PR code:
ARP Parser Table Values
Figure imgf000023_0001
The combination of an NT code and a "Head of Input Stream Data Value" index the parser table values in the RSP. Note that the start symbol S, EoFrame symbol, and bottom of stack symbol $ are special cases — the parser control FSM can be implemented to not reference the parser table for these symbols. For many NT codes, the table produces the same PR code regardless of the data value occupying the head of the input stream. In this example, all other NT codes have valid values for only one or two head of input stream values (a blank value in a cell represents an invalid entry). This information can be coded in a matrix format, with each cell filled in, or can be coded in some other more economical format.
Given the tables above, an example of RSP execution for an Ethernet/ ARP packet is now presented. In this example, the DXP is stepped by parser cycles, corresponding to one "loop" through the flowchart in Figure 7. At each cycle, the following machine states are tracked: the input pointer ip, indicating the byte address of the current stream input symbol being parsed; the input symbol pointed to by the input pointer, *ip the parser stack pointer psp, indicating which stack value is pointed to at the beginning of the parser cycle; the top-of- parser-stack symbol at the beginning of that parser cycle, *psp, where non-terminal symbols are indicated by the prefix "nt", and the terminal symbol t.xx matches any input symbol; PT[*zp, *psp], the currently indexed value of the parser table; PRT[PT], the production rule pointed to by PT[*zp, *psp]; SCT[PT], the s-code segment pointed to by PT[*zp, *psp]; and *ps, the entire contents of the parser stack.
The following ARP packet will be used in the example, where all values are stated in hexadecimal notation:
0x0000 FF FF FF FF FF FF 00 02 3F 11 6D 9E 08 06 00 01 0x0010 08 00 06 04 00 01 00 02 3F 11 6D 9E CO A8 00 04
0x0020 00 00 00 00 00 00 CO A8 00 06 3A 20 33 0D 0A 53
0x0030 54 3A 20 15 12 6E 3A 13 63 68 65 6D EF 13 84 CC
This is an ARP request packet sent to a broadcast MAC address, requesting the hardware address associated with a network address 192.168.0.6, which in this example is a network address assigned to the RSP. The results for parsing this example packet are shown below in tabular format, followed by a brief explanation. Although the example is lengthy, it is instructive as it exercises most of the basic functions of the RSP.
ARP Packet Parser Cycle Example
Figure imgf000025_0001
Figure imgf000026_0001
Figure imgf000027_0001
Generally, the detailed example above illustrates how production rules are expanded onto the parser stack and then processed individually, either by: matching a terminal symbol with an input symbol (see, e.g., parser cycles 2-7); matching a terminal don't care symbol t.xx with an input symbol (see, e.g., parser cycles 9-14); further expanding a non-terminal symbol either irrespective of input (see, e.g., parser cycle 8) or based on the current input symbol (see, e.g., parser cycles 0, 1, 17); or executing a null cycle, in this case to allow a SEE to adjust the input pointer to "skip" parsing for a padding field (parser cycle 63). This example also illustrates the calls to s-code segments at appropriate points during the parsing process, depending on which production rules get loaded onto the stack (parser cycles 8, 33, 62, 64). It can be appreciated that some of these code segments can execute in parallel with continued parsing.
The exemplary grammar given above is merely one way of implementing an ARP grammar according to an embodiment of the invention. Some cycle inefficiencies could be reduced by explicitly expanding some of the non-terminals into their parent production rales, for example. The ARP grammar could also be generalized considerably to handle more possibilities. The coding selected, however, is meant to illustrate basic principles and not all possible optimizations or ARP features. Explicit expansions may also be limited by the chosen stack size for a given implementation. In an alternate embodiment, DXP 200 can implement an LL(f(X)) parser, where the look-ahead value f(X) is coded in a stack symbol, such that each stack symbol can specify its own look-ahead. As an example, the production rule for ARP_P ADDING in the previous example could be specified as $ARP_PADDING : = octet ARP_PADDING | EoFrame , (LA5 ) where (LA5) indicates an input symbol look-ahead of 5 symbols for this rule. The look- ahead value is coded into the production rale table, such that when the rule is executed DXP 200 looks up (X, a+5) in the production rule table.
A variable look-ahead capability can also be used to indicate that multiple input symbols are to be used in a table lookup. For instance, the production rale for MAC_DA could be specified as
$MAC_DA := 0X08 0X01 0X02 0X03 0X04 0X05
I OXFF OXFF OXFF OXFF OXFF OXFF, (LA6 ) Instead of creating two production rales 52 and 53 with six terminal symbols each, the parser table contains two entries that match six symbols each, e.g., at parser table locations (X a) — (130, 0x08 0x01 0x02 0x03 0x04 0x05) and (130, OxFF OxFF OxFF OxFF OxFF OxFF).
With such an approach, a standard row, column matrix parser table could prove very wasteful due to the number of addressable columns needed for up to a six-octet input symbol width, and the sparsity of such a matrix. One alternate implementation, using a ternary
CAM, is shown in Figure 11. Ternary CAM 900 of Figure 11 is loaded with a table of match addresses and corresponding production rale codes. Each match address comprises a one-octet stack symbolXand six octets of input symbols al, a2, a3, a4, a5, a6. When a match address is supplied to CAM 900, it determines whether a match exists in its parser table entries. If a match exists, the corresponding production rale code is returned (alternately, the address of the table entry that caused a match is returned, which can be used as an index into a separate table of production rale codes or pointers).
One advantage of the parser table implementation of Figure 11 is that it is more efficient than a matrix approach, as entries are only created for valid combinations of stack and input symbols. This same efficiency allows for longer input symbols strings to be parsed in one parser cycle (up to six input symbols are shown, but a designer could use whatever length is convenient), thus a MAC or IP address can be parsed in one parser cycle. Further, look-ahead capability can be implicitly coded into the CAM, e.g., the next six input symbols can always be supplied to the table. For production rales corresponding to EE(1) parsing (such as the row for X = 136 in CAM 900), the CAM bits corresponding to a2, a3, a4, a5, a6 on that row are set to a "don't care" value xx, and merely do not contribute to the lookup. For production rules corresponding to EE(2) parsing (such as the rows for X = 134 and 135, which match a two-octet packet type field for ARP and IP packets, respectively), the CAM bits corresponding to a3, a4, a5, a6 on those rows are set to xx. Up to EE(6) parsing can be entered in the table, as is shown in the two MAC address entries for X = 129. Note that if al, a2, a3, a4, a5 were set to xx, a true six-symbol look-ahead can also be implemented. One last observation is that with a ternary CAM, each bit can be set independently to a "don't care" state, thus production rales can also be set to ignore certain bits, e.g., in a flag field.
A binary CAM can also function in a parser table implementation. The primary difference is that the binary CAM cannot store "don't care" information explicitly, thus leaving the parser state machine (or some other mechanism) responsible for handling any "don't care" functionality in some other manner. One of ordinary skill in the art will recognize that the concepts taught herein can be tailored to a particular application in many other advantageous ways. For instance, many variations on the codes and addressing schemes presented are possible. In the described embodiments, a microinstruction code segment ends with a NULL instruction — the occurrence of the NULL instruction can be detected either by the S-bus interface of a SEE, by the microinstruction decoder, or even by an s-code table function. The s-code addresses do not necessarily have to be known to the SEEs; it is possible for the SCT to track instruction pointers for each SEE, with the instruction pointers for each SEE set by the DXP. Although multiple memory storage areas with different interfaces are illustrated, several of the interfaces can share access to a common memory storage area that serves as a physical storage space for both. Those skilled in the art will recognize that some components, such as the exception processing unit, can either by integrated with the RSP or connect to the RSP as a separate unit.
It is not critical how the parser table, production rale table, and s-code table are populated for a given set of grammars — the population can be achieved, for example, through an EPU, a boot-code segment on one of the SEEs, or a boot-grammar segment with the table population instructions provided at the input port. The tables can also, of course, be implemented with non-volatile memory so that table reloading is not required at every power- up. The flowchart illustrating the operation of the DXP is merely illustrative — for instance, it is recognized herein that a given state machine implementation may accomplish many tasks in parallel that are shown here as sequential tasks, and may perform many operations speculatively.
Although several embodiments have been shown and described with a single input port, the description of "an" input port merely acknowledges that at least one port exists. The physical port arrangement can be varied depending on application. For instance, depending on port bandwidth and parser performance, several input ports may be multiplexed to the same direct execution parser.
Those skilled in the art recognize that other functional partitions are possible within the scope of the invention. Further, what functions are and are not implemented on a common integrated circuit (for a hardware implementation) is a design choice, and can vary depending on application. It is also recognized that the described parser functions can be implemented on a general-purpose processor, using conventional software techniques, although this may defeat some of the advantages present with the hardware embodiments. Finally, although the specification may refer to "an", "one", "another", or "some" embodiment(s) in several locations, this does not necessarily mean that each such reference is to the same embodiment(s), or that the feature only applies to a single embodiment.

Claims

WHAT IS CLAIMED IS:
1. A data processing system comprising: an input port to receive data symbols; a direct execution parser having a stack to store stack symbols, the parser capable of processing stack symbols in response to the received data symbols; a parser table accessible by the parser, the parser table capable of population with production rule codes indexable by the combination of at least one received data symbol and a stack symbol supplied by the parser; a production rule table accessible by the parser, the production rule table capable of population with production rales indexable by production rale codes; a first semantic code execution engine capable of executing machine instructions when prompted by the direct execution parser, using machine instruction segments indicated by the parser; and a semantic code table accessible by the semantic code execution engine, the semantic code table capable of population with machine instruction segments indexable by production rale codes.
2. The system of claim 1, further comprising a second semantic code execution engine capable of executing machine instructions when prompted by the direct execution parser, using machine instructions indicated by the parser, the first and second semantic code execution engines capable of parallel machine instruction execution.
3. The system of claim 2, further comprising an exception processing unit having a microprocessor and associated memory, the exception processing unit capable of performing tasks at the request of at least one of the semantic code execution engines.
4. The system of claim 2, further comprising a block input/output port connected to at least one of the semantic code execution engines, the block input/output port capable of initiating block input/output operations under control of the at least one semantic code execution engine.
5. The system of claim 2, wherein a production rale code allows the direct execution parser to determine whether a corresponding segment of semantic code table machine instructions can be directed to any available semantic code execution engine, or whether that segment should be directed to a specific semantic code execution engine.
6. The system of claim 1 , further comprising an interface between the direct execution parser and the semantic code execution engine, the interface having the capability to suspend stack symbol processing by the direct execution parser when directed by the semantic code execution engine.
7. The system of claim 1 , wherein the parser table, production rule table, and semantic code table at least partially reside in reprogrammable storage.
8. The system of claim 7, wherein the system processes data packets, each data packet formatted according to one or more network protocols, the parser table, production rale table, and semantic code table reprogrammable to support parsing for different network protocols.
9. The system of claim 8, wherein the system can load parser table reprogrammable storage with a network protocol while the system is processing data packets.
10. The system of claim 1, further comprising a machine context data interface connected to a data storage area and accessible by the semantic code execution engine, the machine context data interface managing the data storage area and performing data operations in response to machine context instructions issued by the semantic code execution engine.
11. The system of claim 10, the machine context data interface comprising a variable machine context data interface and an array machine context data interface, the array machine context data interface capable of managing and performing data operations on array data.
12. The system of claim 11, wherein the array machine context data interface accesses at least one data storage area with a data access format different from that of the data storage area accessed by the variable machine context data interface.
13. The system of claim 1, wherein at least the direct execution parser, the parser table, and the production rule table are implemented using software to configure a microprocessor and its attached memory.
14. The system of claim 1, wherein the production rale table is capable of storing bitmasked terminal symbols, each bitmasked terminal symbol capable of indicating that selected bits in a corresponding input symbol are "don't care" bits.
15. The system of claim 1, wherein the direct execution parser performs a parsing method selected from the group of methods including EE parsing, ER parsing, LALR parsing, and recursive descent parsing.
16. The system of claim 1, wherein the direct execution parser is capable of parsing input symbols using a variable input symbol look-ahead that can be varied for each stack symbol.
17. The system of claim 16, wherein the variable input symbol look-ahead can be stored as a value in the production rule table along with the production rales, and wherein the direct execution parser loads the variable input symbol look-ahead when it loads a production rale into the stack.
18. The system of claim 16, wherein the parser table comprises a binary or ternary content-addressable memory (CAM) with a word size capable of storing entries corresponding to the combination of a stack symbol and up to N input symbols.
19. The system of claim 18, wherein the parser supplies Ninput symbols to the parser table on each access, each CAM entry determining which of the Ninput symbols affect the lookup for that CAM entry.
20. An integrated circuit comprising: an input port to receive data symbols; a direct execution parser having a stack to store stack symbols, the parser capable of processing stack symbols in response to the received data symbols; a parser table accessible by the parser, the parser table capable of population with production rule codes indexable by the combination of a received data symbol and a stack symbol supplied by the parser; a production rule table accessible by the parser, the production rule table capable of population with production rules indexable by production rale codes; a first semantic code execution engine capable of executing machine instructions when prompted by the direct execution parser, using machine instruction segments indicated by the parser; and a semantic code table accessible by the semantic code execution engine, the semantic code table capable of population with machine instruction segments indexable by production rale codes.
21. The integrated circuit of claim 20, further comprising a second semantic code execution engine capable of executing machine instractions when prompted by the direct execution parser, using machine instructions indicated by the parser, the first and second semantic code execution engines capable of parallel machine instraction execution.
22. The integrated circuit of claim 21, fiirther comprising an exception processing unit having a microprocessor, the exception processing unit capable of performing programmable tasks at the request of at least one of the semantic code execution engines.
23. The integrated circuit of claim 21, -further comprising a block input/output port connected to at least one of the semantic code execution engines, the block input/output port capable of initiating block input/output operations under control of the at least one semantic code execution engine.
24. The integrated circuit of claim 21 , wherein a production rale code allows the direct execution parser to determine whether a corresponding segment of semantic code table machine instructions can be directed to any available semantic code execution engine, or whether that segment should be directed to a specific semantic code execution engine.
25. The integrated circuit of claim 20, further comprising an interface between the direct execution parser and the semantic code execution engine, the interface having the capability to suspend stack symbol processing by the direct execution parser when directed by the semantic code execution engine.
26. The integrated circuit of claim 20, wherein the parser table, production rule table, and semantic code table at least partially reside in reprogrammable storage.
27. The integrated circuit of claim 26, wherein the parser table, production rule table, and semantic code table comprise caches for larger table residing in memory separate from the integrated circuit.
28. The integrated circuit of claim 20, further comprising a machine context data interface connectable to a data storage area and accessible by the semantic code execution engine, the machine context data interface managing the data storage area and performing data operations in response to machine context instractions issued by the semantic code execution engine.
29. The integrated circuit of claim 28, wherein the data storage area is at least partially integrated on the integrated circuit.
30. The integrated circuit of claim 28, the machine context data interface comprising a variable machine context data interface and an array machine context data interface, the array machine context data interface capable of managing and performing data operations on array data.
31. The integrated circuit of claim 30, wherein the array machine context data interface accesses at least one data storage area with a data access format different from that of the data storage area accessed by the variable machine context data interface.
32. An integrated circuit comprising: an input port to receive data symbols; a direct execution parser having a stack to store stack symbols, the parser capable of processing stack symbols in response to the received data symbols; a parser table accessible by the parser, the parser table capable of population with production rale codes indexable by the combination of a received data symbol and a stack symbol supplied by the parser; a production rale table accessible by the parser, the production rule table capable of population with production rules indexable by production rale codes; multiple semantic code execution engines, each capable of executing machine instractions when prompted by the direct execution parser, using machine instruction segments indicated by the parser; a semantic code table accessible by the semantic code execution engines, the semantic code table capable of population with machine instraction segments indexable by production rule codes; and a machine context data interface connectable to a data storage area and accessible by the semantic code execution engines, the machine context data interface managing the data storage area and performing data operations in response to machine context instructions issued by the semantic code execution engines.
33. The integrated circuit of claim 32, further comprising: a first bus between the semantic code execution engines and the semantic code table; and a second bus between the semantic code execution engines and the machine context data interface.
34. The integrated circuit of claim 33, further comprising an input bus to allow the semantic code execution engines access to the data symbols.
35. The integrated circuit of claim 32, further comprising an interface between the direct execution parser and the semantic code execution engines, the interface having access to status information for each semantic code execution engine and having the capability to suspend stack symbol processing by the direct execution parser based on the status of a semantic code execution engine.
36. The integrated circuit of claim 35, wherein the status information comprises a set of semaphores corresponding to the semantic code execution engines and settable by the corresponding semantic code execution engines.
37. An integrated circuit comprising: an input port to receive data symbols; a direct execution parser comprising a stack to store stack symbols, the parser capable of processing stack symbols in response to the received data symbols, a parser table interface to allow the parser to access a parser table capable of population with production rale codes, each code indexable by the combination of a received data symbol and a stack symbol supplied by the parser, a production rule table interface to allow the parser to access a production rule table capable of population with production rules, each rale indexable by production rale codes; and a first semantic code execution engine capable of executing machine instractions when prompted by the direct execution parser, using machine instraction segments indicated by the parser; and a semantic code table interface to allow the semantic code execution engine to access a semantic code table capable of population with machine instraction segments corresponding to production rules.
38. The integrated circuit of claim 36, further comprising at least a section of the parser table, production rule table, and semantic code table integrated on the circuit.
39. The integrated circuit of claim 36, wherein the production rule table is capable of storing bitmasked terminal symbols, each bitmasked terminal symbol capable of indicating that selected bits in a corresponding input symbol are "don't care" bits.
40. The integrated circuit of claim 36, wherein the direct execution parser performs a parsing method selected from the group of methods including EE parsing, ER parsing,
LALR parsing, and recursive descent parsing.
41. The integrated circuit of claim 36, wherein the direct execution parser is capable of parsing input symbols using a variable input symbol look-ahead that can be varied for each stack symbol.
42. The integrated circuit of claim 41, wherein the variable input symbol look-ahead can be stored as a value in the production rale table along with the production rales, and wherein the direct execution parser loads the variable input symbol look-ahead when it loads a production rale into the stack.
43. The integrated circuit of claim 41, wherein the parser table comprises a ternary content-addressable memory (CAM) with a word size capable of storing entries corresponding to the combination of a stack symbol and up to Ninput symbols.
44. The integrated circuit of claim 43, wherein the parser supplies Ninput symbols to the parser table on each access, each CAM entry determining which of the Ninput symbols affect the lookup for that CAM entry.
45. A method of configuring a data processor to process a datagram data input stream, the method comprising: storing a set of production rules, for interpreting datagrams, in a production rule table, each rule comprising one or more symbols; storing a set of semantic execution engine instructions in a semantic code table, the semantic execution engine instractions including code segments associated with at least some of the production rales; and storing a set of production rale codes, referencing the production rales, in a parser table.
46. The method of claim 45, further comprising initializing a direct execution parser to begin parsing a datagram, according to the stored production rales, upon receipt of a start symbol in the datagram data input stream.
47. A method of operating a network processor, the method comprising: detecting, at an input port, reception of the start of a datagram comprising multiple data symbols; directing a direct execution parser to parse data symbols from the datagram according to a set of stored production rules; and at least once during the parsing process, directing a semantic code execution engine to execute a code segment associated with a production rule.
48. The method of claim 47, further comprising during execution of the code segment, executing an instruction that generates a machine-context data request to an attached machine context data interface, and translating the machine context data request to at least one physical memory operation.
49. The method of claim 47, further comprising: detecting the occurrence of datagram content that cannot be processed by a semantic code execution engine; and directing an exception processing unit to process the datagram content.
50. The method of claim 47, wherein executing the code segment comprises directing a block input/output data operation to a block input/output port.
51. A method of implementing a network packet protocol, the method comprising: dividing the protocol into a set of parseable grammatical production rales, each comprising at least one symbol selected from the group of terminal and nonterminal symbols, and a set of machine context tasks to be performed for at least some of the production rales by an execution engine; assigning a non-terminal code and a production rale code to each production rule; organizing the grammatical production rules in a machine-storable format, indexable by production rule code; organizing the machine context tasks in an execution-engine instruction code format, indexable by the production rale code associated with the corresponding production rule; and generating a parser table of production rule codes in machine-storable format, indexable by the combination of a non-terminal symbol and at least one symbol appearing in a packet to be parsed by the network packet protocol.
52. The method of claim 51, further comprising affixing a prefix code to the symbols in the machine-storable format production rales, the prefix code indicating whether each symbol is a terminal or non-terminal symbol.
53. The method of claim 52, the prefix code further indicating whether a terminal symbol can match any network packet protocol symbol that it is paired with.
54. The method of claim 51, further comprising, for at least one terminal symbol, assigning a bitmask to that symbol and storing the bitmask with the production rale containing that symbol.
55. The method of claim 51, further comprising setting at least some indices in the parser table based on the combination of a non-terminal symbol and multiple input symbols.
56. The method of claim 55, wherein each index in the parser table can be based on up to Ninput symbols in N index positions, and wherein setting at least some indices in the parser table comprises, for each indice, using between 1 and N index positions and setting the remainder of the index positions, if any, to a "don't care" condition.
PCT/US2003/036225 2003-01-24 2003-11-12 A reconfigurable semantic processor WO2004068271A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU2003290817A AU2003290817A1 (en) 2003-01-24 2003-11-12 A reconfigurable semantic processor
CA002513097A CA2513097A1 (en) 2003-01-24 2003-11-12 A reconfigurable semantic processor
JP2004567407A JP4203023B2 (en) 2003-01-24 2003-11-12 Reconfigurable semantic processor
EP03783401A EP1590744A4 (en) 2003-01-24 2003-11-12 A reconfigurable semantic processor

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/351,030 US7130987B2 (en) 2003-01-24 2003-01-24 Reconfigurable semantic processor
US10/351,030 2003-01-24

Publications (2)

Publication Number Publication Date
WO2004068271A2 true WO2004068271A2 (en) 2004-08-12
WO2004068271A3 WO2004068271A3 (en) 2005-02-10

Family

ID=32735705

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2003/036225 WO2004068271A2 (en) 2003-01-24 2003-11-12 A reconfigurable semantic processor

Country Status (9)

Country Link
US (2) US7130987B2 (en)
EP (1) EP1590744A4 (en)
JP (1) JP4203023B2 (en)
KR (1) KR20050106591A (en)
CN (1) CN1742272A (en)
AU (1) AU2003290817A1 (en)
CA (1) CA2513097A1 (en)
TW (1) TWI239475B (en)
WO (1) WO2004068271A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9286892B2 (en) 2014-04-01 2016-03-15 Google Inc. Language modeling in speech recognition

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7415596B2 (en) * 2003-01-24 2008-08-19 Gigafin Networks, Inc. Parser table/production rule table configuration using CAM and SRAM
US7082044B2 (en) * 2003-03-12 2006-07-25 Sensory Networks, Inc. Apparatus and method for memory efficient, programmable, pattern matching finite state machine hardware
US7751440B2 (en) * 2003-12-04 2010-07-06 Intel Corporation Reconfigurable frame parser
US7219319B2 (en) * 2004-03-12 2007-05-15 Sensory Networks, Inc. Apparatus and method for generating state transition rules for memory efficient programmable pattern matching finite state machine hardware
US20050223369A1 (en) * 2004-03-31 2005-10-06 Intel Corporation Method and system for programming a reconfigurable processing element
US7251722B2 (en) * 2004-05-11 2007-07-31 Mistletoe Technologies, Inc. Semantic processor storage server architecture
US20070027991A1 (en) * 2005-07-14 2007-02-01 Mistletoe Technologies, Inc. TCP isolation with semantic processor TCP state machine
JP2008509484A (en) * 2004-08-05 2008-03-27 ミスルトウ テクノロジーズ, インコーポレイテッド Data context switching in the semantic processor
US20070022275A1 (en) * 2005-07-25 2007-01-25 Mistletoe Technologies, Inc. Processor cluster implementing conditional instruction skip
US8387029B2 (en) * 2005-07-25 2013-02-26 Hercules Software, Llc Direct execution virtual machine
KR100697536B1 (en) 2005-11-08 2007-03-20 전자부품연구원 Method of providing personal information based search by get_data operation in tv-anytime service
KR100681199B1 (en) * 2006-01-11 2007-02-09 삼성전자주식회사 Method and apparatus for interrupt handling in coarse grained array
US20080022401A1 (en) * 2006-07-21 2008-01-24 Sensory Networks Inc. Apparatus and Method for Multicore Network Security Processing
US8117530B2 (en) * 2007-02-19 2012-02-14 International Business Machines Corporation Extensible markup language parsing using multiple XML parsers
US9424339B2 (en) 2008-08-15 2016-08-23 Athena A. Smyros Systems and methods utilizing a search engine
CN110737628A (en) * 2019-10-17 2020-01-31 辰芯科技有限公司 reconfigurable processor and reconfigurable processor system
CN114679504B (en) * 2022-05-27 2022-09-06 成都数联云算科技有限公司 UDP message parsing method and device and computer equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5916305A (en) 1996-11-05 1999-06-29 Shomiti Systems, Inc. Pattern recognition in data communications using predictive parsers

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4837735A (en) * 1987-06-09 1989-06-06 Martin Marietta Energy Systems, Inc. Parallel machine architecture for production rule systems
US5193192A (en) * 1989-12-29 1993-03-09 Supercomputer Systems Limited Partnership Vectorized LR parsing of computer programs
US5487147A (en) * 1991-09-05 1996-01-23 International Business Machines Corporation Generation of error messages and error recovery for an LL(1) parser
US5805808A (en) * 1991-12-27 1998-09-08 Digital Equipment Corporation Real time parser for data packets in a communications network
US5632034A (en) * 1993-06-01 1997-05-20 International Business Machines Corporation Controlling method invocation sequence through virtual functions in an object-oriented class library
US5581696A (en) * 1995-05-09 1996-12-03 Parasoft Corporation Method using a computer for automatically instrumenting a computer program for dynamic debugging
US6493761B1 (en) * 1995-12-20 2002-12-10 Nb Networks Systems and methods for data processing using a protocol parsing engine
US5793954A (en) * 1995-12-20 1998-08-11 Nb Networks System and method for general purpose network analysis
US6034963A (en) * 1996-10-31 2000-03-07 Iready Corporation Multiple network protocol encoder/decoder and data processor
US6330659B1 (en) * 1997-11-06 2001-12-11 Iready Corporation Hardware accelerator for an object-oriented programming language
WO1998050852A1 (en) * 1997-05-08 1998-11-12 Iready Corporation Hardware accelerator for an object-oriented programming language
US6122757A (en) * 1997-06-27 2000-09-19 Agilent Technologies, Inc Code generating system for improved pattern matching in a protocol analyzer
US5991539A (en) * 1997-09-08 1999-11-23 Lucent Technologies, Inc. Use of re-entrant subparsing to facilitate processing of complicated input data
US6341130B1 (en) * 1998-02-09 2002-01-22 Lucent Technologies, Inc. Packet classification method and apparatus employing two fields
US6208649B1 (en) * 1998-03-11 2001-03-27 Cisco Technology, Inc. Derived VLAN mapping technique
US6119215A (en) * 1998-06-29 2000-09-12 Cisco Technology, Inc. Synchronization and control system for an arrayed processing engine
US6145073A (en) * 1998-10-16 2000-11-07 Quintessence Architectures, Inc. Data flow integrated circuit architecture
US6356950B1 (en) * 1999-01-11 2002-03-12 Novilit, Inc. Method for encoding and decoding data according to a protocol specification
US6763499B1 (en) * 1999-07-26 2004-07-13 Microsoft Corporation Methods and apparatus for parsing extensible markup language (XML) data streams
US6549916B1 (en) * 1999-08-05 2003-04-15 Oracle Corporation Event notification system tied to a file system
US6772413B2 (en) * 1999-12-21 2004-08-03 Datapower Technology, Inc. Method and apparatus of data exchange using runtime code generator and translator
US6985964B1 (en) * 1999-12-22 2006-01-10 Cisco Technology, Inc. Network processor system including a central processor and at least one peripheral processor
US6892237B1 (en) * 2000-03-28 2005-05-10 Cisco Technology, Inc. Method and apparatus for high-speed parsing of network messages
US6952666B1 (en) * 2000-07-20 2005-10-04 Microsoft Corporation Ranking parser for a natural language processing system
JP3690730B2 (en) * 2000-10-24 2005-08-31 インターナショナル・ビジネス・マシーンズ・コーポレーション Structure recovery system, syntax analysis system, conversion system, computer device, syntax analysis method, and storage medium
US20020116527A1 (en) * 2000-12-21 2002-08-22 Jin-Ru Chen Lookup engine for network devices
US20020083331A1 (en) * 2000-12-21 2002-06-27 802 Systems, Inc. Methods and systems using PLD-based network communication protocols
US7379475B2 (en) * 2002-01-25 2008-05-27 Nvidia Corporation Communications processor
US8218555B2 (en) * 2001-04-24 2012-07-10 Nvidia Corporation Gigabit ethernet adapter
US6587750B2 (en) * 2001-09-25 2003-07-01 Intuitive Surgical, Inc. Removable infinite roll master grip handle and touch sensor for robotic surgery
US6920154B1 (en) * 2001-12-17 2005-07-19 Supergate Technology Usa, Inc. Architectures for a modularized data optimization engine and methods therefor
US7535913B2 (en) * 2002-03-06 2009-05-19 Nvidia Corporation Gigabit ethernet adapter supporting the iSCSI and IPSEC protocols
US7426634B2 (en) * 2003-04-22 2008-09-16 Intruguard Devices, Inc. Method and apparatus for rate based denial of service attack detection and prevention

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5916305A (en) 1996-11-05 1999-06-29 Shomiti Systems, Inc. Pattern recognition in data communications using predictive parsers

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AHO A V ET AL.: "Compilers principles, techniques, and tools", ADDISON-WESLEY PUBLISHING CO., article "Compilers principles, techniques, and tools"
CHISVIN L ET AL.: "Content-addressable and associative memory: alternatives to the ubiquitous RAM", COMPUTER, IEEE SERVICE CENTER, vol. 22, no. 7, XP000039034, DOI: doi:10.1109/2.30732
See also references of EP1590744A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9286892B2 (en) 2014-04-01 2016-03-15 Google Inc. Language modeling in speech recognition

Also Published As

Publication number Publication date
TW200419443A (en) 2004-10-01
WO2004068271A3 (en) 2005-02-10
EP1590744A4 (en) 2007-12-05
US20040148415A1 (en) 2004-07-29
CA2513097A1 (en) 2004-08-12
JP4203023B2 (en) 2008-12-24
AU2003290817A8 (en) 2004-08-23
TWI239475B (en) 2005-09-11
CN1742272A (en) 2006-03-01
EP1590744A2 (en) 2005-11-02
JP2006513667A (en) 2006-04-20
KR20050106591A (en) 2005-11-10
US7130987B2 (en) 2006-10-31
AU2003290817A1 (en) 2004-08-23
US20070083858A1 (en) 2007-04-12

Similar Documents

Publication Publication Date Title
US20070083858A1 (en) Reconfigurable semantic processor
US7478223B2 (en) Symbol parsing architecture
US20050281281A1 (en) Port input buffer architecture
EP1581841B1 (en) Methods and apparatuses for evaluation of regular expressions of arbitrary size
US9916145B2 (en) Utilizing special purpose elements to implement a FSM
US6453365B1 (en) Direct memory access controller having decode circuit for compact instruction format
US7251722B2 (en) Semantic processor storage server architecture
US7210130B2 (en) System and method for parsing data
US7529746B2 (en) Search circuit having individually selectable search engines
WO2004063886A2 (en) Methods and apparatuses for evaluation of regular expressions of arbitrary size
US5713044A (en) System for creating new group of chain descriptors by updating link value of last descriptor of group and rereading link value of the updating descriptor
KR19980079301A (en) Vector processor design reduces register file reading
US20090228693A1 (en) System and method for large microcoded programs
EP0745932B1 (en) Microprocessor supporting variable length instruction execution
EP3236350B1 (en) Tigersharc series dsp start-up management chip and method
US20050135353A1 (en) Packet assembly
US20070043871A1 (en) Debug non-terminal symbol for parser error handling
US11579802B2 (en) Pipeline using match-action blocks
JP2006505043A (en) Hardware parser accelerator
US11960772B2 (en) Pipeline using match-action blocks
JPH11296497A (en) Processor device

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2513097

Country of ref document: CA

Ref document number: 2003783401

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 1020057013618

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2004567407

Country of ref document: JP

Ref document number: 20038A92138

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 2003783401

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020057013618

Country of ref document: KR