US20050289323A1 - Barrel shifter for a microprocessor - Google Patents
Barrel shifter for a microprocessor Download PDFInfo
- Publication number
- US20050289323A1 US20050289323A1 US11/132,448 US13244805A US2005289323A1 US 20050289323 A1 US20050289323 A1 US 20050289323A1 US 13244805 A US13244805 A US 13244805A US 2005289323 A1 US2005289323 A1 US 2005289323A1
- Authority
- US
- United States
- Prior art keywords
- bit
- bits
- shifter
- shift
- barrel shifter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 claims description 30
- 230000008901 benefit Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000003213 activating effect Effects 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3861—Recovery, e.g. branch miss-prediction, exception handling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F5/00—Methods or arrangements for data conversion without changing the order or content of the data handled
- G06F5/01—Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3648—Software debugging using additional hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7867—Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30032—Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
- G06F9/30149—Instruction analysis, e.g. decoding, instruction word fields of variable length instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/32—Address formation of the next instruction, e.g. by incrementing the instruction counter
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/32—Address formation of the next instruction, e.g. by incrementing the instruction counter
- G06F9/322—Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
- G06F9/325—Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for loops, e.g. loop detection or loop counter
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3804—Instruction prefetching for branches, e.g. hedging, branch folding
- G06F9/3806—Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3816—Instruction alignment, e.g. cache line crossing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3844—Speculative instruction execution using dynamic branch prediction, e.g. using branch history tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3846—Speculative instruction execution using static prediction, e.g. branch taken strategy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
- G06F9/3895—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
- G06F9/3897—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- This invention relates generally to microprocessor architecture and more specifically to a design for a rotation and shifting logic element of a microprocessor.
- Data within a computer or other digital circuit is typically organized into one or more standard data sizes, referred to as data words.
- a very common data word size contains 32 bits of binary data (zeros and ones).
- the size of the data word affects precision and/or resolution of the information contained within the digital circuit, with larger data sizes allowing greater precision and/or resolution because they can represent more values. Larger data words, however, require larger digital circuits to manipulate the data, leading to greater cost and complexity.
- many digital circuits also allow data of smaller, evenly divided sizes to be manipulated. For example, a digital circuit with a maximum data word size of 32 bits might also manipulate 8-bit or 16-bit data.
- a data operand that is half the size of the maximum data word is typically called a half-word.
- manipulating smaller data operands may provide advantages such as requiring less memory to store the data or allowing multiple data operands to be manipulated simultaneously by the same circuit.
- the bits of data within a data word are arranged in a fixed order, typically from most significant bit (MSB) in the leftmost position to least significant bit (LSB) in the rightmost position.
- the rotation operation takes a data word as an input operand and rearranges the order of the bits within that data word by moving bit values to the left or the right by a number of bit positions which may be fixed or may be specified by a second input operand.
- bit values that are moved past the MSB bit position are inserted into the right side bit positions which have been left vacant by the other bits being moved to the left.
- bits that are moved past the LSB bit position are inserted into the left side bit positions in the same manner. For example, consider a 32-bit data word:
- the second operation also takes a data word as an input operand and rearranges the order of the bits within that data word by moving bit values to the left or the right by a number of bit positions which may be fixed or may be specified by a second input operand.
- a shift operation discards the bit values that are moved past the MSB or LSB bit positions.
- the bit positions that are left empty by the shift operation are filled with a fixed value, most commonly either with all 0s or all 1s. As an example, consider a 32-bit data word:
- barrel shifter for effecting bitwise shifts of binary numbers.
- Barrel shifters permit shifting of an N bit word either to the left or to the right by 0, 1, 2, . . . N-1 bits.
- a typical 32-bit barrel shifter will consist of a series of multiplexers. Referring to FIG. 1 , a conventional right and left barrel shifter structure 100 is shown. In order to permit bi-directional shifting, duplicative hardware is used in parallel, with one side performing leftward shifts and the other performing rightward shifts. A single 5-bit control line will tell each stage of the multiplexer to effect a shift.
- any combination of shifts between 0 and 31 bits may be effected by enabling various combinations of the 5 multiplexer stages. For example, a nine-bit shift would have a control signal of 01001, enabling the 1 st and the 4 th multiplexers while disabling the others. One of the parallel shifters will perform a right directional shift while the other performs a left directional shift. Selection logic at the output of the last of each parallel multiplexer will select the appropriate result.
- the conventional barrel shifter is effective at shifting, however, it is a less than ideal solution because the redundant hardware structure occupies extra space on the chip, consumes additional power and complicates the hardware design.
- the hardware complexity of this 32-bit barrel shifter can be characterised by the number of 2:1 multiplexers required to implement its functionalities. In this case, 5 stages each of 32 2:1 multiplexers are required resulting in 160 2:1 multiplexers. In general, the number of 2:1 multiplexers required to implement an N-bit barrel shifter, where N is a positive integer and a power of 2, is N log 2 (N). As noted above, a typical processor needs two such barrel shifters to implement both left and right shifts. In the case of a 32-bit processor, this requires 320 2:1 multiplexers.
- the rotation operation can also be implemented with additional logic to compute the effective shift distance required in each shifter and then combining the results of the shift operations.
- This can be illustrated by way of an example of rotating a 32-bit number to the right by 4 bit positions.
- the right shifter has to shift the input data by 4 bit positions and the left shifter has to shift the input data by 28 bit positions.
- the rotation result can then be obtained by combining the two shifter outputs using the bitwise logical OR operation.
- a shift distance of D is applied to the shifter of the same direction as the rotation and a shift distance of (N-D) is applied to the shifter of the opposite direction.
- N-D shift distance of (N-D)
- further additional logic is required to compute the absolute value of a negative shift distance and apply it to the shifter with a shift direction opposite to the specified one.
- the barrel shifter comprises a 64 bit right-shifting barrel shifter capable of right and left directional shifts of a 32 bit input with positive or negative shift distance.
- a right shift of n bits (n ⁇ 32) is equivalent to a negative left shift of 32-n bits
- a left shift by n bits is equivalent to a negative right shift of 32-n bits.
- the barrel shifter is comprised of a 5 series oriented multiplexers, each shifting by a distance of 1, 2, 4, 8, 16 and 32 bits respectively.
- the barrel shifter also takes advantage of the fact that all bits between the bit length of the multiplexer stage and the 64 th bit are zero. As a result, no hardware is necessary to keep track of these bits. Thus, five series multiplexers having lengths of 33, 35, 39, 47 and 63 bits respectively can be employed having a reduced hardware footprint as compared to five 64-bit multiplexers or dual 32 bit multiplexers as are typically employed. Such a barrel shifter also permits rotation functions with minimal additional hardware logic.
- At least one embodiment of the invention provides a barrel shifter comprising a 2N bit shifter having an upper N bit portion for receiving an N bit input and an lower N bit portion, wherein an X-bit right shift, X ⁇ N of a number yields an X bit right shift in the upper portion and an N-X bit left shift in the lower portion of the 2N bit barrel shifter, and further wherein N is an integer power of 2.
- At least one other embodiment of the invention provides a 2N bit right only barrel shifter, where N is an integer multiple of 2.
- the 2N bit right only barrel shifter according to this embodiment may comprise a number of multiplexer stages corresponding to Log 2 N, wherein each successive stage of the multiplexer adds 2 x additional bits to the number of bits in the preceding stage where x increments from 0 to (Log 2 N-1).
- An additional embodiment of the invention provides a method of performing a positive X bit right shift with a 2N bit right only shifter, 0 ⁇ X ⁇ N, wherein X is an integer and N is a word length in bits.
- the method of performing a positive X bit right shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the shifter, shifting the input by X bits, and retrieving the results from the upper N bit portion of the 2N bit shifter.
- Yet another embodiment of the invention provides a method of performing a negative X bit right shift with a 2N bit right only shifter, 0 ⁇ X ⁇ N, wherein X is an integer and N is a word length in bits.
- the method of performing a negative X bit right shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the shifter, shifting the input by X bits, and retrieving the results from the lower N bit portion of the 2N bit shifter.
- a further embodiment of the invention provides a method of performing a positive X bit left shift with a 2N bit right only shifter, 0 ⁇ X ⁇ N wherein X is an integer and N is a word length in bits.
- the method of performing a positive X bit left shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the 2N bit right only shifter, determining a bit wise inverse of X, shifting the input by (1+inverse of X) bits, and retrieving the results from the lower N bit portion of the 2N bit shifter.
- Still another embodiment of the invention provides a method of performing a negative X bit shift with a 2N bit right only shifter, 0 ⁇ X ⁇ N wherein X is an integer and N is a word length in bits.
- the method of performing a negative X bit shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the 2N bit right only shifter, determining a bit wise inverse of X, shifting the input by (1+inverse of X) bits, and retrieving the results from the upper N bit portion of the 2N bit shifter.
- Yet another additional embodiment of the invention provides a method of performing an X bit right rotation of an N bit number with a 2N bit right only barrel shifter where 0 ⁇ X ⁇ N.
- the method of performing an X bit right rotation of an N bit number with a 2N bit right only barrel shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the 2N bit barrel shifter, right shifting the N bit data input by X bits into the N bit barrel shifter, and performing a logical OR of the contents of the upper N bit portion and lower N bit portion of the 2N bit barrel shifter.
- Still another additional embodiment of the invention provides a method of performing an X bit left rotation of an N bit number with a 2N bit right only barrel shifter where 0 ⁇ X ⁇ N.
- the method of performing an X bit left rotation of an N bit number with a 2N bit right only barrel shifter may comprise receiving an N bit data input in an upper N bit portion of the 2N bit barrel shifter, determining a bit wise inverse of X, shifting the input by (1+inverse of X) bits, and performing a logical OR of the contents of the upper N bit portion and lower N bit portion of the 2N bit barrel shifter.
- a further embodiment of the invention provides a 2N bit barrel shifter.
- the 2N bit barrel shifter according to this embodiment may comprise a pair of upper and lower N bit shifter portions, wherein an X bit right shift of an N bit number yields a X bit right shift in the upper N bit shifter portion and an N-X bit left shift in the lower N bit shifter portion.
- FIG. 1 is a schematic diagram illustrating a conventional parallel right and left shifting 32-bit barrel shifter comprising five series multiplexers each;
- FIG. 2 is a schematic diagram illustrating a 64-bit right shifting only barrel shifter capable of signed right or left binary shifts in accordance with at least one embodiment of this invention
- FIG. 3 is a table illustrating right and left positive and negative shifts as performed with the barrel shifter according to various embodiments of this invention
- FIG. 4 is a table showing the results of right and left positive and negative shifts for a given input as performed with the barrel shifter according to various embodiments of this invention
- FIG. 5 is a table illustrating the results of an 8 bit rotation performed with a barrel shifter according to various embodiments of this invention.
- FIG. 6 is a diagram illustrating the stages of a multiplexer-based barrel shifter according to various embodiments of this invention.
- FIG. 2 an exemplary embodiment of an improved barrel shifter architecture is shown for a microprocessor, wherein a 64-bit right only shifter 200 can provide the same functionality as the conventional double 32-bit shifter 100 shown in FIG. 1 , while having a reduced circuit complexity and reduced power consumption.
- the 64-bit shifter 200 is configured as two side-by-side 32-bit shifters, the following principles may be exploited. Assume that the binary number (up to 32-bits) to be shifted resides in the left side of the 64-bit shifter, labeled A in FIG. 2 .
- the left half of the 64-bit result contains the number A shifted right by two bits, or in other words, the number A with the rightmost (least significant) bits (LSB) truncated off and two leading zeros appended to the front (most significant bit) (MSB).
- the right half of the 64-bit shifter contains the number A characterized by only bits 1 and 0 chopped off from A in the shift operation followed by 30 zeros.
- performing a 2-bit right shift to A in the left half of the 64-bit shifter results in a 30-bit left shift in the right half of the 64-bit register.
- bi-directional shifting is possible with a 64-bit, right only shifter.
- a left shift is equivalent to a right shift of the same number of bit positions but in the opposite direction and vice versa.
- a left shift is equivalent to a right shift of the same absolute shift distance but of the opposite sign.
- an implicit shift distance of 32 is added to the actual number of bit positions shifted right.
- the 2-bit right shift can be viewed as a negative 2-bit left shift. Selecting the left half of the 64-bit result is equivalent to adding 32 to the shift distance of ⁇ 2. Hence, it is equivalent to a left shift of 30 bit positions.
- a negative right shift is equivalent to a positive left shift of the same shift distance
- a negative right shift of 30 bit positions can be obtained by selecting the left half of the result of the 64-bit shifter having performed a 2-bit right shift.
- negative left shift of 2 bit positions is performed by selecting the right half of the above 64-bit result. That is, the 64-bit right only shifter can be used to compute left and right shifts of up to 32 bit positions in either the positive or the negative sense.
- each multiplexer can only shift right by 1, 2, 4, 8 and 16 bits respectively and the left half of the input is always selected to be zero, the required length of each multiplexer need only be 33, 35, 39, 47 and 63-bits respectively. This is because the other bits are all zeros and can be left out of the hardware logic. This results in simplified design, reduction of chip area and reduced power consumption.
- FIG. 3 a table illustrating the way in which right and left positive and negative shifts are performed with the barrel shifter according to various embodiments of this invention is depicted.
- the table 300 in FIG. 3 shows that positive and negative right shifts, that is shift distance D>0 and D ⁇ 0, are performed with the 64 bit barrel shifter according to various embodiments by taking the upper portion and lower portion of the shifter respectively after performing a shift of shift distance D. Also positive and negative left shifts are performed by taking the lower portion and upper portion of the shifter respectively after performing a shift of shift distance equal to the inverse of D+1. For right shifts, the specified shift distance is applied directly to the shifter.
- the upper portion 310 u would be the upper 32 bits and the lower portion 310 l would be the lower 32 bits.
- Left directional positive and negative shifts are performed by taking the lower and upper portions respectively after shifting the negation or inverse of the shift distance plus one bit.
- FIG. 4 an example illustrating the results an eight bit shift according to the procedures set forth in the table 300 of FIG. 3 is illustrated.
- the result of a positive right shift is simply 00AABBCC.
- the last eight bits of the input, DD are truncated off by the shift operation.
- two leading zeros are appended to the leading portion of the input.
- the result remaining in the upper portion 310 u of the 64 bit shifter 300 will be 00AABBCC.
- the lower portion 310 u will contain the number DD followed by six zeros. This is equivalent to a negative right shift of 24 bits or analogously, a left shift of 24 bits.
- a positive left shift of eight bits is derived by shifting ( ⁇ D)+1, where ⁇ D is the bitwise inverse of D, and taking the contents of the lower portion 310 u.
- D 8 or 01000 in binary.
- the inverse of this is 10111. Adding 1 yields 11000 or 24 in decimal. So performing a 24 bit right shift in the 64-bit right shifter yields BBCCDD00 in the lower 32 bit portion of the shifter. This is the same as if the input AABBCCDD had actually shifted left by 8 bits.
- the 64-bit right only barrel shifter 200 of FIG. 2 can also be used to perform the rotation operation.
- the rotation of a 32-bit quantity can be obtained by combining the result of a left shift and a right shift.
- rotating a 32-bit quantity by two bit positions to the right can be implemented by combining the results of right shifting the quantity by two bit positions and that of left shifting the same quantity by 30 bit positions.
- these two shift results are available respectively as the right and left halves of the 64-bit right shifter.
- the same underlying shifter design can be used to compute 32-bit rotation to the right. Rotation to the left can be supported similarly.
- FIG. 5 a table illustrating the results of an eight bit rotation to the right performed with a barrel shifter according to various embodiments of this invention.
- a shift by eight bits is performed on the input. This will leave the results of an eight bit right shift in the upper portion and the results of a 24 bit left shit in the lower portion, 00AABBCC and DD000000 respectively. These two numbers are logically OR-ed together to yield DDAABBCC.
- FIG. 6 is a diagram illustrating the stages of a multiplexer-based barrel shifter according to various embodiments of this invention.
- the diagram shows the input as a 32 bit number in the upper 32 bit portion of the 64 bit multiplexer 600 .
- the multiplexer 600 shown in FIG. 6 is comprised of five stages.
- N is the instruction word length of the processor
- the length of the barrel shifter is 2N
- the number of multiplexer stages required is Log 2 N.
- the first stage of the multiplexer performs a one bit shift and thus is 33 bits in length.
- the second stage performs a two bit shift and thus is 33+2 or 35 bits in length.
- the third stage performs a four bit shift and is thus 35+4 or 39 bits in length.
- the fourth stage performs an eight bit shift and is thus 39+8 or 47 bits in length.
- the fifth stage performs a 16 bit shift and is 47+16 or 63 bits in length.
- the 64 bit right only barrel shifter in accordance with various embodiments of the invention requires only 217 bits of multiplexer logic. This is a logic savings of nearly 33% which will hold constant for differing word sizes, such as, for example, 16 bit, 32 bit, 64 bit, 128 bit, etc.
Abstract
A 2N bit right only barrel shifter for a microprocessor comprising upper and lower N bit shifter portions. A N bit input is put in the upper portion. An X bit right shift of the N bit number yields the results in the N bit upper portion and the result of an N-X bit left shift in the lower portion. The N bit shifter is comprised of a Log2N stage multiplexer where in each successive stage of the multiplexer adds 2x additional bits where x increments from 0 to (Log2N-1).
Description
- This application claims priority to provisional application No. 60/572,238 filed May 19, 2004, entitled “Microprocessor Architecture,” hereby incorporated by reference in its entirety.
- This invention relates generally to microprocessor architecture and more specifically to a design for a rotation and shifting logic element of a microprocessor.
- Data within a computer or other digital circuit is typically organized into one or more standard data sizes, referred to as data words. For example, a very common data word size contains 32 bits of binary data (zeros and ones). The size of the data word affects precision and/or resolution of the information contained within the digital circuit, with larger data sizes allowing greater precision and/or resolution because they can represent more values. Larger data words, however, require larger digital circuits to manipulate the data, leading to greater cost and complexity. In addition to manipulating data of a maximum data size, many digital circuits also allow data of smaller, evenly divided sizes to be manipulated. For example, a digital circuit with a maximum data word size of 32 bits might also manipulate 8-bit or 16-bit data. A data operand that is half the size of the maximum data word is typically called a half-word. When the extra precision is not required, manipulating smaller data operands may provide advantages such as requiring less memory to store the data or allowing multiple data operands to be manipulated simultaneously by the same circuit.
- Two manipulation operations that have proven to be useful when working with digital data are rotation and shifting. The bits of data within a data word are arranged in a fixed order, typically from most significant bit (MSB) in the leftmost position to least significant bit (LSB) in the rightmost position. The rotation operation takes a data word as an input operand and rearranges the order of the bits within that data word by moving bit values to the left or the right by a number of bit positions which may be fixed or may be specified by a second input operand. When rotating to the left, bit values that are moved past the MSB bit position are inserted into the right side bit positions which have been left vacant by the other bits being moved to the left. When rotating to the right, bits that are moved past the LSB bit position are inserted into the left side bit positions in the same manner. For example, consider a 32-bit data word:
-
- 0101 0001 0000 0000 0000 0000 1010 1110
An instruction to rotate this data word left by four bits results in the new value: - 0001 0000 0000 0000 0000 1010 1110 0101
Since the values of the bits that are being rotated out the top or bottom of the data word are wrapped around and inserted at the other end of the data word, no bit values are lost.
- 0101 0001 0000 0000 0000 0000 1010 1110
- The second operation, shifting, also takes a data word as an input operand and rearranges the order of the bits within that data word by moving bit values to the left or the right by a number of bit positions which may be fixed or may be specified by a second input operand. A shift operation, however, discards the bit values that are moved past the MSB or LSB bit positions. The bit positions that are left empty by the shift operation are filled with a fixed value, most commonly either with all 0s or all 1s. As an example, consider a 32-bit data word:
-
- 0101 0001 0000 0000 0000 0000 1010 1110
An instruction to shift this word left by four bits results in the new value: - 0001 0000 0000 0000 0000 1010 1110 0000
It is also common when shifting to the right to use the value of the input at the MSB bit position to fill the bit positions that are left empty. For signed binary numbers, this has the property of ensuring that the number keeps the same sign.
- 0101 0001 0000 0000 0000 0000 1010 1110
- As noted above, shifting and rotation are manipulation functions frequently performed in the execution stage of a microprocessor pipeline. Most microprocessors employ a logic unit known as a barrel shifter for effecting bitwise shifts of binary numbers. Barrel shifters permit shifting of an N bit word either to the left or to the right by 0, 1, 2, . . . N-1 bits. As noted above, a typical 32-bit barrel shifter will consist of a series of multiplexers. Referring to
FIG. 1 , a conventional right and leftbarrel shifter structure 100 is shown. In order to permit bi-directional shifting, duplicative hardware is used in parallel, with one side performing leftward shifts and the other performing rightward shifts. A single 5-bit control line will tell each stage of the multiplexer to effect a shift. In this manner, any combination of shifts between 0 and 31 bits may be effected by enabling various combinations of the 5 multiplexer stages. For example, a nine-bit shift would have a control signal of 01001, enabling the 1st and the 4th multiplexers while disabling the others. One of the parallel shifters will perform a right directional shift while the other performs a left directional shift. Selection logic at the output of the last of each parallel multiplexer will select the appropriate result. - The conventional barrel shifter is effective at shifting, however, it is a less than ideal solution because the redundant hardware structure occupies extra space on the chip, consumes additional power and complicates the hardware design. The hardware complexity of this 32-bit barrel shifter can be characterised by the number of 2:1 multiplexers required to implement its functionalities. In this case, 5 stages each of 32 2:1 multiplexers are required resulting in 160 2:1 multiplexers. In general, the number of 2:1 multiplexers required to implement an N-bit barrel shifter, where N is a positive integer and a power of 2, is N log2(N). As noted above, a typical processor needs two such barrel shifters to implement both left and right shifts. In the case of a 32-bit processor, this requires 320 2:1 multiplexers. With two such barrel shifters working in parallel on the same input data, the rotation operation can also be implemented with additional logic to compute the effective shift distance required in each shifter and then combining the results of the shift operations. This can be illustrated by way of an example of rotating a 32-bit number to the right by 4 bit positions. In this case, the right shifter has to shift the input data by 4 bit positions and the left shifter has to shift the input data by 28 bit positions. The rotation result can then be obtained by combining the two shifter outputs using the bitwise logical OR operation. In general, to rotate the input data by D bit positions, where D is a non-negative integer less than the data word length N, a shift distance of D is applied to the shifter of the same direction as the rotation and a shift distance of (N-D) is applied to the shifter of the opposite direction. In a processor that supports negative shift distance, further additional logic is required to compute the absolute value of a negative shift distance and apply it to the shifter with a shift direction opposite to the specified one.
- It should be appreciated that the description herein of various advantages and disadvantages associated with known apparatus, methods, and materials is not intended to limit the scope of the invention to their exclusion. Indeed, various embodiments of the invention may include one or more of the known apparatus, methods, and materials without suffering from their disadvantages.
- As background to the techniques discussed herein, the following references are incorporated herein by reference: U.S. Pat. No. 6,862,563 issued Mar. 1, 2005 entitled “Method And Apparatus For Managing The Configuration And Functionality Of A Semiconductor Design” (Hakewill et al.); U.S. Ser. No. 10/423,745 filed Apr. 25, 2003, entitled “Apparatus and Method for Managing Integrated Circuit Designs”; and U.S. Ser. No. 10/651,560 filed Aug. 29, 2003, entitled “Improved Computerized Extension Apparatus and Methods”, all assigned to the assignee of the present invention.
- Thus, there exists a need for a barrel shifter that ameliorates and/or eliminates one or more of the above noted problems. In particular, there exists a need for a barrel shifter with reduced power consumption, improved performance and/or reduction of silicon footprint as compared with conventional barrel shifter devices.
- In various embodiments, this is accomplished through a microprocessor architecture that utilizes a barrel shifter characterized by reduction in complexity, reduced power consumption and enhanced capability over conventional barrel shifter designs. In various exemplary embodiments, the barrel shifter comprises a 64 bit right-shifting barrel shifter capable of right and left directional shifts of a 32 bit input with positive or negative shift distance. In various exemplary embodiments, a right shift of n bits (n<32) is equivalent to a negative left shift of 32-n bits, and a left shift by n bits is equivalent to a negative right shift of 32-n bits. In various exemplary embodiments, the barrel shifter is comprised of a 5 series oriented multiplexers, each shifting by a distance of 1, 2, 4, 8, 16 and 32 bits respectively. The barrel shifter also takes advantage of the fact that all bits between the bit length of the multiplexer stage and the 64th bit are zero. As a result, no hardware is necessary to keep track of these bits. Thus, five series multiplexers having lengths of 33, 35, 39, 47 and 63 bits respectively can be employed having a reduced hardware footprint as compared to five 64-bit multiplexers or dual 32 bit multiplexers as are typically employed. Such a barrel shifter also permits rotation functions with minimal additional hardware logic.
- At least one embodiment of the invention provides a barrel shifter comprising a 2N bit shifter having an upper N bit portion for receiving an N bit input and an lower N bit portion, wherein an X-bit right shift, X<N of a number yields an X bit right shift in the upper portion and an N-X bit left shift in the lower portion of the 2N bit barrel shifter, and further wherein N is an integer power of 2.
- At least one other embodiment of the invention provides a 2N bit right only barrel shifter, where N is an integer multiple of 2. The 2N bit right only barrel shifter according to this embodiment may comprise a number of multiplexer stages corresponding to Log2N, wherein each successive stage of the multiplexer adds 2x additional bits to the number of bits in the preceding stage where x increments from 0 to (Log2N-1).
- An additional embodiment of the invention provides a method of performing a positive X bit right shift with a 2N bit right only shifter, 0<X<N, wherein X is an integer and N is a word length in bits. The method of performing a positive X bit right shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the shifter, shifting the input by X bits, and retrieving the results from the upper N bit portion of the 2N bit shifter.
- Yet another embodiment of the invention provides a method of performing a negative X bit right shift with a 2N bit right only shifter, 0<X<N, wherein X is an integer and N is a word length in bits. The method of performing a negative X bit right shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the shifter, shifting the input by X bits, and retrieving the results from the lower N bit portion of the 2N bit shifter.
- A further embodiment of the invention provides a method of performing a positive X bit left shift with a 2N bit right only shifter, 0<X<N wherein X is an integer and N is a word length in bits. The method of performing a positive X bit left shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the 2N bit right only shifter, determining a bit wise inverse of X, shifting the input by (1+inverse of X) bits, and retrieving the results from the lower N bit portion of the 2N bit shifter.
- Still another embodiment of the invention provides a method of performing a negative X bit shift with a 2N bit right only shifter, 0<X<N wherein X is an integer and N is a word length in bits. The method of performing a negative X bit shift with a 2N bit right only shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the 2N bit right only shifter, determining a bit wise inverse of X, shifting the input by (1+inverse of X) bits, and retrieving the results from the upper N bit portion of the 2N bit shifter.
- Yet another additional embodiment of the invention provides a method of performing an X bit right rotation of an N bit number with a 2N bit right only barrel shifter where 0<X<N. The method of performing an X bit right rotation of an N bit number with a 2N bit right only barrel shifter according to this embodiment may comprise receiving an N bit data input in an upper N bit portion of the 2N bit barrel shifter, right shifting the N bit data input by X bits into the N bit barrel shifter, and performing a logical OR of the contents of the upper N bit portion and lower N bit portion of the 2N bit barrel shifter.
- Still another additional embodiment of the invention provides a method of performing an X bit left rotation of an N bit number with a 2N bit right only barrel shifter where 0<X<N. The method of performing an X bit left rotation of an N bit number with a 2N bit right only barrel shifter may comprise receiving an N bit data input in an upper N bit portion of the 2N bit barrel shifter, determining a bit wise inverse of X, shifting the input by (1+inverse of X) bits, and performing a logical OR of the contents of the upper N bit portion and lower N bit portion of the 2N bit barrel shifter.
- A further embodiment of the invention provides a 2N bit barrel shifter. The 2N bit barrel shifter according to this embodiment may comprise a pair of upper and lower N bit shifter portions, wherein an X bit right shift of an N bit number yields a X bit right shift in the upper N bit shifter portion and an N-X bit left shift in the lower N bit shifter portion.
- Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
-
FIG. 1 is a schematic diagram illustrating a conventional parallel right and left shifting 32-bit barrel shifter comprising five series multiplexers each; -
FIG. 2 is a schematic diagram illustrating a 64-bit right shifting only barrel shifter capable of signed right or left binary shifts in accordance with at least one embodiment of this invention; -
FIG. 3 is a table illustrating right and left positive and negative shifts as performed with the barrel shifter according to various embodiments of this invention; -
FIG. 4 is a table showing the results of right and left positive and negative shifts for a given input as performed with the barrel shifter according to various embodiments of this invention; -
FIG. 5 is a table illustrating the results of an 8 bit rotation performed with a barrel shifter according to various embodiments of this invention; and -
FIG. 6 is a diagram illustrating the stages of a multiplexer-based barrel shifter according to various embodiments of this invention. - The following description is intended to convey a thorough understanding of the invention by providing specific embodiments and details involving various aspects of a new and useful microprocessor architecture. It is understood, however, that the invention is not limited to these specific embodiments and details, which are exemplary only. It further is understood that one possessing ordinary skill in the art, in light of known systems and methods, would appreciate the use of the invention for its intended purposes and benefits in any number of alternative embodiments, depending upon specific design and other needs.
- Referring now to
FIG. 2 , an exemplary embodiment of an improved barrel shifter architecture is shown for a microprocessor, wherein a 64-bit right onlyshifter 200 can provide the same functionality as the conventional double 32-bit shifter 100 shown inFIG. 1 , while having a reduced circuit complexity and reduced power consumption. If the 64-bit shifter 200 is configured as two side-by-side 32-bit shifters, the following principles may be exploited. Assume that the binary number (up to 32-bits) to be shifted resides in the left side of the 64-bit shifter, labeled A inFIG. 2 . Now, when, for example, a two bit right shift operation is performed, the left half of the 64-bit result contains the number A shifted right by two bits, or in other words, the number A with the rightmost (least significant) bits (LSB) truncated off and two leading zeros appended to the front (most significant bit) (MSB). However, the right half of the 64-bit shifter contains the number A characterized byonly bits - The above behavior can be explained by two facts. Firstly, a left shift is equivalent to a right shift of the same number of bit positions but in the opposite direction and vice versa. In other words, a left shift is equivalent to a right shift of the same absolute shift distance but of the opposite sign. Secondly, by selecting the left half of the 64-bit result, an implicit shift distance of 32 is added to the actual number of bit positions shifted right. In the above example shown in
FIG. 2 , the 2-bit right shift can be viewed as a negative 2-bit left shift. Selecting the left half of the 64-bit result is equivalent to adding 32 to the shift distance of −2. Hence, it is equivalent to a left shift of 30 bit positions. Also, since a negative right shift is equivalent to a positive left shift of the same shift distance, a negative right shift of 30 bit positions can be obtained by selecting the left half of the result of the 64-bit shifter having performed a 2-bit right shift. Similarly, negative left shift of 2 bit positions is performed by selecting the right half of the above 64-bit result. That is, the 64-bit right only shifter can be used to compute left and right shifts of up to 32 bit positions in either the positive or the negative sense. - Another advantage of the barrel shifter illustrated in
FIG. 2 is that because each multiplexer can only shift right by 1, 2, 4, 8 and 16 bits respectively and the left half of the input is always selected to be zero, the required length of each multiplexer need only be 33, 35, 39, 47 and 63-bits respectively. This is because the other bits are all zeros and can be left out of the hardware logic. This results in simplified design, reduction of chip area and reduced power consumption. - Referring now to
FIG. 3 , a table illustrating the way in which right and left positive and negative shifts are performed with the barrel shifter according to various embodiments of this invention is depicted. The table 300 inFIG. 3 shows that positive and negative right shifts, that is shift distance D>0 and D<0, are performed with the 64 bit barrel shifter according to various embodiments by taking the upper portion and lower portion of the shifter respectively after performing a shift of shift distance D. Also positive and negative left shifts are performed by taking the lower portion and upper portion of the shifter respectively after performing a shift of shift distance equal to the inverse of D+1. For right shifts, the specified shift distance is applied directly to the shifter. In the case of the 64bit shifter 310 depicted inFIG. 3 , theupper portion 310 u would be the upper 32 bits and the lower portion 310 l would be the lower 32 bits. Left directional positive and negative shifts are performed by taking the lower and upper portions respectively after shifting the negation or inverse of the shift distance plus one bit. - Referring now to
FIG. 4 , an example illustrating the results an eight bit shift according to the procedures set forth in the table 300 ofFIG. 3 is illustrated. Taking the 32 bit hex number AABBCCDD as an input, the result of a positive right shift is simply 00AABBCC. The last eight bits of the input, DD, are truncated off by the shift operation. Likewise, two leading zeros are appended to the leading portion of the input. Thus, the result remaining in theupper portion 310 u of the 64bit shifter 300 will be 00AABBCC. Thelower portion 310 u will contain the number DD followed by six zeros. This is equivalent to a negative right shift of 24 bits or analogously, a left shift of 24 bits. - A positive left shift of eight bits is derived by shifting (ˆD)+1, where ˆD is the bitwise inverse of D, and taking the contents of the
lower portion 310 u. Thus, in this case, D=8 or 01000 in binary. The inverse of this is 10111. Adding 1 yields 11000 or 24 in decimal. So performing a 24 bit right shift in the 64-bit right shifter yields BBCCDD00 in the lower 32 bit portion of the shifter. This is the same as if the input AABBCCDD had actually shifted left by 8 bits. Similarly, a negative left shift of 24 bit positions, that is D=−24 is accomplished by right shifting by the inverse of D plus 1 or 24 bits, and taking the contents of theupper portion 310 u, or 000000AA. - The 64-bit right only
barrel shifter 200 ofFIG. 2 can also be used to perform the rotation operation. The rotation of a 32-bit quantity can be obtained by combining the result of a left shift and a right shift. As an example, rotating a 32-bit quantity by two bit positions to the right can be implemented by combining the results of right shifting the quantity by two bit positions and that of left shifting the same quantity by 30 bit positions. As demonstrated above, these two shift results are available respectively as the right and left halves of the 64-bit right shifter. Hence the same underlying shifter design can be used to compute 32-bit rotation to the right. Rotation to the left can be supported similarly. - Referring now to
FIG. 5 , a table illustrating the results of an eight bit rotation to the right performed with a barrel shifter according to various embodiments of this invention. In order to perform an eight bit rotation using the 64 bit right only shifter according to various embodiments of this invention, a shift by eight bits is performed on the input. This will leave the results of an eight bit right shift in the upper portion and the results of a 24 bit left shit in the lower portion, 00AABBCC and DD000000 respectively. These two numbers are logically OR-ed together to yield DDAABBCC. -
FIG. 6 is a diagram illustrating the stages of a multiplexer-based barrel shifter according to various embodiments of this invention. The diagram shows the input as a 32 bit number in the upper 32 bit portion of the 64bit multiplexer 600. Themultiplexer 600 shown inFIG. 6 is comprised of five stages. However, it should be appreciated that the principles set forth herein are applicable to barrel shifters of different lengths, for example, 128 bits, 256 bits, etc., where N is the instruction word length of the processor, the length of the barrel shifter is 2N and the number of multiplexer stages required is Log2N. - With continued reference to
FIG. 6 , the first stage of the multiplexer performs a one bit shift and thus is 33 bits in length. The second stage performs a two bit shift and thus is 33+2 or 35 bits in length. The third stage performs a four bit shift and is thus 35+4 or 39 bits in length. The fourth stage performs an eight bit shift and is thus 39+8 or 47 bits in length. Finally, the fifth stage performs a 16 bit shift and is 47+16 or 63 bits in length. With this combination of stages a shift of any length from 1 to 32 bits can be performed using various possible combinations of the stages with a five bit control line having the most significant bit activating the fifth stage and the least significant bit activating the first stage. Furthermore, as compared to the prior art using dual parallel 32 bit shifters which requires 320 bits of multiplexer logic, the 64 bit right only barrel shifter in accordance with various embodiments of the invention requires only 217 bits of multiplexer logic. This is a logic savings of nearly 33% which will hold constant for differing word sizes, such as, for example, 16 bit, 32 bit, 64 bit, 128 bit, etc. - While the foregoing description includes many details and specificities, it is to be understood that these have been included for purposes of explanation only, and are not to be interpreted as limitations of the present invention. Many modifications to the embodiments described above can be made without departing from the spirit and scope of the invention.
Claims (18)
1. A barrel shifter comprising:
a 2N bit shifter having an upper N bit portion for receiving an N bit input and a lower N bit portion, wherein an X-bit right shift, X<N of a number yields an X bit right shift in the upper portion and an N-X bit left shift in the lower portion of the 2N bit barrel shifter, and further wherein N is an integer power of 2.
2. The barrel shifter according to claim 1 , wherein the 2N bit shifter is a right direction only shifter.
3. The barrel shifter according to claim 1 , wherein an X bit rotation, X<N, of an input is achieved by a bit-wise logical OR of the contents of the upper N bit portion and lower N bit portion after performing an X bit right shift.
4. The barrel shifter according to claim 1 , wherein the 2N bit barrel shifter comprises a Log2N stage multiplexer having Log2N bit control line, wherein each bit of the control line is connected to a respective stage of the multiplexer.
5. The barrel shifter according to claim 4 , wherein N=32 and the multiplexer comprises 5 stages having 33-bits, 35-bits, 39-bits, 47-bits and 63-bits respectively.
6. A 2N bit right only barrel shifter, where N is an integer multiple of 2, comprising:
a number of multiplexer stages corresponding to Log2N, wherein each successive stage of the multiplexer adds 2x additional bits to the number of bits in the preceding stage where x increments from 0 to (Log2N-1).
7. The 2N bit right only barrel shifter according to claim 6 , wherein, N=32, the first multiplexer stage is 33 bits, the second multiplexer stage is 35 bits, the third multiplexer stage is 39 bits, the fourth multiplexer stage is 47 bits and the fifth multiplexer stage is 63 bits, wherein the 5 stages are adapted to perform 1-bit, 2-bit, 4-bit, 8-bit and 16-bit shifts respectively.
8. A method of performing a positive X bit right shift with a 2N bit right only shifter, 0<X<N, wherein X is an integer and N is a word length in bits, comprising:
receiving an N bit data input in an upper N bit portion of the shifter;
shifting the input by X bits; and
retrieving the results from the upper N bit portion of the 2N bit shifter.
9. A method of performing a negative X bit right shift with a 2N bit right only shifter, 0<X<N, wherein X is an integer and N is a word length in bits, comprising:
receiving an N bit data input in an upper N bit portion of the shifter;
shifting the input by X bits; and
retrieving the results from the lower N bit portion of the 2N bit shifter.
10. A method of performing a positive X bit left shift with a 2N bit right only shifter, 0<X<N wherein X is an integer and N is a word length in bits, comprising:
receiving an N bit data input in an upper N bit portion of the 2N bit right only shifter;
determining a bit wise inverse of X;
shifting the input by (1+inverse of X) bits; and
retrieving the results from the lower N bit portion of the 2N bit shifter.
11. The method according to claim 10 , wherein shifting comprises sending a Log2N bit control signal to a Log2N stage multiplexer.
12. The method according to claim 11 , wherein N=32, and the 5 stage multiplexer comprises 33-bit, 35-bit, 39-bit, 47-bit and 63-bit stages shifting 1 bit, 2 bits, 4 bits, 8 bits and 16 bits respectively.
13. A method of performing a negative X bit shift with a 2N bit right only shifter, 0<X<N wherein X is an integer and N is a word length in bits, comprising:
receiving an N bit data input in an upper N bit portion of the 2N bit right only shifter;
determining a bit wise inverse of X;
shifting the input by (1+inverse of X) bits; and
retrieving the results from the upper N bit portion of the 2N bit shifter.
14. The method according to claim 13 , wherein shifting comprises sending a Log2N bit control signal to a Log2N stage multiplexer.
15. The method according to claim 14 , wherein N=32, and the 5 stage multiplexer comprises 33-bit, 35-bit, 39-bit, 47-bit and 63-bit stages shifting 1 bit, 2 bits, 4 bits, 8 bits and 16 bits respectively.
16. A method of performing an X bit right rotation of an N bit number with a 2N bit right only barrel shifter where 0<X<N comprising:
receiving an N bit data input in an upper N bit portion of the 2N bit barrel shifter;
right shifting the N bit data input by X bits into the N bit barrel shifter; and
performing a logical OR of the contents of the upper N bit portion and lower N bit portion of the 2N bit barrel shifter.
17. A method of performing an X bit left rotation of an N bit number with a 2N bit right only barrel shifter where 0<X<N comprising:
receiving an N bit data input in an upper N bit portion of the 2N bit barrel shifter;
determining a bit wise inverse of X;
shifting the input by (1+inverse of X) bits; and
performing a logical OR of the contents of the upper N bit portion and lower N bit portion of the 2N bit barrel shifter.
18. A 2N bit barrel shifter comprising:
a pair of upper and lower N bit shifter portions, wherein an X bit right shift of an N bit number yields a X bit right shift in the upper N bit shifter portion and an N-X bit left shift in the lower N bit shifter portion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/132,448 US20050289323A1 (en) | 2004-05-19 | 2005-05-19 | Barrel shifter for a microprocessor |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US57223804P | 2004-05-19 | 2004-05-19 | |
US11/132,448 US20050289323A1 (en) | 2004-05-19 | 2005-05-19 | Barrel shifter for a microprocessor |
Publications (1)
Publication Number | Publication Date |
---|---|
US20050289323A1 true US20050289323A1 (en) | 2005-12-29 |
Family
ID=35429033
Family Applications (7)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/132,428 Abandoned US20050278517A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods for performing branch prediction in a variable length instruction set microprocessor |
US11/132,447 Abandoned US20050278505A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
US11/132,448 Abandoned US20050289323A1 (en) | 2004-05-19 | 2005-05-19 | Barrel shifter for a microprocessor |
US11/132,423 Abandoned US20050278513A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods of dynamic branch prediction in a microprocessor |
US11/132,432 Abandoned US20050273559A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including unified cache debug unit |
US11/132,424 Active 2031-02-12 US8719837B2 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture having extendible logic |
US14/222,194 Active US9003422B2 (en) | 2004-05-19 | 2014-03-21 | Microprocessor architecture having extendible logic |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/132,428 Abandoned US20050278517A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods for performing branch prediction in a variable length instruction set microprocessor |
US11/132,447 Abandoned US20050278505A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
Family Applications After (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/132,423 Abandoned US20050278513A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods of dynamic branch prediction in a microprocessor |
US11/132,432 Abandoned US20050273559A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including unified cache debug unit |
US11/132,424 Active 2031-02-12 US8719837B2 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture having extendible logic |
US14/222,194 Active US9003422B2 (en) | 2004-05-19 | 2014-03-21 | Microprocessor architecture having extendible logic |
Country Status (5)
Country | Link |
---|---|
US (7) | US20050278517A1 (en) |
CN (1) | CN101002169A (en) |
GB (1) | GB2428842A (en) |
TW (1) | TW200602974A (en) |
WO (1) | WO2005114441A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050278505A1 (en) * | 2004-05-19 | 2005-12-15 | Lim Seow C | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
US7971042B2 (en) | 2005-09-28 | 2011-06-28 | Synopsys, Inc. | Microprocessor system and method for instruction-initiated recording and execution of instruction sequences in a dynamically decoupleable extended instruction pipeline |
US9959247B1 (en) | 2017-02-17 | 2018-05-01 | Google Llc | Permuting in a matrix-vector processor |
Families Citing this family (68)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7577795B2 (en) * | 2006-01-25 | 2009-08-18 | International Business Machines Corporation | Disowning cache entries on aging out of the entry |
US20070260862A1 (en) * | 2006-05-03 | 2007-11-08 | Mcfarling Scott | Providing storage in a memory hierarchy for prediction information |
US7752468B2 (en) | 2006-06-06 | 2010-07-06 | Intel Corporation | Predict computing platform memory power utilization |
US7555605B2 (en) * | 2006-09-28 | 2009-06-30 | Freescale Semiconductor, Inc. | Data processing system having cache memory debugging support and method therefor |
US7716460B2 (en) * | 2006-09-29 | 2010-05-11 | Qualcomm Incorporated | Effective use of a BHT in processor having variable length instruction set execution modes |
US7529909B2 (en) * | 2006-12-28 | 2009-05-05 | Microsoft Corporation | Security verified reconfiguration of execution datapath in extensible microcomputer |
US7779241B1 (en) | 2007-04-10 | 2010-08-17 | Dunn David A | History based pipelined branch prediction |
US8166277B2 (en) * | 2008-02-01 | 2012-04-24 | International Business Machines Corporation | Data prefetching using indirect addressing |
US8209488B2 (en) * | 2008-02-01 | 2012-06-26 | International Business Machines Corporation | Techniques for prediction-based indirect data prefetching |
US9519480B2 (en) * | 2008-02-11 | 2016-12-13 | International Business Machines Corporation | Branch target preloading using a multiplexer and hash circuit to reduce incorrect branch predictions |
US9201655B2 (en) * | 2008-03-19 | 2015-12-01 | International Business Machines Corporation | Method, computer program product, and hardware product for eliminating or reducing operand line crossing penalty |
US8181003B2 (en) * | 2008-05-29 | 2012-05-15 | Axis Semiconductor, Inc. | Instruction set design, control and communication in programmable microprocessor cores and the like |
US8131982B2 (en) * | 2008-06-13 | 2012-03-06 | International Business Machines Corporation | Branch prediction instructions having mask values involving unloading and loading branch history data |
US8225069B2 (en) * | 2009-03-31 | 2012-07-17 | Intel Corporation | Control of on-die system fabric blocks |
US10338923B2 (en) * | 2009-05-05 | 2019-07-02 | International Business Machines Corporation | Branch prediction path wrong guess instruction |
JP5423156B2 (en) * | 2009-06-01 | 2014-02-19 | 富士通株式会社 | Information processing apparatus and branch prediction method |
US8954714B2 (en) * | 2010-02-01 | 2015-02-10 | Altera Corporation | Processor with cycle offsets and delay lines to allow scheduling of instructions through time |
US8521999B2 (en) * | 2010-03-11 | 2013-08-27 | International Business Machines Corporation | Executing touchBHT instruction to pre-fetch information to prediction mechanism for branch with taken history |
US8495287B2 (en) * | 2010-06-24 | 2013-07-23 | International Business Machines Corporation | Clock-based debugging for embedded dynamic random access memory element in a processor core |
US10866807B2 (en) | 2011-12-22 | 2020-12-15 | Intel Corporation | Processors, methods, systems, and instructions to generate sequences of integers in numerical order that differ by a constant stride |
WO2013095554A1 (en) | 2011-12-22 | 2013-06-27 | Intel Corporation | Processors, methods, systems, and instructions to generate sequences of consecutive integers in numerical order |
US10223111B2 (en) | 2011-12-22 | 2019-03-05 | Intel Corporation | Processors, methods, systems, and instructions to generate sequences of integers in which integers in consecutive positions differ by a constant integer stride and where a smallest integer is offset from zero by an integer offset |
US9639354B2 (en) | 2011-12-22 | 2017-05-02 | Intel Corporation | Packed data rearrangement control indexes precursors generation processors, methods, systems, and instructions |
US9395994B2 (en) | 2011-12-30 | 2016-07-19 | Intel Corporation | Embedded branch prediction unit |
US9851973B2 (en) * | 2012-03-30 | 2017-12-26 | Intel Corporation | Dynamic branch hints using branches-to-nowhere conditional branch |
US9135012B2 (en) | 2012-06-14 | 2015-09-15 | International Business Machines Corporation | Instruction filtering |
US9152424B2 (en) * | 2012-06-14 | 2015-10-06 | International Business Machines Corporation | Mitigating instruction prediction latency with independently filtered presence predictors |
EP2862061A4 (en) * | 2012-06-15 | 2016-12-21 | Soft Machines Inc | A virtual load store queue having a dynamic dispatch window with a unified structure |
US9378017B2 (en) | 2012-12-29 | 2016-06-28 | Intel Corporation | Apparatus and method of efficient vector roll operation |
CN103425498B (en) * | 2013-08-20 | 2018-07-24 | 复旦大学 | A kind of long instruction words command memory of low-power consumption and its method for optimizing power consumption |
US10372590B2 (en) | 2013-11-22 | 2019-08-06 | International Business Corporation | Determining instruction execution history in a debugger |
US9870226B2 (en) * | 2014-07-03 | 2018-01-16 | The Regents Of The University Of Michigan | Control of switching between executed mechanisms |
US9910670B2 (en) | 2014-07-09 | 2018-03-06 | Intel Corporation | Instruction set for eliminating misaligned memory accesses during processing of an array having misaligned data rows |
US9740607B2 (en) | 2014-09-03 | 2017-08-22 | Micron Technology, Inc. | Swap operations in memory |
TWI569207B (en) * | 2014-10-28 | 2017-02-01 | 上海兆芯集成電路有限公司 | Fractional use of prediction history storage for operating system routines |
US9665374B2 (en) * | 2014-12-18 | 2017-05-30 | Intel Corporation | Binary translation mechanism |
CN114546492A (en) * | 2015-04-24 | 2022-05-27 | 优创半导体科技有限公司 | Computer processor implementing pre-translation of virtual addresses using target registers |
US10346168B2 (en) * | 2015-06-26 | 2019-07-09 | Microsoft Technology Licensing, Llc | Decoupled processor instruction window and operand buffer |
US10776115B2 (en) * | 2015-09-19 | 2020-09-15 | Microsoft Technology Licensing, Llc | Debug support for block-based processor |
US10664280B2 (en) | 2015-11-09 | 2020-05-26 | MIPS Tech, LLC | Fetch ahead branch target buffer |
US10599428B2 (en) | 2016-03-23 | 2020-03-24 | Arm Limited | Relaxed execution of overlapping mixed-scalar-vector instructions |
GB2548601B (en) * | 2016-03-23 | 2019-02-13 | Advanced Risc Mach Ltd | Processing vector instructions |
US10192281B2 (en) | 2016-07-07 | 2019-01-29 | Intel Corporation | Graphics command parsing mechanism |
CN109690536B (en) * | 2017-02-16 | 2021-03-23 | 华为技术有限公司 | Method and system for fetching multi-core instruction traces from virtual platform simulator to performance simulation model |
CN107179895B (en) * | 2017-05-17 | 2020-08-28 | 北京中科睿芯科技有限公司 | Method for accelerating instruction execution speed in data stream structure by applying composite instruction |
US10902348B2 (en) | 2017-05-19 | 2021-01-26 | International Business Machines Corporation | Computerized branch predictions and decisions |
GB2564390B (en) * | 2017-07-04 | 2019-10-02 | Advanced Risc Mach Ltd | An apparatus and method for controlling use of a register cache |
US11243880B1 (en) * | 2017-09-15 | 2022-02-08 | Groq, Inc. | Processor architecture |
US11360934B1 (en) | 2017-09-15 | 2022-06-14 | Groq, Inc. | Tensor streaming processor architecture |
US11868804B1 (en) | 2019-11-18 | 2024-01-09 | Groq, Inc. | Processor instruction dispatch configuration |
US11114138B2 (en) | 2017-09-15 | 2021-09-07 | Groq, Inc. | Data structures with multiple read ports |
US10372459B2 (en) | 2017-09-21 | 2019-08-06 | Qualcomm Incorporated | Training and utilization of neural branch predictor |
US11170307B1 (en) | 2017-09-21 | 2021-11-09 | Groq, Inc. | Predictive model compiler for generating a statically scheduled binary with known resource constraints |
US20200065112A1 (en) * | 2018-08-22 | 2020-02-27 | Qualcomm Incorporated | Asymmetric speculative/nonspeculative conditional branching |
US11301546B2 (en) | 2018-11-19 | 2022-04-12 | Groq, Inc. | Spatial locality transform of matrices |
US11163577B2 (en) | 2018-11-26 | 2021-11-02 | International Business Machines Corporation | Selectively supporting static branch prediction settings only in association with processor-designated types of instructions |
US11086631B2 (en) | 2018-11-30 | 2021-08-10 | Western Digital Technologies, Inc. | Illegal instruction exception handling |
CN109783384A (en) * | 2019-01-10 | 2019-05-21 | 未来电视有限公司 | Log use-case test method, log use-case test device and electronic equipment |
US11182166B2 (en) | 2019-05-23 | 2021-11-23 | Samsung Electronics Co., Ltd. | Branch prediction throughput by skipping over cachelines without branches |
CN110442382B (en) * | 2019-07-31 | 2021-06-15 | 西安芯海微电子科技有限公司 | Prefetch cache control method, device, chip and computer readable storage medium |
CN110727463B (en) * | 2019-09-12 | 2021-08-10 | 无锡江南计算技术研究所 | Zero-level instruction circular buffer prefetching method and device based on dynamic credit |
US11392535B2 (en) | 2019-11-26 | 2022-07-19 | Groq, Inc. | Loading operands and outputting results from a multi-dimensional array using only a single side |
CN112015490A (en) * | 2020-11-02 | 2020-12-01 | 鹏城实验室 | Method, apparatus and medium for programmable device implementing and testing reduced instruction set |
CN113076277A (en) * | 2021-03-26 | 2021-07-06 | 大唐微电子技术有限公司 | Method and device for realizing pipeline scheduling, computer storage medium and terminal |
US11599358B1 (en) | 2021-08-12 | 2023-03-07 | Tenstorrent Inc. | Pre-staged instruction registers for variable length instruction set machine |
US11663007B2 (en) * | 2021-10-01 | 2023-05-30 | Arm Limited | Control of branch prediction for zero-overhead loop |
CN115495155B (en) * | 2022-11-18 | 2023-03-24 | 北京数渡信息科技有限公司 | Hardware circulation processing device suitable for general processor |
CN117193861A (en) * | 2023-11-07 | 2023-12-08 | 芯来智融半导体科技(上海)有限公司 | Instruction processing method, apparatus, computer device and storage medium |
Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4829460A (en) * | 1986-10-15 | 1989-05-09 | Fujitsu Limited | Barrel shifter |
US4831571A (en) * | 1986-08-11 | 1989-05-16 | Kabushiki Kaisha Toshiba | Barrel shifter for rotating data with or without carry |
US5155698A (en) * | 1990-08-29 | 1992-10-13 | Nec Corporation | Barrel shifter circuit having rotation function |
US5423011A (en) * | 1992-06-11 | 1995-06-06 | International Business Machines Corporation | Apparatus for initializing branch prediction information |
US5493687A (en) * | 1991-07-08 | 1996-02-20 | Seiko Epson Corporation | RISC microprocessor architecture implementing multiple typed register sets |
US5530825A (en) * | 1994-04-15 | 1996-06-25 | Motorola, Inc. | Data processor with branch target address cache and method of operation |
US5808876A (en) * | 1997-06-20 | 1998-09-15 | International Business Machines Corporation | Multi-function power distribution system |
US5896305A (en) * | 1996-02-08 | 1999-04-20 | Texas Instruments Incorporated | Shifter circuit for an arithmetic logic unit in a microprocessor |
US5920711A (en) * | 1995-06-02 | 1999-07-06 | Synopsys, Inc. | System for frame-based protocol, graphical capture, synthesis, analysis, and simulation |
US5978909A (en) * | 1997-11-26 | 1999-11-02 | Intel Corporation | System for speculative branch target prediction having a dynamic prediction history buffer and a static prediction history buffer |
US6292879B1 (en) * | 1995-10-25 | 2001-09-18 | Anthony S. Fong | Method and apparatus to specify access control list and cache enabling and cache coherency requirement enabling on individual operands of an instruction of a computer |
US6550056B1 (en) * | 1999-07-19 | 2003-04-15 | Mitsubishi Denki Kabushiki Kaisha | Source level debugger for debugging source programs |
US6609194B1 (en) * | 1999-11-12 | 2003-08-19 | Ip-First, Llc | Apparatus for performing branch target address calculation based on branch type |
US6622240B1 (en) * | 1999-06-18 | 2003-09-16 | Intrinsity, Inc. | Method and apparatus for pre-branch instruction |
US6774832B1 (en) * | 2003-03-25 | 2004-08-10 | Raytheon Company | Multi-bit output DDS with real time delta sigma modulation look up from memory |
US6823444B1 (en) * | 2001-07-03 | 2004-11-23 | Ip-First, Llc | Apparatus and method for selectively accessing disparate instruction buffer stages based on branch target address cache hit and instruction stage wrap |
US6963554B1 (en) * | 2000-12-27 | 2005-11-08 | National Semiconductor Corporation | Microwire dynamic sequencer pipeline stall |
US20050273559A1 (en) * | 2004-05-19 | 2005-12-08 | Aris Aristodemou | Microprocessor architecture including unified cache debug unit |
Family Cites Families (204)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4342082A (en) | 1977-01-13 | 1982-07-27 | International Business Machines Corp. | Program instruction mechanism for shortened recursive handling of interruptions |
US4216539A (en) | 1978-05-05 | 1980-08-05 | Zehntel, Inc. | In-circuit digital tester |
US4400773A (en) | 1980-12-31 | 1983-08-23 | International Business Machines Corp. | Independent handling of I/O interrupt requests and associated status information transfers |
US4594659A (en) | 1982-10-13 | 1986-06-10 | Honeywell Information Systems Inc. | Method and apparatus for prefetching instructions for a central execution pipeline unit |
US4905178A (en) | 1986-09-19 | 1990-02-27 | Performance Semiconductor Corporation | Fast shifter method and structure |
US4914622A (en) | 1987-04-17 | 1990-04-03 | Advanced Micro Devices, Inc. | Array-organized bit map with a barrel shifter |
US4962500A (en) | 1987-08-28 | 1990-10-09 | Nec Corporation | Data processor including testing structure for a barrel shifter |
KR970005453B1 (en) * | 1987-12-25 | 1997-04-16 | 가부시기가이샤 히다찌세이사꾸쇼 | Data processing apparatus for high speed processing |
US4926323A (en) | 1988-03-03 | 1990-05-15 | Advanced Micro Devices, Inc. | Streamlined instruction processor |
JPH01263820A (en) | 1988-04-15 | 1989-10-20 | Hitachi Ltd | Microprocessor |
DE3886739D1 (en) | 1988-06-02 | 1994-02-10 | Itt Ind Gmbh Deutsche | Device for digital signal processing. |
GB2229832B (en) | 1989-03-30 | 1993-04-07 | Intel Corp | Byte swap instruction for memory format conversion within a microprocessor |
DE69032318T2 (en) * | 1989-08-31 | 1998-09-24 | Canon Kk | Image processing device |
JPH03185530A (en) * | 1989-12-14 | 1991-08-13 | Mitsubishi Electric Corp | Data processor |
EP0436341B1 (en) * | 1990-01-02 | 1997-05-07 | Motorola, Inc. | Sequential prefetch method for 1, 2 or 3 word instructions |
JPH03248226A (en) | 1990-02-26 | 1991-11-06 | Nec Corp | Microprocessor |
JP2560889B2 (en) | 1990-05-22 | 1996-12-04 | 日本電気株式会社 | Microprocessor |
US5155843A (en) * | 1990-06-29 | 1992-10-13 | Digital Equipment Corporation | Error transition mode for multi-processor system |
US5778423A (en) * | 1990-06-29 | 1998-07-07 | Digital Equipment Corporation | Prefetch instruction for improving performance in reduced instruction set processor |
CA2045790A1 (en) | 1990-06-29 | 1991-12-30 | Richard Lee Sites | Branch prediction in high-performance processor |
US5636363A (en) | 1991-06-14 | 1997-06-03 | Integrated Device Technology, Inc. | Hardware control structure and method for off-chip monitoring entries of an on-chip cache |
EP0523898B1 (en) * | 1991-07-08 | 1999-05-06 | Canon Kabushiki Kaisha | Color expressing method, color image reading apparatus and color image processing apparatus |
US5539911A (en) | 1991-07-08 | 1996-07-23 | Seiko Epson Corporation | High-performance, superscalar-based computer system with out-of-order instruction execution |
US5450586A (en) * | 1991-08-14 | 1995-09-12 | Hewlett-Packard Company | System for analyzing and debugging embedded software through dynamic and interactive use of code markers |
CA2073516A1 (en) | 1991-11-27 | 1993-05-28 | Peter Michael Kogge | Dynamic multi-mode parallel processor array architecture computer system |
US5485625A (en) | 1992-06-29 | 1996-01-16 | Ford Motor Company | Method and apparatus for monitoring external events during a microprocessor's sleep mode |
US5274770A (en) | 1992-07-29 | 1993-12-28 | Tritech Microelectronics International Pte Ltd. | Flexible register-based I/O microcontroller with single cycle instruction execution |
US5294928A (en) | 1992-08-31 | 1994-03-15 | Microchip Technology Incorporated | A/D converter with zero power mode |
US5333119A (en) | 1992-09-30 | 1994-07-26 | Regents Of The University Of Minnesota | Digital signal processor with delayed-evaluation array multipliers and low-power memory addressing |
US5542074A (en) | 1992-10-22 | 1996-07-30 | Maspar Computer Corporation | Parallel processor system with highly flexible local control capability, including selective inversion of instruction signal and control of bit shift amount |
US5696958A (en) * | 1993-01-11 | 1997-12-09 | Silicon Graphics, Inc. | Method and apparatus for reducing delays following the execution of a branch instruction in an instruction pipeline |
GB2275119B (en) | 1993-02-03 | 1997-05-14 | Motorola Inc | A cached processor |
US5577217A (en) * | 1993-05-14 | 1996-11-19 | Intel Corporation | Method and apparatus for a branch target buffer with shared branch pattern tables for associated branch predictions |
JPH06332693A (en) | 1993-05-27 | 1994-12-02 | Hitachi Ltd | Issuing system of suspending instruction with time-out function |
US5454117A (en) | 1993-08-25 | 1995-09-26 | Nexgen, Inc. | Configurable branch prediction for a processor performing speculative execution |
US5584031A (en) | 1993-11-09 | 1996-12-10 | Motorola Inc. | System and method for executing a low power delay instruction |
JP2801135B2 (en) | 1993-11-26 | 1998-09-21 | 富士通株式会社 | Instruction reading method and instruction reading device for pipeline processor |
US5590350A (en) | 1993-11-30 | 1996-12-31 | Texas Instruments Incorporated | Three input arithmetic logic unit with mask generator |
US6116768A (en) | 1993-11-30 | 2000-09-12 | Texas Instruments Incorporated | Three input arithmetic logic unit with barrel rotator |
US5509129A (en) | 1993-11-30 | 1996-04-16 | Guttag; Karl M. | Long instruction word controlling plural independent processor operations |
US5590351A (en) | 1994-01-21 | 1996-12-31 | Advanced Micro Devices, Inc. | Superscalar execution unit for sequential instruction pointer updates and segment limit checks |
TW253946B (en) * | 1994-02-04 | 1995-08-11 | Ibm | Data processor with branch prediction and method of operation |
JPH07253922A (en) * | 1994-03-14 | 1995-10-03 | Texas Instr Japan Ltd | Address generating circuit |
US5517436A (en) | 1994-06-07 | 1996-05-14 | Andreas; David C. | Digital signal processor for audio applications |
US5809293A (en) * | 1994-07-29 | 1998-09-15 | International Business Machines Corporation | System and method for program execution tracing within an integrated processor |
US5566357A (en) | 1994-10-06 | 1996-10-15 | Qualcomm Incorporated | Power reduction in a cellular radiotelephone |
US5692168A (en) * | 1994-10-18 | 1997-11-25 | Cyrix Corporation | Prefetch buffer using flow control bit to identify changes of flow within the code stream |
JPH08202469A (en) | 1995-01-30 | 1996-08-09 | Fujitsu Ltd | Microcontroller unit equipped with universal asychronous transmitting and receiving circuit |
US5600674A (en) | 1995-03-02 | 1997-02-04 | Motorola Inc. | Method and apparatus of an enhanced digital signal processor |
US5655122A (en) * | 1995-04-05 | 1997-08-05 | Sequent Computer Systems, Inc. | Optimizing compiler with static prediction of branch probability, branch frequency and function frequency |
US5835753A (en) | 1995-04-12 | 1998-11-10 | Advanced Micro Devices, Inc. | Microprocessor with dynamically extendable pipeline stages and a classifying circuit |
US5659752A (en) * | 1995-06-30 | 1997-08-19 | International Business Machines Corporation | System and method for improving branch prediction in compiled program code |
US5842004A (en) | 1995-08-04 | 1998-11-24 | Sun Microsystems, Inc. | Method and apparatus for decompression of compressed geometric three-dimensional graphics data |
US5768602A (en) | 1995-08-04 | 1998-06-16 | Apple Computer, Inc. | Sleep mode controller for power management |
US5727211A (en) * | 1995-11-09 | 1998-03-10 | Chromatic Research, Inc. | System and method for fast context switching between tasks |
US5774709A (en) | 1995-12-06 | 1998-06-30 | Lsi Logic Corporation | Enhanced branch delay slot handling with single exception program counter |
US5778438A (en) | 1995-12-06 | 1998-07-07 | Intel Corporation | Method and apparatus for maintaining cache coherency in a computer system with a highly pipelined bus and multiple conflicting snoop requests |
US5996071A (en) | 1995-12-15 | 1999-11-30 | Via-Cyrix, Inc. | Detecting self-modifying code in a pipelined processor with branch processing by comparing latched store address to subsequent target address |
JP3663710B2 (en) * | 1996-01-17 | 2005-06-22 | ヤマハ株式会社 | Program generation method and processor interrupt control method |
JPH09261490A (en) * | 1996-03-22 | 1997-10-03 | Minolta Co Ltd | Image forming device |
US5752014A (en) * | 1996-04-29 | 1998-05-12 | International Business Machines Corporation | Automatic selection of branch prediction methodology for subsequent branch instruction based on outcome of previous branch prediction |
US5784636A (en) | 1996-05-28 | 1998-07-21 | National Semiconductor Corporation | Reconfigurable computer architecture for use in signal processing applications |
US20010025337A1 (en) | 1996-06-10 | 2001-09-27 | Frank Worrell | Microprocessor including a mode detector for setting compression mode |
US5826079A (en) | 1996-07-05 | 1998-10-20 | Ncr Corporation | Method for improving the execution efficiency of frequently communicating processes utilizing affinity process scheduling by identifying and assigning the frequently communicating processes to the same processor |
US5805876A (en) | 1996-09-30 | 1998-09-08 | International Business Machines Corporation | Method and system for reducing average branch resolution time and effective misprediction penalty in a processor |
US5964884A (en) | 1996-09-30 | 1999-10-12 | Advanced Micro Devices, Inc. | Self-timed pulse control circuit |
US5848264A (en) * | 1996-10-25 | 1998-12-08 | S3 Incorporated | Debug and video queue for multi-processor chip |
GB2320388B (en) | 1996-11-29 | 1999-03-31 | Sony Corp | Image processing apparatus |
US6061521A (en) | 1996-12-02 | 2000-05-09 | Compaq Computer Corp. | Computer having multimedia operations executable as two distinct sets of operations within a single instruction cycle |
US5909572A (en) | 1996-12-02 | 1999-06-01 | Compaq Computer Corp. | System and method for conditionally moving an operand from a source register to a destination register |
EP0855645A3 (en) * | 1996-12-31 | 2000-05-24 | Texas Instruments Incorporated | System and method for speculative execution of instructions with data prefetch |
KR100236533B1 (en) | 1997-01-16 | 2000-01-15 | 윤종용 | Digital signal processor |
EP0855718A1 (en) | 1997-01-28 | 1998-07-29 | Hewlett-Packard Company | Memory low power mode control |
US6185732B1 (en) | 1997-04-08 | 2001-02-06 | Advanced Micro Devices, Inc. | Software debug port for a microprocessor |
US6154857A (en) | 1997-04-08 | 2000-11-28 | Advanced Micro Devices, Inc. | Microprocessor-based device incorporating a cache for capturing software performance profiling data |
US6584525B1 (en) | 1998-11-19 | 2003-06-24 | Edwin E. Klingman | Adaptation of standard microprocessor architectures via an interface to a configurable subsystem |
US6021500A (en) | 1997-05-07 | 2000-02-01 | Intel Corporation | Processor with sleep and deep sleep modes |
US5931950A (en) | 1997-06-17 | 1999-08-03 | Pc-Tel, Inc. | Wake-up-on-ring power conservation for host signal processing communication system |
US5950120A (en) | 1997-06-17 | 1999-09-07 | Lsi Logic Corporation | Apparatus and method for shutdown of wireless communications mobile station with multiple clocks |
US6035374A (en) | 1997-06-25 | 2000-03-07 | Sun Microsystems, Inc. | Method of executing coded instructions in a multiprocessor having shared execution resources including active, nap, and sleep states in accordance with cache miss latency |
US6088786A (en) | 1997-06-27 | 2000-07-11 | Sun Microsystems, Inc. | Method and system for coupling a stack based processor to register based functional unit |
US5878264A (en) | 1997-07-17 | 1999-03-02 | Sun Microsystems, Inc. | Power sequence controller with wakeup logic for enabling a wakeup interrupt handler procedure |
US6026478A (en) | 1997-08-01 | 2000-02-15 | Micron Technology, Inc. | Split embedded DRAM processor |
US6760833B1 (en) | 1997-08-01 | 2004-07-06 | Micron Technology, Inc. | Split embedded DRAM processor |
US6226738B1 (en) | 1997-08-01 | 2001-05-01 | Micron Technology, Inc. | Split embedded DRAM processor |
US6157988A (en) * | 1997-08-01 | 2000-12-05 | Micron Technology, Inc. | Method and apparatus for high performance branching in pipelined microsystems |
JPH1185515A (en) * | 1997-09-10 | 1999-03-30 | Ricoh Co Ltd | Microprocessor |
JPH11143571A (en) | 1997-11-05 | 1999-05-28 | Mitsubishi Electric Corp | Data processor |
US6044458A (en) * | 1997-12-12 | 2000-03-28 | Motorola, Inc. | System for monitoring program flow utilizing fixwords stored sequentially to opcodes |
US6014743A (en) | 1998-02-05 | 2000-01-11 | Intergrated Device Technology, Inc. | Apparatus and method for recording a floating point error pointer in zero cycles |
US6151672A (en) * | 1998-02-23 | 2000-11-21 | Hewlett-Packard Company | Methods and apparatus for reducing interference in a branch history table of a microprocessor |
US6374349B2 (en) | 1998-03-19 | 2002-04-16 | Mcfarling Scott | Branch predictor with serially connected predictor stages for improving branch prediction accuracy |
US6289417B1 (en) | 1998-05-18 | 2001-09-11 | Arm Limited | Operand supply to an execution unit |
US6308279B1 (en) | 1998-05-22 | 2001-10-23 | Intel Corporation | Method and apparatus for power mode transition in a multi-thread processor |
JPH11353225A (en) | 1998-05-26 | 1999-12-24 | Internatl Business Mach Corp <Ibm> | Memory that processor addressing gray code system in sequential execution style accesses and method for storing code and data in memory |
US6466333B2 (en) * | 1998-06-26 | 2002-10-15 | Canon Kabushiki Kaisha | Streamlined tetrahedral interpolation |
US20020053015A1 (en) | 1998-07-14 | 2002-05-02 | Sony Corporation And Sony Electronics Inc. | Digital signal processor particularly suited for decoding digital audio |
US6327651B1 (en) | 1998-09-08 | 2001-12-04 | International Business Machines Corporation | Wide shifting in the vector permute unit |
US6253287B1 (en) * | 1998-09-09 | 2001-06-26 | Advanced Micro Devices, Inc. | Using three-dimensional storage to make variable-length instructions appear uniform in two dimensions |
US6240521B1 (en) | 1998-09-10 | 2001-05-29 | International Business Machines Corp. | Sleep mode transition between processors sharing an instruction set and an address space |
US6347379B1 (en) | 1998-09-25 | 2002-02-12 | Intel Corporation | Reducing power consumption of an electronic device |
US6339822B1 (en) * | 1998-10-02 | 2002-01-15 | Advanced Micro Devices, Inc. | Using padded instructions in a block-oriented cache |
US6862563B1 (en) | 1998-10-14 | 2005-03-01 | Arc International | Method and apparatus for managing the configuration and functionality of a semiconductor design |
US6671743B1 (en) * | 1998-11-13 | 2003-12-30 | Creative Technology, Ltd. | Method and system for exposing proprietary APIs in a privileged device driver to an application |
WO2000031652A2 (en) * | 1998-11-20 | 2000-06-02 | Altera Corporation | Reconfigurable programmable logic device computer system |
US6189091B1 (en) * | 1998-12-02 | 2001-02-13 | Ip First, L.L.C. | Apparatus and method for speculatively updating global history and restoring same on branch misprediction detection |
US6341348B1 (en) | 1998-12-03 | 2002-01-22 | Sun Microsystems, Inc. | Software branch prediction filtering for a microprocessor |
US6957327B1 (en) * | 1998-12-31 | 2005-10-18 | Stmicroelectronics, Inc. | Block-based branch target buffer |
US6826748B1 (en) * | 1999-01-28 | 2004-11-30 | Ati International Srl | Profiling program execution into registers of a computer |
US6477683B1 (en) | 1999-02-05 | 2002-11-05 | Tensilica, Inc. | Automated processor generation system for designing a configurable processor and method for the same |
US6418530B2 (en) | 1999-02-18 | 2002-07-09 | Hewlett-Packard Company | Hardware/software system for instruction profiling and trace selection using branch history information for branch predictions |
US6499101B1 (en) * | 1999-03-18 | 2002-12-24 | I.P. First L.L.C. | Static branch prediction mechanism for conditional branch instructions |
US6427206B1 (en) * | 1999-05-03 | 2002-07-30 | Intel Corporation | Optimized branch predictions for strongly predicted compiler branches |
US6560754B1 (en) | 1999-05-13 | 2003-05-06 | Arc International Plc | Method and apparatus for jump control in a pipelined processor |
US6438700B1 (en) | 1999-05-18 | 2002-08-20 | Koninklijke Philips Electronics N.V. | System and method to reduce power consumption in advanced RISC machine (ARM) based systems |
US6772325B1 (en) * | 1999-10-01 | 2004-08-03 | Hitachi, Ltd. | Processor architecture and operation for exploiting improved branch control instruction |
US6571333B1 (en) | 1999-11-05 | 2003-05-27 | Intel Corporation | Initializing a memory controller by executing software in second memory to wakeup a system |
US6546481B1 (en) | 1999-11-05 | 2003-04-08 | Ip - First Llc | Split history tables for branch prediction |
US6909744B2 (en) | 1999-12-09 | 2005-06-21 | Redrock Semiconductor, Inc. | Processor architecture for compression and decompression of video and images |
KR100395763B1 (en) | 2000-02-01 | 2003-08-25 | 삼성전자주식회사 | A branch predictor for microprocessor having multiple processes |
US6412038B1 (en) | 2000-02-14 | 2002-06-25 | Intel Corporation | Integral modular cache for a processor |
JP2001282548A (en) | 2000-03-29 | 2001-10-12 | Matsushita Electric Ind Co Ltd | Communication equipment and communication method |
US6519696B1 (en) | 2000-03-30 | 2003-02-11 | I.P. First, Llc | Paired register exchange using renaming register map |
US6681295B1 (en) * | 2000-08-31 | 2004-01-20 | Hewlett-Packard Development Company, L.P. | Fast lane prefetching |
US6718460B1 (en) | 2000-09-05 | 2004-04-06 | Sun Microsystems, Inc. | Mechanism for error handling in a computer system |
US20030070013A1 (en) | 2000-10-27 | 2003-04-10 | Daniel Hansson | Method and apparatus for reducing power consumption in a digital processor |
US6948054B2 (en) | 2000-11-29 | 2005-09-20 | Lsi Logic Corporation | Simple branch prediction and misprediction recovery method |
TW477954B (en) | 2000-12-05 | 2002-03-01 | Faraday Tech Corp | Memory data accessing architecture and method for a processor |
US20020073301A1 (en) | 2000-12-07 | 2002-06-13 | International Business Machines Corporation | Hardware for use with compiler generated branch information |
US7139903B2 (en) | 2000-12-19 | 2006-11-21 | Hewlett-Packard Development Company, L.P. | Conflict free parallel read access to a bank interleaved branch predictor in a processor |
US6877089B2 (en) * | 2000-12-27 | 2005-04-05 | International Business Machines Corporation | Branch prediction apparatus and process for restoring replaced branch history for use in future branch predictions for an executing program |
US20020087851A1 (en) | 2000-12-28 | 2002-07-04 | Matsushita Electric Industrial Co., Ltd. | Microprocessor and an instruction converter |
US8285976B2 (en) * | 2000-12-28 | 2012-10-09 | Micron Technology, Inc. | Method and apparatus for predicting branches using a meta predictor |
US7039901B2 (en) | 2001-01-24 | 2006-05-02 | Texas Instruments Incorporated | Software shared memory bus |
US6925634B2 (en) * | 2001-01-24 | 2005-08-02 | Texas Instruments Incorporated | Method for maintaining cache coherency in software in a shared memory system |
US6823447B2 (en) * | 2001-03-01 | 2004-11-23 | International Business Machines Corporation | Software hint to improve the branch target prediction accuracy |
CA2478570A1 (en) | 2001-03-02 | 2002-09-12 | Atsana Semiconductor Corp. | Data processing apparatus and system and method for controlling memory access |
JP3890910B2 (en) | 2001-03-21 | 2007-03-07 | 株式会社日立製作所 | Instruction execution result prediction device |
US7010558B2 (en) | 2001-04-19 | 2006-03-07 | Arc International | Data processor with enhanced instruction execution and method |
US20020194461A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Speculative branch target address cache |
US6886093B2 (en) * | 2001-05-04 | 2005-04-26 | Ip-First, Llc | Speculative hybrid branch direction predictor |
US7165169B2 (en) * | 2001-05-04 | 2007-01-16 | Ip-First, Llc | Speculative branch target address cache with selective override by secondary predictor based on branch instruction type |
US7165168B2 (en) | 2003-01-14 | 2007-01-16 | Ip-First, Llc | Microprocessor with branch target address cache update queue |
US7200740B2 (en) * | 2001-05-04 | 2007-04-03 | Ip-First, Llc | Apparatus and method for speculatively performing a return instruction in a microprocessor |
US20020194462A1 (en) | 2001-05-04 | 2002-12-19 | Ip First Llc | Apparatus and method for selecting one of multiple target addresses stored in a speculative branch target address cache per instruction cache line |
GB0112275D0 (en) | 2001-05-21 | 2001-07-11 | Micron Technology Inc | Method and circuit for normalization of floating point significands in a simd array mpp |
GB0112269D0 (en) | 2001-05-21 | 2001-07-11 | Micron Technology Inc | Method and circuit for alignment of floating point significands in a simd array mpp |
JP3805339B2 (en) * | 2001-06-29 | 2006-08-02 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Method for predicting branch target, processor, and compiler |
US7162619B2 (en) * | 2001-07-03 | 2007-01-09 | Ip-First, Llc | Apparatus and method for densely packing a branch instruction predicted by a branch target address cache and associated target instructions into a byte-wide instruction buffer |
US7010675B2 (en) * | 2001-07-27 | 2006-03-07 | Stmicroelectronics, Inc. | Fetch branch architecture for reducing branch penalty without branch prediction |
US7191445B2 (en) | 2001-08-31 | 2007-03-13 | Texas Instruments Incorporated | Method using embedded real-time analysis components with corresponding real-time operating system software objects |
US6751331B2 (en) | 2001-10-11 | 2004-06-15 | United Global Sourcing Incorporated | Communication headset |
JP2003131902A (en) | 2001-10-24 | 2003-05-09 | Toshiba Corp | Software debugger, system-level debugger, debug method and debug program |
US7051239B2 (en) * | 2001-12-28 | 2006-05-23 | Hewlett-Packard Development Company, L.P. | Method and apparatus for efficiently implementing trace and/or logic analysis mechanisms on a processor chip |
KR100718754B1 (en) | 2002-01-31 | 2007-05-15 | 에이알씨 인터내셔널 | Configurable data processor with multi-length instruction set architecture |
US7168067B2 (en) | 2002-02-08 | 2007-01-23 | Agere Systems Inc. | Multiprocessor system with cache-based software breakpoints |
US7181596B2 (en) | 2002-02-12 | 2007-02-20 | Ip-First, Llc | Apparatus and method for extending a microprocessor instruction set |
US7529912B2 (en) | 2002-02-12 | 2009-05-05 | Via Technologies, Inc. | Apparatus and method for instruction-level specification of floating point format |
US7328328B2 (en) | 2002-02-19 | 2008-02-05 | Ip-First, Llc | Non-temporal memory reference control mechanism |
US7315921B2 (en) | 2002-02-19 | 2008-01-01 | Ip-First, Llc | Apparatus and method for selective memory attribute control |
US7546446B2 (en) | 2002-03-08 | 2009-06-09 | Ip-First, Llc | Selective interrupt suppression |
US7395412B2 (en) | 2002-03-08 | 2008-07-01 | Ip-First, Llc | Apparatus and method for extending data modes in a microprocessor |
US7302551B2 (en) | 2002-04-02 | 2007-11-27 | Ip-First, Llc | Suppression of store checking |
US7380103B2 (en) | 2002-04-02 | 2008-05-27 | Ip-First, Llc | Apparatus and method for selective control of results write back |
US7185180B2 (en) | 2002-04-02 | 2007-02-27 | Ip-First, Llc | Apparatus and method for selective control of condition code write back |
US7373483B2 (en) | 2002-04-02 | 2008-05-13 | Ip-First, Llc | Mechanism for extending the number of registers in a microprocessor |
US7155598B2 (en) | 2002-04-02 | 2006-12-26 | Ip-First, Llc | Apparatus and method for conditional instruction execution |
US7380109B2 (en) | 2002-04-15 | 2008-05-27 | Ip-First, Llc | Apparatus and method for providing extended address modes in an existing instruction set for a microprocessor |
US20030204705A1 (en) | 2002-04-30 | 2003-10-30 | Oldfield William H. | Prediction of branch instructions in a data processing apparatus |
KR100450753B1 (en) | 2002-05-17 | 2004-10-01 | 한국전자통신연구원 | Programmable variable length decoder including interface of CPU processor |
US6938151B2 (en) | 2002-06-04 | 2005-08-30 | International Business Machines Corporation | Hybrid branch prediction using a global selection counter and a prediction method comparison table |
US6718504B1 (en) * | 2002-06-05 | 2004-04-06 | Arc International | Method and apparatus for implementing a data processor adapted for turbo decoding |
US7493480B2 (en) * | 2002-07-18 | 2009-02-17 | International Business Machines Corporation | Method and apparatus for prefetching branch history information |
US7000095B2 (en) * | 2002-09-06 | 2006-02-14 | Mips Technologies, Inc. | Method and apparatus for clearing hazards using jump instructions |
US20050125634A1 (en) | 2002-10-04 | 2005-06-09 | Fujitsu Limited | Processor and instruction control method |
US6968444B1 (en) | 2002-11-04 | 2005-11-22 | Advanced Micro Devices, Inc. | Microprocessor employing a fixed position dispatch unit |
US7266676B2 (en) * | 2003-03-21 | 2007-09-04 | Analog Devices, Inc. | Method and apparatus for branch prediction based on branch targets utilizing tag and data arrays |
US20040193855A1 (en) | 2003-03-31 | 2004-09-30 | Nicolas Kacevas | System and method for branch prediction access |
US7590829B2 (en) | 2003-03-31 | 2009-09-15 | Stretch, Inc. | Extension adapter |
US7174444B2 (en) | 2003-03-31 | 2007-02-06 | Intel Corporation | Preventing a read of a next sequential chunk in branch prediction of a subject chunk |
US20040225870A1 (en) * | 2003-05-07 | 2004-11-11 | Srinivasan Srikanth T. | Method and apparatus for reducing wrong path execution in a speculative multi-threaded processor |
US7010676B2 (en) * | 2003-05-12 | 2006-03-07 | International Business Machines Corporation | Last iteration loop branch prediction upon counter threshold and resolution upon counter one |
US20040255104A1 (en) * | 2003-06-12 | 2004-12-16 | Intel Corporation | Method and apparatus for recycling candidate branch outcomes after a wrong-path execution in a superscalar processor |
US7668897B2 (en) | 2003-06-16 | 2010-02-23 | Arm Limited | Result partitioning within SIMD data processing systems |
US7783871B2 (en) | 2003-06-30 | 2010-08-24 | Intel Corporation | Method to remove stale branch predictions for an instruction prior to execution within a microprocessor |
US7373642B2 (en) | 2003-07-29 | 2008-05-13 | Stretch, Inc. | Defining instruction extensions in a standard programming language |
US20050027974A1 (en) | 2003-07-31 | 2005-02-03 | Oded Lempel | Method and system for conserving resources in an instruction pipeline |
US7133950B2 (en) | 2003-08-19 | 2006-11-07 | Sun Microsystems, Inc. | Request arbitration in multi-core processor |
JP2005078234A (en) | 2003-08-29 | 2005-03-24 | Renesas Technology Corp | Information processor |
US7237098B2 (en) * | 2003-09-08 | 2007-06-26 | Ip-First, Llc | Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence |
US20050066305A1 (en) * | 2003-09-22 | 2005-03-24 | Lisanke Robert John | Method and machine for efficient simulation of digital hardware within a software development environment |
KR100980076B1 (en) | 2003-10-24 | 2010-09-06 | 삼성전자주식회사 | System and method for branch prediction with low-power consumption |
US7363544B2 (en) | 2003-10-30 | 2008-04-22 | International Business Machines Corporation | Program debug method and apparatus |
US8069336B2 (en) | 2003-12-03 | 2011-11-29 | Globalfoundries Inc. | Transitioning from instruction cache to trace cache on label boundaries |
US7219207B2 (en) * | 2003-12-03 | 2007-05-15 | Intel Corporation | Reconfigurable trace cache |
US7401328B2 (en) | 2003-12-18 | 2008-07-15 | Lsi Corporation | Software-implemented grouping techniques for use in a superscalar data processing system |
US7293164B2 (en) * | 2004-01-14 | 2007-11-06 | International Business Machines Corporation | Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions |
US8607209B2 (en) | 2004-02-04 | 2013-12-10 | Bluerisc Inc. | Energy-focused compiler-assisted branch prediction |
US7613911B2 (en) | 2004-03-12 | 2009-11-03 | Arm Limited | Prefetching exception vectors by early lookup exception vectors within a cache memory |
US20050216713A1 (en) * | 2004-03-25 | 2005-09-29 | International Business Machines Corporation | Instruction text controlled selectively stated branches for prediction via a branch target buffer |
US7281120B2 (en) * | 2004-03-26 | 2007-10-09 | International Business Machines Corporation | Apparatus and method for decreasing the latency between an instruction cache and a pipeline processor |
US20050223202A1 (en) | 2004-03-31 | 2005-10-06 | Intel Corporation | Branch prediction in a pipelined processor |
US20060015706A1 (en) | 2004-06-30 | 2006-01-19 | Chunrong Lai | TLB correlated branch predictor and method for use thereof |
TWI305323B (en) | 2004-08-23 | 2009-01-11 | Faraday Tech Corp | Method for verification branch prediction mechanisms and readable recording medium for storing program thereof |
-
2005
- 2005-05-19 US US11/132,428 patent/US20050278517A1/en not_active Abandoned
- 2005-05-19 GB GB0622477A patent/GB2428842A/en not_active Withdrawn
- 2005-05-19 US US11/132,447 patent/US20050278505A1/en not_active Abandoned
- 2005-05-19 US US11/132,448 patent/US20050289323A1/en not_active Abandoned
- 2005-05-19 TW TW094116302A patent/TW200602974A/en unknown
- 2005-05-19 CN CNA2005800215322A patent/CN101002169A/en active Pending
- 2005-05-19 US US11/132,423 patent/US20050278513A1/en not_active Abandoned
- 2005-05-19 WO PCT/US2005/017586 patent/WO2005114441A2/en active Application Filing
- 2005-05-19 US US11/132,432 patent/US20050273559A1/en not_active Abandoned
- 2005-05-19 US US11/132,424 patent/US8719837B2/en active Active
-
2014
- 2014-03-21 US US14/222,194 patent/US9003422B2/en active Active
Patent Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4831571A (en) * | 1986-08-11 | 1989-05-16 | Kabushiki Kaisha Toshiba | Barrel shifter for rotating data with or without carry |
US4829460A (en) * | 1986-10-15 | 1989-05-09 | Fujitsu Limited | Barrel shifter |
US5155698A (en) * | 1990-08-29 | 1992-10-13 | Nec Corporation | Barrel shifter circuit having rotation function |
US5493687A (en) * | 1991-07-08 | 1996-02-20 | Seiko Epson Corporation | RISC microprocessor architecture implementing multiple typed register sets |
US5423011A (en) * | 1992-06-11 | 1995-06-06 | International Business Machines Corporation | Apparatus for initializing branch prediction information |
US5530825A (en) * | 1994-04-15 | 1996-06-25 | Motorola, Inc. | Data processor with branch target address cache and method of operation |
US5920711A (en) * | 1995-06-02 | 1999-07-06 | Synopsys, Inc. | System for frame-based protocol, graphical capture, synthesis, analysis, and simulation |
US6292879B1 (en) * | 1995-10-25 | 2001-09-18 | Anthony S. Fong | Method and apparatus to specify access control list and cache enabling and cache coherency requirement enabling on individual operands of an instruction of a computer |
US5896305A (en) * | 1996-02-08 | 1999-04-20 | Texas Instruments Incorporated | Shifter circuit for an arithmetic logic unit in a microprocessor |
US5808876A (en) * | 1997-06-20 | 1998-09-15 | International Business Machines Corporation | Multi-function power distribution system |
US5978909A (en) * | 1997-11-26 | 1999-11-02 | Intel Corporation | System for speculative branch target prediction having a dynamic prediction history buffer and a static prediction history buffer |
US6622240B1 (en) * | 1999-06-18 | 2003-09-16 | Intrinsity, Inc. | Method and apparatus for pre-branch instruction |
US6550056B1 (en) * | 1999-07-19 | 2003-04-15 | Mitsubishi Denki Kabushiki Kaisha | Source level debugger for debugging source programs |
US6609194B1 (en) * | 1999-11-12 | 2003-08-19 | Ip-First, Llc | Apparatus for performing branch target address calculation based on branch type |
US6963554B1 (en) * | 2000-12-27 | 2005-11-08 | National Semiconductor Corporation | Microwire dynamic sequencer pipeline stall |
US6823444B1 (en) * | 2001-07-03 | 2004-11-23 | Ip-First, Llc | Apparatus and method for selectively accessing disparate instruction buffer stages based on branch target address cache hit and instruction stage wrap |
US6774832B1 (en) * | 2003-03-25 | 2004-08-10 | Raytheon Company | Multi-bit output DDS with real time delta sigma modulation look up from memory |
US20050273559A1 (en) * | 2004-05-19 | 2005-12-08 | Aris Aristodemou | Microprocessor architecture including unified cache debug unit |
US20050278517A1 (en) * | 2004-05-19 | 2005-12-15 | Kar-Lik Wong | Systems and methods for performing branch prediction in a variable length instruction set microprocessor |
US20050278513A1 (en) * | 2004-05-19 | 2005-12-15 | Aris Aristodemou | Systems and methods of dynamic branch prediction in a microprocessor |
US20050278505A1 (en) * | 2004-05-19 | 2005-12-15 | Lim Seow C | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
US20050289321A1 (en) * | 2004-05-19 | 2005-12-29 | James Hakewill | Microprocessor architecture having extendible logic |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050278505A1 (en) * | 2004-05-19 | 2005-12-15 | Lim Seow C | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
US8719837B2 (en) | 2004-05-19 | 2014-05-06 | Synopsys, Inc. | Microprocessor architecture having extendible logic |
US9003422B2 (en) | 2004-05-19 | 2015-04-07 | Synopsys, Inc. | Microprocessor architecture having extendible logic |
US7971042B2 (en) | 2005-09-28 | 2011-06-28 | Synopsys, Inc. | Microprocessor system and method for instruction-initiated recording and execution of instruction sequences in a dynamically decoupleable extended instruction pipeline |
US9959247B1 (en) | 2017-02-17 | 2018-05-01 | Google Llc | Permuting in a matrix-vector processor |
US10216705B2 (en) | 2017-02-17 | 2019-02-26 | Google Llc | Permuting in a matrix-vector processor |
US10592583B2 (en) | 2017-02-17 | 2020-03-17 | Google Llc | Permuting in a matrix-vector processor |
US10614151B2 (en) | 2017-02-17 | 2020-04-07 | Google Llc | Permuting in a matrix-vector processor |
US10956537B2 (en) | 2017-02-17 | 2021-03-23 | Google Llc | Permuting in a matrix-vector processor |
US11748443B2 (en) | 2017-02-17 | 2023-09-05 | Google Llc | Permuting in a matrix-vector processor |
Also Published As
Publication number | Publication date |
---|---|
WO2005114441A3 (en) | 2007-01-18 |
CN101002169A (en) | 2007-07-18 |
US9003422B2 (en) | 2015-04-07 |
US20050273559A1 (en) | 2005-12-08 |
TW200602974A (en) | 2006-01-16 |
US20050278505A1 (en) | 2005-12-15 |
US20050278513A1 (en) | 2005-12-15 |
GB2428842A (en) | 2007-02-07 |
US8719837B2 (en) | 2014-05-06 |
WO2005114441A2 (en) | 2005-12-01 |
US20050278517A1 (en) | 2005-12-15 |
US20050289321A1 (en) | 2005-12-29 |
US20140208087A1 (en) | 2014-07-24 |
GB0622477D0 (en) | 2006-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20050289323A1 (en) | Barrel shifter for a microprocessor | |
KR100236525B1 (en) | Multifunction data aligner in wide data width processor | |
US5995122A (en) | Method and apparatus for parallel conversion of color values from a single precision floating point format to an integer format | |
US7685408B2 (en) | Methods and apparatus for extracting bits of a source register based on a mask and right justifying the bits into a target register | |
US6502115B2 (en) | Conversion between packed floating point data and packed 32-bit integer data in different architectural registers | |
US20090100247A1 (en) | Simd permutations with extended range in a data processor | |
US6247116B1 (en) | Conversion from packed floating point data to packed 16-bit integer data in different architectural registers | |
US7962718B2 (en) | Methods for performing extended table lookups using SIMD vector permutation instructions that support out-of-range index values | |
KR20080049825A (en) | Fast rotator with embeded masking and method therefor | |
EP2130132B1 (en) | Dsp including a compute unit with an internal bit fifo circuit | |
US6877019B2 (en) | Barrel shifter | |
EP0264048B1 (en) | Thirty-two bit bit-slice | |
US9996317B2 (en) | High speed and low power circuit structure for barrel shifter | |
US20120005458A1 (en) | Fast Static Rotator/Shifter with Non Two's Complemented Decode and Fast Mask Generation | |
US5903779A (en) | System and method for efficient packing data into an output buffer | |
US6332188B1 (en) | Digital signal processor with bit FIFO | |
US20020065860A1 (en) | Data processing apparatus and method for saturating data values | |
US5572682A (en) | Control logic for a sequential data buffer using byte read-enable lines to define and shift the access window | |
US5416731A (en) | High-speed barrel shifter | |
US5822231A (en) | Ternary based shifter that supports multiple data types for shift functions | |
US6687262B1 (en) | Distributed MUX scheme for bi-endian rotator circuit | |
US5559730A (en) | Shift operation unit and shift operation method | |
US7014122B2 (en) | Method and apparatus for performing bit-aligned permute | |
US20050256996A1 (en) | Register read circuit using the remainders of modulo of a register number by the number of register sub-banks | |
JPH0450615B2 (en) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ARC INTERNATIONAL, UNITED KINGDOM Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WONG, KAR-LIK;TOPHAM, NIGEL;REEL/FRAME:016910/0262;SIGNING DATES FROM 20050714 TO 20050721 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |