US5870320A - Method for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow - Google Patents

Method for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow Download PDF

Info

Publication number
US5870320A
US5870320A US08/881,720 US88172097A US5870320A US 5870320 A US5870320 A US 5870320A US 88172097 A US88172097 A US 88172097A US 5870320 A US5870320 A US 5870320A
Authority
US
United States
Prior art keywords
signal
mask
bit
bit positions
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US08/881,720
Inventor
Vladimir Y. Volkonsky
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle America Inc
Original Assignee
Sun Microsystems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems Inc filed Critical Sun Microsystems Inc
Priority to US08/881,720 priority Critical patent/US5870320A/en
Priority to US08/881,510 priority patent/US5917740A/en
Assigned to SUN MICROSYSTEMS, INC. reassignment SUN MICROSYSTEMS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: VOLKONSKY, VLADIMIR YU
Application granted granted Critical
Publication of US5870320A publication Critical patent/US5870320A/en
Assigned to Oracle America, Inc. reassignment Oracle America, Inc. MERGER AND CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: Oracle America, Inc., ORACLE USA, INC., SUN MICROSYSTEMS, INC.
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/499Denomination or exception handling, e.g. rounding or overflow
    • G06F7/49905Exception handling
    • G06F7/4991Overflow or underflow
    • G06F7/49921Saturation, i.e. clipping the result to a minimum or maximum value

Definitions

  • This invention relates to a method for optimizing overflow checking and reduction of data signals represented as signed 16-bit integers.
  • Multimedia signals include audio and pixel ("picture elements") signals, among other things, and which may be sufficiently represented using binary data of no more than eight bits of resolution. Binary data having greater widths may also be used but are often limited to intermediate results for advanced data manipulation since such data formats lead to an increase load on instruction execution, resulting in slower rates of data manipulation by a processor.
  • a computer system running a video application may represent color pixels through four signed 16-bit signals, with each signed 16-bit signal representing the three primary color values of red, green, and blue and an intensity value. This translates to a large amount of data signals required to represent an image for display on a computer screen even when restricting pixel data widths to eight bits. For example, to display a digital NTSC video signal in real-time on a computer monitor requires a pixel rate of 10.4 million pixels per second. With three data signals to manipulate per pixel, this translates to about 30 million pieces of data to manipulate per second. A processor clock rate of 200 million MHz would only have 20 clock cycles available for processing each pixel which is less than seven clock cycles for each primary color value.
  • Manipulating pixel data that are represented using signed 16-bit integers usually requires that the resulting pixel data remain within the maximum negative and positive boundaries of a signed 16-bit integer.
  • a signed 16-bit integer in two's compliment format has a maximum range boundary of 32767 and a minimum range boundary of -32767.
  • the resulting pixel data is reduced to a value supported by the data format in which the pixel is represented. If either of the range boundaries is exceeded by the resulting pixel data, the resulting pixel data is reduced to within the maximum or minimum range boundaries of 32767 and -32767, respectively.
  • the resulting pixel data which is represented as the variable "dst," is compared with the upper range boundary of 32767. If the resulting pixel data exceeds the upper range boundary, then the upper range boundary value is transferred into the resulting pixel data. Otherwise, a conditional branch occurs which bypasses the execution of the second operation.
  • the third operation compares the resulting pixel data with the lower range boundary of -32767. If the resulting pixel data exceeds the lower range boundary, then the lower range boundary value is transferred into the resulting pixel data. Otherwise, another conditional branch occurs which bypasses the execution of the third operation, i.e., the resulting pixel data falls within the range boundaries.
  • conditional branches in a superscalar pipelined processor decreases processor execution throughput because the branches interrupt the pipeline processing of instructions. Also, conditional branches usually require processors to perform a memory fetch from intermediate or main memory in the event of a cache miss. Since intermediate or main memory is typically much slower than an instruction register which is used to process the instructions, the time to process the conditional branches takes much longer to complete than instructions that do not require fetches from intermediate or main memory. Thus, not only does the processor incur an increase in fetch latency but it also takes an efficiency hit due to the fact that the pipelining of instructions has been interrupted by the branches.
  • the present invention is directed to checking and reducing an intermediate signal arising from a manipulation of 16-bit signed data signals without using conditional branches, thereby improving instruction processing in a superscalar pipelined processor or an arithmetic unit that can execute several arithmetic operations concurrently.
  • the data signals are represented as signed 16-bit binary values in a two's compliment format.
  • An intermediate register that is greater than 16-bits wide is used, allowing for the proper checking of an overflow condition. It is presently contemplated that the present invention include using a processor operating under program control. The program determines whether the intermediate signal is in a positive overflow state or a negative overflow state.
  • the program sets a first mask signal to have 16 lower bits in an ON position when the intermediate signal is within the range boundary of a 16-bit signed integer. If the intermediate signal exceeds the maximum range boundary of 32767, then the first mask signal is set to have its 31st through 17th bits in the ON position, while the remaining bit positions are placed in the OFF position. If the intermediate signal exceeds the minimum range boundary of -32767, then the first mask signal is set to have its 32 bit position set to the ON position, while the remaining bit positions are placed in the OFF position. The second mask signal has a value that is equal to the first mask value bit-shifted 16 bit positions to the right. Finally, the program bitwise ANDs the original intermediate result with the first mask signal to obtain a translated data signal, and bitwise ORs the translated data signal with the second mask signal.
  • FIG. 1 is a schematic block diagram illustrating a computer system.
  • FIG. 2 is a schematic block diagram illustrating a processor used in accordance with a preferred embodiment of the present invention.
  • FIG. 3 is a process flow diagram showing the method of operation in accordance with a preferred embodiment of the present invention.
  • FIG. 4 is a block diagram of registers sequentially addressed in an 8-bits per byte address space.
  • FIG. 5 is an implementation in a processor operating under program control using the "C" programming language in accordance to a preferred embodiment of the present invention.
  • the present invention is directed to checking and reducing a 16-bit intermediate signal arising from a manipulation of data signals without using conditional branches, thereby improving instruction processing in a superscalar pipelined processor or an arithmetic unit that can execute several arithmetic operations concurrently.
  • the data signals are represented as 16-bit signed integers which requires the intermediate signal to be stored in a register that is greater than 16-bits wide. This allows for the proper checking of an overflow condition.
  • the focus of whether an overflow condition exists focuses on the state of the 16th and 17th bits of the intermediate result data. Consequently, intermediate registers need to have a 15th and 16th bit position but may be conventionally sized as 32-bit wide registers to fit within the processing scheme of a processor.
  • Data manipulations of 16-bit signed integers that require overflow checking arise from many different types of signal processing applications. Images may be superimposed, requiring pixel signals from the image sources to be processed so that when displayed, one image is semi-transparent over the other image. If the images are comprised of pixels represented as 16-bit signed integers, this results in combining pixels that occupy the same pixel space to a single intermediate signal. To ensure that the intermediate signal is within the 16-bit signed integer data scheme, its overflow condition must be checked. And if in the overflow condition, reduced to within the range boundaries of a signed 16-bit integer.
  • the range boundaries of a signed 16-bit integer in a two's compliment format include a maximum positive range boundary of 32767 and a maximum negative range boundary of -32767.
  • FIG. 1 is a schematic block diagram illustrating a computer system 8 in which the presently preferred invention would have application.
  • Computer system includes a superscalar pipelined processor 10, an I/O adapter 12 coupled to a data store 14, a display adapter 16, a ROM 18, and RAM devices 20.
  • a data and addressing bus 22 is also shown coupled to each of the items included with the computer system.
  • FIG. 2 is a schematic block diagram illustrating an example superscalar pipelined processor 24 used in accordance with a preferred embodiment of the present invention.
  • a Prefetch and Dispatch Unit (PDU) 26 an Integer Execution Unit (IEU) 28, a Floating-Point Unit (FPU) 30, a Memory Management Unit (MMU) 32, a Load and Store Unit (LSU) 34, an External Cache Unit (ECU) 36, a Graphics Unit (GRU) 38, an Instruction Cache 40, and a Data Cache 42 are shown in FIG. 2.
  • PDU Prefetch and Dispatch Unit
  • IEU Integer Execution Unit
  • FPU Floating-Point Unit
  • MMU Memory Management Unit
  • LSU Load and Store Unit
  • ECU External Cache Unit
  • GRU Graphics Unit
  • Instruction Cache 40 Instruction Cache 40
  • Data Cache 42 Data Cache
  • the PDU 26 ensures that all execution units remain busy by fetching instructions before they are needed in the pipeline. Instructions can be prefetched from all levels of the memory hierarchy, including instruction cache 40, external cache 36, and main memory.
  • the PDU 26 provides a 12-entry prefetch buffer which minimizes pipeline stalls.
  • the PDU 26 has a 9-stage instruction pipeline to minimize latency and dynamic branch prediction to allow for greater prediction accuracy.
  • the pipeline is a double-instruction-issue pipeline with nine stages: fetch, decode, grouping, execution, cache access, load miss, integer pipe wait, trap resolution, and writeback. These stages imply that the latency (time from start to end of execution) of most instructions is nine clock cycles. However, at any given time, as many as nine instructions can execute simultaneously, producing an overall rate of execution of one clock per instruction in many cases. However, some instructions may require more than one cycle to execute due to the nature of the instruction such as a branch instruction or to a cache miss, or other resource contention.
  • the first stage of the pipeline is a fetch from instruction cache 40.
  • instructions are decoded and placed in the instruction buffer.
  • the third stage grouping, groups and dispatches up to four instructions.
  • integer instructions are executed and virtual addresses calculated during the execution stage.
  • data cache 42 is accessed. Cache hits and misses are determined, and branches are resolved. If a cache miss was detected, the loaded miss enters the load buffer. At this point, the integer pipe waits for the floating-point/graphics pipe to fill and traps are resolved.
  • writeback all results are written to the register files and instructions are committed.
  • IEU 28 includes two ALU (arithmetic logical units) for arithmetic, logical, and shift operations, an eight window register file, result bypassing, and a Completion Unit which allows a nine-stage pipeline with minimal bypasses.
  • ALU arithmetic logical units
  • FPU 30 is a pipelined floating-point processor that consists of five separate functional units to support floating-point and multimedia operations. The separation of execution units allows the issuance and execution of two floating-point instructions per cycle. Source and data results are stored in a 32-entry register file in either 8, 16 or 32 bit lengths. Most floating-point instructions have a throughput of one cycle, a latency of three cycles, and are fully pipelined.
  • the FPU is able to operate on both single precision (32-bit), and double-precision (64-bit) numbers, normalized or denormalized, in hardware, and quad-precision (128-bit) operands in software.
  • FPU 30 is tightly coupled to the integer pipeline and is capable of seamlessly executing a floating-point memory event and a floating-point operation.
  • IEU 28 and FPU 30 have a dedicated control interface which includes the dispatch of operations fetched by the PDU 26 to the FPU30. Once in the queue, the PDU 26 is responsible for distribution of instructions to the FPU 30.
  • IEU 28 controls the data cache portion of the operation, while the FPU 30 decides how to manipulate the data.
  • the IEU 28 and FPU 30 cooperatively detect floating-point data dependencies.
  • the interface also includes IEU 28 and FPU 30 handshaking for floating-point exceptions.
  • the FPU 30 performs all floating-point operations and implements a 3-entry floating-point instruction queue to reduce the impact of bottlenecks at the IEU 28 and improve overall performance.
  • MMU 32 handles all memory operations as well as arbitration between data stores and memory.
  • GRU 38 relies on integer registers of varying bit lengths for addressing image data, and floating point registers for manipulating that data. This division of labor enables processor to make full use of all available internal registers so as to maximize graphical throughput.
  • FIG. 3 is a process flow diagram showing a method of operation in accordance with a preferred embodiment of the present invention.
  • FIG. 4 is a block diagram of registers sequentially addressed in an 8-bits per byte address space in accordance with a preferred embodiment of the present invention.
  • FIG. 5 is an implementation in a processor operating under program control using the "C" programming language in accordance with a preferred embodiment of the present invention.
  • registers are initialized to their initial values.
  • the registers may be in an array form having elements for storing mask signals represented as 32-bit binary values.
  • the registers 46 are sequentially addressed so that they are four (4) bytes apart from each other in an 8-bits per byte address space.
  • the first 48, second 50, third 52, and fourth 54 registers are initialized to hold a signal represented as 0000FFFF (hex), 7FFF0000 (hex), 80000000 (hex), and 0000FFFF (hex), respectively.
  • the registers may be implemented in the "C" programming language as an array of unsigned integers 56.
  • This array may be formed in system memory or any memory available to a processor operating under program control.
  • a processor determines whether an intermediate signal is in a positive overflow state or negative overflow state Specifically, the processor is directed under program control to bitwise shift the intermediate signal thirteen bit positions to the right which is then bitwise ANDed ("masked") with a mask signal having a value C (hex) to obtain an offset signal.
  • the intermediate signal may be bit-shifted 15 positions to the right and masked with a mask signal having a value 3 (hex) to obtain an offset signal.
  • This alternative approach is applicable in processors having compilers that transforms a two bit index into a byte address but incurs an added step of bitwise shifting the offset signal two places to the left to obtain a proper byte address.
  • Step 44b results in interpreting the 17th and 16th bit positions of the original intermediate signal in the following manner.
  • An overflow status of "00" or "11” in the 17th and 16th bit positions of the intermediate signal results in an offset signal having the value of "0" or "12", respectively, indicating that the intermediate signal is within the boundary range of a 16-bit signed integer.
  • An overflow status of "01" in the 17th and 16th bit positions results in an offset signal having the value of "4", indicating that the intermediate signal exceeds the maximum boundary of a 16-bit signed integer.
  • An overflow status of "10" in the 17th and 16th bit positions results in an offset signal having the value of "8", indicating that the intermediate signal exceeds the minimum boundary of a 16-bit signed integer.
  • a first mask signal receives a mask value stored in one of the registers.
  • a register is chosen according to the value assigned to the offset signal in step 44b. For example, if an offset signal having a value of "0", "4", “8", or "12" chooses the first, second, third, or fourth register, respectively.
  • the value C (hex) used as a mask signal in step 44b may be appropriately filled with additional binary signals of the value zero, depending on the size of the intermediate register used. This provides a level of scalability as to the size of the intermediate register.
  • bitwise AND operation passes through all bits that are ON in bit positions that correspond to bit positions in first mask signal having an ON bit. All other bit positions, whether ON or OFF, that correspond to bit positions in the first mask signal having an OFF bit are set to the OFF position.
  • a second mask signal receives the operational result of bitwise shifting the first mask signal 16 bit positions to the right.
  • the processor masks the intermediate signal with first mask signal to obtain a translated data signal.
  • this mask operation includes bitwise ANDing the intermediate signal with the first mask signal.
  • the processor performs a bitwise OR operation on translated data signal using second mask signal.
  • a bitwise OR operation turns all bits that are OFF to ON in bit positions that correspond to bit positions in second mask signal having an ON bit. All other bit positions, whether ON or OFF, that correspond to bit positions in the second mask signal having an OFF bit are passed through without change.
  • the above method of operation essentially is based on the analysis of the 17th and 16th bits of the intermediate signal which are interpreted in the following way.
  • An overflow status of "00" or "11” indicates that the original intermediate signal is inside the range of a signed 16-bit integer in a two's compliment format, which is a range between 32767 and -32767.
  • the first mask signal is set to have 16 lower bits in an ON position when the intermediate signal is within the range boundary of a 16-bit signed integer, i.e., the first mask signal may have a mask value of 0000FFFF (hex).
  • An overflow status of "01" indicates a positive overflow state with the intermediate signal exceeding the maximum range boundary of 32767 for a 16-bit signed integer in a two's compliment format. If so, the first mask signal is set to have its 31st through 17th bits in the ON position. Thus, the first mask signal may have a mask value of 7FFF0000 (hex).
  • An overflow status of "10” indicates a negative overflow state with the intermediate signal exceeding the minimum range boundary of -32767 for a 16-bit signed integer in a two's compliment format.
  • the first mask signal is set to have its 32nd bit position set to an ON state, while the remaining bit positions are set to have an OFF state.
  • an overflow status of "10” results in the first mask signal receiving a mask value of 80000000 (hex),
  • the second mask signal has a value that is equal to the first mask value bit-shifted 16 bit positions to the right. All bit positions that are left vacant by the bitwise shift operation are filled with zero.
  • the first mask signal is bitwise ADDed with the original intermediate signal with the resulting value bitwise ORed with the second mask signal. This operation either reduces or passes through the original intermediate signal depending on its value.
  • Alternative embodiments of the present invention may include embedding the steps of the method of the present invention into a Field Programmable Gate Architecture (FPGA) as is well-known in the art, or using an integrated circuit design program such as VHDL to describe the method, thus hard coding the method in an application-specific integrated circuit (ASIC).
  • FPGA Field Programmable Gate Architecture
  • ASIC application-specific integrated circuit

Abstract

The present invention is directed to checking and reducing an intermediate signal arising from a manipulation of 16-bit signed data signals without using conditional branches, thereby improving instruction processing in a superscalar pipelined processor or an arithmetic unit that can execute several arithmetic operations concurrently. In the preferred embodiment of the present invention, the data signals are represented as signed 16-bit binary values in a two's compliment format. An intermediate register is used to hold the intermediate signal which is greater than 16-bits in width to allow for the proper checking of an overflow condition. It is presently contemplated that the present invention include using a processor operating under program control. The program determines whether the intermediate signal is in a positive or negative overflow state. The program sets a first mask signal to have 16 lower bits in an ON position when the intermediate signal is within the range boundary of a 16-bit signed integer. If the intermediate signal exceeds the maximum range boundary of 32767, the first mask signal is set to have its 31st through 17th bits in the ON position. If the intermediate signal exceeds the minimum range boundary of -32767, the first mask signal is set to have its 32nd bit position set to the ON position The second mask signal has a value that is equal to the first mask value bit-shifted 16 bit positions to the right. Finally, the program bitwise ANDs the original intermediate signal with the first mask signal to obtain a translated data signal, and bitwise ORs the translated data signal with the second mask signal.

Description

BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a method for optimizing overflow checking and reduction of data signals represented as signed 16-bit integers.
2. Description of Related Art
The explosion of graphics, audio, and video ("multimedia") related applications in computer systems has fueled efforts in improving processor efficiency with regard to processing multimedia signals. Multimedia signals include audio and pixel ("picture elements") signals, among other things, and which may be sufficiently represented using binary data of no more than eight bits of resolution. Binary data having greater widths may also be used but are often limited to intermediate results for advanced data manipulation since such data formats lead to an increase load on instruction execution, resulting in slower rates of data manipulation by a processor.
A computer system running a video application may represent color pixels through four signed 16-bit signals, with each signed 16-bit signal representing the three primary color values of red, green, and blue and an intensity value. This translates to a large amount of data signals required to represent an image for display on a computer screen even when restricting pixel data widths to eight bits. For example, to display a digital NTSC video signal in real-time on a computer monitor requires a pixel rate of 10.4 million pixels per second. With three data signals to manipulate per pixel, this translates to about 30 million pieces of data to manipulate per second. A processor clock rate of 200 million MHz would only have 20 clock cycles available for processing each pixel which is less than seven clock cycles for each primary color value.
Manipulating pixel data that are represented using signed 16-bit integers usually requires that the resulting pixel data remain within the maximum negative and positive boundaries of a signed 16-bit integer. A signed 16-bit integer in two's compliment format has a maximum range boundary of 32767 and a minimum range boundary of -32767. For example, when scaling or rotating images, it is necessary to combine the incoming signal being processed with other internally generated signal data in order to obtain the resulting pixel data. This ensures that if an overflow state does occur, the resulting pixel data is reduced to a value supported by the data format in which the pixel is represented. If either of the range boundaries is exceeded by the resulting pixel data, the resulting pixel data is reduced to within the maximum or minimum range boundaries of 32767 and -32767, respectively.
In the past, checking resulting pixel data for an overflow condition included using conditional branches. For example, in one such method branch operations in the programming language "C" are used in the following manner.
int dst;
if (dst>32767) dst=32767;
if (dst<-32767) dst=-32767;
The resulting pixel data, which is represented as the variable "dst," is compared with the upper range boundary of 32767. If the resulting pixel data exceeds the upper range boundary, then the upper range boundary value is transferred into the resulting pixel data. Otherwise, a conditional branch occurs which bypasses the execution of the second operation. The third operation compares the resulting pixel data with the lower range boundary of -32767. If the resulting pixel data exceeds the lower range boundary, then the lower range boundary value is transferred into the resulting pixel data. Otherwise, another conditional branch occurs which bypasses the execution of the third operation, i.e., the resulting pixel data falls within the range boundaries.
The use of conditional branches in a superscalar pipelined processor decreases processor execution throughput because the branches interrupt the pipeline processing of instructions. Also, conditional branches usually require processors to perform a memory fetch from intermediate or main memory in the event of a cache miss. Since intermediate or main memory is typically much slower than an instruction register which is used to process the instructions, the time to process the conditional branches takes much longer to complete than instructions that do not require fetches from intermediate or main memory. Thus, not only does the processor incur an increase in fetch latency but it also takes an efficiency hit due to the fact that the pipelining of instructions has been interrupted by the branches.
Accordingly, it would be desirable to provide a method that ensures resulting pixel data remain within the range boundaries of a signed 16-bit integer without the use of conditional branches in an instruction. This advantage is achieved by performing two shift operations, two logic multiplications, one addition, one load, and one logic addition to obtain a result that is within the range boundaries of a signed 16-bit integer, improving the instruction throughput of a processor.
SUMMARY OF THE INVENTION
The present invention is directed to checking and reducing an intermediate signal arising from a manipulation of 16-bit signed data signals without using conditional branches, thereby improving instruction processing in a superscalar pipelined processor or an arithmetic unit that can execute several arithmetic operations concurrently. In the preferred embodiment of the present invention, the data signals are represented as signed 16-bit binary values in a two's compliment format. An intermediate register that is greater than 16-bits wide is used, allowing for the proper checking of an overflow condition. It is presently contemplated that the present invention include using a processor operating under program control. The program determines whether the intermediate signal is in a positive overflow state or a negative overflow state. The program sets a first mask signal to have 16 lower bits in an ON position when the intermediate signal is within the range boundary of a 16-bit signed integer. If the intermediate signal exceeds the maximum range boundary of 32767, then the first mask signal is set to have its 31st through 17th bits in the ON position, while the remaining bit positions are placed in the OFF position. If the intermediate signal exceeds the minimum range boundary of -32767, then the first mask signal is set to have its 32 bit position set to the ON position, while the remaining bit positions are placed in the OFF position. The second mask signal has a value that is equal to the first mask value bit-shifted 16 bit positions to the right. Finally, the program bitwise ANDs the original intermediate result with the first mask signal to obtain a translated data signal, and bitwise ORs the translated data signal with the second mask signal.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a schematic block diagram illustrating a computer system.
FIG. 2 is a schematic block diagram illustrating a processor used in accordance with a preferred embodiment of the present invention.
FIG. 3 is a process flow diagram showing the method of operation in accordance with a preferred embodiment of the present invention.
FIG. 4 is a block diagram of registers sequentially addressed in an 8-bits per byte address space.
FIG. 5 is an implementation in a processor operating under program control using the "C" programming language in accordance to a preferred embodiment of the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
In the following description, a preferred embodiment of the invention is described with regard to preferred process steps and data structures. Those skilled in the art would recognize after perusal of this application that embodiments of the invention can be implemented using one or more general purpose processors operating under program control, or special purpose processors adapted to particular process steps and data structures, and that implementation of the process steps and data structures described herein would not require undue experimentation or further invention.
The present invention is directed to checking and reducing a 16-bit intermediate signal arising from a manipulation of data signals without using conditional branches, thereby improving instruction processing in a superscalar pipelined processor or an arithmetic unit that can execute several arithmetic operations concurrently.
The data signals are represented as 16-bit signed integers which requires the intermediate signal to be stored in a register that is greater than 16-bits wide. This allows for the proper checking of an overflow condition. In the preferred embodiment, the focus of whether an overflow condition exists focuses on the state of the 16th and 17th bits of the intermediate result data. Consequently, intermediate registers need to have a 15th and 16th bit position but may be conventionally sized as 32-bit wide registers to fit within the processing scheme of a processor.
Data manipulations of 16-bit signed integers that require overflow checking arise from many different types of signal processing applications. Images may be superimposed, requiring pixel signals from the image sources to be processed so that when displayed, one image is semi-transparent over the other image. If the images are comprised of pixels represented as 16-bit signed integers, this results in combining pixels that occupy the same pixel space to a single intermediate signal. To ensure that the intermediate signal is within the 16-bit signed integer data scheme, its overflow condition must be checked. And if in the overflow condition, reduced to within the range boundaries of a signed 16-bit integer. The range boundaries of a signed 16-bit integer in a two's compliment format include a maximum positive range boundary of 32767 and a maximum negative range boundary of -32767.
FIG. 1 is a schematic block diagram illustrating a computer system 8 in which the presently preferred invention would have application. Computer system includes a superscalar pipelined processor 10, an I/O adapter 12 coupled to a data store 14, a display adapter 16, a ROM 18, and RAM devices 20. A data and addressing bus 22 is also shown coupled to each of the items included with the computer system.
FIG. 2 is a schematic block diagram illustrating an example superscalar pipelined processor 24 used in accordance with a preferred embodiment of the present invention. A Prefetch and Dispatch Unit (PDU) 26, an Integer Execution Unit (IEU) 28, a Floating-Point Unit (FPU) 30, a Memory Management Unit (MMU) 32, a Load and Store Unit (LSU) 34, an External Cache Unit (ECU) 36, a Graphics Unit (GRU) 38, an Instruction Cache 40, and a Data Cache 42 are shown in FIG. 2. Superscalar pipelined processors are known in the art of computer architecture. Consequently, those of ordinary skill in the art will readily recognize that processor 24 is capable performing arithmetic and logical operations that include bitwise shifting, bitwise AND operations, bitwise OR operations, and addition operations when operating under program control.
The PDU 26 ensures that all execution units remain busy by fetching instructions before they are needed in the pipeline. Instructions can be prefetched from all levels of the memory hierarchy, including instruction cache 40, external cache 36, and main memory. The PDU 26 provides a 12-entry prefetch buffer which minimizes pipeline stalls. In addition, the PDU 26 has a 9-stage instruction pipeline to minimize latency and dynamic branch prediction to allow for greater prediction accuracy.
The pipeline is a double-instruction-issue pipeline with nine stages: fetch, decode, grouping, execution, cache access, load miss, integer pipe wait, trap resolution, and writeback. These stages imply that the latency (time from start to end of execution) of most instructions is nine clock cycles. However, at any given time, as many as nine instructions can execute simultaneously, producing an overall rate of execution of one clock per instruction in many cases. However, some instructions may require more than one cycle to execute due to the nature of the instruction such as a branch instruction or to a cache miss, or other resource contention.
The first stage of the pipeline is a fetch from instruction cache 40. In the second stage, instructions are decoded and placed in the instruction buffer. The third stage, grouping, groups and dispatches up to four instructions. Next, integer instructions are executed and virtual addresses calculated during the execution stage. In the fifth stage data cache 42 is accessed. Cache hits and misses are determined, and branches are resolved. If a cache miss was detected, the loaded miss enters the load buffer. At this point, the integer pipe waits for the floating-point/graphics pipe to fill and traps are resolved. In the final stage, writeback, all results are written to the register files and instructions are committed.
IEU 28 includes two ALU (arithmetic logical units) for arithmetic, logical, and shift operations, an eight window register file, result bypassing, and a Completion Unit which allows a nine-stage pipeline with minimal bypasses.
FPU 30 is a pipelined floating-point processor that consists of five separate functional units to support floating-point and multimedia operations. The separation of execution units allows the issuance and execution of two floating-point instructions per cycle. Source and data results are stored in a 32-entry register file in either 8, 16 or 32 bit lengths. Most floating-point instructions have a throughput of one cycle, a latency of three cycles, and are fully pipelined. The FPU is able to operate on both single precision (32-bit), and double-precision (64-bit) numbers, normalized or denormalized, in hardware, and quad-precision (128-bit) operands in software.
FPU 30 is tightly coupled to the integer pipeline and is capable of seamlessly executing a floating-point memory event and a floating-point operation. IEU 28 and FPU 30 have a dedicated control interface which includes the dispatch of operations fetched by the PDU 26 to the FPU30. Once in the queue, the PDU 26 is responsible for distribution of instructions to the FPU 30. IEU 28 controls the data cache portion of the operation, while the FPU 30 decides how to manipulate the data. The IEU 28 and FPU 30 cooperatively detect floating-point data dependencies. The interface also includes IEU 28 and FPU 30 handshaking for floating-point exceptions. The FPU 30 performs all floating-point operations and implements a 3-entry floating-point instruction queue to reduce the impact of bottlenecks at the IEU 28 and improve overall performance.
MMU 32 handles all memory operations as well as arbitration between data stores and memory.
GRU 38 relies on integer registers of varying bit lengths for addressing image data, and floating point registers for manipulating that data. This division of labor enables processor to make full use of all available internal registers so as to maximize graphical throughput.
Method of Operation
FIG. 3 is a process flow diagram showing a method of operation in accordance with a preferred embodiment of the present invention.
FIG. 4 is a block diagram of registers sequentially addressed in an 8-bits per byte address space in accordance with a preferred embodiment of the present invention.
FIG. 5 is an implementation in a processor operating under program control using the "C" programming language in accordance with a preferred embodiment of the present invention.
Referring now to FIG. 3, at step 44a registers are initialized to their initial values. In the preferred embodiment, the registers may be in an array form having elements for storing mask signals represented as 32-bit binary values. As seen in FIG. 4, the registers 46 are sequentially addressed so that they are four (4) bytes apart from each other in an 8-bits per byte address space. The first 48, second 50, third 52, and fourth 54 registers are initialized to hold a signal represented as 0000FFFF (hex), 7FFF0000 (hex), 80000000 (hex), and 0000FFFF (hex), respectively.
Alternatively, as seen in FIG. 5, the registers may be implemented in the "C" programming language as an array of unsigned integers 56. This array may be formed in system memory or any memory available to a processor operating under program control.
Referring back to FIG. 3, at step 44b, a processor determines whether an intermediate signal is in a positive overflow state or negative overflow state Specifically, the processor is directed under program control to bitwise shift the intermediate signal thirteen bit positions to the right which is then bitwise ANDed ("masked") with a mask signal having a value C (hex) to obtain an offset signal.
Alternatively, the intermediate signal may be bit-shifted 15 positions to the right and masked with a mask signal having a value 3 (hex) to obtain an offset signal. This alternative approach is applicable in processors having compilers that transforms a two bit index into a byte address but incurs an added step of bitwise shifting the offset signal two places to the left to obtain a proper byte address.
Step 44b results in interpreting the 17th and 16th bit positions of the original intermediate signal in the following manner. An overflow status of "00" or "11" in the 17th and 16th bit positions of the intermediate signal results in an offset signal having the value of "0" or "12", respectively, indicating that the intermediate signal is within the boundary range of a 16-bit signed integer.
An overflow status of "01" in the 17th and 16th bit positions results in an offset signal having the value of "4", indicating that the intermediate signal exceeds the maximum boundary of a 16-bit signed integer.
An overflow status of "10" in the 17th and 16th bit positions results in an offset signal having the value of "8", indicating that the intermediate signal exceeds the minimum boundary of a 16-bit signed integer.
At step 44c, a first mask signal receives a mask value stored in one of the registers. A register is chosen according to the value assigned to the offset signal in step 44b. For example, if an offset signal having a value of "0", "4", "8", or "12" chooses the first, second, third, or fourth register, respectively.
The value C (hex) used as a mask signal in step 44b may be appropriately filled with additional binary signals of the value zero, depending on the size of the intermediate register used. This provides a level of scalability as to the size of the intermediate register.
As known in the art, a bitwise AND operation passes through all bits that are ON in bit positions that correspond to bit positions in first mask signal having an ON bit. All other bit positions, whether ON or OFF, that correspond to bit positions in the first mask signal having an OFF bit are set to the OFF position.
At step 44d, a second mask signal receives the operational result of bitwise shifting the first mask signal 16 bit positions to the right.
At step 44e, the processor masks the intermediate signal with first mask signal to obtain a translated data signal. Specifically, this mask operation includes bitwise ANDing the intermediate signal with the first mask signal.
At step 44f, the processor performs a bitwise OR operation on translated data signal using second mask signal. As known in the art, a bitwise OR operation turns all bits that are OFF to ON in bit positions that correspond to bit positions in second mask signal having an ON bit. All other bit positions, whether ON or OFF, that correspond to bit positions in the second mask signal having an OFF bit are passed through without change.
The above method of operation essentially is based on the analysis of the 17th and 16th bits of the intermediate signal which are interpreted in the following way. An overflow status of "00" or "11" indicates that the original intermediate signal is inside the range of a signed 16-bit integer in a two's compliment format, which is a range between 32767 and -32767. Thus, the first mask signal is set to have 16 lower bits in an ON position when the intermediate signal is within the range boundary of a 16-bit signed integer, i.e., the first mask signal may have a mask value of 0000FFFF (hex).
An overflow status of "01" indicates a positive overflow state with the intermediate signal exceeding the maximum range boundary of 32767 for a 16-bit signed integer in a two's compliment format. If so, the first mask signal is set to have its 31st through 17th bits in the ON position. Thus, the first mask signal may have a mask value of 7FFF0000 (hex).
An overflow status of "10" indicates a negative overflow state with the intermediate signal exceeding the minimum range boundary of -32767 for a 16-bit signed integer in a two's compliment format. The first mask signal is set to have its 32nd bit position set to an ON state, while the remaining bit positions are set to have an OFF state. Thus, an overflow status of "10" results in the first mask signal receiving a mask value of 80000000 (hex),
The second mask signal has a value that is equal to the first mask value bit-shifted 16 bit positions to the right. All bit positions that are left vacant by the bitwise shift operation are filled with zero.
The first mask signal is bitwise ADDed with the original intermediate signal with the resulting value bitwise ORed with the second mask signal. This operation either reduces or passes through the original intermediate signal depending on its value.
Alternative Embodiments
Alternative embodiments of the present invention may include embedding the steps of the method of the present invention into a Field Programmable Gate Architecture (FPGA) as is well-known in the art, or using an integrated circuit design program such as VHDL to describe the method, thus hard coding the method in an application-specific integrated circuit (ASIC). The skill necessary to perform such embedding and hard coding is well-known to those of ordinary skill in the art.
While preferred embodiments are disclosed herein, many variations are possible which remain within the concept and scope of the invention, and these variations would become clear to one of ordinary skill in the art after perusal of the specification, drawings and claims herein.

Claims (7)

What is claimed is:
1. A method for optimally checking and reducing a signal to a specified upper or lower threshold signal in a computer system, the method comprising the steps of:
processing 16-bit signed data signals to obtain an intermediate signal having a 17th and 16th bit position, said 17th and 16th bit positions providing an overflow status;
initializing a first register, a second register, a third register, and a fourth register to hold mask values of 0000FFFF (hex), 7FFF0000 (hex), 80000000 (hex), and 0000FFFF (hex), respectively, said registers sequentially addressed so that they are four bytes apart from each other in an 8-bits per byte address space;
shifting said intermediate signal thirteen bit positions to the right and masking said shifted signal with a signal having a value C (hex) to obtain an offset signal;
providing a mask value to a first mask signal from one of said first, second, third, or fourth registers in response to said offset signal;
setting a second mask signal to have a value equal to said first mask signal that is shifted 16 bit positions to the right;
masking said intermediate signal by said first mask signal to obtain a masked result; and
turning all bits ON in said masked result that are in bit positions that correspond to bit positions in said second mask signal that are ON to obtain a reduced 16-bit signed data signal.
2. The method of claim 1, where said step of providing a mask signal includes the steps of:
setting said first mask signal to said mask value from said first register when said overflow status of said intermediate signal is equal to "00";
setting said first mask signal to said mask value from said second register when said overflow status of said intermediate signal is equal to "01";
setting said first mask signal to said mask value from said third register when said overflow status of said intermediate signal is equal to "10"; and
setting said first mask signal to said mask value from said fourth register when said overflow status of said intermediate signal is equal to "11".
3. A method of increasing the processing throughput of a processor, the method comprising the steps of:
processing 16-bit signed data signals to obtain an unsigned data signal having a 17th and 16th bit position, said 17th and 16th bit positions indicating a within range status when said 17th and 16th bit positions both have an ON state or an OFF state, said 17th and 16th bit positions indicating a positive overflow status when said 17th and 16th bit positions have an OFF and ON state, respectively, and said 17th and 16th bit positions indicating a negative overflow status when said 17th and 16th bit positions have an ON and OFF state, respectively;
setting a first mask signal to have a mask value having 16 lower bits in an ON position in response to a within range status;
setting said first mask signal to have a mask value having bit positions 31 through 17 in the ON position and bit positions 32 and 16 through 1 set to an OFF position in response to a positive overflow status;
setting said first mask signal to have a mask value having bit positions 32 in the ON position and bit positions 31 through 1 set to an OFF position in response to a negative overflow status;
setting a second mask signal to have a value equal to said first mask signal shifted 16 bit positions to the right;
bitwise ANDing said unsigned data signal with said first mask signal to obtain a translated data signal; and
bitwise ORing said translated data signal with said second mask signal.
4. A method of increasing the processing throughput of a processor, the method comprising the steps of:
processing signed data signals to obtain an intermediate signal having a 17th and 16th bit position, said 17th and 16th bit positions providing an overflow status;
providing a mask value to a first mask signal in response to said overflow status;
setting a second mask signal to have a value equal to said first mask signal that is shifted 16 bit positions to the right;
masking said intermediate signal by said first mask signal to obtain a masked result; and
turning all bits ON in said masked result that are in bit positions that correspond to bit positions in said second mask signal that are ON to obtain a reduced signed 16-bit data signal.
5. The method of claim 4, wherein said step of providing a mask value to said first mask signal includes the steps of:
initializing a first register, a second register, a third register, and a fourth register to hold mask values of 0000FFFF(hex), 7FFF0000(hex), 80000000(hex), and 0000FFFF(hex), respectively, said registers sequentially addressed so that they are four bytes apart from each other in an 8-bits per byte address space;
shifting said intermediate signal thirteen bit positions to the right and masking said shifted signal with a signal having a value C(hex) to obtain an offset signal; and
using said offset signal to select said mask value from one of said first, second, third, or fourth registers, said mask value for transfer to said first mask signal.
6. The method of claim 4, wherein said step of providing a mask value to said first mask signal includes the steps of:
initializing a first register, a second register, a third register, and a fourth register to hold mask values of 0000FFFF(hex), 7FFF0000(hex), 80000000(hex), and 0000FFFF(hex), respectively, said registers sequentially addressed so that they are four bytes apart from each other in an 8-bits per byte address space;
shifting said unsigned data signal fifteen bit positions to the right and masking said shifted signal with a signal having a value 3 (hex) to obtain an offset signal;
shifting said offset signal two positions to the left to obtain a shifted offset signal; and
using said offset signal to select a mask value from one of said first, second, third, or fourth registers, said mask value for transfer to said first mask signal.
7. A computer program for increasing the processing throughput of a processor, the program recorded in a computer-readable medium for causing a computer to perform the steps of:
processing data signals to obtain an intermediate signal having a 17th and 16th bit position, said 17th and 16th bit positions providing an overflow status;
initializing a first register, a second register, a third register, and a fourth register to hold mask values of 0000FFFF(hex), 7FFF0000(hex), 80000000(hex), and 0000FFFF(hex), respectively, said registers sequentially addressed so that they are four bytes apart from each other in an 8-bits per byte address space;
shifting said intermediate signal thirteen bit positions to the right and masking said shifted signal with a signal having a value C(hex) to obtain an offset signal;
using said offset signal to determine which of said first, second, third, and fourth registers to provide said mask values to said first mask signal;
setting a second mask signal to have a value equal to said first mask signal shifted 16 bit positions to the right;
masking said intermediate signal with said first mask signal to obtain a masked result signal; and
bitwise ORing said masked result signal with said second mask signal.
US08/881,720 1997-06-23 1997-06-23 Method for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow Expired - Lifetime US5870320A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US08/881,720 US5870320A (en) 1997-06-23 1997-06-23 Method for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow
US08/881,510 US5917740A (en) 1997-06-23 1997-06-24 Apparatus for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/881,720 US5870320A (en) 1997-06-23 1997-06-23 Method for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US08/881,510 Division US5917740A (en) 1997-06-23 1997-06-24 Apparatus for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow

Publications (1)

Publication Number Publication Date
US5870320A true US5870320A (en) 1999-02-09

Family

ID=25379056

Family Applications (2)

Application Number Title Priority Date Filing Date
US08/881,720 Expired - Lifetime US5870320A (en) 1997-06-23 1997-06-23 Method for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow
US08/881,510 Expired - Lifetime US5917740A (en) 1997-06-23 1997-06-24 Apparatus for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow

Family Applications After (1)

Application Number Title Priority Date Filing Date
US08/881,510 Expired - Lifetime US5917740A (en) 1997-06-23 1997-06-24 Apparatus for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow

Country Status (1)

Country Link
US (2) US5870320A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050210089A1 (en) * 2004-03-19 2005-09-22 Arm Limited Saturating shift mechanisms within data processing systems
WO2006122179A2 (en) * 2005-05-10 2006-11-16 Par Technologies Llc Fluid container with integrated valve
US20070286078A1 (en) * 2005-11-28 2007-12-13 International Business Machines Corporation Method and system for providing frame start indication in a memory system having indeterminate read data latency
US20080040571A1 (en) * 2004-10-29 2008-02-14 International Business Machines Corporation System, method and storage medium for bus calibration in a memory subsystem
US20080183977A1 (en) * 2007-01-29 2008-07-31 International Business Machines Corporation Systems and methods for providing a dynamic memory bank page policy
US7603526B2 (en) 2007-01-29 2009-10-13 International Business Machines Corporation Systems and methods for providing dynamic memory pre-fetch
US7610423B2 (en) 2004-10-29 2009-10-27 International Business Machines Corporation Service interface to a memory system
US7640386B2 (en) 2006-05-24 2009-12-29 International Business Machines Corporation Systems and methods for providing memory modules with multiple hub devices
US7669086B2 (en) 2006-08-02 2010-02-23 International Business Machines Corporation Systems and methods for providing collision detection in a memory system
US7721140B2 (en) 2007-01-02 2010-05-18 International Business Machines Corporation Systems and methods for improving serviceability of a memory system
US7870459B2 (en) 2006-10-23 2011-01-11 International Business Machines Corporation High density high reliability memory module with power gating and a fault tolerant address and command bus
US7934115B2 (en) 2005-10-31 2011-04-26 International Business Machines Corporation Deriving clocks in a memory system
US8140942B2 (en) 2004-10-29 2012-03-20 International Business Machines Corporation System, method and storage medium for providing fault detection and correction in a memory subsystem
US8296541B2 (en) 2004-10-29 2012-10-23 International Business Machines Corporation Memory subsystem with positional read data latency

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3790619B2 (en) * 1996-11-29 2006-06-28 松下電器産業株式会社 Processor capable of suitably performing rounding processing including positive value processing and saturation calculation processing
US6078940A (en) * 1997-01-24 2000-06-20 Texas Instruments Incorporated Microprocessor with an instruction for multiply and left shift with saturate
US6532486B1 (en) * 1998-12-16 2003-03-11 Texas Instruments Incorporated Apparatus and method for saturating data in register
US7580967B2 (en) * 2002-02-28 2009-08-25 Texas Instruments Incorporated Processor with maximum and minimum instructions
KR100493053B1 (en) * 2003-02-26 2005-06-02 삼성전자주식회사 Apparatus for carrying out a saturation of digital data
US20050286380A1 (en) * 2004-06-24 2005-12-29 Cirrus Logic, Inc. Digital adaptive hysteresis system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4945507A (en) * 1988-06-10 1990-07-31 Nec Corporation Overflow correction circuit
US5402368A (en) * 1992-12-10 1995-03-28 Fujitsu Limited Computing unit and digital signal processor using the same
US5508951A (en) * 1993-11-12 1996-04-16 Matsushita Electric Industrial Co., Ltd. Arithmetic apparatus with overflow correction means
US5539685A (en) * 1992-08-18 1996-07-23 Kabushiki Kaisha Toshiba Multiplier device with overflow detection function

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4945507A (en) * 1988-06-10 1990-07-31 Nec Corporation Overflow correction circuit
US5539685A (en) * 1992-08-18 1996-07-23 Kabushiki Kaisha Toshiba Multiplier device with overflow detection function
US5402368A (en) * 1992-12-10 1995-03-28 Fujitsu Limited Computing unit and digital signal processor using the same
US5508951A (en) * 1993-11-12 1996-04-16 Matsushita Electric Industrial Co., Ltd. Arithmetic apparatus with overflow correction means

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"The UltraSPARC Processor--Technology White Paper; The UltraSPARC Architecture," pp. 1-10, Copyright 1994-1997 Sun Microsystems, Inc., Palo Alto, CA.
The UltraSPARC Processor Technology White Paper; The UltraSPARC Architecture, pp. 1 10, Copyright 1994 1997 Sun Microsystems, Inc., Palo Alto, CA. *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050210089A1 (en) * 2004-03-19 2005-09-22 Arm Limited Saturating shift mechanisms within data processing systems
US8589769B2 (en) 2004-10-29 2013-11-19 International Business Machines Corporation System, method and storage medium for providing fault detection and correction in a memory subsystem
US20080040571A1 (en) * 2004-10-29 2008-02-14 International Business Machines Corporation System, method and storage medium for bus calibration in a memory subsystem
US8296541B2 (en) 2004-10-29 2012-10-23 International Business Machines Corporation Memory subsystem with positional read data latency
US8140942B2 (en) 2004-10-29 2012-03-20 International Business Machines Corporation System, method and storage medium for providing fault detection and correction in a memory subsystem
US7610423B2 (en) 2004-10-29 2009-10-27 International Business Machines Corporation Service interface to a memory system
WO2006122179A2 (en) * 2005-05-10 2006-11-16 Par Technologies Llc Fluid container with integrated valve
US20060255064A1 (en) * 2005-05-10 2006-11-16 Par Technologies, Llc Fluid container with integrated valve
US20060264829A1 (en) * 2005-05-10 2006-11-23 Par Technologies, Llc Disposable fluid container with integrated pump motive assembly
WO2006122179A3 (en) * 2005-05-10 2007-09-27 Par Technologies Llc Fluid container with integrated valve
US7934115B2 (en) 2005-10-31 2011-04-26 International Business Machines Corporation Deriving clocks in a memory system
US8145868B2 (en) 2005-11-28 2012-03-27 International Business Machines Corporation Method and system for providing frame start indication in a memory system having indeterminate read data latency
US7685392B2 (en) 2005-11-28 2010-03-23 International Business Machines Corporation Providing indeterminate read data latency in a memory system
US8151042B2 (en) 2005-11-28 2012-04-03 International Business Machines Corporation Method and system for providing identification tags in a memory system having indeterminate data response times
US8327105B2 (en) 2005-11-28 2012-12-04 International Business Machines Corporation Providing frame start indication in a memory system having indeterminate read data latency
US8495328B2 (en) 2005-11-28 2013-07-23 International Business Machines Corporation Providing frame start indication in a memory system having indeterminate read data latency
US20070286078A1 (en) * 2005-11-28 2007-12-13 International Business Machines Corporation Method and system for providing frame start indication in a memory system having indeterminate read data latency
US7640386B2 (en) 2006-05-24 2009-12-29 International Business Machines Corporation Systems and methods for providing memory modules with multiple hub devices
US7669086B2 (en) 2006-08-02 2010-02-23 International Business Machines Corporation Systems and methods for providing collision detection in a memory system
US7870459B2 (en) 2006-10-23 2011-01-11 International Business Machines Corporation High density high reliability memory module with power gating and a fault tolerant address and command bus
US7721140B2 (en) 2007-01-02 2010-05-18 International Business Machines Corporation Systems and methods for improving serviceability of a memory system
US7606988B2 (en) 2007-01-29 2009-10-20 International Business Machines Corporation Systems and methods for providing a dynamic memory bank page policy
US7603526B2 (en) 2007-01-29 2009-10-13 International Business Machines Corporation Systems and methods for providing dynamic memory pre-fetch
US20080183977A1 (en) * 2007-01-29 2008-07-31 International Business Machines Corporation Systems and methods for providing a dynamic memory bank page policy

Also Published As

Publication number Publication date
US5917740A (en) 1999-06-29

Similar Documents

Publication Publication Date Title
US5870320A (en) Method for reducing a computational result to the range boundaries of a signed 16-bit integer in case of overflow
US5673407A (en) Data processor having capability to perform both floating point operations and memory access in response to a single instruction
US5487022A (en) Normalization method for floating point numbers
EP2241968B1 (en) System with wide operand architecture, and method
US6295599B1 (en) System and method for providing a wide operand architecture
US6573846B1 (en) Method and apparatus for variable length decoding and encoding of video streams
US5761103A (en) Left and right justification of single precision mantissa in a double precision rounding unit
US5630160A (en) Floating point exponent compare using repeated two bit compare cell
US6877020B1 (en) Method and apparatus for matrix transposition
US7548248B2 (en) Method and apparatus for image blending
US6693643B1 (en) Method and apparatus for color space conversion
US6697076B1 (en) Method and apparatus for address re-mapping
US7487338B2 (en) Data processor for modifying and executing operation of instruction code according to the indication of other instruction code
US7681013B1 (en) Method for variable length decoding using multiple configurable look-up tables
EP0465322A2 (en) In-register data manipulation in reduced instruction set processor
EP0463975A2 (en) Byte-compare operation for high-performance processor
EP0463973A2 (en) Branch prediction in high performance processor
JPH113226A (en) Visual instruction set for cpu having integrated graphics function
US7546442B1 (en) Fixed length memory to memory arithmetic and architecture for direct memory access using fixed length instructions
US5887181A (en) Method and apparatus for reducing a computational result to the range boundaries of an unsigned 8-bit integer in case of overflow
US5704052A (en) Bit processing unit for performing complex logical operations within a single clock cycle
EP2309382B1 (en) System with wide operand architecture and method
US7467287B1 (en) Method and apparatus for vector table look-up
US5502827A (en) Pipelined data processor for floating point and integer operation with exception handling
US5954786A (en) Method for directing a parallel processing computing device to form an absolute valve of a signed valve

Legal Events

Date Code Title Description
AS Assignment

Owner name: SUN MICROSYSTEMS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:VOLKONSKY, VLADIMIR YU;REEL/FRAME:008832/0349

Effective date: 19970826

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12

AS Assignment

Owner name: ORACLE AMERICA, INC., CALIFORNIA

Free format text: MERGER AND CHANGE OF NAME;ASSIGNORS:ORACLE USA, INC.;SUN MICROSYSTEMS, INC.;ORACLE AMERICA, INC.;REEL/FRAME:037270/0148

Effective date: 20100212