US20050114419A1 - Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus - Google Patents
Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus Download PDFInfo
- Publication number
- US20050114419A1 US20050114419A1 US10/676,051 US67605103A US2005114419A1 US 20050114419 A1 US20050114419 A1 US 20050114419A1 US 67605103 A US67605103 A US 67605103A US 2005114419 A1 US2005114419 A1 US 2005114419A1
- Authority
- US
- United States
- Prior art keywords
- data
- discrete cosine
- input
- transformation
- processor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000009466 transformation Effects 0.000 title claims abstract description 109
- 238000012545 processing Methods 0.000 claims abstract description 89
- 230000017105 transposition Effects 0.000 claims abstract description 43
- 238000010586 diagram Methods 0.000 description 27
- 230000009471 action Effects 0.000 description 10
- 238000009499 grossing Methods 0.000 description 6
- 101100365384 Mus musculus Eefsec gene Proteins 0.000 description 5
- 238000000034 method Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000006837 decompression Effects 0.000 description 3
- 230000003111 delayed effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/147—Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/007—Transform coding, e.g. discrete cosine transform
Definitions
- the present invention relates to a discrete cosine transformation (DCT) apparatus and an inverse discrete cosine transformation (IDCT) apparatus which are often employed for compression and decompression of picture data and particularly to a discrete cosine transformation apparatus and an inverse discrete cosine transformation apparatus for allowing a two-dimensional transformation to be carried out in a one-dimensional transformation circuit.
- DCT discrete cosine transformation
- IDCT inverse discrete cosine transformation
- the discrete cosine transformation is generally used for video compression such as in a digital television broadcast system.
- Such a scheme of the circuit arrangement contributes to the scale down of the entire circuit size of an LSI, hence permitting the price to be reduced.
- one-dimensional processing is shifted to two-dimensional processing over every input of less than eight-point data, such as one-point (one pixel or one coefficient) unit or a two-point unit
- the register has a significant size substantially equal to the scale of a two-dimensional transformation circuit, hence failing to minimize the overall circuit size.
- FIG. 18 illustrates a related technique of switching each block of data between the one-dimensional processing and the two-dimensional processing with the use of an eight-point transformation processor which receives the data at a rate of two units of data per clock period and outputs two eight-point transformed data for every one clock period.
- the transposed output is enabled only after the four clock periods from the completion of input of one-dimensional transformed data. More specifically, the transformation of one block yields an invalid operation of four clock periods.
- the transposition memory has to be implemented by two-port RAM (random access memory) and its area size will hardly be reduced. Furthermore, the input and output are discontinuous from one block to another. For smoothing the operation at one data per clock period, the input and the output of the data require a memory size of 32 coefficients, respectively.
- FIG. 19 illustrates another related technique of switching each block between the one-dimensional processing and the two-dimensional processing with the use of a one-port RAM as the transposition memory, hence reducing the RAM area to a half.
- the start of the read is further delayed by four clock periods from that shown in FIG. 18 . This will extend the invalid operation per block to eight clock periods, thus declining the operational efficiency.
- the input and the output are discontinuous from one block to another.
- the memory size of 32 coefficients may be required for the input and output operation, respectively.
- FIG. 20 illustrates a further another related technique of switching in every two blocks between the one-dimensional processing and the two-dimensional processing in order to eliminate the invalid operation period generated in processing every block.
- the transposition memory requires a memory capacity of two blocks since the one-dimensional processing and the two-dimensional processing are switched in every two blocks.
- the transposition memory may be implemented by a two-port type RAM hence increasing the memory area size to four times greater than that shown in FIG. 19 .
- the input and output of data are discontinuous on the basis of two blocks.
- the memory size of 64 coefficients may be needed for the input and output, respectively.
- the read and the write are executed at one time.
- the transposition RAM area will hardly be decreased or the operational efficiency will be declined.
- a significant size of the data memory is required. More specifically, while the one-dimensional transformation circuit remains not increased in the size, the transposition memory may increase in the size or its operational efficiency may be declined.
- a orthogonal transformation apparatus such as a discrete cosine transformation apparatus or an inverse discrete cosine transformation apparatus
- a discrete cosine transformation apparatus comprising a transposition section which transposes input picture signal of N ⁇ N pixels between one-dimensional processing and two-dimensional processing, and a transformation section which subjects an output of the transposition section to a discrete cosine transformation.
- an inverse discrete cosine transformation apparatus comprising a transposition section which transposes input DCT coefficients of N ⁇ N in every N coefficients between one-dimensional processing and two-dimensional processing, and a transformation section which subjects an output of the transposition section to an inverse discrete cosine transformation.
- a discrete cosine transformation/inverse discrete cosine transformation apparatus comprising a single N-point transformation processor which switches in every N points between the one-dimensional processing and the two-dimensional processing to perform orthogonal transformation of N ⁇ N points.
- a discrete cosine transformation apparatus comprising an input processor which outputs data input one by one, at a rate of 2M data per clock period for M clock periods, an N-point transformation section which N-point transforms data input at the rate of 2M data per clock period from the input processor and outputs the transformed data at the rate of 2M data per clock period, an output processor which continuously outputs the one-dimensionally transformed data input at the rate of 2M data per clock period from the N-point transformation processor at the rate of 2M data per clock period for every N/2M clock periods while rounding N two-dimensionally transformed data input at the rate of 2M data per clock period in the succeeding N/2M clock periods, and a transposition processor which transposes N ⁇ N data input continuously at the rate of 2M data per clock period in every M clock periods and reading them continuously at the rate of 2M data per clock period in every M clock periods.
- the single eight-point transformation processor switches the one-dimensional processing and the two-dimensional processing alternately in every eight points to perform a discrete cosine transformation or an inverse discrete cosine transformation of 8 ⁇ 8 data, hence preventing its overall size from increasing and particularly reducing the circuit arrangement of its transposition RAM to a half.
- FIG. 1 is a block diagram showing a circuit arrangement of one embodiment of the present invention
- FIG. 2 is a diagram schematically showing control operation in the embodiment
- FIG. 3 is a block diagram showing a circuit arrangement of an input processor 1 in the embodiment
- FIGS. 4A and 4B are diagrams schematically showing a DCT processing operation of the input processor 1 in the embodiment
- FIGS. 5A and 5B are diagrams schematically showing an IDCT processing operation of the input processor 1 in the embodiment
- FIG. 6 is a block diagram showing a circuit arrangement of a one-dimensional DCT/IDCT processor 2 in the embodiment
- FIGS. 7A and 7B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment
- FIGS. 8A and 8B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention.
- FIGS. 9A and 9B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention.
- FIGS. 10A and 10B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention.
- FIGS. 11A and 11B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention.
- FIGS. 12A and 12B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention.
- FIGS. 13A and 13B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention.
- FIG. 14 is a block diagram showing a circuit arrangement of an output processor 3 in the embodiment of the present invention.
- FIG. 15 is a diagram showing a circuit arrangement of a round-off/maximum limiting section 33 a or 33 b in the embodiment of the present invention.
- FIGS. 16A and 16B are diagrams schematically showing an action of the round-off/maximum limiting section 33 a and 33 b in the embodiment of the present invention.
- FIG. 17 is a diagram showing a circuit arrangement of a transposition processor 4 in the embodiment.
- FIG. 18 is a diagram showing a first processing timing in the prior art
- FIG. 19 is a diagram showing a second processing timing in the prior art.
- FIG. 20 is a diagram showing a third processing timing in the prior art.
- FIG. 1 is a block diagram of a two-dimensional orthogonal transformation apparatus for carrying out both a DCT processing of 8 ⁇ 8 and an IDCT processing of 8 ⁇ 8 points in a single eight-point transformation processor, showing one embodiment of the present invention.
- FIG. 2 schematically illustrates an operation of the apparatus.
- Table 1 illustrates an input sequence of an 8 ⁇ 8 pixel array which is input into the apparatus for DCT processing, where ⁇ x0, x1, . . . , x6, x7 ⁇ represent horizontal pixel positions and ⁇ y0, y1, . . . , y6, y7 ⁇ represent vertical pixel positions.
- Table 2 illustrates an output sequence of DCT transformed data (an 8 ⁇ 8 array of DCT coefficients) output from the apparatus, where ⁇ f0, f1, . . . , f6, f7 ⁇ represent horizontal frequency components and ⁇ g0, g1, . . . , g6, g7 ⁇ represent vertical frequency components.
- f0 and g0 are a horizontal DC component and a vertical DC component respectively.
- f7 and g7 are the largest horizontal frequency component and the largest vertical frequency component of the eight-point DCT respectively.
- Table 3 illustrates an input sequence of an 8 ⁇ 8 array of DCT coefficients which are input into the apparatus for the IDCT processing.
- Table 4 illustrates an output sequence of IDCT transformed data (an 8 ⁇ 8 array of pixels) output from the apparatus.
- An array of pixels to be subjected to DCT are input in the sequence shown in Table 1 at a rate of one data per clock period into an input terminal 100 of the two-dimensional orthogonal transformation apparatus.
- DCT coefficients are introduced in the sequence shown in Table 3 at a rate of one data per clock period to the input terminal 100 .
- An input processor 1 outputs data dti[11:0] input from the input terminal 100 by two units of data (ido[31:0]) in every clock period, as shown in FIG. 2 .
- the input processor 1 outputs the unit data for four clock periods and then, for the succeeding four clock periods, selectively outputs data (ido[31:0]) output as two units of data (rdo[31:0]) in every clock period from a transposition processor 4 .
- a one-dimensional DCT/IDCT processor 2 i.e., an eight-point transformation processor in this embodiment receives the two units of data in every one clock period, it outputs eight-point transformed data at a rate of two units of data per clock period.
- a difference between the input and the output of input and output delays (ido and odi) is set as seven clocks.
- An output processor 3 outputs one-dimensional transformed data (odi[31:0]), which have been input at the rate of two units of data per clock period from the eight-point transformation processor 2 , as rdi[31:0] at a rate of two units of data per clock period to the transposition processor 4 for four clock periods. Also, the output processor 4 rounds eight two-dimensional transformed data input as two units of data from the eight-point transformation processor 2 and outputs them as dto[11:0] at a rate of one data per clock period from an output terminal 305 for the succeeding four clock periods, the total output being extended for eight clock periods.
- the transposition processor 4 transposes 64 units of data written by two units of data (rdi[31:0]) per clock period for four clock periods and outputs transposed data by two units of data per clock period for four clock periods. As shown in FIG. 2 , the data read out from the transposition memory is delayed by one clock period with respect to a readout control signal, hence allowing the write of rdi[31:0] and the read action of rdo[31:0] not to be executed at one time.
- a control processor 5 controls the action of the input processor 1 , the eight-point orthogonal transformation processor 2 , the output processor 3 , and the transposition processor 4 and generates an input/output interface control signal for the two-dimensional orthogonal transformation apparatus.
- the input/output interface control signal includes a signal dtack (an output terminal 501 ) and a signal dtosync (an output terminal 502 ) indicative of the head of output block data.
- the signal dtack is a signal for not limiting the timing of starting the fetch of data input to the input terminal 100 when all the one-dimension transformed data are completely input to the eight-point orthogonal transformation processor 2 but limiting in every eight clock periods the timing of starting the fetch of data input to the input terminal 100 when all the one-dimensional transformed data are not completely input to the eight-point orthogonal transformation processor 2 .
- a one-port RAM of 64 data storage capacity can be employed as the transposition memory hence reducing the overall memory circuit size to a half.
- the eight-point orthogonal transformation processor 2 generates no invalid operation periods when the block data can be continuously input. If the block data can not be continuously input and there is a space of less than 64 clock periods between two units of block data, the timing of starting the input may be limited by eight clock periods. This generates an invalid operation duration of less than eight clock periods.
- the compression and decompression of picture data is commonly performed over a unit of six blocks and no actual drawback in the operation will be expected.
- FIG. 3 is a block diagram showing an arrangement example of the input processor 1 .
- FIGS. 4A and 4B are diagrams showing the timing of DCT processing in the input processor 1 .
- FIGS. 5A and 5B are diagrams showing the timing of IDCT processing in the input processor 1 .
- an input register 11 (dfa) fetches data dti[11:0] from the input terminal 100 in every clock period.
- a shifter 12 is a selector arranged responsive to a control signal (dct) input from an input terminal 101 for outputting the output of the register 11 three bits to the left (the lower three bits being zeros) in the DCT processing, because the lower nine bits of the data are valid, or for directly outputting the output of the register 11 in the IDCT processing without bit shifting.
- a group of registers 13 a , 13 b , 13 c , and 13 d are responsive to a control signal (idfena) received from an input terminal 102 for updating the register output in each clock periods and holding the data throughout five clock periods (as denoted by dfb, dfc, dfd, and dfe in FIGS. 4A to 5 B).
- a selector 14 is responsive to a control signal (isela) input from an input terminal 103 for releasing the data held in the registers 13 a , 13 b , 13 c , and 13 d in a reverse of the input sequence (as denoted by sela in FIGS. 4A to 5 B).
- Selectors 15 a and 15 b are arranged responsive to a control signal (idfela) input from the input terminal 102 for selecting the output of the shifter 12 and the output of the selector 14 respectively in every four clock periods.
- a control signal idfela
- eight data input by one data per clock period from the input terminal 100 are output by two units of data per clock period in four clock periods.
- the transposition processor output data (rdo[31:0]) input from the input terminals 104 a and 104 b are output at the rate of two date per clock period (as denoted by selb[31:16] and selb[15:0] in FIGS. 4A to 5 B).
- the output of the shifter 12 and the output of the selector 14 are shifted three bits to the left (the lower three bits being zeros) by the selectors 15 a and 15 b for one bit code expansion and output as 16-bits data.
- Selectors 16 a and 16 b are responsive to a control signal (iselc) input from an input terminal 105 for modifying the outputs of the selectors 16 a and 16 b so that the sequence is suitable for the arithmetic operation in the eight-point orthogonal transformation processor and outputting them as ido[31:0]. As shown in FIGS.
- control for selectively outputting the input from the transposition processor 4 is identical between the DCT processing and the IDCT processing while the control for selectively outputting the input from the input terminal 100 is different between the DCT processing and the IDCT processing.
- FIG. 6 is a block diagram showing an arrangement example of the eight-point orthogonal transformation processor 2 which comprises a DCT addition/subtraction processor 21 , a sum-of-products processor 22 for fixed multiply (16 bits input and 21 bits output), and an IDCT addition/subtraction processor 23 .
- the fixed multipliers used in the arrangement are classified into six different types as shown in Table 7. The total number is eight as each of the multipliers c2 and c6 is provided two units for the function of the DCT and IDCT processings.
- FIGS. 7A to 13 B schematically illustrate an operation of DCT and IDCT processing of 8 ⁇ 8 data as switching between the two processings on the basis of a block.
- the DCT addition/subtraction processor 21 includes DFFs (D ytpe flip-flops) 21 a and 21 b connected to input terminals 200 a and 200 b , and adders 213 and 214 connected to the outputs of the two DFFs 21 a and 21 b respectively.
- the outputs of the DFFs 21 a and 21 b are also connected via an AND gate 215 and a NOR gate 216 to the adder 214 and the adder 213 , respectively.
- a control terminal 217 is connected directly to the adder 213 and the AND gate 215 and via an inverter 218 to the NOR gate 216 .
- DCT intermediate signals z(0), z(1), . . . , z(7) are generated and then output in the sequence shown in Table 9.
- DCT coefficients f(0), f(1), . . . , f(7) input from the input terminals 200 a and 200 b are directly output in the sequence as shown in Table 10.
- the sum-of-products processor 22 includes first groups of DFFs 221 and 222 connected to the outputs of the adders 213 and 214 of the DCT addition/subtraction processor 21 and second groups of DFFs 223 and 224 .
- the DFFs 221 and 222 in the first group are connected one another in three steps.
- the DFFs 223 and 224 of the second groups include DFFs connected to the adders 213 and 214 respectively and the DFFs connected to the outputs of the DFFs of the first group.
- a control signal edfena is input to the DFFs 223 and 224 of the second groups.
- the DFFs 223 and 224 of the second group are selectively connected to selectors (MUX) 225 and 226 . More particularly, outputs of the DFF 223 are connected to all inputs of the selectors 225 while outputs of the DFF 224 are connected to three inputs of the selectors 226 .
- the output of the selector 225 is connected via a multiplier 227 to a DFF 229 .
- the output of the selector 226 is connected via a multiplier 228 to one of two inputs of a selector 230 and directly to the other input of the selector 230 .
- a control signal dctsel [1] is input to the selectors 230 a and 230 b
- a control signal dctsel [0] is input to the selectors 230 c and 230 d.
- the DFF 229 a of the DFFs 290 is connected via an OR gate 231 a to an adder 232 a .
- the DFF 229 b is connected directly to the adder 232 a .
- the DFF 229 c is connected via an OR gate 231 b to an adder 232 b while the DFF 229 d is connected directly to the adder 232 b.
- the output of the selector 230 is connected to an input of the DFF 233 .
- the DFF 233 b of the DFFs 233 is connected via an OR gate 234 a to an adder 235 a .
- the DFF 233 a is connected directly to the adder 235 a .
- the DFF 233 d is connected via an OR gate 234 b to an adder 235 b while the DFF 233 c is connected directly to the adder 235 b .
- the adder 235 a is connected directly to an adder 236 while the adder 235 b is connected via an OR gate 237 to the adder 236 .
- the adder 232 a is connected directly to an adder 238 while the adder 232 b is connected via an OR gate 239 to the adder 238 .
- the outputs of the adders 236 and 238 are connected via bit shifters (SFT) 241 and 240 to adder 243 and 242 , respectively.
- SFT bit shifters
- the input DCT intermediate signals z(0), z(1), . . . , z(7) are subjected to the sum-of-products operation shown in Table 11 and the results are output as f(0), f(1), . . . , f(7).
- the transformation results are output by inputting the DCT intermediate values into the multiplier as shown in Table 12.
- Table 13 illustrates a control example of selecting the registers for the transformation.
- the transformation intermediate signals are output by inputting the DCT coefficients f(0), f(1), . . . , f(7) into the corresponding multiplier as shown in Table 15.
- Table 16 illustrates a control example of selecting the registers for the transformation intermediate processing.
- the fixed multipliers is designed for converting 16-bit input to 21-bit output and also the selectors 230 a , 230 b , 230 c , and 230 d for selectively outputting the input and the output of the fixed multiplier selectively output the fixed multiplier input data with four bits shifted to the left (the lower four bits being zeros) for one-bit code expansion.
- Table 17 illustrates a definition example of control signals for selecting the registers.
- Tables 18 and 19 illustrate a control example of selecting the registers for the DCT and IDCT processings based on the definition.
- Table 20 shows a pattern of four clock periods of the register selection control signals for the DCT and IDCT processings.
- Table 21 illustrates a pattern of four clock periods of control signals for addition and subtraction and bit shift processing for the DCT and IDCT processings.
- 16-bit data produced by eliminating the lower six bits of the output of the adder are one-bit code expanded for the DCT processing and, for the IDCT processing, the elimination of the upper two bits and the lower three bits from the output of the adder yields 17-bit data.
- the adders 242 and 243 are round-off circuits for rounding off the 17-bit data input from the bit shifters 240 and 241 in the positive direction to eliminate the lower one bit and outputting resultant 16-bit data.
- FIGS. 8A to 12 B illustrate the timing of operation in the sum-of-products processor 22 . TABLE 23 edo[31:16] f(0) f(6) f(2) f(4) edo[15:0] f(7) f(1) f(5) f(3) odi[31:16] f(0) f(6) f(2) f(4) odi[15:0] f(7) f(1) f(5) f(3)
- the IDCT addition/subtraction processor 23 includes DFFs 251 and 252 connected to the outputs of the adders 242 and 243 of the sum-of-products processor 22 respectively, and adders 253 and 254 connected to the outputs of the DFFs 251 and 252 respectively. Also, the output of the DFF 251 is connected via an AND gate 255 to the adder 254 while the DFF 252 is connected via a NOR gate 256 to the adder 253 . A control signal idctl2d is input to the adder 253 and the AND gate 255 , and supplied via an inverter 257 to the NOR gate 256 .
- the IDCT intermediate signals z(0), z(1), . . . , z(7) are generated, by the operation shown in Table 22, real signals (of pixel data) x(0), x(1), . . . , x(7) which are the transformation results and are then output in the sequence shown in Table 24.
- one of the inputs of the adder is controlled to zero, the input data f(0), f(1), . . . , f(7) are directly output in the sequence shown in Table 23.
- FIGS. 13A and 13B illustrate the timing of operation in the IDCT addition/subtraction processor 23 .
- FIG. 14 is a block diagram showing an arrangement example of the output processor 3 .
- FIGS. 16A and 16B illustrate the timing of operation in the output processor 3 .
- selectors 31 a and 31 b perform interchange of the data over four clock periods of the one-dimensional processing of data input from the input terminals 300 a and 300 b by two units of data per clock period to output the interchanged data as rdi[15:0] and rdi[31:16] to output terminal 306 a and 306 b . They also perform interchange of the data over another four clock periods of the two-dimensional transformation processing to output the interchanged data to registers 32 a and 32 b.
- the round-off/maximum limiting sections 33 a and 33 b perform the positive and negative symmetric rounding off and the maximum limiting for the two-dimensional processing result input every clock period via the registers 32 a and 32 b .
- Resultant data are output as odo[11:0] and odo[23:12].
- FIG. 15 illustrates a circuit example of the round-off/maximum limiting section 33 a or 33 b .
- a round processor 331 is responsive to a control signal (dct81d) input from an input terminal 302 for rounding the lower three bits of the data input in the complement of two from an input terminal 33 i for the DCT processing, and for rounding the lower six bits of the data for the IDCT processing, thus outputting the upper 13 bits as b[12:0]. More specifically, the adder for rounding is a common device over the upper bits between the DCT processing and the IDCT processing, effectively utilizes the operation bit number. In the DCT processing, the output is an integer of 13 bits.
- the lower three bits (b[2:0] is output as invalid data in the decimal place.
- a maximum limiting section 332 when the data b[12:0] input from the round processor 331 is a negative value smaller than 1800 h in the hexadecimal notation, outputs a 12-bit data as 800 h. When the data b is a positive value greater than 07 ffh, the section 332 outputs the 12-bit data as 7 ffh. Because the output of the round processor 331 is an upper portion of the bits, the maximum limiting section 332 perform the same operation for both the DCT processing and the IDCT processing.
- a bit shift processor 333 is responsive to a control signal (dct81d) input from the input terminal 302 for outputting the data output from the maximum limiting section 332 directly for the DCT processing, and for shifting the data output of the maximum limiting section 332 by three bits to the right (the upper three bits being code expanded) for the IDCT processing, from the output terminal 33 o.
- a group of registers 34 a , 34 b , 34 c , and 34 d are responsive to a control signal (odfena) input from an input terminal 303 for receiving output from the round-off/maximum limiting section 33 b and updating each register output in every clock period and saving the data for five clock periods (as denoted by dfb, dfc, dfd, and dfde in FIGS. 16A and 16B ).
- a selector 35 (selb) is a selector (selb shown in FIGS.
- a selector 36 is responsive to a control signal (odfena) input from the input terminal 303 for switching between the output of the round-off/maximum limiting section 33 a and the output of the selector 35 in every four clock periods to process eight data input by two units of data per clock period via the registers 32 a and 32 b for the succeeding four clock periods and outputting them by one data per clock period for eight clock periods via an output register 37 from an output terminal 305 (as selc[11:0] shown in FIGS. 16A and 16B ).
- the rounding off and the maximum limiting are carried out prior to smoothing of the output (one data per clock), the number of bits of registers can be reduced as compared with conducting the rounding off and the maximum limiting after the smoothing operation, hence minimizing the overall circuit arrangement.
- FIG. 17 is a block diagram showing an arrangement of the transposition processor 4 .
- the data input by two units of data per clock period is read out every two units of data, two RAMs of 16 bits by 32 words are employed so that two RAM address controls (adra[4:0] and adrb[4:0]) are different from each other.
- both the RAMs are of a one-port type and the write control signal wenan and the read control signal renan for the RAMs are common.
- the address order for writing the data (rdi[31:0]) input from the output processor 3 into the transposition RAM is the same as in the DCT processing and the IDCT processing, the address orders shown in Tables 27 and 28 are used alternately every block. Also, the address order for reading the data from the transposition RAM is the sane as in the DCT processing and the IDCT processing, the address orders shown in Tables 29 and 30 are used alternately every block.
- the address control patterns are shown in Table 31.
- the present invention permits not only the operating circuit to be reduced to substantially a half in the size but also the timing of writing and reading on the transposition memory to be exclusively made over one block area of the transposition RAM size thereby the transposition RAM area to a half.
- the registers of 4-word type can be used thus minimizing the overall circuit dimensions.
- the eight-point orthogonal transformation processor 2 inputs and outputs two units of data in every one clock period, it may equally handle four data per clock period with the one-dimensional processing and the two-dimensional processing switched from one to the other in every two clock periods.
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Discrete Mathematics (AREA)
- Computational Mathematics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Multimedia (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Complex Calculations (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Image Processing (AREA)
Abstract
A discrete cosine transformation apparatus comprises a transposition section that transposes input picture signal of N×N pixels in every N pixels between the one-dimensional processing and the two-dimensional processing and a transformation section that subjects an output of the transposition section to a discrete cosine transformation.
Description
- This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 11-280673, filed on Sep. 30, 1999, the entire contents of which are incorporated herein by reference.
- The present invention relates to a discrete cosine transformation (DCT) apparatus and an inverse discrete cosine transformation (IDCT) apparatus which are often employed for compression and decompression of picture data and particularly to a discrete cosine transformation apparatus and an inverse discrete cosine transformation apparatus for allowing a two-dimensional transformation to be carried out in a one-dimensional transformation circuit.
- The discrete cosine transformation is generally used for video compression such as in a digital television broadcast system. Conventionally, the application of higher operating clock frequencies was not easy. As the operating clock in LSIs has successfully been shifted to higher frequencies, two-dimensional transformation is now feasible with the use of a single one-dimensional DCT or IDCT circuit operated two times for video compression/decompression of e.g. a high-definition TV system. Such a scheme of the circuit arrangement contributes to the scale down of the entire circuit size of an LSI, hence permitting the price to be reduced.
- However, when one-dimensional processing is shifted to two-dimensional processing over every input of less than eight-point data, such as one-point (one pixel or one coefficient) unit or a two-point unit, it is necessary to provide in the one-dimensional transformation circuit a register for saving the results of intermediate operation between the one-dimensional processing and the two-dimensional processing. The register has a significant size substantially equal to the scale of a two-dimensional transformation circuit, hence failing to minimize the overall circuit size.
-
FIG. 18 illustrates a related technique of switching each block of data between the one-dimensional processing and the two-dimensional processing with the use of an eight-point transformation processor which receives the data at a rate of two units of data per clock period and outputs two eight-point transformed data for every one clock period. As the delay of output due to the arithmetic operation extends throughout substantially seven clock periods, the transposed output is enabled only after the four clock periods from the completion of input of one-dimensional transformed data. More specifically, the transformation of one block yields an invalid operation of four clock periods. Also, as the write (output of one-dimensional transformed data) and the read (input of one-dimensional transformed data for two-dimensional transformation) are executed simultaneously in substantially four clock periods for every 68 clocks, the transposition memory has to be implemented by two-port RAM (random access memory) and its area size will hardly be reduced. Furthermore, the input and output are discontinuous from one block to another. For smoothing the operation at one data per clock period, the input and the output of the data require a memory size of 32 coefficients, respectively. -
FIG. 19 illustrates another related technique of switching each block between the one-dimensional processing and the two-dimensional processing with the use of a one-port RAM as the transposition memory, hence reducing the RAM area to a half. For preventing the read and the write from occurring on the transposition memory, the start of the read is further delayed by four clock periods from that shown inFIG. 18 . This will extend the invalid operation per block to eight clock periods, thus declining the operational efficiency. Similar to the operation shown inFIG. 18 , the input and the output are discontinuous from one block to another. For smoothing the input and output data to one data per clock period, the memory size of 32 coefficients may be required for the input and output operation, respectively. -
FIG. 20 illustrates a further another related technique of switching in every two blocks between the one-dimensional processing and the two-dimensional processing in order to eliminate the invalid operation period generated in processing every block. However, the transposition memory requires a memory capacity of two blocks since the one-dimensional processing and the two-dimensional processing are switched in every two blocks. Also, as the read and the write are executed once, like the related technique shown inFIG. 18 , the transposition memory may be implemented by a two-port type RAM hence increasing the memory area size to four times greater than that shown inFIG. 19 . - In that case, the input and output of data are discontinuous on the basis of two blocks. For smoothing the input and output data to one data per clock period, the memory size of 64 coefficients may be needed for the input and output, respectively.
- While switching between the one-dimensional processing and the two-dimensional processing is conducted in every one block or every two blocks, the read and the write are executed at one time. As a result, the transposition RAM area will hardly be decreased or the operational efficiency will be declined. Also, for preventing the input and output of data from being discontinuous constantly, a significant size of the data memory is required. More specifically, while the one-dimensional transformation circuit remains not increased in the size, the transposition memory may increase in the size or its operational efficiency may be declined.
- It is an object of the present invention to provide a orthogonal transformation apparatus, such as a discrete cosine transformation apparatus or an inverse discrete cosine transformation apparatus, in which declination of the operational efficiency can be minimized even when data blocks cannot be input at predetermined intervals and two-dimensional orthogonal transformation can be performed with the use of a small circuit arrangement.
- According to the present invention, there is provided a discrete cosine transformation apparatus comprising a transposition section which transposes input picture signal of N×N pixels between one-dimensional processing and two-dimensional processing, and a transformation section which subjects an output of the transposition section to a discrete cosine transformation.
- According to the present invention, there is provided an inverse discrete cosine transformation apparatus comprising a transposition section which transposes input DCT coefficients of N×N in every N coefficients between one-dimensional processing and two-dimensional processing, and a transformation section which subjects an output of the transposition section to an inverse discrete cosine transformation.
- According to the present invention, there is provided a discrete cosine transformation/inverse discrete cosine transformation apparatus comprising a single N-point transformation processor which switches in every N points between the one-dimensional processing and the two-dimensional processing to perform orthogonal transformation of N×N points.
- According to the present invention, there is provided a discrete cosine transformation apparatus comprising an input processor which outputs data input one by one, at a rate of 2M data per clock period for M clock periods, an N-point transformation section which N-point transforms data input at the rate of 2M data per clock period from the input processor and outputs the transformed data at the rate of 2M data per clock period, an output processor which continuously outputs the one-dimensionally transformed data input at the rate of 2M data per clock period from the N-point transformation processor at the rate of 2M data per clock period for every N/2M clock periods while rounding N two-dimensionally transformed data input at the rate of 2M data per clock period in the succeeding N/2M clock periods, and a transposition processor which transposes N×N data input continuously at the rate of 2M data per clock period in every M clock periods and reading them continuously at the rate of 2M data per clock period in every M clock periods.
- According to the present invention, the single eight-point transformation processor switches the one-dimensional processing and the two-dimensional processing alternately in every eight points to perform a discrete cosine transformation or an inverse discrete cosine transformation of 8×8 data, hence preventing its overall size from increasing and particularly reducing the circuit arrangement of its transposition RAM to a half.
- Additional objects and advantages of the present invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the present invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out hereinafter.
- The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate presently preferred embodiments of the present invention, and together with the general description given above and the detailed description of the preferred embodiments given below, serve to explain the principles of the invention;
-
FIG. 1 is a block diagram showing a circuit arrangement of one embodiment of the present invention; -
FIG. 2 is a diagram schematically showing control operation in the embodiment; -
FIG. 3 is a block diagram showing a circuit arrangement of aninput processor 1 in the embodiment; -
FIGS. 4A and 4B are diagrams schematically showing a DCT processing operation of theinput processor 1 in the embodiment; -
FIGS. 5A and 5B are diagrams schematically showing an IDCT processing operation of theinput processor 1 in the embodiment; -
FIG. 6 is a block diagram showing a circuit arrangement of a one-dimensional DCT/IDCT processor 2 in the embodiment; -
FIGS. 7A and 7B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment; -
FIGS. 8A and 8B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention; -
FIGS. 9A and 9B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention; -
FIGS. 10A and 10B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention; -
FIGS. 11A and 11B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention; -
FIGS. 12A and 12B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention; -
FIGS. 13A and 13B are diagrams schematically showing an action of the one-dimensional DCT/IDCT processor 2 in the embodiment of the present invention; -
FIG. 14 is a block diagram showing a circuit arrangement of anoutput processor 3 in the embodiment of the present invention; -
FIG. 15 is a diagram showing a circuit arrangement of a round-off/maximum limitingsection -
FIGS. 16A and 16B are diagrams schematically showing an action of the round-off/maximum limiting section -
FIG. 17 is a diagram showing a circuit arrangement of atransposition processor 4 in the embodiment; -
FIG. 18 is a diagram showing a first processing timing in the prior art; -
FIG. 19 is a diagram showing a second processing timing in the prior art; and -
FIG. 20 is a diagram showing a third processing timing in the prior art. -
FIG. 1 is a block diagram of a two-dimensional orthogonal transformation apparatus for carrying out both a DCT processing of 8×8 and an IDCT processing of 8×8 points in a single eight-point transformation processor, showing one embodiment of the present invention.FIG. 2 schematically illustrates an operation of the apparatus.TABLE 1 H V x0 x1 x2 x3 x4 X5 x6 x7 y0 0 1 2 3 4 5 6 7 y1 8 9 10 11 12 13 14 15 y2 16 17 18 19 20 21 22 23 y3 24 25 26 27 28 29 30 31 y4 32 33 34 35 36 37 19 19 y5 40 41 42 43 44 45 46 47 y6 48 49 50 51 52 53 54 55 y7 56 57 58 59 60 61 62 63 -
TABLE 2 H V f0 f1 f2 f3 f4 F5 f6 f7 g0 0 8 16 24 32 40 48 56 g1 1 9 17 25 33 41 49 57 g2 2 10 18 26 34 42 50 58 g3 3 11 19 27 35 43 51 59 g4 4 12 20 28 36 44 52 60 g5 5 13 21 29 37 45 53 61 g6 6 14 22 30 38 46 54 62 g7 7 15 23 31 39 47 55 63 - Table 1 illustrates an input sequence of an 8×8 pixel array which is input into the apparatus for DCT processing, where {x0, x1, . . . , x6, x7} represent horizontal pixel positions and {y0, y1, . . . , y6, y7} represent vertical pixel positions. Table 2 illustrates an output sequence of DCT transformed data (an 8×8 array of DCT coefficients) output from the apparatus, where {f0, f1, . . . , f6, f7} represent horizontal frequency components and {g0, g1, . . . , g6, g7} represent vertical frequency components. f0 and g0 are a horizontal DC component and a vertical DC component respectively. f7 and g7 are the largest horizontal frequency component and the largest vertical frequency component of the eight-point DCT respectively. Table 3 illustrates an input sequence of an 8×8 array of DCT coefficients which are input into the apparatus for the IDCT processing. Table 4 illustrates an output sequence of IDCT transformed data (an 8×8 array of pixels) output from the apparatus.
TABLE 3 H V f0 f1 f2 f3 f4 F5 f6 F7 g0 0 8 16 24 32 40 48 56 g1 1 9 17 25 33 41 49 57 g2 2 10 18 26 34 42 50 58 g3 3 11 19 27 35 43 51 59 g4 4 12 20 28 36 44 52 60 g5 5 13 21 29 37 45 53 61 g6 6 14 22 30 38 46 54 62 g7 7 15 23 31 39 47 55 63 -
TABLE 4 H V x0 x1 x2 x3 x4 X5 x6 x7 y0 0 1 2 3 4 5 6 7 y1 8 9 10 11 12 13 14 15 y2 16 17 18 19 20 21 22 23 y3 24 25 26 27 28 29 30 31 y4 32 33 34 35 36 37 38 39 y5 40 41 42 43 44 45 46 47 y6 48 49 50 51 52 53 54 55 y7 56 57 58 59 60 61 62 63 - An array of pixels to be subjected to DCT are input in the sequence shown in Table 1 at a rate of one data per clock period into an
input terminal 100 of the two-dimensional orthogonal transformation apparatus. For the IDCT processing, DCT coefficients are introduced in the sequence shown in Table 3 at a rate of one data per clock period to theinput terminal 100. Aninput processor 1 outputs data dti[11:0] input from theinput terminal 100 by two units of data (ido[31:0]) in every clock period, as shown inFIG. 2 . Theinput processor 1 outputs the unit data for four clock periods and then, for the succeeding four clock periods, selectively outputs data (ido[31:0]) output as two units of data (rdo[31:0]) in every clock period from atransposition processor 4. - When a one-dimensional DCT/
IDCT processor 2, i.e., an eight-point transformation processor in this embodiment receives the two units of data in every one clock period, it outputs eight-point transformed data at a rate of two units of data per clock period. As shown inFIG. 2 , a difference between the input and the output of input and output delays (ido and odi) is set as seven clocks. - An
output processor 3 outputs one-dimensional transformed data (odi[31:0]), which have been input at the rate of two units of data per clock period from the eight-point transformation processor 2, as rdi[31:0] at a rate of two units of data per clock period to thetransposition processor 4 for four clock periods. Also, theoutput processor 4 rounds eight two-dimensional transformed data input as two units of data from the eight-point transformation processor 2 and outputs them as dto[11:0] at a rate of one data per clock period from anoutput terminal 305 for the succeeding four clock periods, the total output being extended for eight clock periods. - The
transposition processor 4 transposes 64 units of data written by two units of data (rdi[31:0]) per clock period for four clock periods and outputs transposed data by two units of data per clock period for four clock periods. As shown inFIG. 2 , the data read out from the transposition memory is delayed by one clock period with respect to a readout control signal, hence allowing the write of rdi[31:0] and the read action of rdo[31:0] not to be executed at one time. - A
control processor 5 controls the action of theinput processor 1, the eight-pointorthogonal transformation processor 2, theoutput processor 3, and thetransposition processor 4 and generates an input/output interface control signal for the two-dimensional orthogonal transformation apparatus. The input/output interface control signal includes a signal dtack (an output terminal 501) and a signal dtosync (an output terminal 502) indicative of the head of output block data. The signal dtack is a signal for not limiting the timing of starting the fetch of data input to theinput terminal 100 when all the one-dimension transformed data are completely input to the eight-pointorthogonal transformation processor 2 but limiting in every eight clock periods the timing of starting the fetch of data input to theinput terminal 100 when all the one-dimensional transformed data are not completely input to the eight-pointorthogonal transformation processor 2. - In this embodiment, as the write and the read of the transposition memory in the
transposition processor 4 are not executed at the same time, a one-port RAM of 64 data storage capacity can be employed as the transposition memory hence reducing the overall memory circuit size to a half. Also, the eight-pointorthogonal transformation processor 2 generates no invalid operation periods when the block data can be continuously input. If the block data can not be continuously input and there is a space of less than 64 clock periods between two units of block data, the timing of starting the input may be limited by eight clock periods. This generates an invalid operation duration of less than eight clock periods. However, the compression and decompression of picture data is commonly performed over a unit of six blocks and no actual drawback in the operation will be expected. - More details of the components are now explained.
-
FIG. 3 is a block diagram showing an arrangement example of theinput processor 1.FIGS. 4A and 4B are diagrams showing the timing of DCT processing in theinput processor 1.FIGS. 5A and 5B are diagrams showing the timing of IDCT processing in theinput processor 1. As shown inFIG. 3 , an input register 11 (dfa) fetches data dti[11:0] from theinput terminal 100 in every clock period. A shifter 12 (sft) is a selector arranged responsive to a control signal (dct) input from aninput terminal 101 for outputting the output of theregister 11 three bits to the left (the lower three bits being zeros) in the DCT processing, because the lower nine bits of the data are valid, or for directly outputting the output of theregister 11 in the IDCT processing without bit shifting. A group ofregisters input terminal 102 for updating the register output in each clock periods and holding the data throughout five clock periods (as denoted by dfb, dfc, dfd, and dfe inFIGS. 4A to 5B). A selector 14 (sela) is responsive to a control signal (isela) input from aninput terminal 103 for releasing the data held in theregisters FIGS. 4A to 5B). -
Selectors 15 a and 15 b are arranged responsive to a control signal (idfela) input from theinput terminal 102 for selecting the output of theshifter 12 and the output of theselector 14 respectively in every four clock periods. As a result, eight data input by one data per clock period from theinput terminal 100 are output by two units of data per clock period in four clock periods. In the succeeding four clock periods, the transposition processor output data (rdo[31:0]) input from theinput terminals FIGS. 4A to 5B). The output of theshifter 12 and the output of theselector 14 are shifted three bits to the left (the lower three bits being zeros) by theselectors 15 a and 15 b for one bit code expansion and output as 16-bits data.Selectors input terminal 105 for modifying the outputs of theselectors FIGS. 4A to 5B as well as Tables 5 and 6, the control for selectively outputting the input from thetransposition processor 4 is identical between the DCT processing and the IDCT processing while the control for selectively outputting the input from theinput terminal 100 is different between the DCT processing and the IDCT processing. -
FIG. 6 is a block diagram showing an arrangement example of the eight-pointorthogonal transformation processor 2 which comprises a DCT addition/subtraction processor 21, a sum-of-products processor 22 for fixed multiply (16 bits input and 21 bits output), and an IDCT addition/subtraction processor 23. The fixed multipliers used in the arrangement are classified into six different types as shown in Table 7. The total number is eight as each of the multipliers c2 and c6 is provided two units for the function of the DCT and IDCT processings.FIGS. 7A to 13B schematically illustrate an operation of DCT and IDCT processing of 8×8 data as switching between the two processings on the basis of a block.TABLE 7 Multiplier Formula c2 {square root over (2)} cos π/8 c6 {square root over (2)} sin π/8 c1 {square root over (2)} cos π/16 c7 {square root over (2)} sin π/16 c3 {square root over (2)} cos 3π/16 c5 {square root over (2)} sin 3π/16 -
TABLE 9 ido[31:16] x(4) x(2) x(6) x(0) ido[15:0] x(3) x(5) x(1) x(7) add0a[15:0] z(4) z(2) z(6) z(0) add0b[15:0] z(3) z(5) z(1) z(7) -
TABLE 10 ido [31:16] f(4) f(2) f(6) f(0) ido[15:0] f(3) f(5) f(1) f(7) add0a[15:0] f(4) f(2) f(6) f(0) add0b[15:0] f(3) f(5) f(1) f(7) - The DCT addition/
subtraction processor 21 includes DFFs (D ytpe flip-flops) 21 a and 21 b connected to inputterminals adders gate 215 and a NORgate 216 to theadder 214 and theadder 213, respectively. Acontrol terminal 217 is connected directly to theadder 213 and the ANDgate 215 and via aninverter 218 to the NORgate 216. - For the DCT processing in the DCT addition/
subtraction processor 21, for pixel data x(0), x(1), . . . , x(7) input from theinput terminals input terminals FIGS. 7A and 7B illustrate the timing of operation in the DCT addition/subtraction processor 21.TABLE 11 Intermediate signal DCT z(0) z(2) z(4) z(6) F(0) 1 +1 +1 +1 F(6) c6 +c2 −c6 −c2 F(2) c2 −c6 −c2 +c6 F(4) 1 −1 +1 −1 Intermediate signal DCT z(7) z(5) z(3) z(1) f(7) c7 +c3 +c1 +c5 f(1) c1 +c5 c7 c3 f(5) c5 +c7 −c3 +c1 f(3) c3 −c1 +c5 +c7 -
TABLE 12 Multiply coefficient 1 1 1 1 DCT c2 c6 c2 c6 ) f(0) z(2) +z(0) +( z(6) +z(4) ) f(6) z(2) +z(0) −( z(6) +z(4) ) f(2) z(0) −z(2) −( z(4) −z(6) ) f(4) z(0) −z(2) +( z(4) −z(6) ) Multiply coefficient DCT c5 c3 c1 c7 f(7) z(1) +z(5) +( z(3) +z(7) ) f(1) z(5) −z(1) +( z(7) −z(3) ) f(5) z(7) −z(3) +( z(1) +z(5) ) f(3) z(3) +z(7) −( z(5) −z(1) ) -
TABLE 13 Multiply coefficient 1 1 1 1 DCT c2 c6 c2 c(6) f(0) df5a df7a df6a df4a f(6) df5a df7a df6a df4a f(2) df7a df5a df4a df6a f(4) df7a df5a df4a df6a Multiply coefficient DCT c5 c3 c1 c7 f(7) df6b df5b df4b df7b f(1) df5b df6b df7b df4b f(5) df7b df4b df6b df5b f(3) df4b df7b df5b df6b -
TABLE 14 DCT Itermediate signal f(0) f(2) f(4) f(6) z(0) 1 +c2 +1 +c6 z(6) 1 +c6 −1 −c2 z(2) 1 −c6 −1 +c2 z(4) 1 −c2 +1 −c6 DCT Itermediate signal f(1) f(3) f(5) f(7) z(7) +c1 +c3 +c5 +c7 z(1) −c3 +c7 +c1 +c5 z(5) +c5 −c1 +c7 +c3 z(3) −c7 +c5 −c3 +c1 - For the DCT processing, the sum-of-
products processor 22 includes first groups ofDFFs adders subtraction processor 21 and second groups ofDFFs DFFs DFFs adders DFFs - The
DFFs DFF 223 are connected to all inputs of theselectors 225 while outputs of theDFF 224 are connected to three inputs of theselectors 226. The output of theselector 225 is connected via a multiplier 227 to a DFF 229. The output of theselector 226 is connected via amultiplier 228 to one of two inputs of a selector 230 and directly to the other input of the selector 230. A control signal dctsel [1] is input to theselectors 230 a and 230 b, and a control signal dctsel [0] is input to theselectors 230 c and 230 d. - The
DFF 229 a of the DFFs 290 is connected via an OR gate 231 a to anadder 232 a. TheDFF 229 b is connected directly to theadder 232 a. Similarly, theDFF 229 c is connected via anOR gate 231 b to anadder 232 b while theDFF 229 d is connected directly to theadder 232 b. - The output of the selector 230 is connected to an input of the DFF 233. The
DFF 233 b of the DFFs 233 is connected via anOR gate 234 a to anadder 235 a. TheDFF 233 a is connected directly to theadder 235 a. Similarly, theDFF 233 d is connected via anOR gate 234 b to anadder 235 b while the DFF 233 c is connected directly to theadder 235 b. Theadder 235 a is connected directly to anadder 236 while theadder 235 b is connected via anOR gate 237 to theadder 236. - The
adder 232 a is connected directly to anadder 238 while theadder 232 b is connected via anOR gate 239 to theadder 238. The outputs of theadders - For the DCT processing in the sum-of-
products processor 22, the input DCT intermediate signals z(0), z(1), . . . , z(7) are subjected to the sum-of-products operation shown in Table 11 and the results are output as f(0), f(1), . . . , f(7). As the multiply coefficients of the multipliers are fixed in this arrangement example, the transformation results are output by inputting the DCT intermediate values into the multiplier as shown in Table 12. Table 13 illustrates a control example of selecting the registers for the transformation. For the IDCT processing in the sum-of-products processor 22, the input DCT coefficients f(0), f(1), . . . , f(7) are subjected to the sum-of-products operation shown in Table 14 and the results are output as the transformation intermediate signals z(0), z(1), . . . , z(7). As the multiply coefficients of the multipliers are fixed in this arrangement, the transformation intermediate signals are output by inputting the DCT coefficients f(0), f(1), . . . , f(7) into the corresponding multiplier as shown in Table 15.TABLE 15 Multiply coefficient Intermediate signal 1 1 c2 c6 z(0) f(0) +f(4) +( f(2) +f(6) ) z(6) f(0) −f(4) −( f(6) −f(2) ) z(2) f(0) −f(4) +( f(6) −f(2) ) z(4) f(0) +f(4) −( f(2) +f(6) ) Multiply coefficient Intermediate signal c5 c3 c1 c7 z(7) f(5) +f(3) +( f(1) +f(7) ) z(1) f(7) −f(1) +( f(5) +f(3) ) z(5) f(1) +f(7) −( f(3) −f(5) ) z(3) f(3) −f(5) +( f(7) −f(1) ) -
TABLE 16 Multiply coefficient 1 1 c2 c6 Intermediate signal z(0) df7a df4a df5a df6a z(6) df7a df4a df6a df5a z(2) df7a df4a df6a df5a z(4) df7a df4a df5a df6a Multiply coefficient c5 c3 c1 c7 Intermediate signal z(7) df5a df4a df6a df7a z(1) df7a df6a df5a df4a z(5) df6a df7a df4a df5a z(3) df4a df5a df7a df6a - Table 16 illustrates a control example of selecting the registers for the transformation intermediate processing. Assuming that the fixed multipliers is designed for converting 16-bit input to 21-bit output and also the
selectors TABLE 17 Multiply coefficient 1 1 1 1 Select signal (c2) (c6) c2 c6 00 df7a df5a df5a df5a 01 df5a df7a 10 df4a df6a df6a 11 df4a df4a Multiply coefficient c6 c3 c1 c17 Select signal 00 df6b df5b df4b df7b 01 df5b df6b df7b df4b 10 df7b df4b df6b df5b 11 df4b df7b df5b df6b -
TABLE 18 Multiply coefficient 1 1 1 1 DCT (c2) (c6) c2 c6 f(0) 1 01 10 11 f(6) 1 01 10 11 f(2) 0 00 11 10 f(4) 0 00 11 10 Multiply coefficient c5 c3 c1 c7 DCT f(7) 00 00 00 00 f(1) 01 01 01 01 f(5) 10 10 10 10 f(3) 11 11 11 11 -
TABLE 19 Multiply coefficient 1 1 c2 c6 Intermediate signal z(0) 0 10 00 10 z(6) 0 10 10 00 z(2) 0 10 10 00 z(4) 0 10 00 10 Multiply coefficient c5 c3 c1 c7 Intermediate signal z(7) 01 10 10 00 z(1) 10 01 11 01 z(5) 00 11 00 10 z(3) 11 00 01 11 -
TABLE 20 8 point DCT 8 point IDCT timing 0 1 2 3 0 1 2 3 esela[2] 1 1 1 1 1 0 0 1 esela[1] 0 0 0 0 1 1 1 1 esela[0] 1 1 0 0 0 0 0 0 eselb[1] 1 1 1 1 0 1 1 0 eselb[0] 0 0 1 1 0 0 0 0 eselc[1] 0 0 1 1 0 1 0 1 eselc[0] 0 1 0 1 1 0 0 1 eseld[1] 0 0 1 1 1 0 1 0 eseld[0] 0 1 0 1 0 1 1 0 esele[2] 0 0 1 1 1 1 0 0 esele[1] 0 0 1 1 0 0 1 1 esele[0] 0 1 0 1 0 1 0 1 dctsel[1] 0 1 1 0 0 0 0 0 dctsel[0] 1 0 0 1 0 0 0 0 -
TABLE 21 8 Point DCT 8 Point IDCT Timing 0 1 2 3 0 1 2 3 suba[1] 0 1 1 0 0 1 0 1 suba[0] 0 0 1 1 0 1 1 0 subb[2] 0 0 0 1 0 0 1 0 subb[1] 0 1 1 0 0 1 0 1 subb[0] 0 1 0 1 0 0 1 1 dct11d 1 1 1 1 0 0 0 0 - Table 17 illustrates a definition example of control signals for selecting the registers. Tables 18 and 19 illustrate a control example of selecting the registers for the DCT and IDCT processings based on the definition. Table 20 shows a pattern of four clock periods of the register selection control signals for the DCT and IDCT processings. Table 21 illustrates a pattern of four clock periods of control signals for addition and subtraction and bit shift processing for the DCT and IDCT processings. In the
bit shifters adders bit shifters FIGS. 8A to 12B illustrate the timing of operation in the sum-of-products processor 22.TABLE 23 edo[31:16] f(0) f(6) f(2) f(4) edo[15:0] f(7) f(1) f(5) f(3) odi[31:16] f(0) f(6) f(2) f(4) odi[15:0] f(7) f(1) f(5) f(3) -
TABLE 24 edo[31:16] z(0) z(6) z(2) z(4) edo[15:0] z(7) z(1) z(5) z(3) odi[31:16] x(0) x(6) x(2) x(4) odi[15:0] x(7) x(1) x(5) x(3) - The IDCT addition/
subtraction processor 23 includesDFFs adders products processor 22 respectively, andadders DFFs DFF 251 is connected via an ANDgate 255 to theadder 254 while theDFF 252 is connected via a NORgate 256 to theadder 253. A control signal idctl2d is input to theadder 253 and the ANDgate 255, and supplied via aninverter 257 to the NORgate 256. - For the IDCT processing in the IDCT addition/
subtraction processor 23, the IDCT intermediate signals z(0), z(1), . . . , z(7) are generated, by the operation shown in Table 22, real signals (of pixel data) x(0), x(1), . . . , x(7) which are the transformation results and are then output in the sequence shown in Table 24. For the DCT processing, one of the inputs of the adder is controlled to zero, the input data f(0), f(1), . . . , f(7) are directly output in the sequence shown in Table 23.FIGS. 13A and 13B illustrate the timing of operation in the IDCT addition/subtraction processor 23. -
FIG. 14 is a block diagram showing an arrangement example of theoutput processor 3.FIGS. 16A and 16B illustrate the timing of operation in theoutput processor 3. - As shown in
FIG. 14 ,selectors input terminals 300 a and 300 b by two units of data per clock period to output the interchanged data as rdi[15:0] and rdi[31:16] tooutput terminal registers - The round-off/
maximum limiting sections registers -
FIG. 15 illustrates a circuit example of the round-off/maximum limiting section round processor 331 is responsive to a control signal (dct81d) input from aninput terminal 302 for rounding the lower three bits of the data input in the complement of two from aninput terminal 33 i for the DCT processing, and for rounding the lower six bits of the data for the IDCT processing, thus outputting the upper 13 bits as b[12:0]. More specifically, the adder for rounding is a common device over the upper bits between the DCT processing and the IDCT processing, effectively utilizes the operation bit number. In the DCT processing, the output is an integer of 13 bits. In the IDCT processing, the lower three bits (b[2:0] is output as invalid data in the decimal place. Amaximum limiting section 332, when the data b[12:0] input from theround processor 331 is a negative value smaller than 1800 h in the hexadecimal notation, outputs a 12-bit data as 800 h. When the data b is a positive value greater than 07 ffh, thesection 332 outputs the 12-bit data as 7 ffh. Because the output of theround processor 331 is an upper portion of the bits, themaximum limiting section 332 perform the same operation for both the DCT processing and the IDCT processing. Abit shift processor 333 is responsive to a control signal (dct81d) input from theinput terminal 302 for outputting the data output from themaximum limiting section 332 directly for the DCT processing, and for shifting the data output of themaximum limiting section 332 by three bits to the right (the upper three bits being code expanded) for the IDCT processing, from the output terminal 33 o. - A group of
registers input terminal 303 for receiving output from the round-off/maximum limiting section 33 b and updating each register output in every clock period and saving the data for five clock periods (as denoted by dfb, dfc, dfd, and dfde inFIGS. 16A and 16B ). A selector 35 (selb) is a selector (selb shown inFIGS. 16A and 16B ) for outputting the data saved in theregisters FIGS. 16A and 16B ) through the control signal input from theinput terminal 304. - A
selector 36 is responsive to a control signal (odfena) input from theinput terminal 303 for switching between the output of the round-off/maximum limiting section 33 a and the output of theselector 35 in every four clock periods to process eight data input by two units of data per clock period via theregisters output register 37 from an output terminal 305 (as selc[11:0] shown inFIGS. 16A and 16B ). - Because the rounding off and the maximum limiting are carried out prior to smoothing of the output (one data per clock), the number of bits of registers can be reduced as compared with conducting the rounding off and the maximum limiting after the smoothing operation, hence minimizing the overall circuit arrangement.
-
FIG. 17 is a block diagram showing an arrangement of thetransposition processor 4. The data input by two units of data per clock period is read out every two units of data, two RAMs of 16 bits by 32 words are employed so that two RAM address controls (adra[4:0] and adrb[4:0]) are different from each other. However, since the write and the read are executed simultaneously, and the write and the read are switched from one to the other in every four clock periods, both the RAMs are of a one-port type and the write control signal wenan and the read control signal renan for the RAMs are common. - The address order for writing the data (rdi[31:0]) input from the
output processor 3 into the transposition RAM is the same as in the DCT processing and the IDCT processing, the address orders shown in Tables 27 and 28 are used alternately every block. Also, the address order for reading the data from the transposition RAM is the sane as in the DCT processing and the IDCT processing, the address orders shown in Tables 29 and 30 are used alternately every block. The address control patterns are shown in Table 31. - As set forth above, the present invention permits not only the operating circuit to be reduced to substantially a half in the size but also the timing of writing and reading on the transposition memory to be exclusively made over one block area of the transposition RAM size thereby the transposition RAM area to a half. For smoothing the input and output, the registers of 4-word type can be used thus minimizing the overall circuit dimensions. When the single eight-point transformation processor carries out the operation at two pixel per clock period, the distance between block data inputs can be determined over one block in every eight clock periods or over two or more blocks in every one clock period, hence minimizing declination of the operational efficiency.
- Although there has been explained that the eight-point
orthogonal transformation processor 2 inputs and outputs two units of data in every one clock period, it may equally handle four data per clock period with the one-dimensional processing and the two-dimensional processing switched from one to the other in every two clock periods. - Additional advantages and modifications will readily occur to those skilled in the art. Therefore, the prevent invention in its broader aspects is not limited to the specific details and representative embodiments shown and described herein. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and their equivalents.
Claims (16)
1. A discrete cosine transformation apparatus comprising:
a transposition section which transposes data between a one-dimensional processing and a two-dimensional processing in every N pixels of an input picture signal of N×N pixels to produce a transposed output; and
a transformation section which subjects the transposed output of the transposition section to a discrete cosine transformation.
2. The discrete cosine transformation apparatus according to claim 1 , wherein the transposition section transposes the picture signal of 8×8 pixels in every eight pixels.
3. The discrete cosine transformation apparatus according to claim 2 , further comprising:
an input processor which outputs data input in units of L, at a rate of 2M2 L data per clock period for 4(N/M)·N/2 L periods and which, for the succeeding 4(N/M) periods, selects and outputs data output at 2M data per clock period from the transposition section to the transformation section.
4. The discrete cosine transformation apparatus according to claim 3 , wherein the transposition section has a transposition memory in which N×N data are written at the rate of 2M data per clock period for 4(N/M)·N/2 L periods, then transposed, and read out at the rate of 2M data per clock period for four clock periods.
5. The discrete cosine transformation apparatus according to claim 1 , further comprising a control section which produces control signals including a first signal and a second signal, the first signal being for limiting in every N/M clock periods the timing of starting the fetch of data input at the input terminal when the input of all the one-dimensional transformed data to the transformation section is not completed, but not limiting the timing of starting the fetch of data input to the input terminal when all the one-dimensional transformed data is completely input to the transformation section, and the second signal being indicative of a head of output block data.
6. An inverse discrete cosine transformation apparatus comprising:
a transposition section which transposes input DCT coefficients of N×N in every N coefficients between one-dimensional processing and two-dimensional processing; and
a transformation section which subjects an output of the transposition section to an inverse discrete cosine transformation.
7. An inverse discrete cosine transformation apparatus according to claim 6 , wherein the transposition section transposes the picture signal of 8×8 pixels in every eight pixels.
8. An inverse discrete cosine transformation apparatus according to claim 6 , further comprising:
an input processor which outputs first data input one by one, at a rate of two units of data per clock period for four clock periods and which, for the succeeding four clock periods, selects and outputs second data output at two units of data per clock period from the transposition section to the transformation section.
9. A discrete cosine transformation/inverse discrete cosine transformation apparatus comprising:
a single N-point transformation processor which switches in every N points between the one-dimensional processing and the two-dimensional processing to perform orthogonal transformation of N×N points.
10. A discrete cosine transformation/inverse discrete cosine transformation apparatus according to claim 9 , wherein the N-point transformation processor incorporates a single eight-pixel transformation processor which switches in every eight pixels between one-dimensional processing and two-dimensional processing to perform orthogonal transformation of 8×8 pixels.
11. (canceled)
12. A discrete cosine transformation/inverse discrete cosine transformation apparatus according to claim 10 , wherein the eight-pixel transformation processor comprises a first addition/subtraction processor for the discrete cosine transformation, a sum-of-products processor and a second addition/subtraction processor for the inverse discrete cosine transformation,
the first addition/subtraction processor includes a section which generates a discrete cosine intermediate signal of the pixel data input from the input terminal for the discrete cosine transformation and directly outputs the discrete cosine coefficients input from the input terminal with one of inputs of the adder controlled to zero for the inverse cosine transformation,
the sum-of-products processor includes a section which subjects the input discrete cosine intermediate signal to a sum-of-products operation to output a transformed result for the discrete cosine transformation and subjects the input discrete cosine transform coefficients to a sum-of-products operation to output a transformation intermediate signal, and
the second addition/subtraction processor includes a section which generates a real signal as the transformed result from the inverse discrete cosine intermediate signal for the inverse discrete cosine transformation and directly outputs the input data with one of inputs of the adder controlled to zero for the discrete cosine transformation.
13. (canceled)
14. The discrete cosine transformation/inverse discrete cosine transformation apparatus according to claim 10 , wherein the N-point transformation processor incorporates a single eight-coefficient transformation section configured to switch in every eight coefficients between the one-dimensional processing and the two-dimensional processing and subject 8×8 coefficients of data to a discrete cosine transformation or an inverse discrete cosine transformation.
15. A discrete cosine transformation apparatus comprising:
an input processor which outputs data input one by one, at a rate of L data per clock period for M clock periods;
an N-point transformation processor which N-point-transforms the data input at the rate of L data per clock period from the input processor and outputs the transformed data at the rate of L data per clock period;
an output processor which continuously outputs the one-dimensionally transformed data input at the rate of L data per clock period from the N-point transformation processor at the rate of L data per clock period for every M clock periods while rounding N two-dimensionally transformed data input at the rate of L data per clock period in the succeeding M clock periods; and
a transposition section which transposes N×N data input continuously at the rate of L data per clock period in every M clock periods and reading them continuously at the rate of L data per clock period in every M clock periods.
16. A discrete cosine transformation apparatus according to claim 15 , wherein the input processor outputs the data at a rate of two data per clock period, and the N-point transformation processor eight-point transforms the data received at the rate of two data per clock period from the input processor and outputs the transformed data at the rate of two data per clock period.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/676,051 US20050114419A1 (en) | 1999-09-30 | 2003-10-02 | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP28067399A JP3934290B2 (en) | 1999-09-30 | 1999-09-30 | Discrete cosine transform processing device, inverse discrete cosine transform processing device, discrete cosine transform processing device, and inverse discrete cosine transform processing device |
JP11-280673 | 1999-09-30 | ||
US09/664,573 US6732131B1 (en) | 1999-09-30 | 2000-09-18 | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus |
US10/676,051 US20050114419A1 (en) | 1999-09-30 | 2003-10-02 | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/664,573 Continuation US6732131B1 (en) | 1999-09-30 | 2000-09-18 | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
US20050114419A1 true US20050114419A1 (en) | 2005-05-26 |
Family
ID=17628345
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/664,573 Expired - Fee Related US6732131B1 (en) | 1999-09-30 | 2000-09-18 | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus |
US10/676,051 Abandoned US20050114419A1 (en) | 1999-09-30 | 2003-10-02 | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/664,573 Expired - Fee Related US6732131B1 (en) | 1999-09-30 | 2000-09-18 | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus |
Country Status (2)
Country | Link |
---|---|
US (2) | US6732131B1 (en) |
JP (1) | JP3934290B2 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7054897B2 (en) * | 2001-10-03 | 2006-05-30 | Dsp Group, Ltd. | Transposable register file |
US20030126171A1 (en) * | 2001-12-13 | 2003-07-03 | Yan Hou | Temporal order independent numerical computations |
TWI224931B (en) * | 2003-07-04 | 2004-12-01 | Mediatek Inc | Scalable system for inverse discrete cosine transform and method thereof |
US20070009166A1 (en) * | 2005-07-05 | 2007-01-11 | Ju Chi-Cheng | Scalable system for discrete cosine transform and method thereof |
JP5097138B2 (en) * | 2009-01-15 | 2012-12-12 | シャープ株式会社 | Arithmetic circuit and encryption circuit for Montgomery multiplication |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5471412A (en) * | 1993-10-27 | 1995-11-28 | Winbond Electronic Corp. | Recycling and parallel processing method and apparatus for performing discrete cosine transform and its inverse |
US5550765A (en) * | 1994-05-13 | 1996-08-27 | Lucent Technologies Inc. | Method and apparatus for transforming a multi-dimensional matrix of coefficents representative of a signal |
US5610849A (en) * | 1995-06-23 | 1997-03-11 | United Microelectronics Corporation | Real time two-dimensional discrete cosine transform/inverse discrete cosine transform circuit |
US5668748A (en) * | 1995-04-15 | 1997-09-16 | United Microelectronics Corporation | Apparatus for two-dimensional discrete cosine transform |
US5737256A (en) * | 1994-10-13 | 1998-04-07 | Fujitsu Limited | Inverse discrete cosine transform apparatus |
USRE36183E (en) * | 1989-07-03 | 1999-04-06 | Sgs-Thomson Microelectronics S.A. | System for rearranging sequential data words from an initial order to an arrival order in a predetermined order |
US6195674B1 (en) * | 1997-04-30 | 2001-02-27 | Canon Kabushiki Kaisha | Fast DCT apparatus |
US6237012B1 (en) * | 1997-11-07 | 2001-05-22 | Matsushita Electric Industrial Co., Ltd. | Orthogonal transform apparatus |
US6327602B1 (en) * | 1998-07-14 | 2001-12-04 | Lg Electronics Inc. | Inverse discrete cosine transformer in an MPEG decoder |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IT1207346B (en) * | 1987-01-20 | 1989-05-17 | Cselt Centro Studi Lab Telecom | DISCREET DISCREET COSE COEFFI CIRCUIT FOR THE CALCULATION OF THE QUANTITIES OF NUMERICAL SIGNAL SAMPLES |
JP2866754B2 (en) * | 1991-03-27 | 1999-03-08 | 三菱電機株式会社 | Arithmetic processing unit |
JP3697717B2 (en) * | 1993-09-24 | 2005-09-21 | ソニー株式会社 | Two-dimensional discrete cosine transform device and two-dimensional inverse discrete cosine transform device |
US5583803A (en) * | 1993-12-27 | 1996-12-10 | Matsushita Electric Industrial Co., Ltd. | Two-dimensional orthogonal transform processor |
US5805482A (en) * | 1995-10-20 | 1998-09-08 | Matsushita Electric Corporation Of America | Inverse discrete cosine transform processor having optimum input structure |
KR0182511B1 (en) | 1996-02-24 | 1999-05-01 | 김광호 | Two-dimensional inverse discrete cosine transform apparatus |
US5894430A (en) * | 1996-05-20 | 1999-04-13 | Matsushita Electric Industrial Co., Ltd. | Orthogonal transform processor |
US6295320B1 (en) * | 1997-12-31 | 2001-09-25 | Lg Electronics Inc. | Inverse discrete cosine transforming system for digital television receiver |
-
1999
- 1999-09-30 JP JP28067399A patent/JP3934290B2/en not_active Expired - Fee Related
-
2000
- 2000-09-18 US US09/664,573 patent/US6732131B1/en not_active Expired - Fee Related
-
2003
- 2003-10-02 US US10/676,051 patent/US20050114419A1/en not_active Abandoned
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
USRE36183E (en) * | 1989-07-03 | 1999-04-06 | Sgs-Thomson Microelectronics S.A. | System for rearranging sequential data words from an initial order to an arrival order in a predetermined order |
US5471412A (en) * | 1993-10-27 | 1995-11-28 | Winbond Electronic Corp. | Recycling and parallel processing method and apparatus for performing discrete cosine transform and its inverse |
US5550765A (en) * | 1994-05-13 | 1996-08-27 | Lucent Technologies Inc. | Method and apparatus for transforming a multi-dimensional matrix of coefficents representative of a signal |
US5737256A (en) * | 1994-10-13 | 1998-04-07 | Fujitsu Limited | Inverse discrete cosine transform apparatus |
US5668748A (en) * | 1995-04-15 | 1997-09-16 | United Microelectronics Corporation | Apparatus for two-dimensional discrete cosine transform |
US5610849A (en) * | 1995-06-23 | 1997-03-11 | United Microelectronics Corporation | Real time two-dimensional discrete cosine transform/inverse discrete cosine transform circuit |
US6195674B1 (en) * | 1997-04-30 | 2001-02-27 | Canon Kabushiki Kaisha | Fast DCT apparatus |
US6237012B1 (en) * | 1997-11-07 | 2001-05-22 | Matsushita Electric Industrial Co., Ltd. | Orthogonal transform apparatus |
US6327602B1 (en) * | 1998-07-14 | 2001-12-04 | Lg Electronics Inc. | Inverse discrete cosine transformer in an MPEG decoder |
Also Published As
Publication number | Publication date |
---|---|
US6732131B1 (en) | 2004-05-04 |
JP2001102934A (en) | 2001-04-13 |
JP3934290B2 (en) | 2007-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JPH03180965A (en) | Integrated circuit apparatus adapted to repeat dct/idct computation using single multiplier/accumulator and single random access memory | |
US6112219A (en) | Method and apparatus for performing fast discrete cosine transforms and fast inverse discrete cosine transforms using look-up tables | |
JPH08235159A (en) | Inverse cosine transformation device | |
JPH04128982A (en) | Processor element, processing unit, processor, and method of processing operation | |
US6185595B1 (en) | Discrete cosine transformation operation circuit | |
EP1065884A1 (en) | Dct arithmetic device | |
US20010033617A1 (en) | Image processing device | |
KR20010023031A (en) | Variable block size 2-dimensional inverse discrete cosine transform engine | |
US6732131B1 (en) | Discrete cosine transformation apparatus, inverse discrete cosine transformation apparatus, and orthogonal transformation apparatus | |
JP2014241585A (en) | Data processing device and method for executing conversion between space domain and frequency domain in video data processing | |
JPH01269183A (en) | Spatial filter picture processor | |
EP0701218A1 (en) | Parallel processor | |
US20010054051A1 (en) | Discrete cosine transform system and discrete cosine transform method | |
JP3645298B2 (en) | Shift and round circuit and method thereof | |
US20040105500A1 (en) | Image processing system | |
JP2960328B2 (en) | Apparatus for providing operands to "n + 1" operators located in a systolic architecture | |
JPH07327230A (en) | Picture element matrix filter and method for processing picture element matrix | |
JP4170173B2 (en) | Block matching arithmetic unit | |
US7391909B2 (en) | Data manipulation | |
JP3843477B2 (en) | Image processor | |
KR100408884B1 (en) | Discrete cosine transform circuit of distributed arithmetic | |
EP0928114B1 (en) | Video data decoding apparatus | |
JP2790911B2 (en) | Orthogonal transform operation unit | |
JP3895031B2 (en) | Matrix vector multiplier | |
JP2002300597A (en) | Digital filtering device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |