US20050125480A1 - Method and apparatus for multiplying based on booth's algorithm - Google Patents

Method and apparatus for multiplying based on booth's algorithm Download PDF

Info

Publication number
US20050125480A1
US20050125480A1 US10/986,095 US98609504A US2005125480A1 US 20050125480 A1 US20050125480 A1 US 20050125480A1 US 98609504 A US98609504 A US 98609504A US 2005125480 A1 US2005125480 A1 US 2005125480A1
Authority
US
United States
Prior art keywords
multiplier
booth
algorithm according
product
multiplying
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/986,095
Inventor
Ting-Kun Yeh
James Tsai
David Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Via Technologies Inc
Original Assignee
Via Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Via Technologies Inc filed Critical Via Technologies Inc
Priority to US10/986,095 priority Critical patent/US20050125480A1/en
Assigned to VIA TECHNOLOGIES, INC. reassignment VIA TECHNOLOGIES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TSAI, JAMES, WANG, DAVID, YEH, TING-KUN
Publication of US20050125480A1 publication Critical patent/US20050125480A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • H04N19/426Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements using memory downsizing methods
    • H04N19/427Display on the fly, e.g. simultaneous writing to and reading from decoding memory

Definitions

  • the invention relates to an apparatus and method for multiplying, and more particularly, to an apparatus and method for multiplying based on Booth's algorithm.
  • FIG. 1A is a diagram of a shuttle exchange circuit base on Lee's algorithm for DCT.
  • the DCT is divided into a first stage computation, a second stage computation, a third stage computation and a fourth stage computation, thereby eight parallel output value (Y 0 ,Y 1 , . . . , Y 7 ) can be evaluated by the four stage computations according eight parallel input value (X 0 ,X 1 , . . . , X 7 ).
  • DCT processor 1 is constructed by 12 similar process means 3 designed by the butterfly circuits, and the post-processor constructed by five adding means 4 5 and a fixed coefficient multiplication means is connected with the DCT processor 2 thereafter.
  • Each process means 3 comprises an adder 31 , a subtractor 32 and fixed coefficient multiply 5 .
  • the coefficient signed A, B, C, D, E, F and G are 1 2 ⁇ cos ⁇ ( ⁇ / 4 ) , 1 2 ⁇ cos ⁇ ( ⁇ / 8 ) , 1 2 ⁇ cos ⁇ ( 3 ⁇ ⁇ ⁇ / 8 ) , 1 2 ⁇ cos ⁇ ( ⁇ / 16 ) , 1 2 ⁇ cos ⁇ ( 3 ⁇ ⁇ ⁇ / 16 ) , 1 2 ⁇ cos ⁇ ( 7 ⁇ ⁇ ⁇ / 16 ) ⁇ ⁇ and ⁇ ⁇ 1 2 ⁇ cos ⁇ ( 5 ⁇ ⁇ ⁇ / 16 ) respectively. If there is no concern on the adders, subtractors, muliplies, no control means is needed in FIG. 1A .
  • the DCT data-flow dependence without any control means can be designed as data-flow architecture.
  • FIG. 1B is a diagram of an IDCT circuit base on Lee's algorithm.
  • the IDCT is divided into a first stage computation, a second stage computation, a third stage computation and a fourth stage computation, thereby eight parallel output value (X 0 ,X 1 , . . . , X 7 ) can be evaluated by the four stage computations and eight parallel input value (Z 0 ,Z 1 , . . . , Z 7 ).
  • IDCT processor 7 and pre-processor 6 .
  • IDCT processor 7 is constructed by 12 similar process means 8 designed by the butterfly circuits, and the pre-processor 6 constructed by five adding means 9 and a fixed coefficient multiplication means 10 is connected with the pre-processor 6 therebefore.
  • Each process means 8 comprises an adder 81 , a subtractor 82 and fixed coefficient multiply 10 .
  • the coefficient of fixed coefficient multiplies 10 signed A, B, C, D, E, F are the same with the coefficients in FIG. 1A respectively.
  • Another well-known DCT/IDCT algorithm is Chen's algorithm. The other details of Lee's algorithm and Chen's algorithm can reference U.S. Pat. No. 5,452,466 and U.S. Pat. No. 5,841,682.
  • the coefficient set 221 contains a plurality of multiplier coefficients 222 is transformed by a coefficient generation means 22 .
  • partial products generation means 26 generates a plurality of partial products 242 according to the multiplier coefficients 222 and a multiplicand 212 .
  • the partial products 242 are added by the summating means 26 to generate the sum 262 of the multiplication of the multiplier 211 and multiplicand 212 . Because the number of the multiplier coefficient is less than the number of bits of the multiplicand, thus the number of partial products 242 will be less too. Therefore, the cost and performance can be much improved.
  • the present invention provides a method and apparatus for multiplying based on Booth's algorithm to simply the computation by decreasing multiplier coefficients.
  • the present invention provides a method for multiplying according a multiplication indexes to choose a multiplication coefficient set from a plurality of multiplication coefficient sets.
  • Each multiplication coefficient set comprises a plurality of multiplication coefficients transformed from a determined multiplier. Then multiplying the multiplication coefficients with a multiplicand can generate a plurality of partial products and finally an output value can be generated by summing all multiplication coefficients.
  • the present invention also provides a multiplication apparatus, comprising: a coefficient generation means, in which one of a plurality of coefficient sets with a plurality of coefficients generated by Booth's algorithm is chosen in accordance with a multiplier; a partial products generation means, in which products are generated by multiplying chosen coefficient set with a multiplicand; and a summing means for generating an output value by summing all partial products.
  • FIG. 1A and FIG. 1B are the structure block diagrams of the prior art
  • FIG. 2A and FIG. 2B are the function block diagrams based on Booth's algorithm of one embodiment of the present invention.
  • FIG. 3 is the flowchart diagram of another embodiment of the present invention.
  • FIG. 4 is the functional block diagram of further embodiment of the present invention.
  • the feature of Booth's algorithm is to replace multiplier with a plurality of multiplier coefficients for multiplying with the multiplicand to generate partial products, wherein the product can be generated by summing all partial products.
  • the present is an improvement of the feature of Booth's algorithm in the specific condition that all of the possible multipliers are predetermined (which means each possible multiplier is chosen form a fixed group). Namely, all possible multipliers are known. Therefore, each possible multiplier can be replaced with a multiplier index corresponding to a multiplier coefficient set with a plurality of multiplier coefficients. Accordingly, the corresponding multiplier coefficient set can be indexed directly and the cost can be down because of no transforming for generating the multiplier coefficients after the multiplier is determined in the prior art. Besides, because all possible multipliers are determined, each one of them can be identified by less bits. For examples, the number of possible multipliers is 8 that can be identified by 3 bits, even if the number of bits in a multiplier is 16 with 2 16 possible values.
  • the output of the product may be part of the product rather than the whole product.
  • the output value may be taken the integer and some of the decimal place or only some of the integer in the product.
  • the product is identified by 40 bits, then the values of all possible products are in the boundary of 23 bits. In this case, only 23 bits for output will work.
  • the bits from the most significant bit to the bits for outputting are assigned as a high bit set and the rest part of the product is assigned as a low bit set.
  • the high bit sets and the low bit sets of all partial products are summed to be a high bit product and a low bit product respectively, wherein the low bit product comprises a carry out value identified by other bits except the bits of the low bit set for being summed with the high bit product.
  • the sum of the high bit set and the carry out value can be the output. Accordingly, it is not necessary to reserve the other bits in the low bit product except the carry out value whereby the cost is down.
  • the present invention can be used for integer, floating point value, fixed point value or other value types.
  • the value type of the present invention can be identified as binary, nibble, decimal, hexadecimal and so on, the type and the way to identify the value in the present invention are not limited.
  • the manner to choose a multiplier coefficient set can be indexed by a lookup table.
  • the lookup table records the correspondent relationship of the multiplier index and the multiplier coefficient set and can be implemented in memories, state-latched circuits or other storage media.
  • the multiplier index can be used for the address or the control signals to index the multiplier coefficients of correspondent multiplier coefficient set, in which all of this can be integrated in a logical circuit.
  • the illustration of the manner of choosing the multiplier coefficients in the lookup table is for clearly understood, not for confining the implementation of the present invention. The present invention does not limit the implementation for choosing the multiplier coefficient set by the multiplier index.
  • one embodiment of the present invention is a method for multiplying based on the Booth's algorithm.
  • step 320 chooses a correspondent one of the determined multiplier coefficient sets according to a multiplier index.
  • Each one of the multiplier coefficient sets comprises a plurality of multiplier coefficients according to the Booth's algorithm. Namely, all possible values of the multiplier are determined and each possible value of the multiplier corresponds to a set of multiplier coefficients, transformed according to Booth's algorithm, indexed by a respective multiplier index. That is, each multiplier index corresponds to a set of multiplier coefficients. Besides, the values corresponding to different multiplier indexes may be the same.
  • step 340 generates a plurality of partial products by multiplying the multiplier coefficients with a multiplicand according to the Booth's algorithm.
  • step 360 sums all partial products to generate an output value.
  • the output value can be the whole product of the multiplication or above mentioned part of the product.
  • Another embodiment of the present invention is an apparatus for multiplying base on Booth's algorithm, referring to FIG. 4 , comprising a coefficient generation means 42 , a partial product generation means 24 and a summing means 46 .
  • the coefficient generation means 42 choose one of a plurality of coefficient sets to be a multiplier coefficient set 221 , wherein each coefficient set comprises a plurality of multiplier coefficients 222 transformed by a determined multiplier based on Booth's algorithm.
  • the partial product generation means 24 generates the partial products 242 according to the multiplier coefficient set 221 and a multiplicand 212 based on Booth's algorithm.
  • the summing means 46 sums all partial products 242 to generate an output value 463 .
  • the output value can be generated from the product summed from the high bit product 441 and the low bit product 442 .
  • the high bit product 441 and the low bit product 442 can be summed from the high bits 2421 and the low bits 2422 of the partial products 242 respectively, wherein the low bit product 442 comprises the foregoing carry out value 4421 and the output value 443 can further be generated according to the high bit product 441 and the carry out value 4421 .
  • coefficient generation means 42 might be a hardware circuit, such as a combination of a multiplex and some additional units.
  • the partial product generation means might be a hardware circuit which is a combination of several multiplies.
  • the summing means might be a hardware circuit, which is a combination of several adders.
  • further embodiment of the present invention is a apparatus for multiplying base on Booth's algorithm in DCT/IDCT, i.e. the fixed point multiplication means in Lee's algorithm.
  • the multiplicands in the multiplication of lee's algorithm are fixed point values and the values can be cosine values or sine values, i.e.
  • the embodiment can also be the fixed point multiplication means in Chen's algorithm.
  • the foregoing DCT/IDCT can further apply in digital multimedia apparatuses, i.e. VCD player, DVD player, HDTV and so forth. The other details of the embodiment is described above, there is no redundant description here.

Abstract

A multiplying apparatus and method based on Booth's algorithm are disclosed. According to a multiplier index, a one of several predetermined multiplier coefficient sets can be chosen. Each multiplier coefficient set contains several multiplier coefficients that are generated according to a predetermined multiplier value by Booth's algorithm. Then the multiplier coefficients can be used to generate the partial products according to a multiplicand by Booth's algorithm. By summing all of the partial products, an output value can be generated.

Description

    BACKGROUND OF THE PRESENT INVENTION
  • 1. Field of the Invention
  • The invention relates to an apparatus and method for multiplying, and more particularly, to an apparatus and method for multiplying based on Booth's algorithm.
  • 2. Description of the Prior Art
  • Discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) are used for data compression and date decompression respectively. One of well-known DCT and IDCT technology is a Fast Fourier Transform (FFT) base on Lee's algorithm. FIG. 1A is a diagram of a shuttle exchange circuit base on Lee's algorithm for DCT. The DCT is divided into a first stage computation, a second stage computation, a third stage computation and a fourth stage computation, thereby eight parallel output value (Y0,Y1, . . . , Y7) can be evaluated by the four stage computations according eight parallel input value (X0,X1, . . . , X7). There are two function blocks within FIG. 1A: DCT processor 1 and post-processor 2. DCT processor 1 is constructed by 12 similar process means 3 designed by the butterfly circuits, and the post-processor constructed by five adding means 4 5 and a fixed coefficient multiplication means is connected with the DCT processor 2 thereafter. Each process means 3 comprises an adder 31, a subtractor 32 and fixed coefficient multiply 5. There are four signed A, two signed B, two signed C, one signed D, one signed E, one signed F. The coefficient signed A, B, C, D, E, F and G are 1 2 cos ( π / 4 ) , 1 2 cos ( π / 8 ) , 1 2 cos ( 3 π / 8 ) , 1 2 cos ( π / 16 ) , 1 2 cos ( 3 π / 16 ) , 1 2 cos ( 7 π / 16 ) and 1 2 cos ( 5 π / 16 )
    respectively. If there is no concern on the adders, subtractors, muliplies, no control means is needed in FIG. 1A. The DCT data-flow dependence without any control means can be designed as data-flow architecture.
  • In corresponding with FIG. 1A, FIG. 1B is a diagram of an IDCT circuit base on Lee's algorithm. The IDCT is divided into a first stage computation, a second stage computation, a third stage computation and a fourth stage computation, thereby eight parallel output value (X0,X1, . . . , X7) can be evaluated by the four stage computations and eight parallel input value (Z0,Z1, . . . , Z7). There are two function blocks within FIG. 1B: IDCT processor 7 and pre-processor 6. IDCT processor 7 is constructed by 12 similar process means 8 designed by the butterfly circuits, and the pre-processor 6 constructed by five adding means 9 and a fixed coefficient multiplication means 10 is connected with the pre-processor 6 therebefore. Each process means 8 comprises an adder 81, a subtractor 82 and fixed coefficient multiply 10. There are four signed A, two signed B, two signed C, one signed D, one signed E, one signed F, and one signed G of the fixed coefficient multiplies 8 of all process means 8. The coefficient of fixed coefficient multiplies 10 signed A, B, C, D, E, F are the same with the coefficients in FIG. 1A respectively. Another well-known DCT/IDCT algorithm is Chen's algorithm. The other details of Lee's algorithm and Chen's algorithm can reference U.S. Pat. No. 5,452,466 and U.S. Pat. No. 5,841,682.
  • General speaking, multiply costs more space and computing time than adder, especially, the hardware cost for implementing multiply is much more than the hardware cost for implementing adder. Therefore, most cost of DCT and IDCT are spent on multiplication, thereby many improved multiplies are applied in DCT and IDCT. One of the improved multiplies is based on Booth's algorithm whose details can be referenced U.S. Pat. No. 5,485,413. By the Booth's algorithm, referring to FIG. 2A, the multiplier is transformed into a coefficient set comprises a plurality of multiplier coefficients in Step 220. Then Step 240 generates a plurality of partial products by multiplying a multiplicand with the coefficient set. Finally, in step 260, adding all partial products generates the product. By this way, a multiply according to the above mentioned multiplication method can be designed. Referring to FIG. 2B, the coefficient set 221 contains a plurality of multiplier coefficients 222 is transformed by a coefficient generation means 22. Next, partial products generation means 26 generates a plurality of partial products 242 according to the multiplier coefficients 222 and a multiplicand 212. Finally, the partial products 242 are added by the summating means 26 to generate the sum 262 of the multiplication of the multiplier 211 and multiplicand 212. Because the number of the multiplier coefficient is less than the number of bits of the multiplicand, thus the number of partial products 242 will be less too. Therefore, the cost and performance can be much improved.
  • For the conventional technologies, there are seven similar multiplies for DCT and IDCT, but almost of them comprise so many computation processes. That is, so many computing costs are needed. Thus, less computations of the multiplication are made, less cost is needed.
  • SUMMARY OF THE PRESENT INVENTION
  • Accordingly, the present invention provides a method and apparatus for multiplying based on Booth's algorithm to simply the computation by decreasing multiplier coefficients.
  • The present invention provides a method for multiplying according a multiplication indexes to choose a multiplication coefficient set from a plurality of multiplication coefficient sets. Each multiplication coefficient set comprises a plurality of multiplication coefficients transformed from a determined multiplier. Then multiplying the multiplication coefficients with a multiplicand can generate a plurality of partial products and finally an output value can be generated by summing all multiplication coefficients.
  • The present invention also provides a multiplication apparatus, comprising: a coefficient generation means, in which one of a plurality of coefficient sets with a plurality of coefficients generated by Booth's algorithm is chosen in accordance with a multiplier; a partial products generation means, in which products are generated by multiplying chosen coefficient set with a multiplicand; and a summing means for generating an output value by summing all partial products.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • A better understanding of the present invention can be obtained when the following Detailed Description is considered in conjunction with the following drawings, in which:
  • FIG. 1A and FIG. 1B are the structure block diagrams of the prior art;
  • FIG. 2A and FIG. 2B are the function block diagrams based on Booth's algorithm of one embodiment of the present invention;
  • FIG. 3 is the flowchart diagram of another embodiment of the present invention; and
  • FIG. 4 is the functional block diagram of further embodiment of the present invention.
  • DETAIL DESCRIPTION OF THE PREFERRED EMBODIMENT
  • The feature of Booth's algorithm is to replace multiplier with a plurality of multiplier coefficients for multiplying with the multiplicand to generate partial products, wherein the product can be generated by summing all partial products. Thus, the present is an improvement of the feature of Booth's algorithm in the specific condition that all of the possible multipliers are predetermined (which means each possible multiplier is chosen form a fixed group). Namely, all possible multipliers are known. Therefore, each possible multiplier can be replaced with a multiplier index corresponding to a multiplier coefficient set with a plurality of multiplier coefficients. Accordingly, the corresponding multiplier coefficient set can be indexed directly and the cost can be down because of no transforming for generating the multiplier coefficients after the multiplier is determined in the prior art. Besides, because all possible multipliers are determined, each one of them can be identified by less bits. For examples, the number of possible multipliers is 8 that can be identified by 3 bits, even if the number of bits in a multiplier is 16 with 216 possible values.
  • Moreover, the output of the product may be part of the product rather than the whole product. For instance, the output value may be taken the integer and some of the decimal place or only some of the integer in the product. For examples, if the product is identified by 40 bits, then the values of all possible products are in the boundary of 23 bits. In this case, only 23 bits for output will work.
  • Furthermore, if the output is above mentioned part of the product, then the bits from the most significant bit to the bits for outputting are assigned as a high bit set and the rest part of the product is assigned as a low bit set. In the summing, the high bit sets and the low bit sets of all partial products are summed to be a high bit product and a low bit product respectively, wherein the low bit product comprises a carry out value identified by other bits except the bits of the low bit set for being summed with the high bit product. The sum of the high bit set and the carry out value can be the output. Accordingly, it is not necessary to reserve the other bits in the low bit product except the carry out value whereby the cost is down.
  • The present invention can be used for integer, floating point value, fixed point value or other value types. Besides, the value type of the present invention can be identified as binary, nibble, decimal, hexadecimal and so on, the type and the way to identify the value in the present invention are not limited.
  • Besides, the manner to choose a multiplier coefficient set can be indexed by a lookup table. The lookup table records the correspondent relationship of the multiplier index and the multiplier coefficient set and can be implemented in memories, state-latched circuits or other storage media. The multiplier index can be used for the address or the control signals to index the multiplier coefficients of correspondent multiplier coefficient set, in which all of this can be integrated in a logical circuit. The illustration of the manner of choosing the multiplier coefficients in the lookup table is for clearly understood, not for confining the implementation of the present invention. The present invention does not limit the implementation for choosing the multiplier coefficient set by the multiplier index.
  • Accordingly, referring to FIG. 3, one embodiment of the present invention is a method for multiplying based on the Booth's algorithm. Firstly, step 320 chooses a correspondent one of the determined multiplier coefficient sets according to a multiplier index. Each one of the multiplier coefficient sets comprises a plurality of multiplier coefficients according to the Booth's algorithm. Namely, all possible values of the multiplier are determined and each possible value of the multiplier corresponds to a set of multiplier coefficients, transformed according to Booth's algorithm, indexed by a respective multiplier index. That is, each multiplier index corresponds to a set of multiplier coefficients. Besides, the values corresponding to different multiplier indexes may be the same.
  • Next, step 340 generates a plurality of partial products by multiplying the multiplier coefficients with a multiplicand according to the Booth's algorithm.
  • Finally, step 360 sums all partial products to generate an output value. The output value can be the whole product of the multiplication or above mentioned part of the product. The other detail of the present invention is described above, there is no redundant description here.
  • Another embodiment of the present invention is an apparatus for multiplying base on Booth's algorithm, referring to FIG. 4, comprising a coefficient generation means 42, a partial product generation means 24 and a summing means 46. The coefficient generation means 42 choose one of a plurality of coefficient sets to be a multiplier coefficient set 221, wherein each coefficient set comprises a plurality of multiplier coefficients 222 transformed by a determined multiplier based on Booth's algorithm. Next, the partial product generation means 24 generates the partial products 242 according to the multiplier coefficient set 221 and a multiplicand 212 based on Booth's algorithm. Finally, the summing means 46 sums all partial products 242 to generate an output value 463. As mentioned above, the output value can be generated from the product summed from the high bit product 441 and the low bit product 442. The high bit product 441 and the low bit product 442 can be summed from the high bits 2421 and the low bits 2422 of the partial products 242 respectively, wherein the low bit product 442 comprises the foregoing carry out value 4421 and the output value 443 can further be generated according to the high bit product 441 and the carry out value 4421. The other details of the embodiment is described above, there is no redundant description here. Significantly, while the function of each of coefficient generation means 42, partial product generation means 24 and a summing means 46 is clear, anyone in the skill art could implement the coefficient generation means 42, partial product generation means 24 and a summing means 46 without any difficulty. For example, coefficient generation means 42 might be a hardware circuit, such as a combination of a multiplex and some additional units. For example, the partial product generation means might be a hardware circuit which is a combination of several multiplies. For example, the summing means might be a hardware circuit, which is a combination of several adders.
  • Accordingly, further embodiment of the present invention is a apparatus for multiplying base on Booth's algorithm in DCT/IDCT, i.e. the fixed point multiplication means in Lee's algorithm. The multiplicands in the multiplication of lee's algorithm are fixed point values and the values can be cosine values or sine values, i.e. 1 2 cos ( π / 4 ) , 1 2 cos ( π / 8 ) , 1 2 cos ( 3 π / 8 ) , 1 2 cos ( π / 16 ) , 1 2 cos ( 3 π / 16 ) , 1 2 cos ( 7 π / 16 ) and 1 2 cos ( 5 π / 16 ) .
    Moreover, the embodiment can also be the fixed point multiplication means in Chen's algorithm. Besides, the foregoing DCT/IDCT can further apply in digital multimedia apparatuses, i.e. VCD player, DVD player, HDTV and so forth. The other details of the embodiment is described above, there is no redundant description here.
  • What are described above are only preferred embodiments of the invention, not for confining the claims of the invention; and for those who are familiar with the present technical field, the description above can be understood and put into practice, therefore any equal-effect variations or modifications made within the spirit disclosed by the invention should be comprised in the appended claims.

Claims (18)

1. A method for multiplying based on Booth's algorithm, comprising:
choosing a multiplier coefficient set according to a multiplier index, wherein said multiplier coefficient set comprises a plurality of multiplier coefficients transformed by Booth's algorithm according to a determined multiplier corresponding to said multiplier index;
generating a plurality of partial products by multiplying said multiplier coefficients with a multiplicand while using a Booth's algorithm; and
summing said partial products to generated an output value.
2. The method for multiplying based on Booth's algorithm according to claim 1, wherein said multiplier coefficients are indexed by said multiplier index in a lookup table, wherein said lookup table comprises the correspondent relations of a plurality of multiplier indexes and a plurality of multiplier coefficient sets.
3. The method for multiplying based on Booth's algorithm according to claim 1, wherein the sum of said partial products is a sum of the multiplication of said determined multiplier and said multiplicand.
4. The method for multiplying based on Booth's algorithm according to claim 1, wherein said product is a set of binary bits and said output value is formed by partial bits of said product.
5. The method for multiplying based on Booth's algorithm according to claim 1, wherein each of said partial products is a set of binary bits comprising a high bit set and a low bit set, wherein the sum of said high bit set of said partial products is a high bit product and said product is the sum of said high bit product and a carry out value that is the rest bits except said low bit set in the sum of said low set of said partial products.
6. The method for multiplying based on Booth's algorithm according to claim 1, wherein said output value is a sum of said high bit product and said carry out value.
7. The method for multiplying based on Booth's algorithm according to claim 1, wherein said multiplier index is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
8. The method for multiplying based on Booth's algorithm according to claim 1, wherein said multiplicand is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
9. An apparatus for multiplying based on Booth's algorithm, comprising:
a coefficient generation means for choosing one of a plurality of coefficient sets to be a multiplier coefficient set comprising a plurality of multiplier coefficients transformed by Booth's algorithm according to a determined multiplier corresponding to said multiplier index;
a partial product generation means for generating a plurality of partial products by multiplying said multiplier coefficients with a multiplicand; and
a summing means for summing said partial products to generate an output value.
10. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said multiplier coefficients are indexed by said multiplier index in a lookup table, wherein said lookup table comprises the correspondent relations of a plurality of multiplier indexes and a plurality of multiplier coefficient sets.
11. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein the sum of said partial products is a product of the multiplication of said determined multiplier and said multiplicand.
12. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said product is a set of binary bits and said output value is formed by partial bits of said product.
13. The apparatus for multiplying based on Booth's algorithm according to claim 12, wherein each of said partial products is a set of binary bits comprising a high bit set and a low bit set, wherein the sum of said high bit set of said partial products is a high bit product and said product is the sum of said high bit product and a carry out value that is the rest bits except said low bit set in the sum of said low set of said partial products.
14. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said output value is formed by partial bits of the sum of said-high bit product and said carry out value.
15. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said multiplier index is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
16. The apparatus for multiplying based on Booth's algorithm according to claim 9, wherein said multiplicand is chosen from the floating point value and the fixed point value, and is identified by the following representations: binary, nibble, decimal and hexadecimal.
17. The apparatus for multiplying based on Booth's algorithm according to claim 9 is applied in discrete cosine transform/inverse discrete cosine transform and said multiplier index is a cosine value.
18. The apparatus for multiplying based on Booth's algorithm according to claim 17, wherein said cosine value is one of the following group comprising:
1 2 cos ( π / 4 ) , 1 2 cos ( π / 8 ) , 1 2 cos ( 3 π / 8 ) , 1 2 cos ( π / 16 ) , 1 2 cos ( 3 π / 16 ) , 1 2 cos ( 7 π / 16 ) and 1 2 cos ( 5 π / 16 ) .
US10/986,095 2003-12-03 2004-11-12 Method and apparatus for multiplying based on booth's algorithm Abandoned US20050125480A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/986,095 US20050125480A1 (en) 2003-12-03 2004-11-12 Method and apparatus for multiplying based on booth's algorithm

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US52629403P 2003-12-03 2003-12-03
US10/986,095 US20050125480A1 (en) 2003-12-03 2004-11-12 Method and apparatus for multiplying based on booth's algorithm

Publications (1)

Publication Number Publication Date
US20050125480A1 true US20050125480A1 (en) 2005-06-09

Family

ID=34375610

Family Applications (6)

Application Number Title Priority Date Filing Date
US10/986,095 Abandoned US20050125480A1 (en) 2003-12-03 2004-11-12 Method and apparatus for multiplying based on booth's algorithm
US10/992,849 Abandoned US20050123046A1 (en) 2003-12-03 2004-11-22 Method and device for sharing MPEG frame buffers
US10/992,814 Abandoned US20050125475A1 (en) 2003-12-03 2004-11-22 Circuit sharing of MPEG and JPEG on IDCT
US11/001,636 Abandoned US20050152609A1 (en) 2003-12-03 2004-12-02 Video decoder
US11/000,885 Active 2026-10-13 US7558431B2 (en) 2003-12-03 2004-12-02 Method and system for discrete cosine transforms/inverse discrete cosine transforms based on pipeline architecture
US12/976,012 Abandoned US20110091124A1 (en) 2003-12-03 2010-12-22 System for multi-byte reading

Family Applications After (5)

Application Number Title Priority Date Filing Date
US10/992,849 Abandoned US20050123046A1 (en) 2003-12-03 2004-11-22 Method and device for sharing MPEG frame buffers
US10/992,814 Abandoned US20050125475A1 (en) 2003-12-03 2004-11-22 Circuit sharing of MPEG and JPEG on IDCT
US11/001,636 Abandoned US20050152609A1 (en) 2003-12-03 2004-12-02 Video decoder
US11/000,885 Active 2026-10-13 US7558431B2 (en) 2003-12-03 2004-12-02 Method and system for discrete cosine transforms/inverse discrete cosine transforms based on pipeline architecture
US12/976,012 Abandoned US20110091124A1 (en) 2003-12-03 2010-12-22 System for multi-byte reading

Country Status (3)

Country Link
US (6) US20050125480A1 (en)
CN (7) CN100527071C (en)
TW (5) TWI227840B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090228540A1 (en) * 2008-03-05 2009-09-10 Nec Electronics Corporation Filter operation unit and motion-compensating device
US20220171602A1 (en) * 2020-12-02 2022-06-02 Samsung Electronics Co., Ltd. Integrated circuit for constant multiplication and device including the same

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI245548B (en) * 2004-10-20 2005-12-11 Inst Information Industry Method and device for video decoding
US8599841B1 (en) 2006-03-28 2013-12-03 Nvidia Corporation Multi-format bitstream decoding engine
US8593469B2 (en) 2006-03-29 2013-11-26 Nvidia Corporation Method and circuit for efficient caching of reference video data
TW200816787A (en) * 2006-09-25 2008-04-01 Sunplus Technology Co Ltd Method and system of image decoding and image recoding
CN101246468B (en) * 2007-02-13 2010-05-19 扬智科技股份有限公司 Modification type discrete cosine inverse transform method
CN101064515B (en) * 2007-04-18 2011-05-11 威盛电子股份有限公司 Method for enhancing decoding efficiency
US8477852B2 (en) * 2007-06-20 2013-07-02 Nvidia Corporation Uniform video decoding and display
CN100588254C (en) * 2007-06-28 2010-02-03 威盛电子股份有限公司 Back-discrete cosine inverting circuit
US8502709B2 (en) 2007-09-17 2013-08-06 Nvidia Corporation Decoding variable length codes in media applications
US9110849B2 (en) 2009-04-15 2015-08-18 Qualcomm Incorporated Computing even-sized discrete cosine transforms
US9117060B2 (en) * 2009-05-07 2015-08-25 Cadence Design Systems, Inc. System and method for preventing proper execution of an application program in an unauthorized processor
US9069713B2 (en) 2009-06-05 2015-06-30 Qualcomm Incorporated 4X4 transform for media coding
US9118898B2 (en) 2009-06-24 2015-08-25 Qualcomm Incorporated 8-point transform for media data coding
US9081733B2 (en) 2009-06-24 2015-07-14 Qualcomm Incorporated 16-point transform for media data coding
US9075757B2 (en) 2009-06-24 2015-07-07 Qualcomm Incorporated 16-point transform for media data coding
US9824066B2 (en) 2011-01-10 2017-11-21 Qualcomm Incorporated 32-point transform for media data coding
TW201609796A (en) 2013-12-13 2016-03-16 賽諾菲公司 Non-acylated EXENDIN-4 peptide analogues
KR102459917B1 (en) * 2015-02-23 2022-10-27 삼성전자주식회사 Image signal processor and devices having the same
CN105868554B (en) * 2016-03-28 2018-03-27 朱洲森 A kind of relay drainage method based on big data complex calculation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5452466A (en) * 1993-05-11 1995-09-19 Teknekron Communications Systems, Inc. Method and apparatus for preforming DCT and IDCT transforms on data signals with a preprocessor, a post-processor, and a controllable shuffle-exchange unit connected between the pre-processor and post-processor
US5485413A (en) * 1993-09-24 1996-01-16 Nec Corporation Multiplier utilizing the booth algorithm
US5748517A (en) * 1995-02-24 1998-05-05 Mitsubishi Denki Kabushiki Kaisha Multiplier circuit
US5781462A (en) * 1994-11-29 1998-07-14 Mitsubishi Denki Kabushiki Kaisha Multiplier circuitry with improved storage and transfer of booth control coefficients
US5841682A (en) * 1995-12-13 1998-11-24 Samsung Electronics Co., Ltd. Inverse discrete cosine transformation system using Lee's algorithm
US6943579B1 (en) * 2002-12-20 2005-09-13 Altera Corporation Variable fixed multipliers using memory blocks

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69131808T2 (en) * 1990-07-31 2000-03-16 Fujitsu Ltd Process and device for image data processing
US7095783B1 (en) * 1992-06-30 2006-08-22 Discovision Associates Multistandard video decoder and decompression system for processing encoded bit streams including start codes and methods relating thereto
US5603012A (en) * 1992-06-30 1997-02-11 Discovision Associates Start code detector
JP3546437B2 (en) 1993-03-31 2004-07-28 ソニー株式会社 Adaptive video signal processing unit
US5509129A (en) * 1993-11-30 1996-04-16 Guttag; Karl M. Long instruction word controlling plural independent processor operations
JPH07200539A (en) * 1993-12-28 1995-08-04 Matsushita Electric Ind Co Ltd Two-dimensional dct arithmetic unit
US5854757A (en) * 1996-05-07 1998-12-29 Lsi Logic Corporation Super-compact hardware architecture for IDCT computation
US6026217A (en) * 1996-06-21 2000-02-15 Digital Equipment Corporation Method and apparatus for eliminating the transpose buffer during a decomposed forward or inverse 2-dimensional discrete cosine transform through operand decomposition storage and retrieval
US6144771A (en) * 1996-06-28 2000-11-07 Competitive Technologies Of Pa, Inc. Method and apparatus for encoding and decoding images
JPH1079940A (en) * 1996-09-05 1998-03-24 Mitsubishi Electric Corp Image encoder
US6128340A (en) * 1997-03-14 2000-10-03 Sony Corporation Decoder system with 2.53 frame display buffer
TW364269B (en) 1998-01-02 1999-07-11 Winbond Electronic Corp Discreet cosine transform/inverse discreet cosine transform circuit
WO1999039303A1 (en) * 1998-02-02 1999-08-05 The Trustees Of The University Of Pennsylvania Method and system for computing 8x8 dct/idct and a vlsi implementation
JP2000125136A (en) * 1998-10-19 2000-04-28 Internatl Business Mach Corp <Ibm> Image data compression device and its method
US6507614B1 (en) * 1999-10-19 2003-01-14 Sony Corporation Efficient de-quantization in a digital video decoding process using a dynamic quantization matrix for parallel computations
CN1848941A (en) * 1999-12-15 2006-10-18 三洋电机株式会社 Image reproducing method and image processing method, and image reproducing device, image processing device, and television receiver capable of using the methods
TW502532B (en) * 1999-12-24 2002-09-11 Sanyo Electric Co Digital still camera, memory control device therefor, apparatus and method for image processing
US6675185B1 (en) * 2000-06-07 2004-01-06 International Business Machines Corporation Hybrid domain processing of multi-dimensional transformed data
JP3639517B2 (en) * 2000-10-04 2005-04-20 三洋電機株式会社 Moving picture decoding apparatus and moving picture decoding method
US7599434B2 (en) * 2001-09-26 2009-10-06 Reynolds Jodie L System and method for compressing portions of a media signal using different codecs

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5452466A (en) * 1993-05-11 1995-09-19 Teknekron Communications Systems, Inc. Method and apparatus for preforming DCT and IDCT transforms on data signals with a preprocessor, a post-processor, and a controllable shuffle-exchange unit connected between the pre-processor and post-processor
US5485413A (en) * 1993-09-24 1996-01-16 Nec Corporation Multiplier utilizing the booth algorithm
US5781462A (en) * 1994-11-29 1998-07-14 Mitsubishi Denki Kabushiki Kaisha Multiplier circuitry with improved storage and transfer of booth control coefficients
US5748517A (en) * 1995-02-24 1998-05-05 Mitsubishi Denki Kabushiki Kaisha Multiplier circuit
US5841682A (en) * 1995-12-13 1998-11-24 Samsung Electronics Co., Ltd. Inverse discrete cosine transformation system using Lee's algorithm
US6943579B1 (en) * 2002-12-20 2005-09-13 Altera Corporation Variable fixed multipliers using memory blocks

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090228540A1 (en) * 2008-03-05 2009-09-10 Nec Electronics Corporation Filter operation unit and motion-compensating device
US8364741B2 (en) * 2008-03-05 2013-01-29 Renesas Electronics Corporation Motion-compensating device with booth multiplier that reduces power consumption without increasing the circuit size
US20220171602A1 (en) * 2020-12-02 2022-06-02 Samsung Electronics Co., Ltd. Integrated circuit for constant multiplication and device including the same

Also Published As

Publication number Publication date
TW200520536A (en) 2005-06-16
US20050125469A1 (en) 2005-06-09
US20110091124A1 (en) 2011-04-21
CN100527071C (en) 2009-08-12
TWI233267B (en) 2005-05-21
CN100531393C (en) 2009-08-19
US20050152609A1 (en) 2005-07-14
CN1617594A (en) 2005-05-18
CN101060631A (en) 2007-10-24
US20050125475A1 (en) 2005-06-09
CN100539699C (en) 2009-09-09
CN1305313C (en) 2007-03-14
CN1595994A (en) 2005-03-16
CN1598876A (en) 2005-03-23
TWI295787B (en) 2008-04-11
TWI227840B (en) 2005-02-11
TW200520399A (en) 2005-06-16
US20050123046A1 (en) 2005-06-09
CN1282368C (en) 2006-10-25
CN1555199A (en) 2004-12-15
CN1630373A (en) 2005-06-22
US7558431B2 (en) 2009-07-07
CN1591319A (en) 2005-03-09
TW200519633A (en) 2005-06-16
TWI240560B (en) 2005-09-21
TW200520535A (en) 2005-06-16
TWI289992B (en) 2007-11-11

Similar Documents

Publication Publication Date Title
US20050125480A1 (en) Method and apparatus for multiplying based on booth&#39;s algorithm
CN106909970B (en) Approximate calculation-based binary weight convolution neural network hardware accelerator calculation device
Hu et al. An angle recoding method for CORDIC algorithm implementation
CN1892589B (en) Apparatus for performing multimedia application operation, system and method for implementing the operation
US7065543B2 (en) Apparatus and method for 2-D discrete transform using distributed arithmetic module
EP0353223B1 (en) Two-dimensional discrete cosine transform processor
US5815422A (en) Computer-implemented multiplication with shifting of pattern-product partials
US5737253A (en) Method and apparatus for direct digital frequency synthesizer
US20130262547A1 (en) Processor for performing multiply-add operations on packed data
US5951629A (en) Method and apparatus for log conversion with scaling
Toivonen et al. Video filtering with Fermat number theoretic transforms using residue number system
CA2230108C (en) An apparatus for performing multiply-add operations on packed data
US5941939A (en) Logarithm/inverse-logarithm converter and method of using same
WO1995033241A1 (en) High-speed arithmetic unit for discrete cosine transform and associated operation
Dimitrov et al. Loading the bases: A new number representation with applications
KR19980701803A (en) Log / Inverse Log Converter, Calculation Device and Log Value Generation Method
Vassiliadis et al. A general proof for overlapped multiple-bit scanning multiplications
US7020671B1 (en) Implementation of an inverse discrete cosine transform using single instruction multiple data instructions
Lim et al. A serial-parallel architecture for two-dimensional discrete cosine and inverse discrete cosine transforms
US6574649B2 (en) Efficient convolution method and apparatus
US20060106905A1 (en) Method for reducing memory size in logarithmic number system arithmetic units
US7606850B2 (en) Method and apparatus for providing a base-2 logarithm approximation to a binary number
US5477478A (en) Orthogonal transform processor
Nussbaumer New polynomial transform algorithms for multidimensional DFT's and convolutions
US5825420A (en) Processor for performing two-dimensional inverse discrete cosine transform

Legal Events

Date Code Title Description
AS Assignment

Owner name: VIA TECHNOLOGIES, INC., TAIWAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YEH, TING-KUN;TSAI, JAMES;WANG, DAVID;REEL/FRAME:015985/0121

Effective date: 20040427

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION