Embodiment
Fig. 1 illustrates the structure chart of the system having flush system (Tile-based) FPGA (Field Programmable Gate Array) and interconnecting unit.As shown in Figure 1, this system comprises: configurable logic array (Configurablelogicarray), embedded adder and multiplier (MAC), in-line memory (EMB), phase-locked loop (PLL) and input and output IO etc.Can also comprise in some SOC (system on a chip) (SOC): flush bonding processor (ARM/8051/MIPS), code data memory (SRAM/Flash) etc.
In figure, enlarged fragmentary portion is depicted as FPGA (Field Programmable Gate Array) and the interconnection structure of typical Tile-based.The FPGA (Field Programmable Gate Array) of this tile-based and interconnection structure are made up of basic tile unit: PLB (programmablelogicblock, programmable logic block).PLB is made up of basic logic unit (LE) and basic interconnection unit (xbar).Wherein LE is made up of such as 4 LP (LogicParcel, logic chip).Take PLB as the programmable logic array that elementary cell can be combined as arbitrary size; The IP of more additional specific functions, as Embedded memory (EMB), embedded adder and multiplier (MAC), the IO of specific function, can form a typical FPGA system.
Fig. 2 is the basic composition structure chart of basic logic unit LE.LE is by such as 4 LP (Logicparcel, logic chip), and jump input unit (carryskipin), jump output unit (carryskipout) and LBUF form.
Each LP comprises such as 2 LUT4,1 LUT4C (LUT4 of band carry chain) and 2 registers.As shown in the figure, LP0, LP1, LP2, LP3 all comprise two LUT4 and LUT4C, and 2 register REG.LE has such as 12 LUT4 and 8 registers altogether, the ratio of LUT4 and register is such as 3:2, consider the major applications of FPGA, the resource consumption of combinational logic part is larger than sequential logic part, the resource of LUT4 than register aboundresources some, can area be saved, improve the utilance of chip simultaneously.
Carry skip input unit and carry skip output unit are used for realizing carry skip chain function.When LBUF is used for the control signal of register in LE, the generation of clock, does not have direct relation with the present invention, does not repeat again.
It may be noted that in this manual, logic chip is only the one segmentation of logical block.The present invention is not restricted to the integrated circuit with logic chip, also should contain the integrated circuit with various logic unit embodying thinking of the present invention.
Fig. 3 illustrates the general principle of carry skip of the present invention (carryskip).The figure first half is depicted as the basic structure of 4 full adders.Full adder comprises adder FA0, adder FA1, adder FA2 and adder FA3.Adder FA0, adder FA1, between adder FA2 and adder FA3, there is carry chain.
Specifically, adder FA0 receives carry at input and generates signal G0 and carry propagation signal P0 and the carry signal Ci0 from previous full adder, and produces new carry signal Co0 based on these signals; Carry signal Co0 is input to adder FA1.Adder FA1 docking is taken in position generation signal G1 and carry propagation signal P1 and carry signal Co0 and is produced new carry signal Co1; Carry signal Co1 is input to adder FA2.Adder FA2 receives carry generation signal G2 and carry propagation signal P2 and carry signal Co1 and produces new carry signal Co2; Carry signal Co2 is input to adder FA3.Adder FA3 receives carry generation signal G3 and carry propagation signal P3 and carry signal Co2 and produces new carry signal Co3.Carry signal Co3 as four full adders carry export and export.The output of each full adder of this structure depends on the carry chain output of previous full adder.
Coi=Gi+PiCi
Gi=AiBi,Pi=(Ai^Bi)
If increase a MUX in the structure of this full adder, the full adder structure shown in figure Lower Half just can be obtained.When P0, P1, P2, P3 are 1, carry chain exports and the input of carry chain is identical, and MUX selects original input as the output of this carry chain.That is, this feature can be utilized to realize the carry chain structure of jumping.
Because in above-mentioned calculating process, the generation of carry skip signal is only relevant with Gi with Pi, be not produce prerequisite with Coi, therefore arithmetic speed improves, and postpones also significantly to reduce.
Although it may be noted that mainly invention has been described in conjunction with addition in this manual, the present invention equally also goes for subtraction.
Fig. 4 is the carry skip chain structure of LE.As shown in Figure 4, carry skip chain comprises carry chain three parts in jump input unit and jump output unit and LE between multiple LUT4C.Illustrate 4 LUT4C in the drawings.4 LUT4C can be coupled together by ripple carrier chain (ripplecarrychain); Connected by carry skip chain between LE, the high-speed carry of such as 4 and the high-speed carry of such as 8 can be realized thus.For neighbouring two LE, below the output of jump output unit of LE can be directly connected to the input of the jump input unit of LE above.
Generation, as the look-up table of routine, outputs signal by LUT4C, and is exported by the corresponding port of multiplexer mux_dy in port dy [0], dy [1], dy [2], dy [3].The carry that signed magnitude arithmetic(al) obtains also is transmitted by carry chain by LUT4C, and exports through c4_out.
The effect of jump input unit is as current LE selects suitable carry input.The input of jump input unit divides three groups: one group to be the local carry input of current LE, comprises the outside input (byp [4], byp [16]) in gnd and two, ground; One group be LE below carry chain input: c4_in, c_skip4_in, c_skip8_in; Another group is the output of the jump output unit of LE below: r4_in_b, p4_in_b, p8_in_b, the selection signal that the carry chain as current LE inputs.The output of jump input unit is the carry chain input of current LE.Under the control selecting signal, jump input unit from the input of local carry and below LE carry chain input select a signal to input as the carry of current LE.
Above the effect of jump output unit is, the adjacent LE in (i.e. the downstream of carry chain) provides the selection signal of carry signal.The input of jump output unit is divided into two groups: one group to be the output of 4 LUT4C, and another signal is the same input signal shared with jump input unit: p4_in_b, this signal can indicate adjacent LE below and whether the carry skip of 4 occurs.It exports when the input signal as the jump input unit of LE above, i.e. carry select signal r4_in_b, jumps 4 and selects signal p4_in_b, jumps 8 selections signal p8_in_b (carry skip selection signal).
After four LUT4C of current LE carry out add operation when output output is the signal of 1 entirely, jump output unit produces effective selection signal p4_in_b.When this selection invalidating signal, the c4_out of current LE is selected to input as the carry of its LE in the jump input unit of LE above; When this selection signal is effective, c_skip4_out or c_skip8_out of current LE is selected to input as the carry of its LE in the jump input unit of LE above.
Fig. 5 is the basic structure schematic diagram of the LUT4C of band carry chain structure.As shown in Figure 5, the carry chain between LUT4C is ripple carrier chain structure.The structure of this carry chain mainly comprises: XOR gate (xor) for obtaining the output of addition, multiplexer mux_co and multiplexer mux_ca.
LUT40, XOR gate and multiplexer mux_co realize the signed magnitude arithmetic(al) of band carry.The f2 input of LUT40 is coupled to the first input end of multiplexer mux_co, input carry signal ci is input to second input of multiplexer mux_co and the second input of XOR gate, and the first input end of XOR gate receives the output signal from LUT40.The output signal of LUT40 is also as the selection control end input signal of multiplexer mux_co.
The input of multiplexer mux_ca can be GND, directly inputs, the output of LUT0 or the input of LUT40.Different according to the selection of mux_ca, different functions can be realized.When selecting the input of directly input or LUT40, basic signed magnitude arithmetic(al) can be realized; When selecting the output of GND or LUT0, can realize multi input with or function.
It may be noted that the LUT of other form band carry chain adopted beyond Fig. 5 in embodiments of the present invention is also feasible.
Fig. 6 is the detailed construction schematic diagram of carry skip chain.The basic structure for jump input unit of left side signal, right side is illustrated as the basic structure of jump output unit.
For jump input unit, by MUX mux0, the function being realized carry chain by carry skip chain or ripple carrier chain can be selected.When realizing the addition and subtraction of multidigit, carry skip chain can be selected.When realized by carry chain multidigit with or function time, can be realized by ripple carrier chain.
When selecting carry skip chain, by carry select signal r4_in_b, jump 4 and select signal p4_in_b, jump 8 and select signal p8_in_b to decide option value signal c4_in, jump 4 carry signal c_skip4_in and still jump the input of 8 carry signal c_skip8_in as carry chain.In addition, the input that the initial input for carry chain can be selected constant by MUX mux1 and mux5 or directly be inputted as lowest order carry chain, and can determine whether will inputting negate to carry chain by MUX mux4.
When selecting ripple carrier chain, decided the input selected constant or directly input as lowest order carry chain by MUX mux2, mux4, mux5, and can determine whether negate is inputted to carry.
For jump output unit, it is input as the output of 4 LUT4C connected through pulsation carry chain and selects signal p4_in_b with the input jumping 4 that jump input unit is shared.When the output of 4 LUT4C is 1 entirely, and jump 4 when to select into signal p4_in_b as high (last carry chain jump 4 carry signals invalid), jumping 4, to select signal p4_out_b signal effective, can realize the carry skip of 4.Be 1 entirely when 4 LUT4C export, and when jumping 4 to select into signal p4_in_b be low (last carry chain jump 4 carry signals effective), jumping 8, to select signal p8_out_b signal effective, can realize the carry skip of 8.When 4 LUT4C outputs are not 1 entirely, it is effective that carry select goes out signal r4_out_b, realizes ripple carrier.
It may be noted that Fig. 6 only illustrates an example of jump input unit, jump output unit.Those skilled in the art will recognize that, the replacement circuit of other form can be adopted to realize jump input unit, jump output unit.
The ripple carrier chain of jump input unit, jump output unit and 4 is combined, common ripple carrier chain function can be realized, the carry skip of 4 and 8, this carry skip all has very important significance for the lifting of addition and subtraction performance and the lifting of overall system performance.
Fig. 7 realizes an example of 12 additions and the analysis schematic diagram of critical path thereof by carry skip chain.As shown in the figure, the addition completing 12 needs 3 LE, LE0, LE1 and LE2.Each LE realizes the addition of 4.
Wherein, LE0 carries out add operation to 0-3 position, A [3:0] and B [3:0] is added, must with Sum [3:0]; LE1 carries out add operation to 4-7 position, A [7:4] and B [7:4] is added, must with Sum [7:4]; LE2 carries out add operation to 8-11 position, A [11:8] and B [11:8] is added, must with Sum [11:8].
4 LUT4C of LE inside are connected by ripple carrier chain, and the details on the right side of figure to LE0 has been done and amplified signal.Be connected with the carry skip chain structure of jump output unit by jump input unit between LE.
Realized the addition of 12 by carry skip chain structure, its critical path as shown in the figure.The time delay of critical path is made up of the ripple carrier chain time delay of 24 grades and the carry skip chain time delay (carryskip4delay, hereinafter referred to as the time delay of jumping 4 carry) of 14.Compared to the ripple carrier chain time delay of 12 grades, performance is greatly improved.
Fig. 8 realizes an example of 16 additions and the analysis schematic diagram of critical path thereof by carry skip chain.Can see that the addition of 16 needs 4 LE, each LE realizes the addition of 4.4 LUT4C of LE inside are connected by ripple carrier chain.Be connected with the carry skip chain structure of jump output unit by jump input unit between LE.
The addition of 16 is realized by carry skip chain structure.The time delay of critical path is made up of the ripple carrier chain time delay of 24 grades and the carry skip chain time delay (namely carryskip8delay jumps 8 carry time delays) of 18.Compared to the ripple carrier chain time delay of 16 grades, performance is greatly improved.
Fig. 9 is that the structural diagrams realizing multidigit and function by carry chain is intended to.The structure of the LUT4C according to Fig. 9, the output of LUT0 can as the input of MUX mux_ca, such LUT40 and LUT0 can by mux_co couple together realize 8 with or function.
As shown in the figure, 8 can be exported by carry chain with the output of function, and in LP0, two LUT4 are in conjunction with MUX mux_ca, mux_co, then through the MUX mux_sc of LP1 and mux_dy, can obtain 8 and output AND8.
Fig. 9 give one 16 with the implementation structure figure of function, in a LP by carry chain structure can realize 8 and/or function, can realize at most in LE 20 and/or function, but all this with or function all exported by carry chain, therefore all need multiplexer mux_sc and mux_dy using the LP be adjacent to export.
Above-described embodiment, further describes object of the present invention, technical scheme and beneficial effect.Institute it should be understood that and the foregoing is only the specific embodiment of the present invention, the protection range be not intended to limit the present invention.Within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.