US20040230924A1 - Method for tuning a digital design for synthesized random logic circuit macros in a continuous design space with optional insertion of multiple threshold voltage devices - Google Patents

Method for tuning a digital design for synthesized random logic circuit macros in a continuous design space with optional insertion of multiple threshold voltage devices Download PDF

Info

Publication number
US20040230924A1
US20040230924A1 US10/842,589 US84258904A US2004230924A1 US 20040230924 A1 US20040230924 A1 US 20040230924A1 US 84258904 A US84258904 A US 84258904A US 2004230924 A1 US2004230924 A1 US 2004230924A1
Authority
US
United States
Prior art keywords
design
tuning
timing
gates
gate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/842,589
Other versions
US7093208B2 (en
Inventor
Patrick Williams
Ee Cho
David Hathaway
Mei-Ting Hsu
Lawrence Lange
Gregory Northrop
Chandramouli Visweswariah
Cindy Washburn
Jun Zhou
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GlobalFoundries Inc
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/435,824 external-priority patent/US7003747B2/en
Priority claimed from US10/436,213 external-priority patent/US7010763B2/en
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US10/842,589 priority Critical patent/US7093208B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: VISWESWARIAH, CHANDRAMOULI, NORTHROP, GREGORY A., ZHOU, JUN, CHO, EE K., HATHAWAY, DAVID J., HSU, MEI-TING, LANGE, LAWRENCE K., WASHBURN, CINDY SHUIKING, WILLIAMS, PATRICK M.
Publication of US20040230924A1 publication Critical patent/US20040230924A1/en
Application granted granted Critical
Publication of US7093208B2 publication Critical patent/US7093208B2/en
Assigned to GLOBALFOUNDRIES U.S. 2 LLC reassignment GLOBALFOUNDRIES U.S. 2 LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INTERNATIONAL BUSINESS MACHINES CORPORATION
Assigned to GLOBALFOUNDRIES INC. reassignment GLOBALFOUNDRIES INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GLOBALFOUNDRIES U.S. 2 LLC, GLOBALFOUNDRIES U.S. INC.
Adjusted expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/327Logic synthesis; Behaviour synthesis, e.g. mapping logic, HDL to netlist, high-level language to RTL or netlist

Definitions

  • IBM® is a registered trademark of International Business Machines Corporation, Armonk, N.Y., U.S.A. and other names used herein may be registered trademarks, trademarks or product names of International Business Machines Corporation or other companies.
  • This invention relates to methods for tuning the digital design and design automation of high-performance digital integrated circuits.
  • the invention particularly is directed to the problem of developing an integrated circuit design optimization methodology which exploits circuit tuning of individual macros.
  • the tuning of individual macros is conducted by optimizing transistor sizes over a defined continuous design space. To further optimize for performance, circuits with low or high threshold voltage transistors are selectively substituted for regular threshold ones.
  • the datapath and array design sections of a high-speed microprocessor design are logically well-defined.
  • these circuit sections are typically custom-designed circuits which are electrically and physically designed much earlier in the design cycle to assure high clocking performance.
  • the remaining sections of the design, the control logic sections are often changed late in the microprocessor design cycle to reach required logical function but other objectives such as timing closure put an additional constraint on the system, making automated optimal design closure techniques extremely valuable.
  • Automated techniques deliver several advantages such as improved circuit performance, higher quality and correctness, and enhanced time-to-market.
  • the control logic is contained in physical entities called random logic macros or RLMs, where the term random does not imply true randomness, but instead a lack of regular structure as is found in datapath and array circuitry. Due to the unstructured nature of the logic, synthesis and place/route tools are employed to read in a logical description and transform it into primitive logical gates.
  • the term gate is understood to include a collection of transistors which is to be treated as a single logical circuit element. These gates are adjusted in drive strength to achieve timing objectives and placed legally while a wiring tool routes the connections between these gates to complete the physical design.
  • FIG. 1 Prior-art methods of circuit tuning towards timing closure are illustrated in FIG. 1 (flow 100 ) in the case of a “flat” design methodology and FIG. 2 (flow 200 ) in the case of a “hierarchical” design methodology.
  • FIG. 1 is typically employed where designs are analyzed and optimized in a flat representation.
  • the flow starts with a logic design specification (labeled “Customer Logic Drop” in box 110 ) whereby a logic synthesis tool (box 120 ) synthesizes the input logic description into proper logical gates and structures to assure logical correctness.
  • Logic restructuring of the gates (box 130 ) maps the logical gates into technology-approved gates required for manufacturing.
  • An initial timing analysis of the design is typically applied whereby the gates are put through a process of powering up and down (box 140 ), depending on current drive needed to move towards cycle-time goals.
  • This process continues with the physical design (box 150 ) which entails the physical placement of the gates and global routing of the design which is needed to provide a rough understanding of the real wiring of the design.
  • the global routing provides parasitic extraction details which, along with the placed design, is provided as input to static timing analysis (box 160 ) to predict the performance of the current design. Many iterations of this process (box 160 to box 120 ) are attempted with different adjustments of the entire process to achieve timing closure.
  • the design may take on new engineering logic changes (box 170 ), and if so the design is started once again through the synthesis, placement, timing process as described (boxes 120 - 160 ). If no logic changes are required, the gates of the design are connected via routing, the electrical parasitics of the routing are extracted, and the design proceeds through a final static timing (box 180 ). Other checks are applied at this time but typically the design must be adjusted logically to achieve cycle time goals and the design is presented back to the synthesis, placement and timing process of boxes 120 - 160 . Upon final acceptance of the design, it is sent to manufacturing (box 190 ).
  • FIG. 2 The other prior-art methodology to achieve timing closure is illustrated in FIG. 2 (flow 200 ) and is usually implemented for more complex and dense semiconductor designs such as a microprocessor.
  • This hierarchical methodology presents many benefits in the spirit of “divide and conquer.” It permits a parallel approach to tackling the complexity of the design by enabling the design team to work on macro partitions of the overall design.
  • Each macro partition has a space budget and a timing budget associated with it.
  • Once the macro partition has been designed its actual timing is represented as a timing abstract, which is a simplified representation of its salient overall timing characteristics. This method improves design turnaround time and time-to-market. It also permits an analysis of the performance of the chip at any stage of the design, substituting full timing abstract models for portions of the chips that have been designed for the estimated or budgeted timing models created during high-level design planning for the portions that haven't yet been designed.
  • FIG. 2 The boxes on the left of FIG. 2 (boxes 205 to 230 ) pertain to a custom design flow. Custom design is typically applied to arithmetic or dataflow circuits, or other circuits requiring careful hand-design.
  • the boxes on the right of FIG. 2 (boxes 235 - 265 ) pertain to a design flow for automatically synthesized and placed random logic macros.
  • the boxes in the middle (boxes 270 to 290 ) pertain to the steps in the methodology that bring together the custom and random logic macros for global integration and timing of the chip.
  • flow 200 of FIG. 2 embodies two levels of timing closure: one at the individual macro level, and another at the global chip level. The following four paragraphs explain the three main sections of this prior-art hierarchical design methodology.
  • timing abstract is a simplified model that represents the timing behavior of the entire macro.
  • the timing abstract typically contains the timing behavior of timing arcs to and from macro boundary pins to latch points within the macro design.
  • These timing abstracts are incorporated at the global level to enable the chip-level timing analysis (box 270 ) for the entire chip design.
  • Timing assertions (box 230 ) or constraints are fed back and applied on the macro pins from the global timing analysis. These timing assertions when placed on the custom macro during timing analysis could result in the need for additional logic restructuring for timing optimization and the process is started again (box 225 back to box 210 ). To help speed this process it is extremely effective to produce the timing abstracts in the beginning of the design from the schematic timing analysis (dotted arrow from box 220 to box 275 ).
  • the Random Logic Macro (RLM) flow starts with an initial logic specification (labeled “RLM Logic Drop” in box 235 ).
  • This part of the flow is very similar to FIG. 1 (flow 100 ).
  • the logic is synthesized (box 240 ), logic is restructured for technology adaptation (box 245 ), gates are powered up or down for required drive strengths (box 250 ), and the resulting netlist is physically designed (box 255 ) for minimal wire length on critical paths.
  • parasitic extraction and timing analysis is executed (box 260 ) on the design. Constraints are typically adjusted throughout this process and the loop is iterated many times (box 260 back to box 240 ) to produce optimal performance.
  • timing abstracts (box 275 ) for the RLMs are produced for global timing analysis. Timing assertions (box 265 ) are fed back to the RLM design loop as additional constraints on the design.
  • the third main component of flow 200 comprises of the boxes in the middle of FIG. 2 in which the various custom and RLM macros are integrated to obtain a final chip design.
  • Global timing results (box 270 ) and new logic changes (box 280 ) would require either custom or RLM macros or both to be redesigned depending on the change required.
  • Evaluation of a final timing run, physical design requirements and other checks (box 285 ) might also require a loop through either the custom or RLM flows. Until all checks and analysis are satisfied, the chip cannot be released for manufacturing (box 290 ).
  • a flat design flow as in FIG. 1 implies very large optimization problems with very large design spaces.
  • CAD computer-aided design
  • heuristic methods have been favored in these optimization tools.
  • the use of heuristic methods leads to a number of difficulties including problems in achieving performance, a large number of design iterations, incomplete exploitation of the technology, all of which lead to long design times and sub-optimal designs.
  • FIG. 1 flow 100
  • FIG. 2 flow 200
  • FIG. 2 controls the design complexity with the use of macro partitions
  • timing closure must now be accounted for at two levels of the design, both locally within the macro and globally.
  • the difficulty with this flow is the management of the timing budgets between macros and global connections. The process requires careful management of these budgets through the use of the wiring and buffering solutions of the global nets and careful logic implementation of the macro paths connecting between these global paths.
  • Cycle time closure on the critical paths could require logic re-design to reduce overall delays of these timing paths.
  • This redesign could effect connections at the global level and new solutions of wiring and buffering must be implemented. Therefore logic and circuit solutions at both the global and macro levels must be analyzed and developed to achieve optimum cycle time performance.
  • Another potential problem is that typically only the top most critical path or paths are searched, broken down and analyzed for the required solution, but as soon as that is done, other sub-critical paths which were not exposed before may now be limiting. The procedure is therefore time-consuming and not guaranteed to produce a design that meets the required system-level performance.
  • the hierarchical design flow of FIG. 2 suffers from two main problems.
  • the first is the difficulty of iteratively adjusting the budgets of the individual macros so as to meet global timing requirements, while giving individual macros reachable targets.
  • the second is that focusing on just a few critical paths is not sufficient to meet overall timing and leads to a great deal of re-design.
  • both flows limit the progression of the design due to the nature of the heuristic iterations that each flow applies and therefore retards the achievable circuit performance which can be obtained by the flows.
  • This slow convergence rate towards cycle time objectives limits the flexibility of the design team to introduce functional and timing changes throughout the design process, and in particular during the crucial period late in the design cycle. In all cases, only a small sub-set of critical paths is exposed towards an optimal solution.
  • Low threshold voltage (Low Vt) transistors offer faster performance, but at the cost of increased leakage power.
  • High threshold voltage (High Vt) transistors offer significant reduction in leakage power, but at the cost of lower performance. It is therefore beneficial to sparingly use Low Vt devices on the critical paths to achieve higher performance, but limit the usage of such devices to limit leakage power. It is also beneficial to use High Vt devices on the non-critical paths to reduce leakage power, but not to the extent that the non-critical paths slow down and turn into critical paths.
  • Physical synthesis CAD tools employ heuristic methods to introduce multiple threshold devices with the dual objectives of achieving higher performance and limiting leakage power.
  • the prior-art heuristic methods limit the ability to optimally adjust the performance and leakage of the circuits.
  • Prior art flows do not include continuous optimization techniques during RLM design, despite the ability of such techniques to obtain optimal solutions. Further, prior-art flows do not include specialized techniques to deal with the relatively low capacity and high run times of such continuous optimization methods.
  • the novel methodology includes the use of a tuner on random logic macros that adjusts transistor sizes in a continuous domain. To accommodate this tuning, logic gates are mapped to parameterized cells for the tuning process and then back to fixed gates after tuning. Tuning is constrained in such a way as to minimize “binning errors” when the design is mapped back to fixed cells. Further, the critical sections of the circuit are marked in order to make the optimization more effective and to fit within the problem-size constraints of the tuner.
  • a specially formulated objective function is employed during the tuning to promote faster global timing convergence, despite possibly incorrect initial timing budgets. The specially formulated objective function targets all paths that are failing timing, with appropriate weighting, rather than just targeting the most critical path. Finally, the addition of multiple threshold voltage gates allows for increased performance while limiting leakage power.
  • FIG. 1 illustrates a typical prior-art “flat methodology” iterative procedure for achieving timing closure of a typical semiconductor digital integrated circuit.
  • FIG. 2 illustrates a typical prior-art “hierarchical methodology” iterative procedure for achieving timing closure of a high-performance digital integrated circuit or functional unit of a high-performance digital integrated circuit.
  • FIG. 3 illustrates the inventive methodology flow for the tuning of circuits within synthesized random logic macros with the optional substitution of multiple threshold voltage devices. For completeness, the figure also includes the final stages of routing, extraction and final timing.
  • FIG. 4 illustrates the transistor-level schematic of a two-way NAND gate parameterized cell.
  • FIG. 5 illustrates a plot of PPW and PNW parameters of a set of predefined gates of a given function which may be mapped to a single parameterized cell.
  • FIG. 3 An inventive design and optimization methodology for random logic macros (RLMs) is shown in FIG. 3 (flow 300 ). The salient features of the preferred components of this methodology and how they fit together are explained below.
  • the first step is to import design data (box 335 ) that comes out of a logical or physical synthesis program, which is known to be logically correct.
  • This design data is at the gate-level, i.e., it typically consists of a collection of logic gate primitives such as NAND gates, NOR gates, etc. These primitives are called “fixed cells” since they are available in fixed transistor size combinations in the library of logic gates.
  • the inventive methodology maps the fixed-cell gate-level design to an equivalent “parameterized cell” transistor-level design in order to apply sophisticated mathematical optimization techniques, and then finally returns the design to a fixed-cell gate-level description.
  • a “parameterized cell” is a primitive cell which employs continuous variables for the strengths or widths of each transistor that makes up that cell, to be further described later.
  • the next step after importing the design is to strip any threshold voltage assignments and map the fixed-cell logic gates to parameterized gates (box 340 ). Since the identities of the timing-critical paths will change and are not known until later in the optimization flow, it is not advantageous to accept the threshold voltage assignments suggested by the logical or physical synthesis program. Instead, all transistors are set to a “regular” or “normal” threshold voltage for the particular technology at hand, where the “regular” threshold voltage is chosen among a plurality of choices available in the implementation technology as one which provides a good trade-off between leakage current and performance on typical logic paths in the design. Next, the logic gates are mapped to members of a parameterized library [see G. A. Northrop and P-F.
  • a parameterized cell is “in-between” a fixed-cell with no transistor-size flexibility and a full-custom cell in which the width of each transistor can be adjusted arbitrarily and independently.
  • the PFETs are typically “grouped,” i.e., they are all controlled by one parameter called PPW in the sequel.
  • all the NFETs are controlled by one parameter called PNW.
  • a 3-input NAND gate has 6 transistors, but is parameterized by just two variables: PPW and PNW. Referring to FIG.
  • schematic 400 is of a two-way NAND gate parameterized cell comprising PFETs 410 and 420 and NFETs 430 and 440 .
  • Parameter PPW for this cell controls the widths of PFETs 410 and 420 , forcing the ratio between their widths (typically 1/1) to remain constant.
  • parameter PNW for this cell controls the widths of NFETs 430 and 440 , forcing the ratio between their widths to remain constant.
  • point 500 represents a particular fixed cell implementation of a function with PPW and PNW values determined by the location of the point with respect to plot axes 510 .
  • the collection of points 520 represents all available fixed cells for this function.
  • Arrow 530 represents a direction of variation for parameters PPW and PNW corresponding to increasing or decreasing drive strength of the cell.
  • Arrow 540 represents a direction of variation of parameters PPW and PNW corresponding to increasing or decreasing the beta ratio (the ratio between the PFET and NFET strengths of the gate).
  • Parameterized cell libraries possess numerous advantages. For optimization purposes, the adoption of parameterized cells converts the problem from one of discrete optimization to a continuous optimization problem. Parameterized cells also give rise to optimization problems with fewer independent variables than full-custom library cells.
  • the continuous optimization process may be forced to consider only solutions which fall close in both PPW and PNW values to one of the fixed cells in the library.
  • Parameterized cells are also amenable to automated layout techniques [see G. A. Northrop and P-F. Lu, “A semi-custom design flow in high-performance microprocessor design,” Proc. 2001 Design Automation Conference, Las Vegas, Nev., June 2001, pages 426-431] and therefore allow library design flows to be more automated with relatively small loss of performance compared to full-custom design.
  • each fixed cell of the original design is replaced by a parameterized cell of equivalent logical function, and PPW and PNW are chosen for each cell so as to match the original transistor sizes as closely as possible.
  • the next step (box 345 ) is to conduct a “baseline” timing of the circuit to determine its performance for comparison purposes later in the design flow.
  • This baseline timing is typically performed by means of a transistor-level static timer [see V. B. Rao, J. P. Soreff, T. B. Brodnax and R. E. Mains, “EinsTLT: transistor-level timing with EinsTimer,” Proc. TAU ACM/IEEE workshop on timing issues in the specification and synthesis of digital systems, Austin, Tex., December, 1999].
  • one of the main inputs to the transistor-level timing program is a set of timing assertions (box 310 ), which provide the time at which input signals arrive, the time at which output signals are required, the external capacitive load driven by the outputs, and so on.
  • the result of this timing step is a timing report (box 325 ) which is stored in a database for future comparison purposes.
  • One of the main goals of the inventive flow is to use continuous transistor-level optimization techniques to improve the performance of the circuit.
  • Continuous transistor-level optimizers (called “tuners” in the sequel) use sophisticated mathematical techniques [see A. R. Conn, N. I. M. Gould and Ph. L. Toint, “LANCELOT: A Fortran package for large-scale nonlinear optimization (Release A),” Springer Verlag, 1992] to obtain an optimal solution to the transistor sizing problem [see A. R. Conn, I. M. Elfadel, W. W. Molzen Jr., P. R. O'Brien, P. N. Strenski, C. Visweswariah and C. B.
  • the marking step marks all parts of the circuit that are considered to have a chance of being timing-critical as “tunable” and the rest as “untunable” to reduce the size of the optimization problem which also includes latches and connecting clock circuitry. If we are also interested in reducing power or area, the least timing-critical sections could be marked tunable so that the optimizer can take advantage and reduce the area and power in these non-critical sections by downsizing transistors as appropriate.
  • the co-pending application D. J. Hathaway, L. K. Lange, C. Visweswariah and P. M. Williams, “Method of Optimizing and Analyzing Selected Portions of a Digital Integrated Circuit,” U.S. patent application Ser. No. 10/936,213 referenced above illustrates a preferred method of marking the circuit to produce a smaller optimization problem, at the same time giving the tuner maximum flexibility to improve the circuit's performance.
  • a design constraint is that the input loading capacitance at each primary input must be maintained at its starting value, so as not to unduly load the macro from which that input arrives.
  • An example of a library constraint is a constraint is an upper bound on PPW or PNW to reflect the range of cell sizes available in the fixed-cell library.
  • Another example of a library constraint is a beta ratio constraint (i.e., upper or lower bound on the PFET to NFET strength ratio) in order to stay within the range of beta ratios available in the fixed-cell library. It may also be desirable to have a constraint on the total area of the macro, often represented by the total tunable transistor width. Design projects will often impose slew (rise/fall time) limits so as to prevent noise problems.
  • the tuner preferably accepts these constraints on a cell-type basis. In other words, a certain specified constraint is automatically and efficiently applied to all instances of a specified cell type.
  • the next step is to carry out the actual circuit tuning (box 350 ). Since the circuit to be tuned now consists of parameterized cells, the parameters controlling the transistor sizes of the tunable cells are treated as tunable parameters. In other words, the ratio-ing inherent in parameterized cells is respected during the tuning. Between the marking step in box 345 and the use of parameterized cells, the size of the optimization problem is vastly reduced and therefore the mathematical continuous transistor-level tuner can complete a high-quality tuning run in a practical amount of run time (practical usually implies a run time that can be accomplished in an overnight computer run). If the input design is hierarchical, it is flattened to the gate-level at this stage so as to improve the chances of obtaining performance improvement from the tuner.
  • a timing report is generated before and after the tuning step and placed in the database of box 325 .
  • These reports in conjunction with some simple report-comparison scripts (box 325 ), help in auditing exactly which steps resulted in performance improvements and which did not.
  • the true identity of the critical paths are known and multiple threshold devices can now be re-inserted to improve the performance and leakage power characteristics of the macro.
  • This procedure is facilitated if the library consists of sets of equivalent cells that are identical except for containing all transistors of a different type (e.g., low Vt, regular Vt or high Vt). In such a situation, an entire gate can be swapped for another gate with a different type of transistors, but equivalent size, logical function and layout.
  • the threshold level closest to the regular threshold level is used preferentially, where it will provide sufficient improvement to meet timing requirements, with threshold levels farther from the regular threshold level being used with decreasing frequency.
  • the second part of the re-insertion is substitution of high threshold voltage devices for regular threshold voltage devices. If the signal driven by a gate is nowhere near being timing-critical, and if the gate handily meets its slew limit, it is a good candidate for high threshold voltage substitution.
  • substitution reduces the leakage power of the macro, and since the gate is far from critical in a timing sense, the increase in gate delay does not have any impact on the timing characteristics of the overall macro. If more than one threshold level is available which is higher than the regular threshold level at which the circuit was tuned, the highest threshold level which will provide sufficient drive strength to meet timing requirements is used preferentially in each case.
  • Methods for such substitutions are well-known in the literature [see M. Ketkar and S. S. Sapatnekar, “Standby power optimization via transistor sizing and dual threshold voltage assignment,” Prof. International Conference on Computer-Aided Design (ICCAD), San Jose, Calif., pages 375-378, November 2002 and W. Liqiong, C. Zhanping, M. Johnson, K. Roy and V. De, “Design and optimization of low-voltage high-performance dual-threshold CMOS circuits,” Proc. 1998 Design Automation Conference, San Fransisco, Calif., pages 489-494, June 1998].
  • an optional step (not shown) of retuning the macro with the multiple threshold devices in place may now be performed to further optimize the sizes of transistors.
  • This retuning may also be done repeatedly, between iterations of mapping to a lower threshold only a subset of the transistors required to meet timing requirements, and mapping to a higher threshold only a subset of those transistors which can be so mapped. Because of the additional processing time required to repeat the tuning step, the retuning step is often skipped, or is exercised only during what is expected to be the final pass of the design process.
  • step 340 the value of each such device type parameter may be set to its “regular” value for each gate, and a gate with all its device type parameters set to their regular values will be considered a “regular” gate.
  • the tuning may then be performed with the parameter at this regular value, and in step 350 , after continuous tuning, an alternative device type (i.e., a non-regular value for one or more of its device type parameters) may be assigned to selected gates.
  • the gate library table (box 320 ) which is a table containing the type and sizes of all cells in the fixed cell library.
  • Each parameterized cell in the tuned circuit is matched to a fixed cell with the same logical function and the closest available size. For example, the sum of the squares of the differences in transistor sizes between the fixed cell and the parameterized cell can be used a measure to be minimized while mapping the parameterized cells back to a fixed cell library. Once this mapping is complete, the design once again consists of a collection of members of the fixed cell library, and hence looks like a familiar random logic macro for the purposes of all the downstream software tools in the design methodology.
  • the next step is to invoke transistor-level timing (box 360 ) on the tuned and re-mapped circuit.
  • the timing report is again stored in the database for auditing and comparison purposes. By comparing this timing report to the previous one from box 350 , for example, the exact nature and magnitude of timing differences due to “binning errors” can be determined. If the binning errors are large, for example, the constraint generation in box 315 could be revisited to generate tighter constraints, or a richer fixed-cell library may be considered.
  • a “Timing Abstract” (box 365 ) is developed, which is a macro-model that represents the timing behavior of the entire macro.
  • the abstract feeds into global timing (box 330 ) which helps to judge whether the tuning improvements on the particular macro that has been tuned thus far help to meet timing budgets at the next level up in the hierarchy, such as unit-level design or chip-level design. If the timing is acceptable, the design flow continues with the physical design steps (boxes 370 to 385 ). Otherwise, some amount of re-design is necessary. The simplest re-design at this stage is to apportion delay differently between macros, which would result in updated assertions (box 310 ).
  • the synthesis step could be repeated (box 335 ), or just the tuning steps could be repeated (boxes 345 to 365 ).
  • the TPS mode of tuning helps by reducing the delay of sub-critical paths, thereby making the delay-apportionment problem easier and accelerating overall timing convergence.
  • the next step in the design flow is to enter physical design. Since the imported design data in box 335 typically comes from a physical synthesis tool, placement information is already available for the various gates. But the sizes of gates have been changed by the tuner, so the placement may have to be adjusted slightly to make place for the gates that grew larger during tuning and take advantage of the space released by gates that became smaller. This change of placement to accommodate size changes is preferably carried out by an incremental “Engineering Change” (EC) placement. The placement update could also be carried out from scratch, but then the estimated wire parasitics could change drastically, causing unwanted timing changes.
  • EC Engineing Change
  • the routing step completes the detailed connections between gates to completely wire the design with various levels of metal wires.
  • Each logic gate is replaced by its physical design (or layout) view in box 380 to fully assemble the detailed layout of the macro.
  • the electrical parasitics of the layout are also extracted to create a detailed model of the transistors and the wires that corresponds to the physical layout. This detailed model is timed in a final transistor-level timing run (box 390 ), using the updated version of the assertions if necessary.
  • the resulting timing report is placed in the timing report database (box 325 ) for comparison, audit and debugging purposes.
  • the timing abstract coming out of the transistor-level timing is now the most physically-aware and accurate picture of the timing of the macro and is used to update the abstract that feeds into global timing at the next level up in the hierarchy (unit-level or chip-level).
  • gate-level timing can be used instead of transistor-level timing if sufficiently accurate delay models are available for the fixed library cells used.
  • a mixed-level timing analysis may be performed in which gate level models are used for those gates which are not expected to be close to timing-critical (e.g., based on a timing analysis at an earlier design stage, or on a complete gate-level timing analysis), and transistor-level delay modeling is performed for those gates which are expected to be close to timing-critical. It is also possible to perform the marking before mapping to parameterized cells, and then mapping only tunable gates to their parameterized cell equivalents.

Abstract

A Digital Design Method which may be automated is for obtaining timing closure in the design of large, complex, high-performance digital integrated circuits. The methodincludes the use of a tuner on random logic macros that adjusts transistor sizes in a continuous domain. To accommodate this tuning, logic gates are mapped to parameterized cells for the tuning and then back to fixed gates after the tuning. Tuning is constrained in such a way as to minimize “binning errors” when the design is mapped back to fixed cells. Further, the critical sections of the circuit are marked in order to make the optimization more effective and to fit within the problem-size constraints of the tuner. A specially formulated objective function is employed during the tuning to promote faster global timing convergence, despite possibly incorrect initial timing budgets. The specially formulated objective function targets all paths that are failing timing, with appropriate weighting, rather than just targeting the most critical path. Finally, the addition of multiple threshold voltage gates allows for increased performance while limiting leakage power.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation in part of these referenced patent applications and contains subject matter which is related to the subject matter of the following co-pending applications, each of which is assigned to the same assignee as this application, International Business Machines Corporation of Armonk, New York. Each of the two patent applications from which priority is claimed are listed below and hereby incorporated herein by reference in its entirety: [0001]
  • D. J. Hathaway, L. K. Lange, C. Visweswariah and P. M. Williams, “Method of Optimizing and Analyzing Selected Portions of a Digital Integrated Circuit,” U.S. patent application Ser. No. 10/436,213, filed on May 12, 2003, assigned to IBM. [0002]
  • D. J. Hathaway, C. Visweswariah, P. M. Williams, J. Zhou, “Method of Achieving Timing Closure in Digital Integrated Circuits by Optimizing Individual Macros,” U.S. patent application Ser. No. 10/436,824, filed on May 12, 2003, assigned to IBM.[0003]
  • TRADEMARKS
  • IBM® is a registered trademark of International Business Machines Corporation, Armonk, N.Y., U.S.A. and other names used herein may be registered trademarks, trademarks or product names of International Business Machines Corporation or other companies. [0004]
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0005]
  • This invention relates to methods for tuning the digital design and design automation of high-performance digital integrated circuits. The invention particularly is directed to the problem of developing an integrated circuit design optimization methodology which exploits circuit tuning of individual macros. The tuning of individual macros is conducted by optimizing transistor sizes over a defined continuous design space. To further optimize for performance, circuits with low or high threshold voltage transistors are selectively substituted for regular threshold ones. [0006]
  • 2. Description of Background [0007]
  • Typically the datapath and array design sections of a high-speed microprocessor design are logically well-defined. In addition, these circuit sections are typically custom-designed circuits which are electrically and physically designed much earlier in the design cycle to assure high clocking performance. The remaining sections of the design, the control logic sections, are often changed late in the microprocessor design cycle to reach required logical function but other objectives such as timing closure put an additional constraint on the system, making automated optimal design closure techniques extremely valuable. Automated techniques deliver several advantages such as improved circuit performance, higher quality and correctness, and enhanced time-to-market. [0008]
  • The control logic is contained in physical entities called random logic macros or RLMs, where the term random does not imply true randomness, but instead a lack of regular structure as is found in datapath and array circuitry. Due to the unstructured nature of the logic, synthesis and place/route tools are employed to read in a logical description and transform it into primitive logical gates. Hereafter the term gate is understood to include a collection of transistors which is to be treated as a single logical circuit element. These gates are adjusted in drive strength to achieve timing objectives and placed legally while a wiring tool routes the connections between these gates to complete the physical design. [0009]
  • Prior-art methods of circuit tuning towards timing closure are illustrated in FIG. 1 (flow [0010] 100) in the case of a “flat” design methodology and FIG. 2 (flow 200) in the case of a “hierarchical” design methodology. For the majority of semiconductor designs, FIG. 1 is typically employed where designs are analyzed and optimized in a flat representation. The flow starts with a logic design specification (labeled “Customer Logic Drop” in box 110) whereby a logic synthesis tool (box 120) synthesizes the input logic description into proper logical gates and structures to assure logical correctness. Logic restructuring of the gates (box 130) maps the logical gates into technology-approved gates required for manufacturing. An initial timing analysis of the design is typically applied whereby the gates are put through a process of powering up and down (box 140), depending on current drive needed to move towards cycle-time goals. This process continues with the physical design (box 150) which entails the physical placement of the gates and global routing of the design which is needed to provide a rough understanding of the real wiring of the design. The global routing provides parasitic extraction details which, along with the placed design, is provided as input to static timing analysis (box 160) to predict the performance of the current design. Many iterations of this process (box 160 to box 120) are attempted with different adjustments of the entire process to achieve timing closure. If cycle time or the physical constraints are not met via this process, the design may take on new engineering logic changes (box 170), and if so the design is started once again through the synthesis, placement, timing process as described (boxes 120-160). If no logic changes are required, the gates of the design are connected via routing, the electrical parasitics of the routing are extracted, and the design proceeds through a final static timing (box 180). Other checks are applied at this time but typically the design must be adjusted logically to achieve cycle time goals and the design is presented back to the synthesis, placement and timing process of boxes 120-160. Upon final acceptance of the design, it is sent to manufacturing (box 190).
  • The other prior-art methodology to achieve timing closure is illustrated in FIG. 2 (flow [0011] 200) and is usually implemented for more complex and dense semiconductor designs such as a microprocessor. This hierarchical methodology presents many benefits in the spirit of “divide and conquer.” It permits a parallel approach to tackling the complexity of the design by enabling the design team to work on macro partitions of the overall design. Each macro partition has a space budget and a timing budget associated with it. Once the macro partition has been designed, its actual timing is represented as a timing abstract, which is a simplified representation of its salient overall timing characteristics. This method improves design turnaround time and time-to-market. It also permits an analysis of the performance of the chip at any stage of the design, substituting full timing abstract models for portions of the chips that have been designed for the estimated or budgeted timing models created during high-level design planning for the portions that haven't yet been designed.
  • The boxes on the left of FIG. 2 ([0012] boxes 205 to 230) pertain to a custom design flow. Custom design is typically applied to arithmetic or dataflow circuits, or other circuits requiring careful hand-design. The boxes on the right of FIG. 2 (boxes 235-265) pertain to a design flow for automatically synthesized and placed random logic macros. The boxes in the middle (boxes 270 to 290) pertain to the steps in the methodology that bring together the custom and random logic macros for global integration and timing of the chip. Thus flow 200 of FIG. 2 embodies two levels of timing closure: one at the individual macro level, and another at the global chip level. The following four paragraphs explain the three main sections of this prior-art hierarchical design methodology.
  • Functions like adders and arrays are typically implemented as custom macros. It is widely known that efficient implementations of such macros cannot be totally developed by automatic computer-aided design (CAD) software and that in-depth engineering is required in many steps to assure performance. Therefore a custom macro's logic architecture (box [0013] 205) is described but the architecture is restructured logically and modified by hand for optimal timing performance (box 210). A rough physical placement is constructed of the major building blocks (box 215) to estimate overall size and to minimize parasitic element constraints within the design. The schematic is developed with these constraints and static or dynamic timing analysis is performed (box 220) by employing a static timing analysis tool or circuit simulator, respectively. The macro design is physically engineered, parasitics are extracted and the design is timed again (box 225). As necessary, the processes of boxes 210 to 255 are iterated to improve the timing characteristics of the macro.
  • For the purposes of hierarchical analysis, a timing macro model, called a timing abstract (box [0014] 275), is produced. A timing abstract is a simplified model that represents the timing behavior of the entire macro. The timing abstract typically contains the timing behavior of timing arcs to and from macro boundary pins to latch points within the macro design. These timing abstracts are incorporated at the global level to enable the chip-level timing analysis (box 270) for the entire chip design. Timing assertions (box 230) or constraints are fed back and applied on the macro pins from the global timing analysis. These timing assertions when placed on the custom macro during timing analysis could result in the need for additional logic restructuring for timing optimization and the process is started again (box 225 back to box 210). To help speed this process it is extremely effective to produce the timing abstracts in the beginning of the design from the schematic timing analysis (dotted arrow from box 220 to box 275).
  • In parallel to the custom flow, the Random Logic Macro (RLM) flow starts with an initial logic specification (labeled “RLM Logic Drop” in box [0015] 235). This part of the flow is very similar to FIG. 1 (flow 100). The logic is synthesized (box 240), logic is restructured for technology adaptation (box 245), gates are powered up or down for required drive strengths (box 250), and the resulting netlist is physically designed (box 255) for minimal wire length on critical paths. Once completed, parasitic extraction and timing analysis is executed (box 260) on the design. Constraints are typically adjusted throughout this process and the loop is iterated many times (box 260 back to box 240) to produce optimal performance. As with custom designs, timing abstracts (box 275) for the RLMs are produced for global timing analysis. Timing assertions (box 265) are fed back to the RLM design loop as additional constraints on the design.
  • The third main component of [0016] flow 200 comprises of the boxes in the middle of FIG. 2 in which the various custom and RLM macros are integrated to obtain a final chip design. Global timing results (box 270) and new logic changes (box 280) would require either custom or RLM macros or both to be redesigned depending on the change required. Evaluation of a final timing run, physical design requirements and other checks (box 285) might also require a loop through either the custom or RLM flows. Until all checks and analysis are satisfied, the chip cannot be released for manufacturing (box 290).
  • The two described prior-art methodology flows have various strengths and weaknesses, which are discussed below. Analyzing a flat design as in FIG. 1 (flow [0017] 100), enables a small design team to close in on the cycle time quickly with the help of a fully automated chip development system. Typically the optimal cycle time for a given technology is not achieved due to the imperfections within this system including approximations made along the way and short cuts taken throughout the process to achieve a quick turnaround from the design system. Inaccuracies include, for example, the use of approximate timing models rather than transistor-level simulation. Short cuts include sub-optimal and heuristic optimization methods, a relatively small library to keep synthesis turn around time to a minimum and relatively simplistic modeling and optimization techniques. These limitations or imperfections force the system in FIG. 1 (flow 100) to iterate many times before a successful conclusion on the achievable performance. Detailed “cross-section analysis” of the worst case paths of the design is typically not conducted in such a flow.
  • A flat design flow as in FIG. 1 (flow [0018] 100) implies very large optimization problems with very large design spaces. Various commercial and in-house CAD (computer-aided design) tools have been developed to solve the optimization problems in this flow. Due to the large number of optimization variables and the discrete nature of these variables, heuristic methods have been favored in these optimization tools. The use of heuristic methods leads to a number of difficulties including problems in achieving performance, a large number of design iterations, incomplete exploitation of the technology, all of which lead to long design times and sub-optimal designs.
  • As digital designs become more complex and dense as in leading-edge microprocessors and SoCs (systems on a chip), FIG. 1 (flow [0019] 100) is insufficient to obtain timing closure and system objectives, hence a hierarchical methodology like that of FIG. 2 (flow 200) is applied. While FIG. 2 (flow 200) controls the design complexity with the use of macro partitions, timing closure must now be accounted for at two levels of the design, both locally within the macro and globally. Referring back to FIG. 2 (flow 200), the difficulty with this flow is the management of the timing budgets between macros and global connections. The process requires careful management of these budgets through the use of the wiring and buffering solutions of the global nets and careful logic implementation of the macro paths connecting between these global paths. Cycle time closure on the critical paths could require logic re-design to reduce overall delays of these timing paths. This redesign could effect connections at the global level and new solutions of wiring and buffering must be implemented. Therefore logic and circuit solutions at both the global and macro levels must be analyzed and developed to achieve optimum cycle time performance. Another potential problem is that typically only the top most critical path or paths are searched, broken down and analyzed for the required solution, but as soon as that is done, other sub-critical paths which were not exposed before may now be limiting. The procedure is therefore time-consuming and not guaranteed to produce a design that meets the required system-level performance.
  • Thus, the hierarchical design flow of FIG. 2 (flow [0020] 200) suffers from two main problems. The first is the difficulty of iteratively adjusting the budgets of the individual macros so as to meet global timing requirements, while giving individual macros reachable targets. The second is that focusing on just a few critical paths is not sufficient to meet overall timing and leads to a great deal of re-design.
  • As described in the previous two paragraphs, both flows limit the progression of the design due to the nature of the heuristic iterations that each flow applies and therefore retards the achievable circuit performance which can be obtained by the flows. This slow convergence rate towards cycle time objectives limits the flexibility of the design team to introduce functional and timing changes throughout the design process, and in particular during the crucial period late in the design cycle. In all cases, only a small sub-set of critical paths is exposed towards an optimal solution. [0021]
  • Modern technologies allow multiple threshold voltage transistors, whereby transistors with different threshold voltages can be integrated on the same chip. Low threshold voltage (Low Vt) transistors offer faster performance, but at the cost of increased leakage power. High threshold voltage (High Vt) transistors offer significant reduction in leakage power, but at the cost of lower performance. It is therefore beneficial to sparingly use Low Vt devices on the critical paths to achieve higher performance, but limit the usage of such devices to limit leakage power. It is also beneficial to use High Vt devices on the non-critical paths to reduce leakage power, but not to the extent that the non-critical paths slow down and turn into critical paths. [0022]
  • Physical synthesis CAD tools employ heuristic methods to introduce multiple threshold devices with the dual objectives of achieving higher performance and limiting leakage power. The prior-art heuristic methods limit the ability to optimally adjust the performance and leakage of the circuits. [0023]
  • The prior art therefore suffers from several problems and weaknesses as summarized below: [0024]
  • a) Prior art design flows generally require excessive iteration. [0025]
  • b) Prior art design flows focus on optimizing only the single or a very small subset of the most critical paths, which in turn leads to long design times and slow convergence towards cycle time objectives. [0026]
  • c) In prior art hierarchical flows, the interaction between imperfect budgeting across the hierarchy at the global level and imperfect optimization at the macro level can lead to poor circuit performance and excessive redesign effort. [0027]
  • d) Prior art flows do not include continuous optimization techniques during RLM design, despite the ability of such techniques to obtain optimal solutions. Further, prior-art flows do not include specialized techniques to deal with the relatively low capacity and high run times of such continuous optimization methods. [0028]
  • SUMMARY OF THE INVENTION
  • Disclosed is an efficient and effective methodology for obtaining timing closure in the design of large, complex, high-performance digital integrated circuits. The novel methodology includes the use of a tuner on random logic macros that adjusts transistor sizes in a continuous domain. To accommodate this tuning, logic gates are mapped to parameterized cells for the tuning process and then back to fixed gates after tuning. Tuning is constrained in such a way as to minimize “binning errors” when the design is mapped back to fixed cells. Further, the critical sections of the circuit are marked in order to make the optimization more effective and to fit within the problem-size constraints of the tuner. A specially formulated objective function is employed during the tuning to promote faster global timing convergence, despite possibly incorrect initial timing budgets. The specially formulated objective function targets all paths that are failing timing, with appropriate weighting, rather than just targeting the most critical path. Finally, the addition of multiple threshold voltage gates allows for increased performance while limiting leakage power. [0029]
  • These and other improvements are set forth in the following detailed description. For a better understanding of the invention with advantages and features, please refer to the detailed description and to the drawings.[0030]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which: [0031]
  • FIG. 1 illustrates a typical prior-art “flat methodology” iterative procedure for achieving timing closure of a typical semiconductor digital integrated circuit. [0032]
  • FIG. 2 illustrates a typical prior-art “hierarchical methodology” iterative procedure for achieving timing closure of a high-performance digital integrated circuit or functional unit of a high-performance digital integrated circuit. [0033]
  • FIG. 3 illustrates the inventive methodology flow for the tuning of circuits within synthesized random logic macros with the optional substitution of multiple threshold voltage devices. For completeness, the figure also includes the final stages of routing, extraction and final timing. [0034]
  • FIG. 4 illustrates the transistor-level schematic of a two-way NAND gate parameterized cell. [0035]
  • FIG. 5 illustrates a plot of PPW and PNW parameters of a set of predefined gates of a given function which may be mapped to a single parameterized cell. [0036]
  • The detailed description explains the preferred embodiments of the invention, together with advantages and features, by way of example with reference to the drawings. [0037]
  • DETAILED DESCRIPTION OF THE INVENTION
  • An inventive design and optimization methodology for random logic macros (RLMs) is shown in FIG. 3 (flow [0038] 300). The salient features of the preferred components of this methodology and how they fit together are explained below.
  • Referring to FIG. 3, the first step is to import design data (box [0039] 335) that comes out of a logical or physical synthesis program, which is known to be logically correct. This design data is at the gate-level, i.e., it typically consists of a collection of logic gate primitives such as NAND gates, NOR gates, etc. These primitives are called “fixed cells” since they are available in fixed transistor size combinations in the library of logic gates. As will be described in detail below, the inventive methodology maps the fixed-cell gate-level design to an equivalent “parameterized cell” transistor-level design in order to apply sophisticated mathematical optimization techniques, and then finally returns the design to a fixed-cell gate-level description. A “parameterized cell” is a primitive cell which employs continuous variables for the strengths or widths of each transistor that makes up that cell, to be further described later.
  • The next step after importing the design is to strip any threshold voltage assignments and map the fixed-cell logic gates to parameterized gates (box [0040] 340). Since the identities of the timing-critical paths will change and are not known until later in the optimization flow, it is not advantageous to accept the threshold voltage assignments suggested by the logical or physical synthesis program. Instead, all transistors are set to a “regular” or “normal” threshold voltage for the particular technology at hand, where the “regular” threshold voltage is chosen among a plurality of choices available in the implementation technology as one which provides a good trade-off between leakage current and performance on typical logic paths in the design. Next, the logic gates are mapped to members of a parameterized library [see G. A. Northrop and P-F. Lu, “A semi-custom design flow in high-performance microprocessor design,” Proc. 2001 Design Automation Conference, Las Vegas, Nev., June 2001, pages 426-431]. A parameterized cell is “in-between” a fixed-cell with no transistor-size flexibility and a full-custom cell in which the width of each transistor can be adjusted arbitrarily and independently. In parameterized cells, the PFETs are typically “grouped,” i.e., they are all controlled by one parameter called PPW in the sequel. Likewise, all the NFETs are controlled by one parameter called PNW. Thus a 3-input NAND gate has 6 transistors, but is parameterized by just two variables: PPW and PNW. Referring to FIG. 4, schematic 400 is of a two-way NAND gate parameterized cell comprising PFETs 410 and 420 and NFETs 430 and 440. Parameter PPW for this cell controls the widths of PFETs 410 and 420, forcing the ratio between their widths (typically 1/1) to remain constant. Similarly, parameter PNW for this cell controls the widths of NFETs 430 and 440, forcing the ratio between their widths to remain constant. Referring to FIG. 5, point 500 represents a particular fixed cell implementation of a function with PPW and PNW values determined by the location of the point with respect to plot axes 510. The collection of points 520 represents all available fixed cells for this function. Arrow 530 represents a direction of variation for parameters PPW and PNW corresponding to increasing or decreasing drive strength of the cell. Arrow 540 represents a direction of variation of parameters PPW and PNW corresponding to increasing or decreasing the beta ratio (the ratio between the PFET and NFET strengths of the gate). Parameterized cell libraries possess numerous advantages. For optimization purposes, the adoption of parameterized cells converts the problem from one of discrete optimization to a continuous optimization problem. Parameterized cells also give rise to optimization problems with fewer independent variables than full-custom library cells. By the addition of appropriate constraints in the continuous optimization problem (e.g., minimum and maximum value constraints on PNW or PPW, and minimum and maximum value constraints on PPW/PNW), the continuous optimization process may be forced to consider only solutions which fall close in both PPW and PNW values to one of the fixed cells in the library. Parameterized cells are also amenable to automated layout techniques [see G. A. Northrop and P-F. Lu, “A semi-custom design flow in high-performance microprocessor design,” Proc. 2001 Design Automation Conference, Las Vegas, Nev., June 2001, pages 426-431] and therefore allow library design flows to be more automated with relatively small loss of performance compared to full-custom design.
  • Returning to FIG. 3, in the mapping step of [0041] box 340, each fixed cell of the original design is replaced by a parameterized cell of equivalent logical function, and PPW and PNW are chosen for each cell so as to match the original transistor sizes as closely as possible.
  • The next step (box [0042] 345) is to conduct a “baseline” timing of the circuit to determine its performance for comparison purposes later in the design flow. This baseline timing is typically performed by means of a transistor-level static timer [see V. B. Rao, J. P. Soreff, T. B. Brodnax and R. E. Mains, “EinsTLT: transistor-level timing with EinsTimer,” Proc. TAU ACM/IEEE workshop on timing issues in the specification and synthesis of digital systems, Austin, Tex., December, 1999]. Other than the input design and technology models, one of the main inputs to the transistor-level timing program is a set of timing assertions (box 310), which provide the time at which input signals arrive, the time at which output signals are required, the external capacitive load driven by the outputs, and so on. The result of this timing step is a timing report (box 325) which is stored in a database for future comparison purposes.
  • One of the main goals of the inventive flow is to use continuous transistor-level optimization techniques to improve the performance of the circuit. Continuous transistor-level optimizers (called “tuners” in the sequel) use sophisticated mathematical techniques [see A. R. Conn, N. I. M. Gould and Ph. L. Toint, “LANCELOT: A Fortran package for large-scale nonlinear optimization (Release A),” Springer Verlag, 1992] to obtain an optimal solution to the transistor sizing problem [see A. R. Conn, I. M. Elfadel, W. W. Molzen Jr., P. R. O'Brien, P. N. Strenski, C. Visweswariah and C. B. Whan, “Gradient-based optimization of custom circuits using a static-timing formulation,” Proc. [0043] 1999 Design Automation Conference, New Orleans, La., pages 425-429, June 1999]. As a result, they are able to gain tremendous performance improvement. Unfortunately, they typically cannot handle the large size of a random logic macro. Since it is impractical to tune the entire macro, a “marking” step is undertaken (box 345) mainly to reduce the size of the problem to a size that is practical for a tuner to tackle. If we are mainly interested in performance, since the performance is limited by the most critical paths, the marking step marks all parts of the circuit that are considered to have a chance of being timing-critical as “tunable” and the rest as “untunable” to reduce the size of the optimization problem which also includes latches and connecting clock circuitry. If we are also interested in reducing power or area, the least timing-critical sections could be marked tunable so that the optimizer can take advantage and reduce the area and power in these non-critical sections by downsizing transistors as appropriate. The co-pending application D. J. Hathaway, L. K. Lange, C. Visweswariah and P. M. Williams, “Method of Optimizing and Analyzing Selected Portions of a Digital Integrated Circuit,” U.S. patent application Ser. No. 10/936,213 referenced above illustrates a preferred method of marking the circuit to produce a smaller optimization problem, at the same time giving the tuner maximum flexibility to improve the circuit's performance.
  • Before the tuner can be invoked, there is another step, that of generating design and library constraints (box [0044] 315). These are constraints whose general goal is to keep the optimizer in the fixed-cell region of FIG. 5 during continuous optimization such that “binning errors” are minimized. The final goal after tuning in a continuous space is to map the cells back to a fixed-cell library, a procedure called “binning.” Since the tuner finds an optimum solution by mathematical methods, the “binning” procedure invariably causes a loss of performance. The constraints generated in box 315 are a pro-active step to minimize the potential loss of performance due to a (yet-to-be-performed) binning step. One example of a design constraint is that the input loading capacitance at each primary input must be maintained at its starting value, so as not to unduly load the macro from which that input arrives. An example of a library constraint is a constraint is an upper bound on PPW or PNW to reflect the range of cell sizes available in the fixed-cell library. Another example of a library constraint is a beta ratio constraint (i.e., upper or lower bound on the PFET to NFET strength ratio) in order to stay within the range of beta ratios available in the fixed-cell library. It may also be desirable to have a constraint on the total area of the macro, often represented by the total tunable transistor width. Design projects will often impose slew (rise/fall time) limits so as to prevent noise problems.
  • To make the generation of the design and library constraints efficient, the tuner preferably accepts these constraints on a cell-type basis. In other words, a certain specified constraint is automatically and efficiently applied to all instances of a specified cell type. [0045]
  • Once the constraints have been generated, the next step is to carry out the actual circuit tuning (box [0046] 350). Since the circuit to be tuned now consists of parameterized cells, the parameters controlling the transistor sizes of the tunable cells are treated as tunable parameters. In other words, the ratio-ing inherent in parameterized cells is respected during the tuning. Between the marking step in box 345 and the use of parameterized cells, the size of the optimization problem is vastly reduced and therefore the mathematical continuous transistor-level tuner can complete a high-quality tuning run in a practical amount of run time (practical usually implies a run time that can be accomplished in an overnight computer run). If the input design is hierarchical, it is flattened to the gate-level at this stage so as to improve the chances of obtaining performance improvement from the tuner. Although it is not shown in FIG. 3, a timing report is generated before and after the tuning step and placed in the database of box 325. These reports, in conjunction with some simple report-comparison scripts (box 325), help in auditing exactly which steps resulted in performance improvements and which did not.
  • Since a mathematical optimizer is employed in [0047] box 350, there is considerable flexibility in choosing the objective function of the optimizer. For example, the worst path delay of the circuit could be minimized subject to area and other constraints. Or the total transistor width of the circuit could be minimized subject to delay and other constraints. In the case when delay is minimized, the optimizer tends to try to improve the most critical path or paths to the exclusion of other paths. This behavior is not conducive to obtaining global timing convergence at the next higher level of hierarchy, such as at the unit-level or chip-level. Rather, it is beneficial to try to improve the timing of all paths that can potentially cause timing problems. The preferred method of formulating an objective function in this manner is taught in the co-pending application D. J. Hathaway, C. Visweswariah, P. M. Williams, J. Zhou, “Method of Achieving Timing Closure in Digital Integrated Circuits by Optimizing Individual Macros,” U.S. patent application Ser. No. 10/435,824 referenced above, in which a mode of tuning called “Total Positive Slack” (or TPS mode) is employed. The benefit of such a formulation of the objective function is that in addition to critical path delays, sub-critical path delays are also improved, making assertion updating easier and global timing convergence quicker.
  • Once the results of the tuning have been obtained, the true identity of the critical paths are known and multiple threshold devices can now be re-inserted to improve the performance and leakage power characteristics of the macro. This procedure is facilitated if the library consists of sets of equivalent cells that are identical except for containing all transistors of a different type (e.g., low Vt, regular Vt or high Vt). In such a situation, an entire gate can be swapped for another gate with a different type of transistors, but equivalent size, logical function and layout. There are two parts to the re-insertion of multiple threshold devices. On the critical paths, low threshold voltage devices are inserted in such a way as to improve the delay of the critical path as much as possible. If more than one threshold level is available which is lower than the regular threshold level at which the circuit was tuned, the threshold level closest to the regular threshold level is used preferentially, where it will provide sufficient improvement to meet timing requirements, with threshold levels farther from the regular threshold level being used with decreasing frequency. There are several well-known techniques for substituting low threshold voltage devices for regular threshold voltage devices in order to improve the performance of the macro, while keeping a limit on the total transistor width of the low threshold voltage transistors in order to limit the amount of leakage power of the macro. The second part of the re-insertion is substitution of high threshold voltage devices for regular threshold voltage devices. If the signal driven by a gate is nowhere near being timing-critical, and if the gate handily meets its slew limit, it is a good candidate for high threshold voltage substitution. The substitution reduces the leakage power of the macro, and since the gate is far from critical in a timing sense, the increase in gate delay does not have any impact on the timing characteristics of the overall macro. If more than one threshold level is available which is higher than the regular threshold level at which the circuit was tuned, the highest threshold level which will provide sufficient drive strength to meet timing requirements is used preferentially in each case. Methods for such substitutions are well-known in the literature [see M. Ketkar and S. S. Sapatnekar, “Standby power optimization via transistor sizing and dual threshold voltage assignment,” Prof. International Conference on Computer-Aided Design (ICCAD), San Jose, Calif., pages 375-378, November 2002 and W. Liqiong, C. Zhanping, M. Johnson, K. Roy and V. De, “Design and optimization of low-voltage high-performance dual-threshold CMOS circuits,” Proc. 1998 Design Automation Conference, San Fransisco, Calif., pages 489-494, June 1998]. [0048]
  • Since insertion of multiple threshold devices will alter the timing characteristics of the macro from those assumed in the preceding tuning process, an optional step (not shown) of retuning the macro with the multiple threshold devices in place may now be performed to further optimize the sizes of transistors. This retuning may also be done repeatedly, between iterations of mapping to a lower threshold only a subset of the transistors required to meet timing requirements, and mapping to a higher threshold only a subset of those transistors which can be so mapped. Because of the additional processing time required to repeat the tuning step, the retuning step is often skipped, or is exercised only during what is expected to be the final pass of the design process. [0049]
  • Other device type substitutions such as selection between alternative gate insulator thicknesses may also have effects on gate performance, leakage, and other characteristics of interest. Substitution between different alternatives in these spaces may be handled in a manner similar to the multiple threshold processing described above. Specifically, in [0050] step 340 the value of each such device type parameter may be set to its “regular” value for each gate, and a gate with all its device type parameters set to their regular values will be considered a “regular” gate. The tuning may then be performed with the parameter at this regular value, and in step 350, after continuous tuning, an alternative device type (i.e., a non-regular value for one or more of its device type parameters) may be assigned to selected gates.
  • At this stage, we have an optimized circuit consisting of parameterized cells, and some low threshold voltage and some high threshold voltage gates. One option would be to treat this as a final circuit and proceed to physical design. However, the problem is that each parameterized cell defines its own unique transistor sizes, and therefore requires its own unique layout. Although these unique layouts could be generated by automated techniques, one problem is that of explosion of data volume in representing and manipulating the chip design. The second problem is that various downstream software tools in the random logic macro design flow expect the circuit to consist of a collection of fixed cells. For these reasons, the parameterized cells are typically mapped back to members of the fixed cell library (box [0051] 355), a procedure called “binning.” The mapping is a simple procedure. It takes as input the gate library table (box 320) which is a table containing the type and sizes of all cells in the fixed cell library. Each parameterized cell in the tuned circuit is matched to a fixed cell with the same logical function and the closest available size. For example, the sum of the squares of the differences in transistor sizes between the fixed cell and the parameterized cell can be used a measure to be minimized while mapping the parameterized cells back to a fixed cell library. Once this mapping is complete, the design once again consists of a collection of members of the fixed cell library, and hence looks like a familiar random logic macro for the purposes of all the downstream software tools in the design methodology.
  • The next step is to invoke transistor-level timing (box [0052] 360) on the tuned and re-mapped circuit. The timing report is again stored in the database for auditing and comparison purposes. By comparing this timing report to the previous one from box 350, for example, the exact nature and magnitude of timing differences due to “binning errors” can be determined. If the binning errors are large, for example, the constraint generation in box 315 could be revisited to generate tighter constraints, or a richer fixed-cell library may be considered.
  • Out of the transistor-level timing (box [0053] 360) a “Timing Abstract” (box 365) is developed, which is a macro-model that represents the timing behavior of the entire macro. The abstract feeds into global timing (box 330) which helps to judge whether the tuning improvements on the particular macro that has been tuned thus far help to meet timing budgets at the next level up in the hierarchy, such as unit-level design or chip-level design. If the timing is acceptable, the design flow continues with the physical design steps (boxes 370 to 385). Otherwise, some amount of re-design is necessary. The simplest re-design at this stage is to apportion delay differently between macros, which would result in updated assertions (box 310). Depending on the severity of the changes in the assertions, the synthesis step could be repeated (box 335), or just the tuning steps could be repeated (boxes 345 to 365). The TPS mode of tuning helps by reducing the delay of sub-critical paths, thereby making the delay-apportionment problem easier and accelerating overall timing convergence.
  • The next step in the design flow is to enter physical design. Since the imported design data in [0054] box 335 typically comes from a physical synthesis tool, placement information is already available for the various gates. But the sizes of gates have been changed by the tuner, so the placement may have to be adjusted slightly to make place for the gates that grew larger during tuning and take advantage of the space released by gates that became smaller. This change of placement to accommodate size changes is preferably carried out by an incremental “Engineering Change” (EC) placement. The placement update could also be carried out from scratch, but then the estimated wire parasitics could change drastically, causing unwanted timing changes.
  • The rest of the physical design flow is shown in FIG. 3 for completeness. The routing step (box [0055] 375) completes the detailed connections between gates to completely wire the design with various levels of metal wires. Each logic gate is replaced by its physical design (or layout) view in box 380 to fully assemble the detailed layout of the macro. Once the physical design is complete, it goes through checking (box 385) to make sure no physical or electrical design rules have been violated. The electrical parasitics of the layout are also extracted to create a detailed model of the transistors and the wires that corresponds to the physical layout. This detailed model is timed in a final transistor-level timing run (box 390), using the updated version of the assertions if necessary. The resulting timing report is placed in the timing report database (box 325) for comparison, audit and debugging purposes. The timing abstract coming out of the transistor-level timing is now the most physically-aware and accurate picture of the timing of the macro and is used to update the abstract that feeds into global timing at the next level up in the hierarchy (unit-level or chip-level).
  • It is to be understood that one of ordinary skill in the art can use these teachings in a variety of ways to customize the inventive design methodology. Certain steps could be skipped, or certain other steps iteratively repeated till the required result is obtained, or certain other steps invoked in a different order. At certain steps of the design, gate-level timing can be used instead of transistor-level timing if sufficiently accurate delay models are available for the fixed library cells used. Or a mixed-level timing analysis may be performed in which gate level models are used for those gates which are not expected to be close to timing-critical (e.g., based on a timing analysis at an earlier design stage, or on a complete gate-level timing analysis), and transistor-level delay modeling is performed for those gates which are expected to be close to timing-critical. It is also possible to perform the marking before mapping to parameterized cells, and then mapping only tunable gates to their parameterized cell equivalents. [0056]

Claims (19)

What is claimed is:
1. A method of tuning a digital design comprising the steps of:
a) mapping at least one gate of said digital design to a parameterized cell;
b) generating constraints to keep said at least one mapped gate within a range of sizes close to sizes of available fixed library cells;
c) tuning said digital design in a continuous space subject to said constraints; and
d) binning the at least one said gate of said digital design to a fixed library cell.
2. The method of claim 1, further comprising the step of assigning alternative device types to gates of said design.
3. The method of claim 2, wherein said parameterized cells are distinguished from gates of said alternative device type by at least one of threshold voltage and gate insulator thickness.
4. The method of claim 1, wherein said tuning is applied to gates of said design which are most timing-critical.
5. The method of claim 4, wherein said tuning is further applied to gates of said design which are least timing-critical.
6. The method of claim 4, wherein identification of said timing-critical gates comprises an initial transistor-level static timing analysis.
7. The method of claim 2, wherein said mapping comprises setting a threshold voltage for said at least one gate to a selected value.
8. The method of claim 2, wherein said mapping comprises setting an oxide thickness for said at least one gate of said digital design to a selected value.
9. The method of claim 1, wherein said generated constraints include bounds on beta ratios of gates.
10. The method of claim 1, wherein said generated constraints include bounds on a size parameter of transistors.
11. The method of claim 1, further comprising the steps of:
a) performing a final static timing analysis; and
b) generating a timing abstract of said digital design.
12. The method of claim 1, wherein said digital design is a random logic macro portion of a higher level digital design.
13. The method of claim 1, wherein said tuning step is repeated subsequent to said assigning step.
14. The method of claim 13, wherein said tuning and assigning steps are repeated, and wherein each repetition of said assigning step assigns an alternative device type to additional gates of said digital design.
15. The method of claim 1, wherein said tuning step comprises Total Positive Slack mode (TPS mode) tuning.
16. The method of claim 1, wherein said binning step comprises determining an element of a fixed library whose parameters are closest to those of said at least one gate.
17. The method of claim 16, wherein said determination of closeness is made by using a least squares calculation .
18. A program storage device readable by a machine, tangibly embodying a program of instructions executable by said machine to perform method steps for tuning a digital design, said method steps comprising:
a) mapping at least one gate of said digital design to a parameterized cell;
b) generating constraints to keep said at least one mapped gate within a range of sizes close to sizes of available fixed library cells
c) tuning said digital design in a continuous space subject to said constraints; and
d) binning the at least one said gate of said design to a fixed library cell.
19. A program storage device of claim 16 readable by a machine, tangibly embodying a program of instructions executable by said machine to perform method steps for tuning a digital design, said method steps further comprising assigning alternative device types to gates of said ditital design.
US10/842,589 2003-05-12 2004-05-10 Method for tuning a digital design for synthesized random logic circuit macros in a continuous design space with optional insertion of multiple threshold voltage devices Expired - Fee Related US7093208B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/842,589 US7093208B2 (en) 2003-05-12 2004-05-10 Method for tuning a digital design for synthesized random logic circuit macros in a continuous design space with optional insertion of multiple threshold voltage devices

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US10/435,824 US7003747B2 (en) 2003-05-12 2003-05-12 Method of achieving timing closure in digital integrated circuits by optimizing individual macros
US10/436,213 US7010763B2 (en) 2003-05-12 2003-05-12 Method of optimizing and analyzing selected portions of a digital integrated circuit
US10/842,589 US7093208B2 (en) 2003-05-12 2004-05-10 Method for tuning a digital design for synthesized random logic circuit macros in a continuous design space with optional insertion of multiple threshold voltage devices

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/436,213 Continuation-In-Part US7010763B2 (en) 2003-05-12 2003-05-12 Method of optimizing and analyzing selected portions of a digital integrated circuit

Publications (2)

Publication Number Publication Date
US20040230924A1 true US20040230924A1 (en) 2004-11-18
US7093208B2 US7093208B2 (en) 2006-08-15

Family

ID=36926415

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/842,589 Expired - Fee Related US7093208B2 (en) 2003-05-12 2004-05-10 Method for tuning a digital design for synthesized random logic circuit macros in a continuous design space with optional insertion of multiple threshold voltage devices

Country Status (1)

Country Link
US (1) US7093208B2 (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050044515A1 (en) * 2003-08-22 2005-02-24 International Business Machines Corporation Method for determining and using leakage current sensitivities to optimize the design of an integrated circuit
US20060041774A1 (en) * 2004-08-20 2006-02-23 Matsushita Electric Industrial Co., Ltd. Semiconductor integrated circuit and semiconductor integrated circuit manufacturing method
US7032200B1 (en) * 2003-09-09 2006-04-18 Sun Microsystems, Inc. Low threshold voltage transistor displacement in a semiconductor device
US20060273821A1 (en) * 2005-06-01 2006-12-07 Correale Anthony Jr System and method for creating a standard cell library for reduced leakage and improved performance
US7188325B1 (en) * 2004-10-04 2007-03-06 Advanced Micro Devices, Inc. Method for selecting transistor threshold voltages in an integrated circuit
US20070106968A1 (en) * 2005-11-08 2007-05-10 International Business Machines Corporation Opc trimming for performance
US7243312B1 (en) * 2003-10-24 2007-07-10 Xilinx, Inc. Method and apparatus for power optimization during an integrated circuit design process
US20070180415A1 (en) * 2006-01-03 2007-08-02 Shrikrishna Pundoor Method of Leakage Optimization in Integrated Circuit Design
US20070192752A1 (en) * 2006-02-15 2007-08-16 International Business Machines Corporation Influence-based circuit design
US20070256044A1 (en) * 2006-04-26 2007-11-01 Gary Coryer System and method to power route hierarchical designs that employ macro reuse
US20080320420A1 (en) * 2007-06-20 2008-12-25 Sferrazza Benjamin S Efficient cell swapping system for leakage power reduction in a multi-threshold voltage process
US20090172608A1 (en) * 2007-12-28 2009-07-02 Hopkins Jeremy T Techniques for Selecting Spares to Implement a Design Change in an Integrated Circuit
US20090212819A1 (en) * 2008-02-26 2009-08-27 International Business Machines Corporation Method and system for changing circuits in an integrated circuit
US20090267671A1 (en) * 2008-04-29 2009-10-29 Jeffrey Scott Brown Optimization of library slew ratio based circuit
US7716612B1 (en) * 2005-12-29 2010-05-11 Tela Innovations, Inc. Method and system for integrated circuit optimization by using an optimized standard-cell library
US8464198B1 (en) * 2008-07-30 2013-06-11 Lsi Corporation Electronic design automation tool and method for employing unsensitized critical path information to reduce leakage power in an integrated circuit
US8560983B2 (en) 2011-12-06 2013-10-15 International Business Machines Corporation Incorporating synthesized netlists as subcomponents in a hierarchical custom design
US8863058B2 (en) 2012-09-24 2014-10-14 Atrenta, Inc. Characterization based buffering and sizing for system performance optimization
US20160321388A1 (en) * 2015-04-30 2016-11-03 Taiwan Semiconductor Manufacturing Company, Ltd. Method for library having base cell and vt-related cell
US9519746B1 (en) * 2015-06-11 2016-12-13 International Business Machines Corporation Addressing early mode slack fails by book decomposition
US9703910B2 (en) * 2015-07-09 2017-07-11 International Business Machines Corporation Control path power adjustment for chip design
US10565347B2 (en) 2018-03-29 2020-02-18 International Business Machines Corporation Global routing optimization

Families Citing this family (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2003303961A1 (en) * 2003-12-29 2005-07-21 Motorola, Inc. Circuit layout compaction using reshaping
US7254802B2 (en) * 2004-05-27 2007-08-07 Verisilicon Holdings, Co. Ltd. Standard cell library having cell drive strengths selected according to delay
JP2006093631A (en) * 2004-09-27 2006-04-06 Matsushita Electric Ind Co Ltd Method and device for manufacturing semiconductor integrated circuit
US7730437B1 (en) * 2004-10-27 2010-06-01 Cypress Semiconductor Corporation Method of full semiconductor chip timing closure
US20070006106A1 (en) * 2005-06-30 2007-01-04 Texas Instruments Incorporated Method and system for desensitization of chip designs from perturbations affecting timing and manufacturability
US7644383B2 (en) * 2005-06-30 2010-01-05 Texas Instruments Incorporated Method and system for correcting signal integrity crosstalk violations
US8448102B2 (en) 2006-03-09 2013-05-21 Tela Innovations, Inc. Optimizing layout of irregular structures in regular layout context
US8653857B2 (en) 2006-03-09 2014-02-18 Tela Innovations, Inc. Circuitry and layouts for XOR and XNOR logic
US9563733B2 (en) 2009-05-06 2017-02-07 Tela Innovations, Inc. Cell circuit and layout with linear finfet structures
US8225261B2 (en) 2006-03-09 2012-07-17 Tela Innovations, Inc. Methods for defining contact grid in dynamic array architecture
US8658542B2 (en) 2006-03-09 2014-02-25 Tela Innovations, Inc. Coarse grid design methods and structures
US7446352B2 (en) 2006-03-09 2008-11-04 Tela Innovations, Inc. Dynamic array architecture
US9009641B2 (en) 2006-03-09 2015-04-14 Tela Innovations, Inc. Circuits with linear finfet structures
US8245180B2 (en) 2006-03-09 2012-08-14 Tela Innovations, Inc. Methods for defining and using co-optimized nanopatterns for integrated circuit design and apparatus implementing same
US8225239B2 (en) 2006-03-09 2012-07-17 Tela Innovations, Inc. Methods for defining and utilizing sub-resolution features in linear topology
US7956421B2 (en) 2008-03-13 2011-06-07 Tela Innovations, Inc. Cross-coupled transistor layouts in restricted gate level layout architecture
US7943967B2 (en) 2006-03-09 2011-05-17 Tela Innovations, Inc. Semiconductor device and associated layouts including diffusion contact placement restriction based on relation to linear conductive segments
US8247846B2 (en) 2006-03-09 2012-08-21 Tela Innovations, Inc. Oversized contacts and vias in semiconductor chip defined by linearly constrained topology
US8839175B2 (en) 2006-03-09 2014-09-16 Tela Innovations, Inc. Scalable meta-data objects
US7763534B2 (en) 2007-10-26 2010-07-27 Tela Innovations, Inc. Methods, structures and designs for self-aligning local interconnects used in integrated circuits
US8541879B2 (en) 2007-12-13 2013-09-24 Tela Innovations, Inc. Super-self-aligned contacts and method for making the same
US9035359B2 (en) 2006-03-09 2015-05-19 Tela Innovations, Inc. Semiconductor chip including region including linear-shaped conductive structures forming gate electrodes and having electrical connection areas arranged relative to inner region between transistors of different types and associated methods
US7932545B2 (en) 2006-03-09 2011-04-26 Tela Innovations, Inc. Semiconductor device and associated layouts including gate electrode level region having arrangement of six linear conductive segments with side-to-side spacing less than 360 nanometers
US9230910B2 (en) 2006-03-09 2016-01-05 Tela Innovations, Inc. Oversized contacts and vias in layout defined by linearly constrained topology
US7716618B2 (en) * 2006-05-31 2010-05-11 Stmicroelectronics, S.R.L. Method and system for designing semiconductor circuit devices to reduce static power consumption
US8286107B2 (en) 2007-02-20 2012-10-09 Tela Innovations, Inc. Methods and systems for process compensation technique acceleration
US7979829B2 (en) 2007-02-20 2011-07-12 Tela Innovations, Inc. Integrated circuit cell library with cell-level process compensation technique (PCT) application and associated methods
US8667443B2 (en) 2007-03-05 2014-03-04 Tela Innovations, Inc. Integrated circuit cell library for multiple patterning
US7888705B2 (en) 2007-08-02 2011-02-15 Tela Innovations, Inc. Methods for defining dynamic array section with manufacturing assurance halo and apparatus implementing the same
US20080307374A1 (en) * 2007-06-05 2008-12-11 International Business Machines Corporation Method, system, and computer program product for mapping a logical design onto an integrated circuit with slack apportionment
US8453094B2 (en) 2008-01-31 2013-05-28 Tela Innovations, Inc. Enforcement of semiconductor structure regularity for localized transistors and interconnect
US7836418B2 (en) * 2008-03-24 2010-11-16 International Business Machines Corporation Method and system for achieving power optimization in a hierarchical netlist
US7939443B2 (en) 2008-03-27 2011-05-10 Tela Innovations, Inc. Methods for multi-wire routing and apparatus implementing same
US8499230B2 (en) 2008-05-07 2013-07-30 Lsi Corporation Critical path monitor for an integrated circuit and method of operation thereof
KR101749351B1 (en) 2008-07-16 2017-06-20 텔라 이노베이션스, 인코포레이티드 Methods for cell phasing and placement in dynamic array architecture and implementation of the same
US9122832B2 (en) 2008-08-01 2015-09-01 Tela Innovations, Inc. Methods for controlling microloading variation in semiconductor wafer layout and fabrication
US20100250187A1 (en) * 2009-03-25 2010-09-30 Imec Method and system for analyzing performance metrics of array type circuits under process variability
US8239805B2 (en) * 2009-07-27 2012-08-07 Lsi Corporation Method for designing integrated circuits employing a partitioned hierarchical design flow and an apparatus employing the method
US8302056B2 (en) * 2009-08-07 2012-10-30 International Business Machines Corporation Method and system for placement of electronic circuit components in integrated circuit design
US8661392B2 (en) 2009-10-13 2014-02-25 Tela Innovations, Inc. Methods for cell boundary encroachment and layouts implementing the Same
US8656334B2 (en) 2010-07-08 2014-02-18 International Business Machines Corporation Multiple threshold voltage cell families based integrated circuit design
US9159627B2 (en) 2010-11-12 2015-10-13 Tela Innovations, Inc. Methods for linewidth modification and apparatus implementing the same
US8448121B2 (en) 2011-08-11 2013-05-21 International Business Machines Corporation Implementing Z directional macro port assignment
US9041428B2 (en) 2013-01-15 2015-05-26 International Business Machines Corporation Placement of storage cells on an integrated circuit
US9201727B2 (en) 2013-01-15 2015-12-01 International Business Machines Corporation Error protection for a data bus
US9021328B2 (en) 2013-01-15 2015-04-28 International Business Machines Corporation Shared error protection for register banks
US9043683B2 (en) 2013-01-23 2015-05-26 International Business Machines Corporation Error protection for integrated circuits
US9171125B2 (en) 2014-02-26 2015-10-27 Globalfoundries U.S. 2 Llc Limiting skew between different device types to meet performance requirements of an integrated circuit
US9652580B2 (en) 2014-07-23 2017-05-16 Samsung Electronics Co., Ltd. Integrated circuit layout design system and method
US9418198B1 (en) 2015-02-11 2016-08-16 International Business Machines Corporation Method for calculating an effect on timing of moving a pin from an edge to an inboard position in processing large block synthesis (LBS)
US9679092B1 (en) * 2015-11-03 2017-06-13 Xilinx, Inc. Constraint handling for parameterizable hardware description language

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4827428A (en) * 1985-11-15 1989-05-02 American Telephone And Telegraph Company, At&T Bell Laboratories Transistor sizing system for integrated circuits
US5392221A (en) * 1991-06-12 1995-02-21 International Business Machines Corporation Procedure to minimize total power of a logic network subject to timing constraints
US5508937A (en) * 1993-04-16 1996-04-16 International Business Machines Corporation Incremental timing analysis
US6202192B1 (en) * 1998-01-09 2001-03-13 International Business Machines Corporation Distributed static timing analysis
US6460166B1 (en) * 1998-12-16 2002-10-01 International Business Machines Corporation System and method for restructuring of logic circuitry
US6574779B2 (en) * 2001-04-12 2003-06-03 International Business Machines Corporation Hierarchical layout method for integrated circuits
US20030233628A1 (en) * 2002-06-17 2003-12-18 Rana Amar Pal Singh Technology dependent transformations in CMOS and silicon-on-insulator during digital design synthesis
US6701289B1 (en) * 1997-01-27 2004-03-02 Unisys Corporation Method and apparatus for using a placement tool to manipulate cell substitution lists
US6745371B2 (en) * 2002-03-15 2004-06-01 Sun Microsystems, Inc. Low Vt transistor substitution in a semiconductor device
US20040196684A1 (en) * 1997-12-26 2004-10-07 Renesas Technology Corp. Semiconductor integrated circuit device, storage medium on which cell library is stored and designing method for semiconductor intergrated circuit
US20040230929A1 (en) * 2003-05-12 2004-11-18 Jun Zhou Method of achieving timing closure in digital integrated circuits by optimizing individual macros
US20040230921A1 (en) * 2003-05-12 2004-11-18 International Business Machines Corporation Method of optimizing and analyzing selected portions of a digital integrated circuit
US20050050497A1 (en) * 2003-08-27 2005-03-03 Alexander Tetelbaum Method of clock driven cell placement and clock tree synthesis for integrated circuit design
US20050114815A1 (en) * 2003-11-24 2005-05-26 Correale Anthony Jr. Method and program product of level converter optimization
US20050114814A1 (en) * 2003-11-24 2005-05-26 Correale Anthony Jr. Multiple voltage integrated circuit and design method therefor

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4827428A (en) * 1985-11-15 1989-05-02 American Telephone And Telegraph Company, At&T Bell Laboratories Transistor sizing system for integrated circuits
US5392221A (en) * 1991-06-12 1995-02-21 International Business Machines Corporation Procedure to minimize total power of a logic network subject to timing constraints
US5508937A (en) * 1993-04-16 1996-04-16 International Business Machines Corporation Incremental timing analysis
US6701289B1 (en) * 1997-01-27 2004-03-02 Unisys Corporation Method and apparatus for using a placement tool to manipulate cell substitution lists
US20040196684A1 (en) * 1997-12-26 2004-10-07 Renesas Technology Corp. Semiconductor integrated circuit device, storage medium on which cell library is stored and designing method for semiconductor intergrated circuit
US6202192B1 (en) * 1998-01-09 2001-03-13 International Business Machines Corporation Distributed static timing analysis
US6557151B1 (en) * 1998-01-09 2003-04-29 International Business Machines Corporation Distributed static timing analysis
US6460166B1 (en) * 1998-12-16 2002-10-01 International Business Machines Corporation System and method for restructuring of logic circuitry
US6574779B2 (en) * 2001-04-12 2003-06-03 International Business Machines Corporation Hierarchical layout method for integrated circuits
US6745371B2 (en) * 2002-03-15 2004-06-01 Sun Microsystems, Inc. Low Vt transistor substitution in a semiconductor device
US20030233628A1 (en) * 2002-06-17 2003-12-18 Rana Amar Pal Singh Technology dependent transformations in CMOS and silicon-on-insulator during digital design synthesis
US20040230929A1 (en) * 2003-05-12 2004-11-18 Jun Zhou Method of achieving timing closure in digital integrated circuits by optimizing individual macros
US20040230921A1 (en) * 2003-05-12 2004-11-18 International Business Machines Corporation Method of optimizing and analyzing selected portions of a digital integrated circuit
US20050050497A1 (en) * 2003-08-27 2005-03-03 Alexander Tetelbaum Method of clock driven cell placement and clock tree synthesis for integrated circuit design
US20050114815A1 (en) * 2003-11-24 2005-05-26 Correale Anthony Jr. Method and program product of level converter optimization
US20050114814A1 (en) * 2003-11-24 2005-05-26 Correale Anthony Jr. Multiple voltage integrated circuit and design method therefor

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050044515A1 (en) * 2003-08-22 2005-02-24 International Business Machines Corporation Method for determining and using leakage current sensitivities to optimize the design of an integrated circuit
US7137080B2 (en) * 2003-08-22 2006-11-14 International Business Machines Corporation Method for determining and using leakage current sensitivities to optimize the design of an integrated circuit
US7032200B1 (en) * 2003-09-09 2006-04-18 Sun Microsystems, Inc. Low threshold voltage transistor displacement in a semiconductor device
US7243312B1 (en) * 2003-10-24 2007-07-10 Xilinx, Inc. Method and apparatus for power optimization during an integrated circuit design process
US7412679B2 (en) * 2004-08-20 2008-08-12 Matsushita Electric Industrial Co., Ltd. Semiconductor integrated circuit and semiconductor integrated circuit manufacturing method
US20060041774A1 (en) * 2004-08-20 2006-02-23 Matsushita Electric Industrial Co., Ltd. Semiconductor integrated circuit and semiconductor integrated circuit manufacturing method
US7188325B1 (en) * 2004-10-04 2007-03-06 Advanced Micro Devices, Inc. Method for selecting transistor threshold voltages in an integrated circuit
US20060273821A1 (en) * 2005-06-01 2006-12-07 Correale Anthony Jr System and method for creating a standard cell library for reduced leakage and improved performance
US7784012B2 (en) 2005-06-01 2010-08-24 International Business Machines Corporation System and method for creating a standard cell library for use in circuit designs
US7340712B2 (en) 2005-06-01 2008-03-04 International Business Machines Corporation System and method for creating a standard cell library for reduced leakage and improved performance
US20070106968A1 (en) * 2005-11-08 2007-05-10 International Business Machines Corporation Opc trimming for performance
US7627836B2 (en) 2005-11-08 2009-12-01 International Business Machines Corporation OPC trimming for performance
US7716612B1 (en) * 2005-12-29 2010-05-11 Tela Innovations, Inc. Method and system for integrated circuit optimization by using an optimized standard-cell library
US20070180415A1 (en) * 2006-01-03 2007-08-02 Shrikrishna Pundoor Method of Leakage Optimization in Integrated Circuit Design
US7448009B2 (en) * 2006-01-03 2008-11-04 Texas Instruments Incorporated Method of leakage optimization in integrated circuit design
US20070192752A1 (en) * 2006-02-15 2007-08-16 International Business Machines Corporation Influence-based circuit design
US7500207B2 (en) 2006-02-15 2009-03-03 International Business Machines Corporation Influence-based circuit design
US20070256044A1 (en) * 2006-04-26 2007-11-01 Gary Coryer System and method to power route hierarchical designs that employ macro reuse
US20080320420A1 (en) * 2007-06-20 2008-12-25 Sferrazza Benjamin S Efficient cell swapping system for leakage power reduction in a multi-threshold voltage process
US7849422B2 (en) * 2007-06-20 2010-12-07 Lsi Corporation Efficient cell swapping system for leakage power reduction in a multi-threshold voltage process
US20090172608A1 (en) * 2007-12-28 2009-07-02 Hopkins Jeremy T Techniques for Selecting Spares to Implement a Design Change in an Integrated Circuit
US8166439B2 (en) * 2007-12-28 2012-04-24 International Business Machines Corporation Techniques for selecting spares to implement a design change in an integrated circuit
US20090212819A1 (en) * 2008-02-26 2009-08-27 International Business Machines Corporation Method and system for changing circuits in an integrated circuit
US8103989B2 (en) * 2008-02-26 2012-01-24 International Business Machines Corporation Method and system for changing circuits in an integrated circuit
US20090267671A1 (en) * 2008-04-29 2009-10-29 Jeffrey Scott Brown Optimization of library slew ratio based circuit
US8418102B2 (en) * 2008-04-29 2013-04-09 Lsi Corporation Optimization of library slew ratio based circuit
US20130152036A1 (en) * 2008-04-29 2013-06-13 Lsi Corporation Optimization of library slew ratio based circuit
US8667438B2 (en) * 2008-04-29 2014-03-04 Lsi Corporation Optimization of library slew ratio based circuit
US8464198B1 (en) * 2008-07-30 2013-06-11 Lsi Corporation Electronic design automation tool and method for employing unsensitized critical path information to reduce leakage power in an integrated circuit
US8560983B2 (en) 2011-12-06 2013-10-15 International Business Machines Corporation Incorporating synthesized netlists as subcomponents in a hierarchical custom design
US8863058B2 (en) 2012-09-24 2014-10-14 Atrenta, Inc. Characterization based buffering and sizing for system performance optimization
US20160321388A1 (en) * 2015-04-30 2016-11-03 Taiwan Semiconductor Manufacturing Company, Ltd. Method for library having base cell and vt-related cell
US9703911B2 (en) * 2015-04-30 2017-07-11 Taiwan Semiconductor Manufacturing Company, Ltd. Method for library having base cell and VT-related
US9519746B1 (en) * 2015-06-11 2016-12-13 International Business Machines Corporation Addressing early mode slack fails by book decomposition
US9703910B2 (en) * 2015-07-09 2017-07-11 International Business Machines Corporation Control path power adjustment for chip design
US9734270B2 (en) * 2015-07-09 2017-08-15 International Business Machines Corporation Control path power adjustment for chip design
US10565347B2 (en) 2018-03-29 2020-02-18 International Business Machines Corporation Global routing optimization

Also Published As

Publication number Publication date
US7093208B2 (en) 2006-08-15

Similar Documents

Publication Publication Date Title
US7093208B2 (en) Method for tuning a digital design for synthesized random logic circuit macros in a continuous design space with optional insertion of multiple threshold voltage devices
US7360198B2 (en) Technology dependent transformations for CMOS in digital design synthesis
US8631369B1 (en) Methods, systems, and apparatus for timing and signal integrity analysis of integrated circuits with semiconductor process variations
US6453446B1 (en) Timing closure methodology
US6851095B1 (en) Method of incremental recharacterization to estimate performance of integrated disigns
US5724250A (en) Method and apparatus for performing drive strength adjust optimization in a circuit design
US7340698B1 (en) Method of estimating performance of integrated circuit designs by finding scalars for strongly coupled components
US6131182A (en) Method and apparatus for synthesizing and optimizing control logic based on SRCMOS logic array macros
US7743355B2 (en) Method of achieving timing closure in digital integrated circuits by optimizing individual macros
US7003738B2 (en) Process for automated generation of design-specific complex functional blocks to improve quality of synthesized digital integrated circuits in CMOS using altering process
US6334205B1 (en) Wavefront technology mapping
US7818158B2 (en) Method for symbolic simulation of circuits having non-digital node voltages
US6304836B1 (en) Worst case design parameter extraction for logic technologies
US20050268268A1 (en) Methods and systems for structured ASIC electronic design automation
Moreira et al. A 65nm standard cell set and flow dedicated to automated asynchronous circuits design
Northrop et al. A semi-custom design flow in high-performance microprocessor design
US8813006B1 (en) Accelerated characterization of circuits for within-die process variations
Newton et al. CAD tools for ASIC design
US6502223B1 (en) Method for simulating noise on the input of a static gate and determining noise on the output
US20220114321A1 (en) Systems And Methods For Generating Placements For Circuit Designs Using Pyramidal Flows
Karkowski Performance driven synthesis of digital systems
US6496031B1 (en) Method for calculating the P/N ratio of a static gate based on input voltages
Chappell et al. A system-level solution to domino synthesis with 2 GHz application
Bortolon Static noise margin analysis for CMOS logic cells in near-threshold
Santos et al. Effects of using a pin-to-pin delay model on a library-free transistor/gate sizing scheme

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WILLIAMS, PATRICK M.;CHO, EE K.;HATHAWAY, DAVID J.;AND OTHERS;REEL/FRAME:015321/0622;SIGNING DATES FROM 20040506 TO 20040510

FPAY Fee payment

Year of fee payment: 4

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Expired due to failure to pay maintenance fee

Effective date: 20140815

AS Assignment

Owner name: GLOBALFOUNDRIES U.S. 2 LLC, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:036550/0001

Effective date: 20150629

AS Assignment

Owner name: GLOBALFOUNDRIES INC., CAYMAN ISLANDS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GLOBALFOUNDRIES U.S. 2 LLC;GLOBALFOUNDRIES U.S. INC.;REEL/FRAME:036779/0001

Effective date: 20150910