WO2008125509A2 - Method and system for analog frequency clocking in processor cores - Google Patents

Method and system for analog frequency clocking in processor cores Download PDF

Info

Publication number
WO2008125509A2
WO2008125509A2 PCT/EP2008/054011 EP2008054011W WO2008125509A2 WO 2008125509 A2 WO2008125509 A2 WO 2008125509A2 EP 2008054011 W EP2008054011 W EP 2008054011W WO 2008125509 A2 WO2008125509 A2 WO 2008125509A2
Authority
WO
WIPO (PCT)
Prior art keywords
frequency
analog
core
chip
processor
Prior art date
Application number
PCT/EP2008/054011
Other languages
French (fr)
Other versions
WO2008125509A3 (en
Inventor
Lawrence Jacobowitz
Mark Ritter
Daniel Stigliani Jr
Original Assignee
International Business Machines Corporation
Ibm United Kingdom Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corporation, Ibm United Kingdom Limited filed Critical International Business Machines Corporation
Priority to JP2010502492A priority Critical patent/JP5306319B2/en
Priority to CN2008800115703A priority patent/CN101652737B/en
Publication of WO2008125509A2 publication Critical patent/WO2008125509A2/en
Publication of WO2008125509A3 publication Critical patent/WO2008125509A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/08Clock generators with changeable or programmable clock frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/06Clock generators producing several clock signals
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03LAUTOMATIC CONTROL, STARTING, SYNCHRONISATION, OR STABILISATION OF GENERATORS OF ELECTRONIC OSCILLATIONS OR PULSES
    • H03L7/00Automatic control of frequency or phase; Synchronisation
    • H03L7/06Automatic control of frequency or phase; Synchronisation using a reference signal applied to a frequency- or phase-locked loop
    • H03L7/16Indirect frequency synthesis, i.e. generating a desired one of a number of predetermined frequencies using a frequency- or phase-locked loop
    • H03L7/18Indirect frequency synthesis, i.e. generating a desired one of a number of predetermined frequencies using a frequency- or phase-locked loop using a frequency divider or counter in the loop
    • H03L7/197Indirect frequency synthesis, i.e. generating a desired one of a number of predetermined frequencies using a frequency- or phase-locked loop using a frequency divider or counter in the loop a time difference being used for locking the loop, the counter counting between numbers which are variable in time or the frequency divider dividing by a factor variable in time, e.g. for obtaining fractional frequency division
    • H03L7/1974Indirect frequency synthesis, i.e. generating a desired one of a number of predetermined frequencies using a frequency- or phase-locked loop using a frequency divider or counter in the loop a time difference being used for locking the loop, the counter counting between numbers which are variable in time or the frequency divider dividing by a factor variable in time, e.g. for obtaining fractional frequency division for fractional frequency division
    • H03L7/1976Indirect frequency synthesis, i.e. generating a desired one of a number of predetermined frequencies using a frequency- or phase-locked loop using a frequency divider or counter in the loop a time difference being used for locking the loop, the counter counting between numbers which are variable in time or the frequency divider dividing by a factor variable in time, e.g. for obtaining fractional frequency division for fractional frequency division using a phase accumulator for controlling the counter or frequency divider

Definitions

  • This invention generally relates to data processing systems, and more specifically, to frequency clocking in processor cores. Even more specifically, in the preferred embodiment, the invention relates to the analog multi- frequency clocking in multi-chip/multi-core processors.
  • EMI reduction with multiple oscillators makes "synchronous spreading" very difficult or impossible.
  • Prior art technology is based on distribution of clocking signals across a wiring network known as a clock-tree. With the growth in the number of cores in multi-core microprocessors, clock-trees also grow into enormous complexity, creating serious chip layout design difficulties and translating into detractors to final product yield and related increase in manufacturing cost.
  • a method of and system for frequency clocking in a processor core are provided. At least one processor core is provided, and that at least one processor core has a clocking subsystem for generating an analog output clock signal at a variable frequency. Digital frequency control data and an analog signal are both transmitted to that at least one processor core; and that processor core uses the received analog signal and digital frequency control data to set the frequency of the output clock signal of the clocking subsystem.
  • multiple cores are asynchronously clocked and the core frequencies are independently set.
  • a plurality of processor cores are provided, and each of the processor cores has a respective clocking subsystem for generating an analog output clock signal at a variable frequency.
  • an analog signal and individual digital frequency control data are transmitted to each processor core; and each processor core receives the analog signal and digital frequency control data transmitted to the core, and uses the received analog signal and digital control data to set locally (on the core) the frequency of the output clock signal of the clocking subsystem of the processor core.
  • the preferred embodiment of the invention provides a computing system (Server) clocking subsystem solution with a single system reference oscillator, which may be spread (for spread-spectrum) to satisfy EMl requirements.
  • the invention achieves clock distribution to each core via a classical multi-cascade analog tree distribution network and a digital data distribution network to each core.
  • Each core takes both inputs to generate a precise frequency clock for the core, which may be unique to that core.
  • the local core clock synthesizer frequency is determined by the digital control data which is used in conjunction with the analog core clock input to set the precise core frequency of operation using digital signal processing or other digital means.
  • the frequency can be established based upon a policy set by the server manufacturer or customer. For example, the frequency can be set to the maximum capability of each core based upon a particular voltage of operation for all cores.
  • the frequency control information is sent to each core as moderate speed (10- 100 Mb/s) digital data words thereby avoiding the problems with high-speed analog signal transmission.
  • the frequency control information has high noise immunity and low signal distortion since it is in the form of digital data.
  • the frequency control information is sent as individual control data words (v data) to each core.
  • the data is latched into the core "clock synthesizer memory" from the server SEEPROM, which contains the vital chip data (VCD) for each core in the server.
  • the single system reference oscillator is set at a moderate frequency (10- 100 MHz), which is distributed to each core via analog transmission line techniques; phase locked loops (PLL), and re-drive circuits.
  • the analog clock signal frequencies are kept moderate prior to the individual core clock synthesizers to avoid highspeed distortion effects.
  • the system reference clock, chip clock, and generic core clock signals are continuously required to maintain a stable core clock.
  • the fundamental core operating frequency changes infrequently (except for certain spread spectrum techniques) such that speed v data changes are infrequent and only periodic v data updates are sufficient to generate a clock for each core.
  • Each core is running asynchronous from each of the other cores and with respect to local cache. It will be appreciated that, once the different regions of a chip are asynchronous, some handshaking/buffering will be required to transfer data between regions, so there will be some added latency. Techniques are known to minimize this latency. Nevertheless, the net performance gain of operating each core at its maximum frequency will be substantial
  • the present invention can be applied to any processing platform that uses multi- microprocessor core silicon chips.
  • client uP platforms storage controllers, data communication switches, etc.
  • Figure 1 shows an analog multi- frequency clocking of a processor subsystem.
  • FIG. 2 illustrates an analog multi- frequency clocking of processor chips.
  • FIG. 3 shows a local core clock synthesizer embodying the present invention.
  • Figure 4 shows an alternate processor configuration in which multi-core groups share an L2 cache.
  • Figure 5 illustrates a further alternate processor configuration in which multi-core groups share an L2 cache and a common local clock generator.
  • FIG. 1 illustrates a typical computing Server 100 that is composed of multiple microprocessor (uP) chips (N) 102 which has internal clocking functions (e.g. digital signal processor, DSP, core clock generator, etc.) that utilize the server reference oscillator (vR) as the basic system clock.
  • uP microprocessor
  • N microprocessor
  • internal clocking functions e.g. digital signal processor, DSP, core clock generator, etc.
  • VR server reference oscillator
  • a Master PLL and distribution ASIC Application Specific Integrated Circuit
  • the output of the Master PLL & Distribution ASIC is a chip clock signal (vch) that is distributed throughout the processor chip .
  • the reference oscillator 104 clock frequency (v R) is a relatively low frequency (typically 10-100 MHz) such that it can be easily routed throughout the PC board without significant signal degradation yet fast enough to enable feasible up-conversions rates to insure the uP high speed clock (typically 5-10 GHz) is stable and remains within the platform deviation requirement (typically 10 -100 ppm, parts per million).
  • the distribution network is generally point-to-point (illustrated in Figure 1) for best reference clock integrity with signal re-drive at the up-conversion points.
  • the first up-conversion and re-drive point is the Master PLL 106 which is used to generate the chip frequency (v ch) clock for each microprocessor chip in the server.
  • the Master PLL not only re-drives the signal but also multiplies the reference oscillator by typically 2- 10x.
  • the uP chip clock signal is, in turn, distributed within a chip by a second level distribution ASIC for use by each core clock synthesizer to generate the fundamental core clock, described below.
  • Figure 1 also shows the interconnection from the uP chips to the I/O Subsystem, System
  • the Clustering fabric is used to interconnect multiple MCMs together to construct a larger multi-processor Server where the MCMs are connected in a symmetric multi-processing (SMP) configuration.
  • SMP symmetric multi-processing
  • the memory is coherent to all the processors within the SMP. In this case, all the MCMs are synchronized to a single
  • Reference Oscillator 104 (illustrated in Figure 1 outside the MCM).
  • the preferred method of this invention can also be used on a configuration of uP chips contained on multiple Single Chip Modules (SCM) mounted on a common glass epoxy printed circuit (PC) board.
  • SCM Single Chip Modules
  • PC common glass epoxy printed circuit
  • This alternate packaging configuration may be used for smaller systems.
  • the Distribution ASIC is also mounted in an SCM on the system board and interconnection to each processor chip is done via system PC board wiring.
  • the MCM and/or PC board contains vital core frequency data (VCD) for each core in the server. This information is typically maintained in a Serial Electrically Erasable Programmable Read Only Memory (SEEPROM). This SEEPROM contains the vital core frequency data (v data) for each connected processor (core).
  • V data is the digital representation of the optimum processor (core) frequency along with identification (Id) of the appropriate chip and core. The Id information is used to insure the correct VCD is transmitted and stored in the VCD Interface function on each chip, for all cores on the chip.
  • the VCD is derived from the frequency characterization data, voltage characterization data, power characterization, etc. gathered by the Service Element (SE).
  • SE Service Element
  • the SE analyzes and reformats the data and loads the data into the system SEEPROM via an appropriate digital interface (e.g. I2C).
  • the totality of data gathered and analyzed by the SE is used to set the optimum frequency, voltage, etc. for each core to achieve the highest performance possible or other policy established by the customer.
  • a novel aspect of this invention is the use of data to generate the optimum processor frequency locally (within core) in conjunction with the up-converted reference clock versus today's approach of transmitting the same analog clock signal to all cores.
  • the data for each core/chip can be obtained during the chip test/verification stage in the manufacturing process or as part of a training paradigm during power-on sequence of the server. The latter approach would be part of the initialization and set-up process of the server.
  • a representative server processor chip (one of several for a typical server) configuration with multi-cores (4) and shared L2 cache is illustrated at 200 in Figure 2.
  • the four core clock synthesizers 202 within the processor chip receive the generic core clock (v gc) from the second level PLL and distribution ASIC 204 by means of the second level distribution network, which is contained on the chip.
  • the generic core clock signal (v gc) is transmitted to each core using a multi-drop bus (illustrated) or a point-to-point star interconnection.
  • the second level distribution ASIC 204 provides the necessary frequency up-conversion to generate the generic core clock (typically 10-2Ox), re-drive circuits, and a clock (v ch ) for the VCD Interface function.
  • the VCD Interface function contains the VCD interface to the SEEPROM (See Figure 1) to receive and store the appropriate data for setting the precise frequency of each of the cores within the chip along with the appropriate Ids.
  • the VCD Interface function interrogates the VCD interface to the SEEPROM (See Figure 1) to receive and store the appropriate data for setting the precise frequency of each of the cores within the chip along with the appropriate Ids.
  • the VCD Interface function interrogates the VCD interface to the SEEPROM (See Figure 1) to receive and store the appropriate data for setting the precise frequency of each of the cores within the chip along with the appropriate Ids.
  • the VCD Interface function interrogates the VCD interface to the SEEPROM (See Figure 1) to receive and store the appropriate data for setting the precise frequency of each of the cores within the chip along with the appropriate Ids.
  • the VCD Interface function interrogates the VCD interface to the SEEPROM (See Figure 1) to receive and store the appropriate data for setting the precise frequency of each of the cores within the chip along with the appropriate Id
  • SEEPROM and obtains the appropriate data (typically through an I2C interface) for its' cores. It may contain some SRAM and state machines or small controller in addition to the I2C interface to perform this function.
  • the VCD Interface function also performs the distribution function by transmitting the V Data to the appropriate core synthesizer only.
  • a unique chip and core Id is included which is related to the chip and module serial number.
  • This core Id is used by the VCD Interface function to route the V data to the appropriate port.
  • V Data intended for core "0" is routed to port "DO" ( Figure 2).
  • the V data is stored in the clock synthesizer and is used as the processor clock frequency data until it is updated by the VCD function on chip. If no changes are forthcoming, no data is sent from the VCD Interface function or the SEEPROM.
  • the V data is not sent continuously, but only when it is updated. This is in contrast to the state-of-the-art analog technique where the signal must be sent continuously. However, the analog clock is sent continuously to ensure a stable core clock.
  • Each core 206 is comprised of the microprocessor, dedicated cache 210, and the core clock synthesizer 202.
  • the core frequency is set by the core clock synthesizer and the digital V data in the VCD for each core.
  • Each core is likely to have different frequency settings.
  • the number of cores within the processor chip is determined by the technology and manufacturing process capability. Four are shown in Figure 2 for illustrative purposes. The technical approach described herein easily scales with the number of cores, which will likely increase in the future.
  • the chip 200 also contains the appropriate interfaces 210, 212, 214 to the I/O, Memory, and Fabric controllers.
  • the design of the core clock synthesizer is illustrated at 300 in Figure 3. It is comprised of a voltage controlled high speed oscillator (VCO) 302, a low pass filter (LPF) 304, a digitally controlled integer-N divider 306, and a Delta-Sigma modulator 310 in conjunction with a digital signal processor (DSP) 312.
  • VCO voltage controlled high speed oscillator
  • LPF low pass filter
  • DSP digital signal processor
  • VCO design and technology The VCO is tuned to a precise fractional frequency by changing the analog control voltage up or down in precise increments to achieve the desired frequency.
  • a portion of the core clock output of the VCO is sent to the integer-N divider, which divides the incoming core clock frequency by an integer N value from the Delta-Sigma modulator.
  • the Delta-Sigma modulator provides an output bit stream of time discrete integer values such that the average of the division ratio is equal to the input desired fractional division ratio.
  • the desired fractional division ratio is generated by the DSP.
  • the DSP 312 converts the desired V data digital frequency value to the appropriate fractional division ratio to yield the desired optimum core frequency.
  • the reference frequency may be set at the factory based on the desired generic core frequency, which is the basis for determining the desired fractional division ratio.
  • the divided output signal of the Integer-N divider 302 is phase compared to the generic core frequency "v gc" in the analog phase detector 314. If the two signals are matched, no frequency correction signal is generated and the clock synthesizer core output is equal to the desired core frequency, which is defined by the core V data input to the DSP. If there is a mismatch, a correction signal voltage is generated, which is passed through a low pass filter (LPF) 304 to remove high frequency noise prior to being applied to the voltage-controlled oscillator (VCO) 302. The error signal directs the VCO to alter its' output frequency in the direction to drive the correction signal to zero and achieve a frequency match at the phase detector.
  • LPF low pass filter
  • VCO voltage-controlled oscillator
  • each core is likely to be at a different frequency, any issues associated with electromagnetic interference (EMI) are likely to be mitigated and the need for spread spectrum techniques minimized. Nevertheless, this approach offers a novel spread spectrum technique, which is not available with today's technology to reduce EMI even further.
  • the DSP could systematically add and subtract a predefined amount from the V data value in the Data Control Register 316. This is done in a way such that the mean value always remains the same as the base V data value.
  • Each core clock frequency (VCO output) will oscillate about the mean frequency value based upon a spread spectrum oscillating frequency, which is independently chosen for each core. This approach allows the spread spectrum approach to be asynchronous for each core, thereby lowering the total EMI.
  • An alternative is to have the spread spectrum oscillating frequency the same for each core.
  • Inherent to the Delta-Sigma modulator is a harmonic dither driver, thereby eliminating the need to add an external dither modulator to effect the spread-spectrum EMI mitigation.
  • Figure 4 illustrates at 400 an alternate processor chip configuration (versus Figure 2) where multi-core groups 402, 404 share an L2 cache 406, 410.
  • the chip 400 also contains the appropriate interfaces to the I/O, Memory, and Fabric controllers (not shown).
  • the generic core clock signal (v gc) is star connected to each core clock synthesizer 412.
  • v ch is shown as direct connected to the VCD Interface function 414 from the Master PLL & Distribution ASIC but may include a re-drive circuit at the junction point.
  • the digital clocking attributes and functions discussed for Figure 2 also apply to this configuration.
  • the configuration in Figure 4 could have common L2 cache clocking frequency or separate frequencies, depending on regional variability in cache. This arrangement is optimal for wiring resource: local processor/Ll cache clock grids, and Vdd (power supply voltage) grids.
  • the output signal from the VCO to each core, or, to any grouping or subset of cores on the multicore processor chip provides a natural interconnected organization which enables a locally addressable switch or 'gate control' to selectively shut- off any pathway to said core or grouping of cores.
  • the switching off of the local core clock(s) enables fine-grained power management without inducing power-fluctuations in the power-grid supply voltage, since the present invention teaches a method of clock frequency control not based on the use of varying the power supply or power grid voltages, nor, of varying Vdd.
  • workload monitors via autonomic sensor circuits can turn off idle cores, or, redistribute workloads to optimize performance at a minimum physically possible power point.
  • the present invention recognizes and specifically points out the significant distinguishable advantages of eliminating noise effects associated with voltage (or power) grid variations or voltage-island designs used in prior art approaches for clock frequency variation.
  • Figure 5 illustrates at 500 another alternate processor chip configuration where multi-core groups 502, 504 share an L2 cache 506, 510 and a common local clock generator 512, 514.
  • each core group of four contains one clock generator.
  • Figure 5 shows the core clock is multi-dropped to two cores but other interconnection topologies (e.g. star) can be used.
  • the chip also contains the appropriate interfaces to the I/O, Memory, and Fabric controllers (not shown). The digital clocking attributes and functions discussed for
  • FIG. 2 also apply to this configuration.
  • This configuration has a common local frequency for a region of cores and the local shared cache.
  • the granularity of clocking by core or core groups depends on the nature of technology variability, size of cores, etc.
  • the present invention enables a level of scalability and flexibility that is not readily available with today's state-of-the art.
  • the optimum core operation frequency can be determined by varying the local frequency and Vdd (power supply voltage), and the invention enables in- field calibration of optimal operating conditions (if processor circuits degrade with time or environmental operating conditions).
  • the instant invention also enables redundant clocks - that is, each local clock generator could have a "Bypass" mode to allow a generic system clock or another core's clock to be used in the event that the local clock generator circuit fails (or shows low yield in early mfg.).
  • clock information is in digital format (data) at relatively low speed.
  • the invention may be used with a core cache (Ll) synchronous with the core, but with a separate Vdd from the core.
  • Ll core cache
  • the invention may also be used with a cache that is asynchronously shared among a set of processors; shown herein as running at a system frequency (ns), but the cache could also have a local, independent clock generator.
  • different cores/regions/cache can have different Vdd and different frequencies
  • local clock grid(s) can be driven by, for example, a local clock source or a global chip clock grid driven by a global chip clock.
  • the present invention allows global spread-spectrum from the system reference oscillator; each local clock generator may track the system reference oscillator spreading to avoid the "out-of-phase spreading" problem.
  • digital spread spectrum techniques via the DSP may also be used.
  • Computer program, software program, program, or software in the present context mean any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.

Abstract

A method of and system for frequency clocking in a processor core are disclosed. In this system, at least one processor core is provided, and that at least one processor core has a clocking subsystem for generating an analog output clock signal at a variable frequency. Digital frequency control data and an analog signal are both transmitted to that at least one processor core; and that processor core uses the received analog signal and digital frequency control data to set the frequency of the output clock signal of the clocking subsystem. In a preferred implementation, multiple cores are asynchronously clocked and the core frequencies are independently set.

Description

METHOD AND SYSTEM FOR ANALOG FREQUENCY CLOCKING IN PROCESSOR CORES
BACKGROUND OF THE INVENTION
Field of the Invention
This invention generally relates to data processing systems, and more specifically, to frequency clocking in processor cores. Even more specifically, in the preferred embodiment, the invention relates to the analog multi- frequency clocking in multi-chip/multi-core processors.
Background Art
Servers are beginning to exploit a multiplicity of multi-core processor chips in order to continue to increase performance as processor frequency scaling can no longer meet the industry growth in performance. Also, the increasing difficulty and hardware cost, as well as signal integrity concerns, associated with the transmission of high frequency clocking throughout a multi-chip and multi-core processor server make this an untenable long-term strategy for future server systems. The state of the art for clock distribution is based on high-speed analog signals using transmission lines. This technique is limited in scalability due to skin effect, media and connector loss, crosstalk, termination mismatches, etc. Today's large servers contain, for example, greater than 10 processor chips typically containing two cores. It is expected that both chips and cores per chip will increase in the future. Transmission of high frequency clocks (>5-10 GHz) for multiple chips comprised of multiple cores is not feasible with known board technology and connectors. The need to operate this configuration in a tightly coupled mode, such as a Symmetric Multi-processor (SMP), will require a new clocking paradigm.
As microprocessor chips become larger with more cores, regional process and parameter variability across chip means that each core will have an optimal power/performance metric at a different chip voltage and clock frequency setting. Obtaining optimum performance for each core within a multi-core system is not feasible today. Separate core voltage domains are known and state-of-the-art but they can only serve to optimize the power at the chip level and not obtain optimum core performance. A server system with separate frequency domains per core is very complicated and is not practiced in the industry. For example, multiple off-chip and on-chip oscillators are required. Spread spectrum clocking used for
EMI reduction with multiple oscillators makes "synchronous spreading" very difficult or impossible. Prior art technology is based on distribution of clocking signals across a wiring network known as a clock-tree. With the growth in the number of cores in multi-core microprocessors, clock-trees also grow into enormous complexity, creating serious chip layout design difficulties and translating into detractors to final product yield and related increase in manufacturing cost.
SUMMARY OF THE INVENTION
A method of and system for frequency clocking in a processor core are provided. At least one processor core is provided, and that at least one processor core has a clocking subsystem for generating an analog output clock signal at a variable frequency. Digital frequency control data and an analog signal are both transmitted to that at least one processor core; and that processor core uses the received analog signal and digital frequency control data to set the frequency of the output clock signal of the clocking subsystem. In a preferred implementation, multiple cores are asynchronously clocked and the core frequencies are independently set.
Also, in a preferred embodiment, a plurality of processor cores are provided, and each of the processor cores has a respective clocking subsystem for generating an analog output clock signal at a variable frequency. In this preferred embodiment, an analog signal and individual digital frequency control data are transmitted to each processor core; and each processor core receives the analog signal and digital frequency control data transmitted to the core, and uses the received analog signal and digital control data to set locally (on the core) the frequency of the output clock signal of the clocking subsystem of the processor core. The preferred embodiment of the invention provides a computing system (Server) clocking subsystem solution with a single system reference oscillator, which may be spread (for spread-spectrum) to satisfy EMl requirements. The invention achieves clock distribution to each core via a classical multi-cascade analog tree distribution network and a digital data distribution network to each core. Each core takes both inputs to generate a precise frequency clock for the core, which may be unique to that core. The local core clock synthesizer frequency is determined by the digital control data which is used in conjunction with the analog core clock input to set the precise core frequency of operation using digital signal processing or other digital means. The frequency can be established based upon a policy set by the server manufacturer or customer. For example, the frequency can be set to the maximum capability of each core based upon a particular voltage of operation for all cores.
The frequency control information is sent to each core as moderate speed (10- 100 Mb/s) digital data words thereby avoiding the problems with high-speed analog signal transmission. The frequency control information has high noise immunity and low signal distortion since it is in the form of digital data. The frequency control information is sent as individual control data words (v data) to each core. The data is latched into the core "clock synthesizer memory" from the server SEEPROM, which contains the vital chip data (VCD) for each core in the server. The single system reference oscillator is set at a moderate frequency (10- 100 MHz), which is distributed to each core via analog transmission line techniques; phase locked loops (PLL), and re-drive circuits. The analog clock signal frequencies are kept moderate prior to the individual core clock synthesizers to avoid highspeed distortion effects.
The system reference clock, chip clock, and generic core clock signals are continuously required to maintain a stable core clock. However, the fundamental core operating frequency changes infrequently (except for certain spread spectrum techniques) such that speed v data changes are infrequent and only periodic v data updates are sufficient to generate a clock for each core. Each core is running asynchronous from each of the other cores and with respect to local cache. It will be appreciated that, once the different regions of a chip are asynchronous, some handshaking/buffering will be required to transfer data between regions, so there will be some added latency. Techniques are known to minimize this latency. Nevertheless, the net performance gain of operating each core at its maximum frequency will be substantial
(10-20%).
The present invention can be applied to any processing platform that uses multi- microprocessor core silicon chips. For example, client uP platforms, storage controllers, data communication switches, etc.
Further benefits and advantages of this invention will become apparent from a consideration of the following detailed description, given with reference to the accompanying drawings, which specify and show preferred embodiments of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows an analog multi- frequency clocking of a processor subsystem.
Figure 2 illustrates an analog multi- frequency clocking of processor chips.
Figure 3 shows a local core clock synthesizer embodying the present invention.
Figure 4 shows an alternate processor configuration in which multi-core groups share an L2 cache.
Figure 5 illustrates a further alternate processor configuration in which multi-core groups share an L2 cache and a common local clock generator. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Figure 1 illustrates a typical computing Server 100 that is composed of multiple microprocessor (uP) chips (N) 102 which has internal clocking functions (e.g. digital signal processor, DSP, core clock generator, etc.) that utilize the server reference oscillator (vR) as the basic system clock. A Master PLL and distribution ASIC (Application Specific Integrated Circuit) on the MCM or system board multiplies, re-drives, and distributes the reference clock signal to each uP chip in the Multi-chip Module (MCM) or system board. The output of the Master PLL & Distribution ASIC is a chip clock signal (vch) that is distributed throughout the processor chip .
The reference oscillator 104 clock frequency (v R) is a relatively low frequency (typically 10-100 MHz) such that it can be easily routed throughout the PC board without significant signal degradation yet fast enough to enable feasible up-conversions rates to insure the uP high speed clock (typically 5-10 GHz) is stable and remains within the platform deviation requirement (typically 10 -100 ppm, parts per million). The distribution network is generally point-to-point (illustrated in Figure 1) for best reference clock integrity with signal re-drive at the up-conversion points. The first up-conversion and re-drive point is the Master PLL 106 which is used to generate the chip frequency (v ch) clock for each microprocessor chip in the server. The Master PLL not only re-drives the signal but also multiplies the reference oscillator by typically 2- 10x. The uP chip clock signal is, in turn, distributed within a chip by a second level distribution ASIC for use by each core clock synthesizer to generate the fundamental core clock, described below.
Figure 1 also shows the interconnection from the uP chips to the I/O Subsystem, System
Memory, and external System Clustering fabric via the appropriate controller interface 110, 112 and 114. The Clustering fabric is used to interconnect multiple MCMs together to construct a larger multi-processor Server where the MCMs are connected in a symmetric multi-processing (SMP) configuration. In an SMP configuration, the memory is coherent to all the processors within the SMP. In this case, all the MCMs are synchronized to a single
Reference Oscillator 104 (illustrated in Figure 1 outside the MCM). The preferred method of this invention can also be used on a configuration of uP chips contained on multiple Single Chip Modules (SCM) mounted on a common glass epoxy printed circuit (PC) board. This alternate packaging configuration may be used for smaller systems. In this case, the Distribution ASIC is also mounted in an SCM on the system board and interconnection to each processor chip is done via system PC board wiring.
The MCM and/or PC board contains vital core frequency data (VCD) for each core in the server. This information is typically maintained in a Serial Electrically Erasable Programmable Read Only Memory (SEEPROM). This SEEPROM contains the vital core frequency data (v data) for each connected processor (core). The "V data" is the digital representation of the optimum processor (core) frequency along with identification (Id) of the appropriate chip and core. The Id information is used to insure the correct VCD is transmitted and stored in the VCD Interface function on each chip, for all cores on the chip. The VCD is derived from the frequency characterization data, voltage characterization data, power characterization, etc. gathered by the Service Element (SE).
The SE analyzes and reformats the data and loads the data into the system SEEPROM via an appropriate digital interface (e.g. I2C). The totality of data gathered and analyzed by the SE is used to set the optimum frequency, voltage, etc. for each core to achieve the highest performance possible or other policy established by the customer. A novel aspect of this invention is the use of data to generate the optimum processor frequency locally (within core) in conjunction with the up-converted reference clock versus today's approach of transmitting the same analog clock signal to all cores.
The data for each core/chip can be obtained during the chip test/verification stage in the manufacturing process or as part of a training paradigm during power-on sequence of the server. The latter approach would be part of the initialization and set-up process of the server.
A representative server processor chip (one of several for a typical server) configuration with multi-cores (4) and shared L2 cache is illustrated at 200 in Figure 2. The four core clock synthesizers 202 within the processor chip receive the generic core clock (v gc) from the second level PLL and distribution ASIC 204 by means of the second level distribution network, which is contained on the chip. The generic core clock signal (v gc) is transmitted to each core using a multi-drop bus (illustrated) or a point-to-point star interconnection. The second level distribution ASIC 204 provides the necessary frequency up-conversion to generate the generic core clock (typically 10-2Ox), re-drive circuits, and a clock (v ch ) for the VCD Interface function.
The VCD Interface function contains the VCD interface to the SEEPROM (See Figure 1) to receive and store the appropriate data for setting the precise frequency of each of the cores within the chip along with the appropriate Ids. The VCD Interface function interrogates the
SEEPROM and obtains the appropriate data (typically through an I2C interface) for its' cores. It may contain some SRAM and state machines or small controller in addition to the I2C interface to perform this function. The VCD Interface function also performs the distribution function by transmitting the V Data to the appropriate core synthesizer only.
As part of the V data content, a unique chip and core Id is included which is related to the chip and module serial number. This core Id is used by the VCD Interface function to route the V data to the appropriate port. For example, V Data intended for core "0" is routed to port "DO" (Figure 2). The V data is stored in the clock synthesizer and is used as the processor clock frequency data until it is updated by the VCD function on chip. If no changes are forthcoming, no data is sent from the VCD Interface function or the SEEPROM. The V data is not sent continuously, but only when it is updated. This is in contrast to the state-of-the-art analog technique where the signal must be sent continuously. However, the analog clock is sent continuously to ensure a stable core clock.
Each core 206 is comprised of the microprocessor, dedicated cache 210, and the core clock synthesizer 202. The core frequency is set by the core clock synthesizer and the digital V data in the VCD for each core. Each core is likely to have different frequency settings. The number of cores within the processor chip is determined by the technology and manufacturing process capability. Four are shown in Figure 2 for illustrative purposes. The technical approach described herein easily scales with the number of cores, which will likely increase in the future. The chip 200 also contains the appropriate interfaces 210, 212, 214 to the I/O, Memory, and Fabric controllers.
The design of the core clock synthesizer is illustrated at 300 in Figure 3. It is comprised of a voltage controlled high speed oscillator (VCO) 302, a low pass filter (LPF) 304, a digitally controlled integer-N divider 306, and a Delta-Sigma modulator 310 in conjunction with a digital signal processor (DSP) 312. This arrangement is a variation of the known Delta- Sigma fractional-N synthesizer, which is used to tune each core clock to operate above and below the generic core clock operating frequency of the server. The VCO operating range, center frequency, and voltage to frequency conversion characteristic is a function of the
VCO design and technology. The VCO is tuned to a precise fractional frequency by changing the analog control voltage up or down in precise increments to achieve the desired frequency.
A portion of the core clock output of the VCO is sent to the integer-N divider, which divides the incoming core clock frequency by an integer N value from the Delta-Sigma modulator. The Delta-Sigma modulator provides an output bit stream of time discrete integer values such that the average of the division ratio is equal to the input desired fractional division ratio. The desired fractional division ratio is generated by the DSP. The DSP 312 converts the desired V data digital frequency value to the appropriate fractional division ratio to yield the desired optimum core frequency. The reference frequency may be set at the factory based on the desired generic core frequency, which is the basis for determining the desired fractional division ratio.
The divided output signal of the Integer-N divider 302 is phase compared to the generic core frequency "v gc" in the analog phase detector 314. If the two signals are matched, no frequency correction signal is generated and the clock synthesizer core output is equal to the desired core frequency, which is defined by the core V data input to the DSP. If there is a mismatch, a correction signal voltage is generated, which is passed through a low pass filter (LPF) 304 to remove high frequency noise prior to being applied to the voltage-controlled oscillator (VCO) 302. The error signal directs the VCO to alter its' output frequency in the direction to drive the correction signal to zero and achieve a frequency match at the phase detector.
Since each core is likely to be at a different frequency, any issues associated with electromagnetic interference (EMI) are likely to be mitigated and the need for spread spectrum techniques minimized. Nevertheless, this approach offers a novel spread spectrum technique, which is not available with today's technology to reduce EMI even further. For example, the DSP could systematically add and subtract a predefined amount from the V data value in the Data Control Register 316. This is done in a way such that the mean value always remains the same as the base V data value. Each core clock frequency (VCO output) will oscillate about the mean frequency value based upon a spread spectrum oscillating frequency, which is independently chosen for each core. This approach allows the spread spectrum approach to be asynchronous for each core, thereby lowering the total EMI. An alternative is to have the spread spectrum oscillating frequency the same for each core. Inherent to the Delta-Sigma modulator is a harmonic dither driver, thereby eliminating the need to add an external dither modulator to effect the spread-spectrum EMI mitigation.
Another approach is to vary the reference oscillator about its' mean. This variation will change the frequency base for comparison in the phase detector, causing the VCO core frequency to change.
Figure 4 illustrates at 400 an alternate processor chip configuration (versus Figure 2) where multi-core groups 402, 404 share an L2 cache 406, 410. The chip 400 also contains the appropriate interfaces to the I/O, Memory, and Fabric controllers (not shown). The generic core clock signal (v gc) is star connected to each core clock synthesizer 412. The chip clock
(v ch) is shown as direct connected to the VCD Interface function 414 from the Master PLL & Distribution ASIC but may include a re-drive circuit at the junction point. The digital clocking attributes and functions discussed for Figure 2 also apply to this configuration. The configuration in Figure 4 could have common L2 cache clocking frequency or separate frequencies, depending on regional variability in cache. This arrangement is optimal for wiring resource: local processor/Ll cache clock grids, and Vdd (power supply voltage) grids. As shown in Figure 3, the output signal from the VCO to each core, or, to any grouping or subset of cores on the multicore processor chip, provides a natural interconnected organization which enables a locally addressable switch or 'gate control' to selectively shut- off any pathway to said core or grouping of cores. In effect, the switching off of the local core clock(s) enables fine-grained power management without inducing power-fluctuations in the power-grid supply voltage, since the present invention teaches a method of clock frequency control not based on the use of varying the power supply or power grid voltages, nor, of varying Vdd. In this manner, workload monitors via autonomic sensor circuits, can turn off idle cores, or, redistribute workloads to optimize performance at a minimum physically possible power point. The present invention recognizes and specifically points out the significant distinguishable advantages of eliminating noise effects associated with voltage (or power) grid variations or voltage-island designs used in prior art approaches for clock frequency variation.
Figure 5 illustrates at 500 another alternate processor chip configuration where multi-core groups 502, 504 share an L2 cache 506, 510 and a common local clock generator 512, 514. In this configuration, each core group of four contains one clock generator. Figure 5 shows the core clock is multi-dropped to two cores but other interconnection topologies (e.g. star) can be used. The chip also contains the appropriate interfaces to the I/O, Memory, and Fabric controllers (not shown). The digital clocking attributes and functions discussed for
Figure 2 also apply to this configuration. This configuration has a common local frequency for a region of cores and the local shared cache. The granularity of clocking by core or core groups depends on the nature of technology variability, size of cores, etc.
The present invention enables a level of scalability and flexibility that is not readily available with today's state-of-the art. For example, with the present invention, the optimum core operation frequency can be determined by varying the local frequency and Vdd (power supply voltage), and the invention enables in- field calibration of optimal operating conditions (if processor circuits degrade with time or environmental operating conditions). The instant invention also enables redundant clocks - that is, each local clock generator could have a "Bypass" mode to allow a generic system clock or another core's clock to be used in the event that the local clock generator circuit fails (or shows low yield in early mfg.). With this invention, clock information is in digital format (data) at relatively low speed.
Different types of caches may be used in this invention. For instance, the invention may be used with a core cache (Ll) synchronous with the core, but with a separate Vdd from the core. The invention may also be used with a cache that is asynchronously shared among a set of processors; shown herein as running at a system frequency (ns), but the cache could also have a local, independent clock generator.
With this invention, different cores/regions/cache can have different Vdd and different frequencies, and local clock grid(s) can be driven by, for example, a local clock source or a global chip clock grid driven by a global chip clock. The present invention allows global spread-spectrum from the system reference oscillator; each local clock generator may track the system reference oscillator spreading to avoid the "out-of-phase spreading" problem. In addition, with this invention, digital spread spectrum techniques via the DSP may also be used.
Aspects of the present invention can be embodied in a computer program product, which comprises all the respective features enabling the implementation of the methods described herein, and which - when loaded in a computer system - is able to carry out these methods. Computer program, software program, program, or software, in the present context mean any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.
While it is apparent that the invention herein disclosed is well calculated to fulfill the objects stated above, it will be appreciated that numerous modifications and embodiments may be devised by those skilled in the art and it is intended that the appended claims cover all such modifications and embodiments as fall within the true spirit and scope of the present invention.
For the avoidance of doubt, the term "comprising", as used herein throughout the description and claims is not to be construed as meaning "consisting only of.

Claims

1. A method of frequency clocking in a processor core, comprising the steps of: providing at least one processor core, said at least one processor core having a clocking subsystem for generating an analog output clock signal at a variable frequency; transmitting to said at least one processor core: i) an analog signal at a given frequency, and ii) digital frequency control data; and said at least one processor core: i) receiving said analog signal and said digital frequency control data, and ii) using said analog signal and said digital frequency control data to set the frequency of the output clock signal of the clocking subsystem.
2. A method according to Claim 1, wherein said processor core is on a processor chip, and said processor chip includes a chip distribution ASIC, and the transmitting step includes the steps of: transmitting an analog chip reference signal having a given frequency to the chip distribution ASIC; and said chip distribution ASIC: i) generating an output core generic analog signal, said core generic analog signal having a frequency greater than the frequency of the chip reference signal, and ii) transmitting said core generic analog signal to the at least one processor core.
3. A method according to Claim 2, wherein said processor chip is on a processor module, and said processor module includes a module distribution ASIC, and the step of transmitting the analog chip reference signal includes the steps of: transmitting an analog primary reference signal having a defined frequency to said module distribution ASIC; and said module distribution ASIC: i) generating said analog chip reference signal, said chip reference signal the frequency of the analog chip reference signal being greater than the frequency of the primary reference signal, and ii) transmitting the analog chip reference signal to the chip distribution ASIC.
4. A method according to any preceding claim , wherein: the providing step includes the step of providing a plurality of processor cores, each of the processor cores having a respective clocking subsystem for generating an analog output clock signal at a variable frequency; and the transmitting step includes the steps of: i) transmitting an analog reference signal having a given frequency to a core distribution ASIC, and ii) said core distribution ASIC generating an output core generic signal, said core generic signal having a frequency greater than the frequency of the reference signal, and transmitting said core generic signal to each of the plurality of processor cores.
5. A method according to Claim 4, wherein: the providing step includes the step of providing a further distribution ASIC; and the step of transmitting the analog reference signal to the core distribution ASIC includes the steps of i) transmitting an analog primary reference signal having a defined frequency to the further distribution ASIC, and ii) said further distribution ASIC generating said analog chip reference signal, the frequency of said chip reference signal being greater than the frequency of the analog primary reference signal, and transmitting the analog chip reference signal to the core distribution ASIC.
6. A system for frequency clocking in a processor core, comprising: at least one clocking subsystem on at least one processor core, and for generating an analog output clock signal at a variable frequency; a digital transmission network for transmitting to said at least one processor core digital frequency control data; an analog transmission network for transmitting to said at least one processor core an analog signal at a given frequency; and wherein said at least one clocking subsystem includes: i) a receiver for receiving said analog signal and said digital frequency control data, and ii) a local clock synthesizer for using said received analog signal and said digital frequency control data to set the frequency of the output clock signal of the clocking subsystem of the processor core.
7. A system according to Claim 6, wherein said at least one processor cores is on a processor chip, and the analog transmission network includes: a chip distribution ASIC on the processor chip for receiving a chip reference analog signal having a given frequency, and for generating a core generic analog signal having a frequency greater than the frequency of the chip reference signal; and a first connection for transmitting the core generic analog signal from the chip distribution ASIC to the at least one processor core.
8. A system according to Claim 7, wherein said processor chip is on a processor module, and the analog transmission network further includes: a module distribution ASIC on the processor module for receiving an analog module reference signal having a defined frequency, and for generating the chip reference signal, the frequency of the chip reference signal being greater than the frequency of the analog module reference signal; and a second connection for transmitting the chip reference signal from the module distribution ASIC to the chip distribution ASIC.
9. A system according to Claim 6, 7 or 8, for frequency clocking in a plurality of processor cores, and wherein each of the processor cores includes a respective one clocking system for generating an analog output clock signal at a variable frequency, and wherein: the digital transmission network transmits digital frequency control data to said plurality of processor cores; the analog transmission network transmits analog signals to said plurality of processor cores; and each of the processor cores receives digital frequency control data and one of the analog signals and uses the received digital frequency control data and the received analog signal to set the frequency of the clocking system of said each of the processor cores.
10. A system according to Claim 9, wherein: the analog transmission network includes: i) a first level distribution ASIC for receiving a reference analog signal having a given frequency, and for generating a chip analog signal having a frequency greater than the frequency of the reference analog signal, and ii) a second level distribution ASIC for receiving the chip analog signal from the first level distribution ASIC, and for generating a generic core signal having a frequency greater than the frequency of the chip analog signal; and each of the processor cores receives the generic core signal from the second level distribution ASIC.
11. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for frequency clocking in at least one processor core, said at least one processor core including a clocking subsystem for generating an analog output clock signal at a variable frequency, said method steps comprising: transmitting to said at least one processor core: i) an analog signal at a given frequency, and ii) digital frequency control data; and said at least one processor core: i) receiving said analog signal and said digital frequency control data, and ii) using said analog signal and said digital frequency control data to set the frequency of the output clock signal of the clocking subsystem.
12. A program storage device according to Claim 11, wherein said processor core is on a processor chip, and said processor chip includes a chip distribution ASIC, and the transmitting step includes the steps of: transmitting an analog chip reference signal having a given frequency to the chip distribution ASIC; and said chip distribution ASIC: i) generating an output core generic analog signal, said core generic analog signal having a frequency greater than the frequency of the chip reference signal, and ii) transmitting said core generic analog signal to the at least one processor core.
13. A program storage device according to Claim 12, wherein said processor chip is on a processor module, and said processor module includes a module distribution ASIC, and the step of transmitting the analog chip reference signal includes the steps of: transmitting an analog primary reference signal having a defined frequency to said module distribution ASIC; and said module distribution ASIC: i) generating said analog chip reference signal, said chip reference signal the frequency of the analog chip reference signal being greater than the frequency of the primary reference signal, and ii) transmitting the analog chip reference signal to the chip distribution ASIC.
14. A program storage device according to Claim 11, 12 or 13, wherein the method steps are for frequency clocking in a plurality of processor cores, each of the processor cores having a respective clocking subsystem for generating an analog output clock signal at a variable frequency, and wherein: the transmitting step includes the steps of: i) transmitting an analog reference signal having a given frequency to a core distribution ASIC, and ii) said core distribution ASIC generating an output core generic signal, said core generic signal having a frequency greater than the frequency of the reference signal, and transmitting said core generic signal to each of the plurality of processor cores.
15. A program storage device according to Claim 14, wherein the step of transmitting the analog reference signal to the core distribution ASIC includes the steps of: transmitting an analog primary reference signal having a defined frequency to a further distribution ASIC, and said further distribution ASIC generating said analog chip reference signal, the frequency of said chip reference signal being greater than the frequency of the analog primary reference signal, and transmitting the analog chip reference signal to the core distribution ASIC.
16. A system for frequency clocking in a multi-core processor chip, each of said cores including a clocking subsystem for generating an analog clock signal at a variable frequency, the system comprising: a digital transmission network for transmitting to each of the cores an associated digital value; an analog transmission network for transmitting to each of the cores an associated analog signal; and wherein each of the cores uses the digital value and the analog signal transmitted to the core to generate on the core an optimum processor clock frequency.
17. A system according to Claim 16, wherein: the analog transmission network includes: i) a first level distribution ASIC for receiving a reference analog signal having a given frequency, and for generating a chip analog signal having a frequency greater than the frequency of the reference analog signal, and ii) a second level distribution ASIC for receiving the chip analog signal from the first level distribution ASIC, and for generating a generic core signal having a frequency greater than the frequency of the chip analog signal; and each of the processor cores receives the generic core signal from the second level distribution ASIC.
18. A system according to Claim 16, further comprising: a memory unit for storing for each of the processor cores, a respective identification value and an associated optimal frequency value; and wherein: the digital transmission network transmits to each of the processor cores the optimal frequency value associated with said each of the processor cores; each of the processor cores generate the optimum processor clock frequency for said each of the processor cores independently of the optimum processor clock frequencies generated by the others of the processor cores; said optimal frequency values in the memory unit change over time; and whenever the optimal frequency value associated with one of the processor cores changes, from an old value to a new value, the digital transmission network transmits said new value to said one of the processor cores.
19. A method of managing power applied to a processor chip having multiple processor cores, each of the processor cores including a clocking subsystem for generating an analog output clock signal at a variable frequency, the method comprising the steps of: transmitting to each of the processor cores an analog signal and digital frequency control data to set the frequency of the output clock signal of the processor core; and switching off the clocking subsystems of selected ones of the processor cores at selected times to manage power-consumption by the processor chip.
20. A method according to Claim 19, comprising the further step of applying a substantially constant power supply voltage to the processor core during said switching step.
PCT/EP2008/054011 2007-04-12 2008-04-03 Method and system for analog frequency clocking in processor cores WO2008125509A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2010502492A JP5306319B2 (en) 2007-04-12 2008-04-03 Method and system for analog frequency clocking in a processor core
CN2008800115703A CN101652737B (en) 2007-04-12 2008-04-03 Method and system for analog frequency clocking in processor cores

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/734,334 2007-04-12
US11/734,334 US8161314B2 (en) 2007-04-12 2007-04-12 Method and system for analog frequency clocking in processor cores

Publications (2)

Publication Number Publication Date
WO2008125509A2 true WO2008125509A2 (en) 2008-10-23
WO2008125509A3 WO2008125509A3 (en) 2009-01-22

Family

ID=39854853

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/054011 WO2008125509A2 (en) 2007-04-12 2008-04-03 Method and system for analog frequency clocking in processor cores

Country Status (6)

Country Link
US (1) US8161314B2 (en)
JP (1) JP5306319B2 (en)
KR (1) KR20100003727A (en)
CN (1) CN101652737B (en)
TW (1) TWI417700B (en)
WO (1) WO2008125509A2 (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7917799B2 (en) * 2007-04-12 2011-03-29 International Business Machines Corporation Method and system for digital frequency clocking in processor cores
US7945804B2 (en) * 2007-10-17 2011-05-17 International Business Machines Corporation Methods and systems for digitally controlled multi-frequency clocking of multi-core processors
US7996743B1 (en) * 2008-04-01 2011-08-09 Altera Corporation Logic circuit testing with reduced overhead
JP2011160369A (en) * 2010-02-04 2011-08-18 Sony Corp Electronic circuit, electronic apparatus, and digital signal processing method
US8484495B2 (en) * 2010-03-25 2013-07-09 International Business Machines Corporation Power management in a multi-processor computer system
US8943334B2 (en) 2010-09-23 2015-01-27 Intel Corporation Providing per core voltage and frequency control
TW201250520A (en) * 2011-06-13 2012-12-16 Waltop Int Corp Digitizer integration chip
CN102445916B (en) * 2011-09-15 2014-04-02 福建星网锐捷网络有限公司 Method and system of programmable controller and clock frequency control
US9471088B2 (en) 2013-06-25 2016-10-18 Intel Corporation Restricting clock signal delivery in a processor
US9377836B2 (en) * 2013-07-26 2016-06-28 Intel Corporation Restricting clock signal delivery based on activity in a processor
US9552034B2 (en) * 2014-04-29 2017-01-24 Qualcomm Incorporated Systems and methods for providing local hardware limit management and enforcement
KR102032330B1 (en) * 2014-06-20 2019-10-16 에스케이하이닉스 주식회사 Semiconductor device and its global synchronous type dynamic voltage frequency scaling method
US20160283333A1 (en) * 2015-03-25 2016-09-29 International Business Machines Corporation Utilizing a processor with a time of day clock error
US20160283334A1 (en) * 2015-03-25 2016-09-29 International Business Machines Corporation Utilizing a processor with a time of day clock error
CN105049002B (en) * 2015-07-02 2018-07-31 深圳市韬略科技有限公司 A kind of method of the spreading device of electromagnetic compatibility and generation spread spectrum clock signal
US10430354B2 (en) * 2017-04-21 2019-10-01 Intel Corporation Source synchronized signaling mechanism
CN108984469A (en) * 2018-06-06 2018-12-11 北京嘉楠捷思信息技术有限公司 Chip frequency modulation method and device of computing equipment, computing force board, computing equipment and storage medium
CN113132272B (en) * 2021-03-31 2023-02-14 中国人民解放军战略支援部队信息工程大学 Network switching frequency dynamic adjustment method and system based on flow perception and network switching chip structure

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594895A (en) * 1992-12-15 1997-01-14 International Business Machines Corporation Method and apparatus for switching between clock generators only when activity on a bus can be stopped
US20040251970A1 (en) * 2003-05-29 2004-12-16 Intel Corporation Method for clock generator lock-time reduction during speedstep transition
WO2006000929A2 (en) * 2004-06-21 2006-01-05 Koninklijke Philips Electronics N.V. Power management
US7187742B1 (en) * 2000-10-06 2007-03-06 Xilinx, Inc. Synchronized multi-output digital clock manager

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5481573A (en) * 1992-06-26 1996-01-02 International Business Machines Corporation Synchronous clock distribution system
US6256745B1 (en) * 1998-06-05 2001-07-03 Intel Corporation Processor having execution core sections operating at different clock rates
JPH1185303A (en) * 1997-09-02 1999-03-30 Nippon Steel Corp Clock generation circuit
JP3702148B2 (en) * 1999-05-28 2005-10-05 三洋電機株式会社 PLL device
JP2001117903A (en) * 1999-10-22 2001-04-27 Seiko Epson Corp Semiconductor integrated circuit device, microprocessor, microcomputer and electronic equipment
US6990598B2 (en) 2001-03-21 2006-01-24 Gallitzin Allegheny Llc Low power reconfigurable systems and methods
US6993669B2 (en) 2001-04-18 2006-01-31 Gallitzin Allegheny Llc Low power clocking systems and methods
US7188261B1 (en) * 2001-05-01 2007-03-06 Advanced Micro Devices, Inc. Processor operational range indicator
US6898721B2 (en) 2001-06-22 2005-05-24 Gallitzin Allegheny Llc Clock generation systems and methods
JP4170218B2 (en) * 2001-08-29 2008-10-22 メディアテック インコーポレーテッド Method and apparatus for improving the throughput of a cache-based embedded processor by switching tasks in response to a cache miss
US6978389B2 (en) 2001-12-20 2005-12-20 Texas Instruments Incorporated Variable clocking in an embedded symmetric multiprocessor system
JP3638271B2 (en) * 2002-07-23 2005-04-13 沖電気工業株式会社 Information processing device
US7124315B2 (en) * 2002-08-12 2006-10-17 Hewlett-Packard Development Company, L.P. Blade system for using multiple frequency synthesizers to control multiple processor clocks operating at different frequencies based upon user input
US7945803B2 (en) * 2003-06-18 2011-05-17 Nethra Imaging, Inc. Clock generation for multiple clock domains
JP2005085164A (en) * 2003-09-10 2005-03-31 Sharp Corp Control method for multiprocessor system, and multiprocessor system
JP2004152290A (en) * 2003-10-24 2004-05-27 Renesas Technology Corp Semiconductor device
US7321979B2 (en) 2004-01-22 2008-01-22 International Business Machines Corporation Method and apparatus to change the operating frequency of system core logic to maximize system memory bandwidth
US7350096B2 (en) * 2004-09-30 2008-03-25 International Business Machines Corporation Circuit to reduce power supply fluctuations in high frequency/ high power circuits
JP2006195602A (en) * 2005-01-12 2006-07-27 Fujitsu Ltd System clock distribution device and system clock distribution method
JP2007026075A (en) * 2005-07-15 2007-02-01 Canon Inc Data processor and method for controlling data processor
US7639764B2 (en) * 2005-08-17 2009-12-29 Atmel Corporation Method and apparatus for synchronizing data between different clock domains in a memory controller
US7478259B2 (en) * 2005-10-31 2009-01-13 International Business Machines Corporation System, method and storage medium for deriving clocks in a memory system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594895A (en) * 1992-12-15 1997-01-14 International Business Machines Corporation Method and apparatus for switching between clock generators only when activity on a bus can be stopped
US7187742B1 (en) * 2000-10-06 2007-03-06 Xilinx, Inc. Synchronized multi-output digital clock manager
US20040251970A1 (en) * 2003-05-29 2004-12-16 Intel Corporation Method for clock generator lock-time reduction during speedstep transition
WO2006000929A2 (en) * 2004-06-21 2006-01-05 Koninklijke Philips Electronics N.V. Power management

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "BIOS AND KERNEL DEVELOPER'S GUIDE FOR AMD ATHLONTM 64 AND AMD OPTERONTM PROCESSORS"[Online] February 2006 (2006-02), XP002505671 AMD Athlon 64 Processor Tech Docs page on Internet Retrieved from the Internet: URL:http://web.archive.org/web/20070315084839/http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26094.PDF> [retrieved on 2008-11-26] *
ANONYMOUS: "Enhanced Intel® SpeedStep® Technology for the Intel® Pentium M Processor - White Paper"[Online] March 2004 (2004-03), XP002505672 Retrieved from the Internet: URL:http://developer.intel.com/design/intarch/pentiumm/docs_pentiumm_proc.htm> [retrieved on 2008-11-26] *
JIAN LI ET AL: "Dynamic Power-Performance Adaptation of Parallel Computation on Chip Multiprocessors" HIGH-PERFORMANCE COMPUTER ARCHITECTURE, 2006. THE TWELFTH INTERNATIONA L SYMPOSIUM ON AUSTIN, TEXAS FEBRUARY 11-15, 2006, PISCATAWAY, NJ, USA,IEEE, 11 February 2006 (2006-02-11), pages 77-87, XP010896312 ISBN: 978-0-7803-9368-4 *
MURALIMANOHAR N ET AL: "Power efficient resource scaling in partitioned architectures through dynamic heterogeneity" PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2006 IEEE INTERNATIONAL SYMPOSIUM ON AUSTIN, TX, USA MARCH 19-21, 2006, PISCATAWAY, NJ, USA,IEEE, 19 March 2006 (2006-03-19), pages 100-111, XP010910198 ISBN: 978-1-4244-0186-4 *
WU-CHUN FENG AND CHUNG-HSING HSU: "Green Destiny and its Evolving Parts" 19TH INTERNATIONAL SUPERCOMPUTER CONFERENCE, [Online] June 2004 (2004-06), XP002505673 Heidelberg, Germany Retrieved from the Internet: URL:http://public.lanl.gov/radiant/pubs.html> [retrieved on 2008-11-26] *

Also Published As

Publication number Publication date
US8161314B2 (en) 2012-04-17
JP2010524103A (en) 2010-07-15
CN101652737A (en) 2010-02-17
TW200907631A (en) 2009-02-16
TWI417700B (en) 2013-12-01
WO2008125509A3 (en) 2009-01-22
KR20100003727A (en) 2010-01-11
JP5306319B2 (en) 2013-10-02
CN101652737B (en) 2013-04-03
US20080256381A1 (en) 2008-10-16

Similar Documents

Publication Publication Date Title
US8161314B2 (en) Method and system for analog frequency clocking in processor cores
US7917799B2 (en) Method and system for digital frequency clocking in processor cores
US7945804B2 (en) Methods and systems for digitally controlled multi-frequency clocking of multi-core processors
US5481573A (en) Synchronous clock distribution system
US10007293B2 (en) Clock distribution network for multi-frequency multi-processor systems
US5572557A (en) Semiconductor integrated circuit device including PLL circuit
US8604852B1 (en) Noise suppression using an asymmetric frequency-locked loop
US8269544B2 (en) Power-supply noise suppression using a frequency-locked loop
US20080174347A1 (en) Clock synchronization system and semiconductor integrated circuit
CN117075683A (en) Clock gating component, multiplexer component and frequency dividing component
KR100681287B1 (en) System clock distributing apparatus and system clock distributing method
US7501865B1 (en) Methods and systems for a digital frequency locked loop for multi-frequency clocking of a multi-core processor
KR20020045691A (en) Apparatus for controling a frequency of bus clock in portable computer
US11558059B2 (en) Concept for a digital controlled loop and a digital loop filter
CN112671403A (en) Clock frequency division system, method and equipment
JP4156529B2 (en) Selectable clocking architecture
WO1996035177A1 (en) A modular system utilizing interchangeable processor cards
KR100278284B1 (en) Method and apparatus for minimizing clock skew using synchronous bus clock and programmable interface
Włostowski et al. White Rabbit and MTCA. 4 use in the LLRF upgrade for CERN’s SPS
CN214586628U (en) Multi-path clock output circuit, circuit board and CT scanner
CN116015279A (en) Clock configuration method, device, equipment and medium of programmable logic device
CN210129122U (en) FPGA accelerator card online clock configuration device
JP4553428B2 (en) Phase-locked loop (PLL) clock generator with programmable offset and frequency
Xiu Spectrally Pure Clock versus Flexible Clock: Which One Is More Efficient in Driving Future Electronic Systems?
CN117273159A (en) Quantum state driving signal generator, device and quantum computer system

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200880011570.3

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 1020097014447

Country of ref document: KR

ENP Entry into the national phase

Ref document number: 2010502492

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08749515

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 08749515

Country of ref document: EP

Kind code of ref document: A2