Energy Efficient Computing Power Management System on the Nehalem Family of Microprocessors
Intel'southward® 4th generation Core™ microprocessors are powered past Fully Integrated Voltage Regulators (FIVR). These 140 MHz multi-stage buck regulators are integrated into the 22nm processor dice, and characteristic up to eighty MHz unity gain bandwidth, non-magnetic package trace inductors and on-die MIM capacitors. FIVRs are highly configurable, allowing them to ability a wide range of products from 3W fanless tablets to 300W servers. FIVR helps enable 50% or more battery life improvements for mobile products and more than than doubles the meridian power available for burst workloads.
Figures - uploaded by Fabrice Paillet
Author content
All figure content in this area was uploaded by Fabrice Paillet
Content may be subject to copyright.
Observe the world'southward research
- 20+ million members
- 135+ million publications
- 700k+ research projects
FIVR – Fully Integrated Voltage Regulators on fourth
Generation Intel® Core™ SoCs
Edward A. Burton, Gerhard Schrom, Fabrice Paillet,
Jonathan Douglas
CCDO
Intel Corporation
Hillsboro, OR, United states of america
William J. Lambert, Kaladhar Radhakrishnan,
Michael J. Hill
ATTD
Intel Corporation
Chandler, AZ, USA
Abstract—Intel's® quaternary generation Core™ thouicroprocessors are
powered by Fully Integrated Voltage Regulators (FIVR). These
140 MHz multi-stage cadet regulators are integrated into the
22nm processor die, and feature up to 80 MHz unity gain
bandwidth, non-magnetic package trace inductors and on-die
MIM capacitors. FIVRs are highly configurable, allowing them
to power a wide range of products from 3W fanless tablets to
300W servers. FIVR helps enable fifty% or more than battery life
improvements for mobile products and more than doubles the
height power available for burst workloads.
I. INTRODUCTION
Intel's® 4th generation Core™ microprocessors (code
name Haswell) are powered by Fully Integrated Voltage
Regulators (FIVR), the industry'southward first large scale deployment
of loftier current switching regulators integrated into a VLSI dice
and parcel. An overview of the schemeast is given in Fig. 1(a).
A first phase VR, which is on the motherboard, converts from
the PSU or bombardment voltage (12-20V) to approximately i.8V,
which is distributed across the microprocessor die. The second
conversion stage is containd of between 8 and 31 (depending
on the product) FIVRs, which are one40MHz synchronous
multiphase buck converters with upward to xvi phases. A simplified
schematic for a two phase FIVR domain is shown in Fig. 1(b).
The power FETs, control circuitry and high frequency
decoupling are on the die, while the inductors and mid-
frequency input decoupling capacitors are placed on the
parcel. Each FIVR is independently programmable to achieve
optimal operation given the requirements of thursdaye domain it is
powering. The settings are optimized past the Power Control Unit of measurement
(PCU), which specifies the input voltage, output voltage,
number of operating phases, and a variety of other set uptings to
minimize the total prisoner of warer consumption of the dice.
FIVR is the enabling technology behind fundamental improvements
for Intel'due south® 4thursday generation Core™ microprocessors including a
l% or more increase in battery life for mobile products, and a
2-3x increase in peak available power (westhich converts into
outburst performance). The motherboard voltage regulators
eliminated by FIVR free up space that tin can be used to add
platform features or reduce platform dimensions. Details are
discussed in Section V.
A. Background
Intel's® 2008 microprocessor microarchitecture introduced
the Power Control Unit (PCU) [1], a microcontroller that
monitored conditions beyond the die in real time, and
dynamically suited a multifariousness of settings to optimally manage
ability consumption and performance. One of the almost
important features controlled past the PCU were newly added
loftier electric current power gates, which provided a meaning
improvement in CPU energy efficiency past eliminating large
leakage losses on idle compute domains. The power gates on
loftier electric current domains were introduced by adding gate
transistors into thin 'cracks' betwixt major functional blocks
and represented a very small-scale pct of the total dice surface area.
The die bumps required to support the loftier electric current domains
required a much larger expanse than the gate transistors themselves.
The large "bump area" posed a hard barrier to productization
until a scheme was devised to "borrow" bumps from
surrounding circuitry using a thick, low loss routing layer. An
extension of this scheme makes FIVR affordable.
A limitation of power gate is that all agile domains still
operate at the highest voltage required by any private
practicemain. To create separate voltage domains an entirely new
regulator must be added to the motherboard, which adds cost,
increases surface area, and requires actress parcel pins. An
improvement suggested past recent research is the integration of
high frequency buck regulators straight on the microprocessor
bundle, or in the die itself [2] [3] [4] [5]. This allows a much
larger number of independent power domains, each managed
dynamically to match the local computational demand. For
example, this would let ane CPU cadre to run at an dragd
voltage and frequency to satisfy a heavy computational load,
while other cores execute lower priority code at a much lower
voltage and frequency to save power.
Each of the previous works cited had at least one consequence
making it poorly suited for broad, high volume
deployment. The multi-scrap arroyo taken by [2] resulted in
an effective current density of 1.3A/mmtwo that would make it
expensive to implement because of the silicon area required to
back up a full power microprocessor product. In [3] another
instance of the multi-chip approach is demonstrated. This work,
which used a 90nm procedure and inductors integrated onto the
die, achieved a college current density (8A/mm2) but reported a
relatively depression efficiency of 76% (compared to 85% for 3.3V to
ane.0V conversion in [two]). Instead of the multi-chip arroyo, the
authors of [4] integrated the regulator directly into the dice on a
45nm process, but nevertheless suffered from relatively low current
density (1.7A/mm2). The authors likewise report an efficiency of
83% for 1.5V to 1.0V conversion due in role to the quality of
the discrete inductors that were used.
FIVR builds on the VR designs presented in [2], while the
implementation strategy that makes FIVR affordable, is an
extension of the bump "borrowing" scheme developed for the
loftier electric current power gates [i].
B. Motivation
This paper will prove that FIVR addresses the issues in prior
piece of work that prevented broad deployment of integrated switching
regulators in high volume products. Extending the earlier bump
borrowing scheme yields the same current density inpucker, and
corresponding cost decrease, that first made ability gates
affordable. Improvements in the inductors and transistors yield
efficiency in the 90% range for typical high power workloads.
The high unity gain frequency (up to 80MHz) allows FIVR to
work with just on die ThousandIM for output capacitance.
While these advancements are necessary, they're
bereft to brand a reasonable business concern case for FIVR. To
pay the costs of developing and fabricating FIVR, it's important
to quantify the bodily customer-visible benefits provided in a
real implementation. At the offset of the design FIVR'southward expected
benefits roughshod into one-half a dozen categories. FIVR delivered
material benefits in every category, and some benefits were far
larger than expected. The benefit categories were: battery life
increment, increased available power (for increased burst
functioning), decreased power required for a given level of
performance (or almost equivalently, increased performance
for a given power consumed), decreased platform cost and size,
improved product flexibility and scalability. Run across section Five for
the detailed FIVR impact.
Two. IMPLEMENTATION , DESIGN , AND SIMULATION
A. Circuitry
A block diagram representing the circuitry for a single
FIVR domain is shown in Fig. ii. The buck regulator bridges ardue east
formed by replacing the ability gates in previous products with
NMOS and PMOS cascode ability switches. The cascode
configuration allows the ability switches to exist implemented
with standard 22nm logic devices while nevertheless handling an input
voltage of up to i.8VDC [2]. This avoids the price of extra
processing steps for high voltage devices, while achieving
excellent switching characteristics. The span drivers are
controlled thru high-voltage level-shifters and support ZVS
(zip-voltage switching) and ZCS (zero-current-switching)
soft-switching performance. The gates of the cascode devices are
continued to the "one-half-rail", Vccdrvn , regulated to Fivein /2. This is
also the negative supply of the PMOS span driver as well as
the positive supply of the NMOS bridge commuter.
The area occupied by the power switches and drivers is
pocket-size, then they are distributed across the dice, immediately above
the connexion to their associated package inductor which
minimizes routing losses. This is illustrated in Fig. 3(a), which
shows the location of the package inductors under the die for a
four cadre LGA function. The driver circuitry is interleaved with the
power switches in an array which minimizes parasitics to allow
for very high switching frequencies. This also allows the size of
the bridge to be easily scaled based on the current requirement
and optimization points for each power domain.
Each FIVR domain is controlled by a FIVR Control Module
(FCM). The FCM contains the circuitry for generating the
PWM signals using double-edge modulation, as indicated in
Effigy 1. (a) Representative partitioning of the separate high current power domains on a 4thursday generation Core™ Microprocessor. (b) Simplified schematic of
a single FIVR domain, showing the partitioning of the components between the die and the packet.
Fig. two by the dashed box. Separate circuitry not shown in Fig. 2
manages phase current balancing, and the resulting digital
PWM signals are distributed from the FCM to individual
bridges. The PWM frequency, PWM gain, phase activation,
and the bending of each phasdue east are all programmable in fine
increments to enable optimal efficiency and minimum voltage
ripple across a span of different operating points. Spread-
spectrum is used for EMI and RFI (Radio Frequency
Interference) control.
The FCM module also contains the feedback control
circuitry (compensator). A high-precision ix-fleck DAC generates
a reference voltage for a programmable, high bandwidth analog
fully differential blazon-3 compensator. Sense lines feed the
output voltage back to the compensator. The endpoint of these
sense lines is strategically placed to achieve minimum DC error
and optimal transient responseastward at an important circuit location
in the domain. The compensator is programmed individually
for each voltage domain based on its output filter, and can be
reprogrammed while the domain is agile to maintain optimal
transient response as phase shedding occurs.
The 1000ey to makink FIVR affordable was integrating thursdaye
power devices directly into the microprocessor die. As due westith the
ability gate circuitry discussed in the introduction, the power
switching circuitry for FIVR can be placed in small areas
between major circuit blocks. The lower current handling of the
die bumps makes the die bump expanse requirements for FIVR
much larger than the actual dice area required. Since FIVR is
integrated into the microprocessor dice, routing on the thick
metal die layer permitsouth extra bumps to exist 'borrowed' from areevery bit
over other circuits, which avoids wasting whatsoever excess die surface area
due to bump current limits. This makes the effective current
density of FIVR 31A/mm2, a 24x increase over the bump-
express 1.3A/mm2 reported in [ii].
B. Passives
In society to keep the buck output filter minor enough to fit on
the die and packageastward it is necessary for FIVR to switch at a high
frequency – 140 MHz in most cases. This allows the cadet
output filter inductors to be implemented using simply the bottom
metal layers of a standard flip-fleck package. Ability routing is
constrained to the superlative layers of the parcel as a effect, simply the
proximity of the inductors to the load ensures that minimal
power is dissipated on these layers. Theast inductors are not-
magnetic, i.east. Air Core Inductors (ACI). A representative ACI
from an 8-phase domain of a product with an LGA packet is
shown in Fig. 3(b), including the connectedness points to the ability
switches, the DC current path through the inductor, and the
connection of the inductor to the output plane. Packet blueprint
rules permit the ACIs to be placed in to close proximity with one
some other. On a representative LGA packagdue east with four CPU
cores, this immune 59 inductors on 10 different voltage rails to
be implemented in a 20mm x 8mm area. The package
implementation also allows inductor designs to be customized
on a per rails basis to grandeet efficiency, ripple, and transient
response requirements.
Decoupling for the input runway is provided by a combination
of ceramic packet capacitors and on-dice MIM capacitors [vi].
The on package ceramic capacitors go on the output impedance
of the input rail low from approximately one MHz to their self-
resonance effectually twenty MHz. The MIM capacitors are on the dice
along with the power circuitry and provide high frequency
decoupling, including at the switching frequency and its
harmonics. Decoupling for the output rails is provided
primarily by the MIM capacitors, which are sufficient to
provide practiced transient response if wide bandwidth feedback
control is used (come across the results department). In some cases the MIM
capacitors are supplemented with extra package ceramic
capacitors. The comparatively low self-resonant frequency of
the ceramic capacitors complicates theastward control loop blueprint and
Effigy 2. Simplified block diagram of the circuitry for a single representative FIVR domain
does little to attenuate voltage ripple, but the ceramic capacitors
can provide a net transient response do good if they are robustly
connected to the output power plane.
C. Arrangement Control
In lodge to minimize losses from FIVR, a modified version
of the PCU [one] dynamically configures each FCM based on the
current activity level of the domain. The PCU turns each rail on
or off based on action, and specifies an output voltage target
to support the desired frequency. Information technology too optimizes the settings
discussed in section Ii.A for the anticipated operating
atmospheric condition. These settings include the number of active phases
(i.due east. stage shedding to improve lite load performance), the
compensator settings (to maintain optimal transient response as
the number of phases changes), and the timing of switch drivers
(to ensure zero voltage switching at light loads). This allows
each FIVR domain to operate at near peak efficiency beyond a
wide range of load conditions from retention to Turbo. An
example of the do good this provides is evidencen in Section 4.A.
Iii. CHARACTERIZATION AND PERFORMANCE T ESTING
Validating and optimizing a voltage regulator requires the
measurement of key parameters such as voltage ripple,
efficiency, power supply rejection ratio, transient response, and
control loop stability margins. Dueast to FIVR's high level of
integration and fast switching frequency many of theseast standard
measurements are difficult or impossible to perform using off
the shelf exam equipment. For case, an IA Cadre voltage
domain powered past a FIVR capable of supplying over 30A
occupies less than 15mm 2 on the packageast, which is completely
covered by the microprocessor die. This renders the zipper
of an external load for a full current efficiency measurement
impossible. This section describes some fundamental Design For Test
(DFT) features that are included in FIVR to allow authentic
label.
A. Control Loop Transfer Office
To enable characterization of the control, a loftier frequency
programmable signal generator is placed in the feedback
network on every FIVR domain. The bespeak generator is
activated in a examination style to inject a known, synchronized point
into the on-dice feedback loop. Past controlling the test feature
and monitoring the output voltage on the package westwardith an
oscilloscope, the response of the control loop is directly
measured. Repeated measurements are used to melody the
compensator to achieve fast response and adept stability
margins.
B. Load Transient Response and Rejection Ratio
Microprocessors require a nearly abiding DC voltage in thursdaydue east
presence of big load transients. For characterization purposes,
however, it is difficult to create a well behaved step load using
only the execution of lawmaking in normal functioning. The authors in
[7] instead use a scheme called Integrated Frequency Domain
Impedance Meter (IFDIM), which gates the clock network on
and off at a fixed frequency creating a large load transient. The
frequency is programmable, the magnitude of the transient can
be precisely calibrated using DC measurements, and the load
step is known to occur within i clock cycle, so the
microprocessor itself is consequenceively turned into an alternating
current load. This characteristic is included on every FIVR domain,
allowing the transient responseastward to a known load to exist measured.
FIVR domains are characterized across a wide frequency range
at multiple operatinm points for both output impedance and
output load coupling across domains.
C. Efficiency
As was previously mentioned, the level of integration
makes it impossible to connect a high electric current load direct to
the output of a FIVR rail. An additional complication is thursdayat the
circuitry on the dice cannot be disconnected from the FIVR
output, so whenever FIVR is powered some extra electric current draw
due to leakage results. This required the development of a new
technique for accurately measuring efficiency. A brief
summary of the method is given here. Kickoff, a procedure using
an external low current load and FIVR operating in a test mode
is used to calibrateast the leakage. The completely ungated clock
tree is then operated at varying frequencies to create a large
constructive adjustable DC output current. An iterative series of
measurements is and so used to precisely calibrate the total
current drawn by the clocks and the leakage, which allows the
efficiency to be measured, when combined with conventional
(a) (b)
Figure 3. (a) The bottom of an Intel® 4th generation Core™ microprocessor LGA package is shown along with along with a picture of the corresponding
die. A group of 8 FIVR inductors is pulled off to the side. (b) An enlarged 3D view of two FIVR inductors is shown with electric current flow arrows.
measurements of the voltage and the current at the output of the
first stage regulator.
IV. RESULTS
Due to the high switching frequency used, the performance
of FIVR is sensitive to the layout of the die and the package
which includes the inductors. Each combination of die and
package is individually optimized and validated. The following
section contains some key validation results from an Intel® 4th
generation Core™ microprocessor with four microprocessor
cores on an LGA package.
A. Measurements
Fig. 4(a) shows the voltage ripple for a low racket domain
measured under the die near the connexion of the ACI to the
power airplane. The measurement was boilerplated 128 times against
the PWM clock (with spread spectrum clocking turned off). To
achieve an accurate measurement, a controlled impedance
differential sense line wasouthward routed on the package from the
measurement location to a probe connection point with a
matched termination. An active differential oscilloscope probe
was thursdayen connected to the probe signal. This ensures an accurate,
wide bandwidth measurement is achieved, as opposed to
probing the packet power planes straight or using a unmarried
ended sense line, which tin can essentially attenuate the
measurement over a altitude as short equally a few millimeters. In
two stage operation less than 4mV (less than 1% of the voltage
ready bespeak) of ripple is achieved with a runway driven past air core
inductors well under 2mmii in area.
Fig. iv(b) shows the efficiency every bit measured using the
process in section III.C for 1.70V to one.05V conversion with
the bridges configured for difficult switching. The efficiency
measurement has been repeated for varying numbers of phases,
in each case showing a peak efficiency of approximately ninety%
at 0.75A/phase. By employing a phase shedding scheme it is
possible to proceed the efficiency of the domain within a few
per centum of the acme efficiency of the domain from 1A to 15A.
This is managed by the PCU which can phase shed when the
efficiency tin be improved, just also has the intelligence to
avoid stage shedding when it could exist problematic, for
example when a big load transient is possible.
The measured output voltage (averaged 128 times) during
an 8.5A load step kenerated past the IFDIM feature on the
graphics voltage rail is shown in Fig 5(a). The measurement
Figure iv. (a) Measured voltage ripple for a depression noise domain for unmarried
phase and 2 phase operation (128 averages) (b) Measured efficiency for a
voltage domain as a office of the number of active phases
Figure 5. (a) Measured voltage droop on a graphics domain in response to
an 8.5A footstep load (b) Comparison of the effective impedance profile for the
graphics voltage domain on a 3rd generation Core™ microprocessor versus a
4th generation Cadre™ microprocessor
was performed with a similar probing configuration to that used
for the voltage ripple measurement. The combination of a high
bandwidth feedback loop and on die decoupling go along the
voltage droop under 50mV despite a rise fourth dimension for current pace
of under 1ns (orders of magnitude faster than normal graphics
circuit behavior). The main droop upshot lasts under 30ns, and
the DC voltage is restored inside 100ns. A nonlinear control
feature saturates the duty bicycle when a large transient is
detected (not active in Fig. five(a)). The feature was found to
provide upward to a 25% reduction in voltage droop for step loads,
but the do good is significantly reduced for certain aperiodic load
patterns. The effective output impedance profile for the same
rail is shown in Fig. 5(b). The top impedance demonstrates the
fast bandwidth of the compensator. Because the inductors are
located immediately beneath the bodily surface area of the die that
consumes current, the DC and depression frequency load line is
well-nigh zero. The figure also shows the impedance profildue east for
an Intel® 3rd generation Cadre™ microprocessor graphics rail,
which is powered past a platform VR. For this runway, a DC load
line is required due to the distance betwixt thursdaye VR and the die.
Several resonant peaks occur from the various stages of
decoupling capacitors on the motherboard and parcel, and the
parasitic inductance between them and the actual point of
current consumption on the die.
Fig. half-dozen shows the open loop transfer function for a FIVR
domain, measured using the signal generator DFT. This rail
demonstrates a unity proceeds bandwidth of vii8 MHz while still
maintaining twoscore° phase margin. Robust compensator excursion
design and very small propagation delays were necessary to
attain this bandwidth, which, in plough, was required to maintain
good transient response on rail with limited output
capacitance. The high bandwidth also enables fast voltage
transitions. A FIVR rail turning on and turning off are shown in
Fig. vii. Both transitions are programmed to nigh one-half a
microsecond for a full range transition – two orders of
magnitude faster than a typical platform-based solution. The
fast ramp rate translates into ability savings for the organisation, every bit
the voltage track tin can be turned on, used, and turned off over again
about instantly.
A big number of add-onal measurements are taken for
validation purposes that are not shown here due to space
constraints. These include the output impedance of the 5in rail,
audio susceptibility measurements, and the coupling noise due
to load transients from one rail to some other (particularly from
very loftier current domains such as core and graphics to low
current systalk agent domains). EMI/RFI characterization is
besides performed.
B. Comparison to Previous Piece of work
Tabular array I contains a comparing to previous works discussed
in the introduction. FIVR operates at a higher switching
frequency than previous works, westhich is enabled in part by very
expert gate charge characteristics for the MOSFETs. This allows
up to 90% efficiency at a common conversion ratio.
Five. FIVR I MPACT TO P RODUCTS
Bombardment life improvement: Sufficient bombardment life for a
complete due westork 24-hour interval has long been desired from mobile products.
FIVR, combined with power management architecture
improvements, has enabled this for Intel® ivth generation
Cadre™ products. Increases of well over fifty% take existen widely
reported (for example, [8] and [9]). FIVR's battery life do good
comes by several means:
Figure half dozen. The measured open up loop proceeds and stage of a FIVR bear witnessing
78MHz bandwidth with xl° phase magrin
Figure vii. A FIVR rail ramping to its voltage set point from fully off, and
then turning off once more. Voltage transition times are programmable, simply
typically prepare for one-half a microsecond for a 1V transition.
Standby current historically consumes a large fraction of
the battery's stored energy. FIVR'southward fast bandwidth allows
low frequency supply noise to exist rejected, resulting in up
to a ninety% reduction in decoupling requirements. This
allows both the first and second stages of regulation to be
power cycled much faster than on previous products,
enabling new deep sleep states with up to 20x lower
standby power. With the lowered capacitance, power
expended, and time wasted entering and exiting the states
is similarly reduced. Reduced slumber-state entry/exit time
also saves power by increasing sleep-state usage.
FIVR'southward fast control loop and integration into the package
result in i tenth the top impedance of prior solutions
(see Fig. 5(b)) in the sub-MHz stimulus range nigh
relevant to the graphics architecture. The resulting depression
frequency supply due northoise reduction improves power at a
given functioning by upwards to xxx%.
FIVR increases the number of voltage rails, allowing each
domain to exist set up at the minimum possible voltage thursdayat
supports error-complimentary operation, reducing both leakage and
dynamic power.
Replacing multiple loftier current voltage regulators on the
motherboard with a unmarried kickoff stage regulator reduces the
PCB footprint of the ability delivery solution. This extra
space can be used for a larger battery, with somdue east examples
demonstrating upwards to ten% growth.
Trimming FIVR together with the microprocessor removes
manufacturing baby-sit-bands normmarry required to ensure
that every VR will work with every CPU.
Increased available peak power: Anorth illustrative instance
shows how FIVR tin increment the meridian power available to the
microprocessor. A typical mobile processor platform using the
prior generation ability delivery scheme had two 30A, 1.1V
VRs providing 33W for cores and 33W for graphics. Using the
aforementioned power FETs and inductors for the FIVR's 1.8V input FiveR,
the role has 108W power rail (30A/stage * 2 phases * 1.8V),
which tin can be dynamically allocated to a combination of FIVRs
by the PCU. For core-only workloads, nearly the unabridged 108W
can be allocated for the cores, increasing the available power
ceiling by 3x. For graphics workloads, 36W tin be partitioned
to the cores with the remaining 72W going to graphics – more than
than double the power available from the 33W platform VR.
Because power consumption scales as CV 2F and frequency
scales with voltage, the increment in available power could be
used to operate the graphics at up to 26% higher frequency than
possible with the platform VR. A similar calculation yields a
44% college cadre frequency in the core-just scenario. The
duration of these scenarios is limited by the thermal capabilities
of the platform, only translates into improved speed in many real
scenarios.
Decreased power at a given performance: Intel'due south ® Iris™
Pro graphics uses FIVR's higher available power to deliver
loftier end graphics. FIVR's high unity gain bandwidth presents
less than a tenth the peak output impedance provided by the
prior generation's platform VR in the sub-MHz range importemmet
to the graphics load (run into fig. 5b). Considering FIVR has doubled
the graphics power ceiling, few (if any) of our shipping parts
would fit inside the premises of the older generation'southward platform.
The higher currents typically imply hundreds of millivolt
droops on the older platforms. The combination of high currents
with high impedance peaks yields a hypothetical power revenue enhancement in
the 20-30% range (bold one could, and actually would,
ship these high current levels into thursdaye old platforms). FIVR
avoids that tax.
Improved production flexibility and scalability: FIVR'south ability
to add voltage rails onto a mutual shared input rail without
package growth or even platform changes brings meaning
flexibility and modularity into the pattern space that was non
available before. New voltage rails can exist added as needed,
without any platform modify. This power allowed usa to
introduce the Iris™ Pro graphics into standard platforms even
though new rails were needed for the EDRAM and its loftier
speed OPIO link.
TABLE I. C OMPARISON OF FIVR TO P REVIOUSLY REPORTED I NTEGRATED V OLTAGE R EGULATORS
One thousand. Schrom et al., 2010 [ii]
T. DiBene et al., 2010 [iii]
N. Sturcken et al., 2012 [4]
Total Output Imax
capability
Limited by outset stage and
thermals (Upwardly to 400 A)
Express by commencement stage and
thermals (Upward to 700 A)
Integrated into network die
Package trace, & magnetic
discrete
Magnetic sparse-film on VR
die
Discrete wire-wound air cadre
second array of parcel trace
a MCM – Multi Chip Module – the active circuitry is on a split die assembled on the aforementioned package
Platform size and cost reduction: Since four platform VR
controllers are eliminated, along with the associated decoupling
caps, power FETs and inductors, there's a clear platform size
and cost advantage . FIVR'southward total platform BOM price reduction
is expected to be several billion dollars over the product
lifetime. The power inductors feeding ability to the CPU often
show up in the critical thickness cross-section of small-scale grade-
factor laptops and tablets, and trading FIVR's total component
count reduction for a thickness reduction is straightforward. A
platform phase count increase results in lower electric current per
stage, and the lower stage current tin exist satisfied with a lower
contour set of inductors.
In the prior generation platforms, some dual-sided PCBs
have tall components like ICs and inductors on the primary side
and lower profile detached components on the secondary side.
Often the secondary-side components are more often than not high
frequency decoupling, located immediately underneath front-
side ICs, with sparsely populated areas between. In such cases,
FIVR eliminates most of the secondary-side components, and
frees up infinite on the primary side to accommodate the residual. The
resulting populated PCB thickness is reduced by the height of
the tallest removed dorsumside components.
In small systems, the platform size tends to limit the featureast
fix, leading to fewer connectivity options, smaller storage
space, etc. FIVR's platform size reductions can provide more
space to implement these features.
Half-dozen. CONCLUSION
Consumers expect every generation of mobile computer
products to have yardore compute power, thinner and lighter form
factors, and longer battery life than the last. The 4th generation
Intel® Core™ ability compages using FIVR provides
improvements in all three of these areas. To the author'southward
knowledge, this is the first consumer product to make use of
integrated switching regulators on this calibration. Furthermore,
FIVR'south functioning is improved versus previously reported
prototypes. The authors therefore feel that FIVR is an important
advancement in the field of power electronics.
ACKNOWLEDGMENT
Nosotros would like to acknowledge the followinyard FIVR team
members who were non already listed every bit authors: the FIVR
silicon design team including George Geannopoulos, Keith
Hodgson, Narayanan Raghuraman, Alex Lyakhov, Michael W
Rogers, Ravi Vunnam, Lan D Vu, Mark Southward Milshtein, Chiu
Keung Tang, Hong Yun Tan, Seh Leon Goh, Samie Samaan,
Narayanan Natarajan, Rajan Vijayaraghavan, Ashish Khanna
and Munish Chauhan; top-level integration past Pankaj Aswal;
modeling and implementation development by Doug Huard and
Alex Waizman; layout studies, inductor development and
package designs past John Smith, Brad Larson, and Huong Exercise;
mask design past Neafifty Tanksley and Galyna Burenkova; test and
manufacturing support from RJ Hayes and Arun
Krishnamoorthy; modeling support by Alex Levin and Anne
Augustine.
REFERENCES
Due south. Gunther, A. Deval, T. Burton and R. Kumar, "Energy-Efficient
Calculating: Prisoner of warer Management Organisation on the Nehalem Family unit of
Microprocessors," Intel Technology Periodical, vol. 14, no. iii, pp. l-65,
2010.
F. Paillet, K. Schroone thousand and J. Hahn, "A 60MHz 50W Fine-Grain Parcel-
Integrated VR Powering a CPU from 3.3V," in Advanced Power
Electronics Briefing, Palm Springs, CA, 2010.
J. T. Dibene, et al., "A 400 Amp fully integrated silicon voltage
regulator with in-die magnetically coupled embedded inductors," in
Avant-garde Ability Electronics Conference, Palm Springs, CA, 2010.
Due north. Sturcken, et al., "A switched-inductor integrated voltage
regulatorwith nonlinear feedback and network-on-chip load in 45nm
SOI," IEEE Periodical of Solid-Country Circuits, vol. 47, no. 8, August 2012.
G. Schrom, et al., "A 100 MHz Viii-Phase Buck Converter Delivering
12 A in 25 mm^two Using Air-Core Inductors," in Proc. 22nd Annu. IEEE
Applied Power Electronics Conf., 2007.
C. Auth, et al., "A 22nm High Performance and Low-Power CMOS
Technology Featuring Fully-Depleted Tri-Gate Transistors, Cocky-
Aligned Contacts, and High Density ThousandIM Capacitors," in 2012
Symposium on VLSI Technology, Honolulu, HI, 2012.
A. Waizman, 1000. Livshitz and Chiliad. Sotman, "Integrated Ability Supply
Frequency Domain Impedance Meter (IFDIM)," in 13th Conference on
Electrical Performance of Electronic Packaging, Portland, OR, 2004.
A. L. Shimpi, "Isouthward Haswell Ready for Tablet Duty? Battery Life of
Haswell ULT vs Mod ARM Tablets," 22 July 2013. [Online].
Available: http://www.anandtech.com/show/7117/haswell-ult-
investigation. [Accessed 15 November 201iii].
R. Baldwin, "How the Haswell Chip Makes the New MacBook Air 50ast
12 Hours," 10 June 2013. [Online]. Available:
http://world wide web.wired.com/gadgetlab/2013/06/haswell-mba/. [Accessed 15
November 2013].
... Commencement, DarkGates bypasses the ability-gates of F max -constrained processors at the package level by shorting gated and un-gated CPU cadre power-delivery domains. This enables the sharing of 1) the decoupling capacitors of the dice (i.e., Metal Insulator Metal (MIM) [17]) and the bundle (i.e., decaps [eighteen]), and 2) the package routing resources among CPU cores, resulting in lower voltage drops, and improving voltage/frequency (i.due east., V/F) curves. one Second, DarkGates extends the power management rmware (e.one thousand., Pcode [20]) algorithms to operate in ii modes: one) bypass mode, which increases the CPU cores' voltage and frequency, and ii) normal mode, which ane Intel processors are individually calibrated in the factory to operate on a speci c voltage/frequency and operating-condition curve speci ed for the individual processor [19]. Reducing the voltage guardband increases the eastward ective voltage, which allows the processor to operate at higher frequency for the same voltage level [12]. ...
... There are iii usually-used PDNs in contempo high-terminate client processors [60,89]: motherboard voltage regulators (MBVR) [33,37,90,91], integrated voltage regulators (IVR) [77-lxxx, 92, 93], and low dropout voltage regulators (LDO) [17,41,74,94]. We describe aspects of the MBVR PDN here due to its simplicity. ...
... The ability commitment of a mod processor is limited by EDC, besides known every bit the maximum instantaneous current, tiptop current, Icc max , or 4th power limit (i.east., PL4 [39]). EDC is the maximum amount of current at any instantaneous brusque flow of fourth dimension that tin be delivered by a motherboard VR or an integrated VR (eastward.1000., FIVR [17]). EDC limit is typically imposed by the limited maximum electric current that the VRs can supply [17,20,22,98,111,112]. ...
To reduce the leakage ability of inactive (night) silicon components, mod processor systems shut-off these components' power supply using depression-leakage transistors, called power-gates. Unfortunately, power-gates increase the system'due south power-commitment impedance and voltage guardband, limiting the organisation's maximum attainable voltage (i.e., Vmax) and, thus, the CPU core's maximum attainable frequency (i.e., Fmax). As a result, systems that are functioning constrained by the CPU frequency (i.e., Fmax-constrained), such every bit high-end desktops, suffer significant functioning loss due to power-gates. To mitigate this performance loss, we propose DarkGates, a hybrid system architecture that increases the performance of Fmax-constrained systems while fulfilling their power efficiency requirements. DarkGates is based on 3 key techniques: i) bypassing on-chip ability-gates using package-level resources (called bypass mode), ii) extending power management firmware to support operation either in bypass mode or normal mode, and iii) introducing deeper idle power states. We implement DarkGates on an Intel Skylake microprocessor for client devices and evaluate it using a wide multifariousness of workloads. On a real 4-core Skylake organisation with integrated graphics, DarkGates improves the average performance of SPEC CPU2006 workloads beyond all thermal pattern power (TDP) levels (35W-91W) betwixt four.2% and 5.3%. DarkGates maintains the performance of 3DMark workloads for desktop systems with TDP greater than 45W while for a 35W-TDP (the everyman TDP) desktop it experiences only a 2% degradation. In addition, DarkGates fulfills the requirements of the Free energy STAR and the Intel Ready Mode energy efficiency benchmarks of desktop systems.
... The significant functioning indices of SC converters are regulation, efficiency, ripple, power density, and response fourth dimension. In addition, SC-based voltage regulators provide the benefits of tighter noise margins due to absence of complex poles, subtract in voltage stress across semiconductor device and converter power loss scaling with load electric current [10], [xi]. A detailed SC converter operation methodology is explained in [12]. ...
- Sunita Saini
- Davinder Singh Saini
Fundamental charge vector method assay is a single parameter optimization technique limited to conduction loss assuming all frequency-dependent switching (parasitic) loss negligible. This paper investigates a generalized structure to blueprint DC-DC SC converters based on conduction and switching loss. A new technique is proposed to notice the optimum value of switching frequency and switch size to calculate target load current and output voltage that maximize the efficiency. The analysis is done to place switching frequency and switch size for two-phase 2:1 serial-parallel SC converter for a target load current of 2.67mA implemented on a 22nm technology node. Results testify that a minimum of 250MHz switching frequency is required for target efficiency more than ninety% and the output voltage greater than 0.85V where the switch size of a unit cell corresponds to 10Ω on-resistance. MATLAB and PSpice simulation tools are used for results and validation.
... To minimize ripple, decoupling capacitors are placed on the board, package and die, such that a robust power commitment is available across a large frequency range. Despite these capacitors, there are parasitic inductances across the power commitment network (PDN), which lead to resonances at various frequencies, with the most disruptive ones ordinarily between 0.5 and 100MHz [1]. When in that location is a large current surge on the flake, it may activate 1 of these resonances, resulting in a significant voltage droop, which can be ∼x% of the Vcc level. ...
- Amir Mizrahi
- Yizhak Shifman
- Joseph Shor
The Vcc level and temperature of IC's are important parameters which make up one's mind the ability/performance. Resonances in the package and platform tin cause significant AC voltage droops which tin can dethrone functionality, requiring additional guard-ring. Prior-art droop detectors utilise digital delay circuits, such as tunable replica circuits to measure these droops. Notwithstanding, the filibuster is a strong function of temperature as well as the DC Vcc level, making information technology difficult to differentiate the Air-conditioning droop across different voltage and temperature levels. It is proposed to utilize a current controlled oscillator (CCO) with an analog bias to mitigate the voltage and temperature dependencies, such that merely the Air conditioning droop is measured. The CCO frequency is independent of the DC Vcc level, while the temperature is also characterized forth with the AC droop, such that both temperature and droop levels can be extracted. The sensor tin measure droops and temperature to an accurateness of 10mV and ± 3 °C respectively. The circuit occupies $8800~\mu \text{one thousand}^{2}$ in 65nm with a ability consumption of $297~\mu \text{W}$ . This excursion is very useful to characterize the power grid in design for test (DFT) applications as well as on-the-fly real fourth dimension chip functioning.
A fully integrated switched‐inductor switched‐capacitor (SISC) DC–DC converter is proposed. This converter is designed in such a way that the input voltage can exist twice the process allowable voltage without damaging the on‐flake transistors. To mitigate large series resistance of on‐flake inductors as one of the primary challenges in the switched‐inductor power supply on chip, two solutions are proposed. By using analytical model of an on‐flake inductor, optimal physical dimensions are designed to achieve the desired inductance with minimum series resistance in a minor expanse. Along with the optimization, the dual‐path structure of the proposed converter reduces series resistance losses of the on‐fleck inductor and increases the constructive quality factor upward to 7 times in duty cycle of 0.8. The proposed converter is implemented in 0.xviii μm standard CMOS process. The circuit converts input voltage of iii.6–0.9 V at the load current of 125 mA with the efficiency of 72.8%. Efficiency enhancement factor reaches 72% at 600 mV output voltage. The achieved electric current density of the proposed converter is 333 mA/mm2. By computing small signal model of the proposed converter and designing a suitable feedback loop, an advisable transient behavior was proved In this work, a method is presented to implement a fully integrated dc–dc converter. Input dc voltage higher than the maximum permissible voltage of the used technology is downwards‐converted with high efficiency. This is considering of using an optimized on‐scrap inductor and providing two parallel switched‐capacitor and switched‐inductor paths to supply the output ability. This feature reduces the current flowing through the output series inductor and causes significant improvement of the inverter efficiency.
The exploration of custom deep neural network (DNN) accelerators for highly energy constrained border devices with on-device intelligence is gaining traction in the enquiry community. Despite the superior throughput and performance of custom accelerators as compared to CPUs or GPUs, the energy efficiency and versatility of state-of-the-fine art DNN accelerators is constrained due to a) the storage and movement of a big volume of information and b) the express scope of monolithic architectures, where the entire accelerator executes only a single model at whatever given time. In this paper, a multi-voltage domain heterogeneous DNN accelerator is proposed that executes multiple models simultaneously with different power-functioning operating points. The proposed compages meantime implements near-memory computing and leakage reuse, where the leakage current of idle memory banks within each processing element is utilized to deliver current to the adjacently placed multiply-and-accumulate (MAC) units. The proposed architecture and circuit techniques are evaluated with SPICE simulation in a 65 nm CMOS applied science. The simulation results point that the proposed heterogeneous architecture with leakage reuse results in an energy efficiency of 3.27 tera-operations per 2d per watt (TOPS/W) equally compared to a conventional monolithic and unmarried voltage domain architecture that exhibits an energy efficiency of 0.0458 TOPS/Westward. In addition, the proposed accelerator that implements the leakage reuse technique on only half of the retentivity elements storing the weights reduces the power consumption of the sub-arrays of processing elements past 26% (99.iv mW) as compared to an accelerator that does not utilise leakage reuse.
In this cursory, a novel double-side silicon-embedded coreless inductor is proposed and demonstrated for integrated dc–dc converter applications. The inductor has double-side thick windings embedded into the silicon substrate and connected in parallel. Extremely large effective metal thickness of $300~\mu \text{m}$ can, therefore, be achieved. Consequently, the 0.8 mm <sup xmlns:mml="http://world wide web.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> inductor fabricated shows a low dc resistance of 42 $\text{m}\Omega $ . A large inductance to dc resistance ratio of 0.iv nH/ $\text{m}\Omega $ is then accomplished with an inductance over 16.1 nH. The calculated top effective inductor efficiency is 96.ane% for ane.viii–0.85 V, 100 MHz dc–dc conversion.
We present a 100MHz eight-phase synchronous buck converter using air-core inductors. The voltage regulator (VR) chip was manufactured in a 90nm CMOS process and mounted on a flip-chip test packet together with surface-mountain inductors and decoupling capacitors. The measured superlative efficiency is 84.0% for Vin/Vout= 2.4V/1.5V and 79.3% for 2.4V/i.2V. The VR delivers a load current of 12A in an area of only 25mm2 and ii.5mm peak. This is the first sit-in of a high-frequency VR with air-core inductors, that reaches a tape power density of 3.78kW/in3.
- C. Auth
- C. Allen
- A. Blattner
- K. Mistry
A 22nm generation logic technology is described incorporating fully-depleted tri-gate transistors for the first time. These transistors feature a 3rd-generation high-thousand + metal-gate technology and a fifth generation of channel strain techniques resulting in the highest bulldoze currents yet reported for NMOS and PMOS. The use of tri-gate transistors provides steep subthreshold slopes (~70mV/dec) and very low DIBL (~50mV/V). Self-aligned contacts are implemented to eliminate restrictive contact to gate registration requirements. Interconnects feature 9 metallic layers with ultra-low-one thousand dielectrics throughout the interconnect stack. Loftier density MIM capacitors using a hafnium based high-k dielectric are provided. The technology is in high volume manufacturing.
- Noah Sturcken
- Michele Petracca
- Steven B. Warren
- Kenneth 50. Shepard
A 4-stage integrated buck converter in 45 nm silicon-on-insulator (SOI) technology is presented. The controller uses unlatched pulse-width modulation (PWM) with nonlinear proceeds to provide both stable small-signal dynamics and fast response (~700 ps) to large input and output transients. This fast control arroyo reduces the required output capacitance by 5× in comparison to a conventional, latched PWM controller at a like operating point. The converter switches bundle-integrated air-core inductors at 80 MHz and delivers i A/mm2 at 83% efficiency and 0.66 conversion ratio. A network-on-chip (NoC) serves as a realistic digital load along with a programmable current source capable of generating load electric current steps with slew charge per unit of ~one A/100 ps for characterization of the command scheme.
- A. Waizman
- M. Livshitz
- Michael Sotman
IFDIM is an integrated and self-checking on-die current throttling method that accurately measures CPU'south power delivery impedance profile from the die upwardly to the voltage regulator. Impedance profile characterization in 100Hz-600MHz frequency ranges is demonstrated.
Is Haswell Ready for Tablet Duty? Battery Life of Haswell ULT vs Modernistic ARM Tablets
- A 50 Shimpi
A. Fifty. Shimpi, "Is Haswell Ready for Tablet Duty? Battery Life of Haswell ULT vs Modern ARM Tablets," 22 July 2013. [Online]. Available: http://www.anandtech.com/show/7117/haswell-ultinvestigation. [Accessed 15 Nov 2013].
Free energy-Efficient Computing: Ability Direction System on the Nehalem Family of Microprocessors
- S Gunther
- A Deval
- T Burton
- R Kumar
S. Gunther, A. Deval, T. Burton and R. Kumar, "Free energy-Efficient Computing: Power Management System on the Nehalem Family unit of Microprocessors," Intel Engineering Journal, vol. fourteen, no. 3, pp. l-65, 2010.
A 400 Amp fully integrated silicon voltage regulator with in-die magnetically coupled embedded inductors
- J T Dibene
J. T. Dibene, et al., "A 400 Amp fully integrated silicon voltage regulator with in-die magnetically coupled embedded inductors," in Avant-garde Power Electronics Conference, Palm Springs, CA, 2010.
A 60MHz 50W Fine-Grain Package-Integrated VR Powering a CPU from three.3V
- F Paillet
- One thousand Schrom
- J Hahn
F. Paillet, G. Schrom and J. Hahn, "A 60MHz 50W Fine-Grain Package-Integrated VR Powering a CPU from 3.3V," in Avant-garde Power Electronics Conference, Palm Springs, CA, 2010.
How the Haswell Chip Makes the New MacBook Air Last 12 Hours
- R Baldwin
R. Baldwin, "How the Haswell Chip Makes the New MacBook Air Terminal 12 Hours," 10 June 2013. [Online].
Source: https://www.researchgate.net/publication/271416878_FIVR_-_Fully_integrated_voltage_regulators_on_4th_generation_IntelR_Core_SoCs
0 Response to "Energy Efficient Computing Power Management System on the Nehalem Family of Microprocessors"
Publicar un comentario