throbber
LOW-POWER PROCESSOR DESIGN
`
`Ricardo E. Gonzalez
`
`Technical Report No. CSL-TR-97-726
`
`June 1997
`
`This research has been supported by ARPA contract FBI-92-194.
`
`

`

`LOW-POWER PROCESSOR DESIGN
`
`Ricardo E. Gonzalez
`
`Technical Report: CSL-TR-97-726
`
`June 1997
`
`Computer Systems Laboratory
`Departments of Electrical Engineering and Computer Science
`Stanford University
`William Gates Computer Science Building, A-408
`Stanford, Ca 94305-9040
`<pubs@shasta.stanford.edu>
`
`Abstract
`
`Power has become an important aspect in the design of general purpose processors.
`This thesis explores how design tradeoffs affect the power and performance of the
`processor. Scaling the technology is an attractive way to improve the energy efficiency
`of the processor. In a scaled technology a processor would dissipate less power for the
`same performance or higher performance for the same power. Some micro-
`architectural changes, such as pipelining and caching, can significantly improve
`efficiency. Unfortunately many other architectural
`tradeoffs
`leave efficiency
`unchanged. This is because a large fraction of the energy is dissipated in essential
`functions and is unaffected by the internal organization of the processor.
`
`Another attractive technique for reducing power dissipation is scaling the supply and
`threshold voltages. Unfortunately this makes the processor more sensitive to variations
`in process and operating conditions. Design margins must increase to guarantee
`operation, which reduces the efficiency of the processor. One way to shrink these
`design margins is to use feedback control to regulate the supply and threshold voltages
`thus reducing the design margins. Adaptive techniques can also be used to dynamically
`trade excess performance for lower power. This results in lower average power and
`therefore longer battery life. Improvements are limited, however, by the energy
`dissipation of the rest of the system.
`
`Key Words and Phrases: processor design, processor architecture, low-power
`CMOS circuits, supply and threshold scaling.
`
`i
`
`

`

`Copyright © 1997
`
`by
`
`Ricardo E. Gonzalez
`
`ii
`
`

`

`Acknowledgments
`
`I would never have completed this document without the support, encouragement and
`guidance of many people. I cannot possibly list everyone that helped make my stay at
`Stanford a fulfilling experience; I will mention just a few.
`
`First and foremost I would like to thank Mark Horowitz, my principal advisor, for his
`support, guidance, and encouragement. Working with him for 7 years was a privilege. His
`ability to understand my ideas before I did and his continuous pursuit of knowledge were
`always a source of inspiration.
`
`I would also like to thank Kunle Olukotun and Simon Wong for their support an guidance
`as members of my reading committee. I am also thankful to Bruce Wooley and
`Arogyaswami Paulraj for participating in my oral examination committee. Noe Lozano
`did much to convince me to pursue a Ph.D. and arranged the financial support for me to
`get started.
`
`My family also gave their unwavering support to this enterprise. My parents never
`stopped asking how my research was progressing. And, more important, never accepted
`“It’s going OK” for an answer. This dissertation is dedicated to them.
`
`My friends and colleagues made my life at Stanford very enjoyable. This document now
`before you is due to them. They encouraged me, helped me spend time away from my
`workstation, and were an unending source of interesting conversations.
`
`Finally, I would like to thank all the members of the Stanford Cycling Club especially our
`coach, Art Walker, for making the past few years of my life so painfully memorable.
`
`This research was supported in part by the Advanced Research Projects Agency under
`contract J-FBI-92-194. by the School of Engineering at Stanford, and by the Intel
`Foundation.
`
`iii
`
`

`

`To my parents.
`To my parents.
`
`iV
`iv
`
`

`

`Table of Contents
`
`Chapter 1 Introduction......................................................................................................1
`
`Chapter 2 Energy-Delay Product.....................................................................................5
`2.1 Energy Dissipation in CMOS Circuits................................................................ 5
`2.2 Low-Power Metrics ............................................................................................ 7
`2.3 Low-Power Design Techniques.......................................................................... 9
`
`3.2
`
`Chapter 3 Micro-architectural Tradeoffs......................................................................19
`3.1 Lower Bound on Energy and Energy-Delay...................................................... 19
`3.1.1 Simulation Methodology ..........................................................................20
`3.1.2 Machine Models ........................................................................................22
`3.1.3 Comparison of Energy and Energy-Delay Product ..................................23
`Processor Energy-Delay Product ...................................................................... 24
`3.2.1 Simulation Methodology ..........................................................................25
`3.2.2 Machine Models ........................................................................................25
`3.2.3 Energy Optimizations ...............................................................................26
`3.2.4 Energy and Energy-Delay Results ............................................................29
`3.3 Energy Breakdown............................................................................................ 30
`3.4 Memory Hierarchy Design ............................................................................... 32
`3.4.1 System architecture ...................................................................................32
`3.4.2 Simulation Methodology ..........................................................................33
`3.4.3 Energy and Performance Tradeoffs ..........................................................35
`3.5 Future Directions .............................................................................................. 41
`3.6 Summary........................................................................................................... 47
`
`Chapter 4 Supply and Threshold Scaling......................................................................49
`4.1 Energy and Delay Model .................................................................................. 49
`4.2 Sleep Mode ....................................................................................................... 58
`4.3 Process and Operating Point Variations ........................................................... 60
`4.4 Adaptive Techniques ........................................................................................ 65
`4.5 Adaptive Power Management........................................................................... 68
`4.6 Summary........................................................................................................... 71
`
`Chapter 5 Conclusions.....................................................................................................73
`5.1 Future Work ...................................................................................................... 74
`
`v
`
`

`

`Chapter 6 Bibliography...................................................................................................77
`
`Appendix A Capacitance Estimation ............................................................................85
`
`Appendix B Memory Power Model ...............................................................................87
`
`vi
`
`

`

`List of Tables
`
`Table 2.1:
`Table 3.1:
`Table 3.2:
`Table 4.1:
`Table 4.2:
`Table 4.3:
`Table 4.4:
`Table 4.5:
`Table B.1
`
`Current processors. .................................................................................17
`Summary of SRAM characteristics. .......................................................34
`Summary of DRAM characteristics........................................................34
`Circuit element description. ....................................................................52
`Process and circuit parameters for a 0.25μm technology. ......................54
`Additional process parameters for 0.25μm technology. .........................59
`Operating modes. ....................................................................................71
`Power breakdown categories. .................................................................71
`Cache model parameters. ........................................................................88
`
`vii
`
`

`

`List of Figures
`
`Figure 1.1:
`Figure 1.2:
`Figure 1.3:
`Figure 2.1:
`Figure 2.2:
`Figure 2.3:
`Figure 2.4:
`Figure 2.5:
`Figure 2.6:
`Figure 3.1:
`Figure 3.2:
`Figure 3.3:
`Figure 3.4:
`Figure 3.5:
`Figure 3.6:
`Figure 3.7:
`Figure 3.8:
`Figure 3.9:
`Figure 3.10:
`Figure 3.11:
`Figure 3.12:
`Figure 3.13:
`Figure 4.1:
`Figure 4.2:
`Figure 4.3:
`Figure 4.4:
`Figure 4.5:
`Figure 4.6:
`Figure 4.7:
`Figure 4.8:
`Figure 4.9:
`Figure 4.10:
`Figure 4.11:
`Figure 4.12:
`Figure 4.13:
`Figure B.1:
`
`Evolution of processor performance. .......................................................2
`Evolution of processor power. .................................................................2
`Performance and energy of processors. ...................................................3
`CMOS inverter. ........................................................................................6
`Performance-energy plane. ......................................................................9
`Variation in performance and energy with supply voltage. ...................10
`Variation in performance and energy with transistor sizing. .................11
`EDP contours versus transistor size and supply voltage. .......................13
`Scalar and super-scalar processors pipeline diagrams. ..........................16
`Basic processor operation. .....................................................................21
`Normalized performance and energy of idealized machines. ................24
`Reduction in energy from simple optimizations. ...................................29
`Normalized energy and performance for RISC and TORCH. ...............30
`Energy breakdown for RISC and TORCH processors. .........................31
`Architecture of processor system. ..........................................................32
`Energy breakdown for single level hierarchy. .......................................36
`Energy-delay product for single level hierarchy. ...................................37
`Energy-delay versus associativity for single level hierarchy. ................38
`Energy-delay versus line size for single level hierarchy. ......................39
`Energy-delay for two-level on-chip cache hierarchy. ............................40
`Energy breakdown for two-level hierarchy. ..........................................41
`Comparison of three memory hierarchies. .............................................42
`Delay of circuit blocks divided by the delay of standard inverter. ........52
`EDP contours without velocity saturation. ............................................56
`EDP contours with velocity saturation. .................................................56
`EDP and performance contours with velocity saturation. .....................57
`Ratio of leakage to total power. .............................................................58
`Minimum time for threshold adjustment. ..............................................60
`Variation in energy and delay. ...............................................................62
`EDP contours with uncertainty. .............................................................63
`Ratio of EDP without and with uncertainty. ..........................................63
`EDP contours using HSPICE models. ...................................................64
`Energy and delay variations with operating conditions. ........................66
`Power versus performance for fixed and variable supply. .....................69
`Power breakdown for laptop system ......................................................70
`Cache power model. ...............................................................................87
`
`viii
`
`

`

`Chapter 1
`
`Introduction
`
`In the past five years there has been an explosive growth in the demand for portable
`computation and communication devices, from portable telephones to sophisticated
`portable multimedia terminals [1]. This interest in portable devices has fueled the
`development of low-power signal processors and algorithms, as well as the development
`of low-power general purpose processors. In the digital signal processing area, the results
`of this attention to power are quite remarkable. Designers have been able to reduce the
`energy requirements of particular functions, such as video compression, by several orders
`of magnitude [2], [3]. This reduction has come as a result of focusing on the power
`dissipation at all levels of the design process, from algorithm design to the detailed
`implementation. In the general purpose processor area, however, there has been little work
`done to understand how to design energy efficient processors. This thesis is a start at
`bridging this gap and explores power and performance tradeoffs in the design and
`implementation of energy-efficient processors.
`
`Performance of processors has been growing at an exponential rate, doubling every 18 to
`24 months, as is shown in Figure 1.1. The bad news is that the power dissipated by these
`processors has also been growing exponentially, as is shown in Figure 1.2. Although the
`rate of growth of power is perhaps not quite as fast as the performance curve, it still has
`led to processors which dissipated more than 50W [4]. Such high power levels make
`cooling these processors difficult and expensive. If this trend continues processors will
`soon dissipate hundreds of watts, which is unacceptable in most systems. Thus there is
`great interest in understanding how to continue increasing performance without also
`increasing power dissipation.
`
`For portable applications the problem is even more severe since battery life depends on
`the power dissipation. Lithium-ion batteries have an energy density of approximately
`100Wh/Kg, the highest available today [5]. To operate a 50W processor for 4 hours
`requires a 2Kg battery, hardly a portable device. To address this problem processors
`manufacturers have introduced a variety of low-power chips. The problem with these
`processors is that they tend to have poor performance, as is shown in Figure 1.3. This
`
`1
`
`

`

`(cid:0)
`
`(cid:0)
`
`(cid:0)
`
`(cid:0)
`
`(cid:0)
`
`|
`1989
`
`(cid:0)
`
`(cid:0)
`
`|
`1992
`
`|
`1995
`
`|
`1998
` Year
`
`Figure 1.1: Evolution of processor performance.
`
`(cid:0)
`
`(cid:0)
`
`(cid:0)
`
`(cid:0)
`
`(cid:0)
`
`(cid:0)
`
`|
`1989
`
`(cid:0)
`
`|
`1992
`
`|
`1995
`
`|
`1998
` Year
`
`Figure 1.2: Evolution of processor power.
`
`2
`
`Chapter 1 Introduction
`
`|400
`
`|300
`
`|200
`
`|100
`|90
`|80
`|70
`|60
`|50
`|40
`
`|30
`|
`1986
`
`|50
`|40
`
`|30
`
`|20
`
`|10
`|9
`|8
`|7
`|6
`|5
`|4
`|
`|3
`1986
`
` Performance (SPECavg92)
`
` Power (watts)
`
`

`

`Chapter 1 Introduction
`
`(cid:6)
`
`(cid:0)
`
`|400
`
`|350
`
`|300
`
` Performance (SPECavg92)
`
`figure plots on the Y-axis performance, measured as the average of SPECint92 and
`SPECfp92 [6], and on the X-axis energy, measured as watt/SPEC.
`|450
`(cid:0) 21164
`(cid:0) UltraSPARC
`(cid:2) P6
`(cid:3) R4600
`(cid:4) R4200
`(cid:5) Power 603
`
`|250
`
`|200
`
`|150
`
`|100
`
`|50
`|
`|0
`0.00
`
`(cid:3)
`
`(cid:5)
`
`(cid:4)
`
`|
`0.03
`
`|
`0.06
`
`(cid:2)
`
`|
`0.09
`
`|
`0.12
` Energy (W/SPEC)
`
`Figure 1.3: Performance and energy of processors.
`
`In order to compare processor designs that have different performance and power one
`needs a measure of “goodness”. If two processors have the same performance or the same
`power, then it is trivial to choose which is better—users prefer higher performance for the
`same power level or the lower power one if they have the same performance. But
`processor designs rarely have the same performance. In particular when determining
`whether to add a particular feature designers need to know whether it will make the
`processor more desirable. Chapter 2 introduces the energy-delay product, or EDP for
`short, as a measure of “goodness” for low-power designs. This chapter also describes the
`most common low-power techniques and explores how they affect the energy-delay
`product of CMOS circuits.
`
`Chapter 2 will show that exploiting parallelism is one important technique enabling the
`reduction of the energy-delay of a circuit. Thus Chapter 3 explores how micro-
`architectural choices, which change the amount of parallelism the processor exploits,
`affect the efficiency of the processor. Since both the performance and energy dissipation
`of modern processors depend heavily on the design of the memory hierarchy, one must
`
`3
`
`

`

`Chapter 1 Introduction
`
`look not only at the processor itself, but also explore how the design of the memory
`hierarchy affects the overall efficiency of the system. Using three idealized processor
`models Chapter 3 shows micro-architectural changes do not significantly improve the
`efficiency of the processor system. The processor’s efficiency is set by a few circuits
`elements: memories and clocks.
`
`Since memories and clocking circuits are critical components of every digital system,
`much work already has been done to reduce the energy requirements. A different approach
`to reduce the energy dissipation of clocks and memories is to change the technology by
`scaling the supply voltage and the threshold voltage of transistors. Chapter 4 explores
`tradeoffs in scaling the supply and threshold voltage of CMOS circuits. A simple
`mathematical model of the EDP predicts large gains in efficiency by scaling the supply
`and threshold voltages, especially if transistors are velocity saturated. If there is
`uncertainty in the supply and threshold, however, the gains in EDP are much smaller.
`Furthermore, to achieve these modest gains it may be necessary to give up large (3X)
`factors of performance. These tradeoffs are discussed in more detail in Chapter 4 along
`with a promising method to reduce the effect of variations by using adaptive techniques to
`control the supply and threshold.
`
`Finally, Chapter 5 summarizes the contributions of this dissertation and proposes areas for
`future research.
`
`4
`
`

`

`Chapter 2
`
`Energy-Delay Product
`
`During the past few years many different techniques for reducing the power dissipation of
`CMOS circuits have been proposed, but relatively little work has been done to compare
`the benefits and costs of these different techniques. This chapter provides a review of these
`techniques and compares the effect they have on power and speed of the circuit. The rest
`of this thesis will investigate how the most promising of these techniques affect general
`purpose processors in particular.
`
`This chapter begins by giving a brief description of the sources of energy dissipation in
`CMOS circuits, since it is important to understand this topic before addressing the
`question of how to reduce the energy dissipation. It then describes different metrics that
`can be used to compare designs. An attractive possibility is to represent every design as a
`point in the performance-energy plane. For CMOS, since energy and performance are
`highly correlated, it is often enough to compare the energy-delay product, or EDP for
`short.
`
`2.1 Energy Dissipation in CMOS Circuits
`
`There are three sources of energy dissipation in CMOS circuits; dynamic energy, static
`energy, and short-circuit energy. A simple CMOS gate consists of two transistors,
`represented as a resistor and a switch, connected to a fixed output load capacitance and a
`constant voltage source, as shown in Figure 2.1. Dynamic energy is due to the charging
`and discharging of the load capacitance. If the output node is originally at ground and
`assuming that it swings full rail, then an amount of energy equal to CV2 is drawn from the
`voltage source on a low to high transition. Of this amount, 1/2CV2 is dissipated in the p-
`transistor to charge the load capacitance and 1/2CV2 is stored in the capacitor itself. The
`stored energy is dissipated in the n-transistor to discharge the load. Thus 1/2CV2 is
`dissipated on each transition. The circuit only dissipates dynamic energy when it is active
`or switching. If the output node remains at a fixed voltage level, then no energy is
`dissipated. Most nodes in CMOS circuits transition only infrequently; therefore the energy
`per cycle is usually written as,
`
`5
`
`

`

`2.1 Energy Dissipation in CMOS Circuits
`
`E
`
`=
`
`nCV2
`-------------
`2
`
`(2.1)
`
`where n is the number of transitions during the period of interest. If the circuit is
`synchronous and clocked at a frequency f then the average power can be written as,
`
`P
`
`=
`
`aCV2f
`
`(2.2)
`
`where a is the probability of a transition at the output node divided by 2. If a node
`transitions every cycle then a=0.5.
`
`CL
`
`Figure 2.1: CMOS inverter.
`
`Static energy is due to resistive paths between the supply and ground. The two main
`sources of static energy are analog or analog-like circuits which require constant current
`sources, and leakage current. Although there is some leakage current through the reverse
`biased diode between the source/drain and the bulk, the more important component is
`leakage through the channel when the transistor is nominally off [7]. The leakage current
`density (current per μm of gate width) is proportional to e-Vth/γVt, where Vth is the
`threshold voltage of the transistor, Vt is the thermal voltage, and γ is a constant slightly
`
`6
`
`

`

`2.2 Low-Power Metrics
`
`larger than 1. Static energy is important because it can limit the energy dissipation when
`the circuit is idle or in standby mode and there is no dynamic energy dissipation.
`
`Short-circuit energy is due to both transistors being on simultaneously while the gate
`switches. Troutman [8] and Chatterjee [9] provide good descriptions of short-circuit
`current in CMOS circuits. As these papers show this component is usually small and
`therefore will be ignored for the remainder of this thesis.
`
`For most CMOS circuits in today’s technologies dynamic energy dissipation dominates.
`For example, in a 0.6μm technology with Vth=0.9V leakage current is 4-5 orders of
`magnitude smaller than dynamic current (for one inverter in a 31 element ring oscillator).
`That is, only when the circuit is idle for 99.99% of the time does leakage current become
`an important consideration. As the amount of on-chip transistor width increases or the
`threshold voltage of the transistors decreases, leakage current becomes more significant.
`Chapter 4 explores in more detail how the energy efficiency of CMOS circuits changes as
`the threshold voltage changes.
`
`In order to reduce the energy dissipation it is necessary to reduce one or more of α, C, or
`V. The next section describes and compares how some often proposed low-power design
`techniques attempt to reduce these quantities and how they effect the power and
`performance of CMOS circuits.
`
`2.2 Low-Power Metrics
`
`When optimizing a design for low power it is necessary to have a metric that can be used
`to compare different alternatives. The most obvious choice is power, measured in watts.
`Power is the rate of energy use, or P=dE/dT. A more useful definition, however, is average
`power, or the energy spent to perform a particular operation divided by the time taken to
`perform the operation Pavg=Eop/Top. How to define the operation of interest is arbitrary
`and depends on what is being compared. In the case of a processor, it could be the energy
`to run a benchmark to completion, or the energy to execute an instruction—as long as all
`processors compared execute the same instructions.
`
`Power is important for two reasons. The first is that it determines what kind of package
`can be used for the chip. For example, a small plastic package, the cheapest form of
`packaging, can only dissipate a few watts. A processor which dissipates more than that
`will have to be sold in a more expensive package. The second reason power is important is
`
`7
`
`

`

`2.2 Low-Power Metrics
`
`because it limits how long the system battery will last. But power as a metric of
`“goodness” of low-power designs has some drawbacks. The most important drawback is
`that power is proportional to the operation rate, so one can reduce the power by slowing
`down the system. In CMOS circuits this is very easy to do, one simply reduces the clock
`frequency.
`
`Regardless of what definition of an operation one uses, the basic problem with power
`remains, that power decreases simply by extending the time required to complete an
`operation. Power, therefore, is only a good metric to compare processors that have similar
`performance levels. If two processors can perform computation at the same rate, then
`clearly whichever dissipates less power is more desirable. If the processors run at different
`rates the slower processor will almost always be lower power.
`
`An alternative metric is the energy per operation, measured in jules/op, or its inverse,
`measured in SPEC/watt or MIPS/watt. This metric does not depend on the time taken to
`perform the operation, since running the processor at half the frequency means you need
`to accumulate the power for twice as long. The problem with this metric is that, from
`Equation (2.1) the energy per operation can be made smaller by lowering the supply
`voltage. However, the supply voltage also affects the speed or performance of the basic
`CMOS gates, with lower supplies increasing the delay per operation. Thus low energy
`solutions might (and often do) run very slowly.
`
`Another alternative is to use both metrics, energy and speed. Rather than representing
`designs by a single number they are represented as a point in the performance-energy
`plane, as in Figure 1.3 and Figure 2.2. Given some requirements, such as minimum
`performance or maximum energy, one can determine which is the best solution available.
`The problem with this representation is how to compare designs which have different
`performance or energy levels. Without additional requirements, such as area or cost, there
`is no way to decide which solution is better.
`
`From an optimization standpoint one possible metric is the product of energy and delay,
`measured in jules-sec, or its inverse, measured in SPEC2/watt. Optimizing the energy-
`delay product will prevent the designer from trading off a large amount of performance for
`a small savings in energy, or vice versa. As will be described later the energy-delay is also
`an attractive metric for other reasons. In Figure 2.2 the EDP corresponds to the inverse of
`the slope of a line that connects a design point to the origin. Thus finding a solution with a
`low EDP corresponds to finding a solution which lies on a steeper line.
`
`8
`
`

`

`2.3 Low-Power Design Techniques
`
`(cid:0)
`
`(cid:2)
`
`|
`0.5
`
`|
`1.0
`
`|
`|
`2.0
`1.5
` Normalized Energy
`
`Figure 2.2: Performance-energy plane.
`
`|2.1
`
`|1.8
`
`|1.5
`
`|1.2
`
`|0.9
`
`|0.6
`
`|0.3
`
`|
`|0.0
`0.0
`
` Normalized Performance
`
`2.3 Low-Power Design Techniques
`
`From Equation (2.1) one simple way to reduce the energy per operation is to lower the
`power-supply voltage. However, since both capacitance and threshold voltage are
`constant, the speed of the basic gates will also decrease with this voltage scaling. The
`delay of a CMOS gate can be modeled as the time required to discharge the output
`capacitance by the transistor current, Tg = CV/I. Using the current model presented by [10]
`this gives,
`
`Tg
`
`=
`
`K
`
`V
`---------------------------
`) α
`–(
`V Vth
`
`(2.3)
`
`where α is the velocity saturation coefficient and K is a technology specific constant.
`When transistors are not velocity saturated α=2.0 and the equation reduces to the
`quadratic model for transistor current. As transistors become more velocity saturated α
`decreases towards one. For typical 0.25μm technologies α=1.3-1.5.
`
`9
`
`

`

`2.3 Low-Power Design Techniques
`
`Figure 2.3 plots the normalized speed of operation versus the energy per operation of a
`CMOS gate as the supply voltage is scaled. The speed and energy were found by
`simulating an inverter in a 0.6μm technology using HSPICE. The threshold voltage is held
`constant in this example. At large voltages reducing the supply reduces the energy for a
`modest change in delay. This causes the curve to bend over. At voltages near the device
`threshold, small supply changes cause a large change in delay for a modest change in
`energy. But from V=1.5Vth to V=6Vth changes in energy and delay cancel each other and
`the curve approaches a straight line, which corresponds to a constant energy-delay
`product. Over this region the EDP remains within a factor of 2. In this case scaling the
`supply voltage reduces the power dissipation but at the expense of the speed of the gates.
`Looking at Equation (2.3), we see that one way to gain back the performance lost is to
`scale the threshold voltage. But this increases the leakage current. Chapter 4 explores the
`tradeoff between power and performance from scaling the supply and threshold voltages.
`It will be shown, however, that when leakage power is a very small fraction of the total
`power, as is the case in most technologies today, scaling the supply voltage does not
`significantly affect the EDP.
`|1.0
`
`EDP=X
`
` Normalized Performance
`
`|0.8
`
`|0.6
`
`|0.4
`
`|0.2
`
`|
`|0.0
`0.0
`
`EDP=2X
`
`|
`3.0
`
`|
`6.0
`
`|
`9.0
`
`|
`|
`15.0
`12.0
` Normalized Energy
`
`Figure 2.3: Variation in performance and energy with supply voltage.
`
`Another technique to reduce the energy per operation is to reduce the size of all transistors
`in the gate. This reduces the capacitance that needs to be switched when one of the input
`
`10
`
`

`

`2.3 Low-Power Design Techniques
`
`switches. Unfortunately it also decreases the current drive of the gate, making it slower.
`This can be partly compensated by making the next gate smaller. At some point, however,
`the load of the gate will no longer be dominated by the input capacitance of the following
`gates, but rather by the capacitance of the interconnect between gates.
`
`Figure 2.4 graphs the normalized energy per operation versus the speed of operation of a
`CMOS gate as the percentage of loading that is due to gate capacitance varies from 20% to
`80%. The diffusion capacitance also depends on transistor width and therefore the percent
`of loading that depends linearly on the transistor width is larger than shown in the figure.
`The dotted line indicates the point at which 80% of loading is proportional to transistor
`width. The load will be mostly wire capacitance for small transistor, and will be mostly
`gate capacitance for large devices. Continuing to increase the transistor sizes gives very
`small gains in performance for a large energy cost. This causes the curve to be almost flat.
`But for the points shown in the figure the difference in EDP is a factor of 2.5.
`EDP=X
`|1.0
`
`|0.8
`
`|0.6
`
`|0.4
`
`|0.2
`
`|
`|0.0
`0.0
`
`EDP=2.5X
`
`|
`2.0
`
`|
`4.0
`
`|
`|
`8.0
`6.0
` Normalized Energy
`
` Normalized Performance
`
`Figure 2.4: Variation in performance and energy with transistor sizing.
`
`Clearly, real circuits are more complex. The gate and wire capacitance is different for
`every gate, nodes transition at different frequencies, and not all gates are on the critical
`path. While this problem is difficult to solve precisely, the basic tradeoff remains the same.
`
`11
`
`

`

`2.3 Low-Power Design Techniques
`
`Sizing the transistors allows the designer to tradeoff speed for power. At either extreme
`(very large or very small transistors) the tradeoffs become poor.
`
`Often one wants to optimize two, or perhaps more, variables simultaneously. For example
`one may want to find the optimal supply voltage and transistor size. One way to visualize
`the data is to plot contours of energy, delay, and perhaps energy-delay, on the supply
`voltage vs transistor size plane. Figure 2.5 is an example of such a plot, and gives contours
`of inverse relative EDP versus transistor size and supply voltage. The Y-axis shows the
`percent of the total load that is due to gate capacitance, and the X-axis shows the supply
`voltage. For the range of supply voltage and gate loading shown in this figure there is a
`local minimum in the EDP at V=1.7V and 40% gate loading. The relative EDP is the EDP
`normalized to the minimum value. The figure plots contours of the inverse of this metric.
`Thus the value at the minima is 1 and decreases as one moves away from the minima. The
`contour labeled 0.5 has twice the minimum energy-delay. The advantage of representing
`the data this way is that it is much easier to understand how the variables of interest
`(energy, delay, energy-delay) change with the optimization variables (supply voltage,
`transistor sizing). If the data were plotted in the energy vs. perform

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket