`LOper et al.
`
`19
`
`54 SYSTEM AND METHOD FOR REDUCING
`POWER CONSUMPTION IN AN
`ELECTRONIC CIRCUIT
`
`75 Inventors: Albert John Loper, Cedar Park;
`Soummya Mallick, Austin, both of Tex.
`
`73 ASSignee: International Business Machines
`Corporation, Armonk, N.Y.
`
`21 Appl. No.:726,871
`1-1.
`22 Filed:
`Oct. 4, 1996
`(51) Int. Cl." ........................................................ G06F 1/32
`52 U.S. Cl. ................................ 395/75005, 395/75001;
`395/750.03; 711/128
`58 Field of Search ......................... 395/75001, 75003,
`395/750.04 750.05. 750.06 75007 450.
`452 453 455. 71 /123 125 126 12s
`
`s
`
`s
`
`s
`
`56)
`
`s
`s
`s
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,091,851 2/1992 Shelton et al. .......................... 711/128
`5,345,569 9/1994 Tran.
`5,420.808 5/1995 Alexander et al..
`SE7 8.8 ki.e. a.
`5.491s29 2f1996 Kau et al. .
`5,495.419 2/1996 Rostoker et al. .
`5,615,140 3/1997 Ishikawa ................................. 364/749
`5,638,334 6/1997 Farmwald et al.
`... 365/230.03
`5,666,537 9/1997 Debnath et al. ..
`30s/75004
`5,682,515 10/1997 Lau et al. ................................ 395/455
`FOREIGN PATENT DOCUMENTS
`06236272 8/1994 Japan
`2 297 398 7/1996 United Kingdom .............. GO6F 1/32
`2297398 7/1996 United Kingdom
`2 304 215 3/1997 United Rdom.
`GO6F 1/32
`2304215 3/1997 United Kingdom.
`WO9516952 6/1995 WIPO.
`WO9518998 7/1995 WIPO.
`
`USOO587O616A
`Patent Number:
`11
`(45) Date of Patent:
`
`5,870,616
`Feb. 9, 1999
`
`WO9534070 12/1995 WIPO.
`OTHER PUBLICATIONS
`IBM Technical Disclosure Bulletin, vol. 37, No. 04A, Apr.
`1994, pp. 51-52, “Bus Load Reduction in Disk Array
`Controller through the Use of Multicast Busing Tech
`niques'.
`IBM Technical Disclosure Bulletin, vol. 37, No. 04B, Apr.
`1994, pp. 73-74, “Externally Triggered Soft Stop for Micro
`processors'.
`IBM Technical Disclosure Bulletin, vol. 37, No. 04B, Apr.
`1994, pp. 385-388, “RISC Superscalar Pipeline Out-o-
`f-Order Statistics Gathering and Analysis”.
`IBM Technical Disclosure Bulletin, vol. 37, No. 09, Sep.
`“Clock-Controlled Power Save
`1994, pp. 283-284,
`Method".
`IBM Technical Disclosure Bulletin, vol. 37, No. 10, Oct.
`1994, pp. 59-60, “Special Serialization for Load-with-Up
`date Instruction to Reduce the Complexity of Register
`
`Renaming Circuitry”.
`IBM Technical Disclosure Bulletin, vol. 37, No. 10, Oct.
`1994, pp. 151-153, “Minimizing Power Consumption in
`Micro-Processor Based Systems which Utilize Speech Rec
`ognition Devices.
`IBM Techincal Disclosure Bulletin, vol. 38 No 12, Dec.
`1995, pp. 443-444, “Digital Adjustable Linear Regulator”.
`Primary Examiner Thomas C. Lee
`ASSistant Examiner Ario Etienne
`Attorney, Agent, or Firm-Casimer K. Salys; Michael
`Davis, Jr.
`ABSTRACT
`(57
`While a Set-associative cache memory operates in a first
`power mode, information is Stored in up to N number of
`ways of the cache memory, where N is an integer number
`and N>1. While the cache memory operates in a second
`power mode, the information is stored in up to M number of
`ways of the cache memory, where M is an integer number
`and 0<MCN.
`
`22 Claims, 5 Drawing Sheets
`
`BRA20
`
`el.
`
`INT - -
`
`INSTRUCTIONCACHE
`KINT
`
`SPs - t t
`
`all
`
`SEQUENCER UNIT
`
`SPS
`HPS
`
`
`
`
`
`
`
`
`
`14
`
`18
`
`
`
`COMPLEX
`FIXED
`POINTUNIT
`26
`
`
`
`
`
`SPRS
`40
`
`
`
`FIXED
`PONTUNIT
`
`GPRS 32
`
`LOADSTORE
`UNIT
`
`FPRS is
`
`FOATING
`POIN
`UNIT
`
`t
`
`CARRY BIT
`REGISTER
`42
`
`RENAME
`
`HPS X->
`
`RENAME
`BUFFERS 38
`
`BUS
`INTERFACE
`UNIT 12
`
`TRANSDUCERS 41
`
`SPS
`
`HPS
`
`Vold GND SM INT
`
`1.
`
`MEMORY 39
`
`Page 1
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`U.S. Patent
`US. Patent
`
`Feb. 9, 1999
`
`Sheet 1 0f 5
`
`5,870
`
`616
`616
`’
`9
`
`_.
`
`GE
`
`
`
`m10<ozo_.53m._.mz_
`
`
`
`cm
`
`.52:
`
`._.z_On_:23
`
`
`
`oz_H<O._u_mmOHw\o<O._
`
`elm
`
`Xm._n=>_oo
`
`ome
`
`:2:28
`
`2_E023Uu> ammmoBmzst
`
`a:2:
`
`mam
`
`moEmEz
`
`Tl|9||fly
`
`._.
`
`._._m>mm<o
`
`wmm._.m_0mm
`
`Page 2
`
`AMAZON 1020
`
`Amazon V. SpeakWare
`IPR2019-00999
`
`Page 2
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`
`U.S. Patent
`
`Feb. 9, 1999
`
`Sheet 2 of 5
`
`5,870,616
`
`TO INSTRUCTION CACHE 14
`
`FROMINSTRUCTION CACHE 14
`
`FETCH LOGIC 71
`
`SPS
`
`HPS
`INT
`
`
`
`
`
`TO
`EXECUTION
`UNITS
`
`
`
`
`
`FROM
`EXECUTION
`UNITS
`
`- 1
`
`INSTRUCTION
`TYPE
`
`REORDER
`BUFFER
`76
`
`
`
`
`
`
`
`COMPLETION
`LOGIC
`80
`
`
`
`
`
`EXCEPTION
`LOGIC
`82
`
`FIG. 2
`
`Page 3
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`U.S. Patent
`
`Feb. 9, 1999
`
`Sheet 3 of 5
`
`5,870,616
`
`FROMINSTRUCTIONCACHE 14
`
`2X 32bits -
`
`
`
`
`
`-
`
`- - - - - - - - - - - - - -
`
`
`
`55b
`
`TODECODE LOGIC 72
`
`TODECODE LOGIC 72
`
`FIG. 3
`
`Page 4
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`U.S. Patent
`
`Feb. 9, 1999
`
`Sheet 4 of 5
`
`5,870,616
`
`76
`
`175
`2
`
`BUFFER INSTRUCTION TYPE
`NUMBER EXECUTION UNIT
`
`GPR FPR
`DEST, DEST FINISHED EXCEPTION
`
`COMPLE
`TION ->
`ALLOCA
`toN
`
`173
`
`FIG. 4
`
`
`
`38
`BUFFER REGISTER INFOR
`NUMBER NUMBER MATION
`182
`0
`WRITE.
`PN. 1
`N5 -
`2 3
`1849
`
`
`
`ALLOCA
`TION
`1809
`
`Vdd
`
`Vdd
`
`OUT
`
`FIG. 7
`
`D
`
`ENABLE N-o-
`
`D
`
`Page 5
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`US. Patent
`
`Feb. 9, 1999
`
`Sheets 0f5
`
`5,870,616
`
`o.0."—
`
`New
`
`m4m<zm
`
`AI_|Amam
`
`S.
`
`
`
`
`
`\H\\H\ALIA._.z_a28.30528Alma:
`
`
`
`
`
`2‘:23mwOZMDOmwEOEEHNF.22:mo<u_mm_._.z_momEOEBH
`
`Page 6
`
`AMAZON 1020
`
`Amazon V. SpeakWare
`IPR2019-00999
`
`
`
`
`
`
`
`
`
` mmzmmwmzmmmmzmmmmzmmmw2mmmwzmmmmzmwmwzmwmwzmm
`
`
`
`
`
`
`
`
`
`---
`
`|\1—
`O
`
`E (
`
`DV—
`O
`
`E L
`
`O
`
`E
`
`q.
`
`
`
`
`
`
`
`moo;so;80;802,as;I80;so;80;SEE“oocfimmmmog3505—[l
`
`Page 6
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`
`
`
`
`1
`SYSTEMAND METHOD FOR REDUCING
`POWER CONSUMPTION IN AN
`ELECTRONIC CIRCUIT
`
`CROSS-REFERENCES TO RELATED
`APPLICATIONS
`This patent application is related to copending U.S. patent
`application Ser. No. 08/726,396, application Ser. No.
`08/726,395, and application Ser. No. 08/726,370, each filed
`concurrently here with.
`
`TECHNICAL FIELD
`This patent application relates in general to electronic
`circuitry and in particular to a method and System for
`reducing power consumption in an electronic circuit.
`
`BACKGROUND
`In recent years, portable laptop computers have become
`increasingly popular. Frequently, Such laptop computers are
`battery powered in order to enhance their portability.
`Preferably, a battery powered laptop computer operates for
`an extended period of time under battery power before its
`battery is either recharged or replaced.
`Accordingly, it is important to reduce power consumption
`within an electronic circuit of the laptop computer, in order
`to extend the period of time during which the electronic
`circuit operates before recharging or replacing the battery.
`For this purpose, Some previous techniques disable power or
`disable clock Signals to the electronic circuit in response to
`a Specified time elapsing without Sensing a particular type of
`activity. A shortcoming of Such previous "timer' techniques
`is that the electronic circuit can unnecessarily consume
`excess power while waiting for the timer to expire, even
`when the electronic circuit is not performing an operation.
`Thus, a need has arisen for a method and System in which
`an electronic circuit consumes leSS eXceSS power relative to
`previous techniques.
`
`SUMMARY
`While a Set-associative cache memory operates in a first
`power mode, information is Stored in up to N number of
`ways of the cache memory, where N is an integer number
`and N>1. While the cache memory operates in a second
`power mode, the information is stored in up to M number of
`ways of the cache memory, where M is an integer number
`and 0<MCN.
`It is a technical advantage that an electronic circuit
`consumes leSS eXceSS power relative to previous techniques.
`BRIEF DESCRIPTION OF THE DRAWINGS
`An illustrative embodiment and its advantages are better
`understood by referring to the following descriptions and
`accompanying drawings, in which:
`FIG. 1 is a block diagram of a processor System for
`processing information according to the illustrative embodi
`ment,
`FIG. 2 is a block diagram of a Sequencer unit of the
`processor of FIG. 1;
`FIG. 3 is a block diagram of an instruction buffer queue
`of the sequencer unit of FIG. 2;
`FIG. 4 is a conceptual illustration of a reorder buffer of the
`Sequencer unit of FIG. 2;
`FIG. 5 is a conceptual illustration of rename buffers of the
`processor of FIG. 1;
`
`1O
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,870,616
`
`2
`FIG. 6 is a block diagram of an instruction cache of the
`processor of FIG. 1; and
`FIG. 7 is a Schematic electrical circuit diagram of Sense
`amplification circuitry of the instruction cache of FIG. 6.
`DETAILED DESCRIPTION
`An illustrative embodiment and its advantages are better
`understood by referring to FIGS. 1-7 of the drawings.
`FIG. 1 is a block diagram of a processor 10 system for
`processing information according to the illustrative embodi
`ment. In the illustrative embodiment, processor 10 is a Single
`integrated circuit SuperScalar microprocessor. Accordingly,
`as discussed further hereinbelow, processor 10 includes
`various units, registers, buffers, memories, and other
`Sections, all of which are formed by integrated circuitry.
`Also, in the illustrative embodiment, processor 10 operates
`according to reduced instruction set computing (“RISC)
`techniques. AS shown in FIG. 1, a System buS 11 is con
`nected to a bus interface unit (“BIU”) 12 of processor 10.
`BIU 12 controls the transfer of information between pro
`cessor 10 and system bus 11.
`BIU 12 is connected to an instruction cache 14 and to a
`data cache 16 of processor 10. Instruction cache 14 outputs
`instructions to a Sequencer unit 18. In response to Such
`instructions from instruction cache 14, Sequencer unit 18
`Selectively outputs instructions to other execution circuitry
`of processor 10.
`In addition to sequencer unit 18, in the illustrative
`embodiment the execution circuitry of processor 10 includes
`multiple execution units, namely a branch unit 20, a fixed
`point unit (“FXU’) 22, a complex fixed point unit (“CFXU’)
`26, a load/store unit (“LSU”) 28 and a floating point unit
`(“FPU”) 30. FXU 22, CFXU 26 and LSU 28 input their
`Source operand information from general purpose architec
`tural registers (“GPRs”) 32 and fixed point rename buffers
`34. Moreover, FXU 22 inputs a “carry bit from a carry bit
`(“CA”) register 42. FXU 22, CFXU 26 and LSU 28 output
`results (destination operand information) of their operations
`for Storage at Selected entries in fixed point rename buffers
`34. Also, CFXU 26 inputs and outputs source operand
`information and destination operand information to and
`from special purpose registers (“SPRs”) 40.
`FPU 30 inputs its source operand information from float
`ing point architectural registers (“FPRs”) 36 and floating
`point rename buffers 38. FPU 30 outputs results (destination
`operand information) of its operation for storage at Selected
`entries in floating point rename buffers 38.
`In response to a Load instruction, LSU 28 inputs infor
`mation from data cache 16 and copies Such information to
`Selected ones of rename buffers 34 and 38. If such infor
`mation is not Stored in data cache 16, then data cache 16
`inputs (through BIU 12 and system bus 11) such information
`from a system memory 39 connected to system bus 11.
`Moreover, data cache 16 is able to output (through BIU 12
`and system bus 11) information from data cache 16 to
`system memory 39 connected to system bus 11. In response
`to a Store instruction, LSU 28 inputs information from a
`selected one of GPRS 32 and FPRs 36 and copies such
`information to data cache 16.
`Sequencer unit 18 inputs and outputs information to and
`from GPRs 32 and FPRs 36. From sequencer unit 18, branch
`unit 20 inputs instructions and Signals indicating a present
`State of processor 10. In response to Such instructions and
`Signals, branch unit 20 outputs (to Sequencer unit 18) signals
`indicating Suitable memory addresses Storing a Sequence of
`instructions for execution by processor 10. In response to
`
`Page 7
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`3
`Such signals from branch unit 20, Sequencer unit 18 inputs
`the indicated Sequence of instructions from instruction cache
`14. If one or more of the Sequence of instructions is not
`Stored in instruction cache 14, then instruction cache 14
`inputs (through BIU 12 and system bus 11) such instructions
`from system memory 39 connected to system bus 11.
`In response to the instructions input from instruction
`cache 14, Sequencer unit 18 Selectively dispatches the
`instructions to selected ones of execution units 20, 22, 26, 28
`and 30. Each execution unit executes one or more instruc
`tions of a particular class of instructions. For example, FXU
`22 executes a first class of fixed point mathematical opera
`tions on Source operands, Such as addition, Subtraction,
`ANDing, ORing and XORing. CFXU 26 executes a second
`class of fixed point operations on Source operands, Such as
`fixed point multiplication and division. FPU 30 executes
`floating point operations on Source operands, Such as float
`ing point multiplication and division.
`AS information is Stored at a Selected one of rename
`bufferS 34, Such information is associated with a Storage
`location (e.g. one of GPRS 32 or CA register 42) as specified
`by the instruction for which the selected rename buffer is
`allocated. Information Stored at a Selected one of rename
`buffers 34 is copied to its associated one of GPRs 32 (or CA
`register 42) in response to signals from Sequencer unit 18.
`25
`Sequencer unit 18 directs Such copying of information
`Stored at a Selected one of rename buffers 34 in response to
`“completing the instruction that generated the information.
`Such copying is called “writeback”.
`AS information is Stored at a Selected one of rename
`buffers 38, Such information is associated with one of FPRS
`36. Information stored at a selected one of rename buffers 38
`is copied to its associated one of FPRs 36 in response to
`Signals from Sequencer unit 18. Sequencer unit 18 directs
`Such copying of information Stored at a Selected one of
`rename buffers 38 in response to “completing” the instruc
`tion that generated the information.
`Processor 10 achieves high performance by processing
`multiple instructions Simultaneously at various ones of
`execution units 20, 22, 26, 28 and 30. Accordingly, each
`instruction is processed as a Sequence of Stages, each being
`executable in parallel with Stages of other instructions. Such
`a technique is called "pipelining’. In the illustrative
`embodiment, an instruction is normally processed as Six
`Stages, namely fetch, decode, dispatch, execute, completion,
`and writeback.
`In the fetch Stage, Sequencer unit 18 Selectively inputs
`(from instructions cache 14) one or more instructions from
`one or more memory addresses Storing the Sequence of
`instructions discussed further hereinabove in connection
`with branch unit 20 and sequencer unit 18.
`In the decode Stage, Sequencer unit 18 decodes up to two
`fetched instructions.
`In the dispatch Stage, Sequencer unit 18 Selectively dis
`patches up to two decoded instructions to Selected (in
`response to the decoding in the decode stage) ones of
`execution units 20, 22, 26, 28 and 30 after reserving rename
`buffer entries for the dispatched instructions results
`(destination operand information). In the dispatch Stage,
`operand information is Supplied to the Selected execution
`units for dispatched instructions. Processor 10 dispatches
`instructions in order of their programmed Sequence.
`In the execute Stage, execution units execute their dis
`patched instructions and output results (destination operand
`information) of their operations for storage at Selected
`entries in rename buffers 34 and rename buffers 38 as
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,870,616
`
`5
`
`15
`
`4
`discussed further hereinabove. In this manner, processor 10
`is able to execute instructions out-of-order relative to their
`programmed Sequence.
`In the completion Stage, Sequencer unit 18 indicates an
`instruction is “complete'. Processor 10 “completes’ instruc
`tions in order of their programmed Sequence.
`In the writeback Stage, Sequencer 18 directs the copying
`of information from rename buffers 34 and 38 to GPRS 32
`and FPRs 36, respectively. Sequencer unit 18 directs such
`copying of information Stored at a Selected rename buffer.
`Likewise, in the writeback Stage of a particular instruction,
`processor 10 updates its architectural States in response to
`the particular instruction. Processor 10 processes the respec
`tive “writeback” stages of instructions in order of their
`programmed Sequence. Processor 10 advantageously merges
`an instruction's completion Stage and writeback Stage in
`Specified situations.
`In the illustrative embodiment, each instruction requires
`one machine cycle to complete each of the Stages of instruc
`tion processing. Nevertheless, Some instructions (e.g., com
`plex fixed point instructions executed by CFXU 26) may
`require more than one cycle. Accordingly, a variable delay
`may occur between a particular instruction's execution and
`completion Stages in response to the variation in time
`required for completion of preceding instructions.
`Processor 10 implements, and operates according to, five
`power modes. Four of these five power modes are “power
`Saving modes of operation. The five power modes are
`Selectively enabled and disabled in response to States of
`control bits in a machine state register (“MSR”) and hard
`ware implementation register. These registers are located in
`SPRs 40. Accordingly, the control bits are set and/or cleared
`in response to CFXU 26 executing move instructions
`directed to SPRs 40. The five power modes are Full-power,
`Doze, Nap, Sleep, and, in a significant aspect of the illus
`trative embodiment, Special.
`1. Full-power mode. The Full-power mode is the default
`power mode of processor 10. In the Full-power mode,
`processor 10 is fully powered, and units operate at the
`processor clock speed of processor 10. Processor 10
`further implements a dynamic power management
`mode which can be selectively enabled and disabled. If
`the dynamic power management mode is enabled, then
`idle units within processor 10 automatically enter a
`low-power State without affecting performance, Soft
`ware execution, or external hardware circuitry.
`The aforementioned dynamic power management mode,
`and the Full-power, Doze, Nap, and Sleep power modes, are
`more completely described in the publication entitled Pow
`erPC 603e RISC Microprocessor User's Manual, published
`by IBM Microelectronics Division, Hopewell Junction, New
`York, Telephone 1-800-PowerPC, which is hereby fully
`incorporated by reference herein. Moreover, the dynamic
`power management mode is described in U.S. Pat. No.
`5,420.808, which is hereby fully incorporated by reference
`herein. In the illustrative embodiment, processor 10 is an
`enhanced version of the PowerPC 603e RISC microproces
`Sor available from IBM Microelectronics Division,
`Hopewell Junction, New York. Processor 10 is enhanced
`relative to the PowerPC 603e RISC microprocessor, as
`processor 10 implements the Special power mode.
`Accordingly, the Special power mode is a Significant aspect
`of the illustrative embodiment.
`2. Doze mode. In the Doze mode, all units of processor 10
`are disabled except bus Snooping logic of BIU 12, time
`base/decrementer registers (not shown in FIG. 1) of
`
`Page 8
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`S
`processor 10, and a phase-locked loop (“PLL) (not
`shown in FIG. 1) of processor 10. In the Doze mode,
`the PLL of processor 10 continues in a fully powered
`State and remains Synchronized to an external System
`clock of System buS 11, So that any return to the
`Full-power mode occurs within only a few clock cycles
`of processor 10.
`From the Doze mode, processor 10 returns to the Full
`power mode in response to an external asynchronous inter
`rupt via assertion of an interrupt line INT, so that INT
`provides a signal having a logic 1 State to processor 10.
`Likewise, from the Doze mode, processor 10 returns to the
`Full-power mode in response to a System management
`interrupt via assertion of a System management interrupt line
`SMI, so that SMI provides a signal having a logic 1 state to
`processor 10. Moreover, from the Doze mode, processor 10
`returns to the Full-power mode in response to a decrementer
`exception, a hard or Soft reset, or a machine check input.
`A hard reset occurs in response to a Voltage Supply node
`Vdd Switching from a low voltage (e.g. 0 volts) to a
`predetermined voltage (e.g. 2.5 volts) relative to a voltage
`reference node GND. Notably, for clarity, FIGS. 1-6 show
`less than all connections from INT, SMI, Vdd, and GND to
`various circuitry throughout processor 10. From any power
`Saving mode, processor 10 returns to the Full-power mode
`in response to a Soft reset, in which control bits are set and/or
`cleared in response to CFXU 26 executing suitable move
`instructions directed to SPRs 40, where Such move instruc
`tions are part of a Software reset Sequence of instructions.
`3. Nap mode. Relative to the Doze mode, the Nap mode
`further reduces power consumption of processor 10 by
`disabling bus Snooping logic of BIU 12, So that only the
`PLL and time base/decrementer registers of processor
`10 remain in the full-power state. From the Nap mode,
`processor 10 returns to the Full-power mode in
`response to an external asynchronous interrupt via
`assertion of interrupt line INT, a System management
`interrupt, a decrementer exception, a hard or Soft reset,
`or a machine check input. AS with the Doze mode, any
`return from the Nap mode to the Full-power mode
`occurs within only a few clock cycles of processor 10.
`4. Sleep mode. In the Sleep mode, power consumption is
`reduced to near to near minimum by disabling all units
`of processor 10, after which logic external to processor
`10 can disable the PLL and the external system clock.
`From the Sleep mode, processor 10 returns to the
`Full-power mode in response to a reenabling of both
`the PLL and the external system clock, followed by a
`Suitable minimum time elapsing for PLL to become
`Synchronized to the external System clock, and then
`followed by assertion of interrupt line INT, a system
`management interrupt, a decrementer exception, a hard
`or Soft reset, or a machine check input.
`5. Special mode. In a Significant aspect of the illustrative
`embodiment, processor 10 enters the Special mode in
`response to either (1) a hardware event or (2) a Software
`event. In the illustrative embodiment, the hardware
`event occurs when transducers 41 output a signal
`having a logic 1 state on a line HPS (Hardware event,
`Power saving, Special mode). Similarly, the software
`event occurs when SPRs 40 output a signal having a
`logic 1 state on a line SPS (Software event, Power
`Saving, Special mode). SPRs 40 output Such a signal on
`SPS in response to CFXU 26 executing a suitable
`“Move to Special Purpose Register” (“MTSPR”)
`instruction directed to a predetermined bit of a “HID0”
`register of SPRs 40.
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`5,870,616
`
`6
`Transducers 41 include thermal Sensors for Sensing a
`relative temperature of integrated circuitry which forms
`processor 10. The hardware event occurs (i.e. transducers 41
`output a signal having a logic 1 state on HPS) in response to
`the thermal Sensors (of trandsducers 41) sensing a relative
`temperature that exceeds a threshold temperature. In the
`illustrative embodiment, the threshold temperature is prese
`lected to be a maximum Safe temperature of processor 10
`operating in the Full-power mode. Accordingly, if the tem
`perature of processor 10 were to exceed the maximum Safe
`temperature of processor 10 operating in the Full-power
`mode, then damage to processor 10 would likely result from
`continued operation of processor 10 in the Full-power mode.
`Advantageously, Such damage is Substantially avoided by
`processor 10 entering the Special “power Saving mode of
`operation in response to the hardware event.
`If processor 10 enters the Special mode in response to the
`hardware event, processor 10 reduces the maximum number
`of instructions fetched during a single cycle of processor 10,
`So that fewer instructions are dispatched per cycle of pro
`cessor 10 as discussed further hereinbelow in connection
`with FIGS. 2 and 3. In this manner, execution units are more
`likely to be idle, and consequently the low-power State of the
`dynamic power management mode (described in U.S. Pat.
`No. 5,420.808) advantageously is more readily invoked.
`Moreover, if processor 10 enters the Special mode in
`response to the hardware event, processor 10 changes the
`operation of LSU 28 as discussed further hereinbelow in
`connection with FIG. 5.
`By comparison, if processor 10 enters the Special mode in
`response to the Software event, processor 10 (a) reduces the
`maximum number of instructions fetched during a single
`cycle of processor 10 as discussed further hereinbelow in
`connection with FIGS. 2 and 3, (b) changes the operation of
`LSU 28 as discussed further hereinbelow in connection with
`FIG. 5. and (c) reduces power consumption within instruc
`tion cache 14 and data cache 16 by reducing their number of
`“ways” as discussed further hereinbelow in connection with
`FIG. 6.
`From the Special mode, processor 10 returns to the
`Full-power mode in response to neither SPS nor HPS having
`a logic 1 state. Moreover, if processor 10 entered the Special
`mode in response only to the software event (i.e. SPS has a
`logic 1 state while HPS has a logic 0 state), then processor
`10 further returns to the Full-power mode (from the Special
`mode) in response to (1) an external asynchronous interrupt
`via assertion of INT, or (2) a hard or soft reset, or (3) a
`machine check input. In an alternative embodiment, if
`processor 10 entered the Special mode in response only to
`the software event, processor 10 further returns to the
`Full-power mode (from the Special mode) in response to a
`System management interrupt via assertion of SMI. In Such
`an alternative embodiment, processor 10 would return to the
`Full-power mode in response to assertion of SMI, analo
`gously to the manner in which processor 10 returns to the
`Full-power mode in response to assertion of INT.
`In yet another alternative embodiment, processor 10 also
`returns to the Full-power mode in response to a decrementer
`exception. SPRs 40 include circuitry for decrementing a
`count in response to a processor clock signal (not shown in
`FIG. 1 for clarity). A decrementer exception is generated in
`response to Such a count being decremented to a value of
`ZCO.
`FIG. 1 shows the single SPS line connected to each of
`instruction cache 14, data cache 16, Sequencer unit 18 and
`LSU 28. Likewise, FIG. 1 shows the single HPS line
`connected to each of instruction cache 14, data cache 16,
`
`Page 9
`
`AMAZON 1020
`Amazon v. SpeakWare
`IPR2019-00999
`
`
`
`5,870,616
`
`15
`
`25
`
`35
`
`40
`
`7
`sequencer unit 18 and LSU 28. Similarly, FIG. 1 shows the
`Single INT line connected to each of instruction cache 14,
`data cache 16, sequencer unit 18 and LSU 28.
`FIG. 2 is a block diagram of sequencer unit 18. As
`discussed hereinabove, in the fetch Stage, if processor 10
`(and hence also fetch logic 71) is operating in the Full-power
`mode, then fetch logic 71 Selectively requests up to a
`maximum number of two instructions (per cycle of proces
`sor 10 and hence also of fetch logic 71) from instruction
`cache 14 and Stores Such instructions in an instruction buffer
`70. Accordingly, during a particular cycle of processor 10,
`Sequencer 18 requests a variable number (ranging from 0 to
`2) of instructions from instruction cache 14, where the
`variable number depends upon the number of additional
`instructions able to be stored in instruction buffer 70 (i.e.
`depends upon the number of available buffers in instruction
`buffer 70).
`In the decode stage, if processor 10 (and hence also
`decode logic 72) is operating in the Full-power mode, then
`decode logic 72 selectively inputs and decodes up to a
`maximum number of two fetched instructions (per cycle of
`processor 10 and hence also of decode logic 72) from
`instruction buffer 70. Accordingly, during a particular cycle
`of processor 10, decode logic 72 inputs and decodes a
`variable number (ranging from 0 to 2) of instructions from
`instruction buffer 70, where the variable number depends
`upon the number of instructions to be dispatched by dispatch
`logic 74 during the particular cycle.
`In the dispatch Stage, if processor 10 (and hence also
`dispatch logic 74) is operating in the Full-power mode, then
`dispatch logic 74 Selectively dispatches up to a maximum
`number of two decoded instructions (per cycle of processor
`10 and hence also of dispatch logic 74) to selected (in
`response to the decoding in the decode stage) ones of
`execution units 20, 22, 26, 28 and 30. Accordingly, during
`a particular cycle of processor 10, dispatch logic 74 dis
`patches a variable number (ranging from 0 to 2) of decoded
`instructions to the execution units, where the variable num
`ber depends upon the number of additional instructions able
`to be stored in the execution units for execution (e.g.
`depends upon the number of available reservation Stations in
`the execution units).
`By comparison, in the illustrative embodiment, if proces
`Sor 10 is operating in the Special power mode, then fetch
`logic 71 (in response to logic states of SPS, MPS and INT)
`requests a maximum of one instruction per cycle of proces
`sor 10 (instead of two instructions) from instruction cache
`14 and stores the one instruction in instruction buffer 70. In
`this manner, (a) decode logic 72 inputs and decodes (on
`average) approximately one fetched instructions from
`instruction buffer 70 per cycle of processor 10, (b) dispatch
`logic 74 dispatches (on average) approximately one instruc
`tion (per cycle of processor 10) to a selected one of execu
`tion units 20, 22, 26, 28 and 30, and (c) completion logic 80
`indicates (on average) “completion' (as discussed further
`hereinbelow) of approximately one instruction per cycle of
`processor 10. Accordingly, the execution units are more
`likely to be idle (relative to the Full-power mode), and
`consequently the low-power State of the dynamic power
`management mode (described in U.S. Pat. No. 5,420,808)
`advantageously is more readily invoked.
`In an alternative embodiment, if processor 10 is operating
`in the Special power mode, then dispatch logic 74 (in
`response to logic states of SPS, MPS and INT) dispatches a
`maximum of one instruction per cycle of processor 10
`(instead of two instructions) to a Selected one of execution
`units 20, 22, 26, 28 and 30; this technique of the alternative
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`embodiment is instead of (but can also be in addition to) the
`illustrative embodiment's technique of reducing the maxi
`mum number of instructions fetched during a Single cycle of
`processor 10. Hence, FIG. 2 shows SPS, HPS and INT
`connected to both fetch logic 71 and dispatch logic 74,
`FIG. 3 is a block diagram of instruction buffer 70.
`Instruction buffer 70 stores an I0 instruction and an I1
`instruction in a buffer I0 and a buffer I1, respectively, of
`dispatch buffers 56. In the illustrative embodiment, in
`response to a cycle of processor 10, either the I0 instruction
`is dispatched by itself to decode logic 72 (FIG. 2), both the
`I0 and I1 instructions are dispatched together to decode logic
`72, or the I1 instruction is dispatched by itself to decode
`logic 72. The contents of buffers IO and I1 are output to
`decode logic 72 through lines 55a-b, respectively.
`In the illustrative embodiment, instruction buffer 70 is
`able to input up to two 32-bit instructions in parallel from
`instruction cache 14 through 64-bit bus 50 during a single
`cycle of processor 10. In response to both the I0 and I1
`instructions being dispatched together to decode logic 72,
`instruction buffer 70 transfers any previously stored instruc
`tions from instruction bufferS 54a–b to buffers IO and I1,
`respectively. Also, in such a situation, instruction buffer 70
`transferS any previously Stored instructions from instruction
`buffers 52a–b to instruction buffers 54a–b, respectively.
`Moreover, in Such a situation, if processor 10 is operating in
`the Full-power mode, instruction buffer 70 inputs up to two
`32-bit instructions from instruction cache 14 through 64-bit
`bus 50 and stores such instructions in the first available (i.e.
`e