`US 6,901,522 B2
`(10) Patent No.:
`(12)
`Buch
`(45) Date of Patent:
`May31, 2005
`
`
`US006901522B2
`
`(54) SYSTEM AND METHOD FOR REDUCING
`POWER CONSUMPTIONIN
`MULTIPROCESSOR SYSTEM
`
`3/2004 Howard et al... 713/300
`6,711,691 B1 *
`5/2004 Dillenbergeretal. ....... 718/102
`6,732,139 B1 *
`OTHER PUBLICATIONS
`
`(75)
`(73)
`
`*)
`(*)
`
`Notice:
`otuce:
`
`73) Assignee:
`
`Corporation, Santa Clara, CA
`
`Inventor: Deep K. Buch, Folsom, CA (US)
`Intel C
`1
`Intel
`(US)
`thi
`h
`disclaj
`Sub;
`u jectto any ded aes © nk t .
`ec “154(b)by 647 jas. under 3
`
`Michael Kanellos, Transmeta—based servers boast power—
`saving chips, Jan. 25, 2001, pp 1-3, c/net News.com Tech
`newsfirst.
`Process and Thread Functions, Dec. 5, 2000, pp 1-4,
`Microsoft Corporation.
`Server Design FAQ,Version 1.0, Jul. 2, 1999, pp 1-39,Intel
`Corporation and Microsoft Corporation.
`
`* cited by examiner
`Primary Examiner—Lynne H. Browne
`Assistant Examiner—Eric Chang
`(74) Attorney, Agent, or Firm—Blakely, Sokoloff, Taylor &
`Zafman LLP
`ABSTRACT
`(57)
`Amethod and apparatus for power managementis disclosed.
`:
`.
`an
`:
`The invention reduces power consumption in multiprocess-
`ing systems by dynamically adjusting processor power
`based on system workload. Particularly,
`the method and
`apparatus determines the number of required processors
`based on the numberoractive threads and sets a processor
`affinity to run the active threads on the determined number
`of required processors, thereby allowing the free processors
`to enter a low-power state.
`
`22 Claims, 3 Drawing Sheets
`
`(65)
`
`(21) Appl. No.: 09/876,609
`(22)
`Filed:
`Jun. 7, 2001
`.
`oa.
`Prior Publication Data
`US 2002/0188877 Al Dec. 12, 2002
`7
`.
`Int. Cheee GO06F 1/32; GO6F 9/46
`(51)
`(52) U.S. Che ccc cnescneeeeecnetenenees 713/320; 718/104
`:
`,
`(58) Field of Search 0.0.0.0... 713/320; 718/104
`:
`(56)
`References Cited
`U.S. PATENT DOCUMENTS
`.
`.
`3/1997 Nagai .....c cece 718/102
`10/2000 Wong-laley
`10/2000 Nicolet al.
`7/2001 Gillespie ....0.... 718/100
`
`5,615,370 A
`e:13ici66 A
`6.141.762 A
`6,269,391 Bl *
`
`310
`
`WA 300
`
`
`DETERMINE NUMBER OF
`
`
`
`REQUIRED PROCESSORS BASED
`ON NUMBER OFACTIVE THREADS
`
`
` 320
`SET PROCESSOR AFFINITY TO RUN
`
`
`
`ACTIVE THREADS ON THE DETERMINED
`NUMBER OFREQUIRED PROCESSORS
`
`
`
`TRANSITION THE FREE PROCESSORS
`TO ENTER A LOW-POWER STATE
`
`330
`
`Google Exhibit 1018
`Google Exhibit 1018
`Google v. Valtrus
`Google v. Valtrus
`
`
`
`U.S. Patent
`
`May31, 2005
`
`Sheet 1 of 3
`
`US 6,901,522 B2
`
`120
`
`DEVICE
`
`INPUT/OUTPUT
`DEVICES
`
`110
`
`140 J” STORAGE
`210
`
`J200
`
`JAVA VIRTUAL
`MACHINE
`
`OPERATING SYSTEM
`
`HARDWARE
`
`FIG, 2
`
`
`
`U.S. Patent
`
`May31, 2005
`
`Sheet 2 of 3
`
`US 6,901,522 B2
`
`
`
`DETERMINE NUMBER OF
`
`
`REQUIRED PROCESSORS BASED
`ON NUMBER OFACTIVE THREADS
`
`
`
`SET PROCESSOR AFFINITY TO RUN
`
`
`ACTIVE THREADS ON THE DETERMINED
`NUMBER OF REQUIRED PROCESSORS
`
`TRANSITION THE FREE PROCESSORS
`TO ENTER A LOW-POWER STATE
`
`
`
`310
`
`320
`
`330
`
`
`ix a al fal ZAacre
`
`THREAD STATE
`
`[_] SLOcKED/IOLE
`
`JVM
`
`Os
`
`CPU STATE
`} FULL-POWER
`
`© LOW-POWER
`
`SET CPU AFFINITY
`AFFINITY(1..t) = CPU 1..k
`
`He.4
`
`
`
`U.S. Patent
`
`May31, 2005
`
`Sheet 3 of 3
`
`US 6,901,522 B2
`
`so 500
`
`JAVA API -— -------}---------
`
`JAVA APPLICATION SYSTEM
`
`Individual
`Power
`Planes
`
`Individual
`Voltage
`Reguiators
`560
`
`
`
`
`
`Individual
`Power
`Control
`Signals
`
`
`
`US 6,901,522 B2
`
`1
`SYSTEM AND METHOD FOR REDUCING
`POWER CONSUMPTIONIN
`MULTIPROCESSOR SYSTEM
`
`FIELD
`
`The invention is related to processors and more
`particularly, to power management in multi-processor sys-
`tems.
`
`GENERAL BACKGROUND
`
`In recent years, advances in technology have led to more
`powerful computing devices. For example, a server used in
`business transaction processing or e-commerce may require
`simultaneous execution of a high volumeof transactions.
`Accordingly, server systems are typically set to process the
`highest expectant volume of transactions or workload.
`Nevertheless, servers run, during much of the time, at a
`fraction of the peak capacity. Regardless of the workload,
`however, these systems generally run at nearly full power,
`thereby consuming great amounts of electrical power.
`Particularly, as millions surf the World Wide Web and
`organizations (including corporations and government) use
`the Internet to implement more of their business, internet
`servers form the core of e-business and tend to be massive
`consumers of power.
`In addition, a system running at nearly full power dissi-
`pates large amounts of heat, requiring cooling fans which
`create a high decibel noise. The cooling and powerdistri-
`bution requirements also limit the numberof server systems
`that can be stacked in “racks”. As a result, internet data
`centers are faced with increasing infrastructure requirements
`for space, cooling, and electrical power. Furthermore, for
`computing devices with a finite source of power such as
`portable computers, power consumption can limit the usage
`time as well as generate uncomfortable heat for users.
`Therefore, power management can becritical for any com-
`puting devices.
`Currently, some Operating Systems (OS) have built-in
`power management. For example, Advanced Configuration
`and PowerInterface (ACPI) compliant hardware can support
`dynamic power management underthe control of an OS,for
`example MICROSOFT WINDOWS® 2000. Based on the
`Central Processing Unit (CPU) usage,
`the MICROSOFT
`WINDOWS® 2000 dynamically controls the power con-
`sumed. Under MICROSOFT WINDOWS® 2000, the OS
`defines “CPU usage”as “time not spent in the OS idle loop”.
`On ACPI systems, the OS transitions the CPU into a low
`power state when idle. This reduces the CPU power con-
`sumption.
`Nevertheless, in a Java application server environment,
`the ability of an OS to efficiently manage CPU power is
`limited. Particularly, as seen at the level of an OS, the Java
`application server software and the Java Virtual Machine
`(JVM) can appear to consumea large percentage of the CPU
`time, even underrelatively light user load. As the OS has
`limited visibility into the actual CPU usage of the server
`system, the OS cannotefficiently manage poweron its own
`with the existing mechanisms. For example, threads could
`be in a spin loop or doing housekeeping tasks, which does
`not require full CPU power usage. Moreover, when the JVM
`makesuse of a user-level threadslibrary, the OS’s visibility
`into the actual CPU usage is reduced further since the OS
`sees only a few active threads, while a large numberof user
`threads are mapped on top of the OSnative threads.
`BRIEF DESCRIPTION OF THE DRAWINGS
`The invention will be described in detail with reference to
`the following drawings in which like reference numerals
`refer to like elements wherein:
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`55
`
`60
`
`65
`
`2
`FIG. 1 shows an exemplary system to which a power
`managementin accordance with the invention can be imple-
`mented;
`TIG. 2 shows an exemplary Java application system
`environment;
`FIG. 3 is a flowchart showing the power managementin
`accordance with one embodiment of the invention;
`FIG. 4 shows an exemplary result of applying the power
`management in accordance with one embodiment of the
`invention; and
`FIG. 5 shows an exemplary system which allows a
`fine-grained powercontrol in accordance with one embodi-
`ment of the invention.
`
`DETAILED DESCRIPTION
`
`A method and apparatus for reducing power consumption
`is disclosed. In the following description, numerousspecific
`details are set forth to provide a thorough understanding of
`the invention. However,
`it will be understood by one of
`ordinary skill in the art that the invention may be practiced
`without these specific details. For example, although the
`invention will be described with reference to a Java appli-
`cation server environment, the same techniques and teach-
`ings can be applied to other types of systems, environments
`or platforms.
`Disclosed herein, a “thread” is a sequence of computing
`instructions that make up a program. Within a program, a
`single thread can be assigned to execute one task or multiple
`threads can be assigned to execute more than onetask at a
`time. Typical multiprocessing systems can have four to eight
`(or more) individual processors sharing processing tasks.
`Therefore, by breaking downa process into multiple threads,
`different processors in the system can be operating on
`different portions or tasks of the process at the same time.
`Also, the term “computer readable medium”includes, butis
`not
`limited to portable or fixed storage devices, optical
`storage devices, and any other memory devices capable of
`storing computer instructions and/or data. Here, “computer
`instructions”are software or firmware including data, codes,
`and programs that can be read and/or executed to perform
`certain tasks. In addition, the terms “processor” and “central
`processing unit” (CPU) refer to circuitry that controls vari-
`ous functions of a computer system and will be used
`interchangeably.
`An example of a multiprocessing system 100 implement-
`ing the principles of the invention is shown in FIG. 1,
`including a plurality of processors 110, a storage device 120
`and a bus 130. The processors 110 are coupled to the storage
`device 120 by the bus 130. A numberof input/output devices
`140 such as a keyboard, mouse and displays can also be
`coupled to the bus 130.
`The memory 120 stores computer programs such as an
`operating system (OS), application software, utility
`software, Java servlets or applets, and other/or other instruc-
`tions which are executed by the processors 110. An OS
`manages the basic operations of a system such as determin-
`ing how and in whatorder assigned tasks are to be handled,
`managing the flow of information into and out of the
`processor(s), receiving inputs from a keyboard and sending
`outputs to a display. Here, the system 100 and an OS form
`a platform.
`An application software runs on top of the OS and
`performs a specific task for a user using services offered by
`the OS. An application software is usually designed to run
`on a particular OS because various operating systems oper-
`
`
`
`US 6,901,522 B2
`
`3
`4
`Namely, equation 2 forces the entire pool of threads 1 to
`ate in different ways. However, Java application software is
`to run on processors 1 to k out of n processors. For
`t
`generally platform independent and can be run on different
`example, in MICROSOFT WINDOWS® 2000,the API call
`platforms without alteration.
`“SetProcessAffinityMask” can be used to set the processor
`Java is an object-oriented language compiled into a for-
`affinity.
`matcalled bytecode. The Java bytecodes are designed to be
`Accordingly, the OS assigns the desired processoraffini-
`executed on a Java Virtual Machine (JVM). The JVM is not
`ties to the Java threads and causes all of Java to run on k
`an actual hardware platform, but is a low-level software
`processors, leaving (n-k) processor to run the OS idle loop.
`emulator that can be implemented on many different com-
`Therefore,
`the (n-k) CPUs enter(s) the low-powerstate
`puter processor architectures and under manydifferent oper-
`(block 330). Here, the CPUs can enter the low-powerstates
`ating systems. FIG. 2 shows an exemplary Java application
`using a typical OS Advanced Configuration and Power
`system 200 including a JVM. The system 200 includes
`Interface (ACPI) mechanism. FIG. 4 shows an exemplary
`computer hardware 210 controlled by OS 220 and a JVM
`result of the present power management on a 4-CPUsystem.
`230 for running Java programs 290. The JVM 230 running
`Prior to power management, the CPU1 to CPU4are in
`on the system 200 relies on services from the underlying OS
`15
`full-power state to run active and blocked/idle threads. By
`220 and the computer hardware 210.
`setting the appropriate processor affinity,
`the active and
`FIG. 3 is a flowchart showing power managementof a
`blocked/idle threads are run on CPU1 and CPU2 while CPU
`Java application system environmentin accordance with one
`3 and CPU4are transitioned into low-powerstates.
`embodiment of the invention. The JVM periodically moni-
`The reverse procedure is used when the JVM determines
`tors the state of Java threads to determine the number of
`that the system load has increased. This requires anothercall
`threads in active processing and the numberof threads in a
`into OSto set the processoraffinities. As large-scale changes
`“blocked” or “idle” state. For example, a web server may
`to the system workload tend to occur gradually in servers
`have an allocated pool of 50 worker threads for processing
`running enterprise or e-business applications,
`the perfor-
`Hypertext Transfer Protocol (HTTP) connection requests.
`mance overhead of the above procedure is expected to be
`However,at light load, a few of these threads will actually
`small.
`In addition,
`the procedure described above is a
`be processing requests while the other remaining threads
`minimal implementation of the invention. Because the OS
`will be blocked waiting for client connection attempts. In
`has various services which periodically will run on the (n-k)
`some cases, the states of threads may be seen as blocked by
`processors, a fine-grained hardware/software support for
`the JVM but not by the OS. For example, threads which are
`processor power management can further improve the per-
`waiting to acquire a required synchronization or mutex
`formance of the system. In such systems, the procedure is
`object may be considered “blocked” by the JVM. However,
`extended as follows.
`the same threads may simply be considered “running” from
`the OS viewpoint.
`Based on the numberof active threads, the JVM then
`determines the number of required processor (block 310) in
`order
`to efficiently perform the tasks of the system.
`Particularly, the JVM determines the total number “n” of
`processors in the system. Here, the JVM can determine the
`numberof processors through an OS Application Program-
`ming Interface (API) call. The JVM then determinesa ratio
`of active threads to processors required for good
`performance, depending on the type of processing. For
`example, if the threads are mostly doing Input/Output(I/O)
`or other high-latency tasks, a higher ratio of threads to
`processor is used. On the other hand, if threads are mostly
`doing CPU-intensive processing and less I/O, a lowerratio
`of threads to processor is used, for example, 1 thread per
`CPU.
`
`10
`
`30
`
`40
`
`If an OS APIis available to set individual CPU’s power
`state, the JVM can use this API to specifically request the OS
`to transition (n-k) CPUs into deep sleep and/or turn off
`associated cooling devices such as fans. If a Java API allows
`the JVM to expose the above OS API to Java applications,
`the Java application software can use the Java API to achieve
`the same endresults.
`
`FIG. 5 shows one embodiment of a fine-grained power
`control mechanism in a Java application server environment
`500. The system 500 includes a JVM 510 on top of an
`underlying operating system 520 and computer hardware
`530. The computer hardware 530 includes a plurality of
`CPUs 540 coupled to a chipset 550 and individual voltage
`regulators 560 for each CPUs 540. Using the chipset 550 and
`the voltage regulators 560, separate power control signals
`can be usedto transition particular CPUsinto a “deep sleep”
`state and/or turn off associated cooling devices. Therefore,
`powerstates of individual CPUs 540 can be controlled to
`achieve a fine-grained powercontrol.
`Oneparticular application of the power management in
`accordance to the present invention is in server systems,
`which currently lacks support for fine-grained power control
`of individual CPUs. Server chipsets connect a single “stop-
`clock” outputto all the CPUs’ input pins, thereby makingit
`impossible to selectively throttle a particular CPU. Also,
`there is currently no OS APIthat allows a server application
`to inform the OSthat it no longer needs to use a certain
`number of CPUsso that the OS can transition those CPUs
`into a deep sleep state. However, the power managementin
`accordance to the invention allows fine-grained power con-
`trol and can be implemented in a platform that follows the
`ACPI standard. Therefore multiprocessor systems can pro-
`vide the performance when needed, for example, perfor-
`mance on-demand by dynamically bringing more CPU’s
`on-line to meet increased server workload. On the other
`
`Based on the determined ratio, the JVM determines the
`number “k” of required processors out of the total number n
`of processors, wherein k is determined as follows:
`
`k=(numberofactive threads)/(ratio of active threads tdouatssord]
`
`The remaining number, i.e. (n-k), of processors can be
`transitioned into low-powerstates. For example, if there are
`15 active threads and the optimum ratio of active threads to
`processors is 3:1, k=15/3 andfive processors are required to
`run the system tasks. Assuming an 8-CPU system, three
`CPUs can be placed into a low-power state.
`Upon determining the numberof required processors, the
`JVM makesa system call to the OSto set a processoraffinity
`of the entire Java thread pool (block 320), including the
`JVM’s own threads. A processor affinity means forcing
`threads to run a specific subset of processors and is set as
`follows:
`
`50
`
`55
`
`60
`
`65
`
`Affinity(Thread 1... #=processors1...4,1Sk<n
`
`[Equation 2]
`
`hand, CPU power consumption can be scaled back depend-
`
`
`
`US 6,901,522 B2
`
`5
`ing on the server workload, thereby saving power at low
`system utilization.
`Accordingly, implementation of the invention results in
`extensive powersavings. The low-powerstates such as deep
`sleep can save significant CPU power, while the associated
`cooling systems can be turned off, further reducing power
`consumption. Also reducing the noise level. Furthermore,
`the invention addresses multiprocessor servers in a Java
`application server environment,.
`In addition, the technique above can be implemented in
`run-time environments other than Java application systems
`such as MICROSOFT®.NET.The invention can be imple-
`mented in any system with a layer of software above the OS
`that has visibility into the processing needs of the applica-
`tion system. Also, the technique can be applied to different
`operating systems including MS Windows and Linux.
`Furthermore,
`the technique can be modified to cover a
`broader range of systems and software (i.e. non-Java case).
`Namely, a “watchdog” thread can be implemented within
`an OS, whose function is to monitor the states and the
`processing nature of the other threads in the system. The
`watchdog thread would perform similar calculation as the
`JVM above and makea call to the OS to request that (n-k)
`CPUs be put
`into a low-power state. For example,
`the
`watchdog thread functionality can be implemented in Win-
`dowsnative threads library, Linux native thread library, and
`User (green) threads library that maysit on top of the OS
`native libraries.
`As discussed above, the present power management in
`accordance with the invention allows a selected numberof
`
`to enter
`processors, based on the amount of workload,
`low-powerstates, thereby reducing the overall power con-
`sumption. As a result, the system level and the CPU level
`power consumption would significantly fall at lower work-
`load levels. Therefore, the systems can beefficiently oper-
`ated at reduced costs, even with power supply constraints.
`The foregoing embodiments are merely exemplary and
`are not to be construed as limiting the present invention. The
`present teachings can be readily applied to other types of
`apparatuses. The description of the present
`invention is
`intended to be illustrative, and not to limit the scope of the
`claims. Many alternatives, modifications, and variations will
`be apparent to those skilled in theart.
`Whatis claimedis:
`
`1. A method comprising:
`determining a numberof required processors in a system
`based on a numberof active threads, comprises
`determining a ratio of active threads to the number of
`required processors to process the active threads, the
`ratio being based on a type of processing associated
`with the active threads, and
`dividing the number of active threads by the deter-
`minedratio;
`setting processor affinity to run the active threads on k
`number of processors,
`the k number of processors
`determined as being the numberof required processors;
`and
`
`transitioning processors other than the k numberof pro-
`cessors to enter a low-powerstate.
`2. The method of claim 1, wherein determining the ratio
`based on the type of processing.
`3. The method of claim 1, wherein transitioning proces-
`sors other than the k numberof processors into a deep sleep
`state.
`
`4. The method of claim 1, further comprising turning off
`unnecessary periodic services running on processors other
`than the k number of processors.
`
`10
`
`15
`
`25
`
`30
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`5. The method of claim 1, wherein the system is a Java
`application server.
`6. The method of claim 1, further comprising assigning a
`watchdog thread on an operating system of the system to
`monitor the state of the system threads to determine the
`number of required processors.
`7. Asystem comprising:
`an operating system; and
`a virtual machine to determine a number of required
`processors in the system based on a numberofactive
`threads by determining a ratio of active threads to the
`number of required processors to process the active
`thread and dividing the numberofactive thread by the
`determined ratio, the ratio being based on a type of
`processing associated with the active threads, the vir-
`tual machine to cause the operating system to at least
`set processoraffinity to run the active threads on k number
`of processors, the k numberof processor determined as
`the numberof required processors, and
`transition processors other than the k numberof pro-
`cessors to enter a low-powerstate.
`8. The system of claim 7, further comprising voltage
`regulators corresponding to each processor,
`the voltage
`regulator to allow separate powerstate control of the plu-
`rality of processors.
`9. The system of claim 8, wherein the virtual machine to
`further cause the operating system to transition processors
`other than the k numberof processor into deep sleep.
`10. The system of claim 8, wherein the virtual machine to
`further cause the operating system to turn off unnecessary
`periodic services running on processors other than the k
`numberof processors.
`11. The system of claim 7, wherein the virtual machineis
`a Java virtual machine.
`12. A method comprising:
`assigning a first thread to monitor the states of other
`threads in a system, the first thread to determine the
`numberof active threads in the system;
`determining a numberof required processors in the sys-
`tem based on a numberof active threads, comprises:
`determining a ratio of active threads to the numberof
`required processors to process the active threads;
`based on the type of processing, and
`dividing the number of active threads by the deter-
`minedratio;
`setting processor affinity to run the active threads on k
`numberof processors, the k numberof processordeter-
`mined as the number of required processors; and
`transitioning processors other than the k numberof pro-
`cessors to enter a low-powerstate.
`13. The method of claim 12, wherein transitioning pro-
`cessors other than the k number of processors into deep
`sleep.
`14. A system comprising:
`means for determining a numberof required processors in
`a system based on a number of active threads by
`determining a ratio of active threads to the numberof
`required processors to process the active threads, the
`ratio being based on a type of processing associated
`with the active threads, and dividing the number of
`active threads by the determined ratio to determine the
`number of required processors;
`means for setting processor affinity to run the active
`threads on k numberof processors, the k number of
`processor determined as the number of required pro-
`cessors; and
`
`
`
`US 6,901,522 B2
`
`7
`meansfor transitioning processors other than the k num-
`ber of processors to enter a low-powerstate.
`15. The system of claim 14, further comprising means for
`transitioning processors other than the k numberof proces-
`sors into deep sleep.
`16. The system of claim 14, further comprising means for
`turning off unnecessary periodic services running on pro-
`cessors other than the k numberof processors.
`17. A system comprising:
`a Java virtual machine to determine a numberof required
`processors in a system based on a numberofactive
`threads, the Java virtual machine determinesa ratio of
`active threads to the numberof required processors to
`process the active threads, the ratio being based on a
`type of processing associated with the active threads,
`and divides the numberof active threads by the deter-
`mined ratio to determine the number of required pro-
`cessors; and
`an operation system caused by the Java virtual machine to
`at least:
`set processor affinity to run the active threads on k
`number of processors, the k number of processor
`determined as the number of required processors,
`and
`transition processors other than the k number ofpro-
`cessors to enter a low-powerstate.
`18. The system of claim 17, wherein the Java virtual
`machine further causes the operation system to transition
`processors other than the k numberof processorsinto a deep
`sleep state.
`
`8
`19. The system of claim 17, wherein the Java virtual
`machine further causes the operation system to turn off
`unnecessary periodic services running on processors other
`than the k number of processors.
`20. A program loaded in a computer readable medium
`comprising:
`
`a first group of computer instructions to determine a
`number of required processors in a system based on a
`number of active threads and a type of processing
`associated with the active threads;
`
`a second group of computer instructions to set processor
`affinity to run the active threads on k number of
`processors, the k numberof processor determined as
`the number of required processors; and
`
`a third group of computer instructions to transition pro-
`cessors other than the k numberof processors to enter
`a low-powerstate.
`21. The program of claim 20, further comprises computer
`instructions to transition processors other than the k number
`of processors into deep sleep.
`22. The program of claim 20, further comprising com-
`puter instructions to turn off unnecessary periodic services
`running on processors other than the k numberof processors.
`
`10
`
`15
`
`20
`
`25
`
`