[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2742854.2742885acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Chrysso: an integrated power manager for constrained many-core processors

Published: 06 May 2015 Publication History

Abstract

Modern microprocessors are increasingly power-constrained as a result of slowed supply voltage scaling (end of Dennard scaling) in conjunction with the transistor density scaling (Moore's Law). Existing many-core power management techniques such as chip-wide/per-core DVFS, and core and cache adaptation are quite effective in isolation at moderate to high power budgets. However, for future many-core chip, the existing techniques do not scale well to large core counts, small time slices and stringent power budgets. We need a new solution that combines different adaptation and reconfiguration techniques.
In this paper, we present Chrysso, an integrated, scalable and low-overhead power management framework. Chrysso consists of a three-step process: leveraging simple analytical performance and power models, pruning the search space early using local Pareto front generation, followed by global utility-based power allocation. This ensures scalable and effective dynamic adaptation of many-core processors at short time scales along multiple axes, including core, cache and per-core DVFS adaptations. By integrating multiple power management techniques into a common methodology, Chrysso provides significant performance improvements over isolated mechanisms within a given power budget without power-gating cores. On a 64-core system, Chrysso improves system throughput by 1.6× and 1.9× over core-gating at stringent power envelops for multi-program (SPEC) and multi-threaded (PARSEC) workloads, respectively.

References

[1]
D. H. Albonesi. Selective cache ways: On-demand cache resource allocation. In MICRO, 1999.
[2]
M. Annavaram et al. Mitigating Amdahl's law through EPI throttling. In ISCA, 2005.
[3]
R. I. Bahar and S. Manne. Power and energy reduction via pipeline balancing. In ISCA, 2001.
[4]
A. Bhattacharjee and M. Martonosi. Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors. In ISCA, 2009.
[5]
C. Bienia et al. The PARSEC benchmark suite: Characterization and architectural implications. In PACT, 2008.
[6]
R. Bitirgen et al. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach. In MICRO, 2008.
[7]
D. Brooks and M. Martonosi. Dynamic thermal management for high-performance microprocessors. In HPCA, 2001.
[8]
T. E. Carlson et al. Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulations. In SC, 2011.
[9]
R. Dennard et al. Design of ion-implanted MOSFET's with very small physical dimensions. ISSCC, 1974.
[10]
K. Du Bois et al. Criticality stacks: Identifying critical threads in parallel programs using synchronization behavior. In ISCA, 2013.
[11]
C. Dubach et al. A predictive model for dynamic microarchitectural adaptivity control. In MICRO, 2010.
[12]
Y. Eckert et al. Something old and something new: P-states can borrow microarchitecture techniques too. In ISLPED, 2012.
[13]
H. Esmaeilzadeh et al. Dark silicon and the end of multicore scaling. In ISCA, 2011.
[14]
S. Eyerman and L. Eeckhout. System-level performance metrics for multiprogram workloads. IEEE Micro, 2008.
[15]
S. Eyerman et al. A mechanistic performance model for superscalar out-of-order processors. TOCS, 2009.
[16]
X. Fan et al. Power provisioning for a warehouse-sized computer. In ISCA, 2007.
[17]
D. Folegnani and A. González. Energy-effective issue logic. In ISCA, 2001.
[18]
H. R. Ghasemi and N. S. Kim. RCS: Runtime resource and core scaling for power-constrained multi-core processors. In PACT, 2014.
[19]
D. Gibson and D. A. Wood. Forwardflow: A scalable core for power-constrained CMPs. In ISCA, 2010.
[20]
Intel. 2nd gen. Intel Core vPro processor family, 2008.
[21]
C. Isci et al. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In MICRO, 2006.
[22]
R. Jayaseelan and T. Mitra. A hybrid local-global approach for multi-core thermal management. In ICCD, 2009.
[23]
J. Jeffers and J. Reinders. Intel Xeon Phi Coprocessor High Performance Programming. Newnes, 2013.
[24]
J. A. Joao et al. Bottleneck identification and scheduling in multithreaded applications. In ASPLOS, 2012.
[25]
W. Kim et al. System level analysis of fast, per-core DVFS using on-chip switching regulators. In HPCA, 2008.
[26]
J. Leverich et al. Power management of datacenter workloads using per-core power gating. CAL, 2009.
[27]
J. Li and J. F. Martinez. Dynamic power-performance adaptation of parallel computation on chip multiprocessors. In HPCA, 2006.
[28]
S. Li et al. McPAT: An integrated power, area and timing modeling framework for multicore and manycore architectures. In MICRO, 2009.
[29]
K. Meng et al. Multi-optimization power management for chip multiprocessors. In PACT, 2008.
[30]
A. Mericas. Performance monitoring on the POWER5 microprocessor. In Performance Evaluation and Benchmarking. CRC Press, 2006.
[31]
T. N. Miller et al. Booster: Reactive core acceleration for mitigating the effects of process variation and application imbalance in low-voltage chips. In HPCA, 2012.
[32]
G. E. Moore. Cramming more components onto integrated circuits. Electronics, 1965.
[33]
T. Morad et al. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors. CAL, 2006.
[34]
H. Patil et al. Pinpointing representative portions of large Intel Itanium; programs with dynamic instrumentation. In MICRO, 2004.
[35]
P. Petrica et al. Flicker: A dynamically adaptive architecture for power limited multicore systems. In ISCA, 2013.
[36]
M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In MICRO, 2006.
[37]
E. Rotem et al. Power-management architecture of the Intel microarchitecture code-named Sandy Bridge. IEEE Micro, 2012.
[38]
A. Snavely and D. M. Tullsen. Symbiotic jobscheduling for simultaneous multithreading processor. In ASPLOS, 2000.
[39]
S.-H. Yang et al. Exploiting choice in resizable cache design to optimize deep-submicron processor energy-delay. In HPCA, 2002.

Cited By

View all
  • (2021)Intelligent Adaptation of Hardware Knobs for Improving Performance and Power ConsumptionIEEE Transactions on Computers10.1109/TC.2020.298023070:1(1-16)Online publication date: 1-Jan-2021
  • (2020)CuttleSys: Data-Driven Resource Management for Interactive Services on Reconfigurable Multicores2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00060(650-664)Online publication date: Oct-2020
  • (2019)TangramProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358285(384-398)Online publication date: 12-Oct-2019
  • Show More Cited By

Index Terms

  1. Chrysso: an integrated power manager for constrained many-core processors

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CF '15: Proceedings of the 12th ACM International Conference on Computing Frontiers
      May 2015
      413 pages
      ISBN:9781450333580
      DOI:10.1145/2742854
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 May 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. analytical modeling
      2. many-core processor
      3. microarchitecture reconfiguration
      4. power management

      Qualifiers

      • Research-article

      Funding Sources

      • European Research Council under the European Community's 7th Framework Programme
      • Spanish Ministry of Economy and Competitiveness

      Conference

      CF'15
      Sponsor:
      CF'15: Computing Frontiers Conference
      May 18 - 21, 2015
      Ischia, Italy

      Acceptance Rates

      CF '15 Paper Acceptance Rate 33 of 96 submissions, 34%;
      Overall Acceptance Rate 273 of 785 submissions, 35%

      Upcoming Conference

      CF '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 14 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Intelligent Adaptation of Hardware Knobs for Improving Performance and Power ConsumptionIEEE Transactions on Computers10.1109/TC.2020.298023070:1(1-16)Online publication date: 1-Jan-2021
      • (2020)CuttleSys: Data-Driven Resource Management for Interactive Services on Reconfigurable Multicores2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00060(650-664)Online publication date: Oct-2020
      • (2019)TangramProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358285(384-398)Online publication date: 12-Oct-2019
      • (2017)Using Application-Level Thread Progress Information to Manage Power and Performance2017 IEEE International Conference on Computer Design (ICCD)10.1109/ICCD.2017.87(501-508)Online publication date: Nov-2017
      • (2016)Shared resource aware scheduling on power-constrained tiled many-core processorsProceedings of the ACM International Conference on Computing Frontiers10.1145/2903150.2903490(365-368)Online publication date: 16-May-2016
      • (2016)FastCap: An efficient and fair algorithm for power capping in many-core systems2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS.2016.7482074(57-68)Online publication date: Apr-2016

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media