More Web Proxy on the site http://driver.im/

research-article

Public Access

Power Tuning HPC Jobs on Power-Constrained Systems

Authors:

Barry RountreeAuthors Info & Claims

PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation

Pages 179 - 191

https://doi.org/10.1145/2967938.2967961

Published: 11 September 2016 Publication History

Abstract

As we approach the exascale era, power has become a primary bottleneck. The US Department of Energy has set a power constraint of 20MW on each exascale machine. To be able achieve one exaflop under this constraint, it is necessary that we use power intelligently to maximize performance under a power constraint.

Most production-level parallel applications that run on a supercomputer are tightly-coupled parallel applications. A naíve approach of enforcing a power constraint for a parallel job would be to divide the job's power budget uniformly across all the processors. However, previous work has shown that a power capped job suffers from performance variation of otherwise identical processors leading to overall sub-optimal performance. We propose a 2-level hierarchical variation-aware approach of managing power at machine- level. At the macro level, PPartition partitions a machine's power budget across jobs to assign a power budget to each job running on the system such that the machine never exceeds its power budget. At the micro level, PTune makes job-centric decisions by taking the performance variation into account. For every moldable job, PTune determines the optimal number of processors, the selection of processors and the distribution of the job's power budget across them, with the goal of maximizing the job's performance under its power budget.

Experiments show that, at the micro level, PTune achieves a performance improvement of up to 29% compared to a naíve approach. PTune does not lead to any performance degradation, yet frees up almost 40% of the processors for the same performance as that of the naíve approach under a hard power bound. At the macro level, PPartition is able to achieve a throughput improvement of 5-35% compared to uniform power distribution.

References

[1]

U.s. energy information administration. https://www.eia.gov/electricity/annual/html/epa_04_03.html.

[2]

Top 500 list. http://www.top500.org/, June 2002.

[3]

D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, D. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS Parallel Benchmarks. The International Journal of Supercomputer Applications, 5(3):63--73, Fall 1991.

Digital Library

[4]

P. E. Bailey, A. Marathe, D. K. Lowenthal, B. Rountree, and M. Schulz. Finding the limits of power-constrained application performance. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '15, pages 79:1--79:12, New York, NY, USA, 2015. ACM.

Digital Library

[5]

K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, J. Hiller, S. Karp, S. Keckler, D. Klein, R. Lucas, M. Richards, A. Scarpelli, S. Scott, A. Snavely, T. Sterling, R. S. Williams, K. Yelick, K. Bergman, S. Borkar, D. Campbell, W. Carlson, W. Dally, M. Denneau, P. Franzon, W. Harrod, J. Hiller, S. Keckler, D. Klein, P. Kogge, R. S. Williams, and K. Yelick. Exascale Computing Study: Technology Challenges in Achieving Exascale Systems. 2008.

[6]

S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De. Parameter variations and impact on circuits and microarchitecture. In Design Automation Conference, 2003. Proceedings, pages 338--342, June 2003.

Digital Library

[7]

B. Dally. Power and programmability: The challenges of exascale computing. In Presentation at ASCR Exascale Research PI Meeting, 2011.

[8]

D. A. Ellsworth, A. D. Malony, B. Rountree, and M. Schulz. Pow: System-wide dynamic reallocation of limited power in hpc. In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, HPDC '15, pages 145--148, New York, NY, USA, 2015. ACM.

Digital Library

[9]

M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Optimizing Job Performance Under a Given Power Constraint in HPC Centers. In Green Computing Conference, pages 257--267, 2010.

Digital Library

[10]

M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Utilization driven power-aware parallel job scheduling. Computer Science - R&D, 25(3--4):207--216, 2010.

[11]

M. E. Femal and V. Freeh. Boosting data center performance through non-uniform power allocation. In International Conference on Autonomic Computing, pages 250--261, 2005.

Digital Library

[12]

M. E. Femal and V. W. Freeh. Safe overprovisioning: using power limits to increase aggregate throughput. In International Conference on Power-Aware Computer Systems, December 2005.

Digital Library

[13]

V. Freeh, F. Pan, N. Kappiah, and D. K. Lowenthal. Using multiple energy gears in mpi programs on a power-scalable cluster. In PPoPP, pages 164--173, June 2005.

Digital Library

[14]

V. Freeh, F. Pan, N. Kappiah, D. K. Lowenthal, and R. Springer. Exploring the energy-time tradeoff in mpi programs on a power-scalable cluster. In IPDPS, May 2005.

Digital Library

[15]

R. Ge, X. Feng, S. Song, H.-C. Chang, and D. Li. Powerpack: Energy profiling and analysis of high-performance systems and applications. May 2010.

Digital Library

[16]

S. Herbert and D. Marculescu. Variation-aware dynamic voltage/frequency scaling. In High Performance Computer Architecture, 2009. HPCA 2009. IEEE 15th International Symposium on, pages 301--312, Feb 2009.

[17]

J. H. L. III, P. Pokorny, and D. Debonis. Powerinsight - a commodity power measurement capability. In IGCC'13, pages 1--6, 2013.

[18]

Y. Inadomi, T. Patki, K. Inoue, M. Aoyagi, B. Rountree, M. Schulz, D. Lowenthal, Y. Wada, K. Fukazawa, M. Ueda, M. Kondo, and I. Miyoshi. Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '15, pages 78:1--78:12, New York, NY, USA, 2015. ACM.

Digital Library

[19]

InsideHPC. Power Consumption is the Exascale Gorilla in the Room. http://insidehpc.com/2010/12/.

[20]

Intel. Intel-64 and IA-32 Architectures Software Developer's Manual, Volumes 3A and 3B: System Programming Guide. 2011.

[21]

N. Kappiah, V. Freeh, and D. K. Lowenthal. Just in time dynamic voltage scaling: Exploiting inter-node slack to save energy in mpi programs. In SC, Nov 2005.

Digital Library

[22]

A. Langer, E. Totoni, U. S. Palekar, and L. V. Kalé. Energy-efficient computing for hpc workloads on heterogeneous manycore chips. In Proceedings of Programming Models and Applications on Multicores and Manycores. ACM, 2015.

Digital Library

[23]

M. Y. Lim, V. Freeh, and D. K. Lowenthal. Adaptive, transparent frequency and voltage scaling of communication phases in mpi programs. In SC, 2006.

Digital Library

[24]

T. Patki, D. K. Lowenthal, B. Rountree, M. Schulz, and B. R. de Supinski. Exploring Hardware Overprovisioning in Power-constrained, High Performance Computing. In International Conference on Supercomputing, pages 173--182, 2013.

Digital Library

[25]

T. Patki, D. K. Lowenthal, A. Sasidharan, M. Maiterth, B. Rountree, M. Schulz, and B. R. de Supinski. Practical Resource Management in Power-Constrained, High Performance Computing. In HPDC, 2015.

Digital Library

[26]

E. Pinheiro, R. Bianchini, E. V. Carrera, and T. Heath. Load balancing and unbalancing for power and performance in cluster-based systems. In Workshop on compilers and operating systems for low power, 2001.

[27]

B. Rountree, D. H. Ahn, B. R. de Supinski, D. K. Lowenthal, and M. Schulz. Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound. In IPDPS Workshops, pages 947--953. IEEE Computer Society, 2012.

Digital Library

[28]

B. Rountree, D. K. Lowenthal, S. Funk, V. Freeh, B. R. de Supinski, and M. Schulz. Bounding energy consumption in large-scale mpi programs. In SC, pages 10--16, Nov 2007.

Digital Library

[29]

B. Rountree, D. K. Lowenthal, M. Schulz, V. Freeh, and T. Bletsch. Adagio: Making dvs practical for complex hpc applications. In ICS, Nov 2009.

Digital Library

[30]

Sandia National Laboratory. Mantevo project home page. https://software.sandia.gov/mantevo, June 2011.

[31]

V. Sarkar, W. Harrod, and A. Snavely. Software Challenges in Extreme Scale Systems. In Journal of Physics, Conference Series 012045, 2009.

[32]

O. Sarood. Optimizing Performance Under Thermal and Power Constraints for HPC Data Centers. PhD thesis, University of Illinois, Urbana-Champaign, December 2013.

[33]

O. Sarood, A. Langer, A. Gupta, and L. V. Kale. Maximizing throughput of overprovisioned hpc data centers under a strict power budget. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '14, New Orleans, LA, 2014. ACM.

Digital Library

[34]

O. Sarood, A. Langer, L. V. Kale, B. Rountree, and B. de Supinski. Optimizing Power Allocation to CPU and Memory Subsystems in Overprovisioned HPC Systems. In Proceedings of IEEE Cluster 2013, Indianapolis, IN, USA, September 2013.

[35]

K. Shoga, B. Rountree, M. Schulz, and J. Shafer. Whitelisting msrs with msr-safe. In 3rd Workshop on Extreme-Scale Programming Tools at SC, Nov. 2014. http://www.vi-hps.org/upload/program/espt-sc14/vi-hps-ESPT14-Shoga.pdf.

[36]

R. Springer, D. K. Lowenthal, B. Rountree, and V. Freeh. Minimizing execution time in mpi programs on an energy-constrained,power-scalable cluster. In PPoPP, May 2006.

Digital Library

[37]

R. Teodorescu and J. Torrellas. Variation-aware application scheduling and power management for chip multiprocessors. In Computer Architecture, 2008. ISCA '08. 35th International Symposium on, pages 363--374, June 2008.

Digital Library

[38]

E. Totoni, A. Langer, J. Torrellas, and L. Kale. Scheduling for hpc systems with process variation heterogeneity. In Technical Report YCS-2009-443, Department of Computer Science, University of York, 2014.

[39]

L. Zhang, L. S. Bai, R. P. Dick, L. Shang, and R. Joseph. Process variation characterization of chip-level multiprocessors. In Design Automation Conference, 2009. DAC '09. 46th ACM/IEEE, pages 694--697, July 2009.

Digital Library

Cited By

Solórzano ASato KYamamoto KShoji FBrandt JSchwaller BWalton SGreen JTiwari D(2024)Toward Sustainable HPC: In-Production Deployment of Incentive-Based Power Efficiency Mechanism on the Fugaku SupercomputerSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00030(1-16)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SC41406.2024.00030
KODAMA YKONDO MSATO M(2023)Evaluation of Performance and Power Consumption on Supercomputer Fugaku Using SPEC HPC BenchmarksIEICE Transactions on Electronics10.1587/transele.2022LHP0001E106.C:6(303-311)Online publication date: 1-Jun-2023
https://doi.org/10.1587/transele.2022LHP0001
Zhang ABhalachandra SDeng SZhao Z(2023)NPAT - A Power Analysis Tool at NERSCProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624149(712-719)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3624062.3624149
Show More Cited By

Index Terms

Power Tuning HPC Jobs on Power-Constrained Systems
1. Hardware
  1. Power and energy
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Power management

Recommendations

Demand-aware power management for power-constrained HPC systems
CCGRID '16: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing

As limited power budget is becoming one of the most crucial challenges in developing supercomputer systems, hardware overprovisioning which installs larger number of nodes beyond the limitations of the power constraint determined by Thermal Design Power ...
Parallel job scheduling for power constrained HPC systems

Power has become the primary constraint in high performance computing. Traditionally, parallel job scheduling policies have been designed to improve certain job performance metrics when scheduling parallel workloads on a system with a given number of ...
Exploring RAPL as a Power Capping Leverage for Power-Constrained Infrastructures
Algorithms and Architectures for Parallel Processing
Abstract
Data centers are very energy-intensive facilities whose power provision is challenging and constrained by power bounds. In modern data centers, servers account for a significant portion of the total power consumption. In this context, the ability ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation

September 2016

474 pages

ISBN:9781450341219

DOI:10.1145/2967938

General Chairs:
Ayal Zaks
Intel, Israel
,
Bilha Mendelson
Optitura, Israel
,
Program Chairs:
Lawrence Rauchwerger
Texas A&M University, USA
,
Wen-mei W. Hwu
University of Illinois at Urbana-Champaign, USA

Copyright © 2016 ACM.

© 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

IFIP WG 10.3: IFIP WG 10.3
IEEE TCCA: IEEE Computer Society Technical Committee on Computer Architecture
SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE CS TCPP: IEEE Computer Society Technical Committee on Parallel Processing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

PACT '16

Sponsor:

IFIP WG 10.3
IEEE TCCA
SIGARCH
IEEE CS TCPP

PACT '16: International Conference on Parallel Architectures and Compilation

September 11 - 15, 2016

Haifa, Israel

Acceptance Rates

PACT '16 Paper Acceptance Rate 31 of 119 submissions, 26%;

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

53
Total Citations
View Citations
743
Total Downloads

Downloads (Last 12 months)98
Downloads (Last 6 weeks)18

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Solórzano ASato KYamamoto KShoji FBrandt JSchwaller BWalton SGreen JTiwari D(2024)Toward Sustainable HPC: In-Production Deployment of Incentive-Based Power Efficiency Mechanism on the Fugaku SupercomputerSC24: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41406.2024.00030(1-16)Online publication date: 17-Nov-2024
https://doi.org/10.1109/SC41406.2024.00030
KODAMA YKONDO MSATO M(2023)Evaluation of Performance and Power Consumption on Supercomputer Fugaku Using SPEC HPC BenchmarksIEICE Transactions on Electronics10.1587/transele.2022LHP0001E106.C:6(303-311)Online publication date: 1-Jun-2023
https://doi.org/10.1587/transele.2022LHP0001
Zhang ABhalachandra SDeng SZhao Z(2023)NPAT - A Power Analysis Tool at NERSCProceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624149(712-719)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3624062.3624149
Ding JHoffmann HMohror KArnold DBadia R(2023)DPS: Adaptive Power Management for Overprovisioned SystemsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607091(1-14)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607091
Fan KD'Antonio MCarpentieri LCosenza BFicarelli FCesarini DMohror KArnold DBadia R(2023)SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy SavingProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607055(1-13)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607055
Manumachu RKhaleghzadeh HLastovetsky A(2023)Acceleration of Bi-Objective Optimization of Data-Parallel Applications for Performance and Energy on Heterogeneous Hybrid PlatformsIEEE Access10.1109/ACCESS.2023.325868411(27226-27245)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3258684
Costero LIgual FOlcoz K(2023)Dynamic power budget redistribution under a power cap on multi-application environmentsSustainable Computing: Informatics and Systems10.1016/j.suscom.2023.10086538(100865)Online publication date: Apr-2023
https://doi.org/10.1016/j.suscom.2023.100865
Ou ZChen JSun YXu TJiang GTan ZQi X(2023)AOA: Adaptive Overclocking Algorithm on CPU-GPU Heterogeneous PlatformsAlgorithms and Architectures for Parallel Processing10.1007/978-3-031-22677-9_14(253-272)Online publication date: 11-Jan-2023
https://doi.org/10.1007/978-3-031-22677-9_14
Srivastava TZhang HHoffmann H(2022)Penelope: Peer-to-peer Power ManagementProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3545047(1-11)Online publication date: 29-Aug-2022
https://dl.acm.org/doi/10.1145/3545008.3545047
Khaleghzadeh HManumachu RLastovetsky A(2022)A Novel Algorithm for Bi-objective Performance-Energy Optimization of Applications with Continuous Performance and Linear Energy Profiles on Heterogeneous HPC PlatformsEuro-Par 2021: Parallel Processing Workshops10.1007/978-3-031-06156-1_14(166-178)Online publication date: 9-Jun-2022
https://doi.org/10.1007/978-3-031-06156-1_14
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten