[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2132325.2132464acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

Identifying the optimal energy-efficient operating points of parallel workloads

Published: 07 November 2011 Publication History

Abstract

As the number of cores per processor grows, there is a strong incentive to develop parallel workloads to take advantage of the hardware parallelism. In comparison to single-threaded applications, parallel workloads are more complex to characterize due to thread interactions and resource stalls. This paper presents an accurate and scalable method for determining the optimal system operating points (i.e., number of threads and DVFS settings) at runtime for parallel workloads under a set of objective functions and constraints that optimize for energy efficiency in multi-core processors. Using an extensive training data set gathered for a wide range of parallel workloads on a commercial multi-core system, we construct multinomial logistic regression (MLR) models that estimate the optimal system settings as a function of workload characteristics. We use L1-regularization to automatically determine the relevant workload metrics for energy optimization. At runtime, our technique determines the optimal number of threads and the DVFS setting with negligible overhead. Our experiments demonstrate that our method outperforms prior techniques with up to 51% improved decision accuracy. This translates to up to 10.6% average improvement in energy-performance operation, with a maximum improvement of 30.9%. Our technique also demonstrates superior scalability as the number of potential system operating points increases.

References

[1]
A. Azevedo, I. Issenin, R. Cornea, R. Gupta, N. Dutt, A. Veidenbaum, and A. Nicolau. Profile-based dynamic voltage scheduling using program checkpoints. In Proceedings of Design, Automation and Test in Europe Conference, 2002.
[2]
C. Bienia and K. Li. Parsec 2.0: A new benchmark suite for chip-multiprocessors. In Proceedings of the 5th Annual Workshop on Modeling, Benchmarking and Simulation, June 2009.
[3]
K. Choi, R. Soma, and M. Pedram. Dynamic voltage and frequency scaling based on workload decomposition. In Proceedings of international symposium on Low power electronics and design, 2004.
[4]
M. Curtis-Maury, F. Blagojevic, C. D. Antonopoulos, and D. S. Nikolopoulos. Prediction-based power-performance adaptation of multithreaded scientific codes. IEEE Trans. Parallel Distrib. Syst., 19:1396--1410, October 2008.
[5]
G. Dhiman and T. S. Rosing. Dynamic voltage frequency scaling for multi-tasking systems using online learning. In Proceedings of International Symposium on Low PowerElectronics and Design, 2007.
[6]
M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Optimizing job performance under a given power constraint in hpc centers. In Proceedings of the International Conference on Green Computing, 2010.
[7]
S. Herbert and D. Marculescu. Analysis of dynamic voltage/frequency scaling in chip-multiprocessors. In Proceedings of International Symposium on Low PowerElectronics and Design, 2007.
[8]
C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In Proceedings of the 39th International Symposium on Microarchitecture, 2006.
[9]
C. Isci, G. Contreras, and M. Martonosi. Live, runtime phase monitoring and prediction on real systems with application to dynamic power management. In Proceedings of the 39th International Symposium on Microarchitecture, 2006.
[10]
W. Kim, M. S. Gupta, G. yeon Wei, and D. Brooks. System level analysis of fast, per-core dvfs using on-chip switching regulators. In International Symposium on High-Performance Computer Architecture, 2008.
[11]
J. G. Koomey. Worldwide electricity used in data centers. Environmental Research Letters, 3(3):034008, 2008.
[12]
Li and J. Martinez. Dynamic power-performance adaptation of parallel computation on chip multiprocessors. In International Symposium on High-Performance Computer Architecture, 2006.
[13]
G. Magklis, M. L. Scott, G. Semeraro, D. H. Albonesi, and S. Dropsho. Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor. In International Symposium on Computer Architecture, 2003.
[14]
D. Meisner, B. T. Gold, and T. F. Wenisch. Powernap: eliminating server idle power. In Proceeding of the 14th international conference on Architectural support for programming languages and operating systems, ASPLOS '09, 2009.
[15]
M. Moeng and R. G. Melhem. Applying statistical machine learning to multicore voltage & frequency scaling. In Conf. Computing Frontiers, 2010.
[16]
G. J. Narlikar and G. E. Blelloch. Pthreads for dynamic and irregular parallelism. In In Proc. of Supercomputing Õ98, pages 4--1. IEEE, 1998.
[17]
D. Shin, J. Kim, and S. Lee. Low-energy intra-task voltage scheduling using static timing analysis. In Proceedings of the 38th Conference on Design Automation, 2001.
[18]
K. Singh, M. Bhadauria, and S. A. McKee. Real time power estimation and thread scheduling via performance counters. SIGARCH Computer Architecture News, 2009.
[19]
R. Teodorescu and J. Torrellas. Variation-aware application scheduling and power management for chip multiprocessors. In International Symposium on High-Performance Computer Architecture, 2008.
[20]
H. Yu, B. Veeravalli, and Y. Ha. Dynamic scheduling of imprecise-computation tasks in maximizing qos under energy constraints for embedded systems. In Proceedings of the 2008 Asia and South Pacific Design Automation Conference, 2008.
[21]
Y Zhu and F. Mueller. Feedback edf scheduling of realtime tasks exploiting dynamic voltage scaling. Real-Time Systems Journal, 2005.

Cited By

View all
  • (2020)Exploiting Dynamism in HPC Applications to Optimize Energy-EfficiencyWorkshop Proceedings of the 49th International Conference on Parallel Processing10.1145/3409390.3409399(1-10)Online publication date: 17-Aug-2020
  • (2016)Unsupervised power modeling of co-allocated workloads for energy efficiency in data centersProceedings of the 2016 Conference on Design, Automation & Test in Europe10.5555/2971808.2972121(1345-1350)Online publication date: 14-Mar-2016
  • (2016)A Reconfiguration Algorithm for Power-Aware Parallel ApplicationsACM Transactions on Architecture and Code Optimization10.1145/300405413:4(1-25)Online publication date: 2-Dec-2016
  • Show More Cited By
  1. Identifying the optimal energy-efficient operating points of parallel workloads

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICCAD '11: Proceedings of the International Conference on Computer-Aided Design
    November 2011
    844 pages
    ISBN:9781457713989
    • General Chair:
    • Joel Phillips,
    • Program Chairs:
    • Alan J. Hu,
    • Helmut Graeb

    Sponsors

    Publisher

    IEEE Press

    Publication History

    Published: 07 November 2011

    Check for updates

    Qualifiers

    • Research-article

    Conference

    ICCAD '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 457 of 1,762 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Exploiting Dynamism in HPC Applications to Optimize Energy-EfficiencyWorkshop Proceedings of the 49th International Conference on Parallel Processing10.1145/3409390.3409399(1-10)Online publication date: 17-Aug-2020
    • (2016)Unsupervised power modeling of co-allocated workloads for energy efficiency in data centersProceedings of the 2016 Conference on Design, Automation & Test in Europe10.5555/2971808.2972121(1345-1350)Online publication date: 14-Mar-2016
    • (2016)A Reconfiguration Algorithm for Power-Aware Parallel ApplicationsACM Transactions on Architecture and Code Optimization10.1145/300405413:4(1-25)Online publication date: 2-Dec-2016
    • (2016)Minimization of Xeon Phi Core Use with Negligible Execution Time ImpactProceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale10.1145/2949550.2949581(1-8)Online publication date: 17-Jul-2016
    • (2016)Adaptive and Hierarchical Runtime Manager for Energy-Aware Thermal Management of Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/283412015:2(1-25)Online publication date: 29-Jan-2016
    • (2015)Workload uncertainty characterization and adaptive frequency scaling for energy minimization of embedded systemsProceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition10.5555/2755753.2755764(43-48)Online publication date: 9-Mar-2015
    • (2013)Techniques for energy-efficient power budgeting in data centersProceedings of the 50th Annual Design Automation Conference10.1145/2463209.2488951(1-7)Online publication date: 29-May-2013
    • (2012)Optimizing energy efficiency of 3-D multicore systems with stacked DRAM under power and thermal constraintsProceedings of the 49th Annual Design Automation Conference10.1145/2228360.2228477(648-655)Online publication date: 3-Jun-2012

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media