Abstract
The rising concern for power consumption of large-scale computer systems puts a research focus on the respective measurement methods. Varying workload patterns and energy efficiency optimizations cause highly dynamic power consumption on today’s compute nodes—a challenge for every measurement infrastructure. We identify five partly contradictory requirements that characterize such infrastructures: temporal granularity, spatial granularity, well-defined accuracy, scalability, and cost. In two projects we push the boundaries for these criteria: a scalable measurement solution for hundreds of nodes at millisecond granularity that is tightly integrated into the HPC system, and a sophisticated single-node instrumentation to measure the power consumption of application events in the microsecond range. Both measurement solutions are calibrated and their accuracy is carefully studied. We discuss scalable processing of the measurements for global monitoring in large-scale systems and use this data for energy efficiency analyses in combination with contextual information such as application performance trace data.
Similar content being viewed by others
References
Knobloch M, Foszczynski M, Homberg W, Pleiter D, Böttiger H (2013) Mapping fine-grained power measurements to HPC application runtime characteristics on IBM POWER7. Comput Sci Res Dev 29:211–219
Fourestey G, Cumming B, Gilly L, Schulthess TC (2014) First experiences with validating and using the cray power management database tool. CoRR
Hackenberg D, Ilsche T, Schöne R, Molka D, Schmidt M, Nagel WE (2013) Power measurement techniques on standard compute nodes: a quantitative comparison. In: 2013 IEEE international symposium on performance analysis of systems and software (ISPASS)
Hackenberg D, Schöne R, Ilsche T, Molka D, Schuchart J, Geyer R (2015) An energy efficiency feature survey of the intel haswell processor. In: Parallel and distributed processing symposium workshop (IPDPSW), 2015 IEEE international
Ge R, Feng X, Song S, Chang HC, Li D, Cameron KW (2010) PowerPack: energy profiling and analysis of high-performance systems and applications. IEEE Trans Parallel Distrib Syst (TPDS). https://doi.org/10.1109/TPDS.2009.76
Laros III JH, Pokorny P, Debonis D (2013) PowerInsight—a commodity power measurement capability. In: International green computing conference (IGCC). https://doi.org/10.1109/IGCC.2013.6604485
Bedard D, Lim MY, Fowler R, Porterfield A (2010) PowerMon: fine-grained and integrated power monitoring for commodity computer systems. In: IEEE SoutheastCon. https://doi.org/10.1109/SECON.2010.5453824
Dolz MF, Heidari MR, Kuhn M, Ludwig T, Fabregat G, ARDUPOWER: a low-cost wattmeter to improve energy efficiency of HPC applications. In: Sixth international green and sustainable computing conference (IGSC). https://doi.org/10.1109/IGCC.2015.7393692
Sandia National Laboratories, Power API specification, Std., Sep 2015. [Online]. Available: http://powerapi.sandia.gov/docs/PowerAPI_SAND.pdf
Almeida F, Arteaga J, Blanco V, Cabrera A (2015) Energy measurement tools for ultrascale computing: a survey. Supercomput Front Innov 2(2). [Online]. Available: http://superfri.org/superfri/article/view/45
Ilsche T, Hackenberg D, Graul S, Schuchart J, Schöne R (Dec 2015) Power measurements for compute nodes: Improving sampling rates, granularity and accuracy. In: 2015 sixth international green and sustainable computing conference (IGSC), ser. sixth international green and sustainable computing conference, IGSC
Hackenberg D, Ilsche T, Schuchart J, Schöne R, Nagel WE, Simon M, Georgiou Y (2014) Hdeem: high definition energy efficiency monitoring. In: International workshop on energy efficient supercomputing (E2SC). IEEE Press
HDEEM library reference guide. http://www.bull.com/download-hdeem-library-reference-guide, Bull Atos Technologies, Tech. Rep., 2016
Kluge M, Hackenberg D, Nagel WE (2012) Collecting distributed performance data with dataheap: generating and exploiting a holistic system view. Procedia Comput Sci 9:1969–1978
Knüpfer A, Rössel C, Mey D, Biersdorff S, Diethelm K, Eschweiler D, Geimer M, Gerndt M, Lorenz D, Malony A, Nagel WE, Oleynik Y, Philippen P, Saviankou P, Schmidl D, Shende S, Tschüter R, Wagner M, Wesarg B, Wolf F (2012) Score-P: a joint performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir. In: Brunst H, Müller MS, Nagel WE, Resch MM (eds) Tools for high performance computing 2011. Springer, Berlin Heidelberg, pp 79–91
Georgiou Y, Cadeau T, Glesser D, Auble D, Jette M, Hautreux M (2014) Energy accounting and control with slurm resource and job management system. In: Chatterjee M, Cao J-n, Kothapalli K, Rajsbaum S (eds) Distributed computing and networking. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 96–118
Hackenberg D, Oldenburg R, Molka D, Schöne R (2013) Introducing FIRESTARTER: a processor stress test utility
Acknowledgements
This work is supported in parts by the German Research Foundation (DFG) in the Collaborative Research Center 912 “Highly Adaptive Energy-Efficient Computing”, the Bundesministerium für Bildung und Forschung via the research project Score-E (BMBF 01IH13001), and Bull/Atos in the joint project “High Definition Energy Efficiency Monitoring” (HDEEM). The authors would like to thank Robin Geyer for his contribution on the HDEEM verification and Mario Bielert for improvements on the paper layout.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ilsche, T., Schöne, R., Schuchart, J. et al. Power measurement techniques for energy-efficient computing: reconciling scalability, resolution, and accuracy. SICS Softw.-Inensiv. Cyber-Phys. Syst. 34, 45–52 (2019). https://doi.org/10.1007/s00450-018-0392-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00450-018-0392-9