[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Sensible Energy Accounting with Abstract Metering for Multicore Systems

Published: 22 December 2015 Publication History

Abstract

Chip multicore processors (CMPs) are the preferred processing platform across different domains such as data centers, real-time systems, and mobile devices. In all those domains, energy is arguably the most expensive resource in a computing system. Accurately quantifying energy usage in a multicore environment presents a challenge as well as an opportunity for optimization. Standard metering approaches are not capable of delivering consistent results with shared resources, since the same task with the same inputs may have different energy consumption based on the mix of co-running tasks. However, it is reasonable for data-center operators to charge on the basis of estimated energy usage rather than time since energy is more correlated with their actual cost.
This article introduces the concept of Sensible Energy Accounting (SEA). For a task running in a multicore system, SEA accurately estimates the energy the task would have consumed running in isolation with a given fraction of the CMP shared resources. We explain the potential benefits of SEA in different domains and describe two hardware techniques to implement it for a shared last-level cache and on-core resources in SMT processors. Moreover, with SEA, an energy-aware scheduler can find a highly efficient on-chip resource assignment, reducing by up to 39% the total processor energy for a 4-core system.

Supplementary Material

TACO1204-60 (taco1204-60.pdf)
Slide deck associated with this paper

References

[1]
Jaume Abella and Antonio Gonzalez. 2003. On reducing register pressure and energy in multiple-banked register files. In Proceedings of 21st International Conference on Computer Design. 14--20.
[2]
David H. Albonesi, Rajeev Balasubramonian, Steven G. Dropsho, Sandhya Dwarkadas, Eby G. Friedman, Michael C. Huang, Volkan Kursun, Grigorios Magklis, Michael L. Scott, Greg Semeraro, Pradip Bose, Alper Buyuktosunoglu, Peter W. Cook, and Stanley E. Schuster. 2003. Dynamically tuning processor resources with adaptive processing. Computer 36, 12, 49--58.
[3]
Frank Bellosa. 2000. The benefits of event driven energy accounting in power-sensitive systems. In Proceedings of the 9th ACM SIGOPS European Workshop: Beyond the PC: New Challenges for the Operating System(EW 9). 37--42.
[4]
Jeanne P. Bickford, Raymond Rosner, Erik Hedberg, Joseph W. Yoder, and Tomas S. Barnett. 2008. SRAM redundancy—Silicon area versus number of repairs trade-off. In Advanced Semiconductor Manufacturing Conference. 387--392.
[5]
W. Lloyd Bircher and Lizy K. John. 2012. Complete system power estimation using processor performance events. IEEE Trans. Comput. 61, 4 (2012), 563--577.
[6]
David Brooks, Vivek Tiwari, and Margaret Martonosi. 2000. Wattch: A framework for architectural-level power analysis and optimizations. In Proceedings of the 27th International Symposium on Computer Architecture. 83--94.
[7]
Aaron Carroll and Gernot Heiser. 2010. An analysis of power consumption in a smartphone. In Proceedings of the USENIX Annual Technical Conference. 21.
[8]
Francisco J. Cazorla, Alex Ramirez, Mateo Valero, and Enrique Fernandez. 2004. Dynamically controlled resource allocation in SMT processors. In Proceedings of the 37th International Symposium on Microarchitecture. 171--182.
[9]
European Statistics. 2014. Energy price statistics. http://ec.europa.eu/eurostat/statistics-explained/index.php/Energy_price_statistics.
[10]
Ashutosh S. Dhodapkar and James E. Smith. 2002. Managing multi-configuration hardware via dynamic working set analysis. In Proceedings of the 29th Annual International Symposium on Computer Architecture. 233--244.
[11]
Stijn Eyerman and Lieven Eeckhout. 2009. Per-thread cycle accounting in SMT processors. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, Vol. 44. 133--144.
[12]
Stijn Eyerman, Lieven Eeckhout, Tejas Karkhanis, and James E. Smith. 2006. A performance counter architecture for computing accurate CPI components. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems. 175--184.
[13]
Daniele Folegnani and Antonio González. 2001. Energy-effective issue logic. In Proceedings of the International Symposium on Computer Architecture. 230--239.
[14]
Houman Homayoun, Sudeep Pasricha, Mohammad Makhzan, and Alex Veidenbaum. 2008. Dynamic register file resizing and frequency scaling to improve embedded processor performance and energy-delay efficiency. In Proceedings of the 45th ACM/IEEE Design Automation Conference. 68--71.
[15]
Michael C. Huang, Daniel Chaver, Luis Pinuel, Manuel Prieto, and Francisco Tirado. 2003a. Customizing the branch predictor to reduce complexity and energy consumption. IEEE Micro 23, 5, 12--25.
[16]
Michael C. Huang, Jose Renau, and Josep Torrellas. 2003b. Positional adaptation of processors: Application to energy reduction. In Proceedings of the 30th International Symposium on Computer Architecture. 157--168.
[17]
Kamil Kedzierski, Miquel Moretó, Francisco J. Cazorla, and Mateo Valero. 2010. Adapting cache partitioning algorithms to pseudo-LRU replacement policies. In Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing. 1--12.
[18]
Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In 42th International Symposium on Microarchitecture. 469--480.
[19]
Qixiao Liu, Miquel Moreto, Jaume Abella, Francisco J. Cazorla, and Mateo Valero. 2014. DReAM: Per-task DRAM energy metering in multicore systems. In Euro-Par 2014 Parallel Processing, Lecture Notes in Computer Science, Fernando Silva, Inês Dutra, and Vítor Santos Costa (Eds.), Vol. 8632. Springer, Berlin, 111--123.
[20]
Qixiao Liu, Miquel Moreto, Victor Jimenez, Jaume Abella, Francisco J. Cazorla, and Mateo Valero. 2013. Hardware support for accurate per-task energy metering in multicore systems. ACM Transactions on Architecture and Code Optimizationb 10, 4, Article 34, 27 pages.
[21]
Carlos Luque, Miquel Moreto, Francisco J. Cazorla, Roberto Gioiosa Alper Buyuktosunoglu, and Mateo Valero. 2009. CPU accounting in CMP processors. In IEEE Comput. Archit. Lett. 9, 2.
[22]
Carlos Luque, Miquel Moreto, Francisco J. Cazorla, Roberto Gioiosa, Alper Buyuktosunoglu, and Mateo Valero. 2012. CPU accounting for multicore processors. IEEE Trans. Comput. 161, 2.
[23]
Carlos Luque, Miquel Moreto, Francisco J. Cazorla, and Mateo Valero. 2013. Fair CPU time accounting in CMP&Plus;SMT processors. ACM Trans. Archit. Code Optim. 9, 4, Article 50, 25 pages.
[24]
Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture. 248--259.
[25]
Jason Mars, Neil Vachharajani, Robert Hundt, and Mary Lou Soffa. 2010. Contention aware execution: Online contention detection and response. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization. 257--265.
[26]
John C. McCullough, Yuvraj Agarwal, Jaideep Chandrashekar, Sathyanarayan Kuppuswamy, Alex C. Snoeren, and Rajesh K. Gupta. 2011. Evaluating the effectiveness of model-based power characterization. In Proceedings of the USENIX Annual Technical Conference. 12--12.
[27]
Micron. 2007. Calculating memory system power for DDR3. Micron Technical Notes (2007).
[28]
Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P. Jouppi. 2009. CACTI 6.0: A tool to understand large caches. HP Tech Report HPL-2009-85.
[29]
Nokia. 2012. Nokia Energy Profiler. http://nokia-energy-profiler.en.softonic.com/symbian.
[30]
Abhinav Pathak, Y. Charlie Hu, Ming Zhang, Paramvir Bahl, and Yi-Min Wang. 2011. Fine-grained power modeling for smartphones using system call tracing. In EuroSys. 153--168.
[31]
Pavlos Petoumenos, Georgia Psychou, Stefanos Kaxiras, Juan Manuel Cebrian Gonzalez, and Juan Luis Aragon. 2010. MLP-aware instruction queue resizing: The key to power-efficient performance. In Architecture of Computing Systems - ARCS 2010, Lecture Notes in Computer Science, Christian Meller-Schloer, Wolfgang Karl, and Sami Yehia (Eds.), Vol. 5974. Springer, Berlin, 113--125.
[32]
Kishore Kumar Pusukuri, David Vengerov, and Alexandra Fedorova. 2009. A methodology for developing simple and robust power models using performance monitoring events. In Proceedings of the Annual Workshop on the Interaction between Operating Systems and Computer Architecture.
[33]
Moinuddin K. Qureshi and Yale N. Patt. 2006. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th International Symposium on Microarchitecture. 423--432.
[34]
Paul Rosenfeld, Elliott Cooper-Balis, and Bruce Jacob. 2011. DRAMSim2: A cycle accurate memory system simulator. IEEE Computer Architecture Letters 10, 1, 16--19.
[35]
Kai Shen, Arrvindh Shriraman, Sandhya Dwarkadas, Xiao Zhang, and Zhuan Chen. 2013. Power containers: An OS facility for fine-grained power and energy management on multicore servers. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, New York, NY, 65--76.
[36]
Tomothy Sherwood, Erez Perelman, and Brad Calder. 2001. Basic block distribution analysis to find periodic behavior and simulation points in applications. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 3--14.
[37]
G. Edward Suh, Srinivas Devadas, and Larry Rudolph. 2002. A new memory monitoring scheme for memory-aware scheduling and partitioning. In Proceedings of the IEEE Symposium on High Performance Computer Architecture. 117--128.
[38]
Lingjia Tang, Jason Mars, Neil Vachharajani, Robert Hundt, and Mary Lou Soffa. 2011. The impact of memory subsystem resource sharing on datacenter applications. In Proceedings of the 38th Annual International Symposium on Computer Architecture. 283--294.
[39]
Dean M. Tullsen, Susan J. Eggers, and Henry M. Levy. 1998. Simultaneous multithreading: Maximizing on-chip parallelism. In Proceedings of the International Symposium on Computer Architecture. 533--544.

Cited By

View all
  • (2022)5G-EECCSecurity and Communication Networks10.1155/2022/13542382022Online publication date: 1-Jan-2022
  • (2020)Convolutional neural network: a review of models, methodologies and applications to object detectionProgress in Artificial Intelligence10.1007/s13748-019-00203-09:2(85-112)Online publication date: 1-Jun-2020
  • (2017)SEDEA: A Sensible Approach to Account DRAM Energy in Multicore Systems2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD.2017.17(73-80)Online publication date: Oct-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 12, Issue 4
January 2016
848 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2836331
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 December 2015
Accepted: 01 October 2015
Revised: 01 October 2015
Received: 01 June 2015
Published in TACO Volume 12, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Power modeling
  2. chip multiprocessors
  3. energy accounting
  4. modeling and estimation
  5. resource allocation
  6. simultaneous multithreaded

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • European Research Council under the European Union's 7th FP
  • IBM and BSCCNS
  • ERC
  • Spanish Ministry of Science and Innovation
  • HiPEAC Network of Excellence
  • Ramon y Cajal postdoctoral fellowship
  • Chinese Scholarship Council
  • Spanish Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)11
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)5G-EECCSecurity and Communication Networks10.1155/2022/13542382022Online publication date: 1-Jan-2022
  • (2020)Convolutional neural network: a review of models, methodologies and applications to object detectionProgress in Artificial Intelligence10.1007/s13748-019-00203-09:2(85-112)Online publication date: 1-Jun-2020
  • (2017)SEDEA: A Sensible Approach to Account DRAM Energy in Multicore Systems2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD.2017.17(73-80)Online publication date: Oct-2017
  • (2016)Event-Triggered Sleeping for Synchronous DC MAC IN WSNs: Mechanism and DTMC Modeling2016 IEEE Global Communications Conference (GLOBECOM)10.1109/GLOCOM.2016.7841824(1-6)Online publication date: 4-Dec-2016

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media