[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

DReAM: An Approach to Estimate per-Task DRAM Energy in Multicore Systems

Published: 23 November 2016 Publication History

Abstract

Accurate per-task energy estimation in multicore systems would allow performing per-task energy-aware task scheduling and energy-aware billing in data centers, among other applications. Per-task energy estimation is challenged by the interaction between tasks in shared resources, which impacts tasks’ energy consumption in uncontrolled ways. Some accurate mechanisms have been devised recently to estimate per-task energy consumed on-chip in multicores, but there is a lack of such mechanisms for DRAM memories. This article makes the case for accurate per-task DRAM energy metering in multicores, which opens new paths to energy/performance optimizations. In particular, the contributions of this article are (i) an ideal per-task energy metering model for DRAM memories; (ii) DReAM, an accurate yet low cost implementation of the ideal model (less than 5% accuracy error when 16 tasks share memory); and (iii) a comparison with standard methods (even distribution and access-count based) proving that DReAM is much more accurate than these other methods.

References

[1]
C. Acosta, F. J. Cazorla, A. Ramirez, and M. Valero. 2009. The MPsim Simulation Tool. Technical Report UPC-DAC-RR-CAP-2009-15. UPC.
[2]
N. Aggarwal, J. F. Cantin, M. H. Lipasti, and J. E. Smith. 2008. Power-efficient DRAM speculation. In Proceedings of the IEEE 14th International Symposium on High Performance Computer Architecture (HPCA). IEEE, 317--328.
[3]
L. Barroso and U. Holzle. 2007. The case for energy-proportional computing. IEEE Computer 40, 12 (October 2007), 33--37.
[4]
F. Bellosa. 2000. The benefits of event-driven energy accounting in power-sensitive systems. In Proceedings of the ACM SIGOPS European Workshop. ACM, 37--42.
[5]
A. Beloglazov, R. Buyya, Y. Lee, and A. Zomaya. 2011. A taxonomy and survey of energy-efficient data centers and cloud computing systems. Advances in Computers 82 (2011), 47--111.
[6]
R. Bertran, Y. Becerra, D. Carrera, V. Beltran, M. Gonzilez, X. Martorell, N. Navarro, J. Torres, and E. Ayguade. 2012. Energy accounting for shared virtualized environments under DVFS using PMC-based power models. Future Generation Computer Systems 28, 2 (Feb. 2012), 457--468.
[7]
W. Lloyd Bircher and Lizy K. John. 2007. Complete system power estimation: A trickle-down approach based on performance events. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems 8 Software (ISPASS). IEEE, 158--168.
[8]
D. M. Brooks, V. Tiwari, and M. Martonosi. 2000. Wattch: A framework for architectural-level power analysis and optimizations. In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA). ACM, 83--94.
[9]
A. Carroll and G. Heiser. 2010. An analysis of power consumption in a smartphone. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, 21.
[10]
K. Chandrasekar, B. Akesson, and K. Goossens. 2011. Improved power modeling of DDR SDRAMs. In Proceedings of the 14th Euromicro Conference on Digital System Design (DSD). IEEE Computer Society, 99--108.
[11]
Y.-F. Chung, C.-Y. Lin, and C.-T. King. 2011. ANEPROF: Energy profiling for android java virtual machine and applications. In Proceedings of the International Conference on Parallel and Distributed Systems. IEEE Computer Society, 372--379.
[12]
H. David, C. Fallin, E. Gorbatov, U. R. Hanebutte, and O. Mutlu. 2011. Memory power management via dynamic voltage/frequency scaling. In Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC). ACM, 31--40.
[13]
H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. 2010. RAPL: Memory power estimation and capping. In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED). ACM, 189--194.
[14]
Q. Deng, D. Meisner, L. Ramos, T. F. Wenisch, and R. Bianchini. 2011. MemScale: Active low-power modes for main memory. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 225--238.
[15]
J. Gonzalez, J. Gimenez, M. Casas, M. Moretó, A. Ramírez, J. Labarta, and M. Valero. 2011. Simulating whole supercomputer applications. IEEE Micro 31, 3, 32--45.
[16]
J. Hamilton. 2009. Internet-scale service infrastructure efficiency. In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA). ACM, 232.
[17]
J. L. Henning. 2006. SPEC CPU2006 benchmark descriptions. SIGARCH Computer Architecture News 34, 4 (Sept. 2006), 1--17.
[18]
Intel Corp. 2012a. Intel 64 and IA-32 Architectures Software Developer’s Manual. Retrieved from http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html.
[19]
Intel Corp. 2012b. Intel Xeon Processor E5-2600 Product Family Uncore Performance Monitoring Guide. Retrieved from http://www.intel.com/content/dam/www/public/us/en/documents/design-guides/xeon-e5-2600-uncore-guide.pdf.
[20]
A. Jaleel. 2007. Memory Characterization of Workloads Using Instrumentation-Driven Simulation - A Pin-based Memory Characterization of the SPEC CPU2000 and SPEC CPU2006 Benchmark Suites. Technical Report. Retrieved from http://www.glue.umd.edu/ajaleel/workload/.
[21]
JEDEC Solid State Technology Association. 2012. JEDEC DDR3 SDRAM standard. Retrieved from https://www.jedec.org/standards-documents/docs/jesd-79-3d.
[22]
V. Jimenez, R. Gioiosa, F. J. Cazorla, M. Valero, E. Kursun, C. Isci, A. Buyuktosunoglu, and P. Bose. 2011. Energy-aware accounting and billing in large-scale computing facilities. IEEE Micro. IEEE Computer Society, 60--71.
[23]
T. B. Juang, S. H. Chen, and S. M. Li. 2008. A novel VLSI iterative divider architecture for fast quotient generation. In Proceedings of the IEEE International Symposium on Circuit and Systems (ISCAS). IEEE, 3358--3361.
[24]
A. Kansal, F. Zhao, J. Liu, N. Kothari, and A. A. Bhattacharya. 2010. Virtual machine power metering and provisioning. In Proceedings of the ACM Symposium on Cloud Computing (ISCC). ACM, 39--50.
[25]
G. Kestor, R. Gioiosa, D. J. Kerbyson, and A. Hoisie. 2013. Quantifying the energy cost of data movement in scientific applications. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC). IEEE, 56--65.
[26]
Q. Liu, V. Jimenez, M. Moreto, J. Abella, F. J. Cazorla, and M. Valero. 2013b. Hardware support for accurate per-task energy metering in multicore systems. ACM Transactions on Architecture and Code Optimization. 10, 4, Article 34.
[27]
Q. Liu, V. Jimenez, M. Moreto, J. Abella, F. J. Cazorla, and M. Valero. 2013a. Per-task energy accounting in computing systems. IEEE Computer Architecture Letter 13, 2, 85--88.
[28]
Q. Liu, M. Moreto, J. Abella, F. J. Cazorla, and M. Valero. 2014. DReAM: Per-task DRAM energy metering in multicore systems. In Proceedings of the Euro-Par 2014 Parallel Processing. Springer, 111--123.
[29]
J. C. McCullough, Y. Agarwal, J. Chandrashekar, S. Kuppuswamy, A. C. Snoeren, and R. K. Gupta. 2011. Evaluating the effectiveness of model-based power characterization. In Proceedings of the 2011 USENIX Conference on USENIX Annual Technical Conference (USENIXATC’11). USENIX Association, 12.
[30]
J. Michalakes, J. Dudhia, D. Gill, T. Henderson, J. Klemp, W. Skamarock, and W. Wang. 2004. The weather reseach and forecast model: Software architecture and performance. In Proceedings of the 11th ECMWF Workshop on Use of High Performance Computing in Meteorology. 156--168.
[31]
Micron. 2007. Calculating Memory System Power For DDR3. Technical Report tn41-01ddr3-power.
[32]
N. Muralimanohar, R. Balasubramonian, and N. P. Jouppi. 2009. CACTI 6.0: A Tool to Understand Large Caches. Technical Report HPL-2009-85. HP.
[33]
K. J. Nesbit, M. Moreto, F. J. Cazorla, A. Ramirez, M. Valero, and J. E. Smith. 2008. Multicore resource management. IEEE Micro. IEEE Computer Society, 6--16.
[34]
Nokia. 2012. Energy Profiler. Retrieved from http://www.developer.nokia.com/Resources/Tools_and_downloads/Other/Nokia_Energy_Profiler/Quick_start.xhtml.
[35]
A. Pathak, C. Y. Hu, M. Zhang, P. Bahl, and W.-M. Wang. 2011. Fine-grained power modeling for smartphones using system call tracing. In Proceedings of EuroSys. ACM, 153--168.
[36]
A. Phansalkar, A. Joshi, and L. K. John. 2007. Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite. In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA). ACM, 412--423.
[37]
P. Rosenfeld, E. Cooper-Balis, and B. Jacob. 2011. DRAMSim2: A cycle accurate memory system simulator. IEEE Computer Architecture Letter 10, 1, 16--19.
[38]
M. R. Santoro and M. A. Horowitz. 1989. SPIM: A pipelined 64*64-bit iterative multiplier. IEEE Journal of Solid-State Circuits 24, 2, 487--493.
[39]
K. Shen, A. Shriraman, S. Dwarkadas, X. Zhang, and Z. Chen. 2013. Power containers: An OS facility for fine-grained power and energy management on multicore servers. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 65--76.
[40]
T. Sherwood, E. Perelman, and B. Calder. 2001. Basic block distribution analysis to find periodic behavior and simulation points in applications. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE Computer Society, 3--14.
[41]
D. M. Tullsen, S. J. Eggers, and H. M. Levy. 1995. Simultaneous multithreading: Maximizing on-chip parallelism. In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA). ACM, 533--544.
[42]
T. Vogelsang. 2010. Understanding the energy consumption of dynamic random access memories. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE Computer Society, 363--374.
[43]
N. H. E. Weste and K. Eshraghian. 1988. Principles of CMOS VLSI Design. A Systems Perspective. Addison-Wesley.

Cited By

View all
  • (2024)A Highly Parallel DRAM Architecture to Mitigate Large Access Latency and Improve Energy Efficiency of Modern DRAM SystemsIEEE Access10.1109/ACCESS.2024.351217612(182998-183023)Online publication date: 2024
  • (2024)The emergence of maritime logistics database centre (MLDC): A bibliometrics analysisJournal of International Maritime Safety, Environmental Affairs, and Shipping10.1080/25725084.2024.24251528:4Online publication date: 9-Nov-2024
  • (2023)High-Performance and Power-Saving Mechanism for Page Activations Based on Full Independent DRAM Sub-Arrays in Multi-Core SystemsIEEE Access10.1109/ACCESS.2023.329984811(79801-79822)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 22, Issue 1
January 2017
463 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/2948199
  • Editor:
  • Naehyuck Chang
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 23 November 2016
Accepted: 01 May 2016
Revised: 01 May 2016
Received: 01 January 2016
Published in TODAES Volume 22, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Chip multiprocessors
  2. benchmark characterization
  3. modeling and simulation
  4. performance
  5. power modeling

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Spanish Ministry of Science and Innovation
  • Chinese Scholarship Council
  • HiPEAC Network of Excellence
  • Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral
  • IBM and BSC
  • European Research Council under the European Union's 7th FP ERC

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Highly Parallel DRAM Architecture to Mitigate Large Access Latency and Improve Energy Efficiency of Modern DRAM SystemsIEEE Access10.1109/ACCESS.2024.351217612(182998-183023)Online publication date: 2024
  • (2024)The emergence of maritime logistics database centre (MLDC): A bibliometrics analysisJournal of International Maritime Safety, Environmental Affairs, and Shipping10.1080/25725084.2024.24251528:4Online publication date: 9-Nov-2024
  • (2023)High-Performance and Power-Saving Mechanism for Page Activations Based on Full Independent DRAM Sub-Arrays in Multi-Core SystemsIEEE Access10.1109/ACCESS.2023.329984811(79801-79822)Online publication date: 2023
  • (2017)SEDEA: A Sensible Approach to Account DRAM Energy in Multicore Systems2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD.2017.17(73-80)Online publication date: Oct-2017

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media