More Web Proxy on the site http://driver.im/

research-article

Open access

CASH: Criticality-Aware Split Hybrid L1 Data Cache

Authors:

Shruthi Karunakar,

Meenakshi Atkade,

Rajshekar Kalayappan,

Sandeep ChandranAuthors Info & Claims

GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024

Pages 50 - 56

https://doi.org/10.1145/3649476.3658716

Published: 12 June 2024 Publication History

All formats PDF

Abstract

Computer architects continue to explore newer ways to provide the abstraction of a large but fast memory to the processor. This work proposes a memory system that achieves this abstraction using a hybrid cache – a combination of an SRAM array and a Spin-Transfer Torque Magnetic RAM (STTRAM) array, at the highest level (L1) of the memory hierarchy. We overcome the issue of longer access latency of STTRAM arrays by placing those cache lines which are likely to be accessed by a critical load instruction (or delay-sensitive loads) into the SRAM array. Our characterization of the CPU SPEC2017 benchmarks shows that most load instructions are tolerant to access latency of STTRAM array, which makes a small (but fast) SRAM array amenable. The higher densities and lower leakage power of the STTRAM array also make it amenable to provision for larger capacity without significant area overhead. Through extensive simulations of SPEC2017 benchmarks, we show that a combination of a small but fast SRAM array, and a large STTRAM array yields an average performance gain of up to 6.1 % when compared to a baseline system that uses only an SRAM-array-based cache of similar area. This performance improvement comes at a cost of a 1.7 % increase in the energy consumption of the private caches.

References

[1]

Sukarn Agarwal, Shounak Chakraborty, and Magnus Själander. 2023. Architecting Selective Refresh based Multi-Retention Cache for Heterogeneous System (ARMOUR). In Design Automation Conference (DAC).

[2]

Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2014. DASCA: Dead write prediction assisted STT-RAM cache architecture. In International Symposium on High Performance Computer Architecture (HPCA).

[3]

S Arcaro, Stefano Di Carlo, Marco Indaco, D Pala, Paolo Prinetto, and Elena I Vatajelu. 2014. Integration of STT-MRAM model into CACTI simulator. In International Design and Test Symposium (IDT).

[4]

Rajeev Balasubramonian, Viji Srinivasan, Sandhya Dwarkadas, and Alper Buyuktosunoglu. 2005. Hot-and-cold: Using criticality in the design of energy-efficient caches. In Power-Aware Computer Systems (PACS).

[5]

Mu-Tien Chang, Paul Rosenfeld, Shih-Lien Lu, and Bruce Jacob. 2013. Technology comparison for large last-level caches (L 3 Cs): Low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM. In International Symposium on High Performance Computer Architecture (HPCA).

[6]

Ping Chi, Shuangchen Li, Yuanqing Cheng, Yu Lu, Seung H Kang, and Yuan Xie. 2016. Architecture design with STT-RAM: Opportunities and challenges. In Asia and South Pacific Design Automation Conference (ASP-DAC).

Digital Library

[7]

Ki Chul Chun, Hui Zhao, Jonathan D Harms, Tae-Hyoung Kim, Jian-Ping Wang, and Chris H Kim. 2012. A scaling roadmap and performance evaluation of in-plane and perpendicular MTJ based STT-MRAMs for high-density cache memory. Journal of Solid-State Circuits (2012).

[8]

Carlos Escuin, Asif Ali Khan, Pablo Ibáñez, Teresa Monreal, Jeronimo Castrillon, and Víctor Viñals. 2023. Compression-Aware and Performance-Efficient Insertion Policies for Long-Lasting Hybrid LLCs. In International Symposium on High-Performance Computer Architecture (HPCA).

[9]

Brian Fields, Rastislav Bodik, and Mark D Hill. 2002. Slack: Maximizing performance under technological constraints. ACM SIGARCH Computer Architecture News (2002).

Digital Library

[10]

B. Fields, S. Rubin, and R. Bodik. 2001. Focusing processor policies via critical-path prediction. In International Symposium on Computer Architecture (ISCA).

[11]

Kanad Ghose and Milind B Kamble. 1999. Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In International Symposium on Low Power Electronics and Design (ISLPED).

[12]

Lei Jiang, Wujie Wen, Danghui Wang, and Lide Duan. 2016. Improving read performance of stt-mram based main memories through smash read and flexible read. In Asia and South Pacific Design Automation Conference (ASP-DAC).

Digital Library

[13]

Adwait Jog, Asit K Mishra, Cong Xu, Yuan Xie, Vijaykrishnan Narayanan, Ravishankar Iyer, and Chita R Das. 2012. Cache revive: Architecting volatile STT-RAM caches for enhanced performance in CMPs. In Design Automation Conference (DAC).

Digital Library

[14]

Wang Kang, Yuanqing Cheng, Youguang Zhang, Dafine Ravelosona, and Weisheng Zhao. 2014. Readability challenges in deeply scaled STT-MRAM. In Non-Volatile Memory Technology Symposium (NVMTS).

[15]

Kunal Korgaonkar, Ishwar Bhati, Huichu Liu, Jayesh Gaur, Sasikanth Manipatruni, Sreenivas Subramoney, Tanay Karnik, Steven Swanson, Ian Young, and Hong Wang. 2018. Density tradeoffs of non-volatile memory as a replacement for SRAM based last level cache. In International Symposium on Computer Architecture (ISCA).

Digital Library

[16]

Kyle Kuan and Tosiron Adegbija. 2019. Mirrorcache: An energy-efficient relaxed retention l1 sttram cache. In Great Lakes Symposium on VLSI (GLSVLSI).

Digital Library

[17]

Yong Li, Yaojun Zhang, Hai Li, Yiran Chen, and Alex K Jones. 2013. C1C: A configurable, compiler-guided STT-RAM L1 cache. Transactions on Architecture and Code Optimization (TACO) (2013).

Digital Library

[18]

Sheel Sindhu Manohar, Sparsh Mittal, and Hemangee K Kapoor. 2022. CORIDOR: Using COherence and TempoRal LocalIty to Mitigate Read Disurbance ErrOR in STT-RAM Caches. Transactions on Embedded Computing Systems (TECS) (2022).

[19]

Anant Vithal Nori, Jayesh Gaur, Siddharth Rai, Sreenivas Subramoney, and Hong Wang. 2018. Criticality aware tiered cache hierarchy: A fundamental relook at multi-level cache hierarchies. In International Symposium on Computer Architecture (ISCA).

Digital Library

[20]

Farzane Rabiee, Mostafa Kajouyan, Newsha Estiri, Jordan Fluech, Mahdi Fazeli, and Ahmad Patooghy. 2020. Enduring non-volatile L1 cache using low-retention-time STTRAM cells. In Annual Symposium on VLSI (ISVLSI).

[21]

Smruti R Sarangi, Rajshekar Kalayappan, Prathmesh Kallurkar, Seep Goel, and Eldhose Peter. 2015. Tejas: A java based versatile micro-architectural simulator. In International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS).

[22]

André Seznec. 2016. TAGE-SC-L branch predictors again. In JILP Workshop on Computer Architecture Competitions: Championship Branch Prediction (CBP-5).

[23]

Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. 2002. Automatically characterizing large scale program behavior. ACM SIGPLAN Notices (2002).

[24]

Clinton W Smullen, Vidyabhushan Mohan, Anurag Nigam, Sudhanva Gurumurthi, and Mircea R Stan. 2011. Relaxing non-volatility for fast and energy-efficient STT-RAM caches. In International Symposium on High Performance Computer Architecture (HPCA).

[25]

Zhenyu Sun, Xiuyuan Bi, Hai Li, Weng-Fai Wong, Zhong-Liang Ong, Xiaochun Zhu, and Wenqing Wu. 2011. Multi retention level STT-RAM cache designs with a dynamic refresh scheme. In International Symposium on Microarchitecture (Micro).

Digital Library

[26]

Rujia Wang, Lei Jiang, Youtao Zhang, Linzhang Wang, and Jun Yang. 2015. Selective restore: An energy efficient read disturbance mitigation scheme for future STT-MRAM. In Design Automation Conference (DAC).

Digital Library

[27]

Zhe Wang, Daniel A Jiménez, Cong Xu, Guangyu Sun, and Yuan Xie. 2014. Adaptive placement and migration policy for an STT-RAM-based hybrid cache. In International Symposium on High Performance Computer Architecture (HPCA).

[28]

Yaojun Zhang, Wujie Wen, and Yiran Chen. 2012. The prospect of STT-RAM scaling from readability perspective. Transactions on Magnetics (2012).

Index Terms

CASH: Criticality-Aware Split Hybrid L1 Data Cache
1. Computer systems organization
  1. Architectures
2. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging architectures
    2. Spintronics and magnetic technologies
  2. Very large scale integration design

Recommendations

Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches ...
Density tradeoffs of non-volatile memory as a replacement for SRAM based last level cache
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture

Increasing the capacity of the Last Level Cache (LLC) can help scale the memory wall. Due to prohibitive area and leakage power, however, growing conventional SRAM LLC already incurs diminishing returns. Emerging Non-Volatile Memory (NVM) technologies ...
Compiler-Assisted and Profiling-Based Analysis for Fast and Efficient STT-MRAM On-Chip Cache Design

Spin Transfer Torque Magnetic Random Access Memory (STT-MRAM) is a promising candidate for large on-chip memories as a zero-leakage, high-density and non-volatile alternative to the present SRAM technology. Since memories are the dominating component of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024

June 2024

797 pages

ISBN:9798400706059

DOI:10.1145/3649476

Editors:
Inna Partin-Vaisband
University of Illinois Chicago, USA
,
Srinivas Katkoori
University of South Florida, USA
,
Lu Peng
Tulane University, USA
,
Boris Vaisband
McGill University, Canada
,
Tooraj Nikoubin
University of Texas at Dallas, USA

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Science and Engineering Research Board

Conference

GLSVLSI '24

Sponsor:

SIGDA

GLSVLSI '24: Great Lakes Symposium on VLSI 2024

June 12 - 14, 2024

FL, Clearwater, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Upcoming Conference

GLSVLSI '25

Sponsor:
sigda

Great Lakes Symposium on VLSI 2025

June 30 - July 2, 2025

New Orleans , LA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
120
Total Downloads

Downloads (Last 12 months)120
Downloads (Last 6 weeks)18

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents