[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3649476.3658716acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article
Open access

CASH: Criticality-Aware Split Hybrid L1 Data Cache

Published: 12 June 2024 Publication History

Abstract

Computer architects continue to explore newer ways to provide the abstraction of a large but fast memory to the processor. This work proposes a memory system that achieves this abstraction using a hybrid cache – a combination of an SRAM array and a Spin-Transfer Torque Magnetic RAM (STTRAM) array, at the highest level (L1) of the memory hierarchy. We overcome the issue of longer access latency of STTRAM arrays by placing those cache lines which are likely to be accessed by a critical load instruction (or delay-sensitive loads) into the SRAM array. Our characterization of the CPU SPEC2017 benchmarks shows that most load instructions are tolerant to access latency of STTRAM array, which makes a small (but fast) SRAM array amenable. The higher densities and lower leakage power of the STTRAM array also make it amenable to provision for larger capacity without significant area overhead. Through extensive simulations of SPEC2017 benchmarks, we show that a combination of a small but fast SRAM array, and a large STTRAM array yields an average performance gain of up to 6.1 % when compared to a baseline system that uses only an SRAM-array-based cache of similar area. This performance improvement comes at a cost of a 1.7 % increase in the energy consumption of the private caches.

References

[1]
Sukarn Agarwal, Shounak Chakraborty, and Magnus Själander. 2023. Architecting Selective Refresh based Multi-Retention Cache for Heterogeneous System (ARMOUR). In Design Automation Conference (DAC).
[2]
Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2014. DASCA: Dead write prediction assisted STT-RAM cache architecture. In International Symposium on High Performance Computer Architecture (HPCA).
[3]
S Arcaro, Stefano Di Carlo, Marco Indaco, D Pala, Paolo Prinetto, and Elena I Vatajelu. 2014. Integration of STT-MRAM model into CACTI simulator. In International Design and Test Symposium (IDT).
[4]
Rajeev Balasubramonian, Viji Srinivasan, Sandhya Dwarkadas, and Alper Buyuktosunoglu. 2005. Hot-and-cold: Using criticality in the design of energy-efficient caches. In Power-Aware Computer Systems (PACS).
[5]
Mu-Tien Chang, Paul Rosenfeld, Shih-Lien Lu, and Bruce Jacob. 2013. Technology comparison for large last-level caches (L 3 Cs): Low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM. In International Symposium on High Performance Computer Architecture (HPCA).
[6]
Ping Chi, Shuangchen Li, Yuanqing Cheng, Yu Lu, Seung H Kang, and Yuan Xie. 2016. Architecture design with STT-RAM: Opportunities and challenges. In Asia and South Pacific Design Automation Conference (ASP-DAC).
[7]
Ki Chul Chun, Hui Zhao, Jonathan D Harms, Tae-Hyoung Kim, Jian-Ping Wang, and Chris H Kim. 2012. A scaling roadmap and performance evaluation of in-plane and perpendicular MTJ based STT-MRAMs for high-density cache memory. Journal of Solid-State Circuits (2012).
[8]
Carlos Escuin, Asif Ali Khan, Pablo Ibáñez, Teresa Monreal, Jeronimo Castrillon, and Víctor Viñals. 2023. Compression-Aware and Performance-Efficient Insertion Policies for Long-Lasting Hybrid LLCs. In International Symposium on High-Performance Computer Architecture (HPCA).
[9]
Brian Fields, Rastislav Bodik, and Mark D Hill. 2002. Slack: Maximizing performance under technological constraints. ACM SIGARCH Computer Architecture News (2002).
[10]
B. Fields, S. Rubin, and R. Bodik. 2001. Focusing processor policies via critical-path prediction. In International Symposium on Computer Architecture (ISCA).
[11]
Kanad Ghose and Milind B Kamble. 1999. Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation. In International Symposium on Low Power Electronics and Design (ISLPED).
[12]
Lei Jiang, Wujie Wen, Danghui Wang, and Lide Duan. 2016. Improving read performance of stt-mram based main memories through smash read and flexible read. In Asia and South Pacific Design Automation Conference (ASP-DAC).
[13]
Adwait Jog, Asit K Mishra, Cong Xu, Yuan Xie, Vijaykrishnan Narayanan, Ravishankar Iyer, and Chita R Das. 2012. Cache revive: Architecting volatile STT-RAM caches for enhanced performance in CMPs. In Design Automation Conference (DAC).
[14]
Wang Kang, Yuanqing Cheng, Youguang Zhang, Dafine Ravelosona, and Weisheng Zhao. 2014. Readability challenges in deeply scaled STT-MRAM. In Non-Volatile Memory Technology Symposium (NVMTS).
[15]
Kunal Korgaonkar, Ishwar Bhati, Huichu Liu, Jayesh Gaur, Sasikanth Manipatruni, Sreenivas Subramoney, Tanay Karnik, Steven Swanson, Ian Young, and Hong Wang. 2018. Density tradeoffs of non-volatile memory as a replacement for SRAM based last level cache. In International Symposium on Computer Architecture (ISCA).
[16]
Kyle Kuan and Tosiron Adegbija. 2019. Mirrorcache: An energy-efficient relaxed retention l1 sttram cache. In Great Lakes Symposium on VLSI (GLSVLSI).
[17]
Yong Li, Yaojun Zhang, Hai Li, Yiran Chen, and Alex K Jones. 2013. C1C: A configurable, compiler-guided STT-RAM L1 cache. Transactions on Architecture and Code Optimization (TACO) (2013).
[18]
Sheel Sindhu Manohar, Sparsh Mittal, and Hemangee K Kapoor. 2022. CORIDOR: Using COherence and TempoRal LocalIty to Mitigate Read Disurbance ErrOR in STT-RAM Caches. Transactions on Embedded Computing Systems (TECS) (2022).
[19]
Anant Vithal Nori, Jayesh Gaur, Siddharth Rai, Sreenivas Subramoney, and Hong Wang. 2018. Criticality aware tiered cache hierarchy: A fundamental relook at multi-level cache hierarchies. In International Symposium on Computer Architecture (ISCA).
[20]
Farzane Rabiee, Mostafa Kajouyan, Newsha Estiri, Jordan Fluech, Mahdi Fazeli, and Ahmad Patooghy. 2020. Enduring non-volatile L1 cache using low-retention-time STTRAM cells. In Annual Symposium on VLSI (ISVLSI).
[21]
Smruti R Sarangi, Rajshekar Kalayappan, Prathmesh Kallurkar, Seep Goel, and Eldhose Peter. 2015. Tejas: A java based versatile micro-architectural simulator. In International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS).
[22]
André Seznec. 2016. TAGE-SC-L branch predictors again. In JILP Workshop on Computer Architecture Competitions: Championship Branch Prediction (CBP-5).
[23]
Timothy Sherwood, Erez Perelman, Greg Hamerly, and Brad Calder. 2002. Automatically characterizing large scale program behavior. ACM SIGPLAN Notices (2002).
[24]
Clinton W Smullen, Vidyabhushan Mohan, Anurag Nigam, Sudhanva Gurumurthi, and Mircea R Stan. 2011. Relaxing non-volatility for fast and energy-efficient STT-RAM caches. In International Symposium on High Performance Computer Architecture (HPCA).
[25]
Zhenyu Sun, Xiuyuan Bi, Hai Li, Weng-Fai Wong, Zhong-Liang Ong, Xiaochun Zhu, and Wenqing Wu. 2011. Multi retention level STT-RAM cache designs with a dynamic refresh scheme. In International Symposium on Microarchitecture (Micro).
[26]
Rujia Wang, Lei Jiang, Youtao Zhang, Linzhang Wang, and Jun Yang. 2015. Selective restore: An energy efficient read disturbance mitigation scheme for future STT-MRAM. In Design Automation Conference (DAC).
[27]
Zhe Wang, Daniel A Jiménez, Cong Xu, Guangyu Sun, and Yuan Xie. 2014. Adaptive placement and migration policy for an STT-RAM-based hybrid cache. In International Symposium on High Performance Computer Architecture (HPCA).
[28]
Yaojun Zhang, Wujie Wen, and Yiran Chen. 2012. The prospect of STT-RAM scaling from readability perspective. Transactions on Magnetics (2012).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GLSVLSI '24: Proceedings of the Great Lakes Symposium on VLSI 2024
June 2024
797 pages
ISBN:9798400706059
DOI:10.1145/3649476
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2024

Check for updates

Author Tags

  1. Cache design
  2. Hybrid caches
  3. Instruction Criticality
  4. Microarchitecture
  5. STTRAM

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

GLSVLSI '24
Sponsor:
GLSVLSI '24: Great Lakes Symposium on VLSI 2024
June 12 - 14, 2024
FL, Clearwater, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Upcoming Conference

GLSVLSI '25
Great Lakes Symposium on VLSI 2025
June 30 - July 2, 2025
New Orleans , LA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 120
    Total Downloads
  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)18
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media