[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Block Cooperation: Advancing Lifetime of Resistive Memories by Increasing Utilization of Error Correcting Codes

Published: 28 August 2018 Publication History

Abstract

Block-level cooperation is an endurance management technique that operates on top of error correction mechanisms to extend memory lifetimes. Once an error recovery scheme fails to recover from faults in a data block, the entire physical page associated with that block is disabled and becomes unavailable to the physical address space. To reduce the page waste caused by early block failures, other blocks can be used to support the failed block, working cooperatively to keep it alive and extend the faulty page’s lifetime.
We combine the proposed technique with existing error recovery schemes, such as Error Correction Pointers (ECP) and Aegis, to increase memory lifetimes. Block cooperation is realized through metadata sharing in ECP, where one data block shares its unused metadata with another data block. When combined with Aegis, block cooperation is realized through reorganizing data layout, where blocks possessing few faults come to the aid of failed blocks, bringing them back from the dead.
Our evaluation using Monte Carlo simulation shows that block cooperation at a single level (or multiple levels) on top of ECP and Aegis, boosts memory lifetimes by 28% (37%) and 8% (14%) on average, respectively. Furthermore, using trace-driven benchmark evaluation shows that lifetime boost can reach to 68% (30%) exploiting metadata sharing (or data layout reorganization).

References

[1]
Hoda Aghaei Khouzani, Yuan Xue, Chengmo Yang, and Archana Pandurangi. 2014. Prolonging PCM lifetime through energy-efficient, segment-aware, and wear-resistant page allocation. In Proceedings of the 2014 International Symposium on Low Power Electronics and Design. ACM, 327--330.
[2]
Aravinthan Athmanathan, Milos Stanisavljevic, Nikolaos Papandreou, Haralampos Pozidis, and Evangelos Eleftheriou. 2016. Multilevel-cell phase-change memory: A viable technology. IEEE J. Emerg. Select. Top. Circ. Syst. 6, 1 (2016), 87--100.
[3]
Manu Awasthi, Manjunath Shevgoor, Kshitij Sudan, Bipin Rajendran, Rajeev Balasubramonian, and Viii Srinivasan. 2012. Efficient scrub mechanisms for error-prone emerging memories. In Proceedings of the 2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA’12). IEEE, 1--12.
[4]
Rodolfo Azevedo, John D. Davis, Karin Strauss, Parikshit Gopalan, Mark Manasse, and Sergey Yekhanin. 2013. Zombie memory: Extending memory lifetime by reviving dead blocks. In ACM SIGARCH Computer Architecture News, Vol. 41. ACM, 452--463.
[5]
Seungcheol Baek, Hyung Gyu Lee, Chrysostomos Nicopoulos, and Jongman Kim. 2012. A dual-phase compression mechanism for hybrid DRAM/PCM main memory architectures. In Proceedings of the Great Lakes Symposium on VLSI. ACM, 345--350.
[6]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. ACM, 72--81.
[7]
Trevor E. Carlson, Wim Heirmant, and Lieven Eeckhout. 2011. Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation. In Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC’11). IEEE, 1--12.
[8]
Jie Chen, Guru Venkataramani, and H. Howie Huang. 2014. Exploring dynamic redundancy to resuscitate faulty PCM blocks. ACM J. Emerg. Technol. Comput. Syst. 10, 4 (2014), 31.
[9]
Sangyeun Cho and Hyunjin Lee. 2009. Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance. In Proceedings of the 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). IEEE, 347--357.
[10]
Woo-Yeong Cho, Kwang-Jin Lee, and Hye-jin Kim. 2009. Phase change memory devices and systems, and related programming methods. (May 5 2009). Patent No. 7,529,124.
[11]
Youngdon Choi, Ickhyun Song, Mu-Hui Park, Hoeju Chung, Sanghoan Chang, Beakhyoung Cho, Jinyoung Kim, Younghoon Oh, Duckmin Kwon, Jung Sunwoo, et al. 2012. A 20nm 1.8 V 8Gb PRAM with 40MB/s program bandwidth. In Proceedings of the 2012 IEEE International Solid-State Circuits Conference. IEEE, 46--48.
[12]
Thomas M. Conte and W.-M. W. Hwu. 1991. Benchmark characterization. Computer 24, 1 (1991), 48--56.
[13]
SPEC CPU2006. 2006. Standard Performance Evaluation Corporation. Retrieved from https://www.spec.org/cpu2006/.
[14]
G. Dhiman, R. Ayoub, and T. Rosing. 2009. PDRAM: A hybrid PRAM and DRAM main memory system. In Proceedings of the Design Automation Conference (DAC’09). 664--669.
[15]
Ali Eslami, Alfredo Velasco, Alireza Vahid, Georgios Mappouras, Robert Calderbank, and Daniel J. Sorin. 2015. Writing without disturb on phase change memories by integrating coding and layout design. In Proceedings of the 2015 International Symposium on Memory Systems. ACM, 71--77.
[16]
Jie Fan, Song Jiang, Jiwu Shu, Youhui Zhang, and Weimin Zhen. 2013. Aegis: Partitioning data block for efficient recovery of stuck-at-faults in phase change memory. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 433--444.
[17]
David Harel. 1987. Statecharts: A visual formalism for complex systems. Sci. Comput. Program. 8, 3 (1987), 231--274.
[18]
Kenneth Hoste and Lieven Eeckhout. 2007. Microarchitecture-independent workload characterization. IEEE Micro 27, 3 (2007).
[19]
Engin Ipek, Jeremy Condit, Edmund B. Nightingale, Doug Burger, and Thomas Moscibroda. 2010. Dynamically replicated memory: Building reliable systems from nanoscale resistive memories. In ACM SIGARCH Computer Architecture News, Vol. 38. ACM, 3--14.
[20]
ITRS. 2015. More Moore. Retrieved from http://www.itrs2.net/.
[21]
Bruce Jacob, Spencer Ng, and David Wang. 2010. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann.
[22]
Amin Jadidi, Mohammad Arjomand, Mohammad Khavari Tavana, David R. Kaeli, Mahmut T. Kandemir, and Chita R. Das. 2017. Exploring the potential for collaborative data compression and hard-error tolerance in PCM memories. In Proceedings of the 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’17). IEEE, 85--96.
[23]
Lei Jiang, Youtao Zhang, and Jun Yang. 2011. Enhancing phase change memory lifetime through fine-grained current regulation and voltage upscaling. In Proceedings of the 2011 International Symposium on Low Power Electronics and Design (ISLPED’11). IEEE, 127--132.
[24]
Lei Jiang, Youtao Zhang, and Jun Yang. 2014. Mitigating write disturbance in super-dense phase change memories. In Proceedingso of the 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’14). IEEE, 216--227.
[25]
Madhura Joshi, Wangyuan Zhang, and Tao Li. 2011. Mercury: A fast and energy-efficient multi-level cell based phase change memory system. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture. IEEE, 345--356.
[26]
Kinam Kim. 2005. Technology for sub-50nm DRAM and NAND flash manufacturing. In Proceeding of the 2005 IEEE 51st International Electron Devices Meeting, Technical Digest. IEEE, 323--326.
[27]
Kinarn Kim and Su Jin Ahn. 2005. Reliability investigations for manufacturable high density PRAM. In Proceedings of the 43rd Annual 2005 IEEE International Reliability Physics Symposium. IEEE, 157--162.
[28]
Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable dram alternative. In ACM SIGARCH Computer Architecture News, Vol. 37. ACM, 2--13.
[29]
Suyoun Lee, Jeung-hyun Jeong, Taek Sung Lee, Won Mok Kim, and Byung-ki Cheong. 2009. A study on the failure mechanism of a phase-change memory in write/erase cycling. IEEE Electron Device Lett. 30, 5 (2009), 448--450.
[30]
Rakan Maddah, Rami Melhem, and Sangyeun Cho. 2015. Rdis: Tolerating many stuck-at faults in resistive memory. IEEE Trans. Comput. 64, 3 (2015), 847--861.
[31]
Rakan Maddah, Seyed Mohammad Seyedzadeh, and Rami Melhem. 2015. CAFO: Cost aware flip optimization for asymmetric memories. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, 320--330.
[32]
Rami Melhem, Rakan Maddah, and Sangyeun Cho. 2012. RDIS: A recursively defined invertible set scheme to tolerate multiple stuck-at faults in resistive memory. In Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’12). IEEE, 1--12.
[33]
Azalia Mirhoseini, Miodrag Potkonjak, and Farinaz Koushanfar. 2015. Phase change memory write cost minimization by data encoding. IEEE J. Emerg. Select. Top. Circ. Syst. 5, 1 (2015), 51--63.
[34]
Prashant J. Nair, Chiachen Chou, Bipin Rajendran, and Moinuddin K. Qureshi. 2015. Reducing read latency of phase change memory via early read and Turbo Read. In Proceedings of the 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, 309--319.
[35]
Poovaiah M. Palangappa and Kartik Mohanram. 2015. Flip-mirror-rotate: An architecture for bit-write reduction and wear leveling in non-volatile memories. In Proceedings of the 25th Great Lakes Symposium on VLSI. ACM, 221--224.
[36]
P. M. Palangappa and K. Mohanram. 2017. CompEx++: Compression-expansion coding for energy, latency, and lifetime improvements in MLC/TLC NVMs. ACM Trans. Arch. Code Optimiz. 14, 1 (2017), 10.
[37]
N. Papandreou, H. Pozidis, T. Mittelholzer, G. F. Close, M. Breitwisch, C. Lam, and E. Eleftheriou. 2011. Drift-tolerant multilevel phase-change memory. In Proceedings of the 2011 3rd IEEE International Memory Workshop (IMW’11). IEEE, 1--4.
[38]
Moinuddin K. Qureshi. 2011. Pay-as-you-go: Low-overhead hard-error correction for phase change memories. In Proceedings of the 44th Annual International Symposium on Microarchitecture. ACM, 318--328.
[39]
Moinuddin K. Qureshi, John Karidis, Michele Franceschini, Vijayalakshmi Srinivasan, Luis Lastras, and Bulent Abali. 2009. Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling. In Proceedings of the 42nd Annual Symposium on Microarchitecture. ACM, 14--23.
[40]
Moinuddin K. Qureshi, Vijayalakshmi Srinivasan, and Jude A. Rivers. 2009. Scalable high performance main memory system using phase-change memory technology. ACM SIGARCH Computer Architecture News 37, 3 (2009), 24--33.
[41]
Stuart Schechter, Gabriel H. Loh, Karin Strauss, and Doug Burger. 2010. Use ECP, not ECC, for hard failures in resistive memories. In ACM SIGARCH Computer Architecture News, Vol. 38. ACM, 141--152.
[42]
Nak Hee Seong, Dong Hyuk Woo, Vijayalakshmi Srinivasan, Jude A. Rivers, and Hsien-Hsin S. Lee. 2010. SAFER: Stuck-at-fault error recovery for memories. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 115--124.
[43]
Seyed Mohammad Seyedzadeh, Rakan Maddah, Donald Kline, Alex K. Jones, and Rami Melhem. 2016. Improving bit flip reduction for biased and random data. IEEE Trans. Comput. 65, 11 (2016), 3345--3356.
[44]
Mohammad Khavari Tavana, Yunsi Fei, and David Kaeli. 2018. Nacre: Durable, secure and energy-efficient non-volatile memory utilizing data versioning. IEEE Trans. Emerg. Top. Comput. 99 (2018), 1--1. (In press).
[45]
Mohammad Khavari Tavana and David Kaeli. 2017. Cost-effective write disturbance mitigation techniques for advancing PCM density. In Proceedings of the International Conference on Computer-Aided Design. IEEE Press.
[46]
Mohammad Khavari Tavana, Amir Kavyan Ziabari, Mohammad Arjomand, Mahmut Kandemir, Chita Das, and David Kaeli. 2017. REMAP: A reliability/endurance mechanism for advancing PCM. In Proceedings of the International Symposium on Memory Systems (MEMSYS’17). ACM, New York, NY, 385--398.
[47]
Mohammad Khavari Tavana, Amir Kavyan Ziabari, and David Kaeli. 2017. Live together or die alone: Block cooperation to extend lifetime of resistive memories. In Proceedings of the 2017 Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE’17). IEEE, 1098--1103.
[48]
Rujia Wang, Lei Jiang, Youtao Zhang, and Jun Yang. 2015. SD-PCM: Constructing reliable super dense phase change memory under write disturbance. ACM SIGPLAN Not. 50, 4 (2015), 19--31.
[49]
Doe Hyun Yoon, Naveen Muralimanohar, Jichuan Chang, Parthasarathy Ranganathan, Norman P. Jouppi, and Mattan Erez. 2011. FREE-p: Protecting non-volatile memory against both hard and soft errors. In Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture. IEEE, 466--477.
[50]
Vinson Young, Prashant J. Nair, and Moinuddin K. Qureshi. 2015. DEUCE: Write-efficient encryption for non-volatile memories. In ACM SIGARCH Computer Architecture News, Vol. 43. ACM, 33--44.
[51]
Wangyuan Zhang and Tao Li. 2009. Characterizing and mitigating the impact of process variations on phase change based memory systems. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 2--13.
[52]
Ping Zhou, Bo Zhao, Jun Yang, and Youtao Zhang. 2009. A durable and energy efficient main memory using phase change memory technology. In ACM SIGARCH Computer Architecture News, Vol. 37. ACM, 14--23.

Cited By

View all
  • (2024)Hercules: Enabling Atomic Durability for Persistent Memory with Transient Persistence DomainACM Transactions on Embedded Computing Systems10.1145/360747323:6(1-34)Online publication date: 11-Sep-2024
  • (2023)Realizing Extreme Endurance Through Fault-aware Wear Leveling and Improved Tolerance2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071093(964-976)Online publication date: Feb-2023
  • (2020)WoLFRaM: Enhancing Wear-Leveling and Fault Tolerance in Resistive Memories using Programmable Address Decoders2020 IEEE 38th International Conference on Computer Design (ICCD)10.1109/ICCD50377.2020.00044(187-196)Online publication date: Oct-2020

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 15, Issue 3
September 2018
322 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/3274266
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 August 2018
Accepted: 01 July 2018
Revised: 01 June 2018
Received: 01 March 2018
Published in TACO Volume 15, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Resistive memory
  2. endurance
  3. error correction
  4. error recovery
  5. memory reliability
  6. phase change memory
  7. workload characterization

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)96
  • Downloads (Last 6 weeks)19
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Hercules: Enabling Atomic Durability for Persistent Memory with Transient Persistence DomainACM Transactions on Embedded Computing Systems10.1145/360747323:6(1-34)Online publication date: 11-Sep-2024
  • (2023)Realizing Extreme Endurance Through Fault-aware Wear Leveling and Improved Tolerance2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071093(964-976)Online publication date: Feb-2023
  • (2020)WoLFRaM: Enhancing Wear-Leveling and Fault Tolerance in Resistive Memories using Programmable Address Decoders2020 IEEE 38th International Conference on Computer Design (ICCD)10.1109/ICCD50377.2020.00044(187-196)Online publication date: Oct-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media