[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3061639.3062248acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Fault-Tolerant Training with On-Line Fault Detection for RRAM-Based Neural Computing Systems

Published: 18 June 2017 Publication History

Abstract

An RRAM-based computing system (RCS) is an attractive hardware platform for implementing neural computing algorithms. Online training for RCS enables hardware-based learning for a given application and reduces the additional error caused by device parameter variations. However, a high occurrence rate of hard faults due to immature fabrication processes and limited write endurance restrict the applicability of on-line training for RCS. We propose a fault-tolerant on-line training method that alternates between a fault-detection phase and a fault-tolerant training phase. In the fault-detection phase, a quiescent-voltage comparison method is utilized. In the training phase, a threshold-training method and a re-mapping scheme is proposed. Our results show that, compared to neural computing without fault tolerance, the recognition accuracy for the Cifar-10 dataset improves from 37% to 83% when using low-endurance RRAM cells, and from 63% to 76% when using RRAM cells with high endurance but a high percentage of initial faults.

References

[1]
M. M. Waldrop, "The chips are down for Moore's law," Nature News, vol. 530, no. 7589, p. 144, 2016.
[2]
P. Chi et al., "Prime: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory," in ISCA.
[3]
R. Degraeve et al., "Causes and consequences of the stochastic aspect of filamentary RRAM," Microelectronic Engineering, vol. 147, pp. 171--175, 2015.
[4]
L. Xia et al., "Technological exploration of RRAM crossbar array for matrix-vector multiplication," Journal of Computer Science and Technology, vol. 31, 2016.
[5]
C.-Y. Chen et al., "RRAM defect modeling and failure analysis based on march test and a novel squeeze-search scheme," IEEE TC, vol. 64.
[6]
K. Beckmann et al., "Nanoscale hafnium oxide RRAM devices exhibit pulse dependent behavior and multi-level resistance capability," MRS Advances, pp. 1--6, 2016.
[7]
M. Prezioso et al., "Training and operation of an integrated neuromorphic network based on metal-oxide memristors," Nature, vol. 521, no. 7550, pp. 61--64, 2015.
[8]
S. Han et al., "Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding," CoRR, abs/1510.00149, vol. 2, 2015.
[9]
S. Kannan et al., "Modeling, detection, and diagnosis of faults in multilevel memristor memories," IEEE TCAD, vol. 34.
[10]
L. Xia et al., "MNSIM: Simulation platform for memristor-based neuromorphic computing system," in DATE, pp. 469--474, 2016.
[11]
T. Tang et al., "Binary convolutional neural network on rram," in ASP-DAC, pp. 782--787, IEEE, 2017.
[12]
S. Kannan et al., "Sneak-path testing of memristor-based memories," in VLSID, pp. 386--391, IEEE, 2013.
[13]
T. N. Kumar et al., "Operational fault detection and monitoring of a memristor-based LUT," in DATE, pp. 429--434, IEEE, 2015.
[14]
A. Torralba et al., "80 million tiny images: A large data set for nonparametric object and scene recognition," IEEE TPAMI, vol. 30.
[15]
C.-H. Cheng et al., "Novel ultra-low power RRAM with good endurance and retention," in VLSI Symp. Tech. Dig, pp. 85--86, 2010.
[16]
Y.-S. Fan et al., "High endurance and multilevel operation in oxide semiconductor-based resistive RAM using thin-film transistor as a selector," ECS Solid State Letters, vol. 4, no. 9, pp. Q41--Q43, 2015.
[17]
C. Xu et al., "Understanding the trade-offs in multi-level cell ReRAM memory design," in DAC, pp. 1--6, IEEE, 2013.
[18]
Y. LeCun et al., "The MNIST database of handwritten digits," 1998.
[19]
C. Stapper, "Simulation of spatial fault distributions for integrated circuit yield estimations," IEEE TCAD, vol. 8.
[20]
L. Xia et al., "Switched by input: power efficient structure for RRAM-based convolutional neural network," in DAC, p. 125, ACM, 2016.

Cited By

View all
  • (2025)The show must go on: a reliability assessment platform for resistive random access memory crossbarsPhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences10.1098/rsta.2023.0387383:2288Online publication date: 16-Jan-2025
  • (2024)Drop-Connect as a Fault-Tolerance Approach for RRAM-based Deep Neural Network Accelerators2024 IEEE 42nd VLSI Test Symposium (VTS)10.1109/VTS60656.2024.10538531(1-7)Online publication date: 22-Apr-2024
  • (2024)A Fully Automated Platform for Evaluating ReRAM Crossbars2024 IEEE 25th Latin American Test Symposium (LATS)10.1109/LATS62223.2024.10534593(1-6)Online publication date: 9-Apr-2024
  • Show More Cited By
  1. Fault-Tolerant Training with On-Line Fault Detection for RRAM-Based Neural Computing Systems

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017
    June 2017
    533 pages
    ISBN:9781450349277
    DOI:10.1145/3061639
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    DAC '17
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

    Upcoming Conference

    DAC '25
    62nd ACM/IEEE Design Automation Conference
    June 22 - 26, 2025
    San Francisco , CA , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)101
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)The show must go on: a reliability assessment platform for resistive random access memory crossbarsPhilosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences10.1098/rsta.2023.0387383:2288Online publication date: 16-Jan-2025
    • (2024)Drop-Connect as a Fault-Tolerance Approach for RRAM-based Deep Neural Network Accelerators2024 IEEE 42nd VLSI Test Symposium (VTS)10.1109/VTS60656.2024.10538531(1-7)Online publication date: 22-Apr-2024
    • (2024)A Fully Automated Platform for Evaluating ReRAM Crossbars2024 IEEE 25th Latin American Test Symposium (LATS)10.1109/LATS62223.2024.10534593(1-6)Online publication date: 9-Apr-2024
    • (2024)Convolution Neural Network for Fault Detection and Classification in Hybrid Overhead-Underground Distribution System2024 Third International Conference on Power, Control and Computing Technologies (ICPC2T)10.1109/ICPC2T60072.2024.10474926(599-604)Online publication date: 18-Jan-2024
    • (2024)Fault Tolerant Design for Memristor-based AI Accelerators2024 IEEE International Conference on Design, Test and Technology of Integrated Systems (DTTIS)10.1109/DTTIS62212.2024.10779989(1-4)Online publication date: 14-Oct-2024
    • (2023)Dynamic Task Remapping for Reliable CNN Training on ReRAM Crossbars2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137238(1-6)Online publication date: Apr-2023
    • (2023)CRIMP: Compact & Reliable DNN Inference on In-Memory Processing via Crossbar-Aligned Compression and Non-ideality AdaptationACM Transactions on Embedded Computing Systems10.1145/360911522:5s(1-25)Online publication date: 31-Oct-2023
    • (2023)Reconfigurable Mapping Algorithm based Stuck-At-Fault Mitigation in Neuromorphic Computing SystemsProceedings of the Great Lakes Symposium on VLSI 202310.1145/3583781.3590208(261-266)Online publication date: 5-Jun-2023
    • (2023)ReaLPrune: ReRAM Crossbar-Aware Lottery Ticket Pruning for CNNsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2022.322363011:2(303-317)Online publication date: 1-Apr-2023
    • (2023)Variation Enhanced Attacks Against RRAM-Based Neuromorphic Computing SystemIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.320731642:5(1588-1596)Online publication date: May-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media