[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3218603.3218605acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

Input-Splitting of Large Neural Networks for Power-Efficient Accelerator with Resistive Crossbar Memory Array

Published: 23 July 2018 Publication History

Abstract

Resistive Crossbar memory Arrays (RCA) have been gaining interest as a promising platform to implement Convolutional Neural Networks (CNN). One of the major challenges in RCA-based design is that the number of rows in an RCA is often smaller than the number of input neurons in a layer. Previous works used high-resolution Analog-to-Digital Converters (ADCs) to compute the partial weighted sum in each array and merged partial sums from multiple arrays outside the RCAs. However, such approach suffers from significant power consumption due to the need for high-resolution ADCs. In this paper, we propose a methodology to more efficiently construct a large CNN with multiple RCAs. By splitting the input feature map and retraining the CNN with proper initialization, we demonstrate that any CNN model can be represented with multiple arrays without using intermediate partial sums. The experimental results show that the ADC power of the proposed design is 32x smaller and the total chip power of the proposed design is 3x smaller than those of the baseline design.

References

[1]
Tensorflow Alphabet Inc. 2017. Deep MNIST for Experts. Retrieved November 20, 2017 from https://www.tensorflow.org/get_started/mnist/pros
[2]
Marc Borremans et al. 2001. A low power, 10-bit CMOS D/A converter for high speed applications. In IEEE Custom Integrated Circuits Conference. IEEE, 157--160.
[3]
Vanessa H-C Chen et al. 2013. An 8.5 mW 5GS/s 6b flash ADC with dynamic offset calibration in 32nm CMOS SOI. In Symposium on VLSI Circuits (VLSIC). IEEE, C264-C265.
[4]
Yu-Hsin Chen et al. 2017. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE Journal of Solid-State Circuits 52, 1 (2017), 127--138.
[5]
Ping Chi et al. 2016. Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE Press, 27--39.
[6]
Kaiming He et al. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026--1034.
[7]
Miao Hu et al. 2016. Dot-Product Engine for Neuromorphic Computing: Programming 1T1M Crossbar to Accelerate Matrix-Vector Multiplication. DAC-53 (2016), p. 19.
[8]
Itay Hubara et al. 2016. Quantized neural networks: Training neural networks with low precision weights and activations. arXiv preprint arXiv:1609.07061 (2016).
[9]
Alex Krizhevsky et al. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[10]
Lukas Kull et al. 2013. A 3.1 mW 8b 1.2 GS/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32 nm digital SOI CMOS. IEEE Journal of Solid-State Circuits 48, 12 (2013), 3049--3058.
[11]
Boxun Li et al. 2015. Merging the interface: Power, area and accuracy co-optimization for rram crossbar-based mixed-signal computing system. In Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE. IEEE, 1--6.
[12]
Mehdi Saberi et al. 2011. Analysis of power consumption and linearity in capacitive digital-to-analog converters used in successive approximation ADCs. IEEE Transactions on Circuits and Systems I: Regular Papers 58, 8 (2011), 1736--1748.
[13]
Ali Shafiee et al. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars. ISCA 2016 (2016).
[14]
Tianqi Tang et al. 2017. Binary convolutional neural network on RRAM. In Design Automation Conference (ASP-DAC), 2017 22nd Asia and South Pacific. IEEE, 782--787.
[15]
Bob Verbruggen et al. 2009. A 2.2 mW 1.75 GS/s 5 bit folding flash ADC in 90 nm digital CMOS. IEEE Journal of Solid-State Circuits 44, 3 (2009), 874--882.
[16]
Qian Wang et al. 2016. Neuromorphic processors with memristive synapses: Synaptic interface and architectural exploration. ACM Journal on Emerging Technologies in Computing Systems (JETC) 12, 4 (2016), 35.
[17]
Lixue Xia et al. 2016. Switched by input: Power efficient structure for RRAM-based convolutional neural network. In Proceedings of the 53rd Annual Design Automation Conference. ACM, 125.
[18]
Sergey Zagoruyko. 2015. 92.45% on CIFAR-10 in Torch. Retrieved November 20, 2017 from http://torch.ch/blog/2015/07/30/cifar.html
[19]
Shuchang Zhou et al. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016).

Cited By

View all
  • (2024)Hardware/Software Co-Design With ADC-Less In-Memory Computing Hardware for Spiking Neural NetworksIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.331612112:1(35-47)Online publication date: Jan-2024
  • (2024)A 44.2-TOPS/W CNN Processor With Variation-Tolerant Analog Datapath and Variation Compensating CircuitIEEE Journal of Solid-State Circuits10.1109/JSSC.2023.332164359:5(1603-1611)Online publication date: May-2024
  • (2024)Lightening-Transformer: A Dynamically-Operated Optically-Interconnected Photonic Transformer Accelerator2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00059(686-703)Online publication date: 2-Mar-2024
  • Show More Cited By
  1. Input-Splitting of Large Neural Networks for Power-Efficient Accelerator with Resistive Crossbar Memory Array

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and Design
    July 2018
    327 pages
    ISBN:9781450357043
    DOI:10.1145/3218603
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 July 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Neural networks
    2. resistive random-access memory
    3. vector-matrix multiplication acceleration

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ISLPED '18
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 398 of 1,159 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)89
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Hardware/Software Co-Design With ADC-Less In-Memory Computing Hardware for Spiking Neural NetworksIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2023.331612112:1(35-47)Online publication date: Jan-2024
    • (2024)A 44.2-TOPS/W CNN Processor With Variation-Tolerant Analog Datapath and Variation Compensating CircuitIEEE Journal of Solid-State Circuits10.1109/JSSC.2023.332164359:5(1603-1611)Online publication date: May-2024
    • (2024)Lightening-Transformer: A Dynamically-Operated Optically-Interconnected Photonic Transformer Accelerator2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00059(686-703)Online publication date: 2-Mar-2024
    • (2023)ARBiS: A Hardware-Efficient SRAM CIM CNN Accelerator With Cyclic-Shift Weight Duplication and Parasitic-Capacitance Charge Sharing for AI Edge ApplicationIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2022.321553570:1(364-377)Online publication date: Jan-2023
    • (2023)An Energy-Efficient Inference Engine for a Configurable ReRAM-Based Neural Network AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.318446442:3(740-753)Online publication date: Mar-2023
    • (2022)Towards ADC-Less Compute-In-Memory Accelerators for Energy Efficient Deep Learning2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774573(624-627)Online publication date: 14-Mar-2022
    • (2022)Compute-in-Memory Technologies and Architectures for Deep Learning WorkloadsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2022.320358330:11(1615-1630)Online publication date: Nov-2022
    • (2021)Signal Integrity Modeling and Analysis of Large-Scale Memristor Crossbar Array in a High-Speed Neuromorphic System for Deep Neural NetworkIEEE Transactions on Components, Packaging and Manufacturing Technology10.1109/TCPMT.2021.309274011:7(1122-1136)Online publication date: Jul-2021
    • (2021)Leveraging Noise and Aggressive Quantization of In-Memory Computing for Robust DNN Hardware Against Adversarial Input and Weight Attacks2021 58th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC18074.2021.9586233(559-564)Online publication date: 5-Dec-2021
    • (2021)Maximizing Parallel Activation of Word-Lines in MRAM-Based Binary Neural Network AcceleratorsIEEE Access10.1109/ACCESS.2021.31210119(141961-141969)Online publication date: 2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media