More Web Proxy on the site http://driver.im/

research-article

Architecture-Accuracy Co-optimization of ReRAM-based Low-cost Neural Network Processor

Authors:

Jong-Moon Choi,

Seung-Kwang Hong,

Kee-Won KwonAuthors Info & Claims

GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI

Pages 427 - 432

https://doi.org/10.1145/3386263.3406954

Published: 07 September 2020 Publication History

Abstract

Resistive RAM (ReRAM) is a promising technology with such advantages as small device size and in-memory-computing capability. However, designing optimal AI processors based on ReRAMs is challenging due to the limited precision, and the complex interplay between quality of result and hardware efficiency. In this paper we present a study targeting a low-power low-cost image classification application. We discover that the trade-off between accuracy and hardware efficiency in ReRAM-based hardware is not obvious and even surprising, and our solution developed for a recently fabricated ReRAM device achieves both the state-of-the-art efficiency and empirical assurance on the high quality of result.

Supplementary Material

MP4 File (3386263.3406954.mp4)

Presentation video

Download
19.90 MB

References

[1]

E. R. Berikaa et al. 2018. Multi-Bit RRAM Transient Modelling and Analysis. In 2018 30th International Conference on Microelectronics (ICM). 232--235.

[2]

Wei-Hao Chen et al. 2019. CMOS-integrated memristive non-volatile computing-in-memory for AI edge processors. Nature Electronics, Vol. 2, 9 (2019), 420--428.

[3]

Y. Chen et al. 2017. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE Journal of Solid-State Circuits, Vol. 52, 1 (Jan 2017), 127--138.

[4]

Zhiyong Cheng et al. 2015. Training Binary Multilayer Neural Networks for Image Classification using Expectation Backpropagation. CoRR, Vol. abs/1503.03562 (2015). arxiv: 1503.03562

[5]

P. Chi et al. 2016. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). 27--39.

[6]

Matthieu Courbariaux and Yoshua Bengio. 2016. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. CoRR, Vol. abs/1602.02830 (2016). arxiv: 1602.02830

[7]

E. Giacomin et al. 2019. A Robust Digital RRAM-Based Convolutional Block for Low-Power Image Processing and Learning Applications. IEEE Transactions on Circuits and Systems I: Regular Papers, Vol. 66, 2 (Feb 2019), 643--654.

[8]

Itay Hubara et al. 2017. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations. J. Mach. Learn. Res., Vol. 18, 1 (Jan. 2017), 6869--6898.

[9]

Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. (Feb. 2015).

[10]

Alex Krizhevsky. 2012. Learning Multiple Layers of Features from Tiny Images. University of Toronto (May 2012).

[11]

Alex Krizhevsky, Ilya Sutskever and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25. Curran Associates, Inc., 1097--1105.

[12]

Y. Lecun et al. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, Vol. 86, 11 (Nov 1998), 2278--2324.

[13]

J. Lee et al. 2019. UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision. IEEE Journal of Solid-State Circuits, Vol. 54, 1 (Jan 2019), 173--185.

[14]

Can Li et al. 2018. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nature Communications, Vol. 9, 1 (2018), 2385.

[15]

Fengfu Li, Bo Zhang and Bin Liu. 2016. Ternary Weight Networks. CoRR, Vol. abs/1605.04711 (2016). arxiv: 1605.04711

[16]

S. Lim, M. Kwak and H. Hwang. 2018. Improved Synaptic Behavior of CBRAM Using Internal Voltage Divider for Neuromorphic Systems. IEEE Transactions on Electron Devices, Vol. 65, 9 (Sep. 2018), 3976--3981.

[17]

R. Mochida et al. 2018. A 4M Synapses integrated Analog ReRAM based 66.5 TOPS/W Neural-Network Processor with Cell Current Controlled Writing and Flexible Network Architecture. In 2018 IEEE Symposium on VLSI Technology. 175--176.

[18]

Bert Moons et al. 2018. BinarEye: An Always-On Energy-Accuracy-Scalable Binary CNN Processor With All Memory On Chip in 28nm CMOS. CoRR, Vol. abs/1804.05554 (2018). arxiv: 1804.05554

[19]

Olga Russakovsky et al. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), Vol. 115, 3 (2015), 211--252.

Digital Library

[20]

A. Shafiee et al. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). 14--26.

[21]

X. Sun et al. 2018. Fully parallel RRAM synaptic array for implementing binary neural network with (+1, 1) weights and (+1, 0) neurons. In 2018 23rd Asia and South Pacific Design Automation Conference. 574--579.

Digital Library

[22]

Tianqi Tang et al. 2017. Binary convolutional neural network on RRAM. In 2017 22nd Asia and South Pacific Design Automation Conference. 782--787.

[23]

Xiaowei Xu et al. 2018. Scaling for edge inference of deep neural networks. Nature Electronics, Vol. 1, 4 (2018), 216--222.

[24]

C. Xue et al. 2019. 24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors. In 2019 IEEE International Solid- State Circuits Conference - (ISSCC). 388--390.

[25]

H. Yonekawa and H. Nakahara. 2017. On-Chip Memory Based Binarized Convolutional Deep Neural Network Applying Batch Normalization Free Technique on an FPGA. In 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). 98--105.

[26]

Kyungjean Yoon et al. 2016. Comprehensive Writing Margin Analysis and its Application to Stacked one Diode?One Memory Device for High?Density Crossbar Resistance Switching Random Access Memory. Advanced Electronic Materials, Vol. 2 (Sep. 2016).

Cited By

Song MAsim FLee J(2024)Extending Neural Processing Unit and Compiler for Advanced Binarized Neural Networks2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473822(115-120)Online publication date: 22-Jan-2024
https://doi.org/10.1109/ASP-DAC58780.2024.10473822
Yu MHong MLee SKim SLee J(2024)PyAIM: Pynq-Based Scalable Analog In-Memory Computing Prototyping Platform2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS)10.1109/AICAS59952.2024.10595868(174-178)Online publication date: 22-Apr-2024
https://doi.org/10.1109/AICAS59952.2024.10595868
Azamat AAsim FKim JLee J(2023)Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network AcceleratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.329446142:12(4897-4908)Online publication date: Dec-2023
https://doi.org/10.1109/TCAD.2023.3294461
Show More Cited By

Index Terms

Architecture-Accuracy Co-optimization of ReRAM-based Low-cost Neural Network Processor
1. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators
    2. Semiconductor memory
      1. Non-volatile memory

Recommendations

Device-architecture co-optimization of STT-RAM based memory for low power embedded systems
ICCAD '11: Proceedings of the International Conference on Computer-Aided Design

Spin-transfer torque random access memory (STT-RAM) is a fast, scalable, durable non-volatile memory which can be embedded into standard CMOS process. A wide range of write speeds from 1ns to 100ns have been reported for STT-RAM. The switching current ...
Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics
Special Issue on Frontiers of Hardware and Algorithms for On-chip Learning, Special Issue on Silicon Photonics and Regular Papers

Artificial Neural Network computation relies on intensive vector-matrix multiplications. Recently, the emerging nonvolatile memory (NVM) crossbar array showed a feasibility of implementing such operations with high energy efficiency. Thus, there have ...
Device-architecture co-optimization of STT-RAM based memory for low power embedded systems
ICCAD '11: Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design

Spin-transfer torque random access memory (STT-RAM) is a fast, scalable, durable non-volatile memory which can be embedded into standard CMOS process. A wide range of write speeds from 1ns to 100ns have been reported for STT-RAM. The switching current ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

GLSVLSI '20: Proceedings of the 2020 on Great Lakes Symposium on VLSI

September 2020

597 pages

ISBN:9781450379441

DOI:10.1145/3386263

General Chairs:
Tinoosh Mohsenin
University of Maryland, Baltimore County, USA
,
Weisheng Zhao
Beihang University, China
,
Program Chairs:
Yiran Chen
Duke University, USA
,
Onur Mutlu
ETH Zurich, Switzerland

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

GLSVLSI '20

GLSVLSI '20: Great Lakes Symposium on VLSI 2020

September 7 - 9, 2020

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
196
Total Downloads

Downloads (Last 12 months)32
Downloads (Last 6 weeks)4

Reflects downloads up to 10 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Song MAsim FLee J(2024)Extending Neural Processing Unit and Compiler for Advanced Binarized Neural Networks2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC)10.1109/ASP-DAC58780.2024.10473822(115-120)Online publication date: 22-Jan-2024
https://doi.org/10.1109/ASP-DAC58780.2024.10473822
Yu MHong MLee SKim SLee J(2024)PyAIM: Pynq-Based Scalable Analog In-Memory Computing Prototyping Platform2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS)10.1109/AICAS59952.2024.10595868(174-178)Online publication date: 22-Apr-2024
https://doi.org/10.1109/AICAS59952.2024.10595868
Azamat AAsim FKim JLee J(2023)Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network AcceleratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.329446142:12(4897-4908)Online publication date: Dec-2023
https://doi.org/10.1109/TCAD.2023.3294461
Quan CFouda MLee SJung GLee JEltawil AKurdahi F(2023)Training-Free Stuck-At Fault Mitigation for ReRAM-Based Deep Learning AcceleratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322228842:7(2174-2186)Online publication date: Jul-2023
https://doi.org/10.1109/TCAD.2022.3222288
Quan CFouda MLee SLee J(2022)Multi-Fidelity Nonideality Simulation and Evaluation Framework for Resistive Neuromorphic Computing2022 56th Asilomar Conference on Signals, Systems, and Computers10.1109/IEEECONF56349.2022.10052098(1152-1156)Online publication date: 31-Oct-2022
https://doi.org/10.1109/IEEECONF56349.2022.10052098
Lee SFouda MLee JEltawil AKurdahi F(2022)Accurate Prediction of ReRAM Crossbar Performance Under I-V Nonlinearity and IR Drop2022 IEEE 40th International Conference on Computer Design (ICCD)10.1109/ICCD56317.2022.00013(9-16)Online publication date: Oct-2022
https://doi.org/10.1109/ICCD56317.2022.00013
Azamat AAsim FLee J(2021)Quarry: Quantization-based ADC Reduction for ReRAM-based Deep Neural Network Accelerators2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643502(1-7)Online publication date: 1-Nov-2021
https://dl.acm.org/doi/10.1109/ICCAD51958.2021.9643502

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents