short-paper

SRAM-Based Processing-In-Memory Design with Kullback-Leibler Divergence-Based Dynamic Precision Quantization

Authors:

Bing Li,

Bonan YanAuthors Info & Claims

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

Pages 189 - 192

https://doi.org/10.1145/3583781.3590306

Published: 05 June 2023 Publication History

Get Access

Abstract

Deep convolutional neural networks (CNNs) are widely used in Artificial Intelligence of Things (AIoT) systems. Limited by power and area, conventional edge devices are insufficient to handle the cost of CNN computation. The idea of SRAM based Processing-In-Memory (SRAM-PIM) has been advocated to implement CNN on edge devices because of its high area and power efficiency. To further excavate the potential of SRAM-PIM on edge inferences, this paper proposes an SRAM-PIM design with Kullback-Leibler (KL) divergence-based dynamic precision quantization. The proposed quantization method decouples the effect of different CNN layers on accuracy and introduces the SRAM-PIM hardware performance in quantization, realizing SRAM-PIM-aware layer-wise precision adjustment. The proposed SRAM-PIM design has been applied in image classification tasks on edge devices. Our evaluation shows that the implemented design achieves up to 2.03x energy efficiency improvement and 2.54% accuracy improvement compared with existing dynamic precision PIM design. Compared with existing reinforcement-learning-based dynamic quantization method that requires several hours quantization time, the proposed dynamic precision quantization method takes only 26.28us to get the optimal quantization results.

References

[1]

Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Runbin Shi, Xue Lin, and Yanzhi Wang. 2020. MSP: an FPGA-specific mixed-scheme, multi-precision deep neural network quantization framework. arXiv preprint arXiv:2009.07460 (2020).

Google Scholar

[2]

Yiran Chen et al. 2020. A survey of accelerator architectures for deep neural networks. Engineering 6, 3 (2020), 264--274.

Crossref

Google Scholar

[3]

Ping Chi et al. 2016. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory. In 2016 ACM/IEEE 43rd Annu. Int. Symp. Comput. Archit. (ISCA). 27--39.

Google Scholar

[4]

Guosheng Hu et al. 2015. When face recognition meets with deep learning: an evaluation of convolutional neural networks for face recognition. In Proc. IEEE Int. Conf. Comput. Vis. workshops. 142--150.

Google Scholar

[5]

Sitao Huang et al. 2021. Mixed precision quantization for ReRAM-based DNN in-ference accelerators. In 26th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 372--377.

Google Scholar

[6]

Bing Li et al. 2020. HitM: high-throughput ReRAM-based PIM for multi-modal neural networks. In 39th International Conference on Computer-Aided Design. 1--7.

Google Scholar

[7]

Bing Li, Songyun Qu, and Ying Wang. 2022. An Automated Quantization Frame-work for High-Utilization RRAM-Based PIM. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 3 (2022), 583--596. https: //doi.org/10.1109/TCAD.2021.3061521

Crossref

Google Scholar

[8]

Songyun Qu et al. 2020. RaQu: An Automatic High-Utilization CNN Quantization and Mapping Framework For General-Purpose RRAM Accelerator. In 57th ACM/IEEE Design Automation Conference (DAC). 1--6. https://doi.org/10.1109/ DAC18072.2020.9218724

Google Scholar

[9]

Shaoqing Ren et al. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. in Neural Infor. Proc. Syst. 28 (2015), 91--99.

Google Scholar

[10]

Xin Si, Win-San Khwa, Jia-Jing Chen, Jia-Fang Li, Xiaoyu Sun, Rui Liu, Shimeng Yu, Hiroyuki Yamauchi, Qiang Li, and Meng-Fan Chang. 2019. A dual-split 6T SRAM-based computing-in-memory unit-macro with fully parallel product-sum operation for binarized DNN edge processors. IEEE Transactions on Circuits and Systems I: Regular Papers 66, 11 (2019), 4172--4185.

Crossref

Google Scholar

[11]

Bonan Yan et al. 2022. A 1.041-Mb/mm 2 27.38-TOPS/W Signed-INT8 Dynamic-Logic-Based ADC-less SRAM Compute-in-Memory Macro in 28nm with Reconfigurable Bitwise Operation for AI and Embedded Applications. In 2022 IEEE International Solid-State Circuits Conference (ISSCC), Vol. 65. IEEE, 188--190.

Google Scholar

[12]

Zhenhua Zhu et al. 2019. A configurable multi-precision CNN computing framework based on single bit RRAM. In 56th ACM/IEEE Design Automation Conference.

Google Scholar

[13]

Zhenhua Zhu et al. 2020. MNSIM 2.0: A Behavior-Level Modeling Tool for Memristor-Based Neuromorphic Computing Systems. In Great Lakes Symposium on VLSI (GLSVLSI '20). 83--88.

Google Scholar

Index Terms

SRAM-Based Processing-In-Memory Design with Kullback-Leibler Divergence-Based Dynamic Precision Quantization
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Static memory
  2. Very large scale integration design

Recommendations

A modified Kullback–Leibler divergence for non-additive measures based on Choquet integral
Abstract
The Kullback–Leibler divergence is a very important concept in statistics and probability which helps us in many problems of information systems. The main property of the Kullback–Leibler divergence is non-negativity. The study of ...
Speech enhancement using a wavelet thresholding method based on symmetric Kullback-Leibler divergence

Performance of wavelet thresholding methods for speech enhancement strongly depends on estimating an exact threshold value in the wavelet sub-bands. In this paper, we propose a new method for more exact estimation of the threshold value. Our proposed ...
Active contour model based on local Kullback–Leibler divergence for fast image segmentation
Abstract
The inhomogeneity of image intensity and noise are the main factors that affect the segmentation results. To overcome these challenges, a new active contour model is designed based on level set method and Kullback–Leibler Divergence. First of all,...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

June 2023

731 pages

ISBN:9798400701252

DOI:10.1145/3583781

General Chairs:
Himanshu Thapliyal
University of Tennessee, Knoxville, USA
,
Ronald DeMara
University of Central Florida, USA
,
Program Chairs:
Inna Partin-Vaisband
University of Illinois Chicago, USA
,
Srinivas Katkoori
University of South Florida, USA

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

National Natural Science Foundation of China

Conference

GLSVLSI '23

Sponsor:

SIGDA

GLSVLSI '23: Great Lakes Symposium on VLSI 2023

June 5 - 7, 2023

TN, Knoxville, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
168
Total Downloads

Downloads (Last 12 months)78
Downloads (Last 6 weeks)6

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

A modified Kullback–Leibler divergence for non-additive measures based on Choquet integral

Speech enhancement using a wavelet thresholding method based on symmetric Kullback-Leibler divergence

Active contour model based on local Kullback–Leibler divergence for fast image segmentation