[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3583781.3590306acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
short-paper

SRAM-Based Processing-In-Memory Design with Kullback-Leibler Divergence-Based Dynamic Precision Quantization

Published: 05 June 2023 Publication History

Abstract

Deep convolutional neural networks (CNNs) are widely used in Artificial Intelligence of Things (AIoT) systems. Limited by power and area, conventional edge devices are insufficient to handle the cost of CNN computation. The idea of SRAM based Processing-In-Memory (SRAM-PIM) has been advocated to implement CNN on edge devices because of its high area and power efficiency. To further excavate the potential of SRAM-PIM on edge inferences, this paper proposes an SRAM-PIM design with Kullback-Leibler (KL) divergence-based dynamic precision quantization. The proposed quantization method decouples the effect of different CNN layers on accuracy and introduces the SRAM-PIM hardware performance in quantization, realizing SRAM-PIM-aware layer-wise precision adjustment. The proposed SRAM-PIM design has been applied in image classification tasks on edge devices. Our evaluation shows that the implemented design achieves up to 2.03x energy efficiency improvement and 2.54% accuracy improvement compared with existing dynamic precision PIM design. Compared with existing reinforcement-learning-based dynamic quantization method that requires several hours quantization time, the proposed dynamic precision quantization method takes only 26.28us to get the optimal quantization results.

References

[1]
Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Runbin Shi, Xue Lin, and Yanzhi Wang. 2020. MSP: an FPGA-specific mixed-scheme, multi-precision deep neural network quantization framework. arXiv preprint arXiv:2009.07460 (2020).
[2]
Yiran Chen et al. 2020. A survey of accelerator architectures for deep neural networks. Engineering 6, 3 (2020), 264--274.
[3]
Ping Chi et al. 2016. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory. In 2016 ACM/IEEE 43rd Annu. Int. Symp. Comput. Archit. (ISCA). 27--39.
[4]
Guosheng Hu et al. 2015. When face recognition meets with deep learning: an evaluation of convolutional neural networks for face recognition. In Proc. IEEE Int. Conf. Comput. Vis. workshops. 142--150.
[5]
Sitao Huang et al. 2021. Mixed precision quantization for ReRAM-based DNN in-ference accelerators. In 26th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 372--377.
[6]
Bing Li et al. 2020. HitM: high-throughput ReRAM-based PIM for multi-modal neural networks. In 39th International Conference on Computer-Aided Design. 1--7.
[7]
Bing Li, Songyun Qu, and Ying Wang. 2022. An Automated Quantization Frame-work for High-Utilization RRAM-Based PIM. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 3 (2022), 583--596. https: //doi.org/10.1109/TCAD.2021.3061521
[8]
Songyun Qu et al. 2020. RaQu: An Automatic High-Utilization CNN Quantization and Mapping Framework For General-Purpose RRAM Accelerator. In 57th ACM/IEEE Design Automation Conference (DAC). 1--6. https://doi.org/10.1109/ DAC18072.2020.9218724
[9]
Shaoqing Ren et al. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. in Neural Infor. Proc. Syst. 28 (2015), 91--99.
[10]
Xin Si, Win-San Khwa, Jia-Jing Chen, Jia-Fang Li, Xiaoyu Sun, Rui Liu, Shimeng Yu, Hiroyuki Yamauchi, Qiang Li, and Meng-Fan Chang. 2019. A dual-split 6T SRAM-based computing-in-memory unit-macro with fully parallel product-sum operation for binarized DNN edge processors. IEEE Transactions on Circuits and Systems I: Regular Papers 66, 11 (2019), 4172--4185.
[11]
Bonan Yan et al. 2022. A 1.041-Mb/mm 2 27.38-TOPS/W Signed-INT8 Dynamic-Logic-Based ADC-less SRAM Compute-in-Memory Macro in 28nm with Reconfigurable Bitwise Operation for AI and Embedded Applications. In 2022 IEEE International Solid-State Circuits Conference (ISSCC), Vol. 65. IEEE, 188--190.
[12]
Zhenhua Zhu et al. 2019. A configurable multi-precision CNN computing framework based on single bit RRAM. In 56th ACM/IEEE Design Automation Conference.
[13]
Zhenhua Zhu et al. 2020. MNSIM 2.0: A Behavior-Level Modeling Tool for Memristor-Based Neuromorphic Computing Systems. In Great Lakes Symposium on VLSI (GLSVLSI '20). 83--88.

Index Terms

  1. SRAM-Based Processing-In-Memory Design with Kullback-Leibler Divergence-Based Dynamic Precision Quantization

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023
        June 2023
        731 pages
        ISBN:9798400701252
        DOI:10.1145/3583781
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 05 June 2023

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. dynamic precision quantization
        2. kullback-leibler divergence
        3. sram-pim

        Qualifiers

        • Short-paper

        Funding Sources

        • National Natural Science Foundation of China

        Conference

        GLSVLSI '23
        Sponsor:
        GLSVLSI '23: Great Lakes Symposium on VLSI 2023
        June 5 - 7, 2023
        TN, Knoxville, USA

        Acceptance Rates

        Overall Acceptance Rate 312 of 1,156 submissions, 27%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 168
          Total Downloads
        • Downloads (Last 12 months)78
        • Downloads (Last 6 weeks)6
        Reflects downloads up to 19 Dec 2024

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media