[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3649329.3655948acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity

Published: 07 November 2024 Publication History

Abstract

Bit-level sparsity in neural network models harbors immense untapped potential. Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency. Yet, traditional digital SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to effectively exploit this unstructured sparsity. To address this challenge, we propose Dyadic Block PIM (DB-PIM), a groundbreaking algorithm-architecture co-design framework. First, we propose an algorithm coupled with a distinctive sparsity pattern, termed a dyadic block (DB), that preserves the random distribution of non-zero bits to maintain accuracy while restricting the number of these bits in each weight to improve regularity. Architecturally, we develop a custom PIM macro that includes dyadic block multiplication units (DBMUs) and Canonical Signed Digit (CSD)-based adder trees, specifically tailored for Multiply-Accumulate (MAC) operations. An input pre-processing unit (IPU) further refines performance and efficiency by capitalizing on block-wise input sparsity. Results show that our proposed co-design framework achieves a remarkable speedup of up to 7.69× and energy savings of 83.43%.

References

[1]
Alex Krizhevsky et al. Imagenet Classification with Deep Convolutional Neural Networks. In Proceedings of the NIPS, 2012.
[2]
Yu Zhang, William Chan, and Navdeep Jaitly. Very Deep Convolutional Networks for End-to-End Speech Recognition. In Proceedings of the ICASSP, 2017.
[3]
Ekim Yurtsever, Jacob Lambert, et al. A Survey of Autonomous Driving: Common Practices and Emerging Technologies. IEEE ACCESS, 2020.
[4]
Yu-Der Chih, Po-Hao Lee, et al. An 89TOPS/W and 16.3 TOPS/mm2 All-Digital SRAM-based Full-Precision Compute-in Memory Macro in 22nm for Machine-Learning Edge Applications. In Proceedings of the ISSCC, 2021.
[5]
Cenlin Duan, Jianlei Yang, et al. DDC-PIM: Efficient Algorithm/Architecture Co-Design for Doubling Data Capacity of SRAM-based Processing-in-Memory. IEEE TCAD, 2024.
[6]
Tzu-Hsien Yang et al. Sparse ReRAM Engine: Joint Exploration of Activation and Weight Sparsity in Compressed Neural Networks. In Proceedings of the ISCA, 2019.
[7]
Fangxin Liu, Wenbo Zhao, et al. Bit-Transformer: Transforming Bit-Level Sparsity into Higher Preformance in ReRAM-based Accelerator. In Proceedings of the ICCAD, 2021.
[8]
Xueyan Wang, Jianlei Yang, et al. TCIM: Triangle Counting Acceleration with Processing-in-MRAM Architecture. In Proceedings of the DAC, 2020.
[9]
Xuhang Chen et al. Accelerating Graph-Connected Component Computation with Emerging Processing-in-Memory Architecture. IEEE TCAD, 2022.
[10]
Yinglin Zhao, Jianlei Yang, et al. NAND-SPIN-Based Processing-in-MRAM Architecture for Convolutional Neural Network Acceleration. SCIS, 2023.
[11]
Fengbin Tu, Yiqi Wang, et al. SDP: Co-Designing Algorithm, Dataflow, and Architecture for In-SRAM Sparse NN Acceleration. IEEE TCAD, 2022.
[12]
Jinshan Yue, Xiaoyu Feng, et al. A 2.75-to-75.9 TOPS/W Computing-in-Memory NN Processor Supporting Set-Associate Block-Wise Zero Skipping and Ping-Pong CIM with Simultaneous Computation and Weight Updating. In Proceedings of the ISSCC, 2021.
[13]
Shiwei Liu, Peizhe Li, et al. A 28nm 53.8 TOPS/W 8b Sparse Transformer Accelerator with In-Memory Butterfly Zero Skipper for Unstructured-Pruned NN and CIM-based Local-Attention-Reusable Engine. In Proceedings of the ISSCC, 2023.
[14]
Fengbin Tu, Zihan Wu, et al. MuITCIM: A 28nm 2.24 uJ/token Attention-Token-Bit Hybrid Sparse Digital CIM-based Accelerator for Multimodal Transformers. In Proceedings of the ISSCC, 2023.
[15]
Ruiqi Guo, Zhiheng Yue, et al. TT@ CIM: A Tensor-Train in-Memory-Computing Processor Using Bit-Level-Sparsity Optimization and Variable Precision Quantization. IEEE JSSC, 2022.
[16]
Sampatrao L Pinjare et al. Implementation of Artificial Neural Network Architecture for Image Compression Using CSD Multiplier. In Proceedings of the ERCICA, 2013.
[17]
Bonan Yan, Jeng-Long Hsu, et al. A 1.041-Mb/mm2 27.38-TOPS/W Signed-INT8 Dynamic-Logic-based ADC-Less SRAM Compute-in-Memory Macro in 28nm with Reconfigurable Bitwise Operation for AI and Embedded Applications. In Proceedings of the ISSCC, 2022.
[18]
Karen Simonyan and Andrew Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556, 2014.
[19]
Kaiming He, Xiangyu Zhang, et al. Deep Residual Learning for Image Recognition. In Proceedings of the CVPR, 2016.
[20]
Mark Sandler, Andrew Howard, et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the CVPR, 2018.
[21]
Mingxing Tan and Quoc Le. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the ICML, 2019.

Index Terms

  1. Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            DAC '24: Proceedings of the 61st ACM/IEEE Design Automation Conference
            June 2024
            2159 pages
            ISBN:9798400706011
            DOI:10.1145/3649329
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Sponsors

            In-Cooperation

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 07 November 2024

            Check for updates

            Author Tags

            1. bit-level sparsity
            2. SRAM
            3. PIM
            4. algorithm/architecture co-design

            Qualifiers

            • Research-article

            Funding Sources

            • NFSC

            Conference

            DAC '24
            Sponsor:
            DAC '24: 61st ACM/IEEE Design Automation Conference
            June 23 - 27, 2024
            CA, San Francisco, USA

            Acceptance Rates

            Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

            Upcoming Conference

            DAC '25
            62nd ACM/IEEE Design Automation Conference
            June 22 - 26, 2025
            San Francisco , CA , USA

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 148
              Total Downloads
            • Downloads (Last 12 months)148
            • Downloads (Last 6 weeks)55
            Reflects downloads up to 03 Mar 2025

            Other Metrics

            Citations

            View Options

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media