Kim et al., 2024 - Google Patents

BS2: Bit-Serial Architecture Exploiting Weight Bit Sparsity for Efficient Deep Learning Acceleration

Kim et al., 2024

Document ID: 17844911520676885994
Author: Kim E; Lee S; Kim C; Lim H; Nam J; Sim J
Publication year: 2024
Publication venue: 2024 21st International SoC Design Conference (ISOCC)

External Links

Cited by

Snippet

Most of weights in deep learning models are small, thus they show high bit sparsity in MSBs. Based on this observation, we propose a bit-serial processing architecture (BS2) that exploits such bit sparsity to maximize computing efficiency. In this architecture, a bit feed …

Continue reading at ieeexplore.ieee.org (other versions)

230000001133 acceleration 0 title description 4

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/50—Adding; Subtracting
- G06F7/505—Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30021—Compare instructions, e.g. Greater-Than, Equal-To, MINMAX
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/22—Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology

Similar Documents

Publication	Publication Date	Title
US20240112029A1 (en)	2024-04-04	Acceleration of model/weight programming in memristor crossbar arrays
Salamat et al.	2018	Rnsnet: In-memory neural network acceleration using residue number system
Sim et al.	2017	Scalable stochastic-computing accelerator for convolutional neural networks
CN112068798B (en)	2023-11-03	Method and device for realizing importance ordering of network nodes
US10824394B2 (en)	2020-11-03	Concurrent multi-bit adder
US20220188604A1 (en)	2022-06-16	Method and Apparatus for Performing a Neural Network Operation
KR102409615B1 (en)	2022-06-16	Method for min-max computation in associative memory
Cai et al.	2018	Training low bitwidth convolutional neural network on RRAM
Shukla et al.	2022	Mc-cim: Compute-in-memory with monte-carlo dropouts for bayesian edge intelligence
Stevens et al.	2021	GNNerator: A hardware/software framework for accelerating graph neural networks
Kang et al.	2021	S-FLASH: A NAND flash-based deep neural network accelerator exploiting bit-level sparsity
Geng et al.	2020	CQNN: a CGRA-based QNN framework
Karavay et al.	2019	Qubit fault detection in SoC logic
Shivanandamurthy et al.	2021	Atria: A bit-parallel stochastic arithmetic based accelerator for in-dram cnn processing
Kim et al.	2024	BS2: Bit-Serial Architecture Exploiting Weight Bit Sparsity for Efficient Deep Learning Acceleration
US20230244901A1 (en)	2023-08-03	Compute-in-memory sram using memory-immersed data conversion and multiplication-free operators
JP2022074442A (en)	2022-05-18	Arithmetic device and arithmetic method
Block et al.	2013	A hardware acceleration of a phylogenetic tree reconstruction with maximum parsimony algorithm using FPGA
Kang et al.	2019	An energy-efficient programmable mixed-signal accelerator for machine learning algorithms
Klhufek et al.	2024	Exploring Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators
Zhu et al.	2023	iMAT: Energy-Efficient In-Memory Acceleration for Ternary Neural Networks With Sparse Dot Product
US20240143541A1 (en)	2024-05-02	Compute in-memory architecture for continuous on-chip learning
CN110989971B (en)	2024-05-28	System and method for energy-saving data processing
CN110765413A (en)	2020-02-07	Matrix summation structure and neural network computing platform
US12032959B2 (en)	2024-07-09	Non-volatile memory die with latch-based multiply-accumulate components