Wang et al., 2022 - Google Patents
FACCU: Enable fast accumulation for high-speed DSP systemsWang et al., 2022
- Document ID
- 15424996358186374470
- Author
- Wang M
- Cheng X
- Zou D
- Wang Z
- Publication year
- Publication venue
- IEEE Transactions on Circuits and Systems II: Express Briefs
External Links
Snippet
A number of fast accumulation (FACCU) designs are proposed for one of the essential components in DSP systems, the accumulator, to largely boost the DSP's maximum attainable processing speed. By leveraging the unique features of the accumulator, the loop …
- 238000009825 accumulation 0 title abstract description 27
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/533—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even
- G06F7/5334—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even by using multiple bit scanning, i.e. by decoding groups of successive multiplier bits in order to select an appropriate precalculated multiple of the multiplicand as a partial product
- G06F7/5336—Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even by using multiple bit scanning, i.e. by decoding groups of successive multiplier bits in order to select an appropriate precalculated multiple of the multiplicand as a partial product overlapped, i.e. with successive bitgroups sharing one or more bits being recoded into signed digit representation, e.g. using the Modified Booth Algorithm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/50—Adding; Subtracting
- G06F7/505—Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
- G06F7/506—Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination with simultaneous carry generation for, or propagation over, two or more stages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
- G06F7/5318—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel with column wise addition of partial products, e.g. using Wallace tree, Dadda counters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/499—Denomination or exception handling, e.g. rounding, overflow
- G06F7/49994—Sign extension
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/57—Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 - G06F7/556 or for performing logical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G06F2207/3804—Details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mao et al. | A configurable floating-point multiple-precision processing element for HPC and AI converged computing | |
Soniya | A review of different type of multipliers and multiplier-accumulator unit | |
Junjie et al. | Implementation of DFT application on ternary optical computer | |
Tan et al. | Multiple-mode-supporting floating-point FMA unit for deep learning processors | |
Wang et al. | FACCU: Enable fast accumulation for high-speed DSP systems | |
Zahid et al. | Energy-Efficient Approximate Booth Multipliers for Convolutional Neural Networks | |
Daud et al. | Hybrid modified booth encoded algorithm-carry save adder fast multiplier | |
Haghi et al. | O⁴-DNN: A Hybrid DSP-LUT-Based Processing Unit With Operation Packing and Out-of-Order Execution for Efficient Realization of Convolutional Neural Networks on FPGA Devices | |
Pawar et al. | Review on multiply-accumulate unit | |
Luo et al. | A single clock cycle approximate adder with hybrid prediction and error compensation methods | |
Tan et al. | A Low-Cost Floating-Point Dot-Product-Dual-Accumulate Architecture for HPC-Enabled AI | |
Kaur et al. | Implementation of modified booth multiplier using pipeline technique on FPGA | |
Sanjeevaiah et al. | MF-RALU: design of an efficient multi-functional reversible arithmetic and logic unit for processor design on field programmable gate array platform. | |
Kwak et al. | High-speed CORDIC based on an overlapped architecture and a novel σ-prediction method | |
Norollah et al. | An efficient sorting architecture for area and energy constrained edge computing devices | |
Hsiao et al. | Design of a low-cost floating-point programmable vertex processor for mobile graphics applications based on hybrid number system | |
BANDI | RoBA Multiplier-Driven FIR Filter Synthesis: Uniting Efficiency and Speed for Enhanced Digital Signal Processing | |
SalehiTabrizi et al. | Designing Efficient Two-Level Reverse Converters for Moduli Set {2^ 2n+ 1-1, 2^ 2n, 2^ n-1\} 2 2 n+ 1-1, 2 2 n, 2 n-1 | |
Shavit et al. | Programmable All-in-One 4x8-/2x16-/1x32-bits Dual Mode Logic Multiplier in 16 nm FinFET with Semi-Automatic Flow | |
Chopde et al. | Design of a 4-Bit 4-Operand Adder Using Verilog: An Abstraction Analysis | |
Kurzum et al. | Enhancing DNN Training Efficiency Via Dynamic Asymmetric Architecture | |
Shyamsunder et al. | Design and Implementation of an Arithmetic Unit with Reduced Switching Activity | |
Fariddin et al. | Design of High Speed and Area efficient modified Kogge Stone Multiplier Using ZFL | |
Raavi et al. | Implementation of High-Speed Hybrid Carry Select Adder using Binary to Excess-1 Converter | |
Sharma et al. | Design and Implementation of Braun Multiplier using Verilog |