[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Hu et al., 2022 - Google Patents

A survey on convolutional neural network accelerators: GPU, FPGA and ASIC

Hu et al., 2022

Document ID
11314307987561233300
Author
Hu Y
Liu Y
Liu Z
Publication year
Publication venue
2022 14th International Conference on Computer Research and Development (ICCRD)

External Links

Snippet

In recent years, artificial intelligence (AI) has been under rapid development, applied in various areas. Among a vast number of neural network (NN) models, the convolutional neural network (CNN) has a mainstream status in application such as image and sound …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/80Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/53Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • G06F7/505Adding; Subtracting in bit-parallel fashion, i.e. having a different digit-handling circuit for each denomination
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • G06F15/17356Indirect interconnection networks
    • G06F15/17368Indirect interconnection networks non hierarchical topologies
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F2207/00Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F2207/38Indexing scheme relating to groups G06F7/38 - G06F7/575
    • G06F2207/3804Details
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/12Computer systems based on biological models using genetic models
    • G06N3/126Genetic algorithms, i.e. information processing using digital simulations of the genetic system
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F1/00Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application

Similar Documents

Publication Publication Date Title
Hu et al. A survey on convolutional neural network accelerators: GPU, FPGA and ASIC
Jouppi et al. A domain-specific supercomputer for training deep neural networks
Boutros et al. You cannot improve what you do not measure: FPGA vs. ASIC efficiency gaps for convolutional neural network inference
Zhou et al. Transpim: A memory-based acceleration via software-hardware co-design for transformer
Hashemi et al. Understanding the impact of precision quantization on the accuracy and energy of neural networks
Zaman et al. Custom hardware architectures for deep learning on portable devices: a review
Venkataramanaiah et al. Automatic compiler based FPGA accelerator for CNN training
Welser et al. Future computing hardware for AI
Bavikadi et al. A survey on machine learning accelerators and evolutionary hardware platforms
Deng et al. PermCNN: Energy-efficient convolutional neural network hardware architecture with permuted diagonal structure
Noh et al. FlexBlock: A flexible DNN training accelerator with multi-mode block floating point support
Sun et al. A high-performance accelerator for large-scale convolutional neural networks
He et al. A survey to predict the trend of AI-able server evolution in the cloud
Heo et al. T-PIM: An energy-efficient processing-in-memory accelerator for end-to-end on-device training
Morad et al. Efficient dense and sparse matrix multiplication on GP-SIMD
Lou et al. RV-CNN: Flexible and efficient instruction set for CNNs based on RISC-V processors
Que et al. A reconfigurable multithreaded accelerator for recurrent neural networks
Jiang et al. A network-on-chip-based annealing processing architecture for large-scale fully connected ising model
Kim et al. A low-power graph convolutional network processor with sparse grouping for 3d point cloud semantic segmentation in mobile devices
Tiwari et al. Design and implementation of rough set algorithms on FPGA: A survey
Han et al. EGCN: An efficient GCN accelerator for minimizing off-chip memory access
Bavikadi et al. Heterogeneous multi-functional look-up-table-based processing-in-memory architecture for deep learning acceleration
Sharma et al. A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models
Lu et al. A reconfigurable DNN training accelerator on FPGA
Bavikadi et al. Reconfigurable Processing-in-Memory Architecture for Data Intensive Applications