Kwon et al., 2018 - Google Patents
Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnectsKwon et al., 2018
View PDF- Document ID
- 8175607140400717304
- Author
- Kwon H
- Samajdar A
- Krishna T
- Publication year
- Publication venue
- ACM SIGPLAN Notices
External Links
Snippet
Deep neural networks (DNN) have demonstrated highly promising results across computer vision and speech recognition, and are becoming foundational for ubiquitous AI. The computational complexity of these algorithms and a need for high energy-efficiency has led …
- 210000002683 Foot 0 abstract description 27
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G06F15/8023—Two dimensional arrays, e.g. mesh, torus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G06F15/7867—Architectures of general purpose stored programme computers comprising a single central processing unit with reconfigurable architecture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
- G06F9/3891—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute organised in groups of units sharing resources, e.g. clusters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
- G06F9/3895—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
- G06F9/3897—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kwon et al. | Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects | |
Qin et al. | Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training | |
Nabavinejad et al. | An overview of efficient interconnection networks for deep neural network accelerators | |
Kwon et al. | Rethinking NoCs for spatial neural network accelerators | |
Wang et al. | FPDeep: Scalable acceleration of CNN training on deeply-pipelined FPGA clusters | |
Chen et al. | NoC-based DNN accelerator: A future design paradigm | |
Chen et al. | Zara: A novel zero-free dataflow accelerator for generative adversarial networks in 3d reram | |
Wu et al. | Compute-efficient neural-network acceleration | |
Firuzan et al. | Reconfigurable network-on-chip for 3D neural network accelerators | |
Chen et al. | A NoC-based simulator for design and evaluation of deep neural networks | |
Dazzi et al. | Efficient pipelined execution of CNNs based on in-memory computing and graph homomorphism verification | |
Lee et al. | NP-CGRA: Extending CGRAs for efficient processing of light-weight deep neural networks | |
Darbani et al. | RASHT: A partially reconfigurable architecture for efficient implementation of CNNs | |
Aliagha et al. | Energy efficient design of coarse-grained reconfigurable architectures: Insights, trends and challenges | |
WO2017007318A1 (en) | Scalable computation architecture in a memristor-based array | |
Delaye et al. | Deep learning challenges and solutions with xilinx fpgas | |
Salvador et al. | Evolvable 2D computing matrix model for intrinsic evolution in commercial FPGAs with native reconfiguration support | |
JP2005531843A (en) | Division in array processors | |
Akbari et al. | A high-performance network-on-chip topology for neuromorphic architectures | |
Bhattacharya | From dnns to gans: Review of efficient hardware architectures for deep learning | |
Yu et al. | A RDT-based interconnection network for scalable network-on-chip designs | |
Ascia et al. | Networks-on-chip based deep neural networks accelerators for iot edge devices | |
Dazzi et al. | 5 parallel prism: A topology for pipelined implementations of convolutional neural networks using computational memory | |
Morcel et al. | Fpga-based accelerator for deep convolutional neural networks for the spark environment | |
Jan et al. | Scalable communication architectures for massively parallel hardware multi-processors |