Kwon et al., 2018 - Google Patents

Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects

Kwon et al., 2018

Document ID: 8175607140400717304
Author: Kwon H; Samajdar A; Krishna T
Publication year: 2018
Publication venue: ACM SIGPLAN Notices

External Links

Cited by

Snippet

Deep neural networks (DNN) have demonstrated highly promising results across computer vision and speech recognition, and are becoming foundational for ubiquitous AI. The computational complexity of these algorithms and a need for high energy-efficiency has led …

Continue reading at sites.gatech.edu (PDF) (other versions)

210000002683 Foot 0 abstract description 27

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G06F15/8023—Two dimensional arrays, e.g. mesh, torus
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G06F15/7867—Architectures of general purpose stored programme computers comprising a single central processing unit with reconfigurable architecture
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5045—Circuit design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
- G06F9/3891—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute organised in groups of units sharing resources, e.g. clusters
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
- G06F9/3895—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
- G06F9/3897—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations

Similar Documents

Publication	Publication Date	Title
Kwon et al.	2018	Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects
Qin et al.	2020	Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training
Nabavinejad et al.	2020	An overview of efficient interconnection networks for deep neural network accelerators
Kwon et al.	2017	Rethinking NoCs for spatial neural network accelerators
Wang et al.	2020	FPDeep: Scalable acceleration of CNN training on deeply-pipelined FPGA clusters
Chen et al.	2019	NoC-based DNN accelerator: A future design paradigm
Chen et al.	2019	Zara: A novel zero-free dataflow accelerator for generative adversarial networks in 3d reram
Wu et al.	2019	Compute-efficient neural-network acceleration
Firuzan et al.	2018	Reconfigurable network-on-chip for 3D neural network accelerators
Chen et al.	2020	A NoC-based simulator for design and evaluation of deep neural networks
Dazzi et al.	2021	Efficient pipelined execution of CNNs based on in-memory computing and graph homomorphism verification
Lee et al.	2021	NP-CGRA: Extending CGRAs for efficient processing of light-weight deep neural networks
Darbani et al.	2022	RASHT: A partially reconfigurable architecture for efficient implementation of CNNs
Aliagha et al.	2022	Energy efficient design of coarse-grained reconfigurable architectures: Insights, trends and challenges
WO2017007318A1 (en)	2017-01-12	Scalable computation architecture in a memristor-based array
Delaye et al.	2017	Deep learning challenges and solutions with xilinx fpgas
Salvador et al.	2011	Evolvable 2D computing matrix model for intrinsic evolution in commercial FPGAs with native reconfiguration support
JP2005531843A (en)	2005-10-20	Division in array processors
Akbari et al.	2017	A high-performance network-on-chip topology for neuromorphic architectures
Bhattacharya	2021	From dnns to gans: Review of efficient hardware architectures for deep learning
Yu et al.	2005	A RDT-based interconnection network for scalable network-on-chip designs
Ascia et al.	2019	Networks-on-chip based deep neural networks accelerators for iot edge devices
Dazzi et al.	2019	5 parallel prism: A topology for pipelined implementations of convolutional neural networks using computational memory
Morcel et al.	2016	Fpga-based accelerator for deep convolutional neural networks for the spark environment
Jan et al.	2012	Scalable communication architectures for massively parallel hardware multi-processors