[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Kwon et al., 2018 - Google Patents

Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects

Kwon et al., 2018

View PDF
Document ID
8175607140400717304
Author
Kwon H
Samajdar A
Krishna T
Publication year
Publication venue
ACM SIGPLAN Notices

External Links

Snippet

Deep neural networks (DNN) have demonstrated highly promising results across computer vision and speech recognition, and are becoming foundational for ubiquitous AI. The computational complexity of these algorithms and a need for high energy-efficiency has led …
Continue reading at sites.gatech.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/80Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • G06F15/8023Two dimensional arrays, e.g. mesh, torus
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored programme computers
    • G06F15/78Architectures of general purpose stored programme computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored programme computers comprising a single central processing unit with reconfigurable architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
    • G06F15/163Interprocessor communication
    • G06F15/173Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/50Computer-aided design
    • G06F17/5045Circuit design
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • G06F9/3889Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
    • G06F9/3891Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute organised in groups of units sharing resources, e.g. clusters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for programme control, e.g. control unit
    • G06F9/06Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
    • G06F9/30Arrangements for executing machine-instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • G06F9/3893Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
    • G06F9/3895Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
    • G06F9/3897Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/53Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations

Similar Documents

Publication Publication Date Title
Kwon et al. Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects
Qin et al. Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training
Nabavinejad et al. An overview of efficient interconnection networks for deep neural network accelerators
Kwon et al. Rethinking NoCs for spatial neural network accelerators
Wang et al. FPDeep: Scalable acceleration of CNN training on deeply-pipelined FPGA clusters
Chen et al. NoC-based DNN accelerator: A future design paradigm
Chen et al. Zara: A novel zero-free dataflow accelerator for generative adversarial networks in 3d reram
Wu et al. Compute-efficient neural-network acceleration
Firuzan et al. Reconfigurable network-on-chip for 3D neural network accelerators
Chen et al. A NoC-based simulator for design and evaluation of deep neural networks
Dazzi et al. Efficient pipelined execution of CNNs based on in-memory computing and graph homomorphism verification
Lee et al. NP-CGRA: Extending CGRAs for efficient processing of light-weight deep neural networks
Darbani et al. RASHT: A partially reconfigurable architecture for efficient implementation of CNNs
Aliagha et al. Energy efficient design of coarse-grained reconfigurable architectures: Insights, trends and challenges
WO2017007318A1 (en) Scalable computation architecture in a memristor-based array
Delaye et al. Deep learning challenges and solutions with xilinx fpgas
Salvador et al. Evolvable 2D computing matrix model for intrinsic evolution in commercial FPGAs with native reconfiguration support
JP2005531843A (en) Division in array processors
Akbari et al. A high-performance network-on-chip topology for neuromorphic architectures
Bhattacharya From dnns to gans: Review of efficient hardware architectures for deep learning
Yu et al. A RDT-based interconnection network for scalable network-on-chip designs
Ascia et al. Networks-on-chip based deep neural networks accelerators for iot edge devices
Dazzi et al. 5 parallel prism: A topology for pipelined implementations of convolutional neural networks using computational memory
Morcel et al. Fpga-based accelerator for deep convolutional neural networks for the spark environment
Jan et al. Scalable communication architectures for massively parallel hardware multi-processors