Crafton et al., 2020 - Google Patents

Breaking barriers: Maximizing array utilization for compute in-memory fabrics

Crafton et al., 2020

Document ID: 1375611465552224606
Author: Crafton B; Spetalnick S; Murali G; Krishna T; Lim S; Raychowdhury A
Publication year: 2020
Publication venue: 2020 IFIP/IEEE 28th International Conference on Very Large Scale Integration (VLSI-SOC)

External Links

Cited by

Snippet

Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for machine learning …

Continue reading at arxiv.org (PDF) (other versions)

239000004744 fabric 0 title description 2

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/52—Multiplying; Dividing
- G06F7/523—Multiplying only
- G06F7/53—Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system

Similar Documents

Publication	Publication Date	Title
Hu et al.	2022	A survey on convolutional neural network accelerators: GPU, FPGA and ASIC
Wang et al.	2020	FPDeep: Scalable acceleration of CNN training on deeply-pipelined FPGA clusters
Joardar et al.	2020	AccuReD: High accuracy training of CNNs on ReRAM/GPU heterogeneous 3-D architecture
CN107203807B (en)	2020-10-02	On-chip cache bandwidth balancing method, system and device of neural network accelerator
Yang et al.	2021	PIMGCN: A ReRAM-based PIM design for graph convolutional network acceleration
Acer et al.	2016	Improving performance of sparse matrix dense matrix multiplication on large-scale parallel systems
Arka et al.	2021	ReGraphX: NoC-enabled 3D heterogeneous ReRAM architecture for training graph neural networks
CN112119459A (en)	2020-12-22	Memory arrangement for tensor data
Feng et al.	2021	Cosparse: A software and hardware reconfigurable spmv framework for graph analytics
Catthoor et al.	2018	Very large-scale neuromorphic systems for biological signal processing
Arka et al.	2021	DARe: DropLayer-aware manycore ReRAM architecture for training graph neural networks
Crafton et al.	2020	Breaking barriers: Maximizing array utilization for compute in-memory fabrics
Sun et al.	2022	Multi-node acceleration for large-scale GCNs
Joardar et al.	2021	Heterogeneous manycore architectures enabled by processing-in-memory for deep learning: From CNNs to GNNs:(ICCAD special session paper)
Akbari et al.	2017	A high-performance network-on-chip topology for neuromorphic architectures
Zhang et al.	2023	Simeuro: A hybrid CPU-GPU parallel simulator for neuromorphic computing chips
Wang et al.	2022	SPCIM: Sparsity-Balanced Practical CIM Accelerator With Optimized Spatial-Temporal Multi-Macro Utilization
Ravichandiran et al.	2021	A review of 3D-dynamic random-access memory based near-memory computation
Ascia et al.	2019	Networks-on-chip based deep neural networks accelerators for iot edge devices
Venkateswaran et al.	2003	Memory in processor: A novel design paradigm for supercomputing architectures
Zhou et al.	2021	Pim-dl: Boosting dnn inference on digital processing in-memory architectures via data layout optimizations
Wang et al.	2023	Benchmarking DNN Mapping Methods for the In-Memory Computing Accelerators
Joshi et al.	2017	Neuromorphic event-driven multi-scale synaptic connectivity and plasticity
Crafton et al.	2021	Statistical Array Allocation and Partitioning for Compute In-Memory Fabrics
Liu et al.	2022	Regularizing sparse and imbalanced communications for voxel-based brain simulations on supercomputers