Lu et al., 2024 - Google Patents

A large-scale heterogeneous computing framework for non-uniform sampling two-dimensional convolution applications

Lu et al., 2024

Document ID: 12183717959662052544
Author: Lu Y; Yu C; Xiao J; Wang H; Fu H; Kang B; Zheng G
Publication year: 2024
Publication venue: CCF Transactions on High Performance Computing

External Links

Cited by

Snippet

Non-uniform sampling two-dimensional convolution (NUSC) maps spatially sampling data with irregular distribution to a regular grid by convolution. As the data scale and growth rate continue to increase, accelerating NUSC with the heterogeneous computing platform is a …

Continue reading at link.springer.com (other versions)

238000005070 sampling 0 title abstract description 59

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5083—Techniques for rebalancing the load in a distributed system
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/80—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
- G06F15/8007—Architectures of general purpose stored programme computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass

Similar Documents

Publication	Publication Date	Title
Shankar et al.	2018	Numpywren: Serverless linear algebra
Gómez-Luna et al.	2017	Chai: Collaborative heterogeneous applications for integrated-architectures
Dryden et al.	2019	Improving strong-scaling of CNN training by exploiting finer-grained parallelism
Teodoro et al.	2013	High-throughput analysis of large microscopy image datasets on CPU-GPU cluster platforms
US8225074B2 (en)	2012-07-17	Methods and systems for managing computations on a hybrid computing platform including a parallel accelerator
Chen et al.	2018	FlinkCL: An OpenCL-based in-memory computing architecture on heterogeneous CPU-GPU clusters for big data
Agullo et al.	2016	Task‐based FMM for heterogeneous architectures
US20200372337A1 (en)	2020-11-26	Parallelization strategies for training a neural network
Aji et al.	2016	MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL
US20210319298A1 (en)	2021-10-14	Compute-based subgraph partitioning of deep learning models for framework integration
Teodoro et al.	2014	Region templates: Data representation and management for high-throughput image analysis
Lin et al.	2021	Accelerating large sparse neural network inference using GPU task graph parallelism
Behrens et al.	2018	Efficient SIMD Vectorization for Hashing in OpenCL.
Lu et al.	2024	A large-scale heterogeneous computing framework for non-uniform sampling two-dimensional convolution applications
CN111445503B (en)	2023-04-25	Pyramid mutual information image registration method based on parallel programming model on GPU cluster
Aldegheri et al.	2018	Enhancing performance of computer vision applications on low-power embedded systems through heterogeneous parallel programming
Ciznicki et al.	2017	Energy aware scheduling model and online heuristics for stencil codes on heterogeneous computing architectures
Li et al.	2024	OneGraph: a cross-architecture framework for large-scale graph computing on GPUs based on oneAPI
Liu et al.	2021	Establishing high performance AI ecosystem on Sunway platform
Membarth et al.	2019	Efficient mapping of streaming applications for image processing on graphics cards
Geng et al.	2016	The importance of efficient fine-grain synchronization for many-core systems
Saidi et al.	2013	Optimizing two-dimensional DMA transfers for scratchpad Based MPSoCs platforms
Zhang et al.	2020	A two-level storage strategy for map-reduce enabled computation of local map algebra
Xu et al.	2022	Accelerating cryo-EM Reconstruction of RELION on the New Sunway Supercomputer
Lu et al.	2022	EasyNUSC: An Efficient Heterogeneous Computing Framework for Non-uniform Sampling Two-Dimensional Convolution Applications