Ino et al., 2014 - Google Patents

A parallel scheme for accelerating parameter sweep applications on a GPU

Ino et al., 2014

Document ID: 9229135250520716485
Author: Ino F; Shigeoka K; Okuyama T; Motokubota M; Hagihara K
Publication year: 2014
Publication venue: Concurrency and Computation: Practice and Experience

External Links

Cited by

Snippet

This paper proposes a parallel scheme for accelerating parameter sweep applications on a graphics processing unit. By using hundreds of cores on the graphics processing unit, we found that our scheme simultaneously processes multiple parameters rather than a single …

Continue reading at www-ppl.ist.osaka-u.ac.jp (PDF) (other versions)

238000000034 method 0 abstract description 36

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL

Similar Documents

Publication	Publication Date	Title
Mittal et al.	2019	A survey of techniques for optimizing deep learning on GPUs
Silberstein et al.	2008	Efficient computation of sum-products on GPUs through software-managed cache
Hacene et al.	2012	Accelerating VASP electronic structure calculations using graphic processing units
Goddeke et al.	2008	Using GPUs to improve multigrid solver performance on a cluster
Agullo et al.	2011	LU factorization for accelerator-based systems
US7937567B1 (en)	2011-05-03	Methods for scalably exploiting parallelism in a parallel processing system
US10691597B1 (en)	2020-06-23	Method and system for processing big data
EP3757754B1 (en)	2023-01-04	Sorting for data-parallel computing devices
US10332229B2 (en)	2019-06-25	System and method for high performance k-means clustering on GPU with smart kernels
US20150324707A1 (en)	2015-11-12	System and method for selecting useful smart kernels for general-purpose gpu computing
Igual et al.	2012	The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations
JP2013500543A (en)	2013-01-07	Mapping across multiple processors of processing logic with data parallel threads
Gu et al.	2017	Improving execution concurrency of large-scale matrix multiplication on distributed data-parallel platforms
Wang et al.	2023	{MGG}: Accelerating graph neural networks with {Fine-Grained}{Intra-Kernel}{Communication-Computation} pipelining on {Multi-GPU} platforms
Docan et al.	2015	Activespaces: Exploring dynamic code deployment for extreme scale data processing
Rubin et al.	2014	Maps: Optimizing massively parallel applications using device-level memory abstraction
Kelefouras et al.	2014	A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD
Huang et al.	2019	GPU computing performance analysis on matrix multiplication
Awatramani et al.	2013	Increasing gpu throughput using kernel interleaved thread block scheduling
Wilkinson et al.	2013	Porting ONETEP to graphical processing unit‐based coprocessors. 1. FFT box operations
Ibrahim et al.	2013	Analysis and optimization of gyrokinetic toroidal simulations on homogenous and heterogenous platforms
Wan et al.	2015	GPU implementation of a parallel two‐list algorithm for the subset‐sum problem
Zhou et al.	2018	FASTCF: FPGA-based accelerator for stochastic-gradient-descent-based collaborative filtering
Fang et al.	2015	Evaluating vector data type usage in OpenCL kernels
Ino et al.	2014	A parallel scheme for accelerating parameter sweep applications on a GPU