Ino et al., 2014 - Google Patents
A parallel scheme for accelerating parameter sweep applications on a GPUIno et al., 2014
View PDF- Document ID
- 9229135250520716485
- Author
- Ino F
- Shigeoka K
- Okuyama T
- Motokubota M
- Hagihara K
- Publication year
- Publication venue
- Concurrency and Computation: Practice and Experience
External Links
Snippet
This paper proposes a parallel scheme for accelerating parameter sweep applications on a graphics processing unit. By using hundreds of cores on the graphics processing unit, we found that our scheme simultaneously processes multiple parameters rather than a single …
- 238000000034 method 0 abstract description 36
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mittal et al. | A survey of techniques for optimizing deep learning on GPUs | |
Silberstein et al. | Efficient computation of sum-products on GPUs through software-managed cache | |
Hacene et al. | Accelerating VASP electronic structure calculations using graphic processing units | |
Goddeke et al. | Using GPUs to improve multigrid solver performance on a cluster | |
Agullo et al. | LU factorization for accelerator-based systems | |
US7937567B1 (en) | Methods for scalably exploiting parallelism in a parallel processing system | |
US10691597B1 (en) | Method and system for processing big data | |
EP3757754B1 (en) | Sorting for data-parallel computing devices | |
US10332229B2 (en) | System and method for high performance k-means clustering on GPU with smart kernels | |
US20150324707A1 (en) | System and method for selecting useful smart kernels for general-purpose gpu computing | |
Igual et al. | The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations | |
JP2013500543A (en) | Mapping across multiple processors of processing logic with data parallel threads | |
Gu et al. | Improving execution concurrency of large-scale matrix multiplication on distributed data-parallel platforms | |
Wang et al. | {MGG}: Accelerating graph neural networks with {Fine-Grained}{Intra-Kernel}{Communication-Computation} pipelining on {Multi-GPU} platforms | |
Docan et al. | Activespaces: Exploring dynamic code deployment for extreme scale data processing | |
Rubin et al. | Maps: Optimizing massively parallel applications using device-level memory abstraction | |
Kelefouras et al. | A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD | |
Huang et al. | GPU computing performance analysis on matrix multiplication | |
Awatramani et al. | Increasing gpu throughput using kernel interleaved thread block scheduling | |
Wilkinson et al. | Porting ONETEP to graphical processing unit‐based coprocessors. 1. FFT box operations | |
Ibrahim et al. | Analysis and optimization of gyrokinetic toroidal simulations on homogenous and heterogenous platforms | |
Wan et al. | GPU implementation of a parallel two‐list algorithm for the subset‐sum problem | |
Zhou et al. | FASTCF: FPGA-based accelerator for stochastic-gradient-descent-based collaborative filtering | |
Fang et al. | Evaluating vector data type usage in OpenCL kernels | |
Ino et al. | A parallel scheme for accelerating parameter sweep applications on a GPU |