Korch et al., 2018 - Google Patents
Accelerating explicit ODE methods on GPUs by kernel fusionKorch et al., 2018
- Document ID
- 16048403560086785731
- Author
- Korch M
- Werner T
- Publication year
- Publication venue
- Concurrency and Computation: Practice and Experience
External Links
Snippet
Graphics processing units (GPUs) have a promising architecture for implementing highly parallel solution methods for systems of ordinary differential equations (ODEs). However, their high performance comes at the price of caveats such as small caches or wide SIMD …
- 230000004927 fusion 0 title abstract description 74
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/4421—Execution paradigms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/24—Editing, e.g. insert/delete
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Abdelfattah et al. | Performance, design, and autotuning of batched GEMM for GPUs | |
Breß et al. | Efficient co-processor utilization in database query processing | |
Wozniak et al. | Turbine: A distributed-memory dataflow engine for extreme-scale many-task applications | |
Donfack et al. | A survey of recent developments in parallel implementations of Gaussian elimination | |
Lani et al. | A GPU-enabled finite volume solver for global magnetospheric simulations on unstructured grids | |
Hogg et al. | A sparse symmetric indefinite direct solver for GPU architectures | |
Korch et al. | Accelerating explicit ODE methods on GPUs by kernel fusion | |
Jung et al. | DeepCuts: a deep learning optimization framework for versatile GPU workloads | |
Osama et al. | Parallel SAT simplification on GPU architectures | |
Al-Mouhamed et al. | A review of CUDA optimization techniques and tools for structured grid computing | |
He et al. | An efficient sparse approximate inverse preconditioning algorithm on GPU | |
Dinh et al. | Extending the nested parallel model to the nested dataflow model with provably efficient schedulers | |
Marszałek et al. | Fully flexible parallel merge sort for multicore architectures | |
Oyarzun et al. | Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers | |
Ashraf et al. | Empirical investigation: performance and power‐consumption based dual‐level model for exascale computing systems | |
Mele et al. | A PETSc parallel‐in‐time solver based on MGRIT algorithm | |
Schwartz et al. | Pebbling game and alternative basis for high performance matrix multiplication | |
Ait Aba et al. | Efficient algorithm for scheduling parallel applications on hybrid multicore machines with communications delays and energy constraint | |
Abdelfattah et al. | Performance optimization of Sparse Matrix‐Vector Multiplication for multi‐component PDE‐based applications using GPUs | |
Myllykoski et al. | Task‐based, GPU‐accelerated and robust library for solving dense nonsymmetric eigenvalue problems | |
Loffeld et al. | On the arithmetic intensity of high-order finite-volume discretizations for hyperbolic systems of conservation laws | |
Chen et al. | Communication bounds for convolutional neural networks | |
Iwasawa et al. | Implementation and performance of Barnes-hut n-body algorithm on extreme-scale heterogeneous many-core architectures | |
Dufrechou et al. | Using analysis information in the synchronization‐free GPU solution of sparse triangular systems | |
Rouberte et al. | DF‐DTM: Dynamic Task Memoization and reuse in dataflow |