He et al., 2021 - Google Patents
Enabling energy-efficient DNN training on hybrid GPU-FPGA acceleratorsHe et al., 2021
View PDF- Document ID
- 18422944688050821359
- Author
- He X
- Liu J
- Xie Z
- Chen H
- Chen G
- Zhang W
- Li D
- Publication year
- Publication venue
- Proceedings of the 35th ACM International Conference on Supercomputing
External Links
Snippet
DNN training consumes orders of magnitude more energy than inference and requires innovative use of accelerators to improve energy-efficiency. However, despite having complementary features, GPUs and FPGAs have been mostly used independently for the …
- 239000000203 mixture 0 abstract description 32
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
- G06F17/5009—Computer-aided design using simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2217/00—Indexing scheme relating to computer aided design [CAD]
- G06F2217/78—Power analysis and optimization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | A full-stack search technique for domain optimized deep learning accelerators | |
Lorenzon et al. | Parallel computing hits the power wall: principles, challenges, and a survey of solutions | |
Butko et al. | Full-system simulation of big. little multicore architecture for performance and energy exploration | |
Prakash et al. | Energy-efficient execution of data-parallel applications on heterogeneous mobile platforms | |
He et al. | Enabling energy-efficient DNN training on hybrid GPU-FPGA accelerators | |
Guerreiro et al. | GPU static modeling using PTX and deep structured learning | |
D’Agostino et al. | Hardware and Software Solutions for Energy‐Efficient Computing in Scientific Programming | |
Fahad et al. | Accurate energy modelling of hybrid parallel applications on modern heterogeneous computing platforms using system-level measurements | |
Langer et al. | Energy-efficient computing for hpc workloads on heterogeneous manycore chips | |
Wang et al. | Software support for heterogeneous computing | |
Lai et al. | Break down GPU execution time with an analytical method | |
Du et al. | Improving computation and memory efficiency for real-world transformer inference on gpus | |
Marques et al. | Optimizing the edp of openmp applications via concurrency throttling and frequency boosting | |
de Lima et al. | A neural network framework for optimizing parallel computing in cloud servers | |
Halbiniak et al. | Exploration of OpenCL heterogeneous programming for porting solidification modeling to CPU‐GPU platforms | |
Davis et al. | Paradigmatic shifts for exascale supercomputing | |
Choi et al. | Analyzing the energy efficiency of the fast multipole method using a DVFS-aware energy model | |
Anzt et al. | Improving the energy efficiency of sparse linear system solvers on multicore and manycore systems | |
Soudris et al. | EXA2PRO programming environment: Architecture and Applications | |
Mudalige et al. | Predictive modeling and analysis of OP2 on distributed memory GPU clusters | |
Benedict | Prediction assisted runtime based energy tuning mechanism for HPC applications | |
Karamati et al. | An energy-efficient single-source shortest path algorithm | |
Maas et al. | An ANN-Guided Multi-Objective Framework for Power-Performance Balancing in HPC Systems | |
Baruah | Energy efficient execution of heterogeneous applications | |
Vysocky et al. | Energy-Efficient Implementation of the Lattice Boltzmann Method |