Bao et al., 2022 - Google Patents

Deep learning-based job placement in distributed machine learning clusters with heterogeneous workloads

Bao et al., 2022

Document ID: 13764706070418384828
Author: Bao Y; Peng Y; Wu C
Publication year: 2022
Publication venue: IEEE/ACM Transactions on Networking

External Links

Cited by

Snippet

Nowadays, most leading IT companies host a variety of distributed machine learning (ML) workloads in ML clusters to support AI-driven services, such as speech recognition, machine translation, and image processing. While multiple jobs are executed concurrently in a …

Continue reading at i.cs.hku.hk (PDF) (other versions)

238000010801 machine learning 0 title abstract description 73

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce

Similar Documents

Publication	Publication Date	Title
Bao et al.	2019	Deep learning-based job placement in distributed machine learning clusters
Peng et al.	2021	DL2: A deep learning-driven scheduler for deep learning clusters
Bao et al.	2022	Deep learning-based job placement in distributed machine learning clusters with heterogeneous workloads
Yu et al.	2021	Faasrank: Learning to schedule functions in serverless platforms
CN104123189B (en)	2017-12-01	A kind of Web multilayer application dynamic resource methods of adjustment perceived based on the application of IaaS layers
Wu et al.	2022	HiTDL: High-throughput deep learning inference at the hybrid mobile edge
Mondal et al.	2021	Scheduling of time-varying workloads using reinforcement learning
Meyer et al.	2021	ML-driven classification scheme for dynamic interference-aware resource scheduling in cloud infrastructures
Yu et al.	2022	Workflow performance prediction based on graph structure aware deep attention neural network
Bian et al.	2021	Online evolutionary batch size orchestration for scheduling deep learning workloads in GPU clusters
Nigade et al.	2022	Jellyfish: Timely inference serving for dynamic edge networks
Cheng et al.	2023	Proscale: Proactive autoscaling for microservice with time-varying workload at the edge
Zhao et al.	2021	Large-scale machine learning cluster scheduling via multi-agent graph reinforcement learning
Tang et al.	2019	Nanily: A qos-aware scheduling for dnn inference workload in clouds
Gudur et al.	2020	Resource-constrained federated learning with heterogeneous labels and models
CN114895773A (en)	2022-08-12	Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium
Zheng et al.	2023	Shockwave: Fair and efficient cluster scheduling for dynamic adaptation in machine learning
Li et al.	2023	Tapfinger: Task placement and fine-grained resource allocation for edge machine learning
Fang et al.	2019	Multi-tenant mobile offloading systems for real-time computer vision applications
Chouliaras et al.	2023	An adaptive auto-scaling framework for cloud resource provisioning
Zhou et al.	2024	Training and Serving System of Foundation Models: A Comprehensive Survey
Denninnart et al.	2020	Efficient task pruning mechanism to improve robustness of heterogeneous computing systems
Bhattacharjee et al.	2020	Deep-edge: An efficient framework for deep learning model update on heterogeneous edge
Yeung et al.	2020	Horus: An interference-aware resource manager for deep learning systems
Qiu et al.	2024	FLASH: Fast model adaptation in ML-centric cloud platforms