Bao et al., 2022 - Google Patents
Deep learning-based job placement in distributed machine learning clusters with heterogeneous workloadsBao et al., 2022
View PDF- Document ID
- 13764706070418384828
- Author
- Bao Y
- Peng Y
- Wu C
- Publication year
- Publication venue
- IEEE/ACM Transactions on Networking
External Links
Snippet
Nowadays, most leading IT companies host a variety of distributed machine learning (ML) workloads in ML clusters to support AI-driven services, such as speech recognition, machine translation, and image processing. While multiple jobs are executed concurrently in a …
- 238000010801 machine learning 0 title abstract description 73
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bao et al. | Deep learning-based job placement in distributed machine learning clusters | |
Peng et al. | DL2: A deep learning-driven scheduler for deep learning clusters | |
Bao et al. | Deep learning-based job placement in distributed machine learning clusters with heterogeneous workloads | |
Yu et al. | Faasrank: Learning to schedule functions in serverless platforms | |
CN104123189B (en) | A kind of Web multilayer application dynamic resource methods of adjustment perceived based on the application of IaaS layers | |
Wu et al. | HiTDL: High-throughput deep learning inference at the hybrid mobile edge | |
Mondal et al. | Scheduling of time-varying workloads using reinforcement learning | |
Meyer et al. | ML-driven classification scheme for dynamic interference-aware resource scheduling in cloud infrastructures | |
Yu et al. | Workflow performance prediction based on graph structure aware deep attention neural network | |
Bian et al. | Online evolutionary batch size orchestration for scheduling deep learning workloads in GPU clusters | |
Nigade et al. | Jellyfish: Timely inference serving for dynamic edge networks | |
Cheng et al. | Proscale: Proactive autoscaling for microservice with time-varying workload at the edge | |
Zhao et al. | Large-scale machine learning cluster scheduling via multi-agent graph reinforcement learning | |
Tang et al. | Nanily: A qos-aware scheduling for dnn inference workload in clouds | |
Gudur et al. | Resource-constrained federated learning with heterogeneous labels and models | |
CN114895773A (en) | Energy consumption optimization method, system and device of heterogeneous multi-core processor and storage medium | |
Zheng et al. | Shockwave: Fair and efficient cluster scheduling for dynamic adaptation in machine learning | |
Li et al. | Tapfinger: Task placement and fine-grained resource allocation for edge machine learning | |
Fang et al. | Multi-tenant mobile offloading systems for real-time computer vision applications | |
Chouliaras et al. | An adaptive auto-scaling framework for cloud resource provisioning | |
Zhou et al. | Training and Serving System of Foundation Models: A Comprehensive Survey | |
Denninnart et al. | Efficient task pruning mechanism to improve robustness of heterogeneous computing systems | |
Bhattacharjee et al. | Deep-edge: An efficient framework for deep learning model update on heterogeneous edge | |
Yeung et al. | Horus: An interference-aware resource manager for deep learning systems | |
Qiu et al. | FLASH: Fast model adaptation in ML-centric cloud platforms |