More Web Proxy on the site http://driver.im/

research-article

Scheduling Cofiows of Multi-stage Jobs to Minimize the Total Weighted Job Completion Time

Authors:

Bingchuan Tian,

Bingquan WangAuthors Info & Claims

IEEE INFOCOM 2018 - IEEE Conference on Computer Communications

Pages 864 - 872

https://doi.org/10.1109/INFOCOM.2018.8486340

Published: 16 April 2018 Publication History

Abstract

Datacenter networks are critical to Cloud computing. The coflow abstraction is a major leap forward of application-aware network scheduling. In the context of multistage jobs, there are dependencies among coflows. As a result, there is a large divergence between coflow-completion-time (CCT) and job-completion-time (JCT). To our best knowledge, this is the first work that systematically studies: how to schedule dependent coflows of multi-stage jobs, so that the total weighted job completion time can be minimized. We present a formal mathematical formulation. We also prove that this problem is strongly NP-hard. Inspired by the optimal solution of the relaxed linear programming, we design an algorithm that runs in polynomial time to solve this problem with an approximation ratio of (2M + 1), where <tex>$M$</tex> is the number of machines. Evaluation results demonstrate that, the largest gap between our algorithm and the lower bound is only 9.14%. We reduce the average JCT by up to 33.48 % compared with Aalo, a heuristic multi-stage coflow scheduler. We reduce the total weighted JCT by up to 83.31 % compared with LP-OV-LS, the state-of-the-art approximation algorithm of coflow scheduling.

References

[1]

J. Dean and S. Ghemawat, “Mapreduce: simplified data processing on large clusters”, Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.

Digital Library

[2]

“Apache hadoop”. http://hadoop.apache.org

[3]

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing”, in Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012, pp. 2–2.

Digital Library

[4]

M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, “Managing data transfers in computer clusters with orchestra”, in ACM SIGCOMM, vol. 41, no. 4. ACM, 2011, pp. 98–109.

[5]

M. Chowdhury and I. Stoica, “Coflow: a networking abstraction for cluster applications”, in ACM Hotnets. ACM, 2012, pp. 31–36.

[6]

M. Chowdhury, Y. Zhong, and I. Stoica, “Efficient coflow scheduling with varys”, in ACM SIGCOMM. ACM, 2014, pp. 443–454.

[7]

F. R. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron, “Decentralized task-aware scheduling for data center networks”, in ACM SIGCOMM Computer Communication Review, vol. 44, no. 4. ACM, 2014, pp. 431–442.

[8]

M. Chowdhury and I. Stoica, “Efficient coflow scheduling without prior knowledge”, in ACM SIGCOMM Computer Communication Review, vol. 45, no. 4. ACM, 2015, pp. 393–406.

[9]

“Tpc-ds”. http://www.tpc.org/tpcds

[10]

H. Susanto, H. Jin, and K. Chen, “Stream: Decentralized opportunistic inter-coflow scheduling for datacenter networks”, in Network Protocols (ICNP), 2016 IEEE 24th International Conference on. IEEE, 2016, pp. 1–10.

[11]

Z. Qiu, C. Stein, and Y. Zhong, “Minimizing the total weighted completion time of coflows in datacenter networks”, in Proceedings of the 27th ACM symposium on Parallelism in Algorithms and Architectures. ACM, 2015, pp. 294–303.

[12]

S. Khuller and M. Purohit, “Brief announcement: Improved approximation algorithms for scheduling co-flows”, in Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 2016, pp. 239–240.

[13]

S. Luo, H. Yu, Y. Zhao, S. Wang, S. Yu, and L. Li, “Towards practical and near-optimal coflow scheduling for data center networks”, IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 11, pp. 3366–3380, 2016.

Digital Library

[14]

M. Shafiee and J. Ghaderi, “An improved bound for minimizing the total weighted completion time of coflows in datacenters”, arXiv preprint arXiv:, 2017.

[15]

A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, “Brief announcement: A new improved bound for coflow scheduling”, in Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 2017.

[16]

A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta, “V12: a scalable and flexible data center network”, in ACM SIGCOMM 2009.

[17]

A. Singh, J. Ong, A. Agarwal, G. Anderson, A. Armistead, R. Bannon, S. Boving, G. Desai, B. Felderman, P. Germano et al., “Jupiter rising: A decade of clos topologies and centralized control in google's datacenter network”, in Proc. ACM SIGDC 2015. ACM, 2015, pp. 183–197.

[18]

T. A. Roemer, “A note on the complexity of the concurrent open shop problem”, Journal of scheduling, vol. 9, no. 4, pp. 389–396, 2006.

Digital Library

[19]

R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. R. Kan, “Optimization and approximation in deterministic sequencing and scheduling: a survey”, Annals of discrete mathematics, vol. 5, pp. 287–326, 1979.

[20]

J. Y.-T. Leung, H. Li, and M. Pinedo, “Order scheduling in an environment with dedicated resources in parallel”, Journal of Scheduling, vol. 8, no. 5, pp. 355–386, 2005.

Digital Library

[21]

A. Agnetis, H. Kellerer, G. Nicosia, and A. Pacifici, “Parallel dedicated machines scheduling with chain precedence constraints”, European Journal of Operational Research, vol. 221, no. 2, pp. 296–305, 2012.

[22]

M. Marcus and R. Ree, “Diagonals of doubly stochastic matrices”, The Quarterly Journal of Mathematics, vol. 10, no. 1, pp. 296–302, 1959.

[23]

F. Dufossé and B. Uçar, “Notes on birkhoff-von neumann decomposition of doubly stochastic matrices”, Linear Algebra and its Applications, vol. 497, pp. 108–115, 2016.

[24]

L. G. Khachiyan, “A polynomial algorithm in linear programming”, Ussr Computational Mathematics & Mathematical Physics, vol. 20, no. 80, pp. 1–3, 1979.

[25]

N. Karmarkar, “A new polynomial-time algorithm for linear programming”, in Proceedings of the sixteenth annual ACM symposium on Theory of computing ACM, 1984, pp. 302–311.

[26]

Y. Li, S. H.-C. Jiang, H. Tan, C. Zhang, G. Chen, J. Zhou, and F. Lau, “Efficient online coflow routing and scheduling”, in Proceedings of the 17th ACM International Symposium on Mobile Ad Hoc Networking and Computing. ACM, 2016, pp. 161–170.

Cited By

Pan RLei YLi JXie ZYuan BXia Y(2022)Efficient flow scheduling in distributed deep learning training with echelon formationProceedings of the 21st ACM Workshop on Hot Topics in Networks10.1145/3563766.3564096(93-100)Online publication date: 14-Nov-2022
https://dl.acm.org/doi/10.1145/3563766.3564096
Shi YWen MZhang C(2020)Towards High-Efficiency Data Centers via Job-Aware Network SchedulingProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404474(1-10)Online publication date: 17-Aug-2020
https://dl.acm.org/doi/10.1145/3404397.3404474

Index Terms

Scheduling Cofiows of Multi-stage Jobs to Minimize the Total Weighted Job Completion Time
1. Theory of computation
  1. Design and analysis of algorithms
    1. Approximation algorithms analysis
      1. Scheduling algorithms
    2. Online algorithms
      1. Online learning algorithms
  2. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning
        Sequential decision making

Index terms have been assigned to the content through auto-classification.

Recommendations

Job scheduling to minimize the weighted waiting time variance of jobs

This study considers the job scheduling problem of minimizing the weighted waiting time variance (WWTV) of jobs. It is an extension of WTV minimization problems in which we schedule a batch of n jobs, for servicing on a single resource, in such a way ...
Online scheduling with linear deteriorating jobs to minimize the total weighted completion time

In this paper, we study the online scheduling of linear deteriorating jobs on a single machine to minimize the total weighted completion time. In the problem, a set of n independent linear deteriorating jobs arriving online over time has to be scheduled ...
Primary-secondary bicriteria scheduling on identical machines to minimize the total completion time of all jobs and the maximum T-time of all machines

In this paper, we study a new primary-secondary bicriteria scheduling problem on identical machines. The primary objective is to minimize the total completion time of all jobs and the secondary objective is to minimize the maximum T-time of all machines,...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

IEEE INFOCOM 2018 - IEEE Conference on Computer Communications

Apr 2018

2776 pages

Copyright © 2018.

Publisher

IEEE Press

Publication History

Published: 16 April 2018

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Pan RLei YLi JXie ZYuan BXia Y(2022)Efficient flow scheduling in distributed deep learning training with echelon formationProceedings of the 21st ACM Workshop on Hot Topics in Networks10.1145/3563766.3564096(93-100)Online publication date: 14-Nov-2022
https://dl.acm.org/doi/10.1145/3563766.3564096
Shi YWen MZhang C(2020)Towards High-Efficiency Data Centers via Job-Aware Network SchedulingProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404474(1-10)Online publication date: 17-Aug-2020
https://dl.acm.org/doi/10.1145/3404397.3404474

View Options

View options

Figures

Tables

Media

View Table of Conten