[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Heavy-Traffic Optimal Size- and State-Aware Dispatching

Published: 21 February 2024 Publication History

Abstract

Dispatching systems, where arriving jobs are immediately assigned to one of multiple queues, are ubiquitous in computer systems and service systems. A natural and practically relevant model is one in which each queue serves jobs in FCFS (First-Come First-Served) order. We consider the case where the dispatcher is size-aware, meaning it learns the size (i.e. service time) of each job as it arrives; and state-aware, meaning it always knows the amount of work (i.e. total remaining service time) at each queue. While size- and state-aware dispatching to FCFS queues has been extensively studied, little is known about optimal dispatching for the objective of minimizing mean delay. A major obstacle is that no nontrivial lower bound on mean delay is known, even in heavy traffic (i.e. the limit as load approaches capacity). This makes it difficult to prove that any given policy is optimal, or even heavy-traffic optimal. In this work, we propose the first size- and state-aware dispatching policy that provably minimizes mean delay in heavy traffic. Our policy, called CARD (Controlled Asymmetry Reduces Delay), keeps all but one of the queues short, then routes as few jobs as possible to the one long queue. We prove an upper bound on CARD's mean delay, and we prove the first nontrivial lower bound on the mean delay of any size- and state-aware dispatching policy. Both results apply to any number of servers. Our bounds match in heavy traffic, implying CARD's heavy-traffic optimality. In particular, CARD's heavy-traffic performance improves upon that of LWL (Least Work Left), SITA (Size Interval Task Assignment), and other policies from the literature whose heavy-traffic performance is known.

References

[1]
Osman T Akgun, Rhonda Righter, and Ronald Wolff. 2013. Partial flexibility in routeing and scheduling. Advances in Applied Probability 45, 3 (2013), 673--691.
[2]
Jonatha Anselmi. 2019. Combining size-based load balancing with round-robin for scalable low latency. IEEE Transactions on Parallel and Distributed Systems 31, 4 (2019), 886--896.
[3]
Francois Baccelli and Pierre Brémaud. 2002. Elements of queueing theory: Palm Martingale calculus and stochastic recurrences. Vol. 26. Springer Science & Business Media.
[4]
Yan Chen and Jing Dong. 2021. Scheduling with service-time information: The power of two priority classes. arXiv preprint arXiv:2105.10499 (2021).
[5]
DJ Daley. 1987. Certain optimality properties of the first-come first-served discipline for G/G/s queues. Stochastic Processes and their Applications 25 (1987), 301--308.
[6]
Douglas G Down and Rong Wu. 2006. Multi-layered round robin routing for parallel servers. Queueing Systems 53 (2006), 177--188.
[7]
Anthony Ephremides, Pravin Varaiya, and Jean Walrand. 1980. A simple dynamic routing problem. IEEE transactions on Automatic Control 25, 4 (1980), 690--693.
[8]
Atilla Eryilmaz and Rayadurgam Srikant. 2012. Asymptotically tight steady-state queue length bounds implied by drift conditions. Queueing Systems 72 (2012), 311--359.
[9]
Hanhua Feng, Vishal Misra, and Dan Rubenstein. 2005. Optimal state-free, size-aware dispatching for heterogeneous M/G/-type systems. Performance evaluation 62, 1--4 (2005), 475--492.
[10]
Sergey Foss, Seva Shneer, and Andrey Tyurlikov. 2012. Stability of a Markov-modulated Markov chain, with application to a wireless network governed by two protocols. Stochastic Systems 2, 1 (2012), 208--231.
[11]
Sergei Georgievich Foss. 1980. Approximation of multichannel queueing systems. Siberian Mathematical Journal 21, 6 (1980), 851--857.
[12]
Xinzhe Fu and Eytan Modiano. 2022. Joint Learning and Control in Stochastic Queueing Networks with Unknown Utilities. Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, 3 (2022), 1--32.
[13]
Steve W Fuhrmann and Robert B Cooper. 1985. Stochastic decompositions in the M/G/1 queue with generalized vacations. Operations research 33, 5 (1985), 1117--1129.
[14]
Isaac Grosof, Ziv Scully, and Mor Harchol-Balter. 2019. Load balancing guardrails: keeping your heavy traffic on the road to low response times. Proceedings of the ACM on Measurement and Analysis of Computing Systems 3, 2 (2019), 1--31.
[15]
Isaac Grosof, Kunhe Yang, Ziv Scully, and Mor Harchol-Balter. 2021. Nudge: Stochastically improving upon FCFS. Proceedings of the ACM on Measurement and Analysis of Computing Systems 5, 2 (2021), 1--29.
[16]
Mor Harchol-Balter. 2013. Performance modeling and design of computer systems: queueing theory in action. Cambridge University Press.
[17]
Mor Harchol-Balter, Mark E Crovella, and Cristina D Murta. 1999. On choosing a task assignment policy for a distributed server system. J. Parallel and Distrib. Comput. 59, 2 (1999), 204--228.
[18]
Mor Harchol-Balter, Alan Scheller-Wolf, and Andrew R Young. 2009. Surprising results on task assignment in server farms with high-variability workloads. In Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems. 287--298.
[19]
Mor Harchol-Balter and Rein Vesilo. 2010. To balance or unbalance load in size-interval task allocation. Probability in the Engineering and Informational Sciences 24, 2 (2010), 219--244.
[20]
Daniela Hurtado-Lange and Siva Theja Maguluri. 2022. A load balancing system in the many-server heavy-traffic asymptotics. Queueing Systems 101, 3--4 (2022), 353--391.
[21]
Daniela Hurtado-Lange, Sushil Mahavir Varma, and Siva Theja Maguluri. 2022. Logarithmic heavy traffic error bounds in generalized switch and load balancing systems. Journal of Applied Probability 59, 3 (2022), 652--669.
[22]
Esa Hyytiä. 2013. Lookahead actions in dispatching to parallel queues. Performance Evaluation 70, 10 (2013), 859--872.
[23]
Esa Hyytiä, Peter Jacko, and Rhonda Righter. 2022. Routing with too much information? Queueing Systems 100, 3--4 (2022), 441--443.
[24]
Esa Hyytiä, Aleksi Penttinen, and Samuli Aalto. 2012. Size-and state-aware dispatching problem with queue-specific job sizes. European Journal of Operational Research 217, 2 (2012), 357--370.
[25]
Esa Hyytiä and Rhonda Righter. 2020. STAR and RATS: Multi-level dispatching policies. In 2020 32nd International Teletraffic Congress (ITC 32). IEEE, 81--89.
[26]
Esa Hyytiä and Rhonda Righter. 2022. On Sequential Dispatching Policies. In 2022 32nd International Telecommunication Networks and Applications Conference (ITNAC). IEEE, 1--6.
[27]
Esa Hyytiä and Rhonda Righter. 2023. On Dynamic Size-Aware Dispatching and Computation of the Optimal Actions. SSRN 4395052 (2023).
[28]
Prakirt Raj Jhunjhunwala and Siva Theja Maguluri. 2021. Low-complexity switch scheduling algorithms: delay optimality in heavy traffic. IEEE/ACM Transactions on Networking 30, 1 (2021), 464--473.
[29]
Prakirt Raj Jhunjhunwala and Siva Theja Maguluri. 2023. Heavy Traffic Queue Length Distribution without Resource Pooling in an Input-Queued Switch. ACM SIGMETRICS Performance Evaluation Review 50, 4 (2023), 26--28.
[30]
Frank P Kelly and CN Laws. 1993. Dynamic routing in open queueing networks: Brownian models, cut constraints and resource pooling. Queueing systems 13 (1993), 47--86.
[31]
Gerhardus Martinus Koole. 1992. On the optimality of FCFS for networks of multi-server queues. Centre for Mathematics and Computer Science.
[32]
Daniela Hurtado Lange and Siva Theja Maguluri. 2019. Heavy-traffic analysis of the generalized switch under multidimensional state space collapse. ACM SIGMETRICS Performance Evaluation Review 47, 2 (2019), 36--38.
[33]
Xin Liu, Kang Gong, and Lei Ying. 2022. Steady-state analysis of load balancing with Coxian-2 distributed service times. Naval Research Logistics (NRL) 69, 1 (2022), 57--75.
[34]
Zhen Liu and Rhonda Righter. 1998. Optimal load balancing on distributed homogeneous unreliable processors. Operations Research 46, 4 (1998), 563--573.
[35]
Zhen Liu and Don Towsley. 1994. Optimality of the round-robin routing policy. Journal of applied probability 31, 2 (1994), 466--475.
[36]
Siva Theja Maguluri, Sai Kiran Burle, and Rayadurgam Srikant. 2018. Optimal heavy-traffic queue length scaling in an incompletely saturated switch. Queueing Systems 88 (2018), 279--309.
[37]
Siva Theja Maguluri and R Srikant. 2016. Heavy traffic queue length behavior in a switch under the MaxWeight algorithm. Stochastic Systems 6, 1 (2016), 211--250.
[38]
Aniket Mahanti, Niklas Carlsson, Anirban Mahanti, Martin Arlitt, and Carey Williamson. 2013. A tale of the tails: Power-laws in internet measurements. IEEE Network 27, 1 (2013), 59--64.
[39]
Sean P Meyn and Douglas Down. 1994. Stability of generalized Jackson networks. The Annals of Applied Probability (1994), 124--148.
[40]
Sean P Meyn and Richard L Tweedie. 1993. Stability of Markovian processes II: Continuous-time processes and sampled chains. Advances in Applied Probability 25, 3 (1993), 487--517.
[41]
Sean P Meyn and Richard L Tweedie. 1993. Stability of Markovian processes III: Foster--Lyapunov criteria for continuous-time processes. Advances in Applied Probability 25, 3 (1993), 518--548.
[42]
Masakiyo Miyazawa. 1994. Rate conservation laws: a survey. Queueing Systems 15 (1994), 1--58.
[43]
Sheldon M Ross. 1995. Stochastic processes. John Wiley & Sons.
[44]
Sigurður Gauti Samúelsson and Esa Hyytiä. 2018. Applying reinforcement learning to basic routing problem. In Queueing Theory and Network Applications: 13th International Conference, QTNA 2018, Tsukuba, Japan, July 25--27, 2018, Proceedings 13. Springer, 238--249.
[45]
Linus Schrage. 1968. A proof of the optimality of the shortest remaining processing time discipline. Operations Research 16, 3 (1968), 687--690.
[46]
Ziv Scully. 2022. A New Toolbox for Scheduling Theory. Ph.D. Dissertation. Carnegie Mellon University, Pittsburgh, PA. https://ziv.codes/pdf/scully-thesis.pdf
[47]
Ziv Scully, Isaac Grosof, and Mor Harchol-Balter. 2020. The Gittins policy is nearly optimal in the M/G/k under extremely general conditions. Proceedings of the ACM on Measurement and Analysis of Computing Systems 4, 3 (2020), 1--29.
[48]
Yih-Choung Teh and Amy R Ward. 2002. Critical thresholds for dynamic routing in queueing networks. Queueing Systems 42 (2002), 297--316.
[49]
Chang-Heng Wang, Siva Theja Maguluri, and Tara Javidi. 2017. Heavy traffic queue length behavior in switches with reconfiguration delay. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications. IEEE, 1--9.
[50]
Yinghui Wang and Douglas Down. 2014. On resource pooling in SITA-like parallel server systems. In 2014 26th International Teletraffic Congress (ITC). IEEE, 1--9.
[51]
Richard R Weber. 1978. On the optimal assignment of customers to parallel servers. Journal of Applied Probability 15, 2 (1978), 406--413.
[52]
Wentao Weng, Xingyu Zhou, and R Srikant. 2020. Optimal load balancing with locality constraints. Proceedings of the ACM on Measurement and Analysis of Computing Systems 4, 3 (2020), 1--37.
[53]
Ward Whitt. 1993. Approximations for the GI/G/m queue. Production and Operations Management 2, 2 (1993), 114--161.
[54]
Adam Wierman and Bert Zwart. 2012. Is tail-optimal scheduling possible? Operations research 60, 5 (2012), 1249--1257.
[55]
Wayne Winston. 1977. Optimality of the shortest line discipline. Journal of applied probability 14, 1 (1977), 181--189.
[56]
Ronald W Wolff. 1982. Poisson arrivals see time averages. Operations research 30, 2 (1982), 223--231.
[57]
Runhan Xie and Ziv Scully. 2023. Reducing heavy-traffic response time with asymmetric dispatching. ACM SIGMETRICS Performance Evaluation Review 51, 2 (2023), 36--38.
[58]
Xingyu Zhou, Jian Tan, and Ness Shroff. 2018. Flexible load balancing with multi-dimensional state-space collapse: Throughput and heavy-traffic delay optimality. Performance Evaluation 127--128 (2018), 176--193.
[59]
Xingyu Zhou, Jian Tan, and Ness Shroff. 2018. Heavy-traffic delay optimality in pull-based load balancing systems: Necessary and sufficient conditions. Proceedings of the ACM on Measurement and Analysis of Computing Systems 2, 3 (2018), 1--33.
[60]
Xingyu Zhou, Fei Wu, Jian Tan, Kannan Srinivasan, and Ness Shroff. 2018. Degree of queue imbalance: Overcoming the limitation of heavy-traffic delay optimality in load balancing systems. Proceedings of the ACM on Measurement and Analysis of Computing Systems 2, 1 (2018), 1--41.
[61]
Xingyu Zhou, Fei Wu, Jian Tan, Yin Sun, and Ness Shroff. 2017. Designing low-complexity heavy-traffic delay-optimal load balancing schemes: Theory to algorithms. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1, 2 (2017), 1--30.

Cited By

View all
  • (2024)Heavy-Traffic Optimal Size- and State-Aware DispatchingACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365505952:1(7-8)Online publication date: 13-Jun-2024
  • (2024)Heavy-Traffic Optimal Size- and State-Aware DispatchingAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655059(7-8)Online publication date: 10-Jun-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems
Proceedings of the ACM on Measurement and Analysis of Computing Systems  Volume 8, Issue 1
POMACS
March 2024
494 pages
EISSN:2476-1249
DOI:10.1145/3649331
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 February 2024
Published in POMACS Volume 8, Issue 1

Check for updates

Author Tags

  1. asymptotic optimality
  2. dispatching
  3. fcfs
  4. heavy traffic
  5. latency
  6. response time
  7. sojourn time

Qualifiers

  • Research-article

Funding Sources

  • Tennenbaum Postdoctoral Fellowship at the Georgia Institute of Technology School of Industrial and Systems Engineering
  • US National Science Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)548
  • Downloads (Last 6 weeks)332
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Heavy-Traffic Optimal Size- and State-Aware DispatchingACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365505952:1(7-8)Online publication date: 13-Jun-2024
  • (2024)Heavy-Traffic Optimal Size- and State-Aware DispatchingAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655059(7-8)Online publication date: 10-Jun-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media