[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Approximations to Study the Impact of the Service Discipline in Systems with Redundancy

Published: 21 February 2024 Publication History

Abstract

As job redundancy has been recognized as an effective means to improve performance of large-scale computer systems, queueing systems with redundancy have been studied by various authors. Existing results include methods to compute the queue length distribution and response time but only when the service discipline is First-Come-First-Served (FCFS). For other service disciplines, such as Processor Sharing (PS), or Last-Come-First-Served (LCFS), only the stability conditions are known. In this paper we develop the first methods to approximate the queue length distribution in a queueing system with redundancy under various service disciplines. We focus on a system with exponential job sizes, i.i.d. copies, and a large number of servers. We first derive a mean field approximation that is independent of the scheduling policy. In order to study the impact of service discipline, we then derive refinements of this approximation to specific scheduling policies. In the case of Processor Sharing, we provide a pair and a triplet approximation. The pair approximation can be regarded as a refinement of the classic mean field approximation and takes the service discipline into account, while the triplet approximation further refines the pair approximation. We also develop a pair approximation for three other service disciplines: First-Come-First-Served, Limited Processor Sharing and Last-Come-First-Served. We present numerical evidence that shows that all the approximations presented in the paper are highly accurate, but that none of them are asymptotically exact (as the number of servers goes to infinity). This makes these approximations suitable to study the impact of the service discipline on the queue length distribution. Our results show that FCFS yields the shortest queue length, and that the differences are more substantial at higher loads.

References

[1]
Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective Straggler Mitigation: Attack of the Clones. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). USENIX Association, Lombard, IL, 185--198. https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/ananthanarayanan
[2]
Elene Anton, Urtzi Ayesta, Matthieu Jonckheere, and Ina Maria Verloop. 2021. On the stability of redundancy models. Operations Research 69, 5 (2021), 1540--1565.
[3]
Elene Anton, Urtzi Ayesta, Matthieu Jonckheere, and Ina Maria Verloop. 2021. A Survey of Stability Results for Redundancy Systems. In Modern Trends in Controlled Stochastic Processes:, Alexey Piunovskiy and Yi Zhang (Eds.). Springer International Publishing, Cham, 266--283.
[4]
E. Anton and K. Gardner. 2023. The Stationary Distribution of the Redundancy-d Model with Random Order of Service. SIGMETRICS Perform. Eval. Rev. 51, 2 (oct 2023), 9--11. https://doi.org/10.1145/3626570.3626575
[5]
U. Ayesta, T. Bodas, and I.M. Verloop. 2018. On a unifying product form framework for redundancy models. Performance Evaluation 127--128 (2018), 93--119. https://doi.org/10.1016/j.peva.2018.09.008
[6]
Thomas Bonald and Céline Comte. 2017. Balanced fair resource sharing in computer clusters. Performance Evaluation 116 (2017), 70--83. https://doi.org/10.1016/j.peva.2017.08.006
[7]
Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56, 2 (feb 2013), 74--80. https: //doi.org/10.1145/2408776.2408794
[8]
Kristen Gardner, Mor Harchol-Balter, Alan Scheller-Wolf, and Benny Van Houdt. 2017. A better model for job redundancy: Decoupling server slowdown and job size. IEEE/ACM transactions on networking 25, 6 (2017), 3353--3367.
[9]
Kristen Gardner, Mor Harchol-Balter, Alan Scheller-Wolf, Mark Velednitsky, and Samuel Zbarsky. 2017. Redundancy-d: The power of d choices for redundancy. Operations Research 65, 4 (2017), 1078--1094.
[10]
Kristen Gardner, Samuel Zbarsky, Sherwin Doroudi, Mor Harchol-Balter, and Esa Hyytia. 2015. Reducing latency via redundant requests: Exact analysis. ACM SIGMETRICS Performance Evaluation Review 43, 1 (2015), 347--360.
[11]
J. Harte. 2021. Token Redundancy in Distributed JIQ. Bachelor's Thesis. Eindhoven University of Technology.
[12]
Tim Hellemans, Tejas Bodas, and Benny Van Houdt. 2019. Performance Analysis of Workload Dependent Load Balancing Policies. Proc. ACM Meas. Anal. Comput. Syst. 3, 2, Article 35 (jun 2019), 35 pages. https://doi.org/10.1145/ 3341617.3326150
[13]
T. Hellemans and B. Van Houdt. 2018. On the Power-of-d-choices with Least Loaded Server Selection. Proc. ACM Meas. Anal. Comput. Syst. (June 2018).
[14]
T. Hellemans and B. Van Houdt. 2019. Analysis of Redundancy(d) with Identical Replicas. SIGMETRICS Perform. Eval. Rev. 46, 3 (jan 2019), 74--79. https://doi.org/10.1145/3308897.3308932
[15]
A. E. Krzesinski. 2011. Order Independent Queues. Springer US, Boston, MA, 85--120. https://doi.org/10.1007/978--1- 4419--6472--4_2
[16]
Y. Lu, Q. Xie, G. Kliot, A. Geller, J. R. Larus, and A. Greenberg. 2011. Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services. Perform. Eval. 68 (2011), 1056--1071. Issue 11.
[17]
Michael Mitzenmacher. 2016. Analyzing distributed join-idle-queue: A fluid limit approach. In 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 312--318.
[18]
Youri Raaijmakers and Sem Borst. 2020. Achievable Stability in Redundancy Systems. Proc. ACM Meas. Anal. Comput. Syst. 4, 3, Article 46 (nov 2020), 21 pages. https://doi.org/10.1145/3428331
[19]
Youri Raaijmakers, Sem Borst, and Onno Boxma. 2020. Stability of Redundancy Systems with Processor Sharing. In Proceedings of the 13th EAI International Conference on Performance Evaluation Methodologies and Tools (Tsukuba, Japan) (VALUETOOLS '20). Association for Computing Machinery, New York, NY, USA, 120--127. https://doi.org/10. 1145/3388831.3388837
[20]
S. Shneer and A. L. Stolyar. 2021. Large-scale parallel server system with multi-component jobs. Queueing Systems: Theory and Applications 98, 1 (June 2021), 21--48. https://doi.org/10.1007/s11134-021-09686-
[21]
A.L. Stolyar. 2015. Pull-based load distribution in large-scale heterogeneous service systems. Queueing Systems 80, 4 (2015), 341--361. https://doi.org/10.1007/s11134-015--9448--8
[22]
Benny Van Houdt. 2023. On the Performance Evaluation of Distributed Join-Idle-Queue Load Balancing. In Proceedings of MASCOTS 2023. IEEE, New York, NY, USA.
[23]
Jeremy Visschers, Ivo Adan, and Gideon Weiss. 2012. A product form solution to a system with multi-type jobs and multi-type servers. Queueing Systems 70, 3 (2012), 269--298.
[24]
Ashish Vulimiri, Philip Brighten Godfrey, Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, and Scott Shenker. 2013. Low latency via redundancy. In Proceedings of the ninth ACM conference on Emerging networking experiments and technologies. 283--294.
[25]
C Wang, C Feng, and J Cheng. 2018. Distributed Join-the-Idle-Queue for Low Latency Cloud Services. IEEE/ACM Transactions on Networking 26, 5 (2018), 2309--2319. https://doi.org/10.1109/TNET.2018.2869092

Cited By

View all
  • (2024)Approximations to Study the Impact of the Service Discipline in Systems with RedundancyACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365504552:1(1-2)Online publication date: 13-Jun-2024
  • (2024)Approximations to Study the Impact of the Service Discipline in Systems with RedundancyAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655045(1-2)Online publication date: 10-Jun-2024
  • (2024)New directions in pass-and-swap queuesQueueing Systems: Theory and Applications10.1007/s11134-024-09914-1107:3-4(205-256)Online publication date: 1-Sep-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems
Proceedings of the ACM on Measurement and Analysis of Computing Systems  Volume 8, Issue 1
POMACS
March 2024
494 pages
EISSN:2476-1249
DOI:10.1145/3649331
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 February 2024
Published in POMACS Volume 8, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. load balancing
  2. mean field approximation
  3. pair approximation
  4. queueing theory
  5. redundancy
  6. scheduling

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)103
  • Downloads (Last 6 weeks)4
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Approximations to Study the Impact of the Service Discipline in Systems with RedundancyACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365504552:1(1-2)Online publication date: 13-Jun-2024
  • (2024)Approximations to Study the Impact of the Service Discipline in Systems with RedundancyAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655045(1-2)Online publication date: 10-Jun-2024
  • (2024)New directions in pass-and-swap queuesQueueing Systems: Theory and Applications10.1007/s11134-024-09914-1107:3-4(205-256)Online publication date: 1-Sep-2024

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media