[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3359989.3365432acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article
Public Access

Reducing tail latency using duplication: a multi-layered approach

Published: 03 December 2019 Publication History

Abstract

Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We call for making duplication a first-class concept in cloud systems, and make two contributions in this regard. First, we present duplicate-aware scheduling or DAS, an aggressive duplication policy that duplicates every job, but keeps the system safe by providing suitable support (prioritization and purging) at multiple layers of the cloud system. Second, we present the D-Stage abstraction, which supports DAS and other duplication policies across diverse layers of a cloud system (e.g., network, storage, etc.). The D-Stage abstraction decouples the duplication policy from the mechanism, and facilitates working with legacy layers of a system. Using this abstraction, we evaluate the benefits of DAS for two data parallel applications (HDFS, an in-memory workload generator) and a network function (Snort-based IDS cluster). Our experiments on the public cloud and Emulab show that DAS is safe to use, and the tail latency improvement holds across a wide range of workloads.

References

[1]
2017. Apache Hadoop. https://hadoop.apache.org/.
[2]
2017. DPDK: Data Plane Development Kit. http://dpdk.org/.
[3]
2017. Emulab. http://www.emulab.net.
[4]
2017. F-Stack: High Performance Network Framework Based On DPDK. http://www.f-stack.org/.
[5]
2017. Google Cloud. https://cloud.google.com/.
[6]
2017. Google Cloud Persistent Disk. https://cloud.google.com/compute/docs/disks/#pdspecs.
[7]
2017. Kernel Document. https://www.kernel.org/doc/Documentation/block/cfq-iosched.txt.
[8]
2017. Linux Manpage. http://man7.org/linux/man-pages/man2/ioprio_set.2.html.
[9]
2017. Linux Pthread Manpage. http://man7.org/linux/man-pages/man3/pthread_setschedprio.3.html.
[10]
2018. Google Cloud. https://cloud.google.com/.
[11]
2018. Packet Bricks. https://github.com/bro/packet-bricks.
[12]
2018. Snort3. https://www.snort.org/snort3.
[13]
C. L. Abad, Y. Lu, and R. H. Campbell. 2011. DARE: Adaptive Data Replication for Efficient Cluster Scheduling. In Proc. IEEE International Conference on Cluster Computing.
[14]
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data center TCP (DCTCP). In Proc. ACM SIGCOMM.
[15]
Mohammad Alizadeh, Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, and Scott Shenker. 2013. pFabric: Minimal Near-optimal Datacenter Transport. In Proc. ACM SIGCOMM.
[16]
Ganesh Ananthanarayanan, Sameer Agarwal, Srikanth Kandula, Albert Greenberg, Ion Stoica, Duke Harlan, and Ed Harris. 2011. Scarlett: Coping with skewed content popularity in MapReduce clusters. In Proc. EuroSys.
[17]
Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective Straggler Mitigation: Attack of the Clones. In Proc. Usenix NSDI.
[18]
Ganesh Ananthanarayanan, Michael Chien-Chun Hung, Xiaoqi Ren, Ion Stoica, Adam Wierman, and Minlan Yu. 2014. GRASS: trimming stragglers in approximation analytics. In Proc. Usenix NSDI.
[19]
Ganesh Ananthanarayanan, Srikanth Kandula, Albert Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris. 2010. Reining in the Outliers in Map-reduce Clusters Using Mantri. In Proc. USENIX OSDI.
[20]
Sebastian Angel, Hitesh Ballani, Thomas Karagiannis, Greg O'Shea, and Eno Thereska. 2014. End-to-end Performance Isolation Through Virtual Datacenters. In Proc. USENIX OSDI.
[21]
Dan Ardelean, Amer Diwan, and Chandra Erdman. 2018. Performance Analysis of Cloud Applications. In Proc. USNIX NSDI.
[22]
Wei Bai, Li Chen, Kai Chen, Dongsu Han, Chen Tian, and Hao Wang. 2015. Information-Agnostic Flow Scheduling for Commodity Data Centers. In Proc. Usenix NSDI.
[23]
Wei Bai, Li Chen, Kai Chen, and Haitao Wu. [n. d.]. Enabling ECN in Multi-Service Multi-Queue Data Centers. In Proc. Usenix NSDI.
[24]
Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, and David Walker. 2014. P4: Programming Protocol-independent Packet Processors. SIGCOMM Comput. Commun. Rev. 44, 3 (July 2014), 87--95.
[25]
Guo Chen, Yuanwei Lu, Yuan Meng, Bojie Li, Kun Tan, Dan Pei, Peng Cheng, Layong Luo, Yongqiang Xiong, Xiaoliang Wang, et al. 2016. Fast and Cautious: Leveraging Multi-path Diversity for Transport Loss Recovery in Data Centers. In Proc. USENIX ATC.
[26]
Z. Cheng, Z. Luan, Y. Meng, Y. Xu, D. Qian, A. Roy, N. Zhang, and G. Guan. 2012. ERMS: An Elastic Replication Management System for HDFS. In Proc. IEEE Cluster Computing Workshops.
[27]
Mosharaf Chowdhury, Srikanth Kandula, and Ion Stoica. 2013. Leveraging end-point flexibility in data-intensive clusters. In Proc. ACM SIGCOMM.
[28]
Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM (Feb. 2013), 74--80.
[29]
Fahad R Dogar, Thomas Karagiannis, Hitesh Ballani, and Antony Rowstron. 2014. Decentralized Task-aware Scheduling for Data Center Networks. In Proc. ACM SIGCOMM.
[30]
Fahad R Dogar and Peter Steenkiste. 2012. Architecting for Edge Diversity: Supporting Rich Services Over an Unbundled Transport. In Proc. ACM CoNext.
[31]
Abdullah Bin Faisal, Hafiz Mohsin Bashir, Ihsan Ayyub Qazi, Zartash Uzmi, and Fahad R. Dogar. 2018. Workload Adaptive Flow Scheduling. In Proc. ACM CoNEXT.
[32]
Kristen Gardner. 2017. Modeling and Analyzing Systems with Redundancy. PhD thesis. http://www.cs.cmu.edu/~harchol/gardner_thesis.pdf.
[33]
Kristen Gardner, Mor Harchol-Balter, Esa Hyytiä, and Rhonda Righter. 2017. Scheduling for efficiency and fairness in systems with redundancy. Performance Evaluation (2017).
[34]
Albert Greenberg, James R Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. In Proc. ACM SIGCOMM. ACM.
[35]
Ajay Gulati, Irfan Ahmad, and Carl A. Waldspurger. 2009. PARDA: Proportional Allocation of Resources for Distributed Storage Access. In Proc. USENIX FAST.
[36]
Haryadi S. Gunawi, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patanaanake, Thanh Do, Jeffry Adityatama, Kurnia J. Eliazar, Agung Laksono, Jeffrey F. Lukman, Vincentius Martin, and Anang D. Satria. 2014. What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems. In Proc. ACM SoCC.
[37]
Haryadi S. Gunawi, Mingzhe Hao, Riza O. Suminto, Agung Laksono, Anang D. Satria, Jeffry Adityatama, and Kurnia J. Eliazar. 2016. Why Does the Cloud Stop Computing?: Lessons from Hundreds of Service Outages. In Proc. ACM SoCC.
[38]
Haryadi S. Gunawi, Riza O. Suminto, Russell Sears, Casey Golliher, Swaminathan Sundararaman, Xing Lin, Tim Emami, Weiguang Sheng, Nematollah Bidokhti, Caitie McCaffrey, Gary Grider, Parks M. Fields, Kevin Harms, Robert B. Ross, Andree Jacobson, Robert Ricci, Kirk Webb, Peter Alvaro, H. Birali Runesha, Mingzhe Hao, and Huaicheng Li. 2018. Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems. In Proc. USENIX FAST.
[39]
Dongsu Han, Ashok Anand, Fahad Dogar, Boyan Li, Hyeontaek Lim, Michel Machado, Arvind Mukundan, Wenfei Wu, Aditya Akella, David G. Andersen, John W. Byers, Srinivasan Seshan, and Peter Steenkiste. 2012. XIA: Efficient Support for Evolvable Internetworking. In Proc. USENIX NSDI. San Jose, CA.
[40]
Mingzhe Hao, Huaicheng Li, Michael Hao Tong, Chrisma Pakha, Riza O Suminto, Cesar A Stuardo, Andrew A Chien, and Haryadi S Gunawi. 2017. MittOS: Supporting Millisecond Tail Tolerance with Fast Rejecting SLO-Aware OS Interface. In Proc. ACM SOSP.
[41]
Osama Haq and Fahad R. Dogar. 2015. Leveraging the Power of the Cloud for Reliable Wide Area Communication. In Proc. ACM Hotnets.
[42]
Osama Haq, Cody Doucette, John W Byers, and Fahad R Dogar. 2019. Judicious QoS using Cloud Overlays. arXiv preprint arXiv:1906.02562 (2019).
[43]
Osama Haq, Mamoon Raja, and Fahad R. Dogar. 2017. Measuring and Improving the Reliability of Wide-Area Cloud Paths. In Proc. WWW.
[44]
Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Analysis of HDFS Under HBase: A Facebook Messages Case Study. In Proc. USENIX FAST.
[45]
Ali Musa Iftikhar, Fahad Dogar, and Ihsan Ayyub Qazi. 2016. Towards a Redundancy-Aware Network Stack for Data Centers. In Proc. HotNets.
[46]
Syed Mohammad Irteza, Hafiz Mohsin Bashir, Talal Anwar, Ihsan Ayyub Qazi, and Fahad Rafique Dogar. 2017. Load balancing over symmetric virtual topologies. In Proc. IEEE INFOCOM.
[47]
EunYoung Jeong, Shinae Wood, Muhammad Jamshed, Haewon Jeong, Sunghwan Ihm, Dongsu Han, and KyoungSoo Park. 2014. mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems. In Proc. NSDI.
[48]
Bojie Li, Kun Tan, Layong Larry Luo, Yanqing Peng, Renqian Luo, Ningyi Xu, Yongqiang Xiong, Peng Cheng, and Enhong Chen. 2016. Clicknp: Highly flexible and high performance network processing with reconfigurable hardware. In Proc. ACM SIGCOMM.
[49]
S. Liu, H. Xu, L. Liu, W. Bai, K. Chen, and Z. Cai. 2018. RepNet: Cutting Latency with Flow Replication in Data Center Networks. IEEE Transactions on Services Computing (2018).
[50]
Jonathan Mace, Peter Bodik, Rodrigo Fonseca, and Madanlal Musuvathi. 2015. Retro: Targeted Resource Management in Multi-tenant Distributed Systems. In Proc. USENIX NSDI.
[51]
Jonathan Mace, Ryan Roelke, and Rodrigo Fonseca. 2015. Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems. In Proc. SOSP.
[52]
Ali Munir, Ghufran Baig, Syed M Irteza, Ihsan A Qazi, Alex X Liu, and Fahad R Dogar. 2014. Friends, not Foes: Synthesizing Existing Transport Strategies for Data Center Networks. In Proc. ACM SIGCOMM.
[53]
Akshay Narayan, Frank Cangialosi, Deepti Raghavan, Prateesh Goyal, Srinivas Narayana, Radhika Mittal, Mohammad Alizadeh, and Hari Balakrishnan. 2018. Restructuring Endpoint Congestion Control. In Proc. ACM SIGCOMM.
[54]
Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: Distributed, Low Latency Scheduling. In Proc. SOSP.
[55]
George Parisis, Toby Moncaster, Anil Madhavapeddy, and Jon Crowcroft. 2013. Trevi: Watering Down Storage Hotspots with Cool Fountain Codes. In Proc. ACM HotNets.
[56]
K.V. Rashmi, Nihar B. Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, and Kannan Ramchandran. 2014. A "Hitchhiker's" Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers. In Proc. ACM SIGCOMM.
[57]
Xiaoqi Ren, Ganesh Ananthanarayanan, Adam Wierman, and Minlan Yu. 2015. Hopper: Decentralized Speculation-aware Cluster Scheduling at Scale. In Proc. ACM SIGCOMM.
[58]
Luigi Rizzo. 2012. Netmap: A Novel Framework for Fast Packet I/O. In Proc. USENIX ATC.
[59]
Raja R. Sambasivan, Ilari Shafer, Jonathan Mace, Benjamin H. Sigelman, Rodrigo Fonseca, and Gregory R. Ganger. 2016. Principled Workflow-centric Tracing of Distributed Systems. In Proc. SoCC.
[60]
Bianca Schroeder, Mor Harchol-Balter, Arun Iyengar, Erich M. Nahum, and Adam Wierman. 2006. How to Determine a GoodMulti-Programming Level for External Scheduling. In Proc. IEEE ICDE.
[61]
Michael A. Sevilla, Noah Watkins, Ivo Jimenez, Peter Alvaro, Shel Finkelstein, Jeff LeFevre, and Carlos Maltzahn. 2017. Malacology: A Programmable Storage System. In Proc. EuroSys.
[62]
Benjamin H. Sigelman, Luiz André Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, and Chandan Shanbhag. 2010. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. Technical Report. https://research.google.com/archive/papers/dapper-2010-1.pdf
[63]
Ioan Stefanovici, Bianca Schroeder, Greg O'Shea, and Eno Thereska. 2016. sRoute: Treating the Storage Stack Like a Network. In Proc. USENIX FAST.
[64]
Christopher Stewart, Aniket Chakrabarti, and Rean Griffith. 2013. Zoolander: Efficiently Meeting Very Strict, Low-Latency SLOs. In Proc. USENIX ICAC.
[65]
Riza O. Suminto, Cesar A. Stuardo, Alexandra Clark, Huan Ke, Tanakorn Leesatapornwongsa, Bo Fu, Daniar H. Kurniawan, Vincentius Martin, Maheswara Rao G. Uma, and Haryadi S. Gunawi. 2017. PBSE: A Robust Path-based Speculative Execution for Degraded-network Tail Tolerance in Data-parallel Frameworks. In Proc. ACM SoCC.
[66]
Lalith Suresh, Marco Canini, Stefan Schmid, and Anja Feldmann. 2015. C3: Cutting tail latency in cloud data stores via adaptive replica selection. In Proc. Usenix NSDI.
[67]
Eno Thereska, Hitesh Ballani, Greg O'Shea, Thomas Karagiannis, Antony Rowstron, Tom Talpey, Richard Black, and Timothy Zhu. 2013. Ioflow: A software-defined storage architecture. In Proc. ACM SOSP.
[68]
Beth Trushkowsky, Peter Bodík, Armando Fox, Michael J. Franklin, Michael I. Jordan, and David A. Patterson. 2011. The SCADS Director: Scaling a Distributed Storage System Under Stringent Performance Requirements. In Proc. USENIX FAST.
[69]
Balajee Vamanan, Jahangir Hasan, and T.N. Vijaykumar. 2012. Deadline-aware Datacenter TCP (D2TCP). In Proc. ACM SIGCOMM.
[70]
Ashish Vulimiri, Philip Brighten Godfrey, Radhika Mittal, Justine Sherry, Sylvia Ratnasamy, and Scott Shenker. 2013. Low latency via redundancy. In Proc. ACM CoNext.
[71]
Qingsong Wei, Bharadwaj Veeravalli, Bozhao Gong, Lingfang Zeng, and Dan Feng. 2010. CDRM: A Cost-Effective Dynamic Replication Management Scheme for Cloud Storage Cluster. In Proc. IEEE CLUSTER.
[72]
Matt Welsh, David Culler, and Eric Brewer. 2001. SEDA: an architecture for well-conditioned, scalable internet services. ACM SIGOPS Operating Systems Review 35, 5 (2001), 230--243.
[73]
Zhe Wu, Curtis Yu, and Harsha V. Madhyastha. 2015. CosTLO: Cost-Effective Redundancy for Lower Latency Variance on Cloud Storage Services. In Proc. USENIX NSDI.
[74]
Hong Xu and Baochun Li. 2014. RepFlow: Minimizing flow completion times with replicated flows in data centers. In IEEE INFOCOM.
[75]
Yunjing Xu, Zachary Musgrave, Brian Noble, and Michael Bailey. 2013. Bobtail: Avoiding Long Tails in the Cloud. In Proc. Usenix NSDI.
[76]
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz, and Ion Stoica. 2008. Improving MapReduce Performance in Heterogeneous Environments. In Proc. USENIX OSDI.

Cited By

View all
  • (2024)HGR: A Hybrid Global Graph-Based Recovery Approach for Cloud Storage Systems with Failure and Straggler Nodes2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00075(750-761)Online publication date: 23-Jul-2024
  • (2022)Enabling emerging edge applications through a 5G control plane interventionProceedings of the 18th International Conference on emerging Networking EXperiments and Technologies10.1145/3555050.3569130(386-400)Online publication date: 30-Nov-2022
  • (2022)Characterizing the Availability and Latency in AWS Network From the Perspective of TenantsIEEE/ACM Transactions on Networking10.1109/TNET.2022.314870130:4(1554-1568)Online publication date: Aug-2022
  • Show More Cited By

Index Terms

  1. Reducing tail latency using duplication: a multi-layered approach

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        CoNEXT '19: Proceedings of the 15th International Conference on Emerging Networking Experiments And Technologies
        December 2019
        395 pages
        ISBN:9781450369985
        DOI:10.1145/3359989
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 03 December 2019

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. abstraction
        2. cloning
        3. duplicate-aware scheduling
        4. duplication
        5. straggler mitigation
        6. tail-latency

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        CoNEXT '19
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 198 of 789 submissions, 25%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)202
        • Downloads (Last 6 weeks)30
        Reflects downloads up to 11 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)HGR: A Hybrid Global Graph-Based Recovery Approach for Cloud Storage Systems with Failure and Straggler Nodes2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00075(750-761)Online publication date: 23-Jul-2024
        • (2022)Enabling emerging edge applications through a 5G control plane interventionProceedings of the 18th International Conference on emerging Networking EXperiments and Technologies10.1145/3555050.3569130(386-400)Online publication date: 30-Nov-2022
        • (2022)Characterizing the Availability and Latency in AWS Network From the Perspective of TenantsIEEE/ACM Transactions on Networking10.1109/TNET.2022.314870130:4(1554-1568)Online publication date: Aug-2022
        • (2020)Judicious QoS using cloud overlaysProceedings of the 16th International Conference on emerging Networking EXperiments and Technologies10.1145/3386367.3431318(371-385)Online publication date: 23-Nov-2020
        • (2020)Improving NAND flash performance with read heat separation2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS50786.2020.9285970(1-8)Online publication date: 17-Nov-2020
        • (2020)Age-aware Fairness in Blockchain Transaction Ordering2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS)10.1109/IWQoS49365.2020.9212952(1-9)Online publication date: Jun-2020

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media