More Web Proxy on the site http://driver.im/

research-article

Public Access

Kraken: Adaptive Container Provisioning for Deploying Dynamic DAGs in Serverless Platforms

Authors:

Vivek M. Bhasi,

Jashwant Raj Gunasekaran,

Prashanth Thinakaran,

Cyan Subhra Mishra,

Mahmut Taylan Kandemir,

Chita DasAuthors Info & Claims

SoCC '21: Proceedings of the ACM Symposium on Cloud Computing

Pages 153 - 167

https://doi.org/10.1145/3472883.3486992

Published: 01 November 2021 Publication History

Abstract

The growing popularity of microservices has led to the proliferation of online cloud service-based applications, which are typically modelled as Directed Acyclic Graphs (DAGs) comprising of tens to hundreds of microservices. The vast majority of these applications are user-facing, and hence, have stringent SLO requirements. Serverless functions, having short resource provisioning times and instant scalability, are suitable candidates for developing such latency-critical applications. However, existing serverless providers are unaware of the workflow characteristics of application DAGs, leading to container over-provisioning in many cases. This is further exacerbated in the case of dynamic DAGs, where the function chain for an application is not known a priori. Motivated by these observations, we propose Kraken, a workflow-aware resource management framework that minimizes the number of containers provisioned for an application DAG while ensuring SLO-compliance. We design and implement Kraken on OpenFaaS and evaluate it on a multi-node Kubernetes-managed cluster. Our extensive experimental evaluation using DeathStarbench workload suite and real-world traces demonstrates that Kraken spawns up to 76% fewer containers, thereby improving container utilization and saving cluster-wide energy by up to 4x and 48%, respectively, when compared to state-of-the art schedulers employed in serverless platforms.

Supplementary Material

MP4 File (Day1_Session3_Order_3_Kraken.mp4)

Presentation video

Download
313.14 MB

References

[1]

[n.d.]. Twitter Stream traces. https://archive.org/details/twitterstream. Accessed: 2020-05-07.

[2]

2019. Airbnb AWS Case Study. https://aws.amazon.com/solutions/case- studies/airbnb/.

[3]

2019. Provisioned Concurrency. https://docs.aws.amazon.com/lambda/latest/dg/configuration-concurrency.html.

[4]

2020. Amazon States Language. https://docs.aws.amazon.com/step-functions/latest/dg/concepts-amazon-states-language.html.

[5]

2020. AWS Lambda. Serverless Functions. https://aws.amazon.com/lambda/.

[6]

2020. Azure Durable Functions. https://docs.microsoft.com/en-us/azure/azure-functions/durable.

[7]

2020. hey HTTP Load Testing Tool. https://github.com/rakyll/hey.

[8]

2020. IBM-Composer. https://cloud.ibm.com/docs/openwhisk?topic=cloud-functions-pkg_composer.

[9]

2020. Kubernetes. https://kubernetes.io/.

[10]

2020. Microsoft Azure Serverless Functions. https://azure.microsoft.com/en-us/services/functions/.

[11]

2020. OpenFaaS. https://www.openfaas.com/.

[12]

2020. Prometheus. https://prometheus.io/.

[13]

2021. AWS Lambda Cold Starts. https://mikhail.io/serverless/coldstarts/aws/.

[14]

2021. Azure Functions Cold Starts. https://mikhail.io/serverless/coldstarts/azure/.

[15]

2021. Expedia Case Study - Amazon AWS. https://mikhail.io/serverless/coldstarts/azure/.

[16]

Feb 24, 2020. Intel Power Gadget. https://github.com/sosy-lab/cpu-energy-meter.

[17]

February 2018. Google Cloud Functions. https://cloud.google.com/functions/docs/.

[18]

Istemi Ekin Akkus et al. 2018. SAND: Towards High-Performance Serverless Computing. In ATC.

[19]

Mamoun Awad, Latifur Khan, and Bhavani Thuraisingham. 2008. Predicting WWW surfing using multiple evidence combination. The VLDB Journal 17, 3 (2008), 401--417.

Digital Library

[20]

M. A. Awad and I. Khalil. 2012. Prediction of User's Web-Browsing Behavior: Application of Markov Model. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 42, 4 (2012), 1131--1142. https://doi.org/10.1109/TSMCB.2012.2187441

Digital Library

[21]

Ron Begleiter, Ran El-Yaniv, and Golan Yona. 2004. On Prediction Using Variable Order Markov Models. Journal of Artificial Intelligence Research 22 (2004), 385--421.

Digital Library

[22]

Marc Brooker, Andreea Florescu, Diana-Maria Popa, Rolf Neugebauer, Alexandru Agache, Alexandra Iordache, Anthony Liguori, and Phil Piwonka. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. In NSDI.

[23]

Jyothi Prasad Buddha and Reshma Beesetty. 2019. Step Functions. In The Definitive Guide to AWS Application Integration. Springer.

[24]

James Cadden, Thomas Unger, Yara Awad, Han Dong, Orran Krieger, and Jonathan Appavoo. 2020. SEUSS: skip redundant paths to make serverless fast. In Proceedings of the Fifteenth European Conference on Computer Systems. 1--15.

Digital Library

[25]

Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. 2019. Cirrus: A Serverless Framework for End-to-End ML Workflows. In Proceedings of the ACM Symposium on Cloud Computing (Santa Cruz, CA, USA) (SoCC '19). Association for Computing Machinery, New York, NY, USA, 13--24. https://doi.org/10.1145/3357223.3362711

Digital Library

[26]

Benjamin Carver, Jingyuan Zhang, Ao Wang, and Yue Cheng. 2019. In search of a fast and efficient serverless dag engine. In 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW). IEEE, 1--10.

[27]

Nilanjan Daw, Umesh Bellur, and Purushottam Kulkarni. 2020. Xanadu: Mitigating cascading cold starts in serverless function chain deployments. In Proceedings of the 21st International Middleware Conference. 356--370.

Digital Library

[28]

Paul A Gagniuc. 2017. Markov chains: From Theory to Implementation and Experimentation. John Wiley & Sons.

[29]

Yu Gan, Yanqi Zhang, Dailun Cheng, Ankitha Shetty, Priyal Rathi, Nayan Katarki, Ariana Bruno, Justin Hu, Brian Ritchken, Brendon Jackson, et al. 2019. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 3--18.

Digital Library

[30]

Arpan Gujarati, Sameh Elnikety, Yuxiong He, Kathryn S. McKinley, and Björn B. Brandenburg. 2017. Swayam: Distributed Autoscaling to Meet SLAs of Machine Learning Inference Services with Resource Efficiency. In USENIX Middleware Conference.

[31]

Jashwant Raj Gunasekaran, Prashanth Thinakaran, Mahmut Taylan Kandemir, Bhuvan Urgaonkar, George Kesidis, and Chita Das. 2019. Spock: Exploiting Serverless Functions for SLO and Cost Aware Resource Procurement in Public Cloud. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). 199--208. https://doi.org/10.1109/CLOUD.2019.00043

[32]

Jashwant Raj Gunasekaran, Prashanth Thinakaran, Nachiappan C Nachiappan, Mahmut Taylan Kandemir, and Chita R Das. 2020. Fifer: Tackling Resource Underutilization in the Serverless Era. In Proceedings of the 21st International Middleware Conference. 280--295.

Digital Library

[33]

Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, et al. 2019. Cloud programming simplified: A berkeley view on serverless computing. arXiv preprint arXiv:1902.03383 (2019).

[34]

Ram Srivatsa Kannan, Lavanya Subramanian, Ashwin Raju, Jeongseob Ahn, Jason Mars, and Lingjia Tang. 2019. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In EuroSys.

[35]

Kate Keahey, Jason Anderson, Zhuo Zhen, Pierre Riteau, Paul Ruth, Dan Stanzione, Mert Cevik, Jacob Colleran, Haryadi S. Gunawi, Cody Hammock, Joe Mambretti, Alexander Barnes, François Halbach, Alex Rocha, and Joe Stubbs. 2020. Lessons Learned from the Chameleon Testbed. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC '20). USENIX Association.

[36]

Bernhard Korte and Jens Vygen. 2018. Bin-Packing. In Combinatorial Optimization. Springer, 489--507.

[37]

Jörn Kuhlenkamp, Sebastian Werner, and Stefan Tai. 2020. The ifs and buts of less is more: a serverless computing reality check. In 2020 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 154--161.

[38]

Anup Mohan, Harshad Sane, Kshitij Doshi, Saikrishna Edupuganti, Naren Nayak, and Vadim Sukhomlinov. 2019. Agile cold starts for scalable serverless. In 11th {USENIX} Workshop on Hot Topics in Cloud Computing (HotCloud 19).

[39]

Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2018. SOCK: Rapid Task Provisioning with Serverless-Optimized Containers. In USENIX ATC.

[40]

Haoran Qiu, Subho S Banerjee, Saurabh Jha, Zbigniew T Kalbarczyk, and Ravishankar K Iyer. 2020. {FIRM}: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In 14th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 20). 805--825.

[41]

Mohammad Shahrad, Jonathan Balkind, and David Wentzlaff. 2019. Architectural implications of function-as-a-service computing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 1063--1075.

Digital Library

[42]

Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. In 2020 {USENIX} Annual Technical Conference ({USENIX} {ATC} 20). 205--218.

[43]

Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira. 2020. Prebaking Functions to Warm the Serverless Cold Start. In Proceedings of the 21st International Middleware Conference. 1--13.

Digital Library

[44]

Arjun Singhvi, Kevin Houck, Arjun Balasubramanian, Mohammed Danish Shaikh, Shivaram Venkataraman, and Aditya Akella. 2019. Archipelago: A scalable low-latency serverless platform. arXiv preprint arXiv:1911.09849 (2019).

[45]

Davide Taibi, Nabil El Ioini, Claus Pahl, and Jan Raphael Schmid Niederkofler. 2020. Patterns for Serverless Functions (Function-asa-Service): A Multivocal Literature Review. In CLOSER. 181--192.

[46]

Ali Tariq, Austin Pahl, Sharat Nimmagadda, Eric Rozner, and Siddharth Lanka. 2020. Sequoia: Enabling quality-of-service in serverless computing. In Proceedings of the 11th ACM Symposium on Cloud Computing. 311--327.

Digital Library

[47]

Prashanth Thinakaran, Jashwant Raj Gunasekaran, Bikash Sharma, Mahmut Taylan Kandemir, and Chita R. Das. 2017. Phoenix: A Constraint-Aware Scheduler for Heterogeneous Datacenters. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 977--987. https://doi.org/10.1109/ICDCS.2017.262

[48]

Prashanth Thinakaran, Jashwant Raj Gunasekaran, Bikash Sharma, Mahmut Taylan Kandemir, and Chita R. Das. 2019. Kube-Knots: Resource Harvesting through Dynamic Container Orchestration in GPU-based Datacenters. In 2019 IEEE International Conference on Cluster Computing (CLUSTER). 1--13. https://doi.org/10.1109/CLUSTER.2019.8891040

[49]

Guido Urdaneta, Guillaume Pierre, and Maarten Van Steen. 2009. Wikipedia workload analysis for decentralized hosting. Computer Networks (2009).

[50]

Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In ATC.

[51]

Hailong Yang, Quan Chen, Moeiz Riaz, Zhongzhi Luan, Lingjia Tang, and Jason Mars. 2017. PowerChief: Intelligent power allocation for multi-stage applications to improve responsiveness on power constrained CMP. In Computer Architecture News.

[52]

Yiming Zhang, Jon Crowcroft, Dongsheng Li, Chengfen Zhang, Huiba Li, Yaozheng Wang, Kai Yu, Yongqiang Xiong, and Guihai Chen. 2018. KylinX: a dynamic library operating system for simplified and efficient cloud virtualization. In 2018 USENIX Annual Technical Conference. 173--186.

Cited By

Yang YDu DSong HXia Y(2024)On-demand and Parallel Checkpoint/Restore for GPU ApplicationsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698510(415-433)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698510
Sui YYu HHu YLi JWang H(2024)Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-LoadingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698509(178-195)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698509
Wang YChen PDou HZhang YYu GHe ZHuang HFilkov VRay BZhou M(2024)FaaSConf: QoS-aware Hybrid Resources Configuration for Serverless WorkflowsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695477(957-969)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695477
Show More Cited By

Index Terms

Kraken: Adaptive Container Provisioning for Deploying Dynamic DAGs in Serverless Platforms
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Scheduling

Recommendations

Fifer: Tackling Resource Underutilization in the Serverless Era
Middleware '20: Proceedings of the 21st International Middleware Conference

Datacenters are witnessing a rapid surge in the adoption of serverless functions for microservices-based applications. A vast majority of these microservices typically span less than a second, have strict SLO requirements, and are chained together as ...
Cypress: input size-sensitive container provisioning and request scheduling for serverless platforms
SoCC '22: Proceedings of the 13th Symposium on Cloud Computing

The growing popularity of the serverless platform has seen an increase in the number and variety of applications (apps) being deployed on it. The majority of these apps process user-provided input to produce the desired results. Existing work in the ...
Reducing response latency of composite functions-as-a-service through scheduling
Highlights
- Function composition is a recent addition to Function-as-a-Service model.
- ...
Abstract
In Function-as-a-Service (FaaS) clouds, customers deploy to cloud individual functions, in contrast to complete virtual machines (IaaS) or Linux containers (PaaS). FaaS offerings are available in the largest public clouds (Amazon ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SoCC '21: Proceedings of the ACM Symposium on Cloud Computing

November 2021

685 pages

ISBN:9781450386388

DOI:10.1145/3472883

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

NSF (National Science Foundation)

Conference

SoCC '21

Sponsor:

SoCC '21: ACM Symposium on Cloud Computing

November 1 - 4, 2021

WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

42
Total Citations
View Citations
2,269
Total Downloads

Downloads (Last 12 months)612
Downloads (Last 6 weeks)90

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang YDu DSong HXia Y(2024)On-demand and Parallel Checkpoint/Restore for GPU ApplicationsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698510(415-433)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698510
Sui YYu HHu YLi JWang H(2024)Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-LoadingProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698509(178-195)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3698038.3698509
Wang YChen PDou HZhang YYu GHe ZHuang HFilkov VRay BZhou M(2024)FaaSConf: QoS-aware Hybrid Resources Configuration for Serverless WorkflowsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695477(957-969)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695477
Bhasi VSharma AJain RGunasekaran JPattnaik AKandemir MDas CSchiavoni VEdinger JCao JJin Z(2024)Towards SLO-Compliant and Cost-Effective Serverless Computing on Emerging GPU ArchitecturesProceedings of the 25th International Middleware Conference10.1145/3652892.3700760(211-224)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3652892.3700760
Chen QQian JChe YLin ZWang JZhou JSong LLiang YWu JZheng WLiu WLi LLiu FTan KSekar VYu MSeneviratne AVeitch D(2024)YuanRong: A Production General-purpose Serverless System for Distributed Applications in the CloudProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672216(843-859)Online publication date: 4-Aug-2024
https://dl.acm.org/doi/10.1145/3651890.3672216
Liu YGuo JJiang BZhang PSun XSong YRen WHou ZLyu BWen RZhu SWang XVallina-Rodríguez NSuarez-Tángil GLevin DPelsser C(2024)Understanding Network Startup for Secure Containers in Multi-Tenant Clouds: Performance, Bottleneck and OptimizationProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3688436(635-650)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3646547.3688436
Luo SLin CYe KXu GZhang LYang GXu HXu C(2024)Optimizing Resource Management for Shared Microservices: A Scalable System DesignACM Transactions on Computer Systems10.1145/363160742:1-2(1-28)Online publication date: 13-Feb-2024
https://dl.acm.org/doi/10.1145/3631607
Cheng DYan KCai XGong YHu C(2024)SLO-Aware Function Placement for Serverless Workflows With Layer-Wise Memory SharingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.339185835:6(1074-1091)Online publication date: Jun-2024
https://doi.org/10.1109/TPDS.2024.3391858
Hu MWang HXu XHe JHu YDeng TPeng K(2024)Joint Optimization of Microservice Deployment and Routing in Edge via Multi-Objective Deep Reinforcement LearningIEEE Transactions on Network and Service Management10.1109/TNSM.2024.344387221:6(6364-6381)Online publication date: Dec-2024
https://doi.org/10.1109/TNSM.2024.3443872
Ray KBanerjee ANarendra N(2024)Learning-Based Microservice Placement and Migration for Multi-Access Edge ComputingIEEE Transactions on Network and Service Management10.1109/TNSM.2023.334419221:2(1969-1982)Online publication date: Apr-2024
https://doi.org/10.1109/TNSM.2023.3344192
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten