[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3489525.3511680acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
research-article

Why Is It Not Solved Yet?: Challenges for Production-Ready Autoscaling

Published: 09 April 2022 Publication History

Abstract

Autoscaling is a task of major importance in the cloud computing domain as it directly affects both operating costs and customer experience. Although there has been active research in this area for over ten years now, there is still a significant gap between the proposed methods in the literature and the deployed autoscalers in practice. Hence, many research autoscalers do not find their way into production deployments. This paper describes six core challenges that arise in production systems that are still not solved by most research autoscalers. We illustrate these problems through experiments in a realistic cloud environment with a real-world multi-service business application and show that commonly used autoscalers have various shortcomings. In addition, we analyze the behavior of overloaded services and show that these can be problematic for existing autoscalers. Generally, we analyze that these challenges are only insufficiently addressed in the literature and conclude that future scaling approaches should focus on the needs of production systems.

References

[1]
Cristina Abad, Ian T. Foster, Nikolas Herbst, and Alexandru Iosup. 2021. Serverless Computing (Dagstuhl Seminar 21201). In Dagstuhl Reports, Vol. 11. Schloss Dagstuhl-Leibniz-Zentrum für Informatik.
[2]
Auday Al-Dulaimy, Javid Taheri, Andreas Kassler, M. Reza Hoseiny Farahabady, Shuiguang Deng, and Albert Zomaya. 2020. MULTISCALER: A Multi-Loop Auto-Scaling Approach for Cloud-Based Applications. IEEE Transactions on Cloud Computing (2020).
[3]
Fahd Al-Haidari, Mohammed H. Sqalli, and Khaled Salah. 2013. Impact of CPU Utilization Thresholds and Scaling Size on Autoscaling Cloud Resources. In 2013 IEEE 5th International Conference on Cloud Computing Technology and Science, Vol. 2. 256--261.
[4]
Ahmed Ali-Eldin, Maria Kihl, Johan Tordsson, and Erik Elmroth. 2012a. Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control. In Proceedings of the 3rd workshop on Scientific Cloud Computing. 31--40.
[5]
Ahmed Ali-Eldin, Johan Tordsson, and Erik Elmroth. 2012b. An adaptive hybrid elasticity controller for cloud infrastructures. In 2012 IEEE Network Operations and Management Symposium. 204--212.
[6]
Michael Armbrust, Armando Fox, Rean Griffith, Anthony D Joseph, Randy H Katz, Andrew Konwinski, Gunho Lee, David A Patterson, Ariel Rabkin, Ion Stoica, et almbox. 2009. Above the clouds: A berkeley view of cloud computing . Technical Report. Technical Report UCB/EECS-2009--28, EECS Department, University of California.
[7]
Mohammad Sadegh Aslanpour, Mostafa Ghobaei-Arani, and Adel Nadjaran Toosi. 2017. Auto-scaling web applications in clouds: A cost-aware approach. Journal of Network and Computer Applications, Vol. 95 (2017), 26--41.
[8]
The Kubernetes Authors. 2021. Horizontal Pod Autoscaler . https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.
[9]
Microsoft Azure. 2021. How To Scale Cloud Services . https://docs.microsoft.com/de-de/azure/cloud-services/cloud-services-how-to-scale-portal .
[10]
André Bauer, Johannes Grohmann, Nikolas Herbst, and Samuel Kounev. 2018. On the Value of Service Demand Estimation for Auto-Scaling. In Proceedings of 19th International GI/ITG Conference on Measurement, Modelling and Evaluation of Computing Systems (MMB 2018) (Lecture Notes in Computer Science, Vol. 10740). Springer, Cham, 142--156.
[11]
André Bauer, Nikolas Herbst, Simon Spinner, Ahmed Ali-Eldin, and Samuel Kounev. 2019 a. Chameleon: A Hybrid, Proactive Auto-Scaling Mechanism on a Level-Playing Field. IEEE Transactions on Parallel and Distributed Systems, Vol. 30, 4 (2019), 800--813.
[12]
André Bauer, Veronika Lesch, Laurens Versluis, Alexey Ilyushkin, Nikolas Herbst, and Samuel Kounev. 2019 b. Chamulteon: Coordinated Auto-Scaling of Micro-Services. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) . 2015--2025.
[13]
André Bauer, Marwin Züfle, Nikolas Herbst, Samuel Kounev, and Valentin Curtef. 2020. Telescope: An automatic feature extraction and transformation approach for time series forecasting on a level-playing field. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1902--1905.
[14]
Marta Beltrán. 2015. Automatic provisioning of multi-tier applications in cloud computing environments. The Journal of Supercomputing, Vol. 71, 6 (2015), 2221--2250.
[15]
Cor-Paul Bezemer, Simon Eismann, Vincenzo Ferme, Johannes Grohmann, Robert Heinrich, Pooyan Jamshidi, Weiyi Shang, André van Hoorn, Monica Villavicencio, Jürgen Walter, and Felix Willnecker. 2019. How is Performance Addressed in DevOps?. In Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering (Mumbai, India) (ICPE '19). Association for Computing Machinery, New York, NY, USA, 45--50.
[16]
Gong Chen, Wenbo He, Jie Liu, Suman Nath, Leonidas Rigas, Lin Xiao, and Feng Zhao. 2008. Energy-Aware Server Provisioning and Load Dispatching for Connection-Intensive Internet Services. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (San Francisco, California) (NSDI'08). USENIX Association, USA, 337--350.
[17]
Trieu C. Chieu, Ajay Mohindra, Alexei A. Karve, and Alla Segal. 2009. Dynamic Scaling of Web Applications in a Virtualized Cloud Computing Environment. In 2009 IEEE International Conference on e-Business Engineering . 281--286.
[18]
Alfonso Delgado-Bonal and Alexander Marshak. 2019. Approximate Entropy and Sample Entropy: A Comprehensive Tutorial. Entropy, Vol. 21, 6 (2019).
[19]
Naghmeh Dezhabad and Saeed Sharifian. 2018. Learning-based dynamic scalable load-balanced firewall as a service in network function-virtualized cloud computing environments. The Journal of Supercomputing, Vol. 74, 7 (2018), 3329--3358.
[20]
Google Cloud Docs. 2021. Autosclaing groups of instances . https://cloud.google.com/compute/docs/autoscaler .
[21]
Xavier Dutreilh, Aurélien Moreau, Jacques Malenfant, Nicolas Rivierre, and Isis Truck. 2010. From Data Center Resource Allocation to Control Theory and Back. In 2010 IEEE 3rd International Conference on Cloud Computing. 410--417.
[22]
CloudFoundry Foundation. 2021. KubeCF: A Kubernetes Native Distribution of Cloud Foundry . https://kubecf.io/.
[23]
Anshul Gandhi, Mor Harchol-Balter, Ram Raghunathan, and Michael A. Kozuch. 2012. AutoScale: Dynamic, Robust Capacity Management for Multi-Tier Data Centers. ACM Trans. Comput. Syst., Vol. 30, 4, Article 14 (Nov. 2012).
[24]
Hamoun Ghanbari, Bradley Simmons, Marin Litoiu, and Gabriel Iszlai. 2011. Exploring alternative approaches to implement an elasticity policy. In 2011 IEEE 4th International Conference on Cloud Computing. IEEE, 716--723.
[25]
Zhenhuan Gong, Xiaohui Gu, and John Wilkes. 2010. PRESS: PRedictive Elastic ReSource Scaling for cloud systems. In 2010 International Conference on Network and Service Management. 9--16.
[26]
Johannes Grohmann, Patrick K. Nicholson, Jesus Omana Iglesias, Samuel Kounev, and Diego Lugones. 2019. Monitorless: Predicting Performance Degradation in Cloud Applications with Machine Learning. In Proceedings of the 20th International Middleware Conference (Davis, CA, USA) (Middleware '19). Association for Computing Machinery, New York, NY, USA, 149--162.
[27]
Johannes Grohmann, Martin Straesser, Avi Chalbani, Simon Eismann, Yair Arian, Nikolas Herbst, Noam Peretz, and Samuel Kounev. 2021. SuanMing: Explainable Prediction of Performance Degradations in Microservice Applications. In Proceedings of the ACM/SPEC International Conference on Performance Engineering (Virtual Event, France) (ICPE '21). Association for Computing Machinery, New York, NY, USA, 165--176.
[28]
Nikolas Roman Herbst, Samuel Kounev, and Ralf Reussner. 2013. Elasticity in Cloud Computing: What It Is, and What It Is Not. In 10th International Conference on Autonomic Computing (ICAC 13). USENIX Association, San Jose, CA, 23--27.
[29]
Waheed Iqbal, Matthew N. Dailey, David Carrera, and Paul Janecek. 2011. Adaptive resource provisioning for read intensive multi-tier applications in the cloud. Future Generation Computer Systems, Vol. 27, 6 (2011), 871--879.
[30]
Jing Jiang, Jie Lu, Guangquan Zhang, and Guodong Long. 2013. Optimal cloud resource auto-scaling for web applications. In 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. IEEE, 58--65.
[31]
Bibal Benifa J.V. and Dejey Dharma. 2018. HAS: Hybrid auto-scaler for resource scaling in cloud environment. J. Parallel and Distrib. Comput., Vol. 120 (2018), 1--15.
[32]
Bibal Benifa J.V. and Dejey Dharma. 2019. Rlpas: Reinforcement learning-based proactive auto-scaler for resource provisioning in cloud environment. Mobile Networks and Applications, Vol. 24, 4 (2019), 1348--1363.
[33]
Evangelia Kalyvianaki, Themistoklis Charalambous, and Steven Hand. 2009. Self-Adaptive and Self-Configured CPU Resource Provisioning for Virtualized Servers Using Kalman Filters. In Proceedings of the 6th International Conference on Autonomic Computing (Barcelona, Spain) (ICAC '09). Association for Computing Machinery, New York, NY, USA, 117--126.
[34]
G. Mahalakshmi, S. Sridevi, and S. Rajaram. 2016. A survey on forecasting of time series data. In 2016 International Conference on Computing Technologies and Intelligent Data Engineering (ICCTIDE'16) . 1--8.
[35]
Micrometer Metrics. 2021. Micrometer GitHub Repository and Documentation . https://github.com/micrometer-metrics/micrometer .
[36]
Paul Newbold. 1983. ARIMA model building and the time series analysis approach to forecasting. Journal of forecasting, Vol. 2, 1 (1983), 23--35.
[37]
Hiep Nguyen, Zhiming Shen, Xiaohui Gu, Sethuraman Subbiah, and John Wilkes. 2013. AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service. In 10th International Conference on Autonomic Computing (ICAC 13). USENIX Association, San Jose, CA, 69--82.
[38]
Steven M Pincus, Igor M Gladstone, and Richard A Ehrenkranz. 1991. A regularity statistic for medical data analysis. Journal of clinical monitoring, Vol. 7, 4 (1991), 335--345.
[39]
Nilabja Roy, Abhishek Dubey, and Aniruddha Gokhale. 2011. Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting. In 2011 IEEE 4th International Conference on Cloud Computing. 500--507.
[40]
Krzysztof Rzadca, Pawel Findeisen, Jacek Swiderski, Przemyslaw Zych, Przemyslaw Broniek, Jarek Kusmierek, Pawel Nowak, Beata Strack, Piotr Witusowski, Steven Hand, and John Wilkes. 2020. Autopilot: Workload Autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems (Heraklion, Greece) (EuroSys '20). Association for Computing Machinery, New York, NY, USA, Article 16.
[41]
Semih Sahin, Wenqi Cao, Qi Zhang, and Ling Liu. 2016. JVM Configuration Management and Its Performance Impact for Big Data Applications. In 2016 IEEE International Congress on Big Data (BigData Congress) . 410--417.
[42]
Amazon Web Services. 2021. Predictive Scaling for EC2 . https://aws.amazon.com/en/blogs/aws/new-predictive-scaling-for-ec2-powered-by-machine-learning/.
[43]
Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell system technical journal, Vol. 27, 3 (1948), 379--423.
[44]
Upendra Sharma, Prashant Shenoy, and Donald F. Towsley. 2012. Provisioning Multi-Tier Cloud Applications Using Statistical Bounds on Sojourn Time. In Proceedings of the 9th International Conference on Autonomic Computing (San Jose, California, USA) (ICAC '12). Association for Computing Machinery, New York, NY, USA, 43--52.
[45]
Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, and John Wilkes. 2011. CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems. In Proceedings of the 2nd ACM Symposium on Cloud Computing (Cascais, Portugal) (SOCC '11). Association for Computing Machinery, New York, NY, USA, Article 5.
[46]
Bradley Simmons, Hamoun Ghanbari, Marin Litoiu, and Gabriel Iszlai. 2011. Managing a SaaS application in the cloud using PaaS policy sets and a strategy-tree. In 2011 7th International Conference on Network and Service Management . 1--5.
[47]
Parminder Singh, Pooja Gupta, Kiran Jyoti, and Anand Nayyar. 2019. Research on auto-scaling of web applications in cloud: survey, trends and future directions. Scalable Computing: Practice and Experience, Vol. 20, 2 (2019), 399--432.
[48]
Parminder Singh, Avinash Kaur, Pooja Gupta, Sukhpal Singh Gill, and Kiran Jyoti. 2021. RHAS: robust hybrid auto-scaling for web applications in cloud computing. Cluster Computing, Vol. 24, 2 (2021), 717--737.
[49]
Fan-Hsun Tseng, Ming-Shiun Tsai, Chia-Wei Tseng, Yao-Tsung Yang, Chien-Chang Liu, and Li-Der Chou. 2018. A Lightweight Autoscaling Mechanism for Fog Computing in Industrial Applications. IEEE Transactions on Industrial Informatics, Vol. 14, 10 (2018), 4529--4537.
[50]
Bhuvan Urgaonkar, Prashant Shenoy, Abhishek Chandra, Pawan Goyal, and Timothy Wood. 2008. Agile Dynamic Provisioning of Multi-Tier Internet Applications. ACM Trans. Auton. Adapt. Syst., Vol. 3, 1, Article 1 (March 2008).
[51]
Yi Wei, Daniel Kudenko, Shijun Liu, Li Pan, Lei Wu, and Xiangxu Meng. 2019. A reinforcement learning based auto-scaling approach for SaaS providers in dynamic cloud environment. Mathematical Problems in Engineering, Vol. 2019 (2019).
[52]
Song Wu, Binji Li, Xinhou Wang, and Hai Jin. 2016. HybridScaler: Handling Bursting Workload for Multi-tier Web Applications in Cloud. In 2016 15th International Symposium on Parallel and Distributed Computing (ISPDC) . 141--148.
[53]
Pengcheng Xiong, Zhikui Wang, Simon Malkowski, Qingyang Wang, Deepal Jayasinghe, and Calton Pu. 2011. Economical and Robust Provisioning of N-Tier Cloud Workloads: A Multi-level Control Approach. In 2011 31st International Conference on Distributed Computing Systems . 571--580.

Cited By

View all
  • (2024)REAL-TIME MANAGEMENT OF CRITICAL IT-SYSTEMS BASED ON NEURAL NETWORK TECHNOLOGIESHERALD OF POLOTSK STATE UNIVERSITY. Series С FUNDAMENTAL SCIENCES10.52928/2070-1624-2024-42-1-18-25(18-25)Online publication date: 24-Apr-2024
  • (2024)OptScaler: A Collaborative Framework for Robust Autoscaling in the CloudProceedings of the VLDB Endowment10.14778/3685800.368582917:12(4090-4103)Online publication date: 8-Nov-2024
  • (2024)SimuScale: Optimizing Parameters for Autoscaling of Serverless Edge Functions Through Co-Simulation2024 IEEE 17th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD62652.2024.00042(305-315)Online publication date: 7-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPE '22: Proceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering
April 2022
242 pages
ISBN:9781450391436
DOI:10.1145/3489525
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2022

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Industry Paper

Author Tags

  1. autoscaling
  2. cloud computing
  3. microservices

Qualifiers

  • Research-article

Conference

ICPE '22

Acceptance Rates

ICPE '22 Paper Acceptance Rate 14 of 58 submissions, 24%;
Overall Acceptance Rate 252 of 851 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)80
  • Downloads (Last 6 weeks)5
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)REAL-TIME MANAGEMENT OF CRITICAL IT-SYSTEMS BASED ON NEURAL NETWORK TECHNOLOGIESHERALD OF POLOTSK STATE UNIVERSITY. Series С FUNDAMENTAL SCIENCES10.52928/2070-1624-2024-42-1-18-25(18-25)Online publication date: 24-Apr-2024
  • (2024)OptScaler: A Collaborative Framework for Robust Autoscaling in the CloudProceedings of the VLDB Endowment10.14778/3685800.368582917:12(4090-4103)Online publication date: 8-Nov-2024
  • (2024)SimuScale: Optimizing Parameters for Autoscaling of Serverless Edge Functions Through Co-Simulation2024 IEEE 17th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD62652.2024.00042(305-315)Online publication date: 7-Jul-2024
  • (2024)Enhancing Machine Learning-Based Autoscaling for Cloud Resource OrchestrationJournal of Grid Computing10.1007/s10723-024-09783-122:4Online publication date: 1-Dec-2024
  • (2024)Extending parallel programming patterns with adaptability featuresCluster Computing10.1007/s10586-024-04622-027:9(12547-12568)Online publication date: 1-Dec-2024
  • (2024)Kubernetes-in-the-Loop: Enriching Microservice Simulation Through Authentic Container OrchestrationPerformance Evaluation Methodologies and Tools10.1007/978-3-031-48885-6_6(82-98)Online publication date: 3-Jan-2024
  • (2023)Power to the Applications: The Vision of Continuous Decentralized Autoscaling2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing Workshops (CCGridW)10.1109/CCGridW59191.2023.00058(281-283)Online publication date: May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media