Abstract
Performance and cost-effectiveness are sustained by efficient management of resources in cloud computing. Current autoscaling approaches, when trying to balance between the consumption of resources and QoS requirements, usually fall short and end up being inefficient and leading to service disruptions. The existing literature has primarily focuses on static metrics and/or proactive scaling approaches which do not align with dynamically changing tasks, jobs or service calls. The key concept of our approach is the use of statistical analysis to select the most relevant metrics for the specific application being scaled. We demonstrated that different applications require different metrics to accurately estimate the necessary resources, highlighting that what is critical for an application may not be for the other. The proper metrics selection for control mechanism which regulates the requried recources of application are described in this study. Introduced selection mechanism enables us to improve previously designed autoscaler by allowing them to react more quickly to sudden load changes, use fewer resources, and maintain more stable service QoS due to the more accurate machine learning models. We compared our method with previous approaches through a carefully designed series of experiments, and the results showed that this approach brings significant improvements, such as reducing QoS violations by up to 80% and reducing VM usage by 3% to 50%. Testing and measurements were conducted on the Hungarian Research Network (HUN-REN) Cloud, which supports the operation of over 300 scientific projects.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Data Availability
Data available under private github. The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code Availability
Code available under private github. Access is granted upon request.
References
Alipour, H., Liu, Y., Hamou-Lhadj, A.: Analyzing auto-scaling issues in cloud environments. (2014)
Straesser, M., Grohmann, J., Kistowski, J.V., Eismann, S., Bauer, A., Kounev, S.: Why is it not solved yet?: Challenges for production-ready autoscaling. In: Proceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering, (2022). https://doi.org/10.1145/3489525.3511680
Zhong, Z., Xu, M., Rodríguez, M., Xu, C.-Z., Buyya, R.: Machine learning-based orchestration of containers: A taxonomy and future directions. ACM Comput. Surv. 54 (2022). https://doi.org/10.1145/3510415
Ullah, A., Kiss, T., Kovács, J., Tusa, F., Deslauriers, J., Dagdeviren, H., Arjun, R., Hamzeh, H.: Orchestration in the cloud-to-things compute continuum: taxonomy, survey and future directions. J. Cloud Comput.-Adv. Syst. Appl. 12 (2023). https://doi.org/10.1186/s13677-023-00516-5
Biswas, A., Majumdar, S., Nandy, B., El-Haraki, A.: Predictive auto-scaling techniques for clouds subjected to requests with service level agreements. In: 2015 IEEE World Congress on Services, pp. 311–318. (2015). https://doi.org/10.1109/SERVICES.2015.54
Wang, Z., Zhu, S., Li, J., Jiang, W., Ramakrishnan, K.K., Zheng, Y., Yan, M., Zhang, X., Liu, A.X.: Deepscaling: microservices autoscaling for stable cpu utilization in large scale cloud systems. In: Proceedings of the 13th Symposium on Cloud Computing. SoCC ’22, pp. 16–30. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3542929.3563469
Qiu, H., Banerjee, S.S., Jha, S., Kalbarczyk, Z.T., Iyer, R.K.: Firm: an intelligent fine-grained resource management framework for slo-oriented microservices. In: Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation. OSDI’20. USENIX Association, USA (2020)
Xu, M., Song, C., Wu, H., Gill, S.S., Ye, K., Xu, C.: esdnn: Deep neural network based multivariate workload prediction in cloud computing environments. ACM Trans. Internet Technol. 22(3) (2022). https://doi.org/10.1145/3524114
Imdoukh, M., Ahmad, I., Alfailakawi, M.: Machine learning-based auto-scaling for containerized applications. Neural Comput. Appl. 32, 9745–9760 (2019). https://doi.org/10.1007/s00521-019-04507-z
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Adaptive ai-based auto-scaling for kubernetes. In: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pp. 599–608. (2020). https://doi.org/10.1109/CCGrid49817.2020.00-33
Bansal, S., Kumar, M.: Deep learning-based workload prediction in cloud computing to enhance the performance. In: 2023 Third International Conference on Secure Cyber Computing and Communication (ICSCCC), pp. 635–640. (2023). https://doi.org/10.1109/ICSCCC58608.2023.10176790
Yazdanian, P., Sharifian, S.: E2lg: a multiscale ensemble of lstm/gan deep learning architecture for multistep-ahead cloud workload prediction. The J. Supercomput. 77, 11052–11082 (2021). https://doi.org/10.1007/s11227-021-03723-6
Patel, Y.S., Bedi, J.: Mag-d: A multivariate attention network based approach for cloud workload forecasting. Fut. Gener. Comput. Syst. 142, 376–392 (2023). https://doi.org/10.1016/j.future.2023.01.002
Baresi, L., Guinea, S., Leva, A., Quattrocchi, G.: A discrete-time feedback controller for containerized cloud applications. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. FSE 2016, pp. 217–228. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2950290.2950328
Sabuhi, M., Mahmoudi, N., Khazaei, H.: Optimizing the performance of containerized cloud software systems using adaptive pid controllers. ACM Trans. Auton. Adapt. Syst. 15(3) (2021). https://doi.org/10.1145/3465630
Baarzi, A.F., Kesidis, G.: Showar: Right-sizing and efficient scheduling of microservices. In: Proceedings of the ACM Symposium on Cloud Computing. SoCC ’21, pp. 427–441. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3472883.3486999
Goli, A., Mahmoudi, N., Khazaei, H., Ardakanian, O.: A holistic machine learning-based autoscaling approach for microservice applications. (2021). https://doi.org/10.5220/0010407701900198
Abrantes, A., Netto, M.: Using application data for sla-aware auto-scaling in cloud environments. (2015). https://doi.org/10.1109/MASCOTS.2015.15
Podolskiy, V., Mayo, M., Koay, A., Gerndt, M., Patros, P.: Maintaining slos of cloud-native applications via self-adaptive resource sharing. In: 2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO), pp. 72–81. (2019). https://doi.org/10.1109/SASO.2019.00018
Rossi, F., Cardellini, V., Presti, F.L., Nardelli, M.: Dynamic multi-metric thresholds for scaling applications using reinforcement learning. IEEE Trans Cloud Comput. 11, 1807–1821 (2023). https://doi.org/10.1109/TCC.2022.3163357
Zhang, Y., Hua, W., Zhou, Z., Suh, G.E., Delimitrou, C.: Sinan: Ml-based and qos-aware resource management for cloud microservices. In: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS ’21, pp. 167–181. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3445814.3446693
Dang-Quang, N.-M., Yoo, M.: An efficient multivariate autoscaling framework using bi-lstm for cloud computing. Appl. Sci. (2022)
Jayakumar, V., Arbat, S., Kim, I., Wang, W.: Cloudbruno: A low-overhead online workload prediction framework for cloud computing. In: 2022 IEEE International Conference on Cloud Engineering (IC2E), pp. 188–198. (2022). https://doi.org/10.1109/IC2E55432.2022.00027
Pfeifer, A., Brand, H., Lohweg, V.: A comparison of statistical and machine learning approaches for time series forecasting in a demand management scenario. In: 2023 IEEE 21st International Conference on Industrial Informatics (INDIN), pp. 1–6. (2023). https://doi.org/10.1109/INDIN51400.2023.10218206
Li, Y., Lin, Y., Wang, Y., Ye, K., Xu, C.: Serverless computing: State-of-the-art, challenges and opportunities. IEEE Trans. Serv. Computing. 16(2), 1522–1539 (2023). https://doi.org/10.1109/TSC.2022.3166553
Hossen, M.R., Islam, M.A., Ahmed, K.: Practical efficient microservice autoscaling with qos assurance. In: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing. HPDC ’22, pp. 240–252. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3502181.3531460
Wajahat, M., Karve, A., Kochut, A., Gandhi, A.: Mlscale: A machine learning based application-agnostic autoscaler. Sustain. Comput.: Inf. Syst. 22 (2017). https://doi.org/10.1016/j.suscom.2017.10.003
Wajahat, M., Gandhi, A., Karve, A., Kochut, A.: Using machine learning for black-box autoscaling, pp. 1–8 (2016). https://doi.org/10.1109/IGCC.2016.7892598
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Machine learning-based scaling management for kubernetes edge clusters. IEEE Trans. Netw. Serv. Manag. 18, 958–972 (2021). https://doi.org/10.1109/TNSM.2021.3052837
Chen, X., Zhu, F., Chen, Z., Min, G., Zheng, X., Rong, C.: Resource allocation for cloud-based software services using prediction-enabled feedback control with reinforcement learning. IEEE Trans. Cloud Comput. 10(2), 1117–1129 (2022). https://doi.org/10.1109/TCC.2020.2992537
Dhal, P., Azad, C.: A comprehensive survey on feature selection in the various fields of machine learning. Appl. Intell. 52 (2021). https://doi.org/10.1007/s10489-021-02550-9
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014). https://doi.org/10.5555/2627435.2670313
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR. abs/1412.6980 (2014)
Barrow, D., Crone, S.: Cross-validation aggregation for combining autoregressive neural network forecasts. Int. J. Forecast. 32, 1120–1137 (2016). https://doi.org/10.1016/j.ijforecast.2015.12.011
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012). https://doi.org/10.5555/2503308.2188395
Prechelt, L.: Automatic early stopping using cross validation: quantifying the criteria. Neural Netw.: The Official J. Int. Neural Netw. Soc. 11(4), 761–767 (1998). https://doi.org/10.1016/S0893-6080(98)00010-0
Podolskiy, V., Jindal, A., Gerndt, M.: Multilayered autoscaling performance evaluation: Can virtual machines and containers co-scale? Int. J. Appl. Math. Comput. Sci. 29, 227–244 (2019). https://doi.org/10.2478/amcs-2019-0017
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining, Fourth Edition: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2016)
Héder, M., Rigó, E., Medgyesi, D., Lovas, R., Tenczer, S., Török, F., Farkas, A., Emődi, M., Kadlecsik, J., Mező, G., Pintér, Á., Kacsuk, P.: The past, present and future of the ELKH cloud. Inf. Társadalom. 22(2), 128 (2022). https://doi.org/10.22503/inftars.xxii.2022.2.8
Podolskiy, V., Patrou, M., Patros, P., Gerndt, M., Kent, K.B.: The weakest link: Revealing and modeling the architectural patterns of microservice applications. In: Conference of the Centre for Advanced Studies on Collaborative Research (2020). https://doi.org/10.5555/3432601.3432616
Luo, S., Huanle, X., Lu, C., Ye, K., Xu, G., Zhang, L., Ding, Y., He, J., Xu, C.-Z.: Characterizing microservice dependency and performance: Alibaba trace analysis. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 412–426. (2021). https://doi.org/10.1145/3472883.3487003
Acknowledgements
On behalf of the “Autonóm felhöalapú skálázás” (“Autonomous cloud-based scaling”) we are grateful for the possibility to use HUN-REN Cloud (see Héder et al. 2022; https://science-cloud.hu) which helped us achieve the results published in this paper.
Funding
Open access funding provided by HUN-REN Institute for Computer Science and Control. This work was partially funded by the European Commission’s Swarmchestrate Horizon Europe project (GA No. 101135012): https://www.swarmchestrate.eu, by the Ministry of Innovation and Technology NRDI Office within the framework of the Artificial Intelligence National Laboratory Program (GA No. F-2.3.1-21-2022-00004) and by the National Research, Development and Innovation Office (NKFIH) under OTKA (GA No. K132838).
Author information
Authors and Affiliations
Contributions
I. Pintye: original idea; design, implementation of algorithms and measurements; writing the draft paper; reviewing relevant research; J. Kovacs: desiging, editing and writing the paper; support for designing, implementing the algorithms and measurements; revising the work; proofreading; R. Lovas: conceptualization, validation, proofreading
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Ethics Approval and Consent to Participate
Not applicable.
Consent for Publication
Not applicable.
Materials Availability
Materials are not publicly available.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pintye, I., Kovács, J. & Lovas, R. Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration. J Grid Computing 22, 68 (2024). https://doi.org/10.1007/s10723-024-09783-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10723-024-09783-1