Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration

István Pintye^1,2,
József Kovács^1,3 &
Róbert Lovas^1,2

409 Accesses
Explore all metrics

Abstract

Performance and cost-effectiveness are sustained by efficient management of resources in cloud computing. Current autoscaling approaches, when trying to balance between the consumption of resources and QoS requirements, usually fall short and end up being inefficient and leading to service disruptions. The existing literature has primarily focuses on static metrics and/or proactive scaling approaches which do not align with dynamically changing tasks, jobs or service calls. The key concept of our approach is the use of statistical analysis to select the most relevant metrics for the specific application being scaled. We demonstrated that different applications require different metrics to accurately estimate the necessary resources, highlighting that what is critical for an application may not be for the other. The proper metrics selection for control mechanism which regulates the requried recources of application are described in this study. Introduced selection mechanism enables us to improve previously designed autoscaler by allowing them to react more quickly to sudden load changes, use fewer resources, and maintain more stable service QoS due to the more accurate machine learning models. We compared our method with previous approaches through a carefully designed series of experiments, and the results showed that this approach brings significant improvements, such as reducing QoS violations by up to 80% and reducing VM usage by 3% to 50%. Testing and measurements were conducted on the Hungarian Research Network (HUN-REN) Cloud, which supports the operation of over 300 scientific projects.

Article PDF

Evaluating machine learning prediction techniques and their impact on proactive resource provisioning for cloud environments

Article 18 June 2024

Dynamic Selection of Virtual Machines for Application Servers in Cloud Environments

Efficient resource provisioning for elastic Cloud services based on machine learning techniques

Article Open access 16 April 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data Availability

Data available under private github. The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

Code Availability

Code available under private github. Access is granted upon request.

References

Alipour, H., Liu, Y., Hamou-Lhadj, A.: Analyzing auto-scaling issues in cloud environments. (2014)
Straesser, M., Grohmann, J., Kistowski, J.V., Eismann, S., Bauer, A., Kounev, S.: Why is it not solved yet?: Challenges for production-ready autoscaling. In: Proceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering, (2022). https://doi.org/10.1145/3489525.3511680
Zhong, Z., Xu, M., Rodríguez, M., Xu, C.-Z., Buyya, R.: Machine learning-based orchestration of containers: A taxonomy and future directions. ACM Comput. Surv. 54 (2022). https://doi.org/10.1145/3510415
Ullah, A., Kiss, T., Kovács, J., Tusa, F., Deslauriers, J., Dagdeviren, H., Arjun, R., Hamzeh, H.: Orchestration in the cloud-to-things compute continuum: taxonomy, survey and future directions. J. Cloud Comput.-Adv. Syst. Appl. 12 (2023). https://doi.org/10.1186/s13677-023-00516-5
Biswas, A., Majumdar, S., Nandy, B., El-Haraki, A.: Predictive auto-scaling techniques for clouds subjected to requests with service level agreements. In: 2015 IEEE World Congress on Services, pp. 311–318. (2015). https://doi.org/10.1109/SERVICES.2015.54
Wang, Z., Zhu, S., Li, J., Jiang, W., Ramakrishnan, K.K., Zheng, Y., Yan, M., Zhang, X., Liu, A.X.: Deepscaling: microservices autoscaling for stable cpu utilization in large scale cloud systems. In: Proceedings of the 13th Symposium on Cloud Computing. SoCC ’22, pp. 16–30. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3542929.3563469
Qiu, H., Banerjee, S.S., Jha, S., Kalbarczyk, Z.T., Iyer, R.K.: Firm: an intelligent fine-grained resource management framework for slo-oriented microservices. In: Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation. OSDI’20. USENIX Association, USA (2020)
Xu, M., Song, C., Wu, H., Gill, S.S., Ye, K., Xu, C.: esdnn: Deep neural network based multivariate workload prediction in cloud computing environments. ACM Trans. Internet Technol. 22(3) (2022). https://doi.org/10.1145/3524114
Imdoukh, M., Ahmad, I., Alfailakawi, M.: Machine learning-based auto-scaling for containerized applications. Neural Comput. Appl. 32, 9745–9760 (2019). https://doi.org/10.1007/s00521-019-04507-z
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Adaptive ai-based auto-scaling for kubernetes. In: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pp. 599–608. (2020). https://doi.org/10.1109/CCGrid49817.2020.00-33
Bansal, S., Kumar, M.: Deep learning-based workload prediction in cloud computing to enhance the performance. In: 2023 Third International Conference on Secure Cyber Computing and Communication (ICSCCC), pp. 635–640. (2023). https://doi.org/10.1109/ICSCCC58608.2023.10176790
Yazdanian, P., Sharifian, S.: E2lg: a multiscale ensemble of lstm/gan deep learning architecture for multistep-ahead cloud workload prediction. The J. Supercomput. 77, 11052–11082 (2021). https://doi.org/10.1007/s11227-021-03723-6
Article Google Scholar
Patel, Y.S., Bedi, J.: Mag-d: A multivariate attention network based approach for cloud workload forecasting. Fut. Gener. Comput. Syst. 142, 376–392 (2023). https://doi.org/10.1016/j.future.2023.01.002
Article Google Scholar
Baresi, L., Guinea, S., Leva, A., Quattrocchi, G.: A discrete-time feedback controller for containerized cloud applications. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. FSE 2016, pp. 217–228. Association for Computing Machinery, New York, NY, USA (2016). https://doi.org/10.1145/2950290.2950328
Sabuhi, M., Mahmoudi, N., Khazaei, H.: Optimizing the performance of containerized cloud software systems using adaptive pid controllers. ACM Trans. Auton. Adapt. Syst. 15(3) (2021). https://doi.org/10.1145/3465630
Baarzi, A.F., Kesidis, G.: Showar: Right-sizing and efficient scheduling of microservices. In: Proceedings of the ACM Symposium on Cloud Computing. SoCC ’21, pp. 427–441. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3472883.3486999
Goli, A., Mahmoudi, N., Khazaei, H., Ardakanian, O.: A holistic machine learning-based autoscaling approach for microservice applications. (2021). https://doi.org/10.5220/0010407701900198
Abrantes, A., Netto, M.: Using application data for sla-aware auto-scaling in cloud environments. (2015). https://doi.org/10.1109/MASCOTS.2015.15
Podolskiy, V., Mayo, M., Koay, A., Gerndt, M., Patros, P.: Maintaining slos of cloud-native applications via self-adaptive resource sharing. In: 2019 IEEE 13th International Conference on Self-Adaptive and Self-Organizing Systems (SASO), pp. 72–81. (2019). https://doi.org/10.1109/SASO.2019.00018
Rossi, F., Cardellini, V., Presti, F.L., Nardelli, M.: Dynamic multi-metric thresholds for scaling applications using reinforcement learning. IEEE Trans Cloud Comput. 11, 1807–1821 (2023). https://doi.org/10.1109/TCC.2022.3163357
Zhang, Y., Hua, W., Zhou, Z., Suh, G.E., Delimitrou, C.: Sinan: Ml-based and qos-aware resource management for cloud microservices. In: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS ’21, pp. 167–181. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3445814.3446693
Dang-Quang, N.-M., Yoo, M.: An efficient multivariate autoscaling framework using bi-lstm for cloud computing. Appl. Sci. (2022)
Jayakumar, V., Arbat, S., Kim, I., Wang, W.: Cloudbruno: A low-overhead online workload prediction framework for cloud computing. In: 2022 IEEE International Conference on Cloud Engineering (IC2E), pp. 188–198. (2022). https://doi.org/10.1109/IC2E55432.2022.00027
Pfeifer, A., Brand, H., Lohweg, V.: A comparison of statistical and machine learning approaches for time series forecasting in a demand management scenario. In: 2023 IEEE 21st International Conference on Industrial Informatics (INDIN), pp. 1–6. (2023). https://doi.org/10.1109/INDIN51400.2023.10218206
Li, Y., Lin, Y., Wang, Y., Ye, K., Xu, C.: Serverless computing: State-of-the-art, challenges and opportunities. IEEE Trans. Serv. Computing. 16(2), 1522–1539 (2023). https://doi.org/10.1109/TSC.2022.3166553
Article Google Scholar
Hossen, M.R., Islam, M.A., Ahmed, K.: Practical efficient microservice autoscaling with qos assurance. In: Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing. HPDC ’22, pp. 240–252. Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3502181.3531460
Wajahat, M., Karve, A., Kochut, A., Gandhi, A.: Mlscale: A machine learning based application-agnostic autoscaler. Sustain. Comput.: Inf. Syst. 22 (2017). https://doi.org/10.1016/j.suscom.2017.10.003
Wajahat, M., Gandhi, A., Karve, A., Kochut, A.: Using machine learning for black-box autoscaling, pp. 1–8 (2016). https://doi.org/10.1109/IGCC.2016.7892598
Toka, L., Dobreff, G., Fodor, B., Sonkoly, B.: Machine learning-based scaling management for kubernetes edge clusters. IEEE Trans. Netw. Serv. Manag. 18, 958–972 (2021). https://doi.org/10.1109/TNSM.2021.3052837
Article Google Scholar
Chen, X., Zhu, F., Chen, Z., Min, G., Zheng, X., Rong, C.: Resource allocation for cloud-based software services using prediction-enabled feedback control with reinforcement learning. IEEE Trans. Cloud Comput. 10(2), 1117–1129 (2022). https://doi.org/10.1109/TCC.2020.2992537
Article Google Scholar
Dhal, P., Azad, C.: A comprehensive survey on feature selection in the various fields of machine learning. Appl. Intell. 52 (2021). https://doi.org/10.1007/s10489-021-02550-9
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014). https://doi.org/10.5555/2627435.2670313
Article MathSciNet Google Scholar
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. CoRR. abs/1412.6980 (2014)
Barrow, D., Crone, S.: Cross-validation aggregation for combining autoregressive neural network forecasts. Int. J. Forecast. 32, 1120–1137 (2016). https://doi.org/10.1016/j.ijforecast.2015.12.011
Article Google Scholar
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012). https://doi.org/10.5555/2503308.2188395
Article MathSciNet Google Scholar
Prechelt, L.: Automatic early stopping using cross validation: quantifying the criteria. Neural Netw.: The Official J. Int. Neural Netw. Soc. 11(4), 761–767 (1998). https://doi.org/10.1016/S0893-6080(98)00010-0
Podolskiy, V., Jindal, A., Gerndt, M.: Multilayered autoscaling performance evaluation: Can virtual machines and containers co-scale? Int. J. Appl. Math. Comput. Sci. 29, 227–244 (2019). https://doi.org/10.2478/amcs-2019-0017
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining, Fourth Edition: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2016)
Google Scholar
Héder, M., Rigó, E., Medgyesi, D., Lovas, R., Tenczer, S., Török, F., Farkas, A., Emődi, M., Kadlecsik, J., Mező, G., Pintér, Á., Kacsuk, P.: The past, present and future of the ELKH cloud. Inf. Társadalom. 22(2), 128 (2022). https://doi.org/10.22503/inftars.xxii.2022.2.8
Podolskiy, V., Patrou, M., Patros, P., Gerndt, M., Kent, K.B.: The weakest link: Revealing and modeling the architectural patterns of microservice applications. In: Conference of the Centre for Advanced Studies on Collaborative Research (2020). https://doi.org/10.5555/3432601.3432616
Luo, S., Huanle, X., Lu, C., Ye, K., Xu, G., Zhang, L., Ding, Y., He, J., Xu, C.-Z.: Characterizing microservice dependency and performance: Alibaba trace analysis. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 412–426. (2021). https://doi.org/10.1145/3472883.3487003

Download references

Acknowledgements

On behalf of the “Autonóm felhöalapú skálázás” (“Autonomous cloud-based scaling”) we are grateful for the possibility to use HUN-REN Cloud (see Héder et al. 2022; https://science-cloud.hu) which helped us achieve the results published in this paper.

Funding

Open access funding provided by HUN-REN Institute for Computer Science and Control. This work was partially funded by the European Commission’s Swarmchestrate Horizon Europe project (GA No. 101135012): https://www.swarmchestrate.eu, by the Ministry of Innovation and Technology NRDI Office within the framework of the Artificial Intelligence National Laboratory Program (GA No. F-2.3.1-21-2022-00004) and by the National Research, Development and Innovation Office (NKFIH) under OTKA (GA No. K132838).

Author information

Authors and Affiliations

Institute for Computer Science and Control, Hungarian Research Network (HUN-REN SZTAKI), Kende u. 13-17, Budapest, H-1111, Hungary
István Pintye, József Kovács & Róbert Lovas
Institute for Cyber-Physical Systems, John von Neumann Faculty of Informatics, Óbuda University, Bécsi út 96/B, Budapest, H-1034, Hungary
István Pintye & Róbert Lovas
School of Computer Science and Engineering, University of Westminster, 115 New Cavendish, London, W1W 6UW, United Kingdom
József Kovács

Authors

István Pintye
View author publications
You can also search for this author in PubMed Google Scholar
József Kovács
View author publications
You can also search for this author in PubMed Google Scholar
Róbert Lovas
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

I. Pintye: original idea; design, implementation of algorithms and measurements; writing the draft paper; reviewing relevant research; J. Kovacs: desiging, editing and writing the paper; support for designing, implementing the algorithms and measurements; revising the work; proofreading; R. Lovas: conceptualization, validation, proofreading

Corresponding author

Correspondence to István Pintye.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Ethics Approval and Consent to Participate

Not applicable.

Consent for Publication

Not applicable.

Materials Availability

Materials are not publicly available.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pintye, I., Kovács, J. & Lovas, R. Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration. J Grid Computing 22, 68 (2024). https://doi.org/10.1007/s10723-024-09783-1

Download citation

Received: 18 June 2024
Accepted: 09 October 2024
Published: 19 October 2024
DOI: https://doi.org/10.1007/s10723-024-09783-1

Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration

Abstract

Article PDF

Similar content being viewed by others

Evaluating machine learning prediction techniques and their impact on proactive resource provisioning for cloud environments

Dynamic Selection of Virtual Machines for Application Servers in Cloud Environments

Efficient resource provisioning for elastic Cloud services based on machine learning techniques

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethics Approval and Consent to Participate

Consent for Publication

Materials Availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration

Abstract

Article PDF

Similar content being viewed by others

Evaluating machine learning prediction techniques and their impact on proactive resource provisioning for cloud environments

Dynamic Selection of Virtual Machines for Application Servers in Cloud Environments

Efficient resource provisioning for elastic Cloud services based on machine learning techniques

Data Availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethics Approval and Consent to Participate

Consent for Publication

Materials Availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation