research-article

A Q-learning approach for the autoscaling of scientific workflows in the Cloud

Authors:

Yisel Garí,

David A. Monge,

Cristian MateosAuthors Info & Claims

Volume 127, Issue C

Pages 168 - 180

https://doi.org/10.1016/j.future.2021.09.007

Published: 01 February 2022 Publication History

Abstract

Autoscaling strategies aim to exploit the elasticity, resource heterogeneity and varied prices options of a Cloud infrastructure to improve efficiency in the execution of resource-hungry applications such as scientific workflows. Scientific workflows represent a special type of Cloud application with task dependencies, high-performance computational requirements and fluctuating workloads. Hence, the amount and type of resources needed during workflow execution changes dynamically over time. The well-known autoscaling problem comprises (i) scaling decisions, for adjusting the computing capacity of a virtualized infrastructure to meet the current demand of the application and (ii) task scheduling decisions, for assigning tasks to specific acquired Cloud resources for execution. Both are highly complex sub-problems, even more because of the uncertainty inherent to the Cloud. Reinforcement Learning (RL) provides a solid framework for decision-making problems in stochastic environments. Therefore, RL offers a promising perspective for designing Cloud autoscaling strategies based on an online learning process. In this work, we propose a novel formulation for the problem of infrastructure scaling in the Cloud as a Markov Decision Process, and we use the Q-learning algorithm for learning scaling policies, while demonstrating that considering the specific characteristics of workflow applications when taking autoscaling decisions can lead to more efficient workflow executions. Thus, our RL-based scaling strategy exploits the information available about workflow dependency structures. Simulations performed on four well-known workflows demonstrate significant gains (25%–55%) of our proposal in comparison with a similar state-of-the-art proposal.

Highlights

•

A new MDP formulation for the problem of Cloud infrastructure scaling for workflows. /item Learning scaling policies using Q-learning to reduce makespan and execution cost.

•

An in-depth evaluation of the proposed scaling strategy using 4 well-known workflows.

•

The inclusion of a state-of-the-art method as baseline for comparisons.

•

Simulation experiments demonstrate significant gains (25%–55%) of our proposal.

References

[1]

Monge D.A., Garí Y., Mateos C., García Garino C., Autoscaling scientific workflows on the cloud by combining on-demand and spot instances, J. Comput. Syst. Sci. Eng. 32 (4) (2017).

Abstract

Highlights

References

Cited By

Index Terms

Recommendations

Online RL-based cloud autoscaling for scientific workflows: Evaluation of Q-Learning and SARSA

Experiences using cloud computing for a scientific workflow application

Reinforcement Learning-Based Auto-scaling Algorithm for Elastic Cloud Workflow Service

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Share

Share this Publication link

Share on social media

Affiliations