research-article

CERES: Container-Based Elastic Resource Management System for Mixed Workloads

Authors:

Yufei XiongAuthors Info & Claims

ICPP '21: Proceedings of the 50th International Conference on Parallel Processing

Article No.: 13, Pages 1 - 10

https://doi.org/10.1145/3472456.3472459

Published: 05 October 2021 Publication History

Get Access

Abstract

It is common to deploy multiple workloads in one cluster to achieve high resource utilization, which tends to bring more resource contentions and performance interferences. If the allocable resources cannot satisfy the resource requirements of a task, the task should wait for resources, significantly increasing its scheduling latency. The inappropriate resource requirements may make a running task become a swollen task or a straggler task, which makes many allocated resources underutilized or the task be processed slowly. Therefore, how to guarantee the QoS of various services in the mixed workload deployment cluster is a challenge. Existing solutions preempt the resources from batch jobs to guarantee the resource requirements of latency-sensitive tasks without taking into account the underutilized resources in swollen tasks, which inevitably compromises the performance of batch jobs. Thus, we try to meet the resource requirements of newly incoming latency-sensitive tasks and straggler tasks with the underutilized resources instead of directly preempting the resources of batch jobs.

This paper presents CERES, which tries to ensure the QoS of latency-sensitive services and reduce the performance impact on batch jobs. Firstly, CERES periodically screens out swollen tasks from batch jobs and straggler tasks from latency-sensitive services. Secondly, CERES reclaims resources from the swollen tasks and even preempts resources from common batch tasks when the idle resources in the cluster cannot meet the resource requirements of newly incoming latency-sensitive tasks and the straggler tasks. If there are sufficient allocable resources in the cluster, CERES expands the resources of the straggler tasks. We have implemented CERES based on Hadoop YARN and conducted comprehensive experiments. The results show that compared with the state-of-the-art approach, CERES can decrease the task completion time of latency-sensitive services by 20.77%, reduce performance losses to batch jobs by 15.46%, and improve the cluster resource utilization by 27.06%.

References

[1]

Mansaf Alam, Kashish Ara Shakil, and Shuchi Sethi. 2015. Analysis and Clustering of Workload in Google Cluster Trace based on Resource Usage. arxiv:1501.01426

Abstract

References

Cited By

Recommendations

A containerized task clustering for scheduling workflows to utilize processors and containers on clouds

Resource Scheduling Management System of Container Cloud Platform based on Virtualization Technology

Workload forecasting based elastic resource management in edge cloud

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations