[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3404397.3404451acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

URSA: Precise Capacity Planning and Fair Scheduling based on Low-level Statistics for Public Clouds

Published: 17 August 2020 Publication History

Abstract

Database platform-as-a-service (dbPaaS) is developing rapidly and a large number of databases have been migrated to run on the Clouds for the low cost and flexibility. Emerging Clouds rely on the tenants to provide the resource specification for their database workloads. However, they tend to over-estimate the resource requirement of their databases, resulting in the unnecessarily high cost and low Cloud utilization. A methodology that automatically suggests the “just-enough” resource specification that fulfills the performance requirement of every database workload is profitable.
To this end, we propose URSA, a capacity planning and fair scheduling system that is comprised of an online capacity planner, a performance interference estimator, and a contention-aware scheduling engine. The capacity planner identifies the most cost-efficient resource specification for a database workload to achieve the required performance online. The interference estimator quantifies the pressure on the shared resource and the tolerance to the shared resource contention of each workload. The scheduling engine schedules the workloads across Cloud nodes carefully to eliminate unfair performance interference between the co-located workloads. Experimental results show that URSA reduces up to 25.9% of CPU usage, 53.4% of memory and reduces the performance unfairness between the co-located workloads by 47.6% usage compared to the prior works without hurting their performance.

References

[1]
[n.d.]. Kunernetes. https://kubernetes.io.
[2]
[n.d.]. Linux containers. https://linuxcontainers.org.
[3]
Alibaba. [n.d.]. AliSQL. https://github.com/alibaba/AliSQL.
[4]
Amazon.[n.d.]. Amazon Relational Database Service.https://aws.amazon.com/rds.
[5]
Timothy G Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. LinkBench: a database benchmark based on the Facebook social graph. In SIGMOD. ACM, 1185–1196.
[6]
Jens Axboe. 2014. Fio-flexible io tester. URL http://freecode.com/projects/fio(2014).
[7]
Michael J Cahill, Uwe Röhm, and Alan D Fekete. 2009. Serializable isolation for snapshot databases. ACM Transactions on Database Systems 34, 4 (2009), 20.
[8]
Quan Chen, Hailong Yang, Minyi Guo, Ram Srivatsa Kannan, Jason Mars, and Lingjia Tang. 2017. Prophet: Precise qos prediction on non-preemptive accelerators to improve utilization in warehouse-scale computers. ACM SIGARCH Computer Architecture News 45, 1 (2017), 17–32.
[9]
Alibaba Cloud.[n.d.]. Alibaba Relational Database Service.https://www.alibabacloud.com/zh/product/apsaradb-for-rds.
[10]
Linux community. 2015. perf: Linux profiling with performance counters.
[11]
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In SoCC. ACM, 143–154.
[12]
The Transaction Processing Council.2007. TPC-C Benchmark. http://www.tpc.org/tpcc/spec/tpcc_current.pdf.
[13]
Christina Delimitrou and Christos Kozyrakis. 2013. ibench: Quantifying interference for datacenter applications. In 2013 IEEE international symposium on workload characterization (IISWC). IEEE, 23–33.
[14]
Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-aware scheduling for heterogeneous datacenters. In ACM SIGPLAN Notices, Vol. 48. ACM, 77–88.
[15]
Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: resource-efficient and QoS-aware cluster management. ACM SIGPLAN Notices 49, 4 (2014), 127–144.
[16]
Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudre-Mauroux. 2013. Oltp-bench: An extensible testbed for benchmarking relational databases. Proceedings of the VLDB Endowment 7, 4 (2013), 277–288.
[17]
[17] Gartner.[n.d.]. www.gartner.com/en/newsroom/press-releases/2018-04-12-gartner-forecasts-worldwide-public-cloud-revenue-to-grow-21-percent-in-2018.
[18]
Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy H Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In NSDI. 22–22.
[19]
Kurt Hornik, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feedforward networks are universal approximators. Neural networks 2, 5 (1989), 359–366.
[20]
Intel. 2016. Intel Resource Director Technology. (2016).
[21]
Harshad Kasture and Daniel Sanchez. 2014. Ubik: efficient cache sharing with strict qos for latency-critical workloads. In ACM SIGPLAN Notices, Vol. 49. ACM, 729–742.
[22]
Wonyoung Kim, Meeta S Gupta, Gu-Yeon Wei, and David Brooks. 2008. System level analysis of fast, per-core DVFS using on-chip switching regulators. In HPCA. IEEE, 123–134.
[23]
Alexey Kopytov. 2004. SysBench: a system performance benchmark. http://sysbench. sourceforge. net/(2004).
[24]
David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: Improving resource efficiency at scale. In ACM SIGARCH Computer Architecture News, Vol. 43. ACM, 450–462.
[25]
Jonathan Mace, Peter Bodik, Madanlal Musuvathi, Rodrigo Fonseca, and Krishnan Varadarajan. 2016. 2dfq: Two-dimensional fair queuing for multi-tenant cloud services. In Proceedings of the 2016 ACM SIGCOMM Conference. 144–159.
[26]
Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Micro. ACM, 248–259.
[27]
Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, and John Wilkes. 2011. Cloudscale: elastic resource scaling for multi-tenant cloud systems. In SoCC. ACM, 5.
[28]
Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267–288.
[29]
Ajay Tirumala. 1999. Iperf: The TCP/UDP bandwidth measurement tool. http://dast. nlanr. net/Projects/Iperf/(1999).
[30]
Dana Van Aken, Andrew Pavlo, Geoffrey J Gordon, and Bohan Zhang. 2017. Automatic database management system tuning through large-scale machine learning. In SIGMOD. ACM, 1009–1024.
[31]
Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, 2013. Apache hadoop yarn: Yet another resource negotiator. In SoCC. ACM, 5.
[32]
Deepak Vohra. 2017. Scheduling pods on nodes. In Kubernetes Management Design Patterns. Springer, 199–236.
[33]
Zhenning Wang, Long Zheng, Quan Chen, and Minyi Guo. 2013. CAP: co-scheduling based on asymptotic profiling in CPU+ GPU hybrid systems. In Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores. 107–114.
[34]
Hailong Yang, Alex Breslow, Jason Mars, and Lingjia Tang. 2013. Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers. In ACM SIGARCH Computer Architecture News, Vol. 41. ACM, 607–618.
[35]
Wei Zhang, Weihao Cui, Kaihua Fu, Quan Chen, Daniel Edward Mawhirter, Bo Wu, Chao Li, and Minyi Guo. 2019. Laius: Towards latency awareness and improved utilization of spatial multitasking accelerators in datacenters. In Proceedings of the ACM International Conference on Supercomputing. 58–68.
[36]
Yunqi Zhang, Michael A Laurenzano, Jason Mars, and Lingjia Tang. 2014. Smite: Precise qos prediction on real-system smt processors to improve utilization in warehouse scale computers. In Micro. IEEE, 406–418.

Cited By

View all
  • (2024)StreamBed: Capacity Planning for Stream ProcessingProceedings of the 18th ACM International Conference on Distributed and Event-based Systems10.1145/3629104.3666034(90-102)Online publication date: 24-Jun-2024
  • (2023)Alioth: A Machine Learning Based Interference-Aware Performance Monitor for Multi-Tenancy Applications in Public Cloud2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00095(908-917)Online publication date: May-2023
  • (2022)Adaptive Resource Efficient Microservice Deployment in Cloud-Edge ContinuumIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.312803733:8(1825-1840)Online publication date: 1-Aug-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '20: Proceedings of the 49th International Conference on Parallel Processing
August 2020
844 pages
ISBN:9781450388160
DOI:10.1145/3404397
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Capacity Planning
  2. DBPaaS
  3. Public Clouds

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ICPP '20

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)StreamBed: Capacity Planning for Stream ProcessingProceedings of the 18th ACM International Conference on Distributed and Event-based Systems10.1145/3629104.3666034(90-102)Online publication date: 24-Jun-2024
  • (2023)Alioth: A Machine Learning Based Interference-Aware Performance Monitor for Multi-Tenancy Applications in Public Cloud2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00095(908-917)Online publication date: May-2023
  • (2022)Adaptive Resource Efficient Microservice Deployment in Cloud-Edge ContinuumIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.312803733:8(1825-1840)Online publication date: 1-Aug-2022
  • (2022)Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUsIEEE Transactions on Computers10.1109/TC.2021.306435271:4(866-879)Online publication date: 1-Apr-2022
  • (2022)Task Partitioning and Orchestration on Heterogeneous Edge Platforms: The Case of Vision ApplicationsIEEE Internet of Things Journal10.1109/JIOT.2022.31539709:10(7418-7432)Online publication date: 15-May-2022
  • (2022)Accelerating DAG-Style Job Execution via Optimizing Resource Pipeline SchedulingJournal of Computer Science and Technology10.1007/s11390-021-1488-437:4(852-868)Online publication date: 1-Jul-2022
  • (2021)Enable simultaneous DNN services based on deterministic operator overlap and precise latency predictionProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476143(1-15)Online publication date: 14-Nov-2021
  • (2021)Adaptive Preference-Aware Co-Location for Improving Resource Utilization of Power Constrained DatacentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.302399732:2(441-456)Online publication date: 1-Feb-2021
  • (2021)QoS-Aware and Resource Efficient Microservice Deployment in Cloud-Edge Continuum2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00102(932-941)Online publication date: May-2021
  • (2021)CHARM: Collaborative Host and Accelerator Resource Management for GPU Datacenters2021 IEEE 39th International Conference on Computer Design (ICCD)10.1109/ICCD53106.2021.00056(307-315)Online publication date: Oct-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media