[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1989323.1989357acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Workload-aware database monitoring and consolidation

Published: 12 June 2011 Publication History

Abstract

In most enterprises, databases are deployed on dedicated database servers. Often, these servers are underutilized much of the time. For example, in traces from almost 200 production servers from different organizations, we see an average CPU utilization of less than 4%. This unused capacity can be potentially harnessed to consolidate multiple databases on fewer machines, reducing hardware and operational costs. Virtual machine (VM) technology is one popular way to approach this problem. However, as we demonstrate in this paper, VMs fail to adequately support database consolidation, because databases place a unique and challenging set of demands on hardware resources, which are not well-suited to the assumptions made by VM-based consolidation.
Instead, our system for database consolidation, named Kairos, uses novel techniques to measure the hardware requirements of database workloads, as well as models to predict the combined resource utilization of those workloads. We formalize the consolidation problem as a non-linear optimization program, aiming to minimize the number of servers and balance load, while achieving near-zero performance degradation. We compare Kairos against virtual machines, showing up to a factor of 12× higher throughput on a TPC-C-like benchmark. We also tested the effectiveness of our approach on real-world data collected from production servers at Wikia.com, Wikipedia, Second Life, and MIT CSAIL, showing absolute consolidation ratios ranging between 5.5:1 and 17:1.

References

[1]
A. Aboulnaga, Z. Wang, and Z. Y. Zhang. Packing the most onto your cloud. In CloudDB, 2009.
[2]
P. Apers. Data allocation in distributed database systems. ACM Transactions on Database Systems (TODS), 13(3):263--304, 1988.
[3]
S. Aulbach, T. Grust, D. Jacobs, A. Kemper, and J. Rittinger. Multi-tenant databases for software as a service: schema-mapping techniques. In SIGMOD, 2008.
[4]
M. Bennani and D. Menasce. Resource allocation for autonomic data centers using analytic performance models. In ICAC, 2005.
[5]
K. Brown, M. Carey, D. DeWitt, M. Mehta, and J. Naughton. Resource allocation and scheduling for mixed database workloads. Technical Report TR1095, University of Wisconsin - Madison CS Department, July 1992.
[6]
A. Chandra, W. Gong, and P. Shenoy. Dynamic resource allocation for shared data centers using online measurements. In IWQoS, 2003.
[7]
C. Curino, E. P. C. Jones, R. A. Popa, N. Malviya, E. Wu, S. Madden, H. Balakrishnan, and N. Zeldovich. Relationalcloud: a database service for the cloud. In CIDR, 2011.
[8]
A. Gulati, C. Kumar, and I. Ahmad. Modeling workloads and devices for IO load balancing in virtualized environments. SIGMETRICS Perform. Eval. Rev., 37(3):61--66, 2009.
[9]
A. Gulati, C. Kumar, and I. Ahmad. Storage workload characterization and consolidation in virtualized environments. In VPACT, 2009.
[10]
S. Harizopoulos, M. Shah, J. Meza, and P. Ranganathan. Energy efficiency: The new holy grail of data management systems research. In CIDR, pages 4--7, 2009.
[11]
M. Heaton. Hosting Nirvana--The Future of Shared Hosting! {Online} http://mattheaton.com/?p=185, April 2009.
[12]
K. Holmström. The TOMLAB optimization environment in Matlab. Advanced Modeling and Optimization, 1(1):47--69, 1999.
[13]
HP. Polyserve: Product Overview. {Online} http://h18000.www1.hp.com/products/quickspecs/12741_na/12741_na.pdf, February 2009.
[14]
M. Hui, D. Jiang, G. Li, and Y. Zhou. Supporting database applications as a service. In ICDE, pages 832--843, 2009.
[15]
D. Jacobs and S. Aulbach. Ruminations on multi-tenant databases. BTW Proceedings, 2007.
[16]
D. R. Jones. DIRECT global optimization algorithm. In Encyclopedia of Optimization, pages 725--735. 2009.
[17]
D. Jonker. Combining database clustering and virtualization to consolidate mission-critical servers. Jan. 2009.
[18]
E. K. Lee and R. H. Katz. An analytic performance model of disk arrays. SIGMETRICS, 21(1):98--109, 1993.
[19]
O. Ozmen, K. Salem, M. Uysal, and H. S. Attar. Storage workload estimation for database management systems. In SIGMOD, 2007.
[20]
A. A. Soror, U. F. Minhas, A. Aboulnaga, K. Salem, P. Kokosielis, and S. Kamath. Automatic virtual machine configuration for database workloads. ACM Trans. Database Syst., 35(1), 2010.
[21]
G. Soundararajan, D. Lupei, S. Ghanbari, A. D. Popescu, J. Chen, and C. Amza. Dynamic resource allocation for database servers running on virtual storage. In FAST, 2009.
[22]
T. Stöhr, H. Martens, and E. Rahm. Multi-dimensional database allocation for parallel data warehouses. In VLDB, 2000.
[23]
G. Urdaneta, G. Pierre, and M. van Steen. Wikipedia workload analysis for decentralized hosting. Elsevier Computer Networks, 53(11), 2009.
[24]
B. Urgaonkar, P. Shenoy, A. Chandra, P. Goyal, and T. Wood. Agile dynamic provisioning of multi-tier internet applications. ACM Trans. Auton. Adapt. Syst., 3(1), 2008.
[25]
E. Varki, A. Merchant, J. Xu, and X. Qiu. Issues and challenges in the performance analysis of real disk arrays. IEEE TPDS, 15(6):559--574, 2004.
[26]
C. A. Waldspurger. Memory resource management in VMware ESX server. In OSDI'02, pages 181--194, 2002.

Cited By

View all
  • (2024)Locality-Preserving Graph Traversal With Split Live MigrationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.343682835:10(1810-1825)Online publication date: Oct-2024
  • (2024)Query execution time estimation in graph databases based on graph neural networksJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10201836:4(102018)Online publication date: Apr-2024
  • (2024)Polyglotte Persistenz im DatenmanagementSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_7(161-188)Online publication date: 3-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
June 2011
1364 pages
ISBN:9781450306614
DOI:10.1145/1989323
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. consolidation
  2. multi-tenant databases

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)82
  • Downloads (Last 6 weeks)4
Reflects downloads up to 26 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Locality-Preserving Graph Traversal With Split Live MigrationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.343682835:10(1810-1825)Online publication date: Oct-2024
  • (2024)Query execution time estimation in graph databases based on graph neural networksJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10201836:4(102018)Online publication date: Apr-2024
  • (2024)Polyglotte Persistenz im DatenmanagementSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_7(161-188)Online publication date: 3-May-2024
  • (2023)Flexible Resource Allocation for Relational Database-as-a-ServiceProceedings of the VLDB Endowment10.14778/3625054.362505816:13(4202-4215)Online publication date: 1-Sep-2023
  • (2023)Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic WorkloadsProceedings of the ACM on Management of Data10.1145/36173331:3(1-25)Online publication date: 13-Nov-2023
  • (2023)Priority-Driven Differentiated Performance for NoSQL Database-as-a-ServiceIEEE Transactions on Cloud Computing10.1109/TCC.2023.329203111:4(3469-3482)Online publication date: Oct-2023
  • (2023)LBFF: Load-Balancing First Fit Algorithm for Tenant Placement ProblemICC 2023 - IEEE International Conference on Communications10.1109/ICC45041.2023.10279638(6261-6267)Online publication date: 28-May-2023
  • (2023)Load Balancing Traffic Among Kubernetes Replicas by Utilizing Workload Estimation2023 IEEE Conference on Standards for Communications and Networking (CSCN)10.1109/CSCN60443.2023.10453145(353-356)Online publication date: 6-Nov-2023
  • (2022)Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and OpportunitiesProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3522566(2465-2473)Online publication date: 10-Jun-2022
  • (2022)Database Meets Artificial Intelligence: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.299464134:3(1096-1116)Online publication date: 1-Mar-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media