[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1989323.1989359acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Performance prediction for concurrent database workloads

Published: 12 June 2011 Publication History

Abstract

Current trends in data management systems, such as cloud and multi-tenant databases, are leading to data processing environments that concurrently execute heterogeneous query workloads. At the same time, these systems need to satisfy diverse performance expectations. In these newly-emerging settings, avoiding potential Quality-of-Service (QoS) violations heavily relies on performance predictability, i.e., the ability to estimate the impact of concurrent query execution on the performance of individual queries in a continuously evolving workload.
This paper presents a modeling approach to estimate the impact of concurrency on query performance for analytical workloads. Our solution relies on the analysis of query behavior in isolation, pairwise query interactions and sampling techniques to predict resource contention under various query mixes and concurrency levels. We introduce a simple yet powerful metric that accurately captures the joint effects of disk and memory contention on query performance in a single value. We also discuss predicting the execution behavior of a time-varying query workload through query-interaction timelines, i.e., a fine-grained estimation of the time segments during which discrete mixes will be executed concurrently. Our experimental evaluation on top of PostgreSQL/TPC-H demonstrates that our models can provide query latency predictions within approximately 20% of the actual values in the average case.

References

[1]
M. Ahmad, A. Aboulnaga, S. Babu, and K. Munagala. Modeling and exploiting query interactions in database systems. In Proceeding of the 17th ACM conference on Information and knowledge management, CIKM '08, pages 183--192, New York, NY, USA, 2008. ACM.
[2]
M. Ahmad, A. Aboulnaga, S. Babu, and K. Munagala. Qshuffler: Getting the query mix right. Data Engineering, International Conference on, 0:1415--1417, 2008.
[3]
M. Ahmad, S. Duan, A. Aboulnaga, and S. Babu. Interaction-aware prediction of business intelligence workload completion times. Data Engineering, International Conference on, 0:413--416, 2010.
[4]
L. A. Barroso, K. Gharachorloo, and E. Bugnion. Memory system characterization of commercial workloads. In Proceedings of the 25th annual international symposium on Computer architecture, ISCA '98, pages 3--14, Washington, DC, USA, 1998. IEEE Computer Society.
[5]
M. Calzarossa and G. Serazzi. Workload characterization: A survey. In Proceedings of the IEEE, pages 1136--1150, 1993.
[6]
S. Chaudhuri, R. Kaushik, and R. Ramamurthy. When can we trust progress estimators for SQL queries? In Proceedings of the 2005 ACM SIGMOD international conference on Management of data, SIGMOD '05, pages 575--586, New York, NY, USA, 2005. ACM.
[7]
S. Chaudhuri, V. Narasayya, and R. Ramamurthy. Estimating progress of execution for sql queries. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data, SIGMOD '04, pages 803--814, New York, NY, USA, 2004. ACM.
[8]
A. Ganapathi, H. Kuno, U. Dayal, J. L. Wiener, A. Fox, M. Jordan, and D. Patterson. Predicting multiple metrics for queries: Better decisions enabled by machine learning. In ICDE '09: Proceedings of the 2009 IEEE International Conference on Data Engineering, pages 592--603, Washington, DC, USA, 2009. IEEE Computer Society.
[9]
C. Gupta, A. Mehta, and U. Dayal. PQR: Predicting Query Execution Times for Autonomous Workload Management. Autonomic Computing, International Conference on, 0:13--22, 2008.
[10]
K. Keeton, D. A. Patterson, Y. Q. He, R. C. Raphael, and W. E. Baker. Performance characterization of a Quad Pentium Pro SMP using OLTP workloads. volume 26, pages 15--26, New York, NY, USA, April 1998. ACM.
[11]
J. L. Lo, L. A. Barroso, S. J. Eggers, K. Gharachorloo, H. M. Levy, and S. S. Parekh. An analysis of database workload performance on simultaneous multithreaded processors. SIGARCH. News, 26(3):39--50, 1998.
[12]
G. Luo, J. Naughton, and P. Yu. Multi-query sql progress indicators. In Y. Ioannidis, M. Scholl, J. Schmidt, F. Matthes, M. Hatzopoulos, K. Boehm, A. Kemper, T. Grust, and C. Boehm, editors, Advances in Database Technology - EDBT 2006, volume 3896 of Lecture Notes in Computer Science, pages 921--941. Springer Berlin / Heidelberg, 2006.
[13]
G. Luo, J. F. Naughton, C. J. Ellmann, and M. W. Watzke. Toward a progress indicator for database queries. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data, SIGMOD '04, pages 791--802, New York, NY, USA, 2004. ACM.
[14]
A. Mehta, C. Gupta, and U. Dayal. BI batch manager: a system for managing batch workloads on enterprise data-warehouses. In Proceedings of the 11th international conference on Extending database technology: Advances in database technology, EDBT '08, pages 640--651, New York, NY, USA, 2008. ACM.
[15]
K. Morton, M. Balazinska, and D. Grossman. ParaTimer: a progress indicator for MapReduce DAGs. In Proceedings of the 2010 international conference on Management of data, SIGMOD '10, pages 507--518, New York, NY, USA, 2010. ACM.
[16]
C. Rasmussen. Gaussian processes in machine learning. In Advanced Lectures on Machine Learning, volume 3176, pages 63--71. Springer Berlin / Heidelberg, 2004.
[17]
P. Yu, M.-S. Chen, H.-U. Heiss, and S. Lee. On workload characterization of relational database environments. IEEE Transactions on Software Engineering, 18:347--355, 1992.

Cited By

View all
  • (2025)Hybrid Cost Modeling for Reducing Query Performance Regression in Index TuningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348495437:1(379-391)Online publication date: Jan-2025
  • (2024)A systematic review of deep learning applications in database query executionJournal of Big Data10.1186/s40537-024-01025-111:1Online publication date: 18-Dec-2024
  • (2023)Prediction of Cloud API Performance Using Uncertainty-Based Fusion of Predictive and Analytical Modeling2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00076(515-522)Online publication date: 17-Dec-2023
  • Show More Cited By

Index Terms

  1. Performance prediction for concurrent database workloads

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
    June 2011
    1364 pages
    ISBN:9781450306614
    DOI:10.1145/1989323
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. concurrency
    2. query performance prediction

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)83
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Hybrid Cost Modeling for Reducing Query Performance Regression in Index TuningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348495437:1(379-391)Online publication date: Jan-2025
    • (2024)A systematic review of deep learning applications in database query executionJournal of Big Data10.1186/s40537-024-01025-111:1Online publication date: 18-Dec-2024
    • (2023)Prediction of Cloud API Performance Using Uncertainty-Based Fusion of Predictive and Analytical Modeling2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00076(515-522)Online publication date: 17-Dec-2023
    • (2022)Execution Time Prediction for Cypher Queries in the Neo4j Database Using a Learning ApproachSymmetry10.3390/sym1401005514:1(55)Online publication date: 1-Jan-2022
    • (2022)Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and OpportunitiesProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3522566(2465-2473)Online publication date: 10-Jun-2022
    • (2022)Efficient Learning with Pseudo Labels for Query Cost EstimationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557305(1309-1318)Online publication date: 17-Oct-2022
    • (2022)Database Meets Artificial Intelligence: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.299464134:3(1096-1116)Online publication date: 1-Mar-2022
    • (2022)Hihooi: A Database Replication Middleware for Scaling Transactional Databases ConsistentlyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.298756034:2(691-707)Online publication date: 1-Feb-2022
    • (2021)CPRQ: Cost Prediction for Range Queries in Moving Object DatabasesISPRS International Journal of Geo-Information10.3390/ijgi1007046810:7(468)Online publication date: 8-Jul-2021
    • (2021)openGaussProceedings of the VLDB Endowment10.14778/3476311.347638014:12(3028-3042)Online publication date: 28-Oct-2021
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media