[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3459637.3481947acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Scalable Learning to Troubleshoot Query Performance Problems

Published: 30 October 2021 Publication History

Abstract

Query optimization has long been fundamental for database systems. There are cracks in the edifice, however, as the complexity of modern query workloads outpace what database systems can manage well. Automatic tools are needed for database vendors, such as IBM with Db2, to help customers troubleshoot their performance problems, as manual troubleshooting is painstaking. To manage complex and large workloads, we develop a distributed system called dGALO that learns recurring problem patterns in query plans over workloads. dGALO employs these problem patterns to build a RDF-based, SPARQL-queried knowledge-base of plan-rewrite remedies. We illustrate a distributed implementation of dGALO on Apache Spark with efficient partitioning strategies for load balancing. The system employs additional pruning strategies via clustering, which yields a fine-grained trade off between runtime and accuracy. dGALO uses its knowledge-base to re-optimize queries, often to dramatic effect, and is a valuable tool for the development team to refine the optimizer with new techniques. We demonstrate by an experimental study over the TPC-DS benchmark the efficiency and effectiveness of our techniques.

Supplementary Material

MP4 File (CIKM-2021-video-afp0562.mp4)
How do we alleviate the manual efforts of database query experts in query problem determination? Previous work has focused on automating this process but was limited in its ability to learn and scale reasonably. In this work dGALO aims to improve the learning time by distributing workloads across multiple workers. In addition, it intelligently reduces redundancy, only executing the necessary queries, while still maximizing the templates discovered. These advances enable the learning to scale to larger workloads while maintaining its effectiveness in repairing problem queries.

References

[1]
S. Agrawal, S. Chaudhuri, L. Kollar, A. Marathe, V. Narasayya, and M. Syamala. 2005. Database tuning advisor for microsoft sql server 2005. In SIGMOD. 930--932.
[2]
R. Barber, G. Lohman, I. Pandis, V. Raman, R. Sidle, G. Attaluri, N. Chainani, S. Lightstone, and D. Sharpe. 2014. Memory-efficient Hash Joins. PVLDB, Vol. 8, 4 (2014), 353--364.
[3]
N. Bruno, S. Chaudhuri, and R. Ramamurthy. 2009a. Interactive Plan Hints for Query Optimization. In SIGMOD. 1043--1046.
[4]
N. Bruno, S. Chaudhuri, and R. Ramamurthy. 2009b. Power Hints for Query Optimization. In ICDE. 469--480.
[5]
IBM Knowledge Center. 2021. IBM InfoSphere Optim Query Workload Tuner. https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.idm.tools.doc/doc/c0057033.html
[6]
S. Chaudhuri and K. Shim. 1999. Optimization of Queries with User-defined Predicates. TODS, Vol. 24, 2 (1999), 177--228.
[7]
B. Dageville, D. Das, K. Dias, K. Yagoub, M. Zait, and M. Ziauddin. 2004. Automatic SQL Tuning in Oracle 10G. In VLDB. 1098--1109.
[8]
G. Damasio, S. Bryson, V. Corvinelli, P. Godfrey, P. Mierzejewski, A. Mihaylov, J. Szlichta, and C. Zuzarte. 2019 a. GALO: Guided Automated Learning for re-Optimization. PVLDB, Vol. 12, 12 (2019), 1778--1781.
[9]
G. Damasio, V. Corvinelli, P. Godfrey, P. Mierzejewski, A. Mihaylov, J. Szlichta, and C. Zuzarte. 2019 b. Guided Automated Learning for query workload re-Optimization. PVLDB, Vol. 12, 12 (2019), 2010--2021.
[10]
G. Damasio, P. Mierzejewski, J. Szlichta, and C. Zuzarte. 2016a. OptImatch: Semantic web system for query problem determination. In ICDE. 1334--1337.
[11]
G. Damasio, P. Mierzejewski, J. Szlichta, and C. Zuzarte. 2016b. Query Performance Problem Determination with Knowledge Base in Semantic Web System OptImatch. In EDBT. 515--526.
[12]
J. Dean and S. Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM, Vol. 51, 1 (2008), 107--113.
[13]
A. El-Helw, I. F. Ilyas, and C. Zuzarte. 2009. Statadvisor: Recommending statistical views. PVLDB, Vol. 2, 2 (2009), 1306--1317.
[14]
M. Ester, H. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD. 226--231.
[15]
J. Gryz, Q. Wang, X. Qian, and C. Zuzarte. 2008. SQL Queries with CASE Expressions. In ISMIS. 351--360.
[16]
Z. He, B. S. Lee, and R. Snapp. 2005. Self-tuning Cost Modeling of User-defined Functions in an Object-relational DBMS. TODS, Vol. 30, 3 (2005), 812--853.
[17]
Viktor Leis, Bernhard Radke, Andrey Gubichev, Alfons Kemper, and Thomas Neumann. 2017. Cardinality Estimation Done Right: Index-Based Join Sampling. In CIDR.
[18]
Henry Liu, Mingbin Xu, Ziting Yu, Vincent Corvinelli, and Calisto Zuzarte. 2015. Cardinality estimation using neural networks. In Proceedings of the 25th Annual International Conference on Computer Science and Software Engineering. IBM Corp., 53--59.
[19]
R. Marcus and O. Papaemmanouil. 2018. Deep Reinforcement Learning for Join Order Enumeration. In aiDM. Article 3, 4 pages.
[20]
V. Muntés-Mulero, J. Aguilar-Saborit, C. Zuzarte, and J. Larriba-Pey. 2006. CGO: A Sound Genetic Optimizer for Cyclic Query Graphs. In ICCS. 156--163.
[21]
J. Ortiz, M. Balazinska, J. Gehrke, and S. Keerthi. 2018. Learning State Representations for Query Optimization with Deep Reinforcement Learning. In DEEM. 1--4.
[22]
A. Pavlo, G. Angulo, J. Arulraj, H. Lin, J. Lin, L. Ma, P. Menon, T. C. Mowry, M. Perron, and I. Quah. 2017. Self-Driving Database Management Systems. In CIDR.
[23]
P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, and T. Price. 1979. Access path selection in a relational database management system. In SIGMOD. 23--34.
[24]
J. Szlichta, P. Godfrey, L. Golab, M. Kargar, and D. Srivastava. 2017. Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization. PVLDB, Vol. 10, 7 (2017), 721--732.
[25]
M. Ziauddin, D. Das, H. Su, Y. Zhu, and K. Yagoub. 2008. Optimizer Plan Change Management: Improved Stability and Performance in Oracle 11G. PVLDB, Vol. 1, 2 (2008), 1346--1355.
[26]
D. C. Zilio, J. Rao, S. Lightstone, G. Lohman, A. Storm, C. Garcia-Arellano, and S. Fadden. 2004 a. DB2 Design Advisor: Integrated Automatic Physical Database Design. In VLDB. 1087--1097.
[27]
D. C. Zilio, C. Zuzarte, S. Lightstone, W. Ma, G. M. Lohman, R. J. Cochrane, H. Pirahesh, L. Colby, J. Gryz, E. Alton, and G. Valentin. 2004 b. Recommending materialized views and indexes with the IBM DB2 Design Advisor. In ICAC. 180--187.

Cited By

View all
  • (2024)Db2une: Tuning Under Pressure via Deep LearningProceedings of the VLDB Endowment10.14778/3685800.368581117:12(3855-3868)Online publication date: 1-Aug-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed computing
  2. query performance problems

Qualifiers

  • Research-article

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Db2une: Tuning Under Pressure via Deep LearningProceedings of the VLDB Endowment10.14778/3685800.368581117:12(3855-3868)Online publication date: 1-Aug-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media