More Web Proxy on the site http://driver.im/

research-article

How good are query optimizers, really?

Editors: Surajit Chaudhuri, Jayant Haritsa Authors:

Andrey Gubichev,

Atanas Mirchev,

Thomas NeumannAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 9, Issue 3

Pages 204 - 215

https://doi.org/10.14778/2850583.2850594

Published: 01 November 2015 Publication History

Abstract

Finding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit the main components in the classic query optimizer architecture using a complex, real-world data set and realistic multi-join queries. We investigate the quality of industrial-strength cardinality estimators and find that all estimators routinely produce large errors. We further show that while estimates are essential for finding a good join order, query performance is unsatisfactory if the query engine relies too heavily on these estimates. Using another set of experiments that measure the impact of the cost model, we find that it has much less influence on query performance than the cardinality estimates. Finally, we investigate plan enumeration techniques comparing exhaustive dynamic programming with heuristic algorithms and find that exhaustive enumeration improves performance despite the sub-optimal cardinality estimates.

References

[1]

R. Ahmed, R. Sen, M. Poess, and S. Chakkappen. Of snowstorms and bushy trees. PVLDB, 7(13):1452--1461, 2014.

Digital Library

[2]

B. Babcock and S. Chaudhuri. Towards a robust query optimizer: A principled and practical approach. In SIGMOD, pages 119--130, 2005.

Digital Library

[3]

S. Bellamkonda, H.-G. Li, U. Jagtap, Y. Zhu, V. Liang, and T. Cruanes. Adaptive and big data scale parallel execution in Oracle. PVLDB, 6(11):1102--1113, 2013.

Digital Library

[4]

R. Borovica-Gajic, S. Idreos, A. Ailamaki, M. Zukowski, and C. Fraser. Smooth scan: Statistics-oblivious access paths. In ICDE, pages 315--326, 2015.

[5]

N. Bruno, C. A. Galindo-Legaria, and M. Joshi. Polynomial heuristics for query optimization. In ICDE, pages 589--600, 2010.

[6]

S. Chaudhuri. Query optimizers: time to rethink the contract? In SIGMOD, pages 961--968, 2009.

Digital Library

[7]

S. Chaudhuri, V. R. Narasayya, and R. Ramamurthy. Exact cardinality query optimization for optimizer testing. PVLDB, 2(1):994--1005, 2009.

Digital Library

[8]

M. Colgan. Oracle adaptive joins. https://blogs.oracle.com/optimizer/entry/what_s_new_in_12c, 2013.

[9]

A. Dutt and J. R. Haritsa. Plan bouquets: query processing without selectivity estimation. In SIGMOD, pages 1039--1050, 2014.

Digital Library

[10]

C. Estan and J. F. Naughton. End-biased samples for join cardinality estimation. In ICDE, page 20, 2006.

Digital Library

[11]

L. Fegaras. A new heuristic for optimizing large queries. In DEXA, pages 726--735, 1998.

Digital Library

[12]

P. Fender and G. Moerkotte. Counter strike: Generic top-down join enumeration for hypergraphs. PVLDB, 6(14):1822--1833, 2013.

Digital Library

[13]

P. Fender, G. Moerkotte, T. Neumann, and V. Leis. Effective and robust pruning for top-down join enumeration algorithms. In ICDE, pages 414--425, 2012.

Digital Library

[14]

C. Fraser, L. Giakoumakis, V. Hamine, and K. F. Moore-Smith. Testing cardinality estimation models in SQL Server. In DBtest, 2012.

Digital Library

[15]

G. Graefe. A generalized join algorithm. In BTW, pages 267--286, 2011.

[16]

Z. Gu, M. A. Soliman, and F. M. Waas. Testing the accuracy of query optimizers. In DBTest, 2012.

Digital Library

[17]

P. J. Haas, J. F. Naughton, S. Seshadri, and A. N. Swami. Selectivity and cost estimation for joins based on random sampling. J Computer System Science, 52(3):550--569, 1996.

Digital Library

[18]

J. R. Haritsa. The Picasso database query optimizer visualizer. PVLDB, 3(2):1517--1520, 2010.

Digital Library

[19]

I. F. Ilyas, V. Markl, P. J. Haas, P. Brown, and A. Aboulnaga. CORDS: automatic discovery of correlations and soft functional dependencies. In SIGMOD, pages 647--658, 2004.

Digital Library

[20]

Y. E. Ioannidis. The history of histograms (abridged). In VLDB, pages 19--30, 2003.

Digital Library

[21]

Y. E. Ioannidis and S. Christodoulakis. On the propagation of errors in the size of join results. In SIGMOD, 1991.

Digital Library

[22]

R. A. Kader, P. A. Boncz, S. Manegold, and M. van Keulen. ROX: run-time optimization of XQueries. In SIGMOD, pages 615--626, 2009.

Digital Library

[23]

R. Kaushik, C. Ré, and D. Suciu. General database statistics using entropy maximization. In DBPL, pages 84--99, 2009.

Digital Library

[24]

Q. Li, M. Shao, V. Markl, K. S. Beyer, L. S. Colby, and G. M. Lohman. Adaptively reordering joins during query execution. In ICDE, pages 26--35, 2007.

[25]

F. Liu and S. Blanas. Forecasting the cost of processing multi-join queries via hashing for main-memory databases. In SoCC, pages 153--166, 2015.

Digital Library

[26]

G. Lohman. Is query optimization a solved problem? http://wp.sigmod.org/?p=1075, 2014.

[27]

L. F. Mackert and G. M. Lohman. R* optimizer validation and performance evaluation for local queries. In SIGMOD, pages 84--95, 1986.

Digital Library

[28]

V. Markl, N. Megiddo, M. Kutsch, T. M. Tran, P. J. Haas, and U. Srivastava. Consistently estimating the selectivity of conjuncts of predicates. In VLDB, pages 373--384, 2005.

Digital Library

[29]

G. Moerkotte and T. Neumann. Dynamic programming strikes back. In SIGMOD, pages 539--552, 2008.

Digital Library

[30]

G. Moerkotte, T. Neumann, and G. Steidl. Preventing bad plans by bounding the impact of cardinality estimation errors. PVLDB, 2(1):982--993, 2009.

Digital Library

[31]

I. Müller, P. Sanders, A. Lacurie, W. Lehner, and F. Färber. Cache-efficient aggregation: Hashing is sorting. In SIGMOD, pages 1123--1136, 2015.

Digital Library

[32]

T. Neumann. Query simplification: graceful degradation for join-order optimization. In SIGMOD, pages 403--414, 2009.

Digital Library

[33]

T. Neumann and C. A. Galindo-Legaria. Taking the edge off cardinality estimation errors using incremental execution. In BTW, pages 73--92, 2013.

[34]

V. Poosala and Y. E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In VLDB, pages 486--495, 1997.

Digital Library

[35]

F. Rusu and A. Dobra. Sketches for size of join estimation. TODS, 33(3), 2008.

Digital Library

[36]

P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In SIGMOD, pages 23--34, 1979.

Digital Library

[37]

M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and randomized optimization for the join ordering problem. VLDB J., 6(3):191--208, 1997.

Digital Library

[38]

M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - DB2's learning optimizer. In VLDB, pages 19--28, 2001.

Digital Library

[39]

K. Tzoumas, A. Deshpande, and C. S. Jensen. Lightweight graphical models for selectivity estimation without independence assumptions. PVLDB, 4(11):852--863, 2011.

Digital Library

[40]

F. Waas and A. Pellenkoft. Join order selection - good enough is easy. In BNCOD, pages 51--67, 2000.

Digital Library

[41]

F. M. Waas, L. Giakoumakis, and S. Zhang. Plan space analysis: an early warning system to detect plan regressions in cost-based optimizers. In DBTest, 2011.

Digital Library

[42]

W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacigümüs, and J. F. Naughton. Predicting query execution time: Are optimizer cost models really unusable? In ICDE, pages 1081--1092, 2013.

Digital Library

[43]

F. Yu, W. Hou, C. Luo, D. Che, and M. Zhu. CS2: a new database synopsis for query estimation. In SIGMOD, pages 469--480, 2013.

Digital Library

Cited By

Giannakouris VTrummer I(2024)DBG-PT: A Large Language Model Assisted Query Performance Regression DebuggerProceedings of the VLDB Endowment10.14778/3685800.368586917:12(4337-4340)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.14778/3685800.3685869
Shankhdhar PLiu FNarale JSun JSchlussel RAntova L(2024)Presto's History-Based Query OptimizerProceedings of the VLDB Endowment10.14778/3685800.368582817:12(4077-4089)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.14778/3685800.3685828
Bianchi AChai ACorvinelli VGodfrey PSzlichta JZuzarte C(2024)Db2une: Tuning Under Pressure via Deep LearningProceedings of the VLDB Endowment10.14778/3685800.368581117:12(3855-3868)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.14778/3685800.3685811
Show More Cited By

How good are query optimizers, really?
1. Information systems
  1. Data management systems
    1. Database management system engines
    2. Query languages
  2. Information retrieval
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory

Recommendations

Building query optimizers with combinators
View-based query containment
PODS '03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Query containment is the problem of checking whether for all databases the answer to a query is a subset of the answer to a second query. In several data management tasks, such as data integration, mobile computing, etc., the data of interest are only ...
Query Folding
ICDE '96: Proceedings of the Twelfth International Conference on Data Engineering

Query folding refers to the activity of determining if and how a query can be answered using a given set of resources, which might be materialized views, cached results of previous queries, or queries answerable by other databases. We investigate query ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 9, Issue 3

November 2015

144 pages

ISSN:2150-8097

Editors:
Surajit Chaudhuri
Microsoft Research
,
Jayant Haritsa
I.I.Sc. Bangalore

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 November 2015

Published in PVLDB Volume 9, Issue 3

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

284
Total Citations
View Citations
1,902
Total Downloads

Downloads (Last 12 months)409
Downloads (Last 6 weeks)58

Reflects downloads up to 17 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Giannakouris VTrummer I(2024)DBG-PT: A Large Language Model Assisted Query Performance Regression DebuggerProceedings of the VLDB Endowment10.14778/3685800.368586917:12(4337-4340)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.14778/3685800.3685869
Shankhdhar PLiu FNarale JSun JSchlussel RAntova L(2024)Presto's History-Based Query OptimizerProceedings of the VLDB Endowment10.14778/3685800.368582817:12(4077-4089)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.14778/3685800.3685828
Bianchi AChai ACorvinelli VGodfrey PSzlichta JZuzarte C(2024)Db2une: Tuning Under Pressure via Deep LearningProceedings of the VLDB Endowment10.14778/3685800.368581117:12(3855-3868)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.14778/3685800.3685811
Yu GWu ZKossmann FLi TMarkakis MNgom AMadden SKraska T(2024)Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRADProceedings of the VLDB Endowment10.14778/3681954.368202617:11(3629-3643)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3682026
Srivastava TFernandez R(2024)Saving Money for Analytical Workloads in the CloudProceedings of the VLDB Endowment10.14778/3681954.368201817:11(3524-3537)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3682018
Zhang WLim WButrovich MPavlo A(2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3682007
Wang ZLiu HLin CBao ZLi GWang T(2024)Leveraging Dynamic and Heterogeneous Workload Knowledge to Boost the Performance of Index AdvisorsProceedings of the VLDB Endowment10.14778/3654621.365463117:7(1642-1654)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654631
Lehmann CSulimov PStockinger K(2024)Is Your Learned Query Optimizer Behaving As You Expect? A Machine Learning PerspectiveProceedings of the VLDB Endowment10.14778/3654621.365462517:7(1565-1577)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654625
Yu TZou ZSun WYan Y(2024)Refactoring Index Tuning Process with Benefit EstimationProceedings of the VLDB Endowment10.14778/3654621.365462217:7(1528-1541)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654622
Justen DRitter DFraser CLamb ALee ABodner THaddad MZeuch SMarkl VBoehm M(2024)POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least ResistanceProceedings of the VLDB Endowment10.14778/3648160.364817517:6(1350-1363)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.14778/3648160.3648175
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents