[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1066157.1066172acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Towards a robust query optimizer: a principled and practical approach

Published: 14 June 2005 Publication History

Abstract

Research on query optimization has focused almost exclusively on reducing query execution time, while important qualities such as consistency and predictability have largely been ignored, even though most database users consider these qualities to be at least as important as raw performance. In this paper, we explore how the query optimization process can be made more robust, focusing on the important subproblem of cardinality estimation. The robust cardinality estimation technique that we propose allows for a user- or application-specified trade-off between performance and predictability, and it captures multi-dimensional correlations while remaining space- and time-efficient.

References

[1]
S. Acharya, P. B. Gibbons, V. Poosala, and S. Ramaswamy. Join synopses for approximate query answering. In Proc. 1999 SIGMOD Conf., pages 275--286, June 1999.
[2]
G. Antoshenkov. Query processing in DEC Rdb: Major issues and future challenges. Data Engineering Bulletin, 16(4):41--50, 1993.
[3]
J. M. Bernardo and A. F. M. Smith. Bayesian Theory. John Wiley, 1994.
[4]
S. Chaudhuri and V. Narasayya. Automating statistics management for query optimizers. In Proc. 2000 Intl. Conf. on Data Engineering, pages 339--348, 2000.
[5]
C.-M. Chen and N. Roussopoulos. Adaptive selectivity estimation using query feedback. In Proc. 1994 SIGMOD.
[6]
F. Chu, J. Y. Halpern, and J. Gehrke. Least expected cost query optimization: What can we expect? In Proc. 21st ACM PODS, pages 293--302, 2002.
[7]
F. Chu, J. Y. Halpern, and P. Seshadri. Least expected cost query optimization: An exercise in utility. In Proc. 18th ACM PODS, pages 138--147, 1999.
[8]
R. L. Cole. A decision theoretic cost model for dynamic plans. Data Engineering Bulletin, 23(2):34--41, 2000.
[9]
A. Deshpande, M. N. Garofalakis, and R. Rastogi. Independence is good: Dependency-based histogram synopses for high-dimensional data. In Proc. 2001 SIGMOD.
[10]
D. Donjerkovic and R. Ramakrishnan. Probabilistic optimization of Top N queries. In Proc. 1999 VLDB Conf., pages 411--422, 1999.
[11]
C. Faloutsos and I. Kamel. Relaxing the uniformity and independence assumptions using the concept of fractal dimensions. Journal of Computer and System Sciences, 55(2):229--240, 1997.
[12]
L. Getoor, B. Taskar, and D. Koller. Selectivity estimation using probabilistic models. In Proc. 2001 SIGMOD Conf.
[13]
P. Haas, J. Naughton, P. Seshadri, and L. Stokes. Sampling-based estimation of the number of distinct values of an attribute. In Proc. 1995 VLDB Conf., pages 311--322.
[14]
P. J. Haas, J. F. Naughton, S. Seshadri, and A. N. Swami. Selectivity and cost estimation for joins based on random sampling. Journal of Computer and System Sciences, 52(3):550--569, 1996.
[15]
P. J. Haas and A. N. Swami. Sequential sampling procedures for query size estimation. In Proc. 1992 SIGMOD Conf., pages 341--350, 1992.
[16]
Y. E. Ioannidis. The history of histograms (abridged). In Proc. 2003 VLDB Conf., pages 19--30, Sept. 2003.
[17]
Y. E. Ioannidis and S. Christodoulakis. On the propagation of errors in the size of join results. In Proc. 1991 SIGMOD Conf., pages 268--277, May 1991.
[18]
H. Jagadish, N. Koudas, S. Muthukrishnan, V. Poosala, K. Sevcik, and T. Suel. Optimal histograms with quality guarantees. In Proc. 1998 VLDB Conf., pages 275--286.
[19]
H. Jeffreys. Theory of Probability. Clarendon Press, 1961.
[20]
J.-H. Lee, D.-H. Kim, and C.-W. Chung. Multi-dimensional selectivity estimation using compressed histogram information. In Proc. 1999 SIGMOD Conf., pages 205--214.
[21]
P. M. Lee. Bayesian Statistics: An Introduction. Oxford University Press, 1989.
[22]
R. J. Lipton, J. F. Naughton, and D. A. Schneider. Practical selectivity estimation through adaptive sampling. In Proc. 1990 SIGMOD Conf., pages 1--11, 1990.
[23]
M. V. Mannino, P. Chu, and T. Sager. Statistical profile estimation in database systems. ACM Computing Surveys, 20(3):192--221, 1988.
[24]
Y. Matias, J. S. Vitter, and M. Wang. Wavelet-based histograms for selectivity estimation. In Proc. 1998 SIGMOD Conf., pages 448--459, June 1998.
[25]
M. Muralikrishna and D. J. DeWitt. Equi-depth histograms for estimating selectivity factors for multi-dimensional queries. In Proc. 1988 SIGMOD Conf., pages 28--36, 1988.
[26]
F. Olken. Random Sampling from Databases. PhD thesis, University of California at Berkeley, 1993.
[27]
V. Poosala and Y. E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In Proc. 1997 VLDB Conf., pages 486--495, 1997.
[28]
V. Poosala, Y. E. Ioannidis, P. J. Haas, and E. J. Shekita. Improved histograms for selectivity estimation of range predicates. In Proc. 1996 SIGMOD Conf., pages 294--305.
[29]
K. Sarac, O. Egecioglu, and A. E. Abbadi. Iterated DFT based techniques for join size estimation. In Proc. 1988 ACM Intl. Conf. on Information and Knowledge Management.
[30]
P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, and T. Price. Access path selection in a relational database management system. In Proc. 1979 SIGMOD Conf., pages 23--34, June 1979.
[31]
K. D. Seppi, J. W. Barnes, and C. N. Morris. A Bayesian approach to database query optimization. ORSA Journal on Computing, 5(4):410--419, 1993.
[32]
Transaction Processing Performance Council. TPC-H benchmark 2.0.0, July 2002. http://www.tpc.org.

Cited By

View all
  • (2024)ROME: Robust Query Optimization via Parallel Multi-Plan ExecutionProceedings of the ACM on Management of Data10.1145/36549732:3(1-25)Online publication date: 30-May-2024
  • (2024)Identifying the Root Causes of DBMS SuboptimalityACM Transactions on Database Systems10.1145/363642549:1(1-40)Online publication date: 28-Feb-2024
  • (2024)TESSM: Tree-based Selective State Space Models for Efficient Join Order Selection LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679742(374-383)Online publication date: 21-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '05: Proceedings of the 2005 ACM SIGMOD international conference on Management of data
June 2005
990 pages
ISBN:1595930604
DOI:10.1145/1066157
  • Conference Chair:
  • Fatma Ozcan
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)69
  • Downloads (Last 6 weeks)11
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ROME: Robust Query Optimization via Parallel Multi-Plan ExecutionProceedings of the ACM on Management of Data10.1145/36549732:3(1-25)Online publication date: 30-May-2024
  • (2024)Identifying the Root Causes of DBMS SuboptimalityACM Transactions on Database Systems10.1145/363642549:1(1-40)Online publication date: 28-Feb-2024
  • (2024)TESSM: Tree-based Selective State Space Models for Efficient Join Order Selection LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679742(374-383)Online publication date: 21-Oct-2024
  • (2024)Optimism and Pessimism in Database Query Optimisation2024 IEEE 18th International Conference on Semantic Computing (ICSC)10.1109/ICSC59802.2024.00049(269-276)Online publication date: 5-Feb-2024
  • (2024)GLO: Towards Generalized Learned Query Optimization2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00368(4843-4855)Online publication date: 13-May-2024
  • (2024)AutoQuo: An Adaptive plan optimizer with reinforcement learning for query plan selectionKnowledge-Based Systems10.1016/j.knosys.2024.112664(112664)Online publication date: Nov-2024
  • (2024)First Past the Post: Evaluating Query Optimization in MongoDBDatabases Theory and Applications10.1007/978-981-96-1242-0_8(99-113)Online publication date: 13-Dec-2024
  • (2023)Cost-Based or Learning-Based?Proceedings of the VLDB Endowment10.14778/3565838.356584615:13(3924-3936)Online publication date: 20-Jan-2023
  • (2023)Regularized Pairwise Relationship based Analytics for Structured DataProceedings of the ACM on Management of Data10.1145/35889361:1(1-27)Online publication date: 30-May-2023
  • (2023)JoinSketch: A Sketch Algorithm for Accurate and Unbiased Inner-Product EstimationProceedings of the ACM on Management of Data10.1145/35889351:1(1-26)Online publication date: 30-May-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media