[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1083592.1083687dlproceedingsArticle/Chapter ViewAbstractPublication PagesvldbConference Proceedingsconference-collections
Article

Inspector joins

Published: 30 August 2005 Publication History

Abstract

The key idea behind Inspector Joins is that during the I/O partitioning phase of a hash-based join, we have the opportunity to look at the actual data itself and then use this knowledge in two ways: (1) to create specialized indexes, specific to the given query on the given data, for optimizing the CPU cache performance of the subsequent join phase of the algorithm, and (2) to decide which join phase algorithm best suits this specific query. We show how inspector joins, employing novel statistics and specialized indexes, match or exceed the performance of state-of-the-art cache-friendly hash join algorithms. For example, when run on eight or more processors, our experiments show that inspector joins offer 1.1-1.4X speedups over these previous algorithms, with the speedup increasing as the number of processors increases.

References

[1]
J. A. Blakeley, P. Larson, and F. W. Tompa. Efficiently Updating Materialized Views. In Proceedings of the 1986 SIGMOD Conference, pages 61--71, May 1986.
[2]
B. H. Bloom. Space/Time Trade-offs in Hash Coding with Allowable Errors. Communications of ACM, 13(7):422--426, 1970.
[3]
P. A. Boncz, S. Manegold, and M. Kersten. Database Architecture Optimized for the New Bottleneck: Memory Access. In Proceedings of the 25th VLDB, pages 54--65, Sept. 1999.
[4]
S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry. Improving Hash Join Performance through Prefetching. In Proceedings of the 20th ICDE, pages 116--127, March 2004.
[5]
G. Graefe. Query Evaluation Techniques for Large Databases. ACM Computing Surveys, 25(2):73--170, 1993.
[6]
Intel Corp. IA-32 Intel Architecture Software Developer's Manual, Volumn 2B:Instruction Set Reference N-Z. Order Number: 253667.
[7]
Intel Corp. Intel Itanium 2 Processor Reference Manual For Software Development and Optimization. Order Number: 251110--003.
[8]
Intel Corp. Intel Itanium Architecture Software Developer's Manual. Order Number: 245317-004.
[9]
N. Kabra and D. J. DeWitt. Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans. In Proceedings of the 1998 SIGMOD Conference, pages 106--117, June 1998.
[10]
M. Kitsuregawa, H. Tanaka, and T. Moto-Oka. Application of Hash to Data Base Machine and its Architecture. New Generation Computing, 1(1):63--74, 1983.
[11]
B. Lindsay. Hash Joins in DB2 UDB: the Inside Story. Carnegie Mellon DB Seminar, March 2002.
[12]
S. Manegold, P. A. Boncz, and M. L. Kersten. What Happens During a Join? Dissecting CPU and Memory Optimization Effects. In Proceedings of the 26th VLDB, pages 339--350, Sept. 2000.
[13]
V. Markl, V. Raman, D. E. Simmen, G. M. Lohman, and H. Pirahesh. Robust Query Processing through Progressive Optimization. In Proceedings of the 2004 SIGMOD Conference, pages 659--670, June 2004.
[14]
S. McFarling. Combining Branch Predictors. Technical Report WRL Technical Note TN-36, Digital Equipment Corporation, June 1993.
[15]
T. K. Sellis. Multiple-Query Optimization. ACM TODS, 13(1):23--52, 1988.
[16]
L. D. Shapiro. Join Processing in Database Systems with Large Main Memories. ACM TODS, 11(3):239--264, 1986.
[17]
A. Shatdal, C. Kant, and J. F. Naughton. Cache Conscious Algorithms for Relational Query Processing. In Proceedings of the 20th VLDB, pages 510--521, Sept. 1994.
[18]
M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - DB2's LEarning Optimizer. In Proceedings of the 27th VLDB, pages 19--28, Sept. 2001.
[19]
P. Valduriez. Join Indices. ACM TODS, 12(2):218--246, 1987.

Cited By

View all
  • (2018)Smooth ScanThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-018-0507-827:4(521-545)Online publication date: 1-Aug-2018
  • (2013)Memory footprint mattersProceedings of the 4th annual Symposium on Cloud Computing10.1145/2523616.2523626(1-16)Online publication date: 1-Oct-2013
  • (2010)Optimization of joins using random record generation methodProceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India10.1145/1858378.1858406(1-6)Online publication date: 16-Sep-2010
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
VLDB '05: Proceedings of the 31st international conference on Very large data bases
August 2005
1392 pages
ISBN:1595931546

Publisher

VLDB Endowment

Publication History

Published: 30 August 2005

Qualifiers

  • Article

Conference

ICMI05

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Smooth ScanThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-018-0507-827:4(521-545)Online publication date: 1-Aug-2018
  • (2013)Memory footprint mattersProceedings of the 4th annual Symposium on Cloud Computing10.1145/2523616.2523626(1-16)Online publication date: 1-Oct-2013
  • (2010)Optimization of joins using random record generation methodProceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India10.1145/1858378.1858406(1-6)Online publication date: 16-Sep-2010
  • (2010)Suffix tree construction algorithms on modern hardwareProceedings of the 13th International Conference on Extending Database Technology10.1145/1739041.1739075(263-274)Online publication date: 22-Mar-2010
  • (2009)Improving the performance of list intersectionProceedings of the VLDB Endowment10.14778/1687627.16877222:1(838-849)Online publication date: 1-Aug-2009
  • (2009)The design of a query monitoring systemACM Transactions on Database Systems10.1145/1508857.150885834:1(1-51)Online publication date: 23-Apr-2009
  • (2008)Breaking the memory wall in MonetDBCommunications of the ACM10.1145/1409360.140938051:12(77-85)Online publication date: 1-Dec-2008
  • (2007)Adaptive aggregation on chip multiprocessorsProceedings of the 33rd international conference on Very large data bases10.5555/1325851.1325893(339-350)Online publication date: 23-Sep-2007
  • (2007)A general framework for improving query processing performance on multi-level memory hierarchiesProceedings of the 3rd international workshop on Data management on new hardware10.1145/1363189.1363193(1-9)Online publication date: 15-Jun-2007
  • (2007)Improving hash join performance through prefetchingACM Transactions on Database Systems10.1145/1272743.127274732:3(17-es)Online publication date: 1-Aug-2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media