[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3357384.3358015acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Deploying Hash Tables on Die-Stacked High Bandwidth Memory

Published: 03 November 2019 Publication History

Abstract

Die-stacked High Bandwidth Memory (HBM) is an emerging memory architecture that achieves much higher memory bandwidth with similar or lower memory access latency and smaller capacity, compared with main memories. Memory-intensive database algorithms may potentially benefit from these new features. Due to the small capacity of such die-stacked HBM, a hybrid memory architecture comprising both main memories and HBMs is promising for main-memory databases. As a starting point, we study a key data structure, hash tables, in such a hybrid memory architecture. In a large hash table distributed among multiple NUMA (non-uniform memory accesses) nodes and accessed by multiple CPU sockets, the data placement and memory access scheduling for workload balance are challenging due to the random memory accesses involved that are difficult to predict. In this work, we propose a deployment algorithm that first estimates the memory access cost and then places data in a way that exploits the hybrid memory architecture in a balanced manner. Evaluation results show that the proposed deployment is able to achieve up to three times performance improvement over the state-of-the-art NUMA-aware scheduling algorithms for hash joins in relational databases on present and simulated future hybrid memory architectures.

References

[1]
Cagri Balkesen, Gustavo Alonso, Jens Teubner, and M Tamer Özsu. 2013. Multi-core, main-memory joins: Sort vs. hash revisited. Proceed- ings of the VLDB Endowment 7, 1 (2013), 85--96.
[2]
Cagri Balkesen, Jens Teubner, Gustavo Alonso, and M Tamer Özsu. 2013. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 362--373.
[3]
Ronald Barber, Guy Lohman, Ippokratis Pandis, Vijayshankar Raman, Richard Sidle, G Attaluri, Naresh Chainani, Sam Lightstone, and David Sharpe. 2014. Memory-effi cient hash joins. Proceedings of the VLDB Endowment 8, 4 (2014), 353--364.
[4]
Ronald Barber, Guy Lohman, Ippokratis Pandis, Vijayshankar Raman, Richard Sidle, G Attaluri, Naresh Chainani, Sam Lightstone, and David Sharpe. 2014. Memory-effi cient hash joins. Proceedings of the VLDB Endowment 8, 4 (2014), 353--364.
[5]
Yu Chen and Ke Yi. 2017. Two-Level Sampling for Join Size Estimation. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 759--774.
[6]
Xuntao Cheng, Bingsheng He, Xiaoli Du, and Chiew Tong Lau. 2017. A study of main-memory hash joins on many-core processor: A case with intel knights landing architecture. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 657-- 666.
[7]
Xuntao Cheng, Bingsheng He, Eric Lo, Wei Wang, Shengliang Lu, and Chen Xinyu. [n.d.]. Deploying Hash Tables on Die-Stacked High- Bandwidth Memory. https://github.com/Xtra-Computing/HashJoin_ HMA/blob/master/CIKM_TR.pdf.
[8]
Biplob Debnath, Alireza Haghdoost, Asim Kadav, Mohammed G Khatib, and Cristian Ungureanu. 2016. Revisiting hash table design for phase change memory. ACM SIGOPS Operating Systems Review 49, 2 (2016), 18--26.
[9]
Jana Giceva, Gustavo Alonso, Timothy Roscoe, and Tim Harris. 2014. Deployment of query plans on multicores. Proceedings of the VLDB Endowment 8, 3 (2014), 233--244.
[10]
Mike P. (Intel). 2016. An Intro to MCDRAM (High Bandwidth Memory) on Knights Landing .
[11]
"JEDEC Solid State Technology Association". 2014. WIDE I/O SIN- GLE DATA RATE (WIDE I/O SDR), JESD229. https://www.jedec.org/ system/fi les/docs/JESD229.pdf.
[12]
"JEDEC Solid State Technology Association". 2015. HIGH BAND- WIDTH MEMORY (HBM) DRAM, JESD235A. https://www.jedec.org/ system/fi les/docs/JESD235A.pdf.
[13]
Saurabh Jha, Bingsheng He, Mian Lu, Xuntao Cheng, and Huynh Phung Huynh. 2015. Improving main memory hash joins on intel xeon phi processors: An experimental approach. Proceedings of the VLDB Endowment 8, 6 (2015), 642--653.
[14]
Yoongu Kim, Weikun Yang, and Onur Mutlu. 2016. Ramulator: A fast and extensible DRAM simulator. IEEE Computer architecture letters 15, 1 (2016), 45--49.
[15]
Richard E. Korf. 1998. A complete anytime algorithm for number partitioning. ARTIFICIAL INTELLIGENCE 106 (1998), 181--203.
[16]
Richard E Korf. 2009. Multi-Way Number Partitioning. In IJCAI. 538-- 543.
[17]
Reinhard Kutzelnigg. 2006. Bipartite random graphs and cuckoo hash- ing. In Discrete Mathematics and Theoretical Computer Science. Discrete Mathematics and Theoretical Computer Science, 403--406.
[18]
J. Macri. 2015. AMD's next generation GPU and high bandwidth memory architecture: FURY. In 2015 IEEE Hot Chips 27 Symposium (HCS). 1--26. https://doi.org/10.1109/HOTCHIPS.2015.7477461
[19]
Darko Makreshanski, Georgios Giannikis, Gustavo Alonso, and Don- ald Kossmann. 2018. Many-query join: efficient shared execution of relational joins on modern hardware. The VLDB Journal The International Journal on Very Large Data Bases 27, 5 (2018), 669--692.
[20]
Mitesh R Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, and Gabriel H Loh. 2015. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off- package memories. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). IEEE, 126--136.
[21]
J. Paul, B. He, S. Lu, and C. T. Lau. 2019. Revisiting Hash Join on Graphics Processors: A Decade Later. In 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW). 294--299. https: //doi.org/10.1109/ICDEW.2019.00008
[22]
W Wesley Peterson. 1957. Addressing for random-access storage. IBM journal of Research and Development 1, 2 (1957), 130--146.
[23]
Orestis Polychroniou, Arun Raghavan, and Kenneth A Ross. 2015. Re- thinking SIMD vectorization for in-memory databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 1493--1508.
[24]
Iraklis Psaroudakis, Tobias Scheuer, Norman May, Abdelkader Sellami, and Anastasia Ailamaki. 2015. Scaling up concurrent main-memory column-store scans: towards adaptive NUMA-aware data and task placement. Proceedings of the VLDB Endowment 8, 12 (2015), 1442-- 1453.
[25]
S. Ramos and T. Hoefl er. 2017. Capability Models for Manycore Mem- ory Systems: A Case-Study with Xeon Phi KNL. In IPDPS. 297--306. https://doi.org/10.1109/IPDPS.2017.30
[26]
Stefan Richter, Victor Alvarez, and Jens Dittrich. 2015. A Seven- dimensional Analysis of Hashing Methods and Its Implications on Query Processing. Proc. VLDB Endow. (2015).
[27]
Stefan Schuh, Xiao Chen, and Jens Dittrich. 2016. An experimental comparison of thirteen relational equi-joins in main memory. In Pro- ceedings of the 2016 International Conference on Management of Data. ACM, 1961--1976.
[28]
Anil Shetty, Josephine Suganthi, and Prakash Khemani. 2014. Systems and methods for distributed hash table in a multi-core system.
[29]
Jeff rey Scott Vitter. 1983. Analysis of the search performance of coa- lesced hashing. Journal of the ACM (JACM) 30, 2 (1983), 231--258.

Cited By

View all
  • (2024)A High-Performance Non-Indexed Text Search SystemElectronics10.3390/electronics1311212513:11(2125)Online publication date: 29-May-2024
  • (2022)Exploiting HBM on FPGAs for Data ProcessingACM Transactions on Reconfigurable Technology and Systems10.1145/349123815:4(1-27)Online publication date: 9-Dec-2022
  • (2022)Shuhai: A Tool for Benchmarking High Bandwidth Memory on FPGAsIEEE Transactions on Computers10.1109/TC.2021.307576571:5(1133-1144)Online publication date: 1-May-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. die-stacked high bandwidth memory
  2. hash joins

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '19
Sponsor:

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)5
Reflects downloads up to 11 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A High-Performance Non-Indexed Text Search SystemElectronics10.3390/electronics1311212513:11(2125)Online publication date: 29-May-2024
  • (2022)Exploiting HBM on FPGAs for Data ProcessingACM Transactions on Reconfigurable Technology and Systems10.1145/349123815:4(1-27)Online publication date: 9-Dec-2022
  • (2022)Shuhai: A Tool for Benchmarking High Bandwidth Memory on FPGAsIEEE Transactions on Computers10.1109/TC.2021.307576571:5(1133-1144)Online publication date: 1-May-2022
  • (2021)GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643582(1-9)Online publication date: 1-Nov-2021
  • (2020)High Bandwidth Memory on FPGAs: A Data Analytics Perspective2020 30th International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL50879.2020.00013(1-8)Online publication date: Aug-2020
  • (2020)Shuhai: Benchmarking High Bandwidth Memory On FPGAS2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM48280.2020.00024(111-119)Online publication date: May-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media