More Web Proxy on the site http://driver.im/

research-article

Deploying Hash Tables on Die-Stacked High Bandwidth Memory

Authors:

Xinyu ChenAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 239 - 248

https://doi.org/10.1145/3357384.3358015

Published: 03 November 2019 Publication History

Abstract

Die-stacked High Bandwidth Memory (HBM) is an emerging memory architecture that achieves much higher memory bandwidth with similar or lower memory access latency and smaller capacity, compared with main memories. Memory-intensive database algorithms may potentially benefit from these new features. Due to the small capacity of such die-stacked HBM, a hybrid memory architecture comprising both main memories and HBMs is promising for main-memory databases. As a starting point, we study a key data structure, hash tables, in such a hybrid memory architecture. In a large hash table distributed among multiple NUMA (non-uniform memory accesses) nodes and accessed by multiple CPU sockets, the data placement and memory access scheduling for workload balance are challenging due to the random memory accesses involved that are difficult to predict. In this work, we propose a deployment algorithm that first estimates the memory access cost and then places data in a way that exploits the hybrid memory architecture in a balanced manner. Evaluation results show that the proposed deployment is able to achieve up to three times performance improvement over the state-of-the-art NUMA-aware scheduling algorithms for hash joins in relational databases on present and simulated future hybrid memory architectures.

References

[1]

Cagri Balkesen, Gustavo Alonso, Jens Teubner, and M Tamer Özsu. 2013. Multi-core, main-memory joins: Sort vs. hash revisited. Proceed- ings of the VLDB Endowment 7, 1 (2013), 85--96.

Digital Library

[2]

Cagri Balkesen, Jens Teubner, Gustavo Alonso, and M Tamer Özsu. 2013. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 362--373.

Digital Library

[3]

Ronald Barber, Guy Lohman, Ippokratis Pandis, Vijayshankar Raman, Richard Sidle, G Attaluri, Naresh Chainani, Sam Lightstone, and David Sharpe. 2014. Memory-effi cient hash joins. Proceedings of the VLDB Endowment 8, 4 (2014), 353--364.

Digital Library

[4]

Ronald Barber, Guy Lohman, Ippokratis Pandis, Vijayshankar Raman, Richard Sidle, G Attaluri, Naresh Chainani, Sam Lightstone, and David Sharpe. 2014. Memory-effi cient hash joins. Proceedings of the VLDB Endowment 8, 4 (2014), 353--364.

Digital Library

[5]

Yu Chen and Ke Yi. 2017. Two-Level Sampling for Join Size Estimation. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 759--774.

Digital Library

[6]

Xuntao Cheng, Bingsheng He, Xiaoli Du, and Chiew Tong Lau. 2017. A study of main-memory hash joins on many-core processor: A case with intel knights landing architecture. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, 657-- 666.

Digital Library

[7]

Xuntao Cheng, Bingsheng He, Eric Lo, Wei Wang, Shengliang Lu, and Chen Xinyu. [n.d.]. Deploying Hash Tables on Die-Stacked High- Bandwidth Memory. https://github.com/Xtra-Computing/HashJoin_ HMA/blob/master/CIKM_TR.pdf.

[8]

Biplob Debnath, Alireza Haghdoost, Asim Kadav, Mohammed G Khatib, and Cristian Ungureanu. 2016. Revisiting hash table design for phase change memory. ACM SIGOPS Operating Systems Review 49, 2 (2016), 18--26.

Digital Library

[9]

Jana Giceva, Gustavo Alonso, Timothy Roscoe, and Tim Harris. 2014. Deployment of query plans on multicores. Proceedings of the VLDB Endowment 8, 3 (2014), 233--244.

Digital Library

[10]

Mike P. (Intel). 2016. An Intro to MCDRAM (High Bandwidth Memory) on Knights Landing .

[11]

"JEDEC Solid State Technology Association". 2014. WIDE I/O SIN- GLE DATA RATE (WIDE I/O SDR), JESD229. https://www.jedec.org/ system/fi les/docs/JESD229.pdf.

[12]

"JEDEC Solid State Technology Association". 2015. HIGH BAND- WIDTH MEMORY (HBM) DRAM, JESD235A. https://www.jedec.org/ system/fi les/docs/JESD235A.pdf.

[13]

Saurabh Jha, Bingsheng He, Mian Lu, Xuntao Cheng, and Huynh Phung Huynh. 2015. Improving main memory hash joins on intel xeon phi processors: An experimental approach. Proceedings of the VLDB Endowment 8, 6 (2015), 642--653.

Digital Library

[14]

Yoongu Kim, Weikun Yang, and Onur Mutlu. 2016. Ramulator: A fast and extensible DRAM simulator. IEEE Computer architecture letters 15, 1 (2016), 45--49.

[15]

Richard E. Korf. 1998. A complete anytime algorithm for number partitioning. ARTIFICIAL INTELLIGENCE 106 (1998), 181--203.

Digital Library

[16]

Richard E Korf. 2009. Multi-Way Number Partitioning. In IJCAI. 538-- 543.

[17]

Reinhard Kutzelnigg. 2006. Bipartite random graphs and cuckoo hash- ing. In Discrete Mathematics and Theoretical Computer Science. Discrete Mathematics and Theoretical Computer Science, 403--406.

[18]

J. Macri. 2015. AMD's next generation GPU and high bandwidth memory architecture: FURY. In 2015 IEEE Hot Chips 27 Symposium (HCS). 1--26. https://doi.org/10.1109/HOTCHIPS.2015.7477461

[19]

Darko Makreshanski, Georgios Giannikis, Gustavo Alonso, and Don- ald Kossmann. 2018. Many-query join: efficient shared execution of relational joins on modern hardware. The VLDB Journal The International Journal on Very Large Data Bases 27, 5 (2018), 669--692.

Digital Library

[20]

Mitesh R Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, and Gabriel H Loh. 2015. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off- package memories. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). IEEE, 126--136.

[21]

J. Paul, B. He, S. Lu, and C. T. Lau. 2019. Revisiting Hash Join on Graphics Processors: A Decade Later. In 2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW). 294--299. https: //doi.org/10.1109/ICDEW.2019.00008

[22]

W Wesley Peterson. 1957. Addressing for random-access storage. IBM journal of Research and Development 1, 2 (1957), 130--146.

Digital Library

[23]

Orestis Polychroniou, Arun Raghavan, and Kenneth A Ross. 2015. Re- thinking SIMD vectorization for in-memory databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 1493--1508.

Digital Library

[24]

Iraklis Psaroudakis, Tobias Scheuer, Norman May, Abdelkader Sellami, and Anastasia Ailamaki. 2015. Scaling up concurrent main-memory column-store scans: towards adaptive NUMA-aware data and task placement. Proceedings of the VLDB Endowment 8, 12 (2015), 1442-- 1453.

Digital Library

[25]

S. Ramos and T. Hoefl er. 2017. Capability Models for Manycore Mem- ory Systems: A Case-Study with Xeon Phi KNL. In IPDPS. 297--306. https://doi.org/10.1109/IPDPS.2017.30

[26]

Stefan Richter, Victor Alvarez, and Jens Dittrich. 2015. A Seven- dimensional Analysis of Hashing Methods and Its Implications on Query Processing. Proc. VLDB Endow. (2015).

Digital Library

[27]

Stefan Schuh, Xiao Chen, and Jens Dittrich. 2016. An experimental comparison of thirteen relational equi-joins in main memory. In Pro- ceedings of the 2016 International Conference on Management of Data. ACM, 1961--1976.

[28]

Anil Shetty, Josephine Suganthi, and Prakash Khemani. 2014. Systems and methods for distributed hash table in a multi-core system.

[29]

Jeff rey Scott Vitter. 1983. Analysis of the search performance of coa- lesced hashing. Journal of the ACM (JACM) 30, 2 (1983), 231--258.

Digital Library

Cited By

Kieu-Do-Nguyen BDang TThe Binh NPham-Quoc CPhuc Nghi HTran NInoue KPham CHoang T(2024)A High-Performance Non-Indexed Text Search SystemElectronics10.3390/electronics1311212513:11(2125)Online publication date: 29-May-2024
https://doi.org/10.3390/electronics13112125
Shi RKara KHagleitner CDiamantopoulos DSyrivelis DAlonso G(2022)Exploiting HBM on FPGAs for Data ProcessingACM Transactions on Reconfigurable Technology and Systems10.1145/349123815:4(1-27)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3491238
Huang HWang ZZhang JHe ZWu CXiao JAlonso G(2022)Shuhai: A Tool for Benchmarking High Bandwidth Memory on FPGAsIEEE Transactions on Computers10.1109/TC.2021.307576571:5(1133-1144)Online publication date: 1-May-2022
https://doi.org/10.1109/TC.2021.3075765
Show More Cited By

Index Terms

Deploying Hash Tables on Die-Stacked High Bandwidth Memory
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
        Join algorithms

Recommendations

How to Manage High-Bandwidth Memory Automatically
SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures

This paper develops an algorithmic foundation for automated management of the multilevel-memory systems common to new supercomputers. In particular, the High-Bandwidth Memory (HBM) of these systems has a similar latency to that of DRAM and a smaller ...
Energy efficient Phase Change Memory based main memory for future high performance systems
IGCC '11: Proceedings of the 2011 International Green Computing Conference and Workshops

Phase Change Memory (PCM) has recently attracted a lot of attention as a scalable alternative to DRAM for main memory systems. As the need for high-density memory increases, DRAM has proven to be less attractive from the point of view of scaling and ...
WOM-Code Solutions for Low Latency and High Endurance in Phase Change Memory
This paper describes a write-once-memory-code phase change memory (WOM-code PCM) architecture for next-generation non-volatile memory applications. Specifically, we address the long latency of the write operation in PCM—attributed to PCM SET—...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Research Foundation Singapore
Research Grants Council, University Grants Committee
Ministry of Education - Singapore
Hong Kong General Research Fund
Innovation and Technology Fund

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
271
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)5

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kieu-Do-Nguyen BDang TThe Binh NPham-Quoc CPhuc Nghi HTran NInoue KPham CHoang T(2024)A High-Performance Non-Indexed Text Search SystemElectronics10.3390/electronics1311212513:11(2125)Online publication date: 29-May-2024
https://doi.org/10.3390/electronics13112125
Shi RKara KHagleitner CDiamantopoulos DSyrivelis DAlonso G(2022)Exploiting HBM on FPGAs for Data ProcessingACM Transactions on Reconfigurable Technology and Systems10.1145/349123815:4(1-27)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3491238
Huang HWang ZZhang JHe ZWu CXiao JAlonso G(2022)Shuhai: A Tool for Benchmarking High Bandwidth Memory on FPGAsIEEE Transactions on Computers10.1109/TC.2021.307576571:5(1133-1144)Online publication date: 1-May-2022
https://doi.org/10.1109/TC.2021.3075765
Hu YDu YUstun EZhang Z(2021)GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643582(1-9)Online publication date: 1-Nov-2021
https://doi.org/10.1109/ICCAD51958.2021.9643582
Kara KHagleitner CDiamantopoulos DSyrivelis DAlonso G(2020)High Bandwidth Memory on FPGAs: A Data Analytics Perspective2020 30th International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL50879.2020.00013(1-8)Online publication date: Aug-2020
https://doi.org/10.1109/FPL50879.2020.00013
Wang ZHuang HZhang JAlonso G(2020)Shuhai: Benchmarking High Bandwidth Memory On FPGAS2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM48280.2020.00024(111-119)Online publication date: May-2020
https://doi.org/10.1109/FCCM48280.2020.00024

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents