[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A Hash Table for Line-Rate Data Processing

Published: 24 March 2015 Publication History

Abstract

FPGA-based data processing is becoming increasingly relevant in data centers, as the transformation of existing applications into dataflow architectures can bring significant throughput and power benefits. Furthermore, a tighter integration of computing and network is appealing, as it overcomes traditional bottlenecks between CPUs and network interfaces, and dramatically reduces latency.
In this article, we present the design of a novel hash table, a fundamental building block used in many applications, to enable data processing on FPGAs close to the network. We present a fully pipelined design capable of sustaining consistent 10Gbps line-rate processing by deploying a concurrent mechanism to handle hash collisions. We address additional design challenges such as support for a broad range of key sizes without stalling the pipeline through careful matching of lookup time with packet reception time. Finally, the design is based on a scalable architecture that can be easily parameterized to work with different memory types operating at different access speeds and latencies.
We have tested the proposed hash table in an FPGA-based memcached appliance implementing a main-memory key-value store in hardware. The hash table is used to index 2 million entries in 24GB of external DDR3 DRAM while sustaining 13 million requests per second, the maximum packet rate that can be achieved with UDP packets on a 10Gbps link for this application.

References

[1]
Arvind Arasu, Spyros Blanas, Ken Eguro, Raghav Kaushik, Donald Kossmann, Ravi Ramamurthy, and Ramaratnam Venkatesan. 2013. Orthogonal security with Cipherbase. In Proceedings of the 6th Conference on Innovative Data Systems Research (CIDR).
[2]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems. ACM, New York, NY, 53--64.
[3]
Masanori Bando, N. Sertac Artan, and H. Jonathan Chao. 2009. Flashlook: 100-Gbps hash-tuned route lookup architecture. In Proceedings of the International Conference on High Performance Switching and Routing. IEEE, Los Alamitos, CA, 1--8.
[4]
Michaela Blott, Kimon Karras, Ling Liu, Zsolt Istvan, Jeremia Baer, and Kees Vissers. 2013. Achieving 10Gbps line-rate key-value stores with FPGAs. In Proceedings of HotCloud’13: The 5th USENIX Workshop on Hot Topics in Cloud Computing.
[5]
Andrei Broder and Michael Mitzenmacher. 2001. Using multiple hash functions to improve IP lookups. In Proceedings of INFOCOM 2001: The 20th Annual Joint Conference of the IEEE Computer and Communications Societies. IEEE, Los Alamitos, CA, 1454--1463.
[6]
Andrei Z. Broder and Anna R. Karlin. 1990. Multilevel adaptive hashing. In Proceedings of the 1st Annual ACM-SIAM Symposium on Discrete Algorithms. 43--53.
[7]
Sai Rahul Chalamalasetti, Kevin Lim, Mitch Wright, Alvin AuYoung, Parthasarathy Ranganathan, and Martin Margala. 2013. An FPGA memcached appliance. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, New York, NY, 245--254.
[8]
Convey. 2013. Ramping Up Web Server Memcached Capabilities with Hybrid-Core Computing. White Paper. Retrieved March 2, 2015, from http://www.conveycomputer.com/files/6113/7998/5068/CONV-13-047_MCD_whit epaper.pdf.
[9]
Christopher Dennl, Daniel Ziener, and Jürgen Teich. 2013. Acceleration of SQL Restrictions and Aggregations through FPGA-Based Dynamic Partial Reconfiguration. In Proceedings of the IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, Los Alamitos, CA, 25--28.
[10]
Brad Fitzpatrick. 2004. Distributed caching with memcached. Linux Journal 2004, 124, 72--74.
[11]
Phil Francisco. 2011. The Netezza data appliance architecture: A platform for high performance data warehousing and analytics. IBM Redbook.
[12]
Zsolt Istvan, Gustavo Alonso, Michaela Blott, and Kees Vissers. 2013. A flexible hash table design for 10Gbps key-value stores on FPGAs. In Proceedings of the 23rd International Conference on Field Programmable Logic and Applications (FPL). IEEE, Los Alamitos, CA, 1--8.
[13]
Bob Jenkins. 2006. Function for Producing 32bit Hashes for Hash Table Lookup. Retrieved March 2, 2015, from http://burtleburtle.net/bob/c/lookup3.c.
[14]
Brian W. Kernighan, Dennis M. Ritchie, and Per Ejeklint. 1988. The C Programming Language, Vol. 2. Prentice Hall, Englewood Cliffs, NJ.
[15]
Adam Kirsch, Michael Mitzenmacher, and Udi Wieder. 2009. More robust hashing: Cuckoo hashing with a stash. SIAM Journal on Computing 39, 4, 1543--1561.
[16]
Maysam Lavasani, Hari Angepat, and Derek Chiou. 2013. An FPGA-based in-line accelerator for memcached. IEEE Computer Architecture Letters 2, 1.
[17]
Memcached. 2013. Free and Open Source, High-Performance, Distributed Memory Object Caching System. Available at http://www.memcached.org/.
[18]
Rene Mueller, Jens Teubner, and Gustavo Alonso. 2009. Streams on wires: A query compiler for FPGAs. Proceedings of the VLDB Endowment 2, 1, 229--240.
[19]
Rasmus Pagh and Flemming Friche Rodler. 2004. Cuckoo hashing. Journal of Algorithms 51, 2, 122--144.
[20]
Viktor Puš and Jan Korenek. 2009. Fast and scalable packet classification using perfect hash functions. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, New York, NY, 229--236.
[21]
Ioannis Sourdis, Dionisios Pnevmatikatos, Stephan Wong, and Stamatis Vassiliadis. 2005. A reconfigurable perfect-hashing scheme for packet inspection. In Proceedings of the International Conference on Field Programmable Logic and Applications. IEEE, Los Alamitos, CA, 644--647.
[22]
Nicholas Weaver, Vern Paxson, and Jose M. Gonzalez. 2007. The Shunt: An FPGA-based accelerator for network intrusion prevention. In Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays. ACM, New York, NY, 199--206.
[23]
Alex Wiggins and Jimmy Langston. 2012. Enhancing the Scalability of Memcached. Retrieved March 2, 2015, from http://software.intel.com/en-us/articles/enhancing-the-scalability-of-memcached.
[24]
Louis Woods, Zsolt Istvan, and Gustavo Alonso. 2014. Ibex: An intelligent storage engine with support for advanced SQL off-loading. Proceedings of the VLDB Endowment 7, 11, 963--974.
[25]
Louis Woods, Jens Teubner, and Gustavo Alonso. 2010. Complex event detection at wire speed with FPGAs. Proceedings of the VLDB Endowment 3, 1 2, 660--669.

Cited By

View all
  • (2024)Work-in-Progress: Northcape: Embedded Real-Time Capability-Based Addressing2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)10.1109/EuroSPW61312.2024.00083(683-690)Online publication date: 8-Jul-2024
  • (2023)CostCounter: A Better Method for Collision Mitigation in Cuckoo HashingACM Transactions on Storage10.1145/359691019:3(1-24)Online publication date: 19-Jun-2023
  • (2021)AurochsProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00039(402-415)Online publication date: 14-Jun-2021
  • Show More Cited By

Index Terms

  1. A Hash Table for Line-Rate Data Processing

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Reconfigurable Technology and Systems
    ACM Transactions on Reconfigurable Technology and Systems  Volume 8, Issue 2
    Special Section on FPL 2013
    April 2015
    129 pages
    ISSN:1936-7406
    EISSN:1936-7414
    DOI:10.1145/2746532
    • Editor:
    • Steve Wilton
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 March 2015
    Accepted: 01 April 2014
    Revised: 01 April 2014
    Received: 01 December 2013
    Published in TRETS Volume 8, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FPGAs
    2. data structures
    3. hash functions
    4. parallel architecture

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Enterprise Computing Center (ECC: http://www.ecc.ethz.ch)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 31 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Work-in-Progress: Northcape: Embedded Real-Time Capability-Based Addressing2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)10.1109/EuroSPW61312.2024.00083(683-690)Online publication date: 8-Jul-2024
    • (2023)CostCounter: A Better Method for Collision Mitigation in Cuckoo HashingACM Transactions on Storage10.1145/359691019:3(1-24)Online publication date: 19-Jun-2023
    • (2021)AurochsProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00039(402-415)Online publication date: 14-Jun-2021
    • (2020)Key-Value Store using High Level Synthesis Flow for Securities Trading System2020 International Conference on Computing, Electronics & Communications Engineering (iCCECE)10.1109/iCCECE49321.2020.9231158(237-242)Online publication date: 17-Aug-2020
    • (2020)FULL-KV: Flexible and Ultra-Low-Latency In-Memory Key-Value Store System Design on CPU-FPGAIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.297396531:8(1828-1444)Online publication date: 1-Aug-2020
    • (2019)Time-SWAD: A Dataflow Engine for Time-Based Single Window Stream Aggregation2019 International Conference on Field-Programmable Technology (ICFPT)10.1109/ICFPT47387.2019.00017(72-80)Online publication date: Dec-2019
    • (2018)Ultra-Low Latency and High Throughput Key-Value Store Systems Over Ethernet2018 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS.2018.8351742(1-5)Online publication date: May-2018
    • (2018)Ultra-Low-Latency and Flexible In-memory Key-Value Store System Design on CPU-FPGA2018 International Conference on Field-Programmable Technology (FPT)10.1109/FPT.2018.00030(142-149)Online publication date: Dec-2018
    • (2018)Conclusions and OutlookDesign Automation Techniques for Approximation Circuits10.1007/978-3-319-98965-5_8(119-121)Online publication date: 10-Oct-2018
    • (2018)ProACt: Hardware Architecture for Cross-Layer Approximate ComputingDesign Automation Techniques for Approximation Circuits10.1007/978-3-319-98965-5_7(103-118)Online publication date: 10-Oct-2018
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media