More Web Proxy on the site http://driver.im/

research-article

GPU-based NFA implementation for memory efficient high speed regular expression matching

Authors:

Qunfeng DongAuthors Info & Claims

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

Pages 129 - 140

https://doi.org/10.1145/2145816.2145833

Published: 25 February 2012 Publication History

Abstract

Regular expression pattern matching is the foundation and core engine of many network functions, such as network intrusion detection, worm detection, traffic analysis, web applications and so on. DFA-based solutions suffer exponentially exploding state space and cannot be remedied without sacrificing matching speed. Given this scalability problem of DFA-based methods, there has been increasing interest in NFA-based methods for memory efficient regular expression matching. To achieve high matching speed using NFA, it requires potentially massive parallel processing, and hence represents an ideal programming task on Graphic Processor Unit (GPU). Based on in-depth understanding of NFA properties as well as GPU architecture, we propose effective methods for fitting NFAs into GPU architecture through proper data structure and parallel programming design, so that GPU's parallel processing power can be better utilized to achieve high speed regular expression matching. Experiment results demonstrate that, compared with the existing GPU-based NFA implementation method [9], our proposed methods can boost matching speed by 29~46 times, consistently yielding above 10Gbps matching speed on NVIDIA GTX-460 GPU. Meanwhile, our design only needs a small amount of memory space, growing exponentially more slowly than DFA size. These results make our design an effective solution for memory efficient high speed regular expression matching, and clearly demonstrate the power and potential of GPU as a platform for memory efficient high speed regular expression matching.

References

[1]

PCRE - Perl Compatible Regular Expressions. http://www.pcre.org/.

[2]

Snort intrusion detection system. http://www.snort.org/.

[3]

S. S. Baghsorkhi, M. Delahaye, S. J. Patel, W. D. Gropp, and W. mei W. Hwu. An adaptive performance modeling tool for GPU architectures. In Proceedings of ACM PPoPP, 2010.

Digital Library

[4]

M. Becchi and P. Crowley. A hybrid finite automaton for practical deep packet inspection. In Proceedings of CoNext, 2007.

Digital Library

[5]

M. Becchi and P. Crowley. An improved algorithm to accelerate regular expression evaluation. In Proceedings of ANCS, 2007.

Digital Library

[6]

M. Becchi and P. Crowley. Efficient regular expression evaluation: Theory to practice. In Proceedings of ANCS, 2008.

Digital Library

[7]

M. Becchi and P. Crowley. Extending finite automata to efficient match perl-compatible regular expressions. In Proceedings of CoNext, 2008.

Digital Library

[8]

M. Becchi, M. Franklin, and P. Crowley. A workload for evaluating deep packet inspection architectures. In Proceedings of IISWC, 2008.

[9]

N. Cascarano, P. Rolando, F. Risso, and R. Sisto. iNFAnt: NFA pattern matching on GPGPU devices. SIGCOMM CCR, 40 (5): 21--26, 2010.

Digital Library

[10]

M. Chen. TCAM-based high speed regular expression matching. Bachelor thesis, Institute of Networked Systems (IONS), University of Science and Technology of China, June 2010.

[11]

M. Chen, Q. Dong, and K. Peng. TCAM-based DFA implementation: A novel approach to efficient regular expression matching. Technical report, Institute of Networked Systems (IONS), University of Science and Technology of China.

[12]

J. W. Choi, A. Singh, and R. W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on GPUs. In Proceedings of ACM PPoPP, 2010.

Digital Library

[13]

Y. Dotsenko, S. S. Baghsorkhi, B. Lloyd, and N. K. Govindaraju. Auto-tuning of fast fourier transform on graphics processors. In Proceedings of PPOPP, 2011.

Digital Library

[14]

S. Hong, S. K. Kim, T. Oguntebi, and K. Olukotun. Accelerating CUDA graph algorithms at maximum warp. In Proceedings of ACM PPoPP, 2011.

Digital Library

[15]

J. Kim, H. Kim, J. H. Lee, and J. Lee. Achieving a single compute device image in OpenCL for multiple GPUs. In Proceedings of ACM PPoPP, 2011.

Digital Library

[16]

S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese. Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia. In Proceedings of ANCS, 2007.

Digital Library

[17]

S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Turner. Algorithms to accelerate multiple regular expressions matching for deep packet inspection. In Proceedings of ACM SIGCOMM, 2006.

Digital Library

[18]

S. Lee, S.-J. Min, and R. Eigenmann. OpenMP to GPGPU: a compiler framework for automatic translation and optimization. In Proceedings of ACM PPoPP, 2009.

Digital Library

[19]

C. R. Meiners, J. Patel, E. Norige, E. Torng, and A. X. Liu. Fast regular expression matching using small TCAMs for network intrusion detection and prevention systems. In Proceedings of USENIX Security, August 2010.

Digital Library

[20]

Peng, Dong, and Chen}peng2011iwqosK. Peng, Q. Dong, and M. Chen. TCAM-based DFA deflation: A novel approach to fast and scalable regular expression matching. In Proceedings of ACM/IEEE IWQoS, 2011.

Digital Library

[21]

K. Peng, S. Tang, Q. Dong, and M. Chen. Chain-based DFA deflation for fast and scalable regular expression matching using TCAM. In Proceedings of ANCS, 2011.

Digital Library

[22]

E. F. O. Sandes and A. C. M. de Melo. CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences. In Proceedings of ACM PPoPP, 2010.

Digital Library

[23]

R. Smith, C. Estan, and S. Jha. XFA: Faster signature matching with extended automata. In Proceedings of IEEE Symposium on Security and Privacy, 2008.

Digital Library

[24]

R. Smith, C. Estan, S. Jha, and S. Kong. Deflating the big bang: fast and scalable deep packet inspection with extended finite automata. In Proceedings of ACM SIGCOMM, 2008.

Digital Library

[25]

R. Smith, N. Goyal, J. Ormont, K. Sankaralingam, and C. Estan. Evaluating GPUs for network packet signature matching. In Proceedings of ISPASS, 2009.

[26]

G. Vasiliadis, S. Antonatos, M. Polychronakis, E. P. Markatos, and S. Ioannidis. Gnort: High performance network intrusion detection using graphics processors. In Proceedings of RAID, 2008.

Digital Library

[27]

G. Vasiliadis, M. Polychronakis, S. Antonatos, E. P. Markatos, and S. Ioannidis. Regular expression matching on graphics hardware for intrusion detection. In Proceedings of RAID, 2009.

Digital Library

[28]

F. Yu, Z. Chen, Y. Diao, T. V. Lakshman, and R. H. Katz. Fast and memory-efficient regular expression matching for deep packet inspection. In Proceedings of ANCS, 2006.

Digital Library

[29]

Y. Zhang, J. Cohen, and J. D. Owens. Fast tridiagonal solvers on the GPU. In Proceedings of ACM PPoPP, 2010.

Digital Library

[30]

M. Zheng, V. T. Ravi, F. Qin, and G. Agrawal. GRace: a low-overhead mechanism for detecting data races in GPU programs. In Proceedings of ACM PPoPP, 2011.

Digital Library

Cited By

Le Glaunec AKong LMamouras K(2024)HybridSA: GPU Acceleration of Multi-pattern Regex Matching using Bit ParallelismProceedings of the ACM on Programming Languages10.1145/36897718:OOPSLA2(1699-1728)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689771
Nguyen TBecchi M(2024)A Transducers-based Programming Framework for Efficient Data TransformationProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676891(66-77)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3656019.3676891
Ge TZhang TLiu HTsafrir DMUSUVATHI MGupta RAbu-Ghazaleh N(2024)ngAP: Non-blocking Large-scale Automata Processing on GPUsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624848(268-285)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624848
Show More Cited By

Index Terms

GPU-based NFA implementation for memory efficient high speed regular expression matching

Recommendations

GPU acceleration of regular expression matching for large datasets: exploring the implementation space
CF '13: Proceedings of the ACM International Conference on Computing Frontiers

Regular expression matching is a central task in several networking (and search) applications and has been accelerated on a variety of parallel architectures, including general purpose multi-core processors, network processors, field programmable gate ...
GPU-based NFA implementation for memory efficient high speed regular expression matching
PPOPP '12

Regular expression pattern matching is the foundation and core engine of many network functions, such as network intrusion detection, worm detection, traffic analysis, web applications and so on. DFA-based solutions suffer exponentially exploding state ...
Exploring different automata representations for efficient regular expression matching on GPUs
PPoPP '13: Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming

Regular expression matching is a central task in several networking (and search) applications and has been accelerated on a variety of parallel architectures. All solutions are based on finite automata (either in deterministic or non-deterministic form),...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

February 2012

352 pages

ISBN:9781450311601

DOI:10.1145/2145816

General Chair:
J. Ramanujam
Louisiana State University, USA
,
Program Chair:
P. Sadayappan
The Ohio State University, USA

ACM SIGPLAN Notices Volume 47, Issue 8
PPOPP '12
August 2012
334 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2370036
Issue’s Table of Contents

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PPoPP '12

Sponsor:

SIGPLAN

PPoPP '12: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 25 - 29, 2012

Louisiana, New Orleans, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

92
Total Citations
View Citations
1,120
Total Downloads

Downloads (Last 12 months)45
Downloads (Last 6 weeks)6

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Le Glaunec AKong LMamouras K(2024)HybridSA: GPU Acceleration of Multi-pattern Regex Matching using Bit ParallelismProceedings of the ACM on Programming Languages10.1145/36897718:OOPSLA2(1699-1728)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689771
Nguyen TBecchi M(2024)A Transducers-based Programming Framework for Efficient Data TransformationProceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques10.1145/3656019.3676891(66-77)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3656019.3676891
Ge TZhang TLiu HTsafrir DMUSUVATHI MGupta RAbu-Ghazaleh N(2024)ngAP: Non-blocking Large-scale Automata Processing on GPUsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624848(268-285)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624848
Cicolini LCarloni FSantambrogio MConficconi DGrosser TDubach CSteuwer MXue JOttoni GQuintão Pereira F(2024)One Automaton to Rule Them All: Beyond Multiple Regular Expressions ExecutionProceedings of the 2024 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO57630.2024.10444810(193-206)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1109/CGO57630.2024.10444810
Liu HPai SJog A(2023)Asynchronous Automata Processing on GPUsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35794537:1(1-27)Online publication date: 2-Mar-2023
https://dl.acm.org/doi/10.1145/3579453
Kong LYu QChattopadhyay ALe Glaunec AHuang YMamouras KYang KJhala RDillig I(2022)Software-hardware codesign for efficient in-memory regular pattern matchingProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523456(733-748)Online publication date: 9-Jun-2022
https://dl.acm.org/doi/10.1145/3519939.3523456
Wang YWatling RQiu JWang Z(2022)GSpecPal: Speculation-Centric Finite State Machine Parallelization on GPUs2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00053(481-491)Online publication date: May-2022
https://doi.org/10.1109/IPDPS53621.2022.00053
Nguyen TBecchi M(2022)A GPU-accelerated Data Transformation Framework Rooted in Pushdown Transducers2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC56025.2022.00038(215-225)Online publication date: Dec-2022
https://doi.org/10.1109/HiPC56025.2022.00038
Qiu JSun XSabet AZhao ZSherwood TBerger EKozyrakis C(2021)Scalable FSM parallelization via path fusion and higher-order speculationProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446705(887-901)Online publication date: 19-Apr-2021
https://dl.acm.org/doi/10.1145/3445814.3446705
Liu HPai SJog ALarus JCeze LStrauss K(2020)Why GPUs are Slow at Executing NFAs and How to Make them FasterProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3373376.3378471(251-265)Online publication date: 9-Mar-2020
https://dl.acm.org/doi/10.1145/3373376.3378471
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents