More Web Proxy on the site http://driver.im/

research-article

Efficient software packet processing on heterogeneous and asymmetric hardware architectures

Authors:

Lazaros Koromilas,

Giorgos Vasiliadis,

Ioannis Manousakis,

Sotiris IoannidisAuthors Info & Claims

ANCS '14: Proceedings of the tenth ACM/IEEE symposium on Architectures for networking and communications systems

Pages 207 - 218

https://doi.org/10.1145/2658260.2658265

Published: 20 October 2014 Publication History

Abstract

Heterogeneous and asymmetric computing systems are composed by a set of different processing units, each with its own unique performance and energy characteristics. Still, the majority of current network packet processing frameworks targets only a single device (the CPU or some accelerator), leaving other processing resources idle. In this paper, we propose an adaptive scheduling approach that supports heterogeneous and asymmetric hardware, tailored for network packet processing applications. Our scheduler is able to respond quickly to dynamic performance fluctuations that occur at real-time, such as traffic bursts, application overloads and system changes. The experimental results show that our system is able to match the peak throughput of a diverse set of packet processing workloads, while consuming up to 3.5x less energy.

References

[1]

1018 2 - PhidgetInterfaceKit 8/8/8. http://www.phidgets.com/.

[2]

1122 0 - 30 Amp Current Sensor AC/DC. http://www.phidgets.com/.

[3]

OpenCL. http://www.khronos.org/opencl/.

[4]

OpenSSL Project. http://www.openssl.org/.

[5]

The Snort IDS/IPS. http://www.snort.org/.

[6]

Intel 82599 10 GbE Controller Datasheet, Revision 2.0, 2009.

[7]

Intel HD Graphics DirectX Developer's Guide, 2010.

[8]

Intel SDK for OpenCL Applications 2013: Optimization Guide, 2013.

[9]

B. Aggarwal, A. Akella, A. Anand, A. Balachandran, P. Chitnis, C. Muthukrishnan, R. Ramjee, and G. Varghese. EndRE: an end-system redundancy elimination service for enterprises. In NSDI, 2010.

Digital Library

[10]

A. V. Aho and M. J. Corasick. E_cient string matching: an aid to bibliographic search. Communications of the ACM, 18(6):333--340, 1975.

Digital Library

[11]

A. Anand, A. Gupta, A. Akella, S. Seshan, and S. Shenker. Packet caches on routers: the implications of universal redundant tra_c elimination. In SIGCOMM, 2008.

Digital Library

[12]

T. Benson, A. Anand, A. Akella, and M. Zhang. Understanding Data Center Tra_c Characteristics. SIGCOMM CCR, 40(1), 2010.

Digital Library

[13]

M. Boyer, K. Skadron, S. Che, and N. Jayasena. Load Balancing in a Changing World: Dealing with Heterogeneity and Performance Variability. In ACM Computing Frontiers, 2013.

Digital Library

[14]

S. A. Crosby and D. S. Wallach. Denial of service via algorithmic complexity attacks. In USENIX Security, 2003.

Digital Library

[15]

G. F. Diamos and S. Yalamanchili. Harmony: An Execution Model and Runtime for Heterogeneous Many Core Systems. In HPDC, 2008.

Digital Library

[16]

M. Dobrescu, K. Argyraki, and S. Ratnasamy. Toward Predictable Performance in Software Packet-Processing Platforms. In NSDI, 2012.

Digital Library

[17]

M. Dobrescu, N. Egi, K. Argyraki, B.-G. Chun, K. Fall, G. Iannaccone, A. Knies, M. Manesh, and S. Ratnasamy. RouteBricks: Exploiting Parallelism to Scale Software Routers. In SOSP, 2009.

Digital Library

[18]

A. Ghodsi, V. Sekar, M. Zaharia, and I. Stoica. Multi-Resource Fair Queueing for Packet Processing. In SIGCOMM, 2012.

[19]

S. Han, K. Jang, K. Park, and S. Moon. PacketShader: a GPU-accelerated software router. In SIGCOMM, 2010.

Digital Library

[20]

O. Harrison and J. Waldron. Practical Symmetric Key Cryptography on Modern Graphics Hardware. In USENIX Security, 2008.

Digital Library

[21]

S. Hong and H. Kim. An integrated gpu power and performance model. In SIGARCH, 2010.

Digital Library

[22]

M. Jamshed, J. Lee, S. Moon, I. Yun, D. Kim, S. Lee, Y. Yi, and K. Park. Kargus: a Highly-scalable Software-based Intrusion Detection System. In CCS, 2012.

Digital Library

[23]

K. Jang, S. Han, S. Han, S. Moon, and K. Park. SSLShader: Cheap SSL Acceleration with Commodity Processors. In NSDI, 2011.

Digital Library

[24]

J. Kim, H. Kim, J. H. Lee, and J. Lee. Achieving a single compute device image in OpenCL for multiple GPUs. In PPoPP, 2011.

Digital Library

[25]

P. Kulkarni, F. Douglis, J. LaVoie, and J. M. Tracey. Redundancy elimination within large collections of _les. In USENIX ATC, 2004.

Digital Library

[26]

M. D. Linderman, J. D. Collins, H. Wang, and T. H. Meng. Merge: A Programming Model for Heterogeneous Multi-core Systems. In ASPLOS, 2008.

Digital Library

[27]

C.-K. Luk, S. Hong, and H. Kim. Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping. In MICRO, 2009.

Digital Library

[28]

G. Maier, A. Feldmann, V. Paxson, and M. Allman. On dominant characteristics of residential broadband internet tra_c. In IMC, 2009.

Digital Library

[29]

L. Niccolini, G. Iannaccone, S. Ratnasamy, J. Chandrashekar, and L. Rizzo. Building a Power-Proportional Software Router. In USENIX ATC, 2012.

Digital Library

[30]

NVIDIA. CUDA C Programming Guide, Version 5.0, 2012.

[31]

R. Rivest. The MD5 message-digest algorithm. 1992.

[32]

L. Rizzo. netmap: A Novel Framework for Fast Packet I/O. In USENIX ATC, 2012.

Digital Library

[33]

J. Shen, J. Fang, H. Sips, and A. L. Varbanescu. Performance Traps in OpenCL for CPUs. In PDP, 2013.

Digital Library

[34]

R. Smith, N. Goyal, J. Ormont, K. Sankaralingam, and C. Estan. Evaluating GPUs for Network Packet Signature Matching. In ISPASS, 2009.

[35]

E. Sun, D. Schaa, R. Bagley, N. Rubin, and D. Kaeli. Enabling Task-Level Scheduling on Heterogeneous Platforms. In GPGPU, 2012.

Digital Library

[36]

W. Sun and R. Ricci. Fast and Flexible: Parallel Packet Processing with GPUs and Click. In ANCS, 2013.

Digital Library

[37]

G. Vasiliadis, S. Antonatos, M. Polychronakis, E. P. Markatos, and S. Ioannidis. Gnort: High Performance Network Intrusion Detection Using Graphics Processors. In RAID, 2008.

Digital Library

[38]

G. Vasiliadis, L. Koromilas, M. Polychronakis, and S. Ioannidis. GASPP: A GPU-Accelerated Stateful Packet Processing Framework. In USENIX ATC, 2014.

Digital Library

[39]

G. Vasiliadis, M. Polychronakis, S. Antonatos, E. P. Markatos, and S. Ioannidis. Regular Expression Matching on Graphics Hardware for Intrusion Detection. In RAID, 2009.

Digital Library

[40]

G. Vasiliadis, M. Polychronakis, and S. Ioannidis. MIDeA: a multi-parallel intrusion detection architecture. In CCS, 2011.

Digital Library

[41]

G. Wang and X. Ren. Power-e_cient work distribution method for cpu-gpu heterogeneous system. In ISPA, 2010.

Digital Library

Cited By

Vasiliadis GTsirbas RIoannidis S(2022)The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW55747.2022.00017(55-64)Online publication date: May-2022
https://doi.org/10.1109/IPDPSW55747.2022.00017
Giakoumakis GPapadogiannaki EVasiliadis GIoannidis S(2022)Scheduling of multiple network packet processing applications using PythiaComputer Networks10.1016/j.comnet.2022.109006212(109006)Online publication date: Jul-2022
https://doi.org/10.1016/j.comnet.2022.109006
Yi XWang JDuan JBai WWu CXiong YHan D(2019)FlowShader: a Generalized Framework for GPU-accelerated VNF Flow Processing2019 IEEE 27th International Conference on Network Protocols (ICNP)10.1109/ICNP.2019.8888129(1-12)Online publication date: Oct-2019
https://doi.org/10.1109/ICNP.2019.8888129
Show More Cited By

Index Terms

Efficient software packet processing on heterogeneous and asymmetric hardware architectures
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems
2. Networks
  1. Network services

Recommendations

Efficient Software Packet Processing on Heterogeneous and Asymmetric Hardware Architectures

Heterogeneous and asymmetric computing systems are composed by a set of different processing units, each with its own unique performance and energy characteristics. Still, the majority of current network packet processing frameworks targets only a ...
Dynamic Partitioning-based JPEG Decompression on Heterogeneous Multicore Architectures
PMAM'14: Proceedings of Programming Models and Applications on Multicores and Manycores

With the emergence of social networks and improvements in computational photography, billions of JPEG images are shared and viewed on a daily basis. Desktops, tablets and smartphones constitute the vast majority of hardware platforms used for displaying ...
Localized asynchronous packet scheduling for buffered crossbar switches
ANCS '06: Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems

Buffered crossbar switches are a special type of crossbar switches. In such a switch, besides normal input queues and output queues, a small buffer is associated with each crosspoint. Due to the introduction of crosspoint buffers, output and input ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ANCS '14: Proceedings of the tenth ACM/IEEE symposium on Architectures for networking and communications systems

October 2014

274 pages

ISBN:9781450328395

DOI:10.1145/2658260

General Chair:
Viktor K. Prasanna
University of Southern California, USA
,
Program Chairs:
Gordon Brebner
Xilinx, USA
,
Isaac Keslassy
Technion, Israel

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ANCS '14

Sponsor:

ANCS '14: Symposium on Architectures for Networking and Communications Systems

October 20 - 21, 2014

California, Los Angeles, USA

Acceptance Rates

ANCS '14 Paper Acceptance Rate 19 of 57 submissions, 33%;

Overall Acceptance Rate 88 of 314 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
231
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Vasiliadis GTsirbas RIoannidis S(2022)The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW55747.2022.00017(55-64)Online publication date: May-2022
https://doi.org/10.1109/IPDPSW55747.2022.00017
Giakoumakis GPapadogiannaki EVasiliadis GIoannidis S(2022)Scheduling of multiple network packet processing applications using PythiaComputer Networks10.1016/j.comnet.2022.109006212(109006)Online publication date: Jul-2022
https://doi.org/10.1016/j.comnet.2022.109006
Yi XWang JDuan JBai WWu CXiong YHan D(2019)FlowShader: a Generalized Framework for GPU-accelerated VNF Flow Processing2019 IEEE 27th International Conference on Network Protocols (ICNP)10.1109/ICNP.2019.8888129(1-12)Online publication date: Oct-2019
https://doi.org/10.1109/ICNP.2019.8888129
Dayarathna MPerera S(2018)Recent Advancements in Event ProcessingACM Computing Surveys10.1145/317043251:2(1-36)Online publication date: 13-Feb-2018
https://dl.acm.org/doi/10.1145/3170432
Hu YLi T(2018)Enabling Efficient Network Service Function Chain Deployment on Heterogeneous Server Platform2018 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2018.00013(27-39)Online publication date: Feb-2018
https://doi.org/10.1109/HPCA.2018.00013
Papadogiannaki EKoromilas LVasiliadis GIoannidis S(2017)Efficient Software Packet Processing on Heterogeneous and Asymmetric Hardware ArchitecturesIEEE/ACM Transactions on Networking10.1109/TNET.2016.264233825:3(1593-1606)Online publication date: 1-Jun-2017
https://dl.acm.org/doi/10.1109/TNET.2016.2642338
Vasiliadis GKoromilas LPolychronakis MIoannidis S(2017)Design and Implementation of a Stateful Network Packet Processing Framework for GPUsIEEE/ACM Transactions on Networking10.1109/TNET.2016.259716325:1(610-623)Online publication date: 1-Feb-2017
https://dl.acm.org/doi/10.1109/TNET.2016.2597163
Khazankin GKomarov SKovalev DBarsegyan ALikhachev A(2017)System architecture for deep packet inspection in high-speed networks2017 Siberian Symposium on Data Science and Engineering (SSDSE)10.1109/SSDSE.2017.8071958(27-32)Online publication date: Apr-2017
https://doi.org/10.1109/SSDSE.2017.8071958
Trevisan MFinamore AMellia MMunafo MRossi D(2017)Traffic Analysis with Off-the-Shelf HardwareIEEE Communications Magazine10.1109/MCOM.2017.1600756CM55:3(163-169)Online publication date: 1-Mar-2017
https://dl.acm.org/doi/10.1109/MCOM.2017.1600756CM
Papadopoulos PVasiliadis GChristou GMarkatos EIoannidis S(2017)No Sugar but All the Taste! Memory Encryption Without Architectural SupportComputer Security – ESORICS 201710.1007/978-3-319-66399-9_20(362-380)Online publication date: 12-Aug-2017
https://doi.org/10.1007/978-3-319-66399-9_20
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten