More Web Proxy on the site http://driver.im/

survey

Offloading Machine Learning to Programmable Data Planes: A Systematic Survey

Authors:

Ricardo Parizotto,

Bruno Loureiro Coelho,

Diego Cardoso Nunes,

Alberto Schaeffer-FilhoAuthors Info & Claims

ACM Computing Surveys, Volume 56, Issue 1

Article No.: 18, Pages 1 - 34

https://doi.org/10.1145/3605153

Published: 26 August 2023 Publication History

Abstract

The demand for machine learning (ML) has increased significantly in recent decades, enabling several applications, such as speech recognition, computer vision, and recommendation engines. As applications become more sophisticated, the models trained become more complex while also increasing the amount of data used for training. Several domain-specific techniques can be helpful to scale machine learning to large amounts of data and more complex models. Among the methods employed, of particular interest is offloading machine learning functionality to the network infrastructure, which is enabled by the use of emerging programmable data plane hardware, such as SmartNICs and programmable switches. As such, offloading machine learning to programmable network hardware has attracted considerable attention from the research community in the last few years. This survey presents a study of programmable data planes applied to machine learning, also highlighting how in-network computing is helping to speed up machine learning applications. In this article, we provide various concepts and propose a taxonomy to classify existing research. Next, we systematically review the literature that offloads machine learning functionality to programmable data plane devices, classifying it based on our proposed taxonomy. Finally, we discuss open challenges in the field and suggest directions for future research.

References

[1]

Alec Radford and Karthik Narasimhan. 2018. Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.

[2]

Yoav Goldberg. 2017. Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies 10, 1 (2017), 1–309.

[3]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[4]

Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).

[5]

Hebatallah A. Mohamed Hassan, Giuseppe Sansonetti, Fabio Gasparetti, and Alessandro Micarelli. 2018. Semantic-based tag recommendation in scientific bookmarking systems. In Proceedings of the 12th ACM Conference on Recommender Systems. 465–469.

Digital Library

[6]

Yong Wu, Yuan Yao, Feng Xu, Hanghang Tong, and Jian Lu. 2016. Tag2word: Using tags to generate words for content based tag recommendation. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 2287–2292.

Digital Library

[7]

Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Korthikanti, Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, et al. 2021. Efficient large-scale language model training on GPU clusters using megatron-LM. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–15.

Digital Library

[8]

Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 583–598.

[9]

Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture. 1–12.

Digital Library

[10]

Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, et al. 2018. Applied machine learning at Facebook: A datacenter infrastructure perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA’18). IEEE, 620–629.

[11]

Amedeo Sapio, Ibrahim Abdelaziz, Abdulla Aldilaijan, Marco Canini, and Panos Kalnis. 2017. In-network computation is a dumb idea whose time has come. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks. 150–156.

Digital Library

[12]

Theophilus A. Benson. 2019. In-network compute: Considered armed and dangerous. In Proceedings of the Workshop on Hot Topics in Operating Systems. 216–224.

Digital Library

[13]

Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, et al. 2014. P4: Programming protocol-independent packet processors. ACM SIGCOMM Computer Communication Review 44, 3 (2014), 87–95.

Digital Library

[14]

Mary Hogan, Shir Landau-Feibish, Mina Tahmasbi Arashloo, Jennifer Rexford, and David Walker. 2022. Modular switch programming under resource constraints. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). 193–207.

[15]

Dan R. K. Ports and Jacob Nelson. 2019. When should the network be the computer? In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS’19). Association for Computing Machinery, New York, NY, 209–215. DOI:

Digital Library

[16]

James McCauley, Aurojit Panda, Arvind Krishnamurthy, and Scott Shenker. 2019. Thoughts on load distribution and the role of programmable switches. ACM SIGCOMM Computer Communication Review 49, 1 (2019), 18–23.

Digital Library

[17]

Oliver Michel, Roberto Bifulco, Gábor Rétvári, and Stefan Schmid. 2021. The programmable data plane: Abstractions, architectures, algorithms, and applications. ACM Computing Surveys (CSUR) 54, 4 (2021), 1–36.

Digital Library

[18]

Frederik Hauser, Marco Häberle, Daniel Merling, Steffen Lindner, Vladimir Gurevich, Florian Zeiger, Reinhard Frank, and Michael Menth. 2021. A survey on data plane programming with P4: Fundamentals, advances, and applied research. arXiv preprint arXiv:2101.10632 (2021).

[19]

Leonardo Reinehr Gobatto, Pablo Rodrigues, Mateus Saquetti Pereira de Carvalho Tirone, Weverton Luis da Costa Cordeiro, and José Rodrigo Furlanetto Azambuja. 2021. Programmable data planes meets in-network computing: A review of the state of the art and prospective directions. Journal of Integrated Circuits and Systems 16, 2 (2021), 1–8.

[20]

Renjie Gu, Chaoyue Niu, Fan Wu, Guihai Chen, Chun Hu, Chengfei Lyu, and Zhihua Wu. 2021. From server-based to client-based machine learning: A comprehensive survey. ACM Comput. Surv. 54, 1, Article 6 (Jan.2021), 36 pages. DOI:

Digital Library

[21]

Joost Verbraeken, Matthijs Wolting, Jonathan Katzy, Jeroen Kloppenburg, Tim Verbelen, and Jan S. Rellermeyer. 2020. A survey on distributed machine learning. ACM Computing Surveys (CSUR) 53, 2 (2020), 1–33.

Digital Library

[22]

Hilde J. P. Weerts, Andreas C. Mueller, and Joaquin Vanschoren. 2020. Importance of tuning hyperparameters of machine learning algorithms. arXiv preprint arXiv:2007.07588 (2020).

[23]

James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 281–297.

[24]

Harold Hotelling. 1933. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24, 6 (1933), 417.

[25]

Mark A. Kramer. 1991. Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal 37, 2 (1991), 233–243.

[26]

Subutai Ahmad, Alexander Lavin, Scott Purdy, and Zuha Agha. 2017. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262 (2017), 134–147.

[27]

Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, and Luc Van Gool. 2020. Scan: Learning to classify images without labels. In European Conference on Computer Vision. Springer, 268–285.

Digital Library

[28]

Fadime Sener and Angela Yao. 2018. Unsupervised learning and segmentation of complex activities from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8368–8376.

[29]

Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press.

Digital Library

[30]

Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Machine Learning 8, 3 (1992), 279–292.

Digital Library

[31]

Gavin A. Rummery and Mahesan Niranjan. 1994. On-line Q-learning Using Connectionist Systems. Vol. 37. Citeseer.

[32]

Jens Kober, J. Andrew Bagnell, and Jan Peters. 2013. Reinforcement learning in robotics: A survey. International Journal of Robotics Research 32, 11 (2013), 1238–1274.

Digital Library

[33]

Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, et al. 2019. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 (2019).

[34]

Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan McMahan, et al. 2019. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems 1 (2019), 374–388.

[35]

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273–1282.

[36]

Ruben Mayer and Hans-Arno Jacobsen. 2020. Scalable deep learning on distributed infrastructures: Challenges, techniques, and tools. ACM Computing Surveys (CSUR) 53, 1 (2020), 1–37.

Digital Library

[37]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). 265–283.

[38]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia. 675–678.

Digital Library

[39]

Saeed Rashidi, Matthew Denton, Srinivas Sridharan, Sudarshan Srinivasan, Amoghavarsha Suresh, Jade Nie, and Tushar Krishna. 2021. Enabling compute-communication overlap in distributed deep learning training platforms. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA’21). IEEE, 540–553.

[40]

Alexander Sergeev and Mike Del Balso. 2018. Horovod: Fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018).

[41]

Jinkun Geng, Dan Li, and Shuai Wang. 2019. Elasticpipe: An efficient and dynamic model-parallel solution to DNN training. In Proceedings of the 10th Workshop on Scientific Cloud Computing. 5–9.

Digital Library

[42]

Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, Phillip B. Gibbons, and Matei Zaharia. 2019. PipeDream: Generalized pipeline parallelism for DNN training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 1–15.

Digital Library

[43]

Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, et al. 2019. Gpipe: Efficient training of giant neural networks using pipeline parallelism. Advances in Neural Information Processing Systems 32 (2019).

[44]

Hang Xu, Chen-Yu Ho, Ahmed M. Abdelmoniem, Aritra Dutta, El Houcine Bergou, Konstantinos Karatsenidis, Marco Canini, and Panos Kalnis. 2020. Compressed Communication for Distributed Deep Learning: Survey and Quantitative Evaluation. Technical Report. 1–16.

[45]

Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, et al. 2019. Machine learning at Facebook: Understanding inference at the edge. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA’19). IEEE, 331–344.

[46]

Yang Hu, Connor Imes, Xuanang Zhao, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, and John Paul N. Walters. 2021. Pipeline parallelism for inference on heterogeneous edge computing. arXiv preprint arXiv:2110.14895 (2021).

[47]

Angela Wang. 2019. Parallelizing across multiple CPU/GPUs to speed up deep learning inference at the edge. https://aws.amazon.com/blogs/machine-learning/parallelizing-across-multiple-cpu-gpus-to-speed-up-deep-learning-inference-at-the-edge/.

[48]

Haoyu Song. 2013. Protocol-oblivious forwarding: Unleash the power of SDN through a future-proof forwarding plane. In Proceedings of the 2nd ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN’13). Association for Computing Machinery, New York, NY, 127–132. DOI:

Digital Library

[49]

Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. ACM SIGCOMM Computer Communication Review 43, 4 (2013), 99–110.

Digital Library

[50]

Anirudh Sivaraman, Mihai Budiu, Alvin Cheung, Changhoon Kim, Steve Licking, George Varghese, Hari Balakrishnan, Mohammad Alizadeh, and Nick McKeown. 2015. Packet transactions: A programming model for data-plane algorithms at hardware speed. CoRR, vol. abs/1512.05023 (2015).

[51]

Ming Liu, Simon Peter, Arvind Krishnamurthy, and Phitchaya Mangpo Phothilimthana. 2019. E3: Energy-efficient microservices on SmartNIC-accelerated servers. In 2019 USENIX Annual Technical Conference (USENIX ATC’19). 363–378.

[52]

Ming Liu, Tianyi Cui, Henry Schuh, Arvind Krishnamurthy, Simon Peter, and Karan Gupta. 2019. Offloading distributed applications onto SmartNICs using IPipe. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM’19). Association for Computing Machinery, New York, NY, 318–333. DOI:

Digital Library

[53]

Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan Ports, and Peter Richtarik. 2021. Scaling distributed machine learning with in-network aggregation. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI’21). USENIX Association, 785–808. https://www.usenix.org/conference/nsdi21/presentation/sapio.

[54]

Davide Sanvito, Giuseppe Siracusano, and Roberto Bifulco. 2018. Can the network be the AI accelerator? In Proceedings of the 2018 Morning Workshop on In-network Computing. 20–25.

Digital Library

[55]

Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. NetCache: Balancing key-value stores with fast in-network caching. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). Association for Computing Machinery, New York, NY, 121–136. DOI:

Digital Library

[56]

Xin Jin, Xiaozhou Li, Haoyu Zhang, Nate Foster, Jeongkeun Lee, Robert Soulé, Changhoon Kim, and Ion Stoica. 2018. NetChain: Scale-free sub-RTT coordination. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI’18). USENIX Association, Renton, WA, 35–49. https://www.usenix.org/conference/nsdi18/presentation/jin.

[57]

Zhuolong Yu, Yiwen Zhang, Vladimir Braverman, Mosharaf Chowdhury, and Xin Jin. 2020. NetLock: Fast, centralized lock management using programmable switches. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM’20). Association for Computing Machinery, New York, NY, 126–138. DOI:

Digital Library

[58]

Richard L. Graham, Devendar Bureddy, Pak Lui, Hal Rosenstock, Gilad Shainer, Gil Bloch, Dror Goldenerg, Mike Dubman, Sasha Kotchubievsky, Vladimir Koushnir, et al. 2016. Scalable hierarchical aggregation protocol (SHArP): A hardware architecture for efficient data reduction. In 2016 1st International Workshop on Communication Optimizations in HPC (COMHPC’16). IEEE, 1–10.

[59]

Shuo Liu, Qiaoling Wang, Junyi Zhang, Qinliang Lin, Yao Liu, Meng Xu, Ray C. C. Chueng, and Jianfei He. 2020. NetReduce: RDMA-compatible in-network reduction for distributed DNN training acceleration. arXiv preprint arXiv:2009.09736 (2020).

[60]

Adam Moody, Juan Fernandez, Fabrizio Petrini, and Dhabaleswar K. Panda. 2003. Scalable NIC-based reduction on large-scale clusters. In Proceedings of the 2003 ACM/IEEE Conference on Supercomputing. 59.

Digital Library

[61]

Richard L. Graham, Steve Poole, Pavel Shamis, Gil Bloch, Noam Bloch, Hillel Chapman, Michael Kagan, Ariel Shahar, Ishai Rabinovitz, and Gilad Shainer. 2010. ConnectX-2 InfiniBand management queues: First investigation of the new support for network offloaded collective operations. In 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE, 53–62.

Digital Library

[62]

Paolo Costa, Austin Donnelly, Antony Rowstron, and Greg O’Shea. 2012. Camdoop: Exploiting in-network aggregation for big data applications. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI’12). 29–42.

[63]

Luo Mai, Lukas Rupprecht, Abdul Alim, Paolo Costa, Matteo Migliavacca, Peter Pietzuch, and Alexander L. Wolf. 2014. Netagg: Using middleboxes for application-specific on-path aggregation in data centres. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies. 249–262.

Digital Library

[64]

Anderson Santos da Silva, Juliano Araujo Wickboldt, Lisandro Zambenedetti Granville, and Alberto Schaeffer-Filho. 2016. ATLANTIC: A framework for anomaly traffic detection, classification, and mitigation in SDN. In 2016 IEEE/IFIP Network Operations and Management Symposium (NOMS’16). IEEE, 27–35.

[65]

Junfeng Xie, F. Richard Yu, Tao Huang, Renchao Xie, Jiang Liu, Chenmeng Wang, and Yunjie Liu. 2018. A survey of machine learning techniques applied to software defined networking (SDN): Research issues and challenges. IEEE Communications Surveys & Tutorials 21, 1 (2018), 393–430.

[66]

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2704–2713.

[67]

Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).

[68]

Youjie Li, Iou-Jen Liu, Yifan Yuan, Deming Chen, Alexander Schwing, and Jian Huang. 2019. Accelerating distributed reinforcement learning with in-switch computing. In 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA’19). IEEE, 279–291.

[69]

Fan Yang, Zhan Wang, Xiaoxiao Ma, Guojun Yuan, and Xuejun An. 2019. SwitchAgg: A further step towards in-network computing. In 2019 IEEE Intl. Conf. on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom’19). IEEE, 36–45.

[70]

Tomoya Itsubo, Michihiro Koibuchi, Hideharu Amano, and Hiroki Matsutani. 2020. Accelerating deep learning using multiple GPUs and FPGA-based 10GbE switch. In 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP’20). 102–109. DOI:

[71]

Benjamin Klenk, Nan Jiang, Greg Thorson, and Larry Dennison. 2020. An in-network architecture for accelerating shared-memory multiprocessor collectives. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA’20). IEEE, 996–1009.

[72]

Kenji Tanaka, Yuki Arikawa, Tsuyoshi Ito, Kazutaka Morita, Naru Nemoto, Fumiaki Miura, Kazuhiko Terada, Junji Teramoto, and Takeshi Sakamoto. 2020. Communication-efficient distributed deep learning with GPU-FPGA heterogeneous computing. In 2020 IEEE Symposium on High-Performance Interconnects (HOTI’20). IEEE, 43–46.

[73]

Nadeen Gebara, Manya Ghobadi, and Paolo Costa. 2021. In-network aggregation for shared machine learning clusters. Proceedings of Machine Learning and Systems 3 (2021), 829–844.

[74]

ChonLam Lao, Yanfang Le, Kshiteej Mahajan, Yixi Chen, Wenfei Wu, Aditya Akella, and Michael M. Swift. 2021. ATP: In-network aggregation for multi-tenant learning. In NSDI. 741–761.

[75]

Ge Chen, Gaoxiong Zeng, and Li Chen. 2021. P4COM: In-network computation with programmable switches. arXiv preprint arXiv:2107.13694 (2021).

[76]

Jiawei Fei, Chen-Yu Ho, Atal N. Sahu, Marco Canini, and Amedeo Sapio. 2021. Efficient sparse collective communication and its application to accelerate distributed deep learning. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference. 676–691.

Digital Library

[77]

Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, and Torsten Hoefler. 2021. Flare: Flexible in-network Allreduce. arXiv preprint arXiv:2106.15565 (2021).

[78]

Fahao Chen, Peng Li, and Toshiaki Miyazaki. 2021. In-network aggregation for privacy-preserving federated learning. In 2021 International Conference on Information and Communication Technologies for Disaster Management (ICT-DM’21). IEEE, 49–56.

[79]

Hao Wang, Yuxuan Qin, ChonLam Lao, Yanfang Le, Wenfei Wu, and Kai Chen. 2022. Efficient data-plane memory scheduling for in-network aggregation. arXiv preprint arXiv:2201.06398 (2022).

[80]

Heng Pan, Penglai Cui, Zhenyu li, Ru Jia, Penghao Zhang, Leilei Zhang, Ye Yang, Jiahao Wu, Jianbo Dong, Zheng Cao, Qiang Li, Hongqiang Harry Liu, Mathy Laurent, and Gaogang Xie. 2022. Enabling Fast and Flexible Distributed Deep Learning with Programmable Switches. DOI:

[81]

Rui Ma, Evangelos Georganas, Alexander Heinecke, Sergey Gribok, Andrew Boutros, and Eriko Nurvitadhi. 2022. FPGA-based AI smart NICs for scalable distributed AI training systems. IEEE Computer Architecture Letters 21, 2 (2022), 49–52. DOI:

Digital Library

[82]

Mingran Yang, Alex Baban, Valery Kugel, Jeff Libby, Scott Mackie, Swamy Sadashivaiah Renu Kananda, Chang-Hong Wu, and Manya Ghobadi. 2022. Using Trio–Juniper Networks’ programmable chipset–for emerging in-network applications. In Proceedings of the ACM SIGCOMM 2022 Conference. 633–648.

Digital Library

[83]

Torsten Hoefler, Salvatore Di Girolamo, Konstantin Taranov, Ryan E. Grant, and Ron Brightwell. 2017. sPIN: High-performance streaming Processing in the Network. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–16.

Digital Library

[84]

Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beranek, Luca Benini, and Torsten Hoefler. 2021. A RISC-V in-network accelerator for flexible high-performance low-power packet processing. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA’21).

[85]

Jayashree Mohan, Amar Phanishayee, and Vijay Chidambaram. 2021. CheckFreq: Frequent, fine-grained DNN checkpointing. In FAST, Vol. 21. 203–216.

[86]

Noa Zilberman, Yury Audzevich, G. Adam Covington, and Andrew W. Moore. 2014. NetFPGA SUME: Toward 100 Gbps as research commodity. IEEE Micro 34, 5 (2014), 32–41.

[87]

Sharad Chole, Andy Fingerhut, Sha Ma, Anirudh Sivaraman, Shay Vargaftik, Alon Berger, Gal Mendelson, Mohammad Alizadeh, Shang-Tse Chuang, Isaac Keslassy, et al. 2017. drmt: Disaggregated programmable switching. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication. 1–14.

Digital Library

[88]

Giuseppe Siracusano and Roberto Bifulco. 2018. In-network neural networks. arXiv preprint arXiv:1801.05731 (2018).

[89]

Tushar Swamy, Alexander Rucker, Muhammad Shahbaz, Ishan Gaur, and Kunle Olukotun. 2022. Taurus: A data plane architecture for per-packet ML. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’22). Association for Computing Machinery, New York, NY, 1099–1114. DOI:

Digital Library

[90]

Timothy Hospedales, Antreas Antoniou, Paul Micaelli, and Amos Storkey. 2020. Meta-learning in neural networks: A survey. arXiv preprint arXiv:2004.05439 (2020).

[91]

Qiaofeng Qin, Konstantinos Poularakis, Kin K. Leung, and Leandros Tassiulas. 2020. Line-speed and scalable intrusion detection at the network edge via federated learning. In 2020 IFIP Networking Conference (Networking’20). IEEE, 352–360.

[92]

Giuseppe Siracusano, Salvator Galea, Davide Sanvito, Mohammad Malekzadeh, Hamed Haddadi, Gianni Antichi, and Roberto Bifulco. 2020. Running neural networks on the nic. arXiv preprint arXiv:2009.02353 (2020).

[93]

Giuseppe Siracusano, Salvator Galea, Davide Sanvito, Mohammad Malekzadeh, Gianni Antichi, Paolo Costa, Hamed Haddadi, and Roberto Bifulco. 2022. Re-architecting traffic analysis with neural network interface cards. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA, 513–533. https://www.usenix.org/conference/nsdi22/presentation/siracusano.

[94]

Mateus Saquetti, Ronaldo Canofre, Arthur F. Lorenzon, Fábio D. Rossi, José Rodrigo Azambuja, Weverton Cordeiro, and Marcelo C. Luizelli. 2021. Toward in-network intelligence: Running distributed artificial neural networks in the data plane. IEEE Communications Letters 25 (2021), 1–5.

[95]

Tushar Swamy, Alexander Rucker, Muhammad Shahbaz, Ishan Gaur, and Kunle Olukotun. 2020. Taurus: A data plane architecture for per-packet ML. 1–16. DOI:

[96]

Thi-Nga Dao, Van-Phuc Hoang, Chi Hieu Ta, et al. 2021. Development of lightweight and accurate intrusion detection on programmable data plane. In 2021 International Conference on Advanced Technologies for Communications (ATC’21). IEEE, 99–103.

[97]

Thi-Nga Dao and HyungJune Lee. 2022. JointNIDS: Efficient joint traffic management for on-device network intrusion detection. IEEE Transactions on Vehicular Technology 71 (2022), 1–13. DOI:

[98]

Coralie Busse-Grawitz, Roland Meier, Alexander Dietmüller, Tobias Bühler, and Laurent Vanbever. 2019. pforest: In-network inference with random forests. arXiv preprint arXiv:1909.05680 (2019).

[99]

Bruno Missi Xavier, Rafael Silva Guimarães, Giovanni Comarela, and Magnos Martinello. 2021. Programmable switches for in-networking classification. In IEEE Conference on Computer Communications (IEEE INFOCOM’21). 1–10. DOI:

Digital Library

[100]

Xiaoquan Zhang, Lin Cui, Fung Po Tso, and Weijia Jia. 2021. pHeavy: Predicting heavy flows in the programmable data plane. IEEE Transactions on Network and Service Management 18 (2021), 4353–4364.

[101]

Radhakrishna Kamath and Krishna M. Sivalingam. 2021. Machine learning based flow classification in DCNs using P4 switches. In 2021 International Conference on Computer Communications and Networks (ICCCN’21). IEEE, 1–10.

[102]

Changgang Zheng and Noa Zilberman. 2021. Planter: Seeding Trees within Switches. Association for Computing Machinery, New York, NY, 12–14.

Digital Library

[103]

Changgang Zheng, Mingyuan Zang, Xinpeng Hong, Riyad Bensoussane, Shay Vargaftik, Yaniv Ben-Itzhak, and Noa Zilberman. 2022. Automating in-network machine learning. arXiv preprint arXiv:2205.08824 (2022).

[104]

Hisham Siddique, Miguel Neves, Carson Kuzniar, and Israat Haque. 2021. Towards network-accelerated ML-based distributed computer vision systems. In 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS’21). 122–129. DOI:

[105]

Guorui Xie, Qing Li, Yutao Dong, Guanglin Duan, Yong Jiang, and Jingpu Duan. 2022. Mousika: Enable general in-network intelligence in programmable switches by knowledge distillation. In IEEE Conference on Computer Communications (IEEE INFOCOM’22). 1938–1947. DOI:

Digital Library

[106]

Roy Friedman, Or Goaz, and Ori Rottenstreich. 2021. Clustreams: Data plane clustering. In Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR’21). 101–107.

Digital Library

[107]

Albert Gran Alcoz, Martin Strohmeier, Vincent Lenders, and Laurent Vanbever. 2022. Aggregate-based congestion control for pulse-wave DDoS defense. In Proceedings of the ACM SIGCOMM 2022 Conference. 693–706.

Digital Library

[108]

Zhaoqi Xiong and Noa Zilberman. 2019. Do switches dream of machine learning? Toward in-network classification. In Proceedings of the 18th ACM Workshop on Hot Topics in Networks. 25–33.

Digital Library

[109]

Changgang Zheng, Zhaoqi Xiong, Thanh T. Bui, Siim Kaupmees, Riyad Bensoussane, Antoine Bernabeu, Shay Vargaftik, Yaniv Ben-Itzhak, and Noa Zilberman. 2022. IIsy: Practical in-network classification. arXiv preprint arXiv:2205.08243 (2022).

[110]

Kyle A. Simpson and Dimitrios P. Pezaros. 2022. Revisiting the classics: Online RL in the programmable dataplane. In 2022 IEEE/IFIP Network Operations and Management Symposium (NOMS’22). 1–10. DOI:

Digital Library

[111]

Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626 (2015).

[112]

Bruno Missi Xavier, Rafael Silva Guimarães, Giovanni Comarela, and Magnos Martinello. 2021. Programmable switches for in-networking classification. In IEEE Conference on Computer Communications (IEEE INFOCOM’21). IEEE, 1–10.

Digital Library

[113]

H. Siddique, M. Neves, C. Kuzniar, and I. Haque. 2021. Towards network-accelerated ML-based distributed computer vision systems. In 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS’21). IEEE Computer Society, Los Alamitos, CA, 122–129. DOI:

[114]

Hanan Samet. 1984. The quadtree and related hierarchical data structures. ACM Computing Surveys (CSUR) 16, 2 (1984), 187–260.

Digital Library

[115]

Xinpeng Hong, Changgang Zheng, Stefan Zohren, and Noa Zilberman. 2022. Linnet: Limit order books within switches. In Proceedings of the SIGCOMM’22 Poster and Demo Sessions. 37–39.

Digital Library

[116]

Jiarong Xing, Ang Chen, and T. S. Eugene Ng. 2020. Secure state migration in the data plane. In Proceedings of the Workshop on Secure Programmable Network Infrastructure. 28–34.

Digital Library

[117]

Jiarong Xing, Yiming Qiu, Kuo-Feng Hsu, Hongyi Liu, Matty Kadosh, Alan Lo, Aditya Akella, Thomas Anderson, Arvind Krishnamurthy, T. S. Eugene Ng, et al. 2021. A vision for runtime programmable networks. In Proceedings of the 20th ACM Workshop on Hot Topics in Networks. 91–98.

Digital Library

[118]

David Hancock and Jacobus van der Merwe. 2016. HyPer4: Using P4 to virtualize the programmable data plane. In Proceedings of the 12th International on Conference on Emerging Networking EXperiments and Technologies (CoNEXT’16). Association for Computing Machinery, New York, NY, 35–49. DOI:

Digital Library

[119]

Peng Zheng, Theophilus Benson, and Chengchen Hu. 2018. P4visor: Lightweight virtualization and composition primitives for building and testing modular programs. In Proceedings of the 14th International Conference on Emerging Networking EXperiments and Technologies. 98–111.

Digital Library

[120]

Yong Feng, Haoyu Song, Jiahao Li, Zhikang Chen, Wenquan Xu, and Bin Liu. 2021. In-situ Programmable Switching Using RP4: Towards Runtime Data Plane Programmability. Association for Computing Machinery, New York, NY, 69–76.

Digital Library

[121]

Hang Zhu, Tao Wang, Yi Hong, Dan R. K. Ports, Anirudh Sivaraman, and Xin Jin. 2022. NetVRM: Virtual register memory for programmable networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA, 155–170. https://www.usenix.org/conference/nsdi22/presentation/zhu.

[122]

Jinkun Geng, Dan Li, and Shuai Wang. 2019. Accelerating distributed machine learning by smart parameter server. In Proceedings of the 3rd Asia-Pacific Workshop on Networking 2019 (APNet’19). Association for Computing Machinery, New York, NY, 92–98. DOI:

Digital Library

[123]

Henggang Cui, James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Abhimanu Kumar, Jinliang Wei, Wei Dai, Gregory R. Ganger, Phillip B. Gibbons, et al. 2014. Exploiting bounded staleness to speed up big data analytics. In 2014 USENIX Annual Technical Conference (USENIX ATC’14). 37–48.

[124]

James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth Gibson, Kimberly Keeton, and Eric Xing. 2013. Solving the straggler problem with bounded staleness. In 14th Workshop on Hot Topics in Operating Systems (HotOS XIV).

[125]

Shijian Li, Oren Mangoubi, Lijie Xu, and Tian Guo. 2021. Sync-switch: Hybrid parameter synchronization for distributed deep learning. arXiv preprint arXiv:2104.08364 (2021).

[126]

Hesam Tajbakhsh, Ricardo Parizotto, Miguel Neves, Alberto Schaeffer-Filho, and Israat Haque. 2022. Accelerator-aware in-network load balancing for improved application performance. In 2022 IFIP Networking Conference (IFIP Networking’22). 1–9. DOI:

[127]

Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing. 2016. Addressing the straggler problem for iterative convergent parallel ML. In Proceedings of the 7th ACM Symposium on Cloud Computing. 98–111.

Digital Library

[128]

Daehyeok Kim, Jacob Nelson, Dan R. K. Ports, Vyas Sekar, and Srinivasan Seshan. 2021. RedPlane: Enabling fault-tolerant stateful in-switch applications. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference. 223–244.

Digital Library

[129]

Justine Sherry, Peter Xiang Gao, Soumya Basu, Aurojit Panda, Arvind Krishnamurthy, Christian Maciocco, Maziar Manesh, João Martins, Sylvia Ratnasamy, Luigi Rizzo, and Scott Shenker. 2015. Rollback-recovery for middleboxes. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM’15). Association for Computing Machinery, New York, NY, 227–240. DOI:

Digital Library

[130]

Brent E. Stephens, Darius Grassi, Hamidreza Almasi, Tao Ji, Balajee Vamanan, and Aditya Akella. 2021. TCP is Harmful to In-network Computing: Designing a Message Transport Protocol (MTP). Association for Computing Machinery, New York, NY, 61–68.

Digital Library

[131]

Raz Segal, Chen Avin, and Gabriel Scalosub. 2022. Constrained in-network computing with low congestion in datacenter networks. arXiv preprint arXiv:2201.04344 (2022). arxiv:cs.DC/2201.04344.

[132]

Hang Xu, Kelly Kostopoulou, Aritra Dutta, Xin Li, Alexandros Ntoulas, and Panos Kalnis. 2021. DeepReduce: A sparse-tensor communication framework for federated deep learning. Advances in Neural Information Processing Systems 34 (2021), 1–18.

[133]

Joshua Romero, Junqi Yin, Nouamane Laanait, Bing Xie, M. Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, and Michael Matheson. 2022. Accelerating collective communication in data parallel training across deep learning frameworks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA, 1027–1040. https://www.usenix.org/conference/nsdi22/presentation/romero.

[134]

Penglai Cui, Heng Pan, Zhenyu Li, Jiaoren Wu, Shengzhuo Zhang, Xingwu Yang, Hongtao Guan, and Gaogang Xie. 2021. NetFC: Enabling accurate floating-point arithmetic on programmable switches. In 2021 IEEE 29th International Conference on Network Protocols (ICNP’21). IEEE, 1–11.

[135]

Yifan Yuan, Omar Alama, Amedeo Sapio, Jiawei Fei, Jacob Nelson, Dan R. K. Ports, Marco Canini, and Nam Sung Kim. 2021. Unlocking the power of inline floating-point operations on programmable switches. arXiv preprint arXiv:2112.06095 (2021).

[136]

Ike Kunze, Klaus Wehrle, Dirk Trossen, Marie-Jose Montpetit, Xavier de Foy, David Griffin, and Miguel Rio. 2022. Use Cases for In-network Computing. Internet-Draft draft-irtf-coinrg-use-cases-02. Internet Engineering Task Force. https://datatracker.ietf.org/doc/html/draft-irtf-coinrg-use-cases-02. Work in Progress.

[137]

René Glebke, Johannes Krude, Ike Kunze, Jan Rüth, Felix Senger, and Klaus Wehrle. 2019. Towards executing computer vision functionality on programmable network devices. In Proceedings of the 1st ACM CoNEXT Workshop on Emerging in-Network Computing Paradigms. 15–20.

Digital Library

[138]

Xukan Ran, Carter Slocum, Maria Gorlatova, and Jiasi Chen. 2019. ShareAR: Communication-efficient multi-user mobile augmented reality. In Proceedings of the 18th ACM Workshop on Hot Topics in Networks (HotNets’19). Association for Computing Machinery, New York, NY, 109–116. DOI:

Digital Library

[139]

Sándor Laki, Csaba Györgyi, József Pető, Péter Vörös, and Géza Szabó. 2022. In-network velocity control of industrial robot arms. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA. https://www.usenix.org/conference/nsdi22/presentation/laki.

[140]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, (Jan. 2003), 993–1022.

Digital Library

[141]

Yue-Jiao Gong, Wei-Neng Chen, Zhi-Hui Zhan, Jun Zhang, Yun Li, Qingfu Zhang, and Jing-Jing Li. 2015. Distributed evolutionary algorithms and their models: A survey of the state-of-the-art. Applied Soft Computing 34 (2015), 286–300.

Digital Library

[142]

Jiaqi Gao, Ennan Zhai, Hongqiang Harry Liu, Rui Miao, Yu Zhou, Bingchuan Tian, Chen Sun, Dennis Cai, Ming Zhang, and Minlan Yu. 2020. Lyra: A cross-platform language and compiler for data plane programming on heterogeneous asics. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication. 435–450.

Digital Library

[143]

George Karlos, Henri Bal, and Lin Wang. 2021. Don’t you worry’bout a packet: Unified programming for in-network computing. In Proceedings of the 20th ACM Workshop on Hot Topics in Networks. 99–107.

[144]

Xiangyu Gao, Taegyun Kim, Michael D. Wong, Divya Raghunathan, Aatish Kishan Varma, Pravein Govindan Kannan, Anirudh Sivaraman, Srinivas Narayana, and Aarti Gupta. 2020. Switch code generation using program synthesis. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication. 44–61.

Digital Library

[145]

Lior Zeno, Dan R. K. Ports, Jacob Nelson, and Mark Silberstein. 2020. SwiShmem: Distributed shared state abstractions for programmable switches. In Proceedings of the 19th ACM Workshop on Hot Topics in Networks. 160–167.

Digital Library

[146]

Ali Fattaholmanan, Mario Baldi, Antonio Carzaniga, and Robert Soulé. 2021. P4 weaver: Supporting modular and incremental programming in P4. In Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR’21). 54–65.

Digital Library

[147]

Marcelo C. Luizelli, Ronaldo Canofre, Arthur F. Lorenzon, Fábio D. Rossi, Weverton Cordeiro, and Oscar M. Caicedo. 2021. In-network neural networks: Challenges and opportunities for innovation. IEEE Network 35, 6 (2021), 68–74.

Digital Library

[148]

Tom Barbette, Chen Tang, Haoran Yao, Dejan Kostić, Gerald Q. Maguire Jr., Panagiotis Papadimitratos, and Marco Chiesa. 2020. A high-speed load-balancer design with guaranteed per-connection-consistency. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI’20). USENIX Association, Santa Clara, CA, 667–683. https://www.usenix.org/conference/nsdi20/presentation/barbette.

[149]

Arpan Gujarati, Sameh Elnikety, Yuxiong He, Kathryn S. McKinley, and Björn B. Brandenburg. 2017. Swayam: Distributed autoscaling to meet SLAs of machine learning inference services with resource efficiency. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference. 109–120.

Digital Library

Cited By

Zhang KSamaan NKarmouch A(2024)A Machine Learning-Based Toolbox for P4 Programmable Data-PlanesIEEE Transactions on Network and Service Management10.1109/TNSM.2024.340207421:4(4450-4465)Online publication date: Aug-2024
https://doi.org/10.1109/TNSM.2024.3402074
Akem AFiore M(2024)Towards Data-Driven Management of Mobile Networks through User Plane InferenceNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575655(1-4)Online publication date: 6-May-2024
https://doi.org/10.1109/NOMS59830.2024.10575655
Cannarozzo LMorais Tde Souza PGobatto LLamb IDuarte PAzambuja JLorenzon ARossi FCordeiro WLuizelli M(2024)Spinner: Enabling In-network Flow Clustering Entirely in a Programmable Data PlaneNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575413(1-9)Online publication date: 6-May-2024
https://doi.org/10.1109/NOMS59830.2024.10575413
Show More Cited By

Index Terms

Offloading Machine Learning to Programmable Data Planes: A Systematic Survey
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
2. Networks
  1. Network services
    1. Programmable networks

Recommendations

Planter: Rapid Prototyping of In-Network Machine Learning Inference

In-network machine learning inference provides high throughput and low latency. It is ideally located within the network, power efficient, and improves applications' performance. Despite its advantages, the bar to in-network machine learning research is ...
Machine Learning-Based Runtime Scheduler for Mobile Offloading Framework
UCC '13: Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing

Remote offloading techniques have been proposed to overcome the limited resources of mobile platforms by leveraging external powerful resources such as personal work-stations or cloud servers. Prior studies have primarily focused on core mechanisms for ...
Towards Data and Computation Offloading in Mobile Cloud Computing: Taxonomy, Overview, and Future Directions
Abstract
The rapid developments in the mobile application context illuminate the demand for more resources and processing power at Smart Mobile Devices (SMDs). Mobile Cloud Computing (MCC) enables the SMDs to offload their workloads on the remote cloud ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 56, Issue 1

January 2024

918 pages

EISSN:1557-7341

DOI:10.1145/3613490

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 August 2023

Online AM: 18 June 2023

Accepted: 31 May 2023

Revised: 12 May 2023

Received: 08 September 2022

Published in CSUR Volume 56, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey

Funding Sources

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES)
CNPq
FAPERGS
FAPESP

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
2,454
Total Downloads

Downloads (Last 12 months)1,412
Downloads (Last 6 weeks)148

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang KSamaan NKarmouch A(2024)A Machine Learning-Based Toolbox for P4 Programmable Data-PlanesIEEE Transactions on Network and Service Management10.1109/TNSM.2024.340207421:4(4450-4465)Online publication date: Aug-2024
https://doi.org/10.1109/TNSM.2024.3402074
Akem AFiore M(2024)Towards Data-Driven Management of Mobile Networks through User Plane InferenceNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575655(1-4)Online publication date: 6-May-2024
https://doi.org/10.1109/NOMS59830.2024.10575655
Cannarozzo LMorais Tde Souza PGobatto LLamb IDuarte PAzambuja JLorenzon ARossi FCordeiro WLuizelli M(2024)Spinner: Enabling In-network Flow Clustering Entirely in a Programmable Data PlaneNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575413(1-9)Online publication date: 6-May-2024
https://doi.org/10.1109/NOMS59830.2024.10575413
Su XZhou YCui LGuo S(2024)Expediting In-Network Federated Learning by Voting-Based Consensus Model CompressionIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621376(1271-1280)Online publication date: 20-May-2024
https://doi.org/10.1109/INFOCOM52122.2024.10621376
Akem ABütün BGucciardo MFiore M(2024)Jewel: Resource-Efficient Joint Packet and Flow Level Inference in Programmable SwitchesIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621365(1631-1640)Online publication date: 20-May-2024
https://doi.org/10.1109/INFOCOM52122.2024.10621365
Ganesan ASarac K(2024)In-network Reinforcement Learning for Attack Mitigation using Programmable Data Plane in SDN2024 33rd International Conference on Computer Communications and Networks (ICCCN)10.1109/ICCCN61486.2024.10637621(1-9)Online publication date: 29-Jul-2024
https://doi.org/10.1109/ICCCN61486.2024.10637621
Kfoury EChoueiri SMazloum AAlSabeh AGomez JCrichigno J(2024)A Comprehensive Survey on SmartNICs: Architectures, Development Models, Applications, and Research DirectionsIEEE Access10.1109/ACCESS.2024.343720312(107297-107336)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3437203
Hara TSasabe M(2024)Practicality of in-kernel/user-space packet processing empowered by lightweight neural network and decision treeComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2024.110188240:COnline publication date: 16-May-2024
https://dl.acm.org/doi/10.1016/j.comnet.2024.110188
Fadhel MAlzubaidi LGu YSantamaría JDuan Y(2024)Real-time diabetic foot ulcer classification based on deep learning & parallel hardware computational toolsMultimedia Tools and Applications10.1007/s11042-024-18304-x83:27(70369-70394)Online publication date: 3-Feb-2024
https://doi.org/10.1007/s11042-024-18304-x
Cardoso Nunes DLoureiro Coelho BParizotto REgon Schaeffer‐Filho A(2024)No Worker Left (Too Far) Behind: Dynamic Hybrid Synchronization for In‐Network ML AggregationInternational Journal of Network Management10.1002/nem.229035:1Online publication date: 24-Jul-2024
https://doi.org/10.1002/nem.2290
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents