[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
survey

Offloading Machine Learning to Programmable Data Planes: A Systematic Survey

Published: 26 August 2023 Publication History

Abstract

The demand for machine learning (ML) has increased significantly in recent decades, enabling several applications, such as speech recognition, computer vision, and recommendation engines. As applications become more sophisticated, the models trained become more complex while also increasing the amount of data used for training. Several domain-specific techniques can be helpful to scale machine learning to large amounts of data and more complex models. Among the methods employed, of particular interest is offloading machine learning functionality to the network infrastructure, which is enabled by the use of emerging programmable data plane hardware, such as SmartNICs and programmable switches. As such, offloading machine learning to programmable network hardware has attracted considerable attention from the research community in the last few years. This survey presents a study of programmable data planes applied to machine learning, also highlighting how in-network computing is helping to speed up machine learning applications. In this article, we provide various concepts and propose a taxonomy to classify existing research. Next, we systematically review the literature that offloads machine learning functionality to programmable data plane devices, classifying it based on our proposed taxonomy. Finally, we discuss open challenges in the field and suggest directions for future research.

References

[1]
Alec Radford and Karthik Narasimhan. 2018. Improving language understanding by generative pre-training. https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
[2]
Yoav Goldberg. 2017. Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies 10, 1 (2017), 1–309.
[3]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[4]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[5]
Hebatallah A. Mohamed Hassan, Giuseppe Sansonetti, Fabio Gasparetti, and Alessandro Micarelli. 2018. Semantic-based tag recommendation in scientific bookmarking systems. In Proceedings of the 12th ACM Conference on Recommender Systems. 465–469.
[6]
Yong Wu, Yuan Yao, Feng Xu, Hanghang Tong, and Jian Lu. 2016. Tag2word: Using tags to generate words for content based tag recommendation. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 2287–2292.
[7]
Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Korthikanti, Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, et al. 2021. Efficient large-scale language model training on GPU clusters using megatron-LM. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–15.
[8]
Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 583–598.
[9]
Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture. 1–12.
[10]
Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, et al. 2018. Applied machine learning at Facebook: A datacenter infrastructure perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA’18). IEEE, 620–629.
[11]
Amedeo Sapio, Ibrahim Abdelaziz, Abdulla Aldilaijan, Marco Canini, and Panos Kalnis. 2017. In-network computation is a dumb idea whose time has come. In Proceedings of the 16th ACM Workshop on Hot Topics in Networks. 150–156.
[12]
Theophilus A. Benson. 2019. In-network compute: Considered armed and dangerous. In Proceedings of the Workshop on Hot Topics in Operating Systems. 216–224.
[13]
Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, et al. 2014. P4: Programming protocol-independent packet processors. ACM SIGCOMM Computer Communication Review 44, 3 (2014), 87–95.
[14]
Mary Hogan, Shir Landau-Feibish, Mina Tahmasbi Arashloo, Jennifer Rexford, and David Walker. 2022. Modular switch programming under resource constraints. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). 193–207.
[15]
Dan R. K. Ports and Jacob Nelson. 2019. When should the network be the computer? In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS’19). Association for Computing Machinery, New York, NY, 209–215. DOI:
[16]
James McCauley, Aurojit Panda, Arvind Krishnamurthy, and Scott Shenker. 2019. Thoughts on load distribution and the role of programmable switches. ACM SIGCOMM Computer Communication Review 49, 1 (2019), 18–23.
[17]
Oliver Michel, Roberto Bifulco, Gábor Rétvári, and Stefan Schmid. 2021. The programmable data plane: Abstractions, architectures, algorithms, and applications. ACM Computing Surveys (CSUR) 54, 4 (2021), 1–36.
[18]
Frederik Hauser, Marco Häberle, Daniel Merling, Steffen Lindner, Vladimir Gurevich, Florian Zeiger, Reinhard Frank, and Michael Menth. 2021. A survey on data plane programming with P4: Fundamentals, advances, and applied research. arXiv preprint arXiv:2101.10632 (2021).
[19]
Leonardo Reinehr Gobatto, Pablo Rodrigues, Mateus Saquetti Pereira de Carvalho Tirone, Weverton Luis da Costa Cordeiro, and José Rodrigo Furlanetto Azambuja. 2021. Programmable data planes meets in-network computing: A review of the state of the art and prospective directions. Journal of Integrated Circuits and Systems 16, 2 (2021), 1–8.
[20]
Renjie Gu, Chaoyue Niu, Fan Wu, Guihai Chen, Chun Hu, Chengfei Lyu, and Zhihua Wu. 2021. From server-based to client-based machine learning: A comprehensive survey. ACM Comput. Surv. 54, 1, Article 6 (Jan.2021), 36 pages. DOI:
[21]
Joost Verbraeken, Matthijs Wolting, Jonathan Katzy, Jeroen Kloppenburg, Tim Verbelen, and Jan S. Rellermeyer. 2020. A survey on distributed machine learning. ACM Computing Surveys (CSUR) 53, 2 (2020), 1–33.
[22]
Hilde J. P. Weerts, Andreas C. Mueller, and Joaquin Vanschoren. 2020. Importance of tuning hyperparameters of machine learning algorithms. arXiv preprint arXiv:2007.07588 (2020).
[23]
James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 281–297.
[24]
Harold Hotelling. 1933. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 24, 6 (1933), 417.
[25]
Mark A. Kramer. 1991. Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal 37, 2 (1991), 233–243.
[26]
Subutai Ahmad, Alexander Lavin, Scott Purdy, and Zuha Agha. 2017. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262 (2017), 134–147.
[27]
Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, and Luc Van Gool. 2020. Scan: Learning to classify images without labels. In European Conference on Computer Vision. Springer, 268–285.
[28]
Fadime Sener and Angela Yao. 2018. Unsupervised learning and segmentation of complex activities from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8368–8376.
[29]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press.
[30]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Machine Learning 8, 3 (1992), 279–292.
[31]
Gavin A. Rummery and Mahesan Niranjan. 1994. On-line Q-learning Using Connectionist Systems. Vol. 37. Citeseer.
[32]
Jens Kober, J. Andrew Bagnell, and Jan Peters. 2013. Reinforcement learning in robotics: A survey. International Journal of Robotics Research 32, 11 (2013), 1238–1274.
[33]
Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, et al. 2019. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 (2019).
[34]
Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan McMahan, et al. 2019. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems 1 (2019), 374–388.
[35]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial Intelligence and Statistics. PMLR, 1273–1282.
[36]
Ruben Mayer and Hans-Arno Jacobsen. 2020. Scalable deep learning on distributed infrastructures: Challenges, techniques, and tools. ACM Computing Surveys (CSUR) 53, 1 (2020), 1–37.
[37]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). 265–283.
[38]
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia. 675–678.
[39]
Saeed Rashidi, Matthew Denton, Srinivas Sridharan, Sudarshan Srinivasan, Amoghavarsha Suresh, Jade Nie, and Tushar Krishna. 2021. Enabling compute-communication overlap in distributed deep learning training platforms. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA’21). IEEE, 540–553.
[40]
Alexander Sergeev and Mike Del Balso. 2018. Horovod: Fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018).
[41]
Jinkun Geng, Dan Li, and Shuai Wang. 2019. Elasticpipe: An efficient and dynamic model-parallel solution to DNN training. In Proceedings of the 10th Workshop on Scientific Cloud Computing. 5–9.
[42]
Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, Phillip B. Gibbons, and Matei Zaharia. 2019. PipeDream: Generalized pipeline parallelism for DNN training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 1–15.
[43]
Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, et al. 2019. Gpipe: Efficient training of giant neural networks using pipeline parallelism. Advances in Neural Information Processing Systems 32 (2019).
[44]
Hang Xu, Chen-Yu Ho, Ahmed M. Abdelmoniem, Aritra Dutta, El Houcine Bergou, Konstantinos Karatsenidis, Marco Canini, and Panos Kalnis. 2020. Compressed Communication for Distributed Deep Learning: Survey and Quantitative Evaluation. Technical Report. 1–16.
[45]
Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, et al. 2019. Machine learning at Facebook: Understanding inference at the edge. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA’19). IEEE, 331–344.
[46]
Yang Hu, Connor Imes, Xuanang Zhao, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, and John Paul N. Walters. 2021. Pipeline parallelism for inference on heterogeneous edge computing. arXiv preprint arXiv:2110.14895 (2021).
[47]
Angela Wang. 2019. Parallelizing across multiple CPU/GPUs to speed up deep learning inference at the edge. https://aws.amazon.com/blogs/machine-learning/parallelizing-across-multiple-cpu-gpus-to-speed-up-deep-learning-inference-at-the-edge/.
[48]
Haoyu Song. 2013. Protocol-oblivious forwarding: Unleash the power of SDN through a future-proof forwarding plane. In Proceedings of the 2nd ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN’13). Association for Computing Machinery, New York, NY, 127–132. DOI:
[49]
Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard, Fernando Mujica, and Mark Horowitz. 2013. Forwarding metamorphosis: Fast programmable match-action processing in hardware for SDN. ACM SIGCOMM Computer Communication Review 43, 4 (2013), 99–110.
[50]
Anirudh Sivaraman, Mihai Budiu, Alvin Cheung, Changhoon Kim, Steve Licking, George Varghese, Hari Balakrishnan, Mohammad Alizadeh, and Nick McKeown. 2015. Packet transactions: A programming model for data-plane algorithms at hardware speed. CoRR, vol. abs/1512.05023 (2015).
[51]
Ming Liu, Simon Peter, Arvind Krishnamurthy, and Phitchaya Mangpo Phothilimthana. 2019. E3: Energy-efficient microservices on SmartNIC-accelerated servers. In 2019 USENIX Annual Technical Conference (USENIX ATC’19). 363–378.
[52]
Ming Liu, Tianyi Cui, Henry Schuh, Arvind Krishnamurthy, Simon Peter, and Karan Gupta. 2019. Offloading distributed applications onto SmartNICs using IPipe. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM’19). Association for Computing Machinery, New York, NY, 318–333. DOI:
[53]
Amedeo Sapio, Marco Canini, Chen-Yu Ho, Jacob Nelson, Panos Kalnis, Changhoon Kim, Arvind Krishnamurthy, Masoud Moshref, Dan Ports, and Peter Richtarik. 2021. Scaling distributed machine learning with in-network aggregation. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI’21). USENIX Association, 785–808. https://www.usenix.org/conference/nsdi21/presentation/sapio.
[54]
Davide Sanvito, Giuseppe Siracusano, and Roberto Bifulco. 2018. Can the network be the AI accelerator? In Proceedings of the 2018 Morning Workshop on In-network Computing. 20–25.
[55]
Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. NetCache: Balancing key-value stores with fast in-network caching. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). Association for Computing Machinery, New York, NY, 121–136. DOI:
[56]
Xin Jin, Xiaozhou Li, Haoyu Zhang, Nate Foster, Jeongkeun Lee, Robert Soulé, Changhoon Kim, and Ion Stoica. 2018. NetChain: Scale-free sub-RTT coordination. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI’18). USENIX Association, Renton, WA, 35–49. https://www.usenix.org/conference/nsdi18/presentation/jin.
[57]
Zhuolong Yu, Yiwen Zhang, Vladimir Braverman, Mosharaf Chowdhury, and Xin Jin. 2020. NetLock: Fast, centralized lock management using programmable switches. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM’20). Association for Computing Machinery, New York, NY, 126–138. DOI:
[58]
Richard L. Graham, Devendar Bureddy, Pak Lui, Hal Rosenstock, Gilad Shainer, Gil Bloch, Dror Goldenerg, Mike Dubman, Sasha Kotchubievsky, Vladimir Koushnir, et al. 2016. Scalable hierarchical aggregation protocol (SHArP): A hardware architecture for efficient data reduction. In 2016 1st International Workshop on Communication Optimizations in HPC (COMHPC’16). IEEE, 1–10.
[59]
Shuo Liu, Qiaoling Wang, Junyi Zhang, Qinliang Lin, Yao Liu, Meng Xu, Ray C. C. Chueng, and Jianfei He. 2020. NetReduce: RDMA-compatible in-network reduction for distributed DNN training acceleration. arXiv preprint arXiv:2009.09736 (2020).
[60]
Adam Moody, Juan Fernandez, Fabrizio Petrini, and Dhabaleswar K. Panda. 2003. Scalable NIC-based reduction on large-scale clusters. In Proceedings of the 2003 ACM/IEEE Conference on Supercomputing. 59.
[61]
Richard L. Graham, Steve Poole, Pavel Shamis, Gil Bloch, Noam Bloch, Hillel Chapman, Michael Kagan, Ariel Shahar, Ishai Rabinovitz, and Gilad Shainer. 2010. ConnectX-2 InfiniBand management queues: First investigation of the new support for network offloaded collective operations. In 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE, 53–62.
[62]
Paolo Costa, Austin Donnelly, Antony Rowstron, and Greg O’Shea. 2012. Camdoop: Exploiting in-network aggregation for big data applications. In 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI’12). 29–42.
[63]
Luo Mai, Lukas Rupprecht, Abdul Alim, Paolo Costa, Matteo Migliavacca, Peter Pietzuch, and Alexander L. Wolf. 2014. Netagg: Using middleboxes for application-specific on-path aggregation in data centres. In Proceedings of the 10th ACM International on Conference on Emerging Networking Experiments and Technologies. 249–262.
[64]
Anderson Santos da Silva, Juliano Araujo Wickboldt, Lisandro Zambenedetti Granville, and Alberto Schaeffer-Filho. 2016. ATLANTIC: A framework for anomaly traffic detection, classification, and mitigation in SDN. In 2016 IEEE/IFIP Network Operations and Management Symposium (NOMS’16). IEEE, 27–35.
[65]
Junfeng Xie, F. Richard Yu, Tao Huang, Renchao Xie, Jiang Liu, Chenmeng Wang, and Yunjie Liu. 2018. A survey of machine learning techniques applied to software defined networking (SDN): Research issues and challenges. IEEE Communications Surveys & Tutorials 21, 1 (2018), 393–430.
[66]
Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2704–2713.
[67]
Song Han, Huizi Mao, and William J. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).
[68]
Youjie Li, Iou-Jen Liu, Yifan Yuan, Deming Chen, Alexander Schwing, and Jian Huang. 2019. Accelerating distributed reinforcement learning with in-switch computing. In 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA’19). IEEE, 279–291.
[69]
Fan Yang, Zhan Wang, Xiaoxiao Ma, Guojun Yuan, and Xuejun An. 2019. SwitchAgg: A further step towards in-network computing. In 2019 IEEE Intl. Conf. on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom’19). IEEE, 36–45.
[70]
Tomoya Itsubo, Michihiro Koibuchi, Hideharu Amano, and Hiroki Matsutani. 2020. Accelerating deep learning using multiple GPUs and FPGA-based 10GbE switch. In 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP’20). 102–109. DOI:
[71]
Benjamin Klenk, Nan Jiang, Greg Thorson, and Larry Dennison. 2020. An in-network architecture for accelerating shared-memory multiprocessor collectives. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA’20). IEEE, 996–1009.
[72]
Kenji Tanaka, Yuki Arikawa, Tsuyoshi Ito, Kazutaka Morita, Naru Nemoto, Fumiaki Miura, Kazuhiko Terada, Junji Teramoto, and Takeshi Sakamoto. 2020. Communication-efficient distributed deep learning with GPU-FPGA heterogeneous computing. In 2020 IEEE Symposium on High-Performance Interconnects (HOTI’20). IEEE, 43–46.
[73]
Nadeen Gebara, Manya Ghobadi, and Paolo Costa. 2021. In-network aggregation for shared machine learning clusters. Proceedings of Machine Learning and Systems 3 (2021), 829–844.
[74]
ChonLam Lao, Yanfang Le, Kshiteej Mahajan, Yixi Chen, Wenfei Wu, Aditya Akella, and Michael M. Swift. 2021. ATP: In-network aggregation for multi-tenant learning. In NSDI. 741–761.
[75]
Ge Chen, Gaoxiong Zeng, and Li Chen. 2021. P4COM: In-network computation with programmable switches. arXiv preprint arXiv:2107.13694 (2021).
[76]
Jiawei Fei, Chen-Yu Ho, Atal N. Sahu, Marco Canini, and Amedeo Sapio. 2021. Efficient sparse collective communication and its application to accelerate distributed deep learning. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference. 676–691.
[77]
Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, and Torsten Hoefler. 2021. Flare: Flexible in-network Allreduce. arXiv preprint arXiv:2106.15565 (2021).
[78]
Fahao Chen, Peng Li, and Toshiaki Miyazaki. 2021. In-network aggregation for privacy-preserving federated learning. In 2021 International Conference on Information and Communication Technologies for Disaster Management (ICT-DM’21). IEEE, 49–56.
[79]
Hao Wang, Yuxuan Qin, ChonLam Lao, Yanfang Le, Wenfei Wu, and Kai Chen. 2022. Efficient data-plane memory scheduling for in-network aggregation. arXiv preprint arXiv:2201.06398 (2022).
[80]
Heng Pan, Penglai Cui, Zhenyu li, Ru Jia, Penghao Zhang, Leilei Zhang, Ye Yang, Jiahao Wu, Jianbo Dong, Zheng Cao, Qiang Li, Hongqiang Harry Liu, Mathy Laurent, and Gaogang Xie. 2022. Enabling Fast and Flexible Distributed Deep Learning with Programmable Switches. DOI:
[81]
Rui Ma, Evangelos Georganas, Alexander Heinecke, Sergey Gribok, Andrew Boutros, and Eriko Nurvitadhi. 2022. FPGA-based AI smart NICs for scalable distributed AI training systems. IEEE Computer Architecture Letters 21, 2 (2022), 49–52. DOI:
[82]
Mingran Yang, Alex Baban, Valery Kugel, Jeff Libby, Scott Mackie, Swamy Sadashivaiah Renu Kananda, Chang-Hong Wu, and Manya Ghobadi. 2022. Using Trio–Juniper Networks’ programmable chipset–for emerging in-network applications. In Proceedings of the ACM SIGCOMM 2022 Conference. 633–648.
[83]
Torsten Hoefler, Salvatore Di Girolamo, Konstantin Taranov, Ryan E. Grant, and Ron Brightwell. 2017. sPIN: High-performance streaming Processing in the Network. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–16.
[84]
Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beranek, Luca Benini, and Torsten Hoefler. 2021. A RISC-V in-network accelerator for flexible high-performance low-power packet processing. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA’21).
[85]
Jayashree Mohan, Amar Phanishayee, and Vijay Chidambaram. 2021. CheckFreq: Frequent, fine-grained DNN checkpointing. In FAST, Vol. 21. 203–216.
[86]
Noa Zilberman, Yury Audzevich, G. Adam Covington, and Andrew W. Moore. 2014. NetFPGA SUME: Toward 100 Gbps as research commodity. IEEE Micro 34, 5 (2014), 32–41.
[87]
Sharad Chole, Andy Fingerhut, Sha Ma, Anirudh Sivaraman, Shay Vargaftik, Alon Berger, Gal Mendelson, Mohammad Alizadeh, Shang-Tse Chuang, Isaac Keslassy, et al. 2017. drmt: Disaggregated programmable switching. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication. 1–14.
[88]
Giuseppe Siracusano and Roberto Bifulco. 2018. In-network neural networks. arXiv preprint arXiv:1801.05731 (2018).
[89]
Tushar Swamy, Alexander Rucker, Muhammad Shahbaz, Ishan Gaur, and Kunle Olukotun. 2022. Taurus: A data plane architecture for per-packet ML. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’22). Association for Computing Machinery, New York, NY, 1099–1114. DOI:
[90]
Timothy Hospedales, Antreas Antoniou, Paul Micaelli, and Amos Storkey. 2020. Meta-learning in neural networks: A survey. arXiv preprint arXiv:2004.05439 (2020).
[91]
Qiaofeng Qin, Konstantinos Poularakis, Kin K. Leung, and Leandros Tassiulas. 2020. Line-speed and scalable intrusion detection at the network edge via federated learning. In 2020 IFIP Networking Conference (Networking’20). IEEE, 352–360.
[92]
Giuseppe Siracusano, Salvator Galea, Davide Sanvito, Mohammad Malekzadeh, Hamed Haddadi, Gianni Antichi, and Roberto Bifulco. 2020. Running neural networks on the nic. arXiv preprint arXiv:2009.02353 (2020).
[93]
Giuseppe Siracusano, Salvator Galea, Davide Sanvito, Mohammad Malekzadeh, Gianni Antichi, Paolo Costa, Hamed Haddadi, and Roberto Bifulco. 2022. Re-architecting traffic analysis with neural network interface cards. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA, 513–533. https://www.usenix.org/conference/nsdi22/presentation/siracusano.
[94]
Mateus Saquetti, Ronaldo Canofre, Arthur F. Lorenzon, Fábio D. Rossi, José Rodrigo Azambuja, Weverton Cordeiro, and Marcelo C. Luizelli. 2021. Toward in-network intelligence: Running distributed artificial neural networks in the data plane. IEEE Communications Letters 25 (2021), 1–5.
[95]
Tushar Swamy, Alexander Rucker, Muhammad Shahbaz, Ishan Gaur, and Kunle Olukotun. 2020. Taurus: A data plane architecture for per-packet ML. 1–16. DOI:
[96]
Thi-Nga Dao, Van-Phuc Hoang, Chi Hieu Ta, et al. 2021. Development of lightweight and accurate intrusion detection on programmable data plane. In 2021 International Conference on Advanced Technologies for Communications (ATC’21). IEEE, 99–103.
[97]
Thi-Nga Dao and HyungJune Lee. 2022. JointNIDS: Efficient joint traffic management for on-device network intrusion detection. IEEE Transactions on Vehicular Technology 71 (2022), 1–13. DOI:
[98]
Coralie Busse-Grawitz, Roland Meier, Alexander Dietmüller, Tobias Bühler, and Laurent Vanbever. 2019. pforest: In-network inference with random forests. arXiv preprint arXiv:1909.05680 (2019).
[99]
Bruno Missi Xavier, Rafael Silva Guimarães, Giovanni Comarela, and Magnos Martinello. 2021. Programmable switches for in-networking classification. In IEEE Conference on Computer Communications (IEEE INFOCOM’21). 1–10. DOI:
[100]
Xiaoquan Zhang, Lin Cui, Fung Po Tso, and Weijia Jia. 2021. pHeavy: Predicting heavy flows in the programmable data plane. IEEE Transactions on Network and Service Management 18 (2021), 4353–4364.
[101]
Radhakrishna Kamath and Krishna M. Sivalingam. 2021. Machine learning based flow classification in DCNs using P4 switches. In 2021 International Conference on Computer Communications and Networks (ICCCN’21). IEEE, 1–10.
[102]
Changgang Zheng and Noa Zilberman. 2021. Planter: Seeding Trees within Switches. Association for Computing Machinery, New York, NY, 12–14.
[103]
Changgang Zheng, Mingyuan Zang, Xinpeng Hong, Riyad Bensoussane, Shay Vargaftik, Yaniv Ben-Itzhak, and Noa Zilberman. 2022. Automating in-network machine learning. arXiv preprint arXiv:2205.08824 (2022).
[104]
Hisham Siddique, Miguel Neves, Carson Kuzniar, and Israat Haque. 2021. Towards network-accelerated ML-based distributed computer vision systems. In 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS’21). 122–129. DOI:
[105]
Guorui Xie, Qing Li, Yutao Dong, Guanglin Duan, Yong Jiang, and Jingpu Duan. 2022. Mousika: Enable general in-network intelligence in programmable switches by knowledge distillation. In IEEE Conference on Computer Communications (IEEE INFOCOM’22). 1938–1947. DOI:
[106]
Roy Friedman, Or Goaz, and Ori Rottenstreich. 2021. Clustreams: Data plane clustering. In Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR’21). 101–107.
[107]
Albert Gran Alcoz, Martin Strohmeier, Vincent Lenders, and Laurent Vanbever. 2022. Aggregate-based congestion control for pulse-wave DDoS defense. In Proceedings of the ACM SIGCOMM 2022 Conference. 693–706.
[108]
Zhaoqi Xiong and Noa Zilberman. 2019. Do switches dream of machine learning? Toward in-network classification. In Proceedings of the 18th ACM Workshop on Hot Topics in Networks. 25–33.
[109]
Changgang Zheng, Zhaoqi Xiong, Thanh T. Bui, Siim Kaupmees, Riyad Bensoussane, Antoine Bernabeu, Shay Vargaftik, Yaniv Ben-Itzhak, and Noa Zilberman. 2022. IIsy: Practical in-network classification. arXiv preprint arXiv:2205.08243 (2022).
[110]
Kyle A. Simpson and Dimitrios P. Pezaros. 2022. Revisiting the classics: Online RL in the programmable dataplane. In 2022 IEEE/IFIP Network Operations and Management Symposium (NOMS’22). 1–10. DOI:
[111]
Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning both weights and connections for efficient neural networks. arXiv preprint arXiv:1506.02626 (2015).
[112]
Bruno Missi Xavier, Rafael Silva Guimarães, Giovanni Comarela, and Magnos Martinello. 2021. Programmable switches for in-networking classification. In IEEE Conference on Computer Communications (IEEE INFOCOM’21). IEEE, 1–10.
[113]
H. Siddique, M. Neves, C. Kuzniar, and I. Haque. 2021. Towards network-accelerated ML-based distributed computer vision systems. In 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS’21). IEEE Computer Society, Los Alamitos, CA, 122–129. DOI:
[114]
Hanan Samet. 1984. The quadtree and related hierarchical data structures. ACM Computing Surveys (CSUR) 16, 2 (1984), 187–260.
[115]
Xinpeng Hong, Changgang Zheng, Stefan Zohren, and Noa Zilberman. 2022. Linnet: Limit order books within switches. In Proceedings of the SIGCOMM’22 Poster and Demo Sessions. 37–39.
[116]
Jiarong Xing, Ang Chen, and T. S. Eugene Ng. 2020. Secure state migration in the data plane. In Proceedings of the Workshop on Secure Programmable Network Infrastructure. 28–34.
[117]
Jiarong Xing, Yiming Qiu, Kuo-Feng Hsu, Hongyi Liu, Matty Kadosh, Alan Lo, Aditya Akella, Thomas Anderson, Arvind Krishnamurthy, T. S. Eugene Ng, et al. 2021. A vision for runtime programmable networks. In Proceedings of the 20th ACM Workshop on Hot Topics in Networks. 91–98.
[118]
David Hancock and Jacobus van der Merwe. 2016. HyPer4: Using P4 to virtualize the programmable data plane. In Proceedings of the 12th International on Conference on Emerging Networking EXperiments and Technologies (CoNEXT’16). Association for Computing Machinery, New York, NY, 35–49. DOI:
[119]
Peng Zheng, Theophilus Benson, and Chengchen Hu. 2018. P4visor: Lightweight virtualization and composition primitives for building and testing modular programs. In Proceedings of the 14th International Conference on Emerging Networking EXperiments and Technologies. 98–111.
[120]
Yong Feng, Haoyu Song, Jiahao Li, Zhikang Chen, Wenquan Xu, and Bin Liu. 2021. In-situ Programmable Switching Using RP4: Towards Runtime Data Plane Programmability. Association for Computing Machinery, New York, NY, 69–76.
[121]
Hang Zhu, Tao Wang, Yi Hong, Dan R. K. Ports, Anirudh Sivaraman, and Xin Jin. 2022. NetVRM: Virtual register memory for programmable networks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA, 155–170. https://www.usenix.org/conference/nsdi22/presentation/zhu.
[122]
Jinkun Geng, Dan Li, and Shuai Wang. 2019. Accelerating distributed machine learning by smart parameter server. In Proceedings of the 3rd Asia-Pacific Workshop on Networking 2019 (APNet’19). Association for Computing Machinery, New York, NY, 92–98. DOI:
[123]
Henggang Cui, James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Abhimanu Kumar, Jinliang Wei, Wei Dai, Gregory R. Ganger, Phillip B. Gibbons, et al. 2014. Exploiting bounded staleness to speed up big data analytics. In 2014 USENIX Annual Technical Conference (USENIX ATC’14). 37–48.
[124]
James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Gregory R. Ganger, Garth Gibson, Kimberly Keeton, and Eric Xing. 2013. Solving the straggler problem with bounded staleness. In 14th Workshop on Hot Topics in Operating Systems (HotOS XIV).
[125]
Shijian Li, Oren Mangoubi, Lijie Xu, and Tian Guo. 2021. Sync-switch: Hybrid parameter synchronization for distributed deep learning. arXiv preprint arXiv:2104.08364 (2021).
[126]
Hesam Tajbakhsh, Ricardo Parizotto, Miguel Neves, Alberto Schaeffer-Filho, and Israat Haque. 2022. Accelerator-aware in-network load balancing for improved application performance. In 2022 IFIP Networking Conference (IFIP Networking’22). 1–9. DOI:
[127]
Aaron Harlap, Henggang Cui, Wei Dai, Jinliang Wei, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing. 2016. Addressing the straggler problem for iterative convergent parallel ML. In Proceedings of the 7th ACM Symposium on Cloud Computing. 98–111.
[128]
Daehyeok Kim, Jacob Nelson, Dan R. K. Ports, Vyas Sekar, and Srinivasan Seshan. 2021. RedPlane: Enabling fault-tolerant stateful in-switch applications. In Proceedings of the 2021 ACM SIGCOMM 2021 Conference. 223–244.
[129]
Justine Sherry, Peter Xiang Gao, Soumya Basu, Aurojit Panda, Arvind Krishnamurthy, Christian Maciocco, Maziar Manesh, João Martins, Sylvia Ratnasamy, Luigi Rizzo, and Scott Shenker. 2015. Rollback-recovery for middleboxes. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM’15). Association for Computing Machinery, New York, NY, 227–240. DOI:
[130]
Brent E. Stephens, Darius Grassi, Hamidreza Almasi, Tao Ji, Balajee Vamanan, and Aditya Akella. 2021. TCP is Harmful to In-network Computing: Designing a Message Transport Protocol (MTP). Association for Computing Machinery, New York, NY, 61–68.
[131]
Raz Segal, Chen Avin, and Gabriel Scalosub. 2022. Constrained in-network computing with low congestion in datacenter networks. arXiv preprint arXiv:2201.04344 (2022). arxiv:cs.DC/2201.04344.
[132]
Hang Xu, Kelly Kostopoulou, Aritra Dutta, Xin Li, Alexandros Ntoulas, and Panos Kalnis. 2021. DeepReduce: A sparse-tensor communication framework for federated deep learning. Advances in Neural Information Processing Systems 34 (2021), 1–18.
[133]
Joshua Romero, Junqi Yin, Nouamane Laanait, Bing Xie, M. Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, and Michael Matheson. 2022. Accelerating collective communication in data parallel training across deep learning frameworks. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA, 1027–1040. https://www.usenix.org/conference/nsdi22/presentation/romero.
[134]
Penglai Cui, Heng Pan, Zhenyu Li, Jiaoren Wu, Shengzhuo Zhang, Xingwu Yang, Hongtao Guan, and Gaogang Xie. 2021. NetFC: Enabling accurate floating-point arithmetic on programmable switches. In 2021 IEEE 29th International Conference on Network Protocols (ICNP’21). IEEE, 1–11.
[135]
Yifan Yuan, Omar Alama, Amedeo Sapio, Jiawei Fei, Jacob Nelson, Dan R. K. Ports, Marco Canini, and Nam Sung Kim. 2021. Unlocking the power of inline floating-point operations on programmable switches. arXiv preprint arXiv:2112.06095 (2021).
[136]
Ike Kunze, Klaus Wehrle, Dirk Trossen, Marie-Jose Montpetit, Xavier de Foy, David Griffin, and Miguel Rio. 2022. Use Cases for In-network Computing. Internet-Draft draft-irtf-coinrg-use-cases-02. Internet Engineering Task Force. https://datatracker.ietf.org/doc/html/draft-irtf-coinrg-use-cases-02. Work in Progress.
[137]
René Glebke, Johannes Krude, Ike Kunze, Jan Rüth, Felix Senger, and Klaus Wehrle. 2019. Towards executing computer vision functionality on programmable network devices. In Proceedings of the 1st ACM CoNEXT Workshop on Emerging in-Network Computing Paradigms. 15–20.
[138]
Xukan Ran, Carter Slocum, Maria Gorlatova, and Jiasi Chen. 2019. ShareAR: Communication-efficient multi-user mobile augmented reality. In Proceedings of the 18th ACM Workshop on Hot Topics in Networks (HotNets’19). Association for Computing Machinery, New York, NY, 109–116. DOI:
[139]
Sándor Laki, Csaba Györgyi, József Pető, Péter Vörös, and Géza Szabó. 2022. In-network velocity control of industrial robot arms. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI’22). USENIX Association, Renton, WA. https://www.usenix.org/conference/nsdi22/presentation/laki.
[140]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3, (Jan. 2003), 993–1022.
[141]
Yue-Jiao Gong, Wei-Neng Chen, Zhi-Hui Zhan, Jun Zhang, Yun Li, Qingfu Zhang, and Jing-Jing Li. 2015. Distributed evolutionary algorithms and their models: A survey of the state-of-the-art. Applied Soft Computing 34 (2015), 286–300.
[142]
Jiaqi Gao, Ennan Zhai, Hongqiang Harry Liu, Rui Miao, Yu Zhou, Bingchuan Tian, Chen Sun, Dennis Cai, Ming Zhang, and Minlan Yu. 2020. Lyra: A cross-platform language and compiler for data plane programming on heterogeneous asics. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication. 435–450.
[143]
George Karlos, Henri Bal, and Lin Wang. 2021. Don’t you worry’bout a packet: Unified programming for in-network computing. In Proceedings of the 20th ACM Workshop on Hot Topics in Networks. 99–107.
[144]
Xiangyu Gao, Taegyun Kim, Michael D. Wong, Divya Raghunathan, Aatish Kishan Varma, Pravein Govindan Kannan, Anirudh Sivaraman, Srinivas Narayana, and Aarti Gupta. 2020. Switch code generation using program synthesis. In Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication. 44–61.
[145]
Lior Zeno, Dan R. K. Ports, Jacob Nelson, and Mark Silberstein. 2020. SwiShmem: Distributed shared state abstractions for programmable switches. In Proceedings of the 19th ACM Workshop on Hot Topics in Networks. 160–167.
[146]
Ali Fattaholmanan, Mario Baldi, Antonio Carzaniga, and Robert Soulé. 2021. P4 weaver: Supporting modular and incremental programming in P4. In Proceedings of the ACM SIGCOMM Symposium on SDN Research (SOSR’21). 54–65.
[147]
Marcelo C. Luizelli, Ronaldo Canofre, Arthur F. Lorenzon, Fábio D. Rossi, Weverton Cordeiro, and Oscar M. Caicedo. 2021. In-network neural networks: Challenges and opportunities for innovation. IEEE Network 35, 6 (2021), 68–74.
[148]
Tom Barbette, Chen Tang, Haoran Yao, Dejan Kostić, Gerald Q. Maguire Jr., Panagiotis Papadimitratos, and Marco Chiesa. 2020. A high-speed load-balancer design with guaranteed per-connection-consistency. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI’20). USENIX Association, Santa Clara, CA, 667–683. https://www.usenix.org/conference/nsdi20/presentation/barbette.
[149]
Arpan Gujarati, Sameh Elnikety, Yuxiong He, Kathryn S. McKinley, and Björn B. Brandenburg. 2017. Swayam: Distributed autoscaling to meet SLAs of machine learning inference services with resource efficiency. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference. 109–120.

Cited By

View all
  • (2024)A Machine Learning-Based Toolbox for P4 Programmable Data-PlanesIEEE Transactions on Network and Service Management10.1109/TNSM.2024.340207421:4(4450-4465)Online publication date: Aug-2024
  • (2024)Towards Data-Driven Management of Mobile Networks through User Plane InferenceNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575655(1-4)Online publication date: 6-May-2024
  • (2024)Spinner: Enabling In-network Flow Clustering Entirely in a Programmable Data PlaneNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575413(1-9)Online publication date: 6-May-2024
  • Show More Cited By

Index Terms

  1. Offloading Machine Learning to Programmable Data Planes: A Systematic Survey

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 56, Issue 1
      January 2024
      918 pages
      EISSN:1557-7341
      DOI:10.1145/3613490
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 August 2023
      Online AM: 18 June 2023
      Accepted: 31 May 2023
      Revised: 12 May 2023
      Received: 08 September 2022
      Published in CSUR Volume 56, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. In-network computing
      2. Programmable NICs
      3. programmable switches
      4. offloading
      5. ML training
      6. ML inference

      Qualifiers

      • Survey

      Funding Sources

      • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES)
      • CNPq
      • FAPERGS
      • FAPESP

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,412
      • Downloads (Last 6 weeks)148
      Reflects downloads up to 11 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Machine Learning-Based Toolbox for P4 Programmable Data-PlanesIEEE Transactions on Network and Service Management10.1109/TNSM.2024.340207421:4(4450-4465)Online publication date: Aug-2024
      • (2024)Towards Data-Driven Management of Mobile Networks through User Plane InferenceNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575655(1-4)Online publication date: 6-May-2024
      • (2024)Spinner: Enabling In-network Flow Clustering Entirely in a Programmable Data PlaneNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575413(1-9)Online publication date: 6-May-2024
      • (2024)Expediting In-Network Federated Learning by Voting-Based Consensus Model CompressionIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621376(1271-1280)Online publication date: 20-May-2024
      • (2024)Jewel: Resource-Efficient Joint Packet and Flow Level Inference in Programmable SwitchesIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621365(1631-1640)Online publication date: 20-May-2024
      • (2024)In-network Reinforcement Learning for Attack Mitigation using Programmable Data Plane in SDN2024 33rd International Conference on Computer Communications and Networks (ICCCN)10.1109/ICCCN61486.2024.10637621(1-9)Online publication date: 29-Jul-2024
      • (2024)A Comprehensive Survey on SmartNICs: Architectures, Development Models, Applications, and Research DirectionsIEEE Access10.1109/ACCESS.2024.343720312(107297-107336)Online publication date: 2024
      • (2024)Practicality of in-kernel/user-space packet processing empowered by lightweight neural network and decision treeComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2024.110188240:COnline publication date: 16-May-2024
      • (2024)Real-time diabetic foot ulcer classification based on deep learning & parallel hardware computational toolsMultimedia Tools and Applications10.1007/s11042-024-18304-x83:27(70369-70394)Online publication date: 3-Feb-2024
      • (2024)No Worker Left (Too Far) Behind: Dynamic Hybrid Synchronization for In‐Network ML AggregationInternational Journal of Network Management10.1002/nem.229035:1Online publication date: 24-Jul-2024
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media