More Web Proxy on the site http://driver.im/

research-article

Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing

Authors:

Xu ChenAuthors Info & Claims

IEEE/ACM Transactions on Networking, Volume 32, Issue 5

Pages 4414 - 4426

https://doi.org/10.1109/TNET.2024.3421356

Published: 16 July 2024 Publication History

Abstract

Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics. Nevertheless, as the key knob to improve network utility in CEC, existing works mainly focus on the workload routing strategies among edge devices with the aim of minimizing the routing cost, remaining an open question for joint workload allocation and routing optimization problem from a system perspective. To this end, this paper presents a holistic, learned optimization for CEC towards maximizing the total network utility in an online manner, even though the utility functions of task input rates are unknown a priori. In particular, we characterize the CEC system in a flow model and formulate an online learning problem in a form of cross-layer optimization. We propose a nested-loop algorithm to solve workload allocation and distributed routing iteratively, using the tools of gradient sampling and online mirror descent. To improve the convergence rate over the nested-loop version, we further devise a single-loop algorithm. Rigorous analysis is provided to show its inherent convexity, efficient convergence, as well as algorithmic optimality. Finally, extensive numerical simulations demonstrate the superior performance of our solutions.

References

[1]

(Nov. 2022). Ericsson Mobility Report. [Online]. Available: https://www.ericsson.com/4ae28d/assets/local/reports-papers/mobility-report/documents/2022/ericsson-mobility-report-november-2022.pdf

[2]

(2023). Cisco Annual Internet Report (2018–2023). [Online]. Available: https://www.cisco.com/c/en/us/solutions/executive-perspectives/annual-internet-report/index.html

[3]

S. Venkataramani et al., “RaPiD: AI accelerator for ultra-low precision training and inference,” in Proc. ACM/IEEE 48th Annu. Int. Symp. Comput. Archit. (ISCA), Jun. 2021, pp. 153–166.

[4]

K. Huang and W. Gao, “Real-time neural network inference on extremely weak devices: Agile offloading with explainable Ai,” in Proc. Annu. Int. Conf. Mobile Comput. Netw. (MobiCom), 2022, pp. 200–213.

[5]

T. X. Tran, A. Hajisami, P. Pandey, and D. Pompili, “Collaborative mobile edge computing in 5G networks: New paradigms, scenarios, and challenges,” IEEE Commun. Mag., vol. 55, no. 4, pp. 54–61, Apr. 2017.

[6]

Y. Sahni, J. Cao, L. Yang, and Y. Ji, “Multi-hop multi-task partial computation offloading in collaborative edge computing,” IEEE Trans. Parallel Distrib. Syst., vol. 32, no. 5, pp. 1133–1145, May 2021.

[7]

Z. Hong, W. Chen, H. Huang, S. Guo, and Z. Zheng, “Multi-hop cooperative computation offloading for industrial IoT-edge–cloud computing environments,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 12, pp. 2759–2774, Dec. 2019.

[8]

J. Zhang, Y. Liu, and E. Yeh, “Optimal congestion-aware routing and offloading in collaborative edge computing,” in Proc. 20th Int. Symp. Modeling Optim. Mobile, Ad hoc, Wireless Netw. (WiOpt), Sep. 2022, pp. 121–128.

[9]

B. Liu, Y. Cao, Y. Zhang, and T. Jiang, “A distributed framework for task offloading in edge computing networks of arbitrary topology,” IEEE Trans. Wireless Commun., vol. 19, no. 4, pp. 2855–2867, Apr. 2020.

[10]

X. He, R. Jin, and H. Dai, “Multi-hop task offloading with on-the-fly computation for multi-UAV remote edge computing,” IEEE Trans. Commun., vol. 70, no. 2, pp. 1332–1344, Feb. 2022.

[11]

Y. Sahni, J. Cao, and L. Yang, “Data-aware task allocation for achieving low latency in collaborative edge computing,” IEEE Internet Things J., vol. 6, no. 2, pp. 3512–3524, Apr. 2019.

[12]

M. J. Neely, E. Modiano, and C.-P. Li, “Fairness and optimal stochastic control for heterogeneous networks,” IEEE/ACM Trans. Netw., vol. 16, no. 2, pp. 396–409, Apr. 2008.

[13]

Y. Xi and E. M. Yeh, “Node-based optimal power control, routing, and congestion control in wireless networks,” IEEE Trans. Inf. Theory, vol. 54, no. 9, pp. 4081–4106, Sep. 2008.

[14]

L. Jingzong, L. Liu, H. Xu, S. Wu, and C. J. Xue, “Cross-camera inference on the constrained edge,” in Proc. 42th IEEE Int. Conf. Comput. Commun. (IEEE INFOCOM), May 2023, pp. 1–10.

[15]

A. G. Howard et al., “MobileNets: Efficient convolutional neural networks for mobile vision applications,” 2017, arXiv:1704.04861.

[16]

A. Vaswani et al., “Attention is all you need,” 2017, arXiv:1706.03762.

[17]

L. Deng, G. Li, S. Han, L. Shi, and Y. Xie, “Model compression and hardware acceleration for neural networks: A comprehensive survey,” Proc. IEEE, vol. 108, no. 4, pp. 485–532, Apr. 2020.

[18]

M. H. Hajiesmaili, A. Khonsari, A. Sehati, and M. S. Talebi, “Content-aware rate allocation for efficient video streaming via dynamic network utility maximization,” J. Netw. Comput. Appl., vol. 35, no. 6, pp. 2016–2027, Nov. 2012.

[19]

C. Jin, D. X. Wei, and S. H. Low, “FAST TCP: Motivation, architecture, algorithms, performance,” in Proc. IEEE INFOCOM, Sep. 2004, pp. 2490–2501.

[20]

(Mar. 2023). Resnet and Resnet_Vd Series. [Online]. Available: https://paddleclas.readthedocs.io/en/latest/models/ResNet_and_vd_ en.html

[21]

X. Fu and E. Modiano, “Learning-num: Network utility maximization with unknown utility functions and queueing delay,” in Proc. 22nd Int. Symp. Theory, Algorithmic Found., Protocol Design Mobile Netw. Mobile Comput., 2021, pp. 21–30.

[22]

J. Wu, L. Wang, Q. Pei, X. Cui, F. Liu, and T. Yang, “HiTDL: High-throughput deep learning inference at the hybrid mobile edge,” IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 12, pp. 4499–4514, Dec. 2022.

[23]

D. Bertsekas and R. Gallager, Data Networks. Nashua, NH, USA: Athena Scientific, 2021.

[24]

W.-H. Wang, M. Palaniswami, and S. H. Low, “Optimal flow control and routing in multi-path networks,” Perform. Eval., vol. 52, nos. 2–3, pp. 119–132, Apr. 2003.

[25]

X. Lin and N. B. Shroff, “The multi-path utility maximization problem,” in Proc. Annu. Allerton Conf. Commun. Control Comput., vol. 41, no. 2, 2003, pp. 789–798.

[26]

R. Gallager, “A minimum delay routing algorithm using distributed computation,” IEEE Trans. Commun., vol. COM-25, no. 1, pp. 73–85, Jan. 1977.

[27]

S. Ioannidis and E. Yeh, “Jointly optimal routing and caching for arbitrary network topologies,” in Proc. 4th ACM Conf. Inf.-Centric Netw., Sep. 2017, pp. 77–87.

[28]

M. Chiang, S. H. Low, A. R. Calderbank, and J. C. Doyle, “Layering as optimization decomposition: A mathematical theory of network architectures,” Proc. IEEE, vol. 95, no. 1, pp. 255–312, Jan. 2007.

[29]

X. Lin, N. B. Shroff, and R. Srikant, “A tutorial on cross-layer optimization in wireless networks,” IEEE J. Sel. Areas Commun., vol. 24, no. 8, pp. 1452–1463, Aug. 2006.

[30]

R. J. La and V. Anantharam, “Utility-based rate control in the internet for elastic traffic,” IEEE/ACM Trans. Netw., vol. 10, no. 2, pp. 272–286, Apr. 2002.

[31]

A. D. Flaxman, A. T. Kalai, and H. B. McMahan, “Online convex optimization in the bandit setting: Gradient descent without a gradient,” 2004, arXiv:cs/0408007.

[32]

S. Shalev-Shwartz, “Online learning and online convex optimization,” Found. Trends Mach. Learn., vol. 4, no. 2, pp. 107–194, 2012.

[33]

D. P. Bertsekas, “Nonlinear programming,” J. Oper. Res. Soc., vol. 48, no. 3, p. 334, Mar. 1997.

[34]

T. Yoon and E. K. Ryu, “Accelerated algorithms for smooth convex-concave minimax problems with o(1/k²) rate on squared gradient norm,” in Proc. 38th Int. Conf. Mach. Learn., vol. 139, M. Meila and T. Zhang, Eds., Jul. 2021, pp. 12098–12109. [Online]. Available: https://proceedings.mlr.press/v139/yoon21d.html

[35]

A. Mokhtari, A. E. Ozdaglar, and S. Pattathil, “Convergence rate of O(1/k) for optimistic gradient and extragradient methods in smooth convex-concave saddle point problems,” SIAM J. Optim., vol. 30, no. 4, pp. 3230–3251, Jan. 2020.

[36]

D. Rossi and G. Rossini, “Caching performance of content centric networks under multi-path routing (and more),” Relatório Técnico, Telecom ParisTech, vol. 2011, pp. 1–6, Sep. 2011.

[37]

K. Kamran, E. Yeh, and Q. Ma, “DECO: Joint computation, caching and forwarding in data-centric computing networks,” in Proc. 20th ACM Int. Symp. Mobile Ad Hoc Netw. Comput., 2019, pp. 111–120.

[38]

C. M. Bishop, A. Blake, and B. Marthi, “Super-resolution enhancement of video,” in Proc. Int. Workshop Artif. Intell. Statist., 2003, pp. 25–32.

[39]

D. Bertsekas, E. Gafni, and R. Gallager, “Second derivative algorithms for minimum delay distributed routing in networks,” IEEE Trans. Commun., vol. COM-32, no. 8, pp. 911–919, Aug. 1984.

Index Terms

Index terms have been assigned to the content through auto-classification.

Recommendations

Utility-Aware Edge Server Deployment in Mobile Edge Computing
Algorithms and Architectures for Parallel Processing
Abstract
Traditional Mobile Cloud Computing (MCC) has gradually turned to Mobile Edge Computing (MEC) to meet the needs of low-latency scenarios. However, due to the unpredictability of user behaviors, how to arrange edge servers in suitable locations and ...
Deviceless edge computing: extending serverless computing to the edge of the network
SYSTOR '17: Proceedings of the 10th ACM International Systems and Storage Conference

The serverless paradigm has been rapidly adopted by developers of cloud-native applications, mainly because it relieves them from the burden of provisioning, scaling and operating the underlying infrastructure. In this paper, we propose a novel ...
Edge computing: A survey
Abstract
In recent years, the Edge computing paradigm has gained considerable popularity in academic and industrial circles. It serves as a key enabler for many future technologies like 5G, Internet of Things (IoT), augmented reality and ...
Highlights
- A comprehensive survey on edge computing, i.e., Fog, Mobile-edge and Cloudlet.
- ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking

IEEE/ACM Transactions on Networking Volume 32, Issue 5

Oct. 2024

897 pages

Issue’s Table of Contents

1063-6692 © 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 16 July 2024

Published in TON Volume 32, Issue 5

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
13
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)6

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents