[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing

Published: 16 July 2024 Publication History

Abstract

Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics. Nevertheless, as the key knob to improve network utility in CEC, existing works mainly focus on the workload routing strategies among edge devices with the aim of minimizing the routing cost, remaining an open question for joint workload allocation and routing optimization problem from a system perspective. To this end, this paper presents a holistic, learned optimization for CEC towards maximizing the total network utility in an online manner, even though the utility functions of task input rates are unknown a priori. In particular, we characterize the CEC system in a flow model and formulate an online learning problem in a form of cross-layer optimization. We propose a nested-loop algorithm to solve workload allocation and distributed routing iteratively, using the tools of gradient sampling and online mirror descent. To improve the convergence rate over the nested-loop version, we further devise a single-loop algorithm. Rigorous analysis is provided to show its inherent convexity, efficient convergence, as well as algorithmic optimality. Finally, extensive numerical simulations demonstrate the superior performance of our solutions.

References

[2]
[3]
S. Venkataramani et al., “RaPiD: AI accelerator for ultra-low precision training and inference,” in Proc. ACM/IEEE 48th Annu. Int. Symp. Comput. Archit. (ISCA), Jun. 2021, pp. 153–166.
[4]
K. Huang and W. Gao, “Real-time neural network inference on extremely weak devices: Agile offloading with explainable Ai,” in Proc. Annu. Int. Conf. Mobile Comput. Netw. (MobiCom), 2022, pp. 200–213.
[5]
T. X. Tran, A. Hajisami, P. Pandey, and D. Pompili, “Collaborative mobile edge computing in 5G networks: New paradigms, scenarios, and challenges,” IEEE Commun. Mag., vol. 55, no. 4, pp. 54–61, Apr. 2017.
[6]
Y. Sahni, J. Cao, L. Yang, and Y. Ji, “Multi-hop multi-task partial computation offloading in collaborative edge computing,” IEEE Trans. Parallel Distrib. Syst., vol. 32, no. 5, pp. 1133–1145, May 2021.
[7]
Z. Hong, W. Chen, H. Huang, S. Guo, and Z. Zheng, “Multi-hop cooperative computation offloading for industrial IoT-edge–cloud computing environments,” IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 12, pp. 2759–2774, Dec. 2019.
[8]
J. Zhang, Y. Liu, and E. Yeh, “Optimal congestion-aware routing and offloading in collaborative edge computing,” in Proc. 20th Int. Symp. Modeling Optim. Mobile, Ad hoc, Wireless Netw. (WiOpt), Sep. 2022, pp. 121–128.
[9]
B. Liu, Y. Cao, Y. Zhang, and T. Jiang, “A distributed framework for task offloading in edge computing networks of arbitrary topology,” IEEE Trans. Wireless Commun., vol. 19, no. 4, pp. 2855–2867, Apr. 2020.
[10]
X. He, R. Jin, and H. Dai, “Multi-hop task offloading with on-the-fly computation for multi-UAV remote edge computing,” IEEE Trans. Commun., vol. 70, no. 2, pp. 1332–1344, Feb. 2022.
[11]
Y. Sahni, J. Cao, and L. Yang, “Data-aware task allocation for achieving low latency in collaborative edge computing,” IEEE Internet Things J., vol. 6, no. 2, pp. 3512–3524, Apr. 2019.
[12]
M. J. Neely, E. Modiano, and C.-P. Li, “Fairness and optimal stochastic control for heterogeneous networks,” IEEE/ACM Trans. Netw., vol. 16, no. 2, pp. 396–409, Apr. 2008.
[13]
Y. Xi and E. M. Yeh, “Node-based optimal power control, routing, and congestion control in wireless networks,” IEEE Trans. Inf. Theory, vol. 54, no. 9, pp. 4081–4106, Sep. 2008.
[14]
L. Jingzong, L. Liu, H. Xu, S. Wu, and C. J. Xue, “Cross-camera inference on the constrained edge,” in Proc. 42th IEEE Int. Conf. Comput. Commun. (IEEE INFOCOM), May 2023, pp. 1–10.
[15]
A. G. Howard et al., “MobileNets: Efficient convolutional neural networks for mobile vision applications,” 2017, arXiv:1704.04861.
[16]
A. Vaswani et al., “Attention is all you need,” 2017, arXiv:1706.03762.
[17]
L. Deng, G. Li, S. Han, L. Shi, and Y. Xie, “Model compression and hardware acceleration for neural networks: A comprehensive survey,” Proc. IEEE, vol. 108, no. 4, pp. 485–532, Apr. 2020.
[18]
M. H. Hajiesmaili, A. Khonsari, A. Sehati, and M. S. Talebi, “Content-aware rate allocation for efficient video streaming via dynamic network utility maximization,” J. Netw. Comput. Appl., vol. 35, no. 6, pp. 2016–2027, Nov. 2012.
[19]
C. Jin, D. X. Wei, and S. H. Low, “FAST TCP: Motivation, architecture, algorithms, performance,” in Proc. IEEE INFOCOM, Sep. 2004, pp. 2490–2501.
[20]
(Mar. 2023). Resnet and Resnet_Vd Series. [Online]. Available: https://paddleclas.readthedocs.io/en/latest/models/ResNet_and_vd_ en.html
[21]
X. Fu and E. Modiano, “Learning-num: Network utility maximization with unknown utility functions and queueing delay,” in Proc. 22nd Int. Symp. Theory, Algorithmic Found., Protocol Design Mobile Netw. Mobile Comput., 2021, pp. 21–30.
[22]
J. Wu, L. Wang, Q. Pei, X. Cui, F. Liu, and T. Yang, “HiTDL: High-throughput deep learning inference at the hybrid mobile edge,” IEEE Trans. Parallel Distrib. Syst., vol. 33, no. 12, pp. 4499–4514, Dec. 2022.
[23]
D. Bertsekas and R. Gallager, Data Networks. Nashua, NH, USA: Athena Scientific, 2021.
[24]
W.-H. Wang, M. Palaniswami, and S. H. Low, “Optimal flow control and routing in multi-path networks,” Perform. Eval., vol. 52, nos. 2–3, pp. 119–132, Apr. 2003.
[25]
X. Lin and N. B. Shroff, “The multi-path utility maximization problem,” in Proc. Annu. Allerton Conf. Commun. Control Comput., vol. 41, no. 2, 2003, pp. 789–798.
[26]
R. Gallager, “A minimum delay routing algorithm using distributed computation,” IEEE Trans. Commun., vol. COM-25, no. 1, pp. 73–85, Jan. 1977.
[27]
S. Ioannidis and E. Yeh, “Jointly optimal routing and caching for arbitrary network topologies,” in Proc. 4th ACM Conf. Inf.-Centric Netw., Sep. 2017, pp. 77–87.
[28]
M. Chiang, S. H. Low, A. R. Calderbank, and J. C. Doyle, “Layering as optimization decomposition: A mathematical theory of network architectures,” Proc. IEEE, vol. 95, no. 1, pp. 255–312, Jan. 2007.
[29]
X. Lin, N. B. Shroff, and R. Srikant, “A tutorial on cross-layer optimization in wireless networks,” IEEE J. Sel. Areas Commun., vol. 24, no. 8, pp. 1452–1463, Aug. 2006.
[30]
R. J. La and V. Anantharam, “Utility-based rate control in the internet for elastic traffic,” IEEE/ACM Trans. Netw., vol. 10, no. 2, pp. 272–286, Apr. 2002.
[31]
A. D. Flaxman, A. T. Kalai, and H. B. McMahan, “Online convex optimization in the bandit setting: Gradient descent without a gradient,” 2004, arXiv:cs/0408007.
[32]
S. Shalev-Shwartz, “Online learning and online convex optimization,” Found. Trends Mach. Learn., vol. 4, no. 2, pp. 107–194, 2012.
[33]
D. P. Bertsekas, “Nonlinear programming,” J. Oper. Res. Soc., vol. 48, no. 3, p. 334, Mar. 1997.
[34]
T. Yoon and E. K. Ryu, “Accelerated algorithms for smooth convex-concave minimax problems with o(1/k2) rate on squared gradient norm,” in Proc. 38th Int. Conf. Mach. Learn., vol. 139, M. Meila and T. Zhang, Eds., Jul. 2021, pp. 12098–12109. [Online]. Available: https://proceedings.mlr.press/v139/yoon21d.html
[35]
A. Mokhtari, A. E. Ozdaglar, and S. Pattathil, “Convergence rate of O(1/k) for optimistic gradient and extragradient methods in smooth convex-concave saddle point problems,” SIAM J. Optim., vol. 30, no. 4, pp. 3230–3251, Jan. 2020.
[36]
D. Rossi and G. Rossini, “Caching performance of content centric networks under multi-path routing (and more),” Relatório Técnico, Telecom ParisTech, vol. 2011, pp. 1–6, Sep. 2011.
[37]
K. Kamran, E. Yeh, and Q. Ma, “DECO: Joint computation, caching and forwarding in data-centric computing networks,” in Proc. 20th ACM Int. Symp. Mobile Ad Hoc Netw. Comput., 2019, pp. 111–120.
[38]
C. M. Bishop, A. Blake, and B. Marthi, “Super-resolution enhancement of video,” in Proc. Int. Workshop Artif. Intell. Statist., 2003, pp. 25–32.
[39]
D. Bertsekas, E. Gafni, and R. Gallager, “Second derivative algorithms for minimum delay distributed routing in networks,” IEEE Trans. Commun., vol. COM-32, no. 8, pp. 911–919, Aug. 1984.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking  Volume 32, Issue 5
Oct. 2024
897 pages

Publisher

IEEE Press

Publication History

Published: 16 July 2024
Published in TON Volume 32, Issue 5

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 13
    Total Downloads
  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)6
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media