Abstract
This paper considers the distributed “nonsmooth+nonsmooth” composite optimization problems for which n agents collaboratively minimize the sum of their local objective functions over the directed networks. In particular, we focus on the scenarios where the sought solutions are desired to possess some structural properties, e.g., sparsity. However, to ensure the convergence, most existing methods produce an ergodic solution via the averaging schemes as the output, which causes the desired structural properties of the output to be destroyed. To address this issue, we develop a new decentralized stochastic proximal gradient method, termed DSPG, in which the nonergodic (last) iteration acts as the output. We also show that the DSPG method achieves the nonergodic convergence rate \(O(\log (T)/\sqrt{T})\) for generally convex objective functions and \(O(\log (T)/T)\) for strongly convex objective functions. When the structure-enhancing regularization is absent and the simple and suffix averaging schemes are used, the convergence rates of DSPG reach \(O(1/\sqrt{T})\) for generally convex objective functions and O(1/T) for strongly convex objective functions, showing improvement relative to the rates \(O(\log (T)/\sqrt{T})\) and \(O(\log (T)/T)\) provided by the existing methods. Simulation examples further illustrate the effectiveness of the proposed method.
Supported by National Supercomputing Center in Shenzhen.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lesser, V., Ortiz, C.L., Tambe, M.: Distributed Sensor Networks: A Multiagent Prespective. Kluwer, Norwell (2003)
Xie, S., Guo, L.: Analysis of distributed adaptive filters based on diffusion strategies over sensor networks. IEEE Trans. Autom. Control 6(3), 3643–3658 (2018)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 5(21), 436–441 (2015)
Author, F.: Article title. Journal 2(5), 99–110 (2003)
Rakhlin, A., Shamir, O., Sridharan, K.: Making gradient descent optimal for strongly convex stochastic optimization. In: Proceedings of the International Conference on Machine Learning, vol. 2, no. 5, pp. 1571–1578 (2012)
Shamir, O., Zhang, T.: Stochastic gradient for non-smooth optimization: Convergence results and optimal averaging schemes. In: Proceedings of the International Conference on Machine Learning, vol. 2, no. 1, pp. 71–79 (2013)
Nedic, A., Olshevsky, A.: Stochastic gradient-push for strongly convex function on time-varying directed graphs. IEEE Trans. Autom. Control 6(1), 3936–3947 (2016)
Makhdoumi, A., Ozdaglar, A.E.: Graph balancing for distributed subgradient methods over directed graphs. In: Proceedings of the International Conference on Machine Learning, vol. 2, pp. 71–79 (2013)
Xiao, L.: Dual averaging method for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 1, 99–110 (2003)
Duchi, J.C., Singer, Y.: Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 1, 2899–2934 (2009)
Parikh, N., Boyd, S.: Proximal algorithms. Foundations Trends Optim. 1(3), 127–239 (2013)
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastics composite optimization. Math. Program. 1(55), 267–305 (2016)
Kempe, D., Dobra, A., Gehrke, J.: Gossip-based computation of aggregate information. In: Proceedings of IEEE Symposium on Foundations of Computer Science, vol. 4, no. 4, pp. 482–491 (2003)
Kushner, H.J., Yin, G.: Stochastic Approximation and Recursive Algorithms and Applications, vol. 3, pp. 99–110. Springer, Cham (2003)
Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 421–436. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 5(8), 267–288 (1996)
Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Their Appl. 1(3), 18–28 (1998)
Hong, M., Chang, T.H.: Stochastic proximal gradient consensus over random networks. IEEE Trans. Signal Process. 6(5), 2933–2948 (2017)
Scaman, K., Bach, F., Bubeck, S., Yin, T.L., Massoulie, L.: Optimal algorithms for non-smooth distributed optimization in networks. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 2, no. 5, pp. 2745–2754 (2018)
Xue, D., Hirche, S.: Distributed topology manipulation to control epidemic spreading over networks. IEEE Trans. Signal Process. 6(7), 1163–1174 (2019)
Lu, X., Lai, J., Yu, X., Wang, Y., Guerrero, J.M.: Distributed coordination of islanded microgrid clusters using a two-layer intermittent communication network. IEEE Trans. Ind. Inform. J. 1(4), 3956–3969 (2018)
Smith, G.B., Hein, B., Whitney, D.E., Fitzpatrick, D.: Distributed network interactions and their emergence in developing neocortex. Nat. Neurosci. 2(1), 1600–1608 (2018)
Yu, D., Zou, Y., Yu, J., Dressler, F., Lau, F.C.: Implementation abstract MAC Layer in Dynamic Dressler. IEEE Trans. Mobile Comput. 1599 (2020). https://doi.org/10.1109/TMC.2020.297
Yu, D., Zou, Y., Yu, J., Dressler, F., Lau, F.C.: Stable local broadcast in Multihop wireless networks under SINR. IEEE/ACM Trans. Netw. J. 2(6), 1278–1291 (2018)
Cai, Z., Zheng, X.: A private and efficient mechanism for data uploading in smart cyber physical systems. IEEE Trans. Netw. Sci. Eng. (TNSE) J. 7(2), 766–775 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Dong, Y., Cui, Z., Zhang, Y., Feng, S. (2021). On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-69244-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69243-8
Online ISBN: 978-3-030-69244-5
eBook Packages: Computer ScienceComputer Science (R0)