On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization

Yichuan Dong¹¹,
Zhuoxu Cui¹²,
Yong Zhang¹³ &
…
Shengzhong Feng¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12606))

Included in the following conference series:

International Conference on Parallel and Distributed Computing: Applications and Technologies

1117 Accesses

Abstract

This paper considers the distributed “nonsmooth+nonsmooth” composite optimization problems for which n agents collaboratively minimize the sum of their local objective functions over the directed networks. In particular, we focus on the scenarios where the sought solutions are desired to possess some structural properties, e.g., sparsity. However, to ensure the convergence, most existing methods produce an ergodic solution via the averaging schemes as the output, which causes the desired structural properties of the output to be destroyed. To address this issue, we develop a new decentralized stochastic proximal gradient method, termed DSPG, in which the nonergodic (last) iteration acts as the output. We also show that the DSPG method achieves the nonergodic convergence rate \(O(\log (T)/\sqrt{T})\) for generally convex objective functions and \(O(\log (T)/T)\) for strongly convex objective functions. When the structure-enhancing regularization is absent and the simple and suffix averaging schemes are used, the convergence rates of DSPG reach \(O(1/\sqrt{T})\) for generally convex objective functions and O(1/T) for strongly convex objective functions, showing improvement relative to the rates \(O(\log (T)/\sqrt{T})\) and \(O(\log (T)/T)\) provided by the existing methods. Simulation examples further illustrate the effectiveness of the proposed method.

Supported by National Supercomputing Center in Shenzhen.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A generalized Frank–Wolfe method with “dual averaging” for strongly convex composite optimization

Article Open access 07 November 2022

Distributed Nonsmooth Convex Optimization over Markovian Switching Random Networks with Two Step-Sizes

Article 12 January 2021

Distributed Coordination for Nonsmooth Convex Optimization via Saddle-Point Dynamics

Article 17 November 2018

Notes

References

Lesser, V., Ortiz, C.L., Tambe, M.: Distributed Sensor Networks: A Multiagent Prespective. Kluwer, Norwell (2003)
Book Google Scholar
Xie, S., Guo, L.: Analysis of distributed adaptive filters based on diffusion strategies over sensor networks. IEEE Trans. Autom. Control 6(3), 3643–3658 (2018)
Article MathSciNet Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 5(21), 436–441 (2015)
Article Google Scholar
Author, F.: Article title. Journal 2(5), 99–110 (2003)
Google Scholar
Rakhlin, A., Shamir, O., Sridharan, K.: Making gradient descent optimal for strongly convex stochastic optimization. In: Proceedings of the International Conference on Machine Learning, vol. 2, no. 5, pp. 1571–1578 (2012)
Google Scholar
Shamir, O., Zhang, T.: Stochastic gradient for non-smooth optimization: Convergence results and optimal averaging schemes. In: Proceedings of the International Conference on Machine Learning, vol. 2, no. 1, pp. 71–79 (2013)
Google Scholar
Nedic, A., Olshevsky, A.: Stochastic gradient-push for strongly convex function on time-varying directed graphs. IEEE Trans. Autom. Control 6(1), 3936–3947 (2016)
Article MathSciNet Google Scholar
Makhdoumi, A., Ozdaglar, A.E.: Graph balancing for distributed subgradient methods over directed graphs. In: Proceedings of the International Conference on Machine Learning, vol. 2, pp. 71–79 (2013)
Google Scholar
Xiao, L.: Dual averaging method for regularized stochastic learning and online optimization. J. Mach. Learn. Res. 1, 99–110 (2003)
Google Scholar
Duchi, J.C., Singer, Y.: Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 1, 2899–2934 (2009)
MathSciNet MATH Google Scholar
Parikh, N., Boyd, S.: Proximal algorithms. Foundations Trends Optim. 1(3), 127–239 (2013)
Article Google Scholar
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastics composite optimization. Math. Program. 1(55), 267–305 (2016)
Article MathSciNet Google Scholar
Kempe, D., Dobra, A., Gehrke, J.: Gossip-based computation of aggregate information. In: Proceedings of IEEE Symposium on Foundations of Computer Science, vol. 4, no. 4, pp. 482–491 (2003)
Google Scholar
Kushner, H.J., Yin, G.: Stochastic Approximation and Recursive Algorithms and Applications, vol. 3, pp. 99–110. Springer, Cham (2003)
MATH Google Scholar
Bottou, L.: Stochastic gradient descent tricks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 421–436. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25
Chapter Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B 5(8), 267–288 (1996)
MathSciNet MATH Google Scholar
Hearst, M.A., Dumais, S.T., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. IEEE Intell. Syst. Their Appl. 1(3), 18–28 (1998)
Article MathSciNet Google Scholar
Hong, M., Chang, T.H.: Stochastic proximal gradient consensus over random networks. IEEE Trans. Signal Process. 6(5), 2933–2948 (2017)
Article MathSciNet Google Scholar
Scaman, K., Bach, F., Bubeck, S., Yin, T.L., Massoulie, L.: Optimal algorithms for non-smooth distributed optimization in networks. In: Proceedings of the Advances in Neural Information Processing Systems, vol. 2, no. 5, pp. 2745–2754 (2018)
Google Scholar
Xue, D., Hirche, S.: Distributed topology manipulation to control epidemic spreading over networks. IEEE Trans. Signal Process. 6(7), 1163–1174 (2019)
Article MathSciNet Google Scholar
Lu, X., Lai, J., Yu, X., Wang, Y., Guerrero, J.M.: Distributed coordination of islanded microgrid clusters using a two-layer intermittent communication network. IEEE Trans. Ind. Inform. J. 1(4), 3956–3969 (2018)
Article Google Scholar
Smith, G.B., Hein, B., Whitney, D.E., Fitzpatrick, D.: Distributed network interactions and their emergence in developing neocortex. Nat. Neurosci. 2(1), 1600–1608 (2018)
Article Google Scholar
Yu, D., Zou, Y., Yu, J., Dressler, F., Lau, F.C.: Implementation abstract MAC Layer in Dynamic Dressler. IEEE Trans. Mobile Comput. 1599 (2020). https://doi.org/10.1109/TMC.2020.297
Yu, D., Zou, Y., Yu, J., Dressler, F., Lau, F.C.: Stable local broadcast in Multihop wireless networks under SINR. IEEE/ACM Trans. Netw. J. 2(6), 1278–1291 (2018)
Article Google Scholar
Cai, Z., Zheng, X.: A private and efficient mechanism for data uploading in smart cyber physical systems. IEEE Trans. Netw. Sci. Eng. (TNSE) J. 7(2), 766–775 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

National Supercomputing Center in Shenzhen, Guangzhou, China
Yichuan Dong & Shengzhong Feng
Wuhan University, Wuhan, China
Zhuoxu Cui
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, People’s Republic of China
Yong Zhang

Authors

Yichuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Zhuoxu Cui
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shengzhong Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhuoxu Cui .

Editor information

Editors and Affiliations

Shenzhen Institutes of Advanced Technology, Shenzhen, China
Yong Zhang
Shenzhen Institutes of Advanced Technology, Shenzhen, China
Yicheng Xu
Griffith University, Gold Coast, QLD, Australia
Hui Tian

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 79 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dong, Y., Cui, Z., Zhang, Y., Feng, S. (2021). On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-69244-5_11
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69243-8
Online ISBN: 978-3-030-69244-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A generalized Frank–Wolfe method with “dual averaging” for strongly convex composite optimization

Distributed Nonsmooth Convex Optimization over Markovian Switching Random Networks with Two Step-Sizes

Distributed Coordination for Nonsmooth Convex Optimization via Saddle-Point Dynamics

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 79 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

On the Non-ergodic Convergence Rate of the Directed Nonsmooth Composite Optimization

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A generalized Frank–Wolfe method with “dual averaging” for strongly convex composite optimization

Distributed Nonsmooth Convex Optimization over Markovian Switching Random Networks with Two Step-Sizes

Distributed Coordination for Nonsmooth Convex Optimization via Saddle-Point Dynamics

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 79 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation