[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

A linearly convergent doubly stochastic Gauss–Seidel algorithm for solving linear equations and a certain class of over-parameterized optimization problems

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

Consider the classical problem of solving a general linear system of equations \(Ax=b\). It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G–S type algorithm that works for anyA? In this paper we answer this question affirmatively by proposing a doubly stochastic G–S algorithm that is provably linearly convergent (in the mean square error sense) for any feasible linear system of equations. The key in the algorithm design is to introduce a nonuniform double stochastic scheme for picking the equation and the variable in each update step as well as a stepsize rule. These techniques also generalize to certain iterative alternating projection algorithms for solving the linear feasibility problem \(A x\le b\) with an arbitrary A, as well as high-dimensional minimization problems for training over-parameterized models in machine learning. Our results demonstrate that a carefully designed randomization scheme can make an otherwise divergent G–S algorithm converge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, 2nd edn. Athena Scientific, Belmont (1997)

    MATH  Google Scholar 

  2. Deutsch, F.: Best Approximation in Inner Product Spaces. Springer, New York (2001)

    Book  MATH  Google Scholar 

  3. Golub, G., van Loan, C.: Matrix Computation. Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  4. Gower, R.M., Richtarik, P.: Stochastic dual ascent for solving linear systems (2015). ArXiv Preprint: arXiv:1512.06890

  5. Hefny, A., Needell, D., Ramdas, A.: Rows versus columns: randomized Kaczmarz or Gauss–Seidel for ridge regression. SIAM J. Sci. Comput. 39(5), S528–S542 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  6. Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: The Proceedings of the Neural Information Processing (NIPS) (2013)

  7. Kaczmarz, S.: Approximate solution of systems of linear equations. Int. J. Control 57(6), 1269–1271 (1993). Translated from German original of 1933

    Article  MathSciNet  MATH  Google Scholar 

  8. Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 795–811. Springer (2016)

  9. Konecny, J., Qu, Z., Richtarik, P.: Semi-stochastic coordinate descent (2014). Preprint arXiv:1412.6293

  10. Leventhal, D., Lewis, A.S.: Randomized methods for linear constraints: convergence rates and conditioning. Math. Oper. Res. 35(3), 641–654 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  11. Liu, J., Wright, S.: An accelerated randomized Kaczmarz algorithm. Math. Comput. 85, 153–178 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  12. Liu, J., Wright, S.J., Ré, C., Bittorf, V.: An asynchronous parallel stochastic coordinate descent algorithm. In: The Proceedings of the International Conference on Machine Learning (ICML) (2014)

  13. Loera, J.A.D., Haddock, J., Needell, D.: A sampling Kaczmarz–Motzkin algorithm for linear feasibility. SIAM J. Sci. Comput. 39(5), S66–S87 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  14. Ma, A., Needell, D., Ramdas, A.: Convergence properties of the randomized extended Gauss–Seidel and Kaczmarz methods. SIAM J. Matrix Anal. Appl. 36(4), 1590–1604 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  15. Motzkin, T.S., Schoenberg, I.J.: The relaxation method for linear inequalities. Can. J. Math. 6, 393–404 (1954)

    Article  MathSciNet  MATH  Google Scholar 

  16. Necoara, I., Nesterov, Y., Glineur, F.: Linear convergence of first order methods for non-strongly convex optimization. In: Mathematical Programming, pp. 1–39 (2018)

  17. Needell, D.: Randomized Kaczmarz solver for noisy linear systems. BIT Numer. Math. 50(2), 395–403 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  18. Needell, D., Tropp, J.A.: Paved with good intentions: analysis of a randomized block Kaczmarz method. Linear Algebra Appl. 441, 199–221 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  19. Ortega, J.M.: Numerical analysis: a second course. SIAM, Philadelphia (1990)

    Book  MATH  Google Scholar 

  20. Oswald, P., Zhou, W.: Convergence analysis for Kaczmarz-type methods in a Hilbert space framework. Linear Algebra Appl. 478, 131–161 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  21. Oswald, P., Zhou, W.: Random reordering in SOR-type methods. Numer. Math. 135(4), 1207–1220 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  22. Saad, Y.: Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, Philadelphia (2003)

    Book  MATH  Google Scholar 

  23. Strohmer, T., Vershynin, R.: A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15(2), 262 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  24. Wheaton, I., Awoniyi, S.: A new iterative method for solving non-square systems of linear equations. J. Comput. Appl. Math. 322(Supplement C), 1–6 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  25. Yu, A.W., Lin, Q., Yang, T.: Doubly stochastic primal–dual coordinate method for bilinear saddle-point problem (2015). ArXiv preprint arXiv:1508.03390

  26. Zhang, A., Gu, Q.: Accelerated stochastic block coordinate descent with optimal sampling. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2035–2044. ACM (2016)

  27. Zouzias, A., Freris, N.M.: Randomized extended Kaczmarz for solving least squares. SIAM J. Matrix Anal. Appl. 34(2), 773–793 (2013)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meisam Razaviyayn.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research is supported by the NSFC Grants 61731018 and 61571384, the Peacock project of SRIBD.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Razaviyayn, M., Hong, M., Reyhanian, N. et al. A linearly convergent doubly stochastic Gauss–Seidel algorithm for solving linear equations and a certain class of over-parameterized optimization problems. Math. Program. 176, 465–496 (2019). https://doi.org/10.1007/s10107-019-01404-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-019-01404-0

Keywords

Mathematics Subject Classification

Navigation