[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Linearly-Convergent FISTA Variant for Composite Optimization with Duality

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

Many large-scale optimization problems can be expressed as composite optimization models. Accelerated first-order methods such as the fast iterative shrinkage–thresholding algorithm (FISTA) have proven effective for numerous large composite models. In this paper, we present a new variation of FISTA, to be called C-FISTA, which obtains global linear convergence for a broader class of composite models than many of the latest FISTA variants. We demonstrate the versatility and effectiveness of C-FISTA through multiple numerical experiments on group Lasso, group logistic regression and geometric programming models. Furthermore, we utilize Fenchel duality to show C-FISTA can solve the dual of a finite sum convex optimization model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

  1. Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. J. Mach. Learn. Res. 18(1), 8194–8244 (2017)

    MathSciNet  MATH  Google Scholar 

  2. Auslender, A.: Asymptotic properties of the Fenchel’s dual functional and their applications to decomposition problems. J. Optim. Theory Appl. 73, 427–449 (1992). https://doi.org/10.1007/BF00940050

    Article  MathSciNet  MATH  Google Scholar 

  3. Beck, A., Shtern, S.: Linearly convergent away-step conditional gradient for non-strongly convex functions. Math. Program. 164, 1–27 (2017). https://doi.org/10.1007/s10107-016-1069-4

    Article  MathSciNet  MATH  Google Scholar 

  4. Beck, A., Teboulle, M.: A fast iterative shrinkage-threshold algorithm linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009). https://doi.org/10.1137/080716542

    Article  MathSciNet  MATH  Google Scholar 

  5. Boyd, S., Kim, S., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng. 8, 67–127 (2007). https://doi.org/10.1007/s11081-007-9001-7

    Article  MathSciNet  MATH  Google Scholar 

  6. Burke, J.V., Ferris, M.C.: A gauss-newton method for convex composite optimization. Math. Program. 71, 179–194 (1995). https://doi.org/10.1007/BF01585997

    Article  MathSciNet  MATH  Google Scholar 

  7. Calatroni, L., Chambolle, A.: Backtracking strategies for accelerated descent methods with smooth composite objectives. SIAM J. Optim. 29, 1772–1798 (2019). https://doi.org/10.1137/17M1149390

    Article  MathSciNet  MATH  Google Scholar 

  8. Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  9. d’Aspremont, A., Scieur, D., Taylor, A.: Acceleration methods (2021). arXiv:2101.09545v1

  10. Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43, 919–948 (2018). https://doi.org/10.1287/moor.2017.0889

    Article  MathSciNet  MATH  Google Scholar 

  11. Drusvyatskiy, D., Paquette, C.: Efficiency of minimizing compositions of convex functions and smooth maps. Math. Program. 178, 503–558 (2019). https://doi.org/10.1007/s10107-018-1311-3

    Article  MathSciNet  MATH  Google Scholar 

  12. Florea, M.I., Vorobyov, S.A.: An accelerated composite gradient method for large-scale composite objective problems. IEEE Trans. Signal Process. 67, 444–459 (2019). https://doi.org/10.1109/TSP.2018.2866409

    Article  MathSciNet  MATH  Google Scholar 

  13. Florea, M.I., Vorobyov, S.A.: A generalized accelerated composite gradient method: uniting Nesterov’s fast gradient method and Fista. IEEE Trans. Signal Process. 68, 3033–3048 (2020). https://doi.org/10.1109/TSP.2020.2988614

    Article  MATH  Google Scholar 

  14. Fukushima, M., Haddou, M., Nguyen, H.V., Strodiot, J.J., Sugimoto, T., Yamakawa, E.: A parallel descent algorithm for convex programming. Comput. Optim. Appl. 5, 5–37 (1996). https://doi.org/10.1007/BF00429749

    Article  MathSciNet  MATH  Google Scholar 

  15. Gaffke, N., Mathar, R.: A cyclic projection algorithm via duality. Metrika 36, 29–54 (1989). https://doi.org/10.1007/BF02614077

    Article  MathSciNet  MATH  Google Scholar 

  16. Han, S.P., Lou, G.: A parallel algorithm for a class of convex programs. SIAM J. Control Optim. 26, 345–355 (1988). https://doi.org/10.1137/0326019

    Article  MathSciNet  MATH  Google Scholar 

  17. Hong, M., Luo, Z.Q.: On the linear convergence of the alternating direction method of multipliers. Math. Program. 162, 165–199 (2017). https://doi.org/10.1007/s10107-016-1034-2

    Article  MathSciNet  MATH  Google Scholar 

  18. Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, p. 433-440. Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1553374.1553431

  19. Liu, J., Ji, S., Ye, J.: SLEP: Sparse Learning with Efficient Projections. Arizona State University (2009). http://yelabs.net/software/SLEP/

  20. Meier, L., Geer, S., Buhlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70, 53–71 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  21. Necoara, I., Nesterov, Y., Glineur, F.: Linear convergence of first order methods for non-strongly convex optimization. Math. Program. 175, 69–107 (2019). https://doi.org/10.1007/s10107-018-1232-1

    Article  MathSciNet  MATH  Google Scholar 

  22. Nesterov, Y.: A method for solving the convex programming problem with convergence rate \(o(1/k^2)\). Proc. USSR Acad. Sci. 269, 543–547 (1983)

    Google Scholar 

  23. Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140, 125–161 (2013). https://doi.org/10.1007/s10107-012-0629-5

    Article  MathSciNet  MATH  Google Scholar 

  24. Qin, Z., Scheinberg, K., Goldfarb, D.: Efficient block-coordinate descent algorithms for the group lasso. Math. Program. Comput. 5, 143–169 (2013). https://doi.org/10.1007/s12532-013-0051-x

    Article  MathSciNet  MATH  Google Scholar 

  25. Rebegoldi, S., Calatroni, L.: Scaled, inexact and adaptive generalized fista for strongly convex optimization (2021). arXiv:2101.03915

  26. Rockafellar, R.T.: Convex Analysis, 2nd edn. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  27. Roos, K., Balvert, M., Gorissen, B.L., den Hertog, D.: A universal and structured way to derive dual optimization problem formulations. INFORMS J. Optim. 2, 229–255 (2020). https://doi.org/10.1287/ijoo.2019.0034

    Article  MathSciNet  Google Scholar 

  28. Shi, H.J.M., Tu, S., Xu, Y., Yin, W.: A primer on coordinate descent algorithms. arXiv preprint arXiv:1610.00040 (2016)

  29. Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22, 231–245 (2013). https://doi.org/10.1080/10618600.2012.681250

    Article  MathSciNet  Google Scholar 

  30. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58, 267–288 (1996). https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

    Article  MathSciNet  MATH  Google Scholar 

  31. Wright, S.J.: Convergence of an inexact algorithm for composite nonsmooth optimization. IMA J. Numer. Anal. 10, 299–321 (1990). https://doi.org/10.1093/imanum/10.3.299

    Article  MathSciNet  MATH  Google Scholar 

  32. Xu, Y., Xu, Y.: Katyusha acceleration for convex finite-sum compositional optimization. INFORMS J. Optim. 3(4), 418–443 (2021)

    Article  MathSciNet  Google Scholar 

  33. Yuan, L., Liu, J., Ye, J.: Efficient methods for overlapping group lasso. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2104–2116 (2013). https://doi.org/10.1109/TPAMI.2013.17

  34. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68, 49–67 (2007). https://doi.org/10.1111/j.1467-9868.2005.00532.x

    Article  MathSciNet  MATH  Google Scholar 

  35. Zhang, Y., Zhang, N., Sun, D., Toh, K.C.: An efficient hessian based algorithm for solving large-scale sparse group lasso problems. Math. Program. 179, 223–263 (2020). https://doi.org/10.1007/s10107-018-1329-6

    Article  MathSciNet  MATH  Google Scholar 

  36. Zhou, X.: On the fenchel duality between strong convexity and lipschitz continuous gradient (2018). arXiv:1803.06573v1

  37. Zhou, Z., So, A.M.C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165, 689–728 (2017). https://doi.org/10.1007/s10107-016-1100-9

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors want to extend their utmost thanks to the two anonymous referees whose comments improved the readability of the manuscript.

Funding

This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1839286. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Casey Garner.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Garner, C., Zhang, S. Linearly-Convergent FISTA Variant for Composite Optimization with Duality. J Sci Comput 94, 65 (2023). https://doi.org/10.1007/s10915-023-02101-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-023-02101-z

Keywords

Mathematics Subject Classification

Navigation