Abstract
Many large-scale optimization problems can be expressed as composite optimization models. Accelerated first-order methods such as the fast iterative shrinkage–thresholding algorithm (FISTA) have proven effective for numerous large composite models. In this paper, we present a new variation of FISTA, to be called C-FISTA, which obtains global linear convergence for a broader class of composite models than many of the latest FISTA variants. We demonstrate the versatility and effectiveness of C-FISTA through multiple numerical experiments on group Lasso, group logistic regression and geometric programming models. Furthermore, we utilize Fenchel duality to show C-FISTA can solve the dual of a finite sum convex optimization model.
Similar content being viewed by others
Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Allen-Zhu, Z.: Katyusha: the first direct acceleration of stochastic gradient methods. J. Mach. Learn. Res. 18(1), 8194–8244 (2017)
Auslender, A.: Asymptotic properties of the Fenchel’s dual functional and their applications to decomposition problems. J. Optim. Theory Appl. 73, 427–449 (1992). https://doi.org/10.1007/BF00940050
Beck, A., Shtern, S.: Linearly convergent away-step conditional gradient for non-strongly convex functions. Math. Program. 164, 1–27 (2017). https://doi.org/10.1007/s10107-016-1069-4
Beck, A., Teboulle, M.: A fast iterative shrinkage-threshold algorithm linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009). https://doi.org/10.1137/080716542
Boyd, S., Kim, S., Vandenberghe, L., Hassibi, A.: A tutorial on geometric programming. Optim. Eng. 8, 67–127 (2007). https://doi.org/10.1007/s11081-007-9001-7
Burke, J.V., Ferris, M.C.: A gauss-newton method for convex composite optimization. Math. Program. 71, 179–194 (1995). https://doi.org/10.1007/BF01585997
Calatroni, L., Chambolle, A.: Backtracking strategies for accelerated descent methods with smooth composite objectives. SIAM J. Optim. 29, 1772–1798 (2019). https://doi.org/10.1137/17M1149390
Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016)
d’Aspremont, A., Scieur, D., Taylor, A.: Acceleration methods (2021). arXiv:2101.09545v1
Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43, 919–948 (2018). https://doi.org/10.1287/moor.2017.0889
Drusvyatskiy, D., Paquette, C.: Efficiency of minimizing compositions of convex functions and smooth maps. Math. Program. 178, 503–558 (2019). https://doi.org/10.1007/s10107-018-1311-3
Florea, M.I., Vorobyov, S.A.: An accelerated composite gradient method for large-scale composite objective problems. IEEE Trans. Signal Process. 67, 444–459 (2019). https://doi.org/10.1109/TSP.2018.2866409
Florea, M.I., Vorobyov, S.A.: A generalized accelerated composite gradient method: uniting Nesterov’s fast gradient method and Fista. IEEE Trans. Signal Process. 68, 3033–3048 (2020). https://doi.org/10.1109/TSP.2020.2988614
Fukushima, M., Haddou, M., Nguyen, H.V., Strodiot, J.J., Sugimoto, T., Yamakawa, E.: A parallel descent algorithm for convex programming. Comput. Optim. Appl. 5, 5–37 (1996). https://doi.org/10.1007/BF00429749
Gaffke, N., Mathar, R.: A cyclic projection algorithm via duality. Metrika 36, 29–54 (1989). https://doi.org/10.1007/BF02614077
Han, S.P., Lou, G.: A parallel algorithm for a class of convex programs. SIAM J. Control Optim. 26, 345–355 (1988). https://doi.org/10.1137/0326019
Hong, M., Luo, Z.Q.: On the linear convergence of the alternating direction method of multipliers. Math. Program. 162, 165–199 (2017). https://doi.org/10.1007/s10107-016-1034-2
Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML ’09, p. 433-440. Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1553374.1553431
Liu, J., Ji, S., Ye, J.: SLEP: Sparse Learning with Efficient Projections. Arizona State University (2009). http://yelabs.net/software/SLEP/
Meier, L., Geer, S., Buhlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70, 53–71 (2008)
Necoara, I., Nesterov, Y., Glineur, F.: Linear convergence of first order methods for non-strongly convex optimization. Math. Program. 175, 69–107 (2019). https://doi.org/10.1007/s10107-018-1232-1
Nesterov, Y.: A method for solving the convex programming problem with convergence rate \(o(1/k^2)\). Proc. USSR Acad. Sci. 269, 543–547 (1983)
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140, 125–161 (2013). https://doi.org/10.1007/s10107-012-0629-5
Qin, Z., Scheinberg, K., Goldfarb, D.: Efficient block-coordinate descent algorithms for the group lasso. Math. Program. Comput. 5, 143–169 (2013). https://doi.org/10.1007/s12532-013-0051-x
Rebegoldi, S., Calatroni, L.: Scaled, inexact and adaptive generalized fista for strongly convex optimization (2021). arXiv:2101.03915
Rockafellar, R.T.: Convex Analysis, 2nd edn. Princeton University Press, Princeton (1970)
Roos, K., Balvert, M., Gorissen, B.L., den Hertog, D.: A universal and structured way to derive dual optimization problem formulations. INFORMS J. Optim. 2, 229–255 (2020). https://doi.org/10.1287/ijoo.2019.0034
Shi, H.J.M., Tu, S., Xu, Y., Yin, W.: A primer on coordinate descent algorithms. arXiv preprint arXiv:1610.00040 (2016)
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22, 231–245 (2013). https://doi.org/10.1080/10618600.2012.681250
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58, 267–288 (1996). https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Wright, S.J.: Convergence of an inexact algorithm for composite nonsmooth optimization. IMA J. Numer. Anal. 10, 299–321 (1990). https://doi.org/10.1093/imanum/10.3.299
Xu, Y., Xu, Y.: Katyusha acceleration for convex finite-sum compositional optimization. INFORMS J. Optim. 3(4), 418–443 (2021)
Yuan, L., Liu, J., Ye, J.: Efficient methods for overlapping group lasso. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2104–2116 (2013). https://doi.org/10.1109/TPAMI.2013.17
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68, 49–67 (2007). https://doi.org/10.1111/j.1467-9868.2005.00532.x
Zhang, Y., Zhang, N., Sun, D., Toh, K.C.: An efficient hessian based algorithm for solving large-scale sparse group lasso problems. Math. Program. 179, 223–263 (2020). https://doi.org/10.1007/s10107-018-1329-6
Zhou, X.: On the fenchel duality between strong convexity and lipschitz continuous gradient (2018). arXiv:1803.06573v1
Zhou, Z., So, A.M.C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165, 689–728 (2017). https://doi.org/10.1007/s10107-016-1100-9
Acknowledgements
The authors want to extend their utmost thanks to the two anonymous referees whose comments improved the readability of the manuscript.
Funding
This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 1839286. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Garner, C., Zhang, S. Linearly-Convergent FISTA Variant for Composite Optimization with Duality. J Sci Comput 94, 65 (2023). https://doi.org/10.1007/s10915-023-02101-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-023-02101-z