A Two-Step Fixed-Point Proximity Algorithm for a Class of Non-differentiable Optimization Models in Machine Learning

Zheng Li¹,
Guohui Song² &
Yuesheng Xu²

650 Accesses
5 Citations
Explore all metrics

Abstract

Sparse learning models are popular in many application areas. Objective functions in sparse learning models are usually non-smooth, which makes it difficult to solve them numerically. We develop a fast and convergent two-step iteration scheme for solving a class of non-differentiable optimization models motivated from sparse learning. To overcome the difficulty of the non-differentiability of the models, we first present characterizations of their solutions as fixed-points of mappings involving the proximity operators of the functions appearing in the objective functions. We then introduce a two-step fixed-point algorithm to compute the solutions. We establish convergence results of the proposed two-step iteration scheme and compare it with the alternating direction method of multipliers (ADMM). In particular, we derive specific two-step iteration algorithms for three models in machine learning: \(\ell ^1\)-SVM classification, \(\ell ^1\)-SVM regression, and the SVM classification with the group LASSO regularizer. Numerical experiments with some synthetic datasets and some benchmark datasets show that the proposed algorithm outperforms ADMM and the linear programming method in computational time and memory storage costs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

A Bregman-Style Improved ADMM and its Linearized Version in the Nonconvex Setting: Convergence and Rate Analyses

Article 12 February 2024

Accelerated-gradient-based generalized Levenberg–Marquardt method with oracle complexity bound and local quadratic convergence

Article Open access 04 November 2024

A second-order method for strongly convex \(\ell _1\)-regularization problems

Article 01 March 2015

References

Argyriou, A., Micchelli, C.A., Pontil, M., Shen, L., Xu, Y.: Efficient first order methods for linear composite regularizers. arXiv:1104.1436 (2011)
Bach, F., Jenatton, R., Mairal, J., Obozinski, G., et al.: Optimization with sparsity-inducing penalties. Found. Trends® Mach. Learn. 4, 1–106 (2012)
MATH Google Scholar
Bottou, L., Lin, C.-J.: Support vector machine solvers. Large Scale Kernel Mach. 301–320 (2007)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3, 1–122 (2011)
MATH Google Scholar
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1–27:27 (2011)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Fung, G.M., Mangasarian, O.L.: A feature selection Newton method for support vector machine classification. Comput. Optim. Appl. 28, 185–202 (2004)
Article MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press, Boca Raton (2015)
Book MATH Google Scholar
Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, pp. 433–440 (2009)
Koshiba, Y., Abe, S.: Comparison of L1 and L2 support vector machines. In: Proceedings of the International Joint Conference On Neural Networks, 2003, vol. 3. IEEE, pp. 2054–2059 (2003)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Article Google Scholar
Li, Q., Micchelli, C.A., Shen, L., Xu, Y.: A proximity algorithm accelerated by Gauss–Seidel iterations for L1/TV denoising models. Inverse Probl. 28, 095003 (2012)
Article MathSciNet MATH Google Scholar
Li, Q., Shen, L., Xu, Y., Zhang, N.: Multi-step fixed-point proximity algorithms for solving a class of optimization problems arising from image processing. Adv. Comput. Math. 41, 387–422 (2015)
Article MathSciNet MATH Google Scholar
Li, Q., Xu, Y., Zhang, N.: Two-step fixed-point proximity algorithms for multi-block separable convex problems. J. Sci. Comput. 70, 1204–1228 (2017)
Article MathSciNet MATH Google Scholar
Li, Z., Song, G., Xu, Y.: A fixed-point proximity approach to solving the support vector regression with the group Lasso regularization. Int. J. Numer. Anal. Model. 15, 154–169 (2018)
MathSciNet MATH Google Scholar
Li, Z., Xu, Y., Ye, Q.: Sparse support vector machines in reproducing kernel Banach spaces. In: Dick, J., Kuo, F., Woźniakowski, H. (eds.) Contemporary Computational Mathematics: A Celebration of the 80th Birthday of Ian Sloan, pp. 869–887. Springer, Cham (2018)
Chapter Google Scholar
Lichman, M.: UCI Machine Learning Repository. University of California, Irvine (2013)
Google Scholar
Lin, R., Song, G., Zhang, H.: Multi-task learning in vector-valued reproducing kernel Banach spaces with the \(\ell ^1\) norm. arXiv:1901.01036 [math] (2019)
Mangasarian, O.L.: Exact 1-norm support vector machines via unconstrained convex differentiable minimization. J. Mach. Learn. Res. 7, 1517–1530 (2007)
MathSciNet MATH Google Scholar
Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70, 53–71 (2008)
Article MathSciNet MATH Google Scholar
Micchelli, C.A., Shen, L., Xu, Y.: Proximity algorithms for image models: denoising. Inverse Prob. 27, 045009 (2011)
Article MathSciNet MATH Google Scholar
Micchelli, C.A., Shen, L., Xu, Y., Zeng, X.: Proximity algorithms for the L1/TV image denoising model. Adv. Comput. Math. 38, 401–426 (2013)
Article MathSciNet MATH Google Scholar
Micchelli, C.A., Xu, Y., Zhang, H.: Universal kernels. J. Mach. Learn. Res. 7, 2651–2667 (2006)
MathSciNet MATH Google Scholar
Nesterov, Y.E.: A method for solving the convex programming problem with convergence rate O (\(1/\)k\(\hat{2}\)), in Dokl. Akad. Nauk Sssr 269, 543–547 (1983)
MathSciNet Google Scholar
Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1, 127–239 (2014)
Article Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)
Google Scholar
Song, G., Zhang, H.: Reproducing kernel Banach spaces with the l1 norm II: error analysis for regularized least square regression. Neural Comput. 23, 2713–2729 (2011)
Article MathSciNet MATH Google Scholar
Song, G., Zhang, H., Hickernell, F.J.: Reproducing kernel Banach spaces with the l1 norm. Appl. Comput. Harmon. Anal. 34, 96–116 (2013)
Article MathSciNet MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Wang, R., Xu, Y.: Functional reproducing kernel Hilbert spaces for non-point-evaluation functional data. Appl. Comput. Harmon. Anal. 46, 569–623 (2019)
Article MathSciNet MATH Google Scholar
Xu, Y., Ye, Q.: Generalized Mercer kernels and reproducing kernel Banach spaces. Mem. Am. Math. Soc. 258, 1–122 (2019)
MathSciNet MATH Google Scholar
Yuan, G., Chang, K., Hsieh, C., Lin, C.: A comparison of optimization methods and software for large-scale L1-regularized linear classification. J. Mach. Learn. Res. 11, 3183–3234 (2010)
MathSciNet MATH Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. 68, 49–67 (2006)
Article MathSciNet MATH Google Scholar
Zhang, H., Xu, Y., Zhang, J.: Reproducing kernel Banach spaces for machine learning. J. Mach. Learn. Res. 10, 3520–3527 (2009)
MathSciNet MATH Google Scholar
Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm support vector machines. Adv. Neural Inf. Process. Syst. 16, 49–56 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics, and Guangdong Province Key Lab of Computational Science, Sun Yat-sen University, Guangzhou, 510275, People’s Republic of China
Zheng Li
Department of Mathematics and Statistics, Old Dominion University, Norfolk, VA, 23529, USA
Guohui Song & Yuesheng Xu

Authors

Zheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Guohui Song
View author publications
You can also search for this author in PubMed Google Scholar
Yuesheng Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuesheng Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported in part by the Ministry of Science and Technology of China under Grant 2016YFB0200602, by the Natural Science Foundation of China under Grants 11471013, 11771464, and by the US National Science Foundation under Grants DMS-1521661, DMS-1522332, DMS-1912958 and DMS-1939203.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Z., Song, G. & Xu, Y. A Two-Step Fixed-Point Proximity Algorithm for a Class of Non-differentiable Optimization Models in Machine Learning. J Sci Comput 81, 923–940 (2019). https://doi.org/10.1007/s10915-019-01045-7

Download citation

Received: 26 September 2018
Revised: 22 May 2019
Accepted: 26 August 2019
Published: 03 September 2019
Issue Date: November 2019
DOI: https://doi.org/10.1007/s10915-019-01045-7

A Two-Step Fixed-Point Proximity Algorithm for a Class of Non-differentiable Optimization Models in Machine Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Bregman-Style Improved ADMM and its Linearized Version in the Nonconvex Setting: Convergence and Rate Analyses

Accelerated-gradient-based generalized Levenberg–Marquardt method with oracle complexity bound and local quadratic convergence

A second-order method for strongly convex \(\ell _1\)-regularization problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Two-Step Fixed-Point Proximity Algorithm for a Class of Non-differentiable Optimization Models in Machine Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Bregman-Style Improved ADMM and its Linearized Version in the Nonconvex Setting: Convergence and Rate Analyses

Accelerated-gradient-based generalized Levenberg–Marquardt method with oracle complexity bound and local quadratic convergence

A second-order method for strongly convex \(\ell _1\)-regularization problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation