Abstract
In this paper, we propose a stochastic method for solving equality constrained optimization problems that utilizes predictive variance reduction. Specifically, we develop a method based on the sequential quadratic programming paradigm that employs variance reduction in the gradient approximations. Under reasonable assumptions, we prove that a measure of first-order stationarity evaluated at the iterates generated by our proposed algorithm converges to zero in expectation from arbitrary starting points, for both constant and adaptive step size strategies. Finally, we demonstrate the practical performance of our proposed algorithm on constrained binary classification problems that arise in machine learning.
Similar content being viewed by others
Data Availability Statement
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
Notes
For lemmas with proofs equivalent to those in [4], we refer interested reader to the appropriate sections.
References
Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: international conference on machine learning, pp. 22–31 (2017). PMLR
Bai, J., Hager, W.W., Zhang, H.: An inexact accelerated stochastic ADMM for separable convex optimization. Comput. Optim. Appl. 81(2), 479–518 (2022)
Berahas, A.S., Curtis, F.E., O’Neill, M.J., Robinson, D.P.: A stochastic sequential quadratic optimization algorithm for nonlinear equality constrained optimization with rank-deficient Jacobians. arXiv preprint arXiv:2106.13015 (2021a)
Berahas, A.S., Curtis, F.E., Robinson, D., Zhou, B.: Sequential quadratic optimization for nonlinear equality constrained stochastic optimization. SIAM J. Optim. 31(2), 1352–1379 (2021)
Bian, F., Liang, J., Zhang, X.: A stochastic alternating direction method of multipliers for non-smooth and non-convex optimization. Inverse Probl. 37(7), 075009 (2021)
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. Siam Rev. 60(2), 223–311 (2018)
Byrd, R.H., Curtis, F.E., Nocedal, J.: An inexact SQP method for equality constrained optimization. SIAM J. Optim. 19(1), 351–369 (2008)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 1–27 (2011)
Chatterjee, N., Chen, Y.-H., Maas, P., Carroll, R.J.: Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. J. Am. Statistical Assoc. 111(513), 107–117 (2016)
Chen, C., Tung, F., Vedula, N., Mori, G.: Constraint-aware deep neural network compression. In: proceedings of the European conference on computer vision (ECCV), pp. 400–415 (2018)
Curtis, F.E., O’Neill, M.J., Robinson, D.P.: Worst-Case Complexity of an SQP method for nonlinear equality constrained stochastic optimization. arXiv preprint arXiv:2112.14799 (2021a)
Curtis, F.E., Robinson, D.P., Zhou, B.: Inexact sequential quadratic optimization for minimizing a stochastic objective function subject to deterministic nonlinear equality constraints. arXiv preprint arXiv:2107.03512 (2021b)
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Adv. Neural Inf. Process. Syst., pp. 1646–1654 (2014)
Geyer, C.J.: Constrained maximum likelihood exemplified by isotonic convex logistic regression. J. Am. Statistical Assoc. 86(415), 717–724 (1991)
Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1–2), 267–305 (2016)
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. Adv. Neural Inf. Process. Syst. 26, 315–323 (2013)
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133(1), 365–397 (2012)
Lan, G.: First-order and Stochastic Optimization Methods for Machine Learning. Springer, Berlin (2020)
Lioutikov, R., Paraschos, A., Peters, J., Neumann, G.: Sample-based informationl-theoretic stochastic optimal control. In: 2014 IEEE international conference on robotics and automation (ICRA), pp. 3896–3902 (2014). IEEE
Malikopoulos, A.A.: Stochastic optimal control for series hybrid electric vehicles. In: 2013 American control conference, pp. 1189–1194 (2013). IEEE
Márquez-Neila, P., Salzmann, M., Fua, P.: Imposing hard constraints on deep networks: Promises and limitations. arXiv: 1706.02025 (2017)
Na, S., Anitescu, M., Kolar, M.: An adaptive stochastic sequential quadratic programming with differentiable exact augmented Lagrangians. arXiv preprint arXiv:2102.05320 (2021a)
Na, S., Anitescu, M., Kolar, M.: Inequality constrained stochastic nonlinear optimization via active-set sequential quadratic programming. arXiv preprint arXiv:2109.11502 (2021b)
Nandwani, Y., Pathak, A., Singla, P.: A primal-dual formulation for deep learning with constraints. In: proceedings of neural information processing systems (NeurIPS), pp. 12157–12168 (2019)
Négiar, G., Dresdner, G., Tsai, A., El Ghaoui, L., Locatello, F., Freund, R., Pedregosa, F.: Stochastic frank-wolfe for constrained finite-sum minimization. In: international conference on machine learning, pp. 7253–7262 (2020). PMLR
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Nguyen, L.M., Liu, J., Scheinberg, K., Takáč, M.: SARAH: A novel method for machine learning problems using stochastic recursive gradient. In: international conference on machine learning, pp. 2613–2621 (2017)
Nocedal, J., Wright, S.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, Springer, New York (2006)
Ouyang, H., He, N., Tran, L., Gray, A.: Stochastic alternating direction method of multipliers. In: international conference on machine learning, pp. 80–88 (2013). PMLR
Ravi, S.N., Dinh, T., Lokhande, V.S., Singh, V.: Explicitly imposing constraints in deep networks via conditional gradients gives improved generalization and faster convergence. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 4772–4779 (2019)
Reddi, S.J., Sra, S., Póczos, B., Smola, A.: Stochastic Frank-Wolfe methods for nonconvex optimization. In: 2016 54th annual Allerton conference on communication, control, and computing (Allerton), pp. 1244–1251 (2016a). IEEE
Reddi, S.J., Hefny, A., Sra, S., Poczos, B., Smola, A.: Stochastic variance reduction for nonconvex optimization. In: international conference on machine learning, pp. 314–323 (2016b). PMLR
Robbins, H., Monro, S.: A stochastic approximation method. Ann Math Statistics, 400–407 (1951)
Ross, S.M.: Simulation. Academic Press, Amsterdam (2013)
Roy, S.K., Mhammedi, Z., Harandi, M.: Geometry aware constrained optimization techniques for deep learning. In: proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4460–4469 (2018)
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162(1–2), 83–112 (2017)
Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss minimization. J. Mach. Learn. Res. 14(2), 567 (2013)
Shapiro, A., Dentcheva, D., Ruszczynski, A.: Lectures on stochastic programming: modeling and theory. SIAM, (2021)
Shi, J., Spall, J.C.: SQP-based Projection SPSA algorithm for stochastic optimization with inequality constraints. In: 2021 American control conference (ACC), pp. 1244–1249 (2021). IEEE
Summers, T., Warrington, J., Morari, M., Lygeros, J.: Stochastic optimal power flow based on conditional value at risk and distributional robustness. Int. J. Electric. Power Energy Syst. 72, 116–125 (2015)
Uryasev, S., Pardalos, P.M.: Stochastic Optimization: Algorithms and Applications, vol. 54. Springer (2013)
Vrakopoulou, M., Mathieu, J.L., Andersson, G.: Stochastic optimal power flow with uncertain reserves from demand response. In: 2014 47th Hawaii international conference on system sciences, pp. 2353–2362 (2014). IEEE
Wood, A.J., Wollenberg, B.F., Sheblé, G.B.: Power Generation, Operation, and Control. Wiley, New Jersey, USA (2013)
Zhong, W., Kwok, J.: Fast stochastic alternating direction method of multipliers. In: international conference on machine learning, pp. 46–54 (2014). PMLR
Zhu, Y., Zabaras, N., Koutsourelakis, P.-S., Perdikaris, P.: Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data. J. Comput. Phys. 394, 56–81 (2019)
Ziemba, W.T., Vickson, R.G.: Stochastic Optimization Models in Finance. Academic Press (2014)
Funding
This material is based upon work supported by the Office of Naval Research under award number N00014-21-1-2532.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Berahas, A.S., Shi, J., Yi, Z. et al. Accelerating stochastic sequential quadratic programming for equality constrained optimization using predictive variance reduction. Comput Optim Appl 86, 79–116 (2023). https://doi.org/10.1007/s10589-023-00483-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-023-00483-2