Peaceman–Rachford splitting for a class of nonconvex optimization problems

Guoyin Li¹,
Tianxiang Liu² &
Ting Kei Pong²

2055 Accesses
25 Citations
Explore all metrics

Abstract

We study the applicability of the Peaceman–Rachford (PR) splitting method for solving nonconvex optimization problems. When applied to minimizing the sum of a strongly convex Lipschitz differentiable function and a proper closed function, we show that if the strongly convex function has a large enough strong convexity modulus and the step-size parameter is chosen below a threshold that is computable, then any cluster point of the sequence generated, if exists, will give a stationary point of the optimization problem. We also give sufficient conditions guaranteeing boundedness of the sequence generated. We then discuss one way to split the objective so that the proposed method can be suitably applied to solving optimization problems with a coercive objective that is the sum of a (not necessarily strongly) convex Lipschitz differentiable function and a proper closed function; this setting covers a large class of nonconvex feasibility problems and constrained least squares problems. Finally, we illustrate the proposed algorithm numerically.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems

Article 24 November 2015

Convergence Analysis of the Generalized Splitting Methods for a Class of Nonconvex Optimization Problems

Article 09 July 2019

Convergence of Bregman Peaceman–Rachford Splitting Method for Nonconvex Nonseparable Optimization

Article 04 May 2022

Notes

This slightly improves [25, Theorem 4] because [25, Theorem 4] assumed a slightly stronger condition that f and g are bounded below and one of them is coercive.
We refer the readers to, for example, [1, 2, 7, 8], for the definition and examples of KL functions. In particular, if f and g are proper closed semi-algebraic functions, then ${\mathfrak {P}}_\gamma $ is a KL function for any $\gamma > 0$.
One natural choice of $\beta $ is to set $\beta = 5$ so that $\max _{\beta > 2}\frac{\beta -2}{(\beta + 1)^2L_F} = \frac{1}{12L_F}$ is attained. However, we discover in our numerical experiments that a smaller $\beta > 2$ coupled with a suitable heuristic for updating $\gamma $ leads to faster convergence.
We choose $tol = 10^{-8}$, and we report $\frac{1}{2}\Vert Az^t - b\Vert ^2$ for both methods.
We choose $tol = 10^{-5}$, and we report $\frac{1}{2}\Vert Az^t - b\Vert ^2$ for both methods.
For both methods, we report $\frac{1}{2} d_C^2(z^t)$.
We declare a failure if the function value at termination is above $10^{-6}$, and a success if the value is below $10^{-12}$.

References

Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems. An approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)
Article MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137, 91–129 (2013)
Article MathSciNet MATH Google Scholar
Bauschke, H.H., Borwein, J.M.: On the convergence of von Neumann’s alternating projection algorithm for two sets. Set-Valued Anal. 1, 185–212 (1993)
Article MathSciNet MATH Google Scholar
Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38, 367–426 (1996)
Article MathSciNet MATH Google Scholar
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)
Book MATH Google Scholar
Bogdan, M., van den Berg, E., Su, W., Candès, E.: Statistical estimation and testing via the sorted L1 norm. Preprint (2013). Available at arxiv:1310.1969
Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2007)
Article MATH Google Scholar
Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18, 556–572 (2007)
Article MathSciNet MATH Google Scholar
Borwein, J.M., Li, G., Yao, L.J.: Analysis of the convergence rate for the cyclic projection algorithm applied to basic semialgebraic convex sets. SIAM J. Optim. 24, 498–527 (2014)
Article MathSciNet MATH Google Scholar
Bruckstein, A.M., Donoho, D.L., Elad, M.: From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Rev. 51, 34–81 (2009)
Article MathSciNet MATH Google Scholar
Candès, E., Tao, T.: The Dantzig selector: statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35, 2313–2351 (2007)
Article MathSciNet MATH Google Scholar
Combettes, P.L.: Iterative construction of the resolvent of a sum of maximal monotone operators. J. Convex Anal. 16, 727–748 (2009)
MathSciNet MATH Google Scholar
Combettes, P.L., Pesquet, J.-C.: A Douglas–Rachford splitting approach to nonsmooth convex variational signal recovery. IEEE J. Sel. Top. Signal Process. 1(4), 564–574 (2007)
Article Google Scholar
Dobra, A.: Variable selection and dependency networks for genomewide data. Biostatistics 10, 621–639 (2009)
Article Google Scholar
Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two or three space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)
Article MathSciNet MATH Google Scholar
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)
Article MathSciNet MATH Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Giselsson, P., Boyd, S.: Diagonal scaling in Douglas–Rachford splitting and ADMM. In: Proceedings of the 53rd IEEE Conference on Decision and Control, pp. 5033–5039 (2014)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Article Google Scholar
Hesse, R., Luke, D.R.: Nonconvex notions of regularity and convergence of fundamental algorithms for feasibility problems. SIAM J. Optim. 23, 2397–2419 (2013)
Article MathSciNet MATH Google Scholar
Hong, M., Luo, Z.-Q., Razaviyayn, M.: Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26, 337–364 (2016)
Article MathSciNet MATH Google Scholar
Knight, K., Fu, W.: Asymptotics for the lasso-type estimators. Ann. Stat. 28, 1356–1378 (2000)
Article MathSciNet MATH Google Scholar
Kyrillidis, A., Becker, S., Cevher, V., Koch, C.: Sparse projections onto the simplex. JMLR W&CP 28, 235–243 (2013)
Google Scholar
Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25, 2434–2460 (2015)
Article MathSciNet MATH Google Scholar
Li, G., Pong, T.K.: Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems. Math. Program. 159, 371–401 (2016)
Article MathSciNet MATH Google Scholar
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Article MathSciNet MATH Google Scholar
Lu, Z., Pong, T.K., Zhang, Y.: An alternating direction method for finding Dantzig selectors. Comput. Stat. Data Anal. 56, 4037–4046 (2012)
Article MathSciNet MATH Google Scholar
Luke, D.R.: Finding best approximation pairs relative to a convex and a prox-regular set in a Hilbert space. SIAM J. Optim. 19, 714–739 (2008)
Article MathSciNet MATH Google Scholar
Patrinos, P., Stella, L., Bemporad, A.: Douglas–Rachford splitting: complexity estimates and accelerated variants. In: Proceedings of the 53rd IEEE Conference on Decision and Control, pp. 4234–4239 (2014)
Peaceman, D.W., Rachford, H.H.: The numerical solution of parabolic and elliptic differential equations. J. Soc. Ind. Appl. Math. 3, 28–41 (1955)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Book MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser B 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Wang, H., Banerjee, A.: Bregman alternating direction method of multipliers. NIPS 27, 2816–2824 (2014)
Google Scholar
Yeung, K.Y., Bumgarner, R.E., Raftery, A.E.: Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21, 2394–2402 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics, University of New South Wales, Sydney, 2052, Australia
Guoyin Li
Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
Tianxiang Liu & Ting Kei Pong

Authors

Guoyin Li
View author publications
You can also search for this author in PubMed Google Scholar
Tianxiang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ting Kei Pong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ting Kei Pong.

Additional information

G. Li was partially supported by a Research Grant from Australian Research Council. T. Liu was supported partly by the AMSS-PolyU Joint Research Institute Postdoctoral Scheme. T. K. Pong was supported partly by Hong Kong Research Grants Council PolyU253008/15p.

Appendix: concrete numerical examples

In this appendix, we provide some simple and concrete examples illustrating the different behaviors of the classical PR splitting method, the classical DR splitting method and our proposed PR splitting method (33).

The first example shows that, even in the convex setting, the classical PR splitting method can be faster than the classical DR splitting method, and our proposed PR method can outperform the classical DR method for some particular choice of the parameter $\gamma $. The second example on nonconvex feasibility problem shows that the classical PR method can diverge while our proposed PR method converges linearly to a solution for the feasibility problem.

Example 1

(Classical DR splitting method vs. classical/proposed PR method) Consider $f(x)=\Vert x\Vert ^2$ and $g(x)=0$ for all $x \in \mathrm{I}\!\mathrm{R}^n$. Then, a direct verification shows that, for any $\gamma >0$,

$$\begin{aligned} \mathrm{prox}_{\gamma f}(z) = \mathop {\hbox {arg min}}_u\left\{ \gamma \Vert u\Vert ^2 + \frac{1}{2} \Vert u - z\Vert ^2\right\} =\frac{z}{2\gamma +1} \end{aligned}$$

and

$$\begin{aligned} \mathrm{prox}_{\gamma g}(z) = \mathop {\hbox {arg min}}_u\left\{ \frac{1}{2} \Vert u - z\Vert ^2\right\} =z. \end{aligned}$$

Thus, the classical DR method reads

$$\begin{aligned} x^{t+1}&=\frac{I+(2\mathrm{prox}_{\gamma g} - I)\circ (2\mathrm{prox}_{\gamma f} - I)}{2}(x^t)\\&=\frac{1}{2\gamma +1} x^t =\cdots = \left( \frac{1}{2\gamma +1}\right) ^{t+1} x^0, \end{aligned}$$

while the classical PR method reads

$$\begin{aligned} x^{t+1}=(2\mathrm{prox}_{\gamma g} - I)\circ (2\mathrm{prox}_{\gamma f} - I)(x^t)=\frac{1-2\gamma }{2\gamma +1} x^t =\cdots = \left( \frac{1-2\gamma }{2\gamma +1}\right) ^{t+1} x^0. \end{aligned}$$

Thus, for this example, the classical PR method converges faster than the classical DR method when $\gamma \in (0,1)$.

Moreover, let $\beta =2.5$ and $\gamma <\frac{\beta -2}{(\beta +1)^2L_F}=\frac{1}{49}$. Then, the proposed PR method (33) reads

$$\begin{aligned} \left\{ \begin{array}{l} y^{t+1}=\mathop {\hbox {arg min}}_{y} \left\{ \frac{7}{2}\Vert y\Vert ^2 + \frac{1}{2\gamma }\Vert y - x^t\Vert ^2\right\} =\frac{1}{1+7\gamma } x^t, \\ z^{t+1}=\mathop {\hbox {arg min}}_{z} \left\{ - \frac{5}{2}\Vert z\Vert ^2 + \frac{1}{2\gamma }\Vert 2y^{t+1} - x^t - z\Vert ^2\right\} =\frac{1}{1-5\gamma }(2y^{t+1}-x^t),\\ x^{t+1}=x^t+2(z^{t+1}- y^{t+1})=\left( 1-\frac{4 \gamma }{(1-5 \gamma )(1+7\gamma )}\right) x^t. \end{array}\right. \nonumber \\ \end{aligned}$$

(50)

Note that, for $\gamma =0.01<\frac{\beta -2}{(\beta +1)^2L_F}=\frac{1}{49}$, we have

$$\begin{aligned} 0<1-\frac{4 \gamma }{(1-5 \gamma )(1+7\gamma )} \le 0.97 < \frac{1}{2\gamma +1}. \end{aligned}$$

Thus, for $\gamma =0.01$, our proposed PR method (33) is faster than the classical DR method for this example.

Example 2

(Classical PR method vs. the proposed PR method) Let $C=\{(0,0)\}$ and $D=\big (\{0\} \times \mathrm{I}\!\mathrm{R}\big ) \cup \big (\mathrm{I}\!\mathrm{R}\times \{0\}\big )$. We consider the feasibility problem of finding a point in the intersection of C and D. We start with the initial point $x^0=(a,0)$ with $a\ne 0$. Then, the classical PR splitting method applies (6) to $f(x)=\delta _C(x)$ and $g(x)=\delta _D(x)$ for all $x \in \mathrm{I}\!\mathrm{R}^2$, and reduces to

$$\begin{aligned} x^{t+1}=(2\mathrm{prox}_{\gamma g} - I)\circ (2\mathrm{prox}_{\gamma f} - I)(x^t)=(2P_D - I)\circ (2P_C - I)(x^t)=-x^{t}. \end{aligned}$$

Thus, the classical PR splitting method diverges and cycles between two points (a, 0) and $(-a,0)$. On the other hand, let $\beta =5$ and $\gamma \in \left( 0,\frac{1}{12}\right) $ and consider the proposed PR method (49) for feasibility problems. This algorithm reads

$$\begin{aligned} \left\{ \begin{array}{l} y^{t+1} = \frac{\gamma P_C\left( \frac{x^t}{1 + \beta \gamma }\right) + x^t}{(1 + \beta )\gamma + 1}=\frac{x^t}{6\gamma + 1}, \\ z^{t+1}\in P_D\left( \frac{2y^{t+1} - x^t}{1 - \beta \gamma }\right) =\left\{ \frac{2y^{t+1} - x^t}{1 - 5\gamma }\right\} ,\\ x^{t+1}=x^t+2(z^{t+1}- y^{t+1})=\left( 1-\frac{2\gamma }{(1-5\gamma )(6\gamma +1)}\right) x^t, \end{array}\right. \end{aligned}$$

(51)

where the formula for the z-update follows from the fact that $x^t$, $y^t \in \mathrm{I}\!\mathrm{R}\times \{0\}\subset D$, and so is $2y^{t+1}-x^t$ by the construction. Hence, the proposed PR method (51) converges to $(0,0) \in C \cap D$ linearly in this case.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, G., Liu, T. & Pong, T.K. Peaceman–Rachford splitting for a class of nonconvex optimization problems. Comput Optim Appl 68, 407–436 (2017). https://doi.org/10.1007/s10589-017-9915-8

Download citation

Received: 18 November 2015
Published: 13 May 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s10589-017-9915-8

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems

Convergence Analysis of the Generalized Splitting Methods for a Class of Nonconvex Optimization Problems

Convergence of Bregman Peaceman–Rachford Splitting Method for Nonconvex Nonseparable Optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: concrete numerical examples

Example 1

Example 2

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Peaceman–Rachford splitting for a class of nonconvex optimization problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems

Convergence Analysis of the Generalized Splitting Methods for a Class of Nonconvex Optimization Problems

Convergence of Bregman Peaceman–Rachford Splitting Method for Nonconvex Nonseparable Optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: concrete numerical examples

Appendix: concrete numerical examples

Example 1

Example 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation