Abstract
Approximation is central to many optimization problems and the supporting theory provides insight as well as foundation for algorithms. In this paper, we lay out a broad framework for quantifying approximations by viewing finite- and infinite-dimensional constrained minimization problems as instances of extended real-valued lower semicontinuous functions defined on a general metric space. Since the Attouch-Wets distance between such functions quantifies epi-convergence, we are able to obtain estimates of optimal solutions and optimal values through bounds of that distance. In particular, we show that near-optimal and near-feasible solutions are effectively Lipschitz continuous with modulus one in this distance. Under additional assumptions on the underlying metric space, we construct approximating functions involving only a finite number of parameters that still are close to an arbitrary extended real-valued lower semicontinuous functions.
Similar content being viewed by others
Notes
We use the slight abbreviation \(\iota _{f\le 0}\) for \(\iota _{\{x\in X:f(x) \le 0\}}\).
B is solid if \(B=\mathop \mathrm{cl}(\mathop \mathrm{int}B)\).
References
Attouch, H.: Variational Convergence for Functions and Operators. Applicable Mathematics Sciences, Pitman (1984)
Attouch, H., Wets, R.J-B.: A convergence theory for saddle functions. Trans. Am. Math. Soc. 280(1), 1–41 (1983)
Attouch, H., Wets, R.J-B.: Isometries for the Legendre-Fenchel transform. Trans. Am. Math. Soc. 296, 33–60 (1986)
Attouch, H., Wets, R.J-B.: Quantitative stability of variational systems: I. The epigraphical distance. Trans. Am. Math. Soc. 328(2), 695–729 (1991)
Attouch, H., Wets, R.J-B.: Quantitative stability of variational systems: II. A framework for nonlinear conditioning. SIAM J. Optim. 3, 359–381 (1993)
Attouch, H., Wets, R.J-B.: Quantitative stability of variational systems: III. \(\epsilon \)-approximate solutions. Math. Progr. 61, 197–214 (1993)
Attouch, H., Lucchetti, R., Wets, R.J-B.: The topology of the \({\rho }\)-Hausdorff distance. Annali di Matematica pura ed applicata CLX, 303–320 (1991)
Aubin, J.-P., Ekeland, I.: Applied Nonlinear Analysis. Issue 1237 of Pure and Applied Mathematics. Wiley, New York (1984)
Aubin, J.-P., Frankowska, H.: Set-Valued Analysis. Birkhäuser, Basel (1990)
Aze, D.: A survey on error bounds for lower semicontinuous functions. In: Proceedings of 2003 MODESMAI Conference, ESAIM Proceedings, vol. 13. EDP Sciences, Les Ulis (2003), pp. 1–17. (2003)
Aze, D., Penot, J.-P.: Recent quantitative results about the convergence of convex sets and functions. In: Functional Analysis and Approximations. Proceedings of the International Conference Bagni di Lucca, pp. 90–110. Pitagora Editrice (1990)
Aze, D., Corvellec, J.-N.: Characterizations of error bounds for lower semicontinuous functions on metric spaces. ESAIM Control Optim. Calc. Var. 10(3), 409–425 (2010)
Aze, D., Corvellec, J.-N., Lucchetti, R.E.: Variational pairs and applications to stability in nonsmooth analysis. Nonlinear Anal. 49, 643–670 (2002)
Beer, G.: Topologies on Closed and Closed Convex Sets, Volume 268 of Mathematics and its Applications. Kluwer, Dordrecht (1993)
Beer, G., Lucchetti, R.: Convex optimization and the epi-distance topology. Trans. Am. Math. Soc. 327(2), 795–813 (1991)
Beer, G., Lucchetti, R.: The epi-distance topology: continuity and stability with application to convex optimization. Math. Oper. Res. 17, 715–726 (1992)
Beer, G., Rockafellar, R.T., Wets, R.J-B.: A characterization of epi-convergence in terms of convergence of level sets. Proc. Am. Math. Soc. 116(3), 753–761 (1992)
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, Berlin (2000)
Bousquet, O., Elisseeff, A.: Stability and generalization. J. Mach. Learn. Res. 2, 499–526 (2002)
Drusvyatskiy, D., Lewis, A.S.: Tilt stability, uniform quadratic growth, and strong metric regularity of the subdifferential. SIAM J. Optim. 23, 256–267 (2013)
Eberhard, A.C., Wenczel, R.: A study of tilt-stable optimality and sufficient conditions. Nonlinear Anal. 75, 1260–1281 (2012)
Hager, W.W., Mico-Umutesi, D.: Error estimation in nonlinear optimization. J. Glob. Optim. 59, 327–341 (2014)
Hamel, A.: Remarks to an equivalent formulation of Ekeland’s variational principle. Optimization 3, 233–238 (1994)
Ioffe, A.D., Outrata, J.V.: On metric and calmness qualification conditions in subdifferential calculus. Set Valued Var. Anal. 16(2–3), 199–227 (2008)
Klatte, D., Kruger, A., Kummer, B.: From convergence principles to stability and optimality conditions. J. Convex Anal. 19(4), 1043–1072 (2012)
Laurent, P.J.: Approximation et Optimisation. Hermann (1972)
Lewis, A.S., Zhang, S.: Partial smoothness, tilt stability, and generalized hessians. SIAM J. Optim. 23, 74–94 (2013)
Mordukhovich, B.: Variational analysis and generalized differentiation I: basic theory. In: Grundlehren der Mathematischen Wissenschaften, 2nd edn. Springer, Berlin (2013)
Mordukhovich, B. Variational analysis and generalized differentiation, II: applications. In: Grundlehren der Mathematischen Wissenschaften, 2nd edn. Springer, Berlin (2013)
Mordukhovich, B.S., Rockafellar, R.T., Sarabi, M.E.: Characterizations of full stability in constrained optimization. SIAM J. Optim. 23, 1810–1849 (2013)
Mosco, U.: Convergence of convex sets and of solutions of variational inequalities. Adv. Math. 3, 510–585 (1969)
Ng, K.F., Zheng, X.Y.: Error bounds for lower semicontinuous functions in normed spaces. SIAM J. Optim. 12, 1–17 (2001)
Pang, J.-S.: Error bounds in mathematical programming. Math. Progr. B 79(1–3), 299–332 (1997)
Penot, J.P.: Error bounds, calmness and their applications in nonsmooth analysis. Contemp. Math. 514, 225–247 (2010)
Polak, E.: Optimization. Algorithms and Consistent Approximations, Volume 124 of Applied Mathematical Sciences. Springer, Berlin (1997)
Rockafellar, R.T., Wets, R.J-B.: Variational Analysis, Volume 317 of Grundlehren der Mathematischen Wissenschaft. Springer, 3rd printing-2009 edition (1998)
Römisch, W., Wets, R.J-B.: Stability of \(\varepsilon \)-approximate solutions to convex stochastic programs. SIAM J Optim. 18(3), 961–979 (2007)
Royset, J.O.: Fusion of hard and soft information in nonparametric density estimation. Eur. J. Oper. Res. 247(2), 532–547 (2015)
Royset, J.O., Wets, R.J-B.: From data to assessments and decisions: Epi-spline technology. In: Newman, A. (ed.) INFORMS Tutorials. INFORMS, Catonsville (2014)
Royset, J.O., Wets, R.J-B.: Multivariate epi-splines and evolving function identification problems. Set Valued Var. Anal. 24(4):517–545 (2016). Erratum, pp. 547–549
Royset, J.O., Wets, R.J-B.: Variational theory for optimization under stochastic ambiguity. SIAM J. Optimization, to appear (2017)
Takahashi, W.: Existence theorems generalizing fixed point theorems for multivalued mappings. In: Fixed Point Theory and Applications (Marseille, 1989), pp. 397–406. Pitman Res. Notes Math. Ser., 252, Longman Sci. Tech., Harlow (1991)
Wets, R.J-B.: Convergence of convex functions, variational inequalities, and convex optimization problems. In: Cottle, R., Giannessi, R., Lions, J.-L. (eds.) Varitional Inequalities and Complementary Problems, pp. 375–403. Wiley, New York (1980)
Wijsman, R.A.: Convergence of sequences of convex sets, cones and functions. Bull. Am. Math. Soc. 70, 186–188 (1964)
Wijsman, R.A.: Convergence of sequences of convex sets, cones and functions. II. Trans. Am. Math. Soc. 123, 32–45 (1966)
Wu, Z., Ye, J.J.: On error bounds for lower semicontinuous functions. Math. Progr. 92, 301–314 (2002)
Ye, J.J., Ye, X.Y.: Necessary optimality conditions for optimization problems with variational inequality constraints. Math. Oper. Res. 22(4), 977–997 (1997)
Acknowledgements
This work is supported in parts by DARPA under grants HR00111410060 and HR0011727928.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Proposition 3.1
Item (i) follows immediately from the definitions. For (ii), first observe that for a nonempty set \(C\subset X\times {IR}\) and \(\bar{x},\bar{y} \in X\times {IR}\) we have for any \(\bar{z}\in C\), \(\mathop \mathrm{dist}(\bar{x}, C) \le \bar{d}(\bar{x}, \bar{z}) \le \bar{d}(\bar{x}, \bar{y}) + \bar{d}(\bar{y}, \bar{z})\). Minimizing over \(\bar{z}\in C\), we obtain that \(\mathop \mathrm{dist}(\bar{x}, C) \le \bar{d}(\bar{x}, \bar{y}) + \mathop \mathrm{dist}(\bar{y}, C)\) and thus
Second, in the notation \(h(\bar{x}) = |\mathop \mathrm{dist}(\bar{x}, C) - \mathop \mathrm{dist}(\bar{x}, D)|\) for nonempty \(C,D\subset X\times {IR}\), we have that
Let \(\varepsilon >0\). For every \(\bar{y}\in \mathbb {S}_{\rho '}\), there exists an \(\bar{x}\in \mathbb {S}_{\rho }\) such that \(\bar{d}(\bar{x}, \bar{y}) \le e(\mathbb {S}_{\rho '},\mathbb {S}_{\rho })+\varepsilon \). Combining these last two facts and replacing C and D by epigraphs, we obtain that
which establishes (ii) after realizing that \(\varepsilon >0\) is arbitrary.
Next consider (iii). Suppose that \(C,D\subset X\times {IR}\) are nonempty closed, \(\varepsilon >0\), and \(\rho \ge 0\). We first show that
The claim is trivial if \(C\cap \mathbb {S}_\rho \) is empty. For nonempty \(C\cap \mathbb {S}_\rho \), we have for every \(\bar{x}\in C\cap \mathbb {S}_\rho \) that \(\mathop \mathrm{dist}(\bar{x}, D) \le \varepsilon \) and the implication follows. The translation of this fact to the context of epigraphs establishes the lower bound in (iii). Second, we establish that
For \(\bar{z} \in C\cap \mathbb {S}_{\rho '}\subset D_\varepsilon ^+\) and \(\bar{x} \in X\times {IR}\), \(\mathop \mathrm{dist}(\bar{x}, D) \le \bar{d}(\bar{x}, \bar{z}) + \mathop \mathrm{dist}(\bar{z}, D) \le \bar{d}(\bar{x}, \bar{z})+\varepsilon \). The minimization over \(\bar{z} \in C\cap \mathbb {S}_{\rho '}\) gives that
This holds trivially if \(C\cap \mathbb {S}_{\rho '}=\emptyset \). Suppose that \(\bar{x}\in \mathbb {S}_\rho \) and \(\rho '> 2\rho +\mathop \mathrm{dist}(( x^{\text {cent}} , 0),C)\). For every \(\nu \), there exists \(\bar{y}^\nu \in C\) such that \(\bar{d}(\bar{x}, \bar{y}^\nu ) \le \mathop \mathrm{dist}(\bar{x}, C) + 1/\nu \). Moreover,
Since \(\rho '> 2\rho +\mathop \mathrm{dist}(( x^{\text {cent}} , 0),C)\), there exists a \(\bar{\nu }\) such that \(\bar{y}^\nu \in C\cap \mathbb {S}_{\rho '}\) for all \(\nu \ge \bar{\nu }\). For such \(\nu \),
Letting \(\nu \rightarrow \infty \) in this expression and observing that \(\mathop \mathrm{dist}(\bar{x}, C\cap \mathbb {S}_{\rho '}) \ge \mathop \mathrm{dist}(\bar{x}, C)\) generally, we obtain that \(\mathop \mathrm{dist}(\bar{x}, C\cap \mathbb {S}_{\rho '}) = \mathop \mathrm{dist}(\bar{x}, C)\), which together with (9) establishes (8). The implication in (8) directly confirms the upper bound in (iii). If (X, d) is finitely compact, then in view of Lemma 2.2 we can take \(\bar{y}^\nu \) above to satisfy \(\bar{d}(\bar{x}, \bar{y}^\nu ) = \mathop \mathrm{dist}(\bar{x}, C)\) for all \(\nu \). Thus, the need for the \(1/\nu \) term vanishes and the stronger statement given at the end of the theorem is established.
Item (iv) follows trivially from the definition of \({ d l}_\rho \). For (v), (vi), and (vii), we follow the lines of arguments in the proof of [36, Lemma 4.41] and omit the details. \(\square \)
Proof of Proposition 3.2
We start by establishing the finiteness of \(d\!\hat{l}^0_\rho (f,g)\) for any \(\rho \ge 0\). Since \(f,g\in \mathop {\text {lsc-fcns}}(X)\), there exist \(\bar{x},\bar{y}\in X\) such that \(g(\bar{x}),f(\bar{y})<\infty \). Let \(\bar{\eta }= \rho \) \(+\) \(\max \{d( x^{\text {cent}}, \bar{x}),\) \(d( x^{\text {cent}} , \bar{y}),g(\bar{x}), f(\bar{y})\}\), which is finite. Then, for \(x\in \mathop {\mathop \mathrm{lev}}\nolimits _\rho f\cap {IB}_\rho \), \(d(x,\bar{x}) \le d(x, x^{\text {cent}} ) + d( x^{\text {cent}} , \bar{x}) \le \rho + (\bar{\eta }- \rho )=\bar{\eta }\) and thus \(\mathop \mathrm{inf}\nolimits _{{IB}(x,\bar{\eta })} g\le g(\bar{x}) \le \bar{\eta }-\rho \le \max \{f(x), -\rho \} + \bar{\eta }.\) Likewise, for \(x\in \mathop {\mathop \mathrm{lev}}\nolimits _\rho g\cap {IB}_\rho \), \(\mathop \mathrm{inf}\nolimits _{{IB}(x,\bar{\eta })} f\le \max \{g(x), -\rho \} + \bar{\eta }\). Thus, \(d\!\hat{l}^0_\rho (f,g) \le \bar{\eta }\), which establishes the rightmost inequality.
Obviously, \( d \hat{l }_\rho (f,g) = \inf \{\eta \ge 0~:~ d \hat{l }_\rho (f,g) \le \eta \}\), which in view of the minimization taking place in the definition of \(d\!\hat{l}^{\delta }_\rho (f,g)\) motivates the examination of the relation \( d \hat{l }_\rho (f,g) \le \eta \) for \(\eta \ge 0\). It follows directly from the definition of \( d \hat{l }_\rho \) that
which in turn is equivalent to having
where \(D_\eta ^+(f) := \{\bar{x}\in X\times {IR}~:~\mathop \mathrm{dist}(\bar{x}, \mathop \mathrm{epi}f) \le \eta \}\) and similarly for \(D_\eta ^+(g)\). By virtue of being defined in terms of distance to an epigraph, we have that \((x,y_0)\in D_\eta ^+(g)\) implies \((x,x_0)\in D_\eta ^+(g)\) for all \(x_0 \ge y_0\). Thus,
where the function \(f_\rho :X\rightarrow \overline{{IR}}\) is given by \(f_\rho (x) = \max \{f(x), -\rho \}\) if \(x\in \mathop \mathrm{lev}_\rho f\cap {IB}_\rho \) and \(f_\rho (x) = \infty \) otherwise. By definition, a point \((x,x_0)\in D_\eta ^+(g)\) if and only if \(\mathop \mathrm{dist}((x,x_0),\mathop \mathrm{epi}g) \le \eta \). The latter condition is more explicitly stated as
We are now in a position to establish the lower bound and let \(\delta >0\). Collecting the above facts, we find that if \( d \hat{l }_\rho (f,g) \le \eta \), then
Let \(x\in \mathop \mathrm{dom}f_\rho \). Hence, for every \(\varepsilon \in (0,\delta ]\), there exists \((y_\varepsilon , y_{0\varepsilon })\in X\times {IR}\) such that \(g(y_\varepsilon ) \le y_{0\varepsilon }\), \(d(x,y_\varepsilon ) \le \eta + \varepsilon \), and \(|f_\rho (x)-y_{0\varepsilon }|\le \eta + \varepsilon \). Consequently, \(g(y_\varepsilon ) \le f_\rho (x) + \eta + \varepsilon \text{ and } y_\varepsilon \in {IB}(x,\eta + \varepsilon )\). Moreover, \(\mathop \mathrm{inf}\nolimits _{{IB}(x,\eta +\delta )} g \le \mathop \mathrm{inf}\nolimits _{ {IB}(x,\eta +\varepsilon )} g \le f_\rho (x) + \eta + \varepsilon \). Since this relation holds for all \(\varepsilon \in (0,\delta ]\), \(\mathop \mathrm{inf}\nolimits _{{IB}(x,\eta +\delta )} g \le f_\rho (x) + \eta \) for \(x\in \mathop \mathrm{dom}f_\rho \). A parallel development gives identical results with the roles of f and g reversed, where we let \(g_\rho (x) = \max \{g(x), -\rho \}\) if \(x\in \mathop \mathrm{lev}_\rho g\cap {IB}_\rho \) and \(g_\rho (x) = \infty \) otherwise. Specifically, we have that \(\mathop \mathrm{inf}\nolimits _{{IB}(x,\eta +\delta )} f \le g_\rho (x) + \eta \) for \(x\in \mathop \mathrm{dom}g_\rho \). Since \(x\in \mathop \mathrm{dom}f_\rho \) if and only if \(x\in \mathop {\mathop \mathrm{lev}}\nolimits _\rho f\cap {IB}_\rho \), we also have that
and similarly
The lower bound then follows after observing that these relations hold, in particular, for \(\eta = d \hat{l }_\rho (f,g)\).
Next, we address the upper bound. Suppose that \(\eta \ge 0\) satisfies
As above, this means that \(\mathop \mathrm{inf}\nolimits _{{IB}(x,\eta )} g \le f_\rho (x) + \eta \) for \(x\in \mathop \mathrm{dom}f_\rho \). We now examine this relation for a fixed \(x\in \mathop \mathrm{dom}f_\rho \). For every \(\varepsilon >0\), there exists a \(y_\varepsilon \in X\) such that \(d(x,y_\varepsilon ) \le \eta \) and \(g(y_\varepsilon ) \le f_\rho + \eta + \varepsilon \). Set \(y_{0\varepsilon } = \max \{g(y_\varepsilon ), f_\rho (x) - \eta - \varepsilon \}\). Thus, \(g(y_\varepsilon ) \le y_{0\varepsilon }\) and
Moreover, \(|f_\rho (x) - y_{0\varepsilon }| \le \eta + \varepsilon \). We have therefore established that
Since this holds for all \(\varepsilon >0\),
Equivalently, \((x,f_\rho (x)) \in D_\eta ^+(g)\) for \(x\in \mathop \mathrm{dom}f_\rho \). A parallel development with the roles of f and g reversed, leads to \((x,g_\rho (x)) \in D_\eta ^+(f)\) for \(x\in \mathop \mathrm{dom}g_\rho \). The implications established in the beginning of the proof show that we then must have that \( d \hat{l }_\rho (f,g) \le \eta \). In view of the definition of \(d\!\hat{l}^0_\rho (f,g)\), it is possible to repeat the above arguments with \(\eta \) replaced by \(\eta ^\nu \) and have \(\eta ^\nu \searrow d\!\hat{l}^0_\rho (f,g)\) as well as \( d \hat{l }_\rho (f,g) \le \eta ^\nu \). This established the upper bound of the theorem.
We next consider the last assertion under the additional assumption that the space is finitely compact. Again, suppose that \( d \hat{l }_\rho (f,g) \le \eta \) and, thus,
Fix \(x\in \mathop \mathrm{dom}f_\rho \). By Lemma 2.2 and the fact that \(\mathop \mathrm{epi}g\) is a nonempty closed set, there exists \((y^*,y_0^*) \in X\times {IR}\), with \(g(y^*) \le y^*_0\), such that \(\eta \ge \mathop \mathrm{dist}((x,f_\rho (x)), \mathop \mathrm{epi}g) = \bar{d}((x,f_\rho (x)),(y^*,y_0^*))\). Hence, \(d(x,y^*) \le \eta \) and \(|f_\rho (x)-y^*_{0}|\le \eta \), which leads to \(g(y^*) \le f_\rho (x) + \eta \text{ and } y^* \in {IB}(x,\eta )\). This fact and a parallel development with the roles of g and f reversed give that
Repeating the last lines of reasoning that lead to the lower bound on \( d \hat{l }_\rho (f,g)\), we conclude that under the additional assumption, the lower bound can be improved to \(d\!\hat{l}^0_\rho (f,g)\). \(\square \)
Proof of Proposition 3.3
We observe that \( d \hat{l }_\rho (C,D)<\infty \) since C, D are nonempty. Let \(\rho \in [0,\infty )\), \(\rho '\in (\rho + d \hat{l }_\rho (C,D), \infty )\), and \(\varepsilon \in (0, \rho ' - \rho - d \hat{l }_\rho (C,D)]\). Set
First, we establish that
Suppose that \(x\in \mathop {\mathop \mathrm{lev}}\nolimits _\rho \{f+\iota _C\}\cap {IB}_\rho \), which of course implies that \(x\in C\). There exists a \(y\in D\) such that \(d(x,y) \le \mathop \mathrm{inf}\nolimits \{d(x,y') : y'\in D\} + \varepsilon \). Thus, \( d \hat{l }_\rho (C,D) \ge e(C\cap {IB}_\rho , D) \ge \mathop \mathrm{dist}(x,D) \ge d(x,y) - \varepsilon \) and \(d(x,y) \le \eta _\varepsilon \). Moreover, \(d( x^{\text {cent}} , y) \le d( x^{\text {cent}} , x) + d(x,y) \le \rho + d \hat{l }_\rho (C,D) + \varepsilon \le \rho '\). These facts and the Lipschitz continuity of g on \({IB}_{\rho '}\) imply that
which establishes the first claim. Second, following a parallel argument, we realize that
Consequently, \(d\!\hat{l}^0_\rho (f,g) \le \eta _\varepsilon \). Since this holds for arbitrarily small \(\varepsilon >0\), the main conclusion follows. If (X, d) is finitely compact, then the minimum distance between a point and a nonempty closed set is attained and the above arguments hold with \(\varepsilon = 0\); see Lemma 2.2. This establishes that \(\rho ' = \rho + d \hat{l }_\rho (C,D)\) is permitted in this case.\(\square \)
Rights and permissions
About this article
Cite this article
Royset, J.O. Approximations and solution estimates in optimization. Math. Program. 170, 479–506 (2018). https://doi.org/10.1007/s10107-017-1165-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-017-1165-0
Keywords
- Epi-convergence
- Attouch-Wets distance
- Epi-splines
- Solution stability
- Approximation theory
- Near-optimality
- Near-feasibility
- Rate of convergence