Abstract
Nonlinear dependence on a probability measure has recently been encountered with increasing intensity in stochastic optimization. This type of dependence corresponds to many situations in applications; it can appear in problems static (one-stage), dynamic with finite (multi-stage) or infinite horizon, and single- and multi-objective ones. Moreover, the nonlinear dependence can appear not only in the objective functions but also in the constraint sets. In this paper, we will consider static one-objective problems in which the nonlinear dependence appears in the objective function and may also appear in the constraint sets. In detail, we consider “deterministic” constraint sets, whose dependence on the probability measure is nonlinear, constraint sets determined by second-order stochastic dominance, and sets given by mean-risk problems. The last mentioned instance means that the constraint set corresponds to solutions which guarantee acceptable values of both criteria. To obtain relevant assertions, we employ the stability results given by the Wasserstein metric, based on the \( {{\mathcal {L}}}_{1} \) norm. We mainly focus on the case in which a solution has to be obtained on the basis of the data and of investigating a relationship between the original problem and its empirical version.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
To introduce a stochastic static one-objective optimization problem with a nonlinear dependence on a probability measure, let \( (\Omega , {{\mathcal {S}}}, P) \) be a probability space; \(\xi \, (:= \xi (\omega ) = (\xi _{1}(\omega ), \ldots , \xi _{s}(\omega ) ) \) an s-dimensional random vector defined on \( (\Omega , {{\mathcal {S}}},P);\) \( F (:= F_{\xi }(z), z \in R^{s}) \) the distribution function of \( \xi ;\) \(P_{F}, \, Z_{F} \) the probability measure and a support corresponding to F; \( X_{F} \subset X \subset R^{n} \) a nonempty set generally depending on F; \(X \subset R^{n} \) a nonempty “deterministic“ set. If \({\bar{g}}_{0} (:= {\bar{g}}_{0}(x, z, y)) \) is a real-valued function defined on \(R^{n} \times R^{s} \times R^{m_{1}}\) and \( h (:= h(x, z)) = (h_{1}(x, z), \ldots , h_{m_{1}}(x, z))\) is an \(m_{1}\)-dimensional vector function defined on \(R^{n} \times R^{s},\) then a stochastic static one-objective optimization problem with a nonlinear dependence on the probability measure can, in a rather general setting, be introduced in the following form:
Find
We assume that the objective function generally depends, in a nonlinear way, on a probability measure and a few types of constraint sets. However, in all such instances we consider a situation in which we in applications replace the underlying probability measure \(P_{F} \) by an empirical measure \(P_{F^{N}},\) distribution function F by its empirical version \(F^{N},\) determined by a random sample (generally, not necessarily independent) corresponding to the measure \(P_{F}.\) In other words, the following empirical problem is often solved instead of Problem (1):
Find
The aim of this paper is to investigate the properties of this empirical estimates of (1) in the cases of both light- and heavy-tailed distributions. We shall evaluate the quality of these approximations by almost-sure convergence and by the rate of this convergence.
Optimization problems with nonlinear dependence on the probability measure correspond to many situations encountered in practice; that is why they have begun to appear in the stochastic programming literature. Problem (1) has appeared in [7], where the constraint set was assumed to depend nonlinearly on a probability measure. More precisely, first, a few “underlying" well-known cases leading to problems with the nonlinear dependence were recalled. Further, sufficient assumptions were introduced for the convergence of the empirical estimates to the theoretical values in mean and almost surely, as well as the convergence rates to the optimal value and optimal solutions. However, those results were only published and proved for deterministic constraint sets. The results presented here are based on the former results of V. Norkin and Yu. Ermoliev see, e.g. [4,5,6,7, 26, 27] and the properties of the Radamacher averages; the case of heavy tails was omitted there. The statistical estimates of composite functionals with potentially nonlinear dependence on the probability measure were also investigated in [2]. The authors had evidently been motivated by applications in finance with uncertainty and risk. They demonstrated theoretical results on Semi-Deviations and Average Value at Risk. Works [19, 20, 22] also deal with Problem (1). The considerations of the nonlinear dependence started in [19] and [20] for the constraint sets both deterministic and given by second-order stochastic dominance, as well as the risk measure considered in the form of the variance and Average Value at Risk. The paper [22] mentioned other types of constraint sets; however only a few results were mentioned and, moreover, they were given without proofs and mostly very briefly. The aim of this paper is to generalize those results and to presents their proofs.
This paper is organized as follows. Exact definitions of the considered constraint sets are given in Sect. 2. Section 3 is devoted to the definitions and auxiliary assertions: properties of the stability based on the Wassertein metric and the \({{\mathcal {L}}}_{1} \) norm are recalled and investigated in Sect. 3.1, applications of the stability results to empirical distribution functions can be found in Sect. 3.2 and some other auxiliary assertions can be found in Sect. 3.3. The main results of this paper are formulated in Sect. 4. Section 4.1 deals with Deterministic constraint sets, Sect. 4.2 with Constraint sets given by a function depending nonlinearly on the probability measure, Sect. 4.3 corresponds to Second-order stochastic dominance constraint sets, and Sect. 4.4 to Mean-Risk problem constraints. Conclusions of the paper can be found in Sect. 5.
2 Preliminary
In this section, we define the types of the constraint set \(X_{F},\) we consider below:
It is, of course, assumed that all mathematical expectations in (1) and (3) exist and that they are finite.
Problem (1) with a deterministic constraint set covers a classical problem with linear dependence on the probability measure in the following form:
Find
with \( {\bar{g}}_{0} (x, \, z, \, y ) : = g_{0} (x, \, z),\, X_{F} = X; \) \(g_{0} (:= g_{0} (x, z)) \) a real-valued function defined on \(R^{n} \times R^{s}.\)
To introduce the next type of \(X_{F}\), we have first to recall a notion of second-order stochastic dominance. If \( V \,:= \, V(\xi ), \, Y \,:= \, Y(\xi ) \) are random values for which there exist finite \( \textsf{E}_{F} V(\xi ), \, \, \textsf{E}_{F} Y(\xi )\) and if
then \( \, V(\xi )\) in second order dominates \(Y(\xi ) \, \) (\( V(\xi ) \, \succeq _{2} \, Y(\xi ) \)) if
To define a second-order stochastic dominance constraint set \(X_{F}, \) let \(g(x, \, z) \) be defined on \(R^{n} \times R^{s} \) and let \(g(x, \, \xi )\), for every \(x \in X \), be a random variable with a distribution function \(F_{g(x, \, \xi ) }.\) Let, moreover, for every \(x \in X\) exist finite \( \textsf{E}_{F} g (x, \xi ), \, \textsf{E}_{F} Y(\xi ) \) and
then a general second-order stochastic dominance constraint set \(X_{F}\) can be defined by
-
c.
$$\begin{aligned} X_{F} \, = \, \{x \in X: F_{g(x, \, \xi )}^{2} (u) \le F_{Y(\xi )}^{2}(u) \quad \mathrm{for~every } \quad u \in R^{1} \}, \end{aligned}$$(5)
equivalently by
The mutual equivalence of the constraint sets (5) and (6) can be found in [28]. For definitions of the stochastic dominance of other orders see, e.g., [24].
To introduce the last considered type of the set \(X_{F}\), we start with a classical mean-risk problem:
Find
Evidently, to simultaneously optimize both of these objectives is mostly impossible. However, it can happen that there exist two real-valued acceptable constants \(\nu _{2}, \, \nu _{1} \) and \(x_{0} \in X \) such that
If, furthermore, the function \(g_{0}\) and the risk measure \(\rho _{F}\) follow this definition:
Definition 1
[9] the mean-risk model (7) is called consistent with the second-order stochastic dominance if, for every \(x \in X\) and \(x' \in X\), it holds that
then we can set
-
d.
\( X_{F} := X_{F} (x_{0}) \) with
$$\begin{aligned} X_{F}(x_{0}) \, = \{ x \in X: \, \textsf{E}_{F} (u - g_{0} (x, \, \xi ))^{+} \le \textsf{E}_{F} (u - g_{0} (x_{0}, \, \xi ))^{+} \quad \mathrm{for~every} \quad u \in R^{1} \}.\nonumber \\ \end{aligned}$$(10)
In the case of \( X_{F}(x_{0} ) = \{ x_{ 0} \}\), it holds true that \(x = x_{0}\) is a solution of the following problem:
3 Definitions and auxiliary assertions
First we recall and prove some auxiliary assertions suitable for investigating empirical estimates for problems (1) and (4) with the corresponding constraint sets. To this end, denote by \( {\mathcal {P}}(R^{s})\) the set of all Borel probability measures on \(R^{s}\) and let the system \({{\mathcal {M}}}_{1}^{1}(R^{s}) \) be defined by the following formula:
3.1 Stability
Problems (1) and (4) depend on the distribution function F. Replacing F by another s-dimensional distribution function G, we obtain a modified problem. If the new problem is well defined and if we denote the corresponding optimal values by \({\bar{\varphi }}(G, X_{G}), \, {\bar{\varphi }}(G, X_{F}), \varphi (G, X_{G}), \, \varphi (G, X_{F}), \, \,\) then evidently
If we denote the sets of the optimal solutions corresponding to problems (1) and (4) by \( \bar{{{\mathcal {X}}} } ( F, \, X_{F})\, \) \( \, \bar{{{\mathcal {X}}} } ( G, \, X_{G}), \,\) \( \bar{{{\mathcal {X}}} } ( G, \, X_{F}); \, \) \( \, {{\mathcal {X}}} ( F, \, X_{F}), \, { {\mathcal {X}}} ( G, \, X_{G}) , \, {{\mathcal {X}}} ( G, \, X_{F}) \) and if all these solution sets are singletons, then also
Here \( \Vert \cdot \Vert \, := \Vert \cdot \Vert _{n} \) denotes the Euclidean norm in \(R^{n}.\)
We introduce a set of the following assumptions:
-
A.1
-
1.
\(g_0(x, z)\) is, for \(x \in X\), a Lipschitz function of \(z \in R^{s}\) with the Lipschitz constant L (corresponding to the \({{\mathcal {L}}}_{1}\) norm) not depending on x
-
2.
\(g_0(x, z)\) is either a uniformly continuous function on \(X \times R^{s}\) or X is a convex set and there exists \( \varepsilon >0\) such that \(g_{0} (x, \, z ) \) is for, \( z \, \in \, R^{s}, \, \) a bounded convex function on \(X(\varepsilon );\) \(X(\varepsilon ) \) denotes the \(\varepsilon \)-neighborhood of the set \(\, X\).
-
1.
-
B.1
\(P_{F}, \, P_{G} \in {{\mathcal {M}}}_{1}^{1}(R^{s}),\) there exists \(\varepsilon >0 \) and a neighborhood \( X(\varepsilon ) \) such that
-
1.
\({\bar{g}}_{0}(x, \, z, \, y) \) is for \(x \in X(\varepsilon ), \, z \in R^{s}\) a Lipschitz function of \(y \in Y (\varepsilon ) \) with the Lipschitz constant \(L_{y}\); \( Y(\varepsilon ) = \{y \in R^{m_{1}}: y = h(x, \, z) \quad \mathrm{for~some} \quad x \in X(\varepsilon ) , \, z \in R^{s} \},\) \(\textsf{E}_{F}h(x, \, \xi ), \, \) \( \, \textsf{E}_{G}h(x, \, \xi ) \in Y(\varepsilon ) \)
-
2.
for every \( x \in X(\varepsilon ) \) and every \( y \in Y(\varepsilon ) \) there exist finite mathematical expectations \( \textsf{E}_{F} {\bar{g}}_{0} (x, \, \xi , \, y ) , \) \( \textsf{E}_{G} {\bar{g}}_{0} (x, \, \xi , \, y )\)
-
3.
\(h_{j}(x, \, z), \, j = 1, \, \ldots , \, m_{1}\) are, for every \(x \in X(\varepsilon ),\) Lipschitz functions of z with the Lipschitz constants \(L^{j}_{h}\) (corresponding to the \({{\mathcal {L}}}_{1} \) norm)
-
4.
\({\bar{g}}_{0}(x, \, z, \, y) \) is, for every \(x \in X(\varepsilon ), \, y \in Y(\varepsilon )\) a Lipschitz function of \(z \in R^{s}\) with the Lipschitz constant \(L_{z}\) (corresponding to the \({{\mathcal {L}}}_{1} \) norm).
-
1.
-
B.2
\( \textsf{E}_{F} {\bar{g}}_{0}(x, \xi , \textsf{E}_{F}h(x, \xi )), \) \( \textsf{E}_{G} {\bar{g}}_{0}(x, \xi , \textsf{E}_{G}h(x, \xi )) \) are continuous functions on X.
Assumption A.1 2. guarantees continuity of \( \textsf{E}_{F} g_{0} (x, \, \xi )\) on X; of course, it could be replaced by another assumption guaranteeing the same property. Further, under Assumption B.1 there evidently exist, for every \(x \in X\), finite \( \textsf{E}_{F} {\bar{g}}_{0} (x, \, \xi , \, \textsf{E}_{F}h(x, \, \xi )), \) \(\textsf{E}_{F} {\bar{g}}_{0} (x, \, \xi , \, \textsf{E}_{G}h(x, \, \xi )), \) \( \textsf{E}_{G} {\bar{g}}_{0} (x, \, \xi , \, \textsf{E}_{F}h(x, \, \xi )), \quad \) \( \textsf{E}_{G} {\bar{g}}_{0} (x, \, \xi , \, \textsf{E}_{G}h(x, \, \xi )). \, \)
If \(F_{i}(z_{i}), G_{i}(z_{i}), \, i = 1, \, \ldots , \, s\) are one-dimensional marginal distributions corresponding to \(F, \, G,\) then we can recall under some assumptions an approach of [37] on the line and to reduce, from the mathematical point of view, the case with an s-dimensional random vector to s cases with one-dimensional random values. Of course, the stochastic dependence between the components of the random vector \( \xi \) is neglected within this approach. The idea to reduce an s-dimensional (\(s >1\)) case to one-dimensional cases is owed to G. CH. Pflug [29], see also [36]. We can formulate our first auxiliary assertion now.
Proposition 1
(stability results based on the Wassesrstein metric and the \({{\mathcal {L}}}_{1} \) norm) Let \( P_F, P_G \in {\mathcal {M}}_1^{1}(R^s)\) and let X be a compact set. If
-
1.
Assumption A.1 1. is satisfied, then it holds, for \(x \in X\), that
$$\begin{aligned} |\textsf{E}_{F} g_{0}(x, \xi ) - \textsf{E}_{G} g_{0}(x, \xi )| \, \le L \sum \limits _{i=1}^s \int \limits _{-\infty }^{+\infty } |F_i(z_i) - G_i (z_i)| d z_{i}. \end{aligned}$$(13)If, moreover, Assumption A.1 2. is satisfied, then, additionally, the following inequality is true:
$$\begin{aligned} |\varphi (F, \, X) - \varphi (G, \, X)| \le L \sum \limits _{i=1}^s \int \limits _{-\infty }^{+\infty } |F_i(z_i) - G_i (z_i)| d z_{i}. \end{aligned}$$(14) -
2.
Assumptions B.1 are satisfied, then there exists a constant \({\hat{C}} \) such that, for \(x \in X\), it holds
$$\begin{aligned} \begin{array}{l} | \textsf{E}_{F} {\bar{g}}_{0}(x, \xi , \textsf{E}_{F}h(x, \xi )) \, - \textsf{E}_{G} {\bar{g}}_{0}(x, \xi , \textsf{E}_{G}h(x, \xi ))| \le {\hat{C}} \sum \limits _{i=1}^{s} \int \limits _{-\infty }^{\infty } | F_{i}(z_{i}) - G_{i}(z_{i})| dz_{i},\\ 0 \le {\hat{C}} \le L_{y} \, \sum \limits _{j=1}^{m_{1}} L_{h}^{j} + L_{z}. \end{array} \end{aligned}$$(15)If, moreover, Assumption B.2 is satisfied, then also
$$\begin{aligned} | {\bar{\varphi }}(F, \, X) - {\bar{\varphi }}(G, \, X)| \le {\hat{C}} \sum \limits _{i=1}^{s} \int \limits _{-\infty }^{\infty } | F_{i}(z_{i}) - G_{i}(z_{i})| dz_{i}. \end{aligned}$$(16)
Proof
The validity of the first part (Assertion 1) is based on the Valander results [37] (see also, e.g., [17, 20]). The second part of Proposition 1 (Assertion 2) follows from Assertion 1, Assumptions B.1, B.2 and the following inequalities, which are guaranteed by the just mentioned assumptions:
\(\square \)
Evidently, employing the assertion of Proposition 1, we can bound the gaps between the optimal values for problems (1) and (4) with different distributions F and G. However it is reasonable, to this end, first to define the sets \( X_{F}^{\varepsilon } \) for \( \varepsilon \in R^{1} \) such that \( X_{F}^{\varepsilon } := X_{F}^{b, \, \varepsilon } \) and \( X_{F}^{\varepsilon } := X_{F}^{c, \, \varepsilon } \, \)
For \(X_{F}\) defined in Sect. 2, it holds true that \(X_{F} = X_{F}^{b,0}, \, X_{F} = X_{F}^{c,0}, \) respectively.
Replacing \({\bar{g}}_{0} \) by \( {\bar{g}}_{1} \) in (15), we obtain, if \( {\bar{g}}_{1} \) fulfills Assumption B.1, that
and, simultaneously,
Generally, for \( \delta \in R^{1} \) in the case of constraint b. with \(m=1\) and \( {\bar{g}}_{1} \) fulfilling Assumption B.1, we have that
Considering the constraint set c. and employing Formula (6), we can see that the following assertion is valid.
Lemma 1
[20] Let \(g(x, \, z), \, Y(z) \) be, for every \(x \in X\), Lipschitz functions of \(z \in R^{s}\) with the Lipschitz constant \(L_{g},\) corresponding to the \( {{\mathcal {L}}}_{1} \) norm, and not depending on \(x \in X,\) then
are Lipschitz functions of \( z \in \, R^{s} \) with the Lipschitz constants \( L_{g} \) corresponding to the \( {{\mathcal {L}}}_{1} \) norm and not depending on \(x \in X, \, u \in R^{1}. \)
Further, let \(P_{F}, \, P_{G} \in {{\mathcal {M}}}^{1}_{1} (R^{s}). \) If \(X_{F} \) is defined by formula (6), then, under the assumptions of Lemma 1 and Proposition 1, it follows that
Consequently, also
Now, employing an approach similar to that employed in formulae (19), (20), (21), we can, for the case of constraint set c. and under the assumptions of Lemma 1, obtain that
3.2 Stability applications to empirical distribution functions
The aim of this paper is to focus on investigating the empirical estimates of the optimal value and the optimal solutions of Problem 1 in the case of light- and heavy-tailed distributions. To follow this approach, we recall that the distribution function \(F_{\eta } \) of a random value \(\eta \) defined on \((\Omega , \, {{\mathcal {S}}}, \, P)\) has light tails if there exists its finite moment-generating function in a neighborhood of zero; and \(\eta \) has a distribution function with heavy tails if its finite moment-generating function does not exist; for the definition of the moment-generating function see, e.g., [8]. Consequently, a light-tailed distribution has finite all of its moments. However, all finite moments do not conversely guarantee a finite moment-generating function. Other, not exactly equivalent definitions, of the light- and heavy-tailed distributions can be found in the literature. However, the stable distributions, with exception of the normal one, always belong to the class of heavy-tailed distributions.
The stable distributions are characterized by four parameters \( S_{\alpha } (\sigma , \, \beta ,\mu ). \) A stable distribution is Gaussian if \(\alpha = 2\); for more details about the stable distributions see, e.g., [23]. We only mention here that a finite first moment exists for \(\alpha > 1.\)
Evidently, replacing in Sect. 3.1 the probability measure \(P_{G}\) by an empirical measure \(P_{F^{N}}\), we can employ the stability results to investigate the empirical estimates of Problems (1) and (4). To this end, we introduce the following system of assumptions:
-
A.2
-
1.
\(\{\xi ^i\}_{i= 1}^{\infty } \) is an independent random sequence corresponding to F,
-
2.
\(F^N\) is an empirical distribution function determined by \(\{\xi ^i\}_{i= 1}^N, N = 1, 2,\dotsc \),
-
1.
-
A.3
\(P_{F_i}, \, i = 1, \dotsc , s\) are absolutely continuous measures w. r. t. the Lebesgue measure on \(R^{1}.\)
According to the aim of this paper and to Proposition 1, it is sufficient to recall the properties of empirical distribution functions just for the case \(s = 1\)
Lemma 2
[34] Let \(s=1,\) \(P_F \in {{\mathcal {M}}}_{1}^{1}(R^{1})\) and Assumption A.2 be fulfilled. Then
Proposition 2
[10] Let \(s=1, \, t>0, \, r > 2, \, \) Assumptions A.2, A.3 be fulfilled. Let, moreover, \(\xi \) be a random variable with \(\textsf{E}_{F} |\xi |^{r} < \infty .\) If the constant \(\gamma \) fulfills the relation \( 0 \,< \, \gamma < 1/2 - 1/r, \) then
3.3 Other definitions and auxiliary assertions
In this section, we recall some other definitions and, moreover, we prove certain additional auxiliary assertions. First, we recall the definition of the Hausdorff distance.
Definition 2
[30] If \(X^{'}, \, X^{''} \subset R^{n}\) are two non-empty sets, then the Hausdorff distance \( \Delta [X^{'}, \, X^{''} ] := \Delta _{n}[X^{'}, \, X^{''} ]\) between these sets is defined by
Proposition 3
Let X be a nonempty compact set. If
-
1.
\( {\hat{g}}_{0}( := {\hat{g}}_{0} (x), \, x \in R^{n}) \) is a Lipschitz function with the Lipschitz constant L (corresponding to the \(\Vert \cdot \Vert _{n} \) norm),
-
2.
\(X_{F}, \, X_{G} \subset X \) are nonempty compact sets,
then
Proof
The assertion of Proposition 3 is just a moderately modified assertion of Proposition 1 in [16]. \(\square \)
Corollary 1
Let \( X \subset R^{n}.\, \) Let, moreover, \( P_{F} \in {{\mathcal {M}}}_{1}^{1}(R^{s}), \, g_{0}(x, \, z) \) be a function defined on \(R^{n} \times R^{s}.\) If
-
1.
for every \(x \in X \) there exists a finite \( \textsf{E}_{F} g_{0} (x, \, \xi ), \)
-
2.
\(g_{0}(x, \, z ) \) is a Lipschitz function on X with the Lipschitz constant L (corresponding to the \(\Vert \cdot \Vert _{n}\) norm) not depending on \(z \in Z_{F},\)
-
3.
A.1 2. is satisfied,
-
4.
\(X_{F}, \, X_{G} \subset X \) are nonempty compact sets,
then
Further, we recall a notion of a strongly convex function.
Definition 3
Let \( {\hat{h}}(x)\) be a real-valued function defined on a nonempty convex set \({{\mathcal {K}}} \subset R^{n}\). \({\hat{h}}(x)\) is a strongly convex function on \({{\mathcal {K}}}\) with a parameter \( {\bar{\rho }} >0 \) if
Proposition 4
[14] Let \({{\mathcal {K}}} \subset R^{n} \) be a nonempty, compact, convex set. Let, moreover, \({\hat{h}}(x) \) be a continuous strongly convex, with a parameter \( {\bar{\rho }} >0, \) real-valued function defined on \({{\mathcal {K}}.}\) If \(x_{0} \) is defined by formula \( x_{0} = \arg \min \limits _{x \in {{\mathcal {K}}}} {\hat{h}} (x), \) then
Lemma 3
(generalization of Proposition 4) Let \({\mathcal {K}} \subset R^{n} \) be a nonempty convex compact set. If
-
1.
\({\hat{h}}(x) \) is a strongly convex continuous function on \( {{\mathcal {K}}} \) with a parameter \( {\bar{\rho }} > 0,\) \( x_{0} = \arg \min \limits _{x \in {{\mathcal {K}}}} {\hat{h}} (x), \, \)
-
2.
\( {{\mathcal {K}}}^{\varepsilon } \, = \, \{ x \in {{\mathcal {K}}}: {\hat{h}}(x) \le \varepsilon \},\) \(\varepsilon \ge {\hat{h}} ( x_{0} ),\)
-
3.
for \( \varepsilon _{1}, \, \varepsilon _{2} > {\hat{h}} (x_{0}), \, \, \varepsilon _{1} < \varepsilon _{2} \) it holds that
-
there exists \(x_{2} \in {{\mathcal {K}}}^{\varepsilon _{2}} \) such that \( {\hat{h}}(x_{2} ) = \varepsilon _{2},\)
-
for every \( x_{2} \in { {\mathcal {K}} }^{\varepsilon _{2}}, \, {\hat{h}}(x_{2}) = \varepsilon _{2} \) there exists a projection \( x_{1} := x_{1}(x_{2}) \) on \( {{\mathcal {K}}}^{\varepsilon _{1}}\) such that \( {\hat{h}}(x_{1} ) = \varepsilon _{1}, \)
-
\( {\hat{h}}(x(\lambda )) \) is an increasing function of \( \lambda ,\) \(\lambda \in \langle 0, \, 1 \rangle , \quad x(\lambda ) = \lambda x_{2} + (1 - \lambda ) x_{1}, \)
-
then
Proof
Since, according to the assumptions, it holds that \( {{\mathcal {K}}}^{\varepsilon _{1}} \subset {\mathcal {K}}^{\varepsilon _{2}}\), we can restrict our consideration to \(x_{2} \in {{\mathcal {K}}}^{\varepsilon _{2}} \, \) such that \( {\hat{h}}(x_{2} ) = \varepsilon _{2}. \) It follows from the assumptions that, for \( x_{1} = x_{1}(x_{2})\), it holds successively
As, according to Assumption 3, it holds that
we can see that
Consequently also
\(\square \)
4 Results
Now we are ready to introduce our results concerning Problem (1). In detail, we try to introduce a relationship between the original Problem (1) and the modified Problem (2). We try to evaluate the quality of this approximation by the consistency properties and by investigating the convergence rates. Research in this direction began in [3, 13]; they were later extended to more general tails, see, e.g., [20]. We will now characterize our results in dependence on the types of the constraint sets.
4.1 Deterministic constraint sets
Considering Problem (1) with a deterministic constraint set X, we can recall the following assertions.
Theorem 1
[20] Let \(P_{F} \in {\mathcal {M}}_{1}^{1} (R^{s}), \, X\) be a nonempty compact set. If Assumptions B.1, B.2, A.2 are fulfilled, then
Proof
Replacing G by \(F^{N} \) in Assertion 2 of Proposition 1, we obtain
Consequently, employing Lemma 2, we get
\(\square \)
It follows from Theorem 1 that \({\bar{\varphi }}(F^N, \, X) \) is a strongly consistent estimate of \({\bar{\varphi }}(F, \, X) \) whenever there exists the first moment of vector \( \xi \) and the other assumptions of Theorem 1 are fulfilled. Consequently, this assertion is valid for a large class of the underlying probability measures. However, the result dealing with the convergence rate depends on the existence of finite absolute moments. In more detail, it has been proved:
Theorem 2
[10, 19] Let \(P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \) \( t >0, \, X\) be a nonempty compact set. If for some \( \, r > 2 \, \) it holds that \( \textsf{E}_{F_{i}} |\xi _{i}|^{r} < + \infty , \quad i = 1, \, \ldots , \,s; \) \(\gamma \) satisfies formula \( 0 \,< \, \gamma < 1/2 - 1/r\) and if, moreover, Assumptions A.2, A.3, B.1 and B.2 are fulfilled, then
Generally, the value \(\gamma := \gamma (r) \) satisfying Theorem 2 evidently depends on the existence of a finite absolute moment of order r. Moreover, setting
we get
Consequently, according to Theorem 2, \(\gamma \in (0, \, 1/2) \, \) holds true not only for the distributions with exponential tails but also whenever all finite absolute moments exist, e.g., Weibull or log-normal distributions. If finite moments only exist for some \(r > 2\), then the convergence rate is worse, see, e.g., the case of Pareto distribution [10].
4.2 Constraint sets depending nonlinearly on a probability measure
First, we prove an auxiliary assertion after introducing the following system of assumptions:
-
C.1
-
\( {\bar{g}}_{0} (x, \, z, \, y) \, \) is for every \(z \in R^{s} \) and \(y \in Y(\varepsilon ) \, \) a Lipschitz function of \(x \in X\) with the Lipschitz constant \(L_{C} \) not depending on \(z \in R^{s}, \, y \in Y(\varepsilon );\) \( \varepsilon >0, \, Y(\varepsilon ) \) determined in B.1
-
\(h_{j} (x, \, z), \, j = 1, \, \ldots , \, m_{1} \) are, for every \(z \in R^{s}\), Lipschitz functions on X with the Lipschitz constant \(L_{C}^{h}\) not depending on \( z \in Z_{F}.\)
-
Lemma 4
(auxiliary assertion) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1}(R^{s}), \, \) X be a nonempty compact set and Assumptions B.1, B.2, C.1 be fulfilled. Then
is a Lipschitz function of \(x \in X \) with the Lipschitz constant not greater than \( 2 \max [ L_{C} , \, m_{1} L_{y} L_{C}^{h}] .\)
Proof
If \( x, \, x^{'} \in X, \) are arbitrary, then
Consequently, the assertion of Lemma 4 is valid. \(\square \)
Now, we can state the following result.
Theorem 3
(general assertion concerning optimal value estimates) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \) \( X \subset R^{n} \) be a nonempty compact set, \(X_{F} \) be given by the constraint set b. with \(m = 1.\) If
-
1.
\( X_{F}, \, X_{F^{N}}, \, N =1, \, 2, \, \ldots \ \) are nonempty compact sets
-
2.
Assumptions B.1, B.2, C.1, A.2 are fulfilled
-
3.
\( P_{F} \{ \omega : \, \Delta [ X_{F^{N}}, \, X_{F} ] \xrightarrow [N\rightarrow \infty ]{} 0 \} = 1\)
then
Proof
First, according to the triangle inequality, we obtain
It follows from Theorem 1 that
Since it is easy to see that Lemma 4 and Corollary 1 are valid if we replace F by \(F^{N}, \) we can successively derive that
is a Lipschitz function on \(x \in X\) and that, for \( N = 1, \, 2, \, \ldots \), it holds
The assertion of Theorem 3 follows from (24), (25), the last formula and Assumption 3.
To introduce the next two assertions we slightly redefine in them the set \(X_{F} \).
Theorem 4
(consistency of optimal value empirical estimates under strong convex assumption) Let \( P_{F} \in {\mathcal {M}}_{1}^{1} (R^{s}), \) X be a nonempty compact convex set, \( X_{F} \) be given by the constraint set b. with \( m =1.\) If
-
1.
\( \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi ))\) is a strongly convex, with a parameter \( {\bar{\rho }} > 0,\) function on X fulfilling the assumptions of Lemma 3 (setting \( \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi )) := {\hat{h}}(x), \quad X:= {{\mathcal {K}}}) \, \)
-
2.
\( x_{0} \, = \, \arg \min \limits _{x \in X} \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi )) , \, \varepsilon _{1} > \textsf{E}_{F} {\bar{g}}_{1} (x_{0}, \, \xi , \, \textsf{E}_{F} h(x_{0}, \, \xi )) \)
-
3.
\( X_{F} = X_{F}^{0}: = X_{F}^{\varepsilon _{1}} = \{ x \in X: \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi )) \le \varepsilon _{1} \}\)
-
4.
\( X_{F}, \, X_{F^{N}}, \, N = 1, \, 2, \, \ldots \) are nonempty compact sets,
-
5.
-
Assumption A. 2 is fulfilled
-
\({\bar{g}}_{1} \) fulfills Assumptions B.1, B.2 (setting \( {\bar{g}}_{1} := {{\bar{g}}}_{0}) \)
-
\( {\bar{g}}_{0} \) fulfills Assumptions B.1, B.2, C.1,
-
-
6.
there exists \( {\bar{\varepsilon }}_{0} > 0 \) such that \( X_{F}^{- \varepsilon _{0}} \) is, for \( 0< \varepsilon _{0} < {\bar{\varepsilon }}_{0}, \) a nonempty compact set,
then
Proof
First, recalling the triangle inequality (formula (24)), we can see that
Replacing the distribution function G by \(F^{N}\) in (21), we obtain
Since the Hausdorff distance is a metric in the space of nonempty compact subsets of \(R^{n}; \) see, e.g., [31] and since \(X^{\varepsilon }_{F} \subset X^{2 \varepsilon }_{F} \), we can employ the triangle inequality and, for \(N \, = 1, \, 2\, \ldots \), obtain that
According to formula (18), Assumption 3 and formula (27), we can see that
Consequently
Further, with the aid of Lemma 3 and Proposition 1 (Assertion 2), we get
and, replacing F by \(F^{N} \) in Lemma 4, we can see that \( \textsf{E}_{F^{N}} {\bar{g}}_{0} (x , \, \xi , \, \textsf{E}_{F^{N}} h(x, \, \xi )) \) is a Lipschitz function of \(x \in X\) with the Lipschitz constant no greater than \( 2 \max [L_{C}, \, m_{1}L_{y}l_{h}^{c}] . \) Hence, employing Corollary 1 for N such that \(\varepsilon (N) < {\bar{\varepsilon }}_{0}\), we get
If, it holds that \( {\hat{C}}[\sum \limits _{i=1}^{s} \int \limits _{- \infty }^{\infty } | F_{i}(z_{i}) \, - \, F_{i}^{N}(z_{i})| dz_{i}] \le {\bar{\varepsilon }}_{0}\) for N then, according to (26) and Theorem 1, we have
If, moreover, it holds that \([\sum \limits _{i=1}^{s} \int \limits _{- \infty }^{\infty } | F_{i}(z_{i}) \, - \, F_{i}^{N}(z_{i})| dz_{i}] \le 1\) for the given N then also
Now, with the aid of the properties of the probability measure, formula (29), and Lemma 2, we can see that the assertion of Theorem 4 is valid. \(\square \)
The following assertion deals with the convergence rate.
Theorem 5
(rate of convergence of empirical estimates under strong convex assumption) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \, t> 0, \) X be a nonempty compact convex set, \( X_{F} \) be given by the constraint set b. with \( m =1.\) If
-
1.
\( \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi ))\) is a strongly convex function on X with a parameter \( {\bar{\rho }} > 0 \) fulfilling the assumptions of Lemma 3 (setting \( \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi )) := {\hat{h}}(x), \quad X:= {{\mathcal {K}}}) \, \)
-
2.
\( x_{0} \, = \, \arg \min \limits _{x \in X} \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi )) , \, \varepsilon _{1} > \textsf{E}_{F} {\bar{g}}_{1} (x_{0}, \, \xi , \, \textsf{E}_{F} h(x_{0}, \, \xi )) ,\)
-
3.
\( X_{F} = X_{F}^{0} : = X_{F}^{\varepsilon _{1}} = \{ x \in X: \textsf{E}_{F} {\bar{g}}_{1} (x, \, \xi , \, \textsf{E}_{F} h(x, \, \xi )) \le \varepsilon _{1} \}\)
-
4.
\( X_{F}, \, X_{F^{N}}, \, N = 1, \, 2, \, \ldots \), are nonempty compact sets
-
5.
-
Assumptions A. 2, A. 3 are fulfilled
-
\({\bar{g}}_{1} \) fulfills Assumptions B.1, B.2 (setting \({\bar{g}}_{1} : = {\bar{g}}_{0} \))
-
\( {\bar{g}}_{0} \) fulfills Assumptions B.1, B.2, C.1
-
-
6.
there exist \({\bar{\varepsilon }}_{0} > 0 \) such that \( X_{F}^{ - \varepsilon _{0}} \) is, for \( 0< \varepsilon _{0} < {\bar{\varepsilon }}_{0}, \) a nonempty compact set
-
7.
there exists \( \, r > 2 \, \) such that \( \textsf{E}_{F_{i}} |\xi _{i}|^{r} < + \infty , \, i = 1, \, 2, \ldots \, s \),
-
8.
\(\gamma \) fulfills formula \( 0 \,< \, \gamma < 1/4 - 1/2r,\)
then
Proof
The main idea of the proof of Theorem 5 is similar to that of Theorem 4; however, specific differences do exist. Consequently, we sometimes employ partial results achieved in the proof of Theorem 4, but we simultaneously try to give the proof of Theorem 5 in an understandable form. To this end, we first employ the triangle inequality and see that formula (26) holds; it means that
and, simultaneously, from Theorem 2 it follows that
Since the Hausdorff distance is a metric in the space of compact nonempty subsets of \( R^{n}\) (see, e.g., [31]), we obtain that
Employing this approach to get (28), we can also see that
Replacing F by \(F^{N} \) in Lemma 4, we can also see that \( \textsf{E}_{F^{N}} {\bar{g}}_{0} (x , \, \xi , \, \textsf{E}_{F^{N}} h(x, \, \xi )) \) is a Lipschitz function of \(x \in X\) with the Lipschitz constant no greater than \(2 \max [L_{C}, \, m_{1} L_{y}L_{C}^{h}] . \) So, for N such that \( \varepsilon (N) < {\bar{\varepsilon }}_{0}\), Corollary 1 implies
Consequently, it follows successively from the last formula and from the properties of the probability measure that
Further, formulae (31), (32), Corollary 1, and the properties of the probability measure successively imply
Now it already follows from the last formula, formulae (26), (30), (32), Lemma 2 and Proposition 2 that the assertion of Theorem 5 holds true. \(\square \)
Example 1
Considering a classical mean-risk problem (6) with a nonempty convex compact set X and
we can see that the function
fulfills (under general assumptions, see, e.g., [21]) the conditions defining the constraint set b. with strongly convex function on the constraint set X. Moreover, if the support \(P_{F} \) is a compact set, then
is a Lipschitz function of \(x \in X \) with the Lipschitz constant not depending on \(z \in Z_{F} \) (for more details see, e.g., [19, 21]).
In the next theorems we omit the assumption on strongly convexity of \(\textsf{E}_{F} {\bar{g}}_{1}(x, \, \xi , h(x, \, \xi )) \) on X and we return to the original definition sets \(X_{F}^{\varepsilon } \) given by formula (18).
Theorem 6
(consistency of optimal value under general conditions) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \) \( X \subset R^{n} \) be a nonempty compact set, \(X_{F} \) correspond to the constraint set b. with \( m = 1 .\) If
-
1.
\(X_{F} , \, X_{F^{N}}, \, N = 1, \, 2, \, \ldots \) are nonempty compact sets
-
Assumption A.2 is fulfilled
-
\( {\bar{g}}_{1}\) fulfills Assumptions B.1, B.2 (setting \({\bar{g}}_{1} : = {\bar{g}}_{0})\)
-
\( {\bar{g}}_{0} \) fulfills Assumptions B.1, B.2, C.1
-
-
3.
there exists \({\bar{\varepsilon }}_{0}>0 \) such that \(X_{F}^{\varepsilon } \) are nonempty compact sets for every \( \varepsilon \in \langle {-{\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle } \) and, moreover, there exists a constant \({\bar{C}} > 0\) such that
$$\begin{aligned} \Delta _{n} [X_{F}^{\varepsilon }, \, \, X_{F}^{\varepsilon ^{'}}] \, \, \le {\bar{C}} |\varepsilon - \varepsilon ^{'}| \quad \textrm{for} \quad \varepsilon , \, \varepsilon ^{'} \in \langle - {\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle , \end{aligned}$$
then
Proof
As the main idea of the proof of Theorem 6 is very similar to the idea of the proof of Theorem 9 (which can be found in the next part) we omit details of that proof here. We only recall essential inequalities and former results to prove the assertion of Theorem 6. In more detail, employing formula (26), assertions of Theorem 1, Corollary 1 and Proposition 1, we can get, for \( N \in {{\mathcal {N}}}_{0} \) fulfilling condition \(10 {\hat{C}} \sum \nolimits _{i=1}^{s} \int \nolimits _{- \infty }^{\infty } | F_{i}(z_{i}) \, - \, F_{i}^{N}(z_{i})| dz_{i}] \le {\bar{\varepsilon }}_{0}\), that
\({{\mathcal {N}}}_{0}\) denotes the set of all natura numbers.
Consequently, according to Lemma 2, the assertion of Theorem 6 is valid. \(\square \)
Remark 1
Evidently, if Assumption 3 holds for every \( {\bar{\varepsilon }}_{0} \in R^{1}, \) then
is valid also for all \( N \in {{\mathcal {N}}}_{0}.\)
The next assertion deals with the rate of convergence.
Theorem 7
(rate of optimal value empirical estimates under general assumptions) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \, t > 0, \) X be a nonempty compact set. Let, moreover, \( X_{F} \) be given by the constraint set b. with \(m =1.\) If
-
1.
\(X_{F}, \, X_{F^{N}}, \, \, N = 1, \, 2, \ \ldots \) are nonempty compact sets
-
2.
-
Assumptions A.2, A.3 are fulfilled
-
\( {\bar{g}}_{1}\) fulfills Assumptions B.1, B.2 (setting \({\bar{g}}_{1} := {\bar{g}}_{0}\))
-
\( {\bar{g}}_{0} \) fulfills Assumptions B.1, B.2, C.1
-
-
3.
there exists \( \, r > 2 \, \) such that \( \textsf{E}_{F_{i}} |\xi _{i}|^{r} < + \infty , \, i = 1, \, 2, \ldots , \, s \)
-
4.
\(\gamma \) fulfills formula \( 0 \,< \, \gamma < 1/2 - 1/r\)
-
5.
there exists \( {\bar{\varepsilon }}_{0}\) such that \(X_{F}^{\varepsilon } \) are nonempty compact sets for every \(\varepsilon \in \langle -{\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle \) and, moreover, there exists a constant \( {\bar{C}} > 0 \) such that
$$\begin{aligned} \Delta [X_{F}^{\varepsilon }, \, X_{F}^{\varepsilon ^{'}}] \, \le \, {\bar{C}} |\varepsilon - \, \varepsilon ^{'} | \quad \mathrm{for~every } \quad \varepsilon , \, \varepsilon ^{'} \, \in \langle -{\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle , \end{aligned}$$
then
Proof
As the main idea of the proof of Theorem 7 is very similar to that of Theorem 10 (which can be found in the next part) we omit it here. \(\square \)
4.3 Second-order stochastic dominance constraint sets
First, evidently we can in this case, also introduce a result similar to the assertion of Theorem 3.
Theorem 8
(general assertion) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \) \( X \subset R^{n} \) be a nonempty compact set, \(X_{F} \) correspond to the constraint set c. If
-
1.
\( X_{F}, \, X_{F^{N}}, \, N = 1, \, 2, \, \ldots \) are nonempty compact sets,
-
2.
Assumptions B.1, B.2, C.1, A.2 are fulfilled,
-
3.
\( P_{F} \{ \omega : \, \Delta [ X_{F^{N}}, \, X_{F} ] \rightarrow _{N \rightarrow \infty } 0 \} \, = 1,\)
then
Proof
As the proof of Theorem 8 is very similar to that of Theorem 3, it can be omitted. \(\square \)
Theorem 9
(consistency of empirical value estimates) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \) \( X \subset R^{n} \) be a nonempty compact set, \(X_{F} \) correspond to the constraint set c. If
-
1.
\(Y(z), \, g(x, \, z) \) are for every \(x \in X\) Lipschitz functions of \(z \in R^{s} \) with the Lipschitz constant \(L_{g}, \) (corresponding to the \({{\mathcal {L}}}_{1} \) norm) not depending on \( x \in X,\)
-
3.
Assumptions B.1, B.2, C.1, A.2 are fulfilled,
-
4.
there exists \({\bar{\varepsilon }}_{0}>0 \) such that \(X_{F}^{\varepsilon }, \) defined by formula (18), are nonempty compact sets for every \( \varepsilon \in \langle {-{\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle } \) and, moreover, there exists a constant \({\bar{C}} > 0\) such that
$$\begin{aligned} \Delta _{n} [X_{F}^{\varepsilon }, \, \, X_{F}^{\varepsilon ^{'}}] \, \, \le {\bar{C}} |\varepsilon - \varepsilon ^{'}| \quad \textrm{for} \quad \varepsilon , \varepsilon ^{'} \in \langle - {\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle , \end{aligned}$$
then
Proof
Evidently, to prove the assertion of Theorem 9, it is sufficient to verify a validity of Assumption 3 of Theorem 8. To this end, first, if we replace in (22) G by \( F^{N},\) we can successively obtain:
where, according to (18),
Since the Hausdorff distance is a metric in the space of compact nonempty subsets of \( R^{n},\) see, e.g., [31], we can employ formulae (35), (36) to obtain that
Now, according to formula (37), Assumption 4 and Lemma 2, we can see that Assumption 3 of Theorem 8 is valid. Consequently, the assertion of Theorem 9 is also valid. \(\square \)
Since here it also holds true, according to (26) and (36), that
we can formulate the following Remark.
Remark 2
Evidently, if Assumption 4 of Theorem 9 holds for every \( {\bar{\varepsilon }}_{0} \in R^{1}, \) then formula
is valid for all \( N \in {{\mathcal {N}}}_{0}.\)
The next assertion deals with the convergence rate.
Theorem 10
(rate of convergence of optimal value empirical estimates) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \, t > 0, \) X be a nonempty, compact set. Let, moreover, \( X_{F} \) be given by constraint set c. If
-
1.
\(Y(z), \, g(x, \, z) \) are for every \(x \in X\) Lipschitz functions of \(z \in R^{s} \) with the Lipschitz constant \(L_{g} \) not depending on \( x \in X\)
-
2.
Assumptions B.1, B.2, C.1, A.2, A.3 are fulfilled
-
3.
there exists \( \, r > 2 \, \) such that \( \textsf{E}_{F_{i}} |\xi _{i}|^{r} < + \infty , \, i = 1, \, 2, \ldots , \, s \)
-
4.
\(\gamma \) fulfills a relation \( 0 \,< \, \gamma < 1/2 - 1/r\)
-
5.
\(X_{F}, \, X_{F^{N}}, \, N= 1, \, 2, \, \ldots \) are nonempty compact sets
-
6.
there exists \( {\bar{\varepsilon }}_{0}\) such that \(X_{F}^{\varepsilon } \) are nonempty compact sets for every \(\varepsilon \in \langle -{\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle \) and, moreover, there exists a constant \( {\bar{C}} > 0 \) such that
$$\begin{aligned} \Delta [X_{F}^{\varepsilon }, \, X_{F}^{\varepsilon ^{'}}] \, \le \, {\bar{C}} |\varepsilon - \, \varepsilon ^{'} | \quad \mathrm{for~every } \quad \varepsilon , \, \varepsilon ^{'} \, \in \langle - {\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle , \end{aligned}$$
then
Proof
As the main idea of the proof of Theorem 10 is very similar to that of Theorem 4.7 in [20], we only briefly present an outline of the proof of Theorem 10. First, it follows from the triangle inequality, see also formula (26) that
Further, it follows from Theorem 2 that
for \(\gamma \) fulfilling the assumptions of Theorem 10.
Since the Hausdorff distance is a metric in the space of compact nonempty subsets of \( R^{n},\) see, e.g., [31], we can replacing G by \(F^{N} \) in formula (22) to obtain that
and, simultaneously,
Replacing F by \(F^{N} \) in Lemma 4, we can see that \( \textsf{E}_{F^{N}} {\bar{g}}_{0} (x , \, \xi , \, \textsf{E}_{F^{N}} h(x, \, \xi )) \, \) is a Lipschitz function of \(x \in X\) with the Lipschitz constant no greater than \(2 \max [L_{C}, \, m_{1}L_{y}L_{C}^{h}].\) Consequently we can obtain
It successively follows from the last formula and from the properties of the probability measure that
Now, we can already see that the assertion of Theorem 10 follows from formulae (38), (39), (40), (41), Lemma 2, Proposition 2, Corollary 1 and the last formula. We have
\(\square \)
4.4 Constraint sets determined by mean-risk problem
It remains to deal with an analysis of Problem (1) with the constraint set \(X_{F}\) corresponding to the case d. To this end, let the constants \( \nu _{1}, \, \nu _{2} \) and \(x_{0} \in X \) satisfy formula (8). Further, let for an \(N \in {{\mathcal {N}}}_{0}, \) \(x_{0}^{N} \in X, \) and \( X_{F^{N}} (x_{0}^{N}) \) satisfy formulae
We introduce the next auxiliary assertion.
Proposition 5
(auxiliary assertion) Let the mean-risk model (7) be consistent with the second-order stochastic dominance, \( P_{F}, \, P_{G} \in {{\mathcal {M}}}_{1}^{1} (R^{s}) \), X be a nonempty compact set, Assumption A.1 be fulfilled. If
-
1.
-
\( x_{0}^{F} \in X \) satisfies formula \( \textsf{E}_{F} g_{0}(x_{0}^{F}, \, \xi ) \, = \nu _{2}\)
-
\( x_{0}^{G} \in X \) satisfies formula \( \textsf{E}_{G} g_{0}(x_{0}^{G}, \, \xi ) \, = \nu _{2},\)
then
$$\begin{aligned}{} & {} | \textsf{E}_{F}(u - g_{0} (x_{0}^{F} , \, \xi ))^{+} - \textsf{E}_{G}(u - g_{0} (x_{0}^{G} , \, \xi ))^{+} | \, \nonumber \\{} & {} \, \, \le L \sum \limits _{i=1}^{s} \int \limits _{ - \infty }^{\infty } |F_{i}(z_{i}) - G_{i} (z_{i})| dz_{i} + | \textsf{E}_{F}(u - g_{0} (x_{0}^{F} , \, \xi ))^{+} \nonumber \\{} & {} \quad - \textsf{E}_{F}(u - g_{0} (x_{0}^{G} , \, \xi ))^{+} | \quad \mathrm{for~every } \quad u \in R^{1}, \end{aligned}$$ -
-
2.
Formulae (43) are satisfied for an \(N \in {{\mathcal {N}}}_{0} \) and every \(u \in R^{1},\) then for every \( u \in R^{1}\) and this N it holds that
$$\begin{aligned} \begin{array}{ll} &{}| \textsf{E}_{F}(u - g_{0} (x_{0}^{F} , \, \xi ))^{+} - \textsf{E}_{F^{N}} (u - g_{0} (x_{0}^{N} , \, \xi ))^{+} | \, \\ &{} \, \, \le L \sum \limits _{i=1}^{s} \int \limits _{ - \infty }^{\infty } |F_{i}(z_{i}) - F_{i}^{N} (z_{i})| dz_{i} + |\textsf{E}_{F}(u - g_{0} (x_{0}^{F} , \, \xi ))^{+} - \textsf{E}_{F}(u - g_{0} (x_{0}^{N} , \, \xi ))^{+} |. \end{array} \end{aligned}$$
Proof
We prove Assertion 1 of Proposition 5. To this end we employ the assertion of Lemma 1, Proposition 1 and the triangle inequality to obtain
Now, we can already see that Assertion 1 of Proposition 5 is valid.
As the proof of Assertion 2 is very similar to that of Assertion 1, we can omit it. \(\square \)
If X is a nonempty compact set and if Assumption A.1 is fulfilled, \(P_{F} \in {{\mathcal {M}}}_{1}^{1} (R^{s}), \) then replacing G by \(F^{N}\) in formula (13), we obtain, for \( N= 1, \, 2, \, \ldots \), that
If, moreover, \( \rho _{F} (g_{0}(x, \, \xi )) \) is a uniformly continuous function on the nonempty compact set X, and if there exists a constant \(C^{1} \, > \, 0\) such that
then, according to formula (14), Lemma 2 and an approach similar to that of formula (14) for risk measure \(\rho _{F}\) , it obviously holds that
Consequently, we have introduced assumptions under which we have proved the convergence \( \textsf{E}_{F^{N}} g_{0} (x_{0}^{N}, \, \xi ) \) to \( \textsf{E}_{F} g_{0} (x_{0}^{F}, \, \xi )\) and \( \rho _{F^{N}}( g_{0} (x_{0}^{N}, \, \xi )) \) to \( \rho _{F} (g_{0} (x_{0}^{F}, \, \xi ))\) (a.s.). This result is important for us. However, our aim is to find out the assumptions under which
To formulate a result similar to the assertion of Theorem 3, we define, first for \( \varepsilon \in R^{1}\), the sets:
Evidently \( X_{F}^{0} (x_{0}^{F} ) = \, X_{F}(x_{0}^{F}), \, \, X_{F^{N}}^{0} (x_{0}^{N} ) = X_{F^{N}}(x_{0}^{N}).\) Moreover \(X_{F}^{\varepsilon } (x_{0}^{F}) \) corresponds to \( X_{F}^{c, \, \varepsilon } \) with \( Y(\xi ), := g_{0} ( x_{0}^{F}, \xi ), \, \varepsilon \in R^{1}.\)
Theorem 11
(general result) Let \(P_{F} \in {{\mathcal {M}}}_{1}^{1}(R^{s}), \, \) relations (43) be for \(N \in {{\mathcal {N}}}_{0} \) satisfied. If
-
1.
\(X, \, X_{F}(x_{0}^{F}), \, X_{F^{N}}(x_{0}^{N}), \, N = 1, \, 2, \, \ldots \) are nonempty compact sets
-
2.
\(x_{0}^{F} \) is defined in Proposition 5
-
3.
\( x_{0}^{N} \, \) and \( X_{F^{N}} (x_{0}^{N}) \) are satisfying (43)
-
4.
Assumptions B.1, B.2, C.1, A1., A.2 are fulfilled
-
5.
\( P\{ \omega : \Delta [ X_{F}(x_{0}^{F}), \, X_{F^{N}} (x_{0}^{N})] \xrightarrow [N\rightarrow \infty ]{} 0 \} = 1,\)
then
Proof
As the proof of Theorem 11 is very similar to that of Theorem 3, we can omit its presentation here. \(\square \)
Employing the assertions of Proposition 5 we obtain the following result.
Theorem 12
(consistency of optimal value empirical estimates) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1}(R^{s}), \, \) \( \, X \) be a nonempty compact set, and formula (43) be satisfied. If
-
1.
\(x_{0}^{F} \) is defined in Proposition 5
-
2.
\( x_{0}^{N} \, \) satisfies formula (43)
-
3.
\( \, X_{F}(x_{0}^{F}), \, X_{F^{N}}(x_{0}^{N}) \) are nonempty compact sets
-
4.
Assumptions B.1, B.2, C.1, A 1., A.2 are fulfilled
-
5.
\( P\{\omega : |\textsf{E}_{F} (u - g_{0} (x_{0}^{F}, \, \xi ))^{+} - \textsf{E}_{F} (u - g_{0} (x_{0}^{N}, \, \xi ))^{+}| \xrightarrow [N\rightarrow \infty ]{} 0 \} = 1 \quad \mathrm{for ~ every } \quad u \in R^{1} \} = 1\)
-
6.
there exists \( {\bar{\varepsilon }}_{0}\) such that \(X_{F}^{\varepsilon } \) are nonempty compact sets for every \(\varepsilon \in \langle -{\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle \) and, moreover, there exists a constant \( {\bar{C}} > 0 \) such that
$$\begin{aligned} \Delta [X_{F}^{\varepsilon }, \, X_{F}^{\varepsilon ^{'}}] \, \le \, {\bar{C}} |\varepsilon - \, \varepsilon ^{'} | \quad \mathrm{for~every } \quad \varepsilon , \, \varepsilon ^{'} \, \in \langle - {\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle , \end{aligned}$$
then
Proof
To prove the assertion of Theorem 12, evidently it is sufficient to prove the validity of Assumption 5 of Theorem 11. To this end, first according to the assumptions, it follows from Lemma 1 and Proposition 1 that
The second assertion of Proposition 5 now, for \( u \in R^{1} \) and \( N \in {{\mathcal {N}}}_{0} \) fulfilling assumptions of Proposition 5, implies that
Consequently
with
and, moreover, if Assumption 5 of Theorem 12 is satisfied, then, according to formula (50), Assumption 4 of Theorem 11 is also satisfied and Theorem 12 holds. \(\square \)
The next assertion deals with the convergence rate.
Theorem 13
(rate of convergence of optimal value empirical estimates) Let \( P_{F} \in {{\mathcal {M}}}_{1}^{1}(R^{s}), \, \) \( \, X \) be a nonempty compact set, relations (43) be fulfilled, \( t > 0\). If
-
1.
\(x_{0}^{F} \) is defined in Proposition 5
-
2.
\( x_{0}^{N} \, \) fulfills Formula (43)
-
3.
\( \, X_{F}(x_{0}^{F}), \, X_{F^{N}}(x_{0}^{N}), \, N \, = \, 1, \, 2\, \ldots \) are nonempty compact sets,
-
4.
Assumptions B.1, B.2, C.1, A 1., A.2 are fulfilled
-
5.
there exists \( r > 2 \) such that \({ \mathsf E}_{F} | \xi _{i}|^{r} < + \infty , \, i =1, \, \ldots s\)
-
6.
\(\gamma \) fulfills a relation \( 0 \,< \gamma < 1/2 - 1/r \)
-
7.
\( P\{\omega : |\textsf{E}_{F} (u - g_{0} (x_{0}^{F}, \, \xi ))^{+} - \textsf{E}_{F} (u - g_{0} (x_{0}^{N}, \, \xi ))^{+}| \xrightarrow [N\rightarrow \infty ]{} 0 \} = 1 \quad \mathrm{for ~ every } \quad u \in R^{1} \} = 1\)
-
8.
\( P\{\omega : N^{\gamma } |\textsf{E}_{F} (u - g( x_{0}^{F}, \, \xi ))^{+} - \textsf{E}_{F} (u - g( x_{0}^{N}, \, \xi ))^{+}| > t \} \xrightarrow [N\rightarrow \infty ]{} 0 \quad \mathrm{for ~ every } \quad u \in R^{1} \} = 1 \)
-
9.
there exists \( {\bar{\varepsilon }}_{0}\) such that \(X_{F}^{\varepsilon } \) are nonempty compact sets for every \(\varepsilon \in \langle -{\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle \) and, moreover, there exists a constant \( {\bar{C}} > 0 \) such that
$$\begin{aligned} \Delta [X_{F}^{\varepsilon }, \, X_{F}^{\varepsilon ^{'}}] \, \le \, {\bar{C}} |\varepsilon - \, \varepsilon ^{'} | \quad \mathrm{for~every } \quad \varepsilon , \, \varepsilon ^{'} \, \in \langle - {\bar{\varepsilon }}_{0}, \, {\bar{\varepsilon }}_{0} \rangle , \end{aligned}$$
then
Proof
To prove the assertion of Theorem 13 it is possible to employ the idea of the proof of Theorem 10 and Assumption 6 of the last Theorem. Consequently, we can omit the detailed proof. \(\square \)
5 Conclusions
This paper deals with a class of stochastic optimization problems in which dependence on the “underlying" probability measure can be nonlinear. The nonlinear dependence can appear both in the objective function and in the constraint sets. In detail, the following types of the constrain sets are considered: deterministic, constraint sets with the nonlinear dependence on a probability measure, constraint sets given by the second-order stochastic dominance and constraint sets given by the mean-risk problem. The properties of strongly convex functions are employed in the second type of the constraints, Theorem 4 and Theorem 5. In other Theorems, this assumption is omitted. The convergence rate based on the strong convexity is worse than that achieved without this assumption. However, the assumption on the Hausdorff distance between the constraint sets has to be utilized in the general case. Problem (1) with constraint set c. generally leads to a semi-infinite optimization problem for which Slater‘s condition is not necessary fulfilled. In the literature, a few approaches to this case have appeared, see, e.g., [1, 11, 35]. A detailed discussion on this topic goes beyond the scope of this paper.
The paper focuses on the situations in which the original problem has to be solved on the basis of the data. The achieved results are evaluated by convergence almost surely and the rate of convergence. The presented results are focused on the case when the underlying probability measure can have a general support. If it is possible to assume a finite support, much stronger results can be obtained with the aid of [32]. Evidently, in this case it is possible, for a large sample, to obtain, with a high degree of probability, the solution of the original “theoretical“ problem (1). However the detailed analysis for the corresponding probability measure and the considered problems goes beyond the scope of the paper.
References
Dentcheva, D., Ruszczynski, A.: Optimization with stochastic dominants constraints. SIAM J. Optim. 14, 548–566 (2003)
Dentcheva, D., Penev, S., Ruszczynski, A.: Statistical Estimation of Composite Risk Functionals and Risk Optimization Problem. Conwelt University Library, New York (2014)
Dupačová, J., Wets, R.J.-B.: Asymptotic behavior of statistical etimators and of optimal solutios for statistical optimization problems. Ann. Stat. 16, 1517–1549 (1988)
Ermoliev, Yu.: Models and Methods of Stochastic Programming. Nauka, Moscow (1976). ((in Russian))
Ermoliev, Yu., Norkin, V.: Normalized cnvergence of random variables and its applications. Cybernetics 26, 912–918 (1990)
Ermoliev, Yu., Norkin, V.: Normalize convergence in stochastic optimization. Ann. Oper. Res. 30(1), 912–918 (1991)
Ermoliev, Yu., Norkin, V.: Sample average approximation method for compound stochastic optimization problems. SIAM J. Optim. 23(4), 2231–2263 (2014)
Gut, A.: Probability: A Graduate Course. Springer, New York (2006)
Gutjahr, W.J., Pichler, A.: Stochastic multi-objective optimization: a survey on non-scalarizing methods. Ann. Oper. Res. 236, 475–499 (2016)
Houda, M., Kaňková, V.: Empirical estimates in economic and financial optimization problems. Bull. Czech Econom. Soc. 29(19), 50–69 (2012)
Hu, J., Homen-de-Mello, Tito, Mehrotra, S.: Sample approximation of stochastic dominance constrained program. Math. Program. Ser. A. 133, 171–201 (2012)
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distribution, vol. 1. Wiley, New York (1994)
Kaňková, V.: An approximative solution of stochastic optimization problem. In: Tranaction of the Prague Conference 1978, Academia 1978, Prague, pp. 327–332 (1978)
Kaňková, V.: Stability in stochastic programming—the case of unknown location parameter. Kybernetika 29(1), 97–112 (1993)
Kaňková, V.: A note on estimates in stochastic programming. J. Comput. Appl. Math. 56, 97–112 (1994)
Kaňková, V.: On the stability in stochastic programming—the case of individual pobability constraints. Kybernetika 33(5), 525–544 (1997)
Kaňková, V., Houda, M.: Empirical processes in stochastic rogramming. In: Hušková, M, Janžura, M (eds) Proceedings of Prague Stochastics 2006. MATFYZPRESS 2006, Prague, pp. 426–436 (2006)
Kaňková, V.: Empirical estimates in stochastic optimization via distribution tails. Kybernetika 46(3), 459–471 (2010)
Kaňková, V.: Risk measures in optimization problems via empirical estimates. Czech Econ. Rev. 7(3), 162–177 (2013)
Kaňková, V., Houda, M.: Thin and heavy tails in stochastic programming. Kybernetika 51(3), 433–456 (2015)
Kaňková, V.: (2016) A remark on multi-objective stochastic optimization with strongly convex unctions. In: CEJOR, 24, pp. 309–333 (2016)
Kaňková, V.: A note on stochastic optimization problems with nonlinear dependence on a probability measure. In: Kapounek, S., Vránová, H. (eds) Proceedings of the 38th International Conference Mathematical Methods in Economics 2020. Mendel Universty in Brno, Faculty of Business and Economics, Brno, pp. 247–252 (2020)
Klebanov, L.B.: Heavy Tailed Distributions. MATFYZPRESS, Prague (2003)
Levy, H.: Stochastic Dominance (Investment, Decision Making under Uncertainty), 2nd edn. Springer (Printed in the United States) (2006)
Markowitz, H.: Portflio selection. J. Finance 7, 7–92 (1952)
Norkin, V.I.: Convergence of the empirical mean method in statistics programming. Cybernet. Syst. Anal. 28, 253–264 (1992)
Norkin, V.I.: Normalized convergence of random variables. Cybernet. Syst. Anal. 28, 396–402 (1992)
Ogryczak, W., Rusczynski, A.: From the stochastic dominance to mean-risk models: semidevations as risk measure. Eur. J. oper. Res. 116, 33–50 (1999)
Pflug, G.C.: Scenario tree generation for multiperiod financial optimization by optimal discretization. Math. Program. Ser. B. 89(2), 251–271 (2001)
Rockafellar, R., Wets, R.J.B.: Variational Analysis. Springer, Berlin (1998)
Salinetti, G., Wets, R.J.B.: On te convergence of closed-valued measurable multifunctions. Trans. Am. Math. Soc. 266(1), 275–289 (1981)
Shapiro, A., Homen-de- Mello, T.: On the rate of convergence of optimal solutionsof Monte Carlo approximations of stochastic programs. SIAM J. Optim. 11, 70–86 (2000)
Shiryaev, A.N.: Essential of Stochastic Finance (Facts, Models Theory). World Scientific, New Jersey (2008)
Shorack, G.R., Wellner, J.A.: Empirical Processes and Applications to Statistics. Wiley, New York (1986)
Sun, H., Xu, R., Meskarian, R., Wang, Y.: Exact penalization, level function method and modified cutting-plane method for stochastic programs with second-order stochastic dominance constraints (To appear)
Šmíd, M.: The expected loss in the discretization of multistage stochasic programming problems-estimation and convergence rate. Ann. Oper. Res. 165(1), 29–45 (2009)
Valander, S.S.: Calculation of the Wasserstein distance between probability distribution on line. Theor. Probab. Appl. 18, 784–786 (1973)
Acknowledgements
The paper is an extended version of the author’s contribution “A note on Stochastic Optimization Problems with Nonlinear Dependence on a Probability Measure“ presented at the 35th International Conference Mathematical Methods in Economics. The author wishes to thank to the anonymous referee for an idea generalizing the original work to a full paper.
Funding
Open access publishing supported by the National Technical Library in Prague. This work was supported by the Czech Science Foundation under Grant No. 18-02739S.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kaňková, V. Stochastic optimization problems with nonlinear dependence on a probability measure via the Wasserstein metric. J Glob Optim 90, 593–617 (2024). https://doi.org/10.1007/s10898-024-01380-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-024-01380-6
Keywords
- Stochastic optimization problem
- Nonlinear dependence
- Static problems
- Second-order stochastic dominance
- Mean-risk model
- Wasserstein metric
- Stability
- Empirical estimates