Distributionally Robust Two-Stage Stochastic Programming
Distributionally robust optimization is a popular modeling paradigm in which the underlying distribution of the random parameters in a stochastic optimization model is unknown. Therefore, hedging against a range of distributions, properly characterized ...
Bregman Proximal Point Algorithm Revisited: A New Inexact Version and Its Inertial Variant
We study a general convex optimization problem, which covers various classic problems in different areas and particularly includes many optimal transport related problems arising in recent years. To solve this problem, we revisit the classic Bregman ...
Golden Ratio Primal-Dual Algorithm with Linesearch
The golden ratio primal-dual algorithm (GRPDA) is a new variant of the classical Arrow--Hurwicz method for solving structured convex optimization problems, in which the objective function consists of the sum of two closed proper convex functions, one of ...
Difference-of-Convex Algorithms for a Class of Sparse Group $\ell_0$ Regularized Optimization Problems
In this paper, we consider a class of sparse group $\ell_0$ regularized optimization problems. First, we give a continuous relaxation model of the considered problem and define a class of stationary points of the relaxation problem. Then, we establish the ...
Sublinear Convergence of a Tamed Stochastic Gradient Descent Method in Hilbert Space
In this paper, we introduce the tamed stochastic gradient descent method (TSGD) for optimization problems. Inspired by the tamed Euler scheme, which is a commonly used method within the context of stochastic differential equations, TSGD is an explicit ...
Effective Scenarios in Multistage Distributionally Robust Optimization with a Focus on Total Variation Distance
We study multistage distributionally robust optimization (DRO) to hedge against ambiguity in quantifying the underlying uncertainty of a problem. Recognizing that not all the realizations and scenario paths might have an “effect” on the optimal value, we ...
On the Benefit of Width for Neural Networks: Disappearance of Basins
Wide networks are often believed to have a nice optimization landscape, but what rigorous results can we prove? To understand the benefit of width, it is important to identify the difference between wide and narrow networks. In this work, we prove that ...
First-Order Methods for Problems with $O$(1) Functional Constraints Can Have Almost the Same Convergence Rate as for Unconstrained Problems
First-order methods (FOMs) have recently been applied and analyzed for solving problems with complicated functional constraints. Existing works show that FOMs for functional constrained problems have lower-order convergence rates than those for ...
Approximating Min-Mean-Cycle for Low-Diameter Graphs in Near-Optimal Time and Memory
We revisit Min-Mean-Cycle, the classical problem of finding a cycle in a weighted directed graph with minimum mean weight. Despite an extensive algorithmic literature, previous work failed to achieve a near-linear runtime in the number of edges $m$. We ...
Convergence Rates of the Heavy Ball Method for Quasi-strongly Convex Optimization
In this paper, we study the behavior of solutions of the ODE associated to the heavy ball method. Since the pioneering work of B. T. Polyak in 1964, it has been well known that such a scheme is very efficient for $C^2$ strongly convex functions with ...
Escaping Unknown Discontinuous Regions in Blackbox Optimization
The design of key nonlinear systems often requires the use of expensive blackbox simulations presenting inherent discontinuities whose positions in the variable space cannot be analytically predicted. Without further precautions, the solution of related ...
The Stochastic Auxiliary Problem Principle in Banach Spaces: Measurability and Convergence
The stochastic auxiliary problem principle (APP) algorithm is a general stochastic approximation (SA) scheme that turns the resolution of an original convex optimization problem into the iterative resolution of a sequence of auxiliary problems. This ...
Stochastic Decomposition Method for Two-Stage Distributionally Robust Linear Optimization
In this paper, we present a sequential sampling-based algorithm for the two-stage distributionally robust linear programming (2-DRLP) models. The 2-DRLP models are defined over a general class of ambiguity sets with discrete or continuous probability ...
Linearly Constrained Nonsmooth Optimization for Training Autoencoders
A regularized minimization model with $l_1$-norm penalty (RP) is introduced for training the autoencoders that belong to a class of two-layer neural networks. We show that the RP can act as an exact penalty model which shares the same global minimizers, ...
Escaping Strict Saddle Points of the Moreau Envelope in Nonsmooth Optimization
Recent work has shown that stochastically perturbed gradient methods can efficiently escape strict saddle points of smooth functions. We extend this body of work to nonsmooth optimization, by analyzing an inexact analogue of a stochastically perturbed ...
Anisotropic Diffusion in Consensus-Based Optimization on the Sphere
In this paper, we are concerned with the global minimization of a possibly nonsmooth and nonconvex objective function constrained on the unit hypersphere by means of a multi-agent derivative-free method. The proposed algorithm falls into the class of the ...
Error Bound Characterizations of the Conical Constraint Qualification in Convex Programming
This paper deals with error bound characterizations of the conical constraint qualification (CCQ) for convex inequality systems in a Banach space $X$. We establish necessary and sufficient conditions for a closed convex set $S$ defined by a convex ...
Simple and Optimal Methods for Stochastic Variational Inequalities, I: Operator Extrapolation
In this paper we first present a novel operator extrapolation (OE) method for solving deterministic variational inequality (VI) problems. Similar to the gradient (operator) projection method, OE updates one single search sequence by solving a single ...
From the Ravine Method to the Nesterov Method and Vice Versa: A Dynamical System Perspective
We revisit the Ravine method of Gelfand and Tsetlin from a dynamical system perspective, study its convergence properties, and highlight its similarities and differences with the Nesterov accelerated gradient method. The two methods are closely related. ...
A Hierarchy of Standard Polynomial Programming Formulations for the Maximum Clique Problem
The maximum clique problem is a classical optimization problem which finds many important applications. Some of the most theoretically interesting and computationally successful approaches to this problem are based on the Motzkin--Straus formulation, ...
Stochastic Approximation Methods for the Two-Stage Stochastic Linear Complementarity Problem
The two-stage stochastic linear complementarity problem (TSLCP), which can be regarded as a special and important reformulation of two-stage stochastic linear programming, has arisen in various fields, such as stochastic programming, game theory, traffic ...
Pivot Rules for Circuit-Augmentation Algorithms in Linear Optimization
Circuit-augmentation algorithms are generalizations of the simplex method, where in each step one is allowed to move along a fixed set of directions, called circuits, that is a superset of the edges of a polytope. We show that in the circuit-augmentation ...
Orthogonal Trace-Sum Maximization: Tightness of the Semidefinite Relaxation and Guarantee of Locally Optimal Solutions
This paper studies an optimization problem on the sum of traces of matrix quadratic forms in $m$ semiorthogonal matrices, which can be considered as a generalization of the synchronization of rotations. While the problem is nonconvex, this paper shows ...
A Global Dual Error Bound and Its Application to the Analysis of Linearly Constrained Nonconvex Optimization
Error bound analysis, which estimates the distance of a point to the solution set of an optimization problem4 using the optimality residual, is a powerful tool for the analysis of first-order optimization algorithms. In this paper, we use global error ...
Amenable Cones Are Particularly Nice
Amenability is a geometric property of convex cones that is stronger than facial exposedness and assists in the study of error bounds for conic feasibility problems. In this paper we establish numerous properties of amenable cones and investigate the ...
Degenerate Preconditioned Proximal Point Algorithms
In this paper we describe a systematic procedure to analyze the convergence of degenerate preconditioned proximal point algorithms. We establish weak convergence results under mild assumptions that can be easily employed in the context of splitting methods ...
High-Order Optimization Methods for Fully Composite Problems
In this paper, we study a fully composite formulation of convex optimization problems, which includes, as a particular case, the problems with functional constraints, max-type minimization problems, and problems with simple nondifferentiable components. We treat ...
Scaled, Inexact, and Adaptive Generalized FISTA for Strongly Convex Optimization
We consider a variable metric and inexact version of the fast iterative soft-thresholding algorithm (FISTA) type algorithm considered in [L. Calatroni and A. Chambolle, SIAM J. Optim., 29 (2019), pp. 1772--1798; A. Chambolle and T. Pock, Acta Numer., 25 (...