Abstract
This paper proposes a multi-scale method to design a continuous-time distributed algorithm for constrained convex optimization problems by using multi-agents with Markov switched network dynamics and noisy inter-agent communications. Unlike most previous work which mainly puts emphasis on dealing with fixed network topology, this paper tackles the challenging problem of investigating the joint effects of stochastic networks and the inter-agent communication noises on the distributed optimization dynamics, which has not been systemically studied in the past literature. Also, in sharp contrast to previous work in constrained optimization, we depart from the use of projected gradient flow which is non-smooth and hard to analyze; instead, we design a smooth optimization dynamics which leads to easier convergence analysis and more efficient numerical simulations. Moreover, the multi-scale method presented in this paper generalizes previously known distributed convex optimization algorithms from the fixed network topology to the switching case and the stochastic averaging obtained in this paper is a generalization of the existing deterministic averaging.
Similar content being viewed by others
References
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Brunner, F.D., Dürr, H.B., Ebenbauer, C.: Feedback design for multi-agent systems: a saddle point approach. In: The 51st IEEE Conference on Decision and Control, pp. 3783–3789, (2012)
Carli, R., Fagnani, F., Speranzon, A., Zampiei, S.: Communication constraints in the average consensus problem. IEEE Trans. Autom. Control 44(3), 671–684 (2008)
Charalamous, C.: Distributed constrained optimization by consensus-based primal-dual perturbation method. IEEE Trans. Autom. Control 59(6), 1524–1538 (2014)
Chen, G., Yi, P., Hong, Y.: Distributed optimization with projection-free dynamics (2021). arXiv:2105.02450
Duchi, J.C., Agarwal, A., Wainwright, M.M.: Dual averaging for distributed optimization: convergence analysis and network scaling. IEEE Trans. Autom. Control 57(3), 592–606 (2012)
Evans, L.C.: Partial Differential Equations, 2nd edn. American Mathematical Society, Providence (2010)
Feijer, D., Paganini, F.: Stability of primal-dual gradient dynamics and applications to network optimization. Automatica 46(12), 1974–1981 (2010)
Fragoso, M.D., Costa, O.L.V.: A unified approach for stochastic and mean square stability of continuous-time linear systems with markovian jumping parameters and additive disturbances. SIAM J. Control Optim. 44(4), 1165–1190 (2015)
Freedman, D.: Markov Chains. Springer, New York (1983)
Fukushima, M.: A relaxed projection method for variational inequalities. Math. Program. 35(1), 58–79 (1986)
Haken, H.: Synergetik. Springer, New York (1982)
Hosseini, S., Chapman, A., Mesbahi, M.: Online distributed convex optimization on dynamic networks. IEEE Trans. Autom. Control 61(11), 3545–3550 (2016)
Jakovetic, D., Xavier, J., Moura, J.M.F.: Fast distributed gradient methods. IEEE Trans. Autom. Control 59(5), 1131–1146 (2014)
Kia, S.S., Corts, J., Martnez, S.: Distributed convex optimization via continuous-time coordination algorithms with discrete-time communication. Automatica 55, 254–264 (2015)
Li, T., Wu, F., Zhang, J.: Multi-agent consensus with relative-state-dependent measurement noises. IEEE Trans. Autom. Control 59(9), 2463–2468 (2014)
Lobel, I., Ozdaglar, A.: Distributed subgradient methods for convex optimization over random networks. IEEE Trans. Autom. Control 56(6), 1291–1306 (2011)
Lou, Y., Hong, Y., Wang, S.: Distributed continuous-time approximate projection protocols for shortest distance optimization problems. Automatica 69, 289–297 (2016)
Mao, X.: Stochastic versions of the lasalle theorem. J. Differ. Equ. 153, 175–195 (1999)
Nedic, A., Ozdaglar, A.: Distributed subgradient methods for multi-agent optimization. IEEE Trans. Autom. Control 54(1), 48–61 (2009)
Nedic, A., Ozdaglar, A.: Distributed optimization over time-varying directed graphs. IEEE Trans. Autom. Control 60(3), 601–615 (2015)
Nedic, A., Ozdaglar, A., Parrilo, P.A.: Constrained consensus and optimization in multi-agent networks. IEEE Trans. Autom. Control 55(4), 922–938 (2010)
Nemirovsky, A.S., Yudin, D.B.: Problem Complexity and Method Efficiency in Optimization. Wiley, Chichester (1983)
Ni, W., Wang, X.: Averaging method to distributed convex optimization for continuous-time multi-agent systems. Kybernetika 52(6), 898–913 (2016)
Ni, W., Wang, X., Xiong, C.: Leader-following consensus of multiple linear systems under switching topologies: an averaging method. Kybernetika 48(6), 1194–1210 (2012)
Ni, W., Wang, X., Xiong, C.: Consensus controllability, observability and robust design for leader-following linear multi-agent systems. Automatica 49(7), 2199–2205 (2013)
Ni, W., Zhao, D., Ni, Y., Wang, X.: Stochastic averaging approach to leader-following consensus of linear multi-agent systems. J. Frankl. Inst. 353(12), 2650–2669 (2016)
Parikh, N.: Proximal algorithms. Found. Trends Optim. 1(3), 123–231 (2013)
Pavliotis, G.V., Stuart, A.M.: Multiscale Methods: Averaging and Homogenization. Springer, New York (2008)
Qu, G., Li, N.: Harnessing smoothness to accelerate distributed optimization. IEEE Trans. Control Netw. Syst. 5(3), 1245–1260 (2017)
Raginsky, M., Bouvrie, J.: Continuous-time stochastic mirror descent on a network: variance reduction, consensus, convergence. In: The 51st IEEE Conference on Decision and Control, pp. 6793–6800, (2012)
Ram, S.S., Nedic, A., Veeravalli, V.V.: Distributed stochastic subgradient projection algorithms for convex optimization. J. Optim. Theory Appl. 147, 516–545 (2010)
Ram, S.S., Nedic, A., Veeravalli, V.V.: Incremental stochastic subgradient algorithms for convex optimization. SIAM J. Optim. 20(2), 691–717 (2010)
Tsitsiklis, J.N., Bertsekas, D.P., Athans, M.: Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans. Autom. Control 31(9), 803–812 (1986)
Wachsmuth, G.: On licq and the uniqueness of lagrange multipliers. Oper. Res. Lett. 41(1), 78–80 (2013)
Wang, H., Li, Huaqing H., Zhou, B.: Distributed optimization under inequality constraints and random projections. In: Distributed Optimization, Game and Learning Algorithms, pp. 39–65. Springer, (2021)
Wang, J., Elia, N.: Mitigation of complex behavior over networked systems: analysis of spatially invariant structures. Automatica 49(6), 1626–1638 (2013)
Wolfowitz, J.: Products of indecomposable, aperiodic, stochastic matrices. Proc. Am. Math. Soc. 14(4), 733–737 (1963)
Xie, P., You, K., Tempo, R., Wu, C.: Distributed convex optimization with inequality constraints over time-varying unbalanced digraphs. IEEE Trans. Autom. Control 63(12), 4331–4337 (2018)
Xin, R., Khan, U.A.: A linear algorithm for optimization over directed graphs with geometric convergence. IEEE Control Syst. Lett. 2(3), 315–320 (2018)
Yang, Q.: The relaxed CQ algorithm solving the split feasibility problem. Inverse Probl. 20(4), 1261–1266 (2004)
Yi, P., Hong, Y., Liu, F.: Distributed gradient algorithm for constrained optimization with application to load sharing in power systems. Syst. Control Lett. 83, 45–52 (2015)
Yuan, D., Ho, D.W.C., Hong, Y.: On convergence rate of distributed stochastic gradient algorithm for convex optimization with inequality constraints. SIAM J. Control Optim. 54(5), 2872–2892 (2016)
Zeng, X., Yi, P., Hong, Y.: Distributed continuous-time algorithm for constrained convex optimizations via nonsmooth analysis approach. IEEE Trans. Autom. Control 62(10), 5227–5233 (2017)
Zhu, M., Martnez, S.: On distributed convex optimization under inequality and equality constraints. IEEE Trans. Autom. Control 57(1), 691–719 (2011)
Acknowledgements
The first author would like to thank Professor Zhong-Ping Jiang for his discussions on this paper when he visited New York University. This work is supported by the NSFC of China under the Grants 61663026, 61963028, 62066026, 61866023 and Jiangxi NSF Under Grant 20192BAB207025.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Xiaoqi Yang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Derivation of The SDE (6)
To characterize the noisy term \(\omega _i:=\sum \nolimits _{j\in {\mathscr {N}}_i(t)}\sigma _{ji}\xi _{ji}(x_j-x_i)\) in Eqs. (5a)–(5b), define \(\xi _i=[\xi _{1i}, \xi _{2i}, \cdots , \xi _{Ni}]^T\), \(i=1, \cdots , N\). Also, for each switching mode \(s\in {\mathbb {S}}\), define \({\mathcal {M}}^i_{s}=[a^{i1}_{s}\sigma _{1i}(x_1-x_i), \cdots , a^{iN}_{s}\sigma _{Ni}(x_N-x_i)]\). Then the noisy term \(\omega _i\) above can be written as \(\omega _i={\mathcal {M}}^i_{s} \xi _i\). Therefore, Eq. (5a) becomes
where \(\xi _i dt= d w_i\) with \(w_i\) an N-dimensional standard Brownian motion on the probability space \((\varOmega , {\mathcal {F}}, {\mathbb {P}}, \{{\mathcal {F}}_t\}_{t\ge 0})\). Let \({\varvec{x}}=\mathrm{col}\{x_1, \cdots , x_N\}\). Define the stacked functions \(G({\varvec{x}})=\mathrm{col}\{\mathrm{col}\{g_{ij}(x_i)\}_{j=1}^{r_i}\}_{i=1}^N \in \mathbb {R}^r\), \(H({\varvec{x}})=\mathrm{col}\{\mathrm{col}\{h_{ij}(x_i)\}_{j=1}^{s_i}\}_{i=1}^N \in \mathbb {R}^s\) and the stacked gradients \(\nabla F({\varvec{x}})=\mathrm{col}\{\nabla f_{i}(x_i)\}_{i=1}^{N}\in \mathbb {R}^{nN}\), \(\nabla G({\varvec{x}})=\mathrm{col}\big \{\mathrm{col}\{\nabla g_{ij}(x_i)\}_{j=1}^{r_i}\big \}_{i=1}^{N} \in \mathbb {R}^{rn}\), \(\nabla H({\varvec{x}})=\mathrm{col}\big \{\mathrm{col}\{\nabla h_{ij}(x_i)\}_{j=1}^{s_i}\big \}_{i=1}^{N}\in \mathbb {R}^{sn}\). For each switching mode \(s\in {\mathbb {S}}\), set \({\mathcal {M}}_s=\mathrm{diag}\{{\mathcal {M}}_s^i\}_{i=1}^{N}\) \(\in \mathbb {R}^{nN\times N^2}\). In addition, let \({\varvec{w}}=\mathrm{col} \{w_i\}_{i=1}^{N} \in \mathbb {R}^{N^2}\) and define \(r=\sum _{i=1}^{N}r_i\), \(s=\sum _{i=1}^{N}s_i\). Set
with
, and
with
. Define \({\varvec{\theta }}=\mathrm{col} \{\theta _1, \cdots , \theta _N\}\in {\mathbb {R}}^{nN}\).
With these, Eq. (5) can be written in a compact form as in (6).
Appendix B: Proof of The Inequality (10)
Firstly, for \(V_1\), defining \({\varvec{\chi }}= \nabla F({\varvec{x}})+\varvec{\lambda } \odot \nabla G({\varvec{x}})+\varvec{\nu } \odot \nabla H({\varvec{x}})\) and \({\varvec{\chi }}^*= \nabla F({\varvec{x}}^*)+\varvec{\lambda }^* \odot \nabla G({\varvec{x}}^*)+\varvec{\nu }^* \odot \nabla H({\varvec{x}}^*)\) and noting \({\varvec{\theta }}^*=-{\varvec{\chi }}^*\) in view of (7), the action of the infinitesimal operator \(\mathcal A\) on \(V_1\) can be calculated as
The trace \(\mathrm{tr}({\mathcal {M}}^T{\mathcal {M}})\) brings difficulty to the convergence analysis. To get rid of this difficulty, we give an estimation of \(\mathrm{tr}({\mathcal {M}}^T{\mathcal {M}})\) as follows,
As for Part A in (19) , noting \({\varvec{x}}^T \varvec{{\mathcal {L}}} {\varvec{x}}= ({\varvec{x}}-{\varvec{x}}^*)^T \varvec{{\mathcal {L}}} ({\varvec{x}}-{\varvec{x}}^*)\) and (c.f. Eq. (9)), the part A can be estimated as \(A\le ({\varvec{x}}-{\varvec{x}}^*)^T[-\hbar \varvec{{\mathcal {L}}}{\varvec{x}}-{\varvec{\theta }}-\nabla F({\varvec{x}})]-\hbar ({\varvec{x}}-{\varvec{x}}^*)^T \varvec{{\mathcal {L}}}({\varvec{x}}-{\varvec{x}}^*)\). Noting that the square bracket is exactly the minus gradient of \(\varPsi ({\varvec{x}}, {\varvec{\theta }})\) (c.f. Eq. (9)) with respect to \({\varvec{x}}\) and denoting by \(\lambda ^{{\mathcal {G}}}_2\) the smallest nonzero eigenvalue of the Laplacian \({\mathcal {L}}\), one obtains \(A \le ({\varvec{x}}^*-{\varvec{x}})^T \nabla _{{\varvec{x}}} \varPsi ({\varvec{x}}, {\varvec{\theta }})-\hbar \lambda ^{{\mathcal {G}}}_2({\varvec{x}}-{\varvec{x}}^*)^T({\varvec{x}}-{\varvec{x}}^*)\le \varPsi ({\varvec{x}}^*, {\varvec{\theta }})-\varPsi ({\varvec{x}},{\varvec{\theta }})-\hbar \lambda ^{{\mathcal {G}}}_2({\varvec{x}}-{\varvec{x}}^*)^T({\varvec{x}}-{\varvec{x}}^*)\), where the last inequality uses the fact that \( \varPsi ({\varvec{x}}, {\varvec{\theta }})\) is convex in its first argument. As for part B in (19) , we can prove the inequality \(B\le ({\varvec{x}}-{\varvec{x}}^*)^T({\varvec{\theta }}-{\varvec{\theta }}^*)+({\varvec{x}}-{\varvec{x}}^*)^T({\varvec{x}}-{\varvec{x}}^*)\) by rewriting it into a quadratic form in terms of \({\varvec{x}}-{\varvec{x}}^*, {\varvec{\theta }}-{\varvec{\theta }}^*, {\varvec{\chi }}-{\varvec{\chi }}^*\) and by showing the corresponding matrix to be semi-positive definite. In view of \(({\varvec{x}}-{\varvec{x}}^*)^T({\varvec{\theta }}-{\varvec{\theta }}^*)=\varPsi ({\varvec{x}}, {\varvec{\theta }})-\varPsi ({\varvec{x}}, {\varvec{\theta }}^*)\), one obtains \(A+B\le \left[ \varPsi ({\varvec{x}}^*, {\varvec{\theta }})-\varPsi ({\varvec{x}}, {\varvec{\theta }}^*)\right] -(\hbar \lambda ^{{\mathcal {G}}}_2-1)({\varvec{x}}-{\varvec{x}}^*)^T({\varvec{x}}-{\varvec{x}}^*)\). Now,
where we use the convexity of the functions \(g_{ij}\) and \(h_{ij}\).
Secondly, we calculate the action of the infinitesimal operator \(\mathcal A\) on \(V_2+V_3\). Firstly note that the function \(V_3\) can be calculated by the definition of Bregman divergence as \(V_3=\sum \nolimits _{i=1}^{N} \sum \nolimits _{j=1}^{r_i}(\lambda _{ij}-\lambda _{ij}^*)-\sum _{(i,j)\in \varOmega }\lambda _{ij}^*(\ln \lambda _{ij}-\ln \lambda _{ij}^*)\). Therefore,
Furthermore, the action of the infinitesimal operator \(\mathcal A\) on \(V_4\) can be easily calculated as \({\mathcal {A}}V_4=\sum \nolimits _{i=1}^{N}\sum \nolimits _{j=1}^{s_i} (v_{ij}-v_{ij}^*) h_{ij}(x_i)\). Collecting above results for \(\mathcal {A}V_i, i=1,2,3,4\) and recalling the definition of \( \varPhi \) in (8), one obtains the inequality (10).
Appendix C: Proof of The Saddle Point Conditions (11)
Due to convexity and affinity, the following four results hold: \(\varSigma _{i=1}^N f_i(x_i)+\hbar {\varvec{x}}{\varvec{\mathcal {L}}}{\varvec{x}}\ge \varSigma _{i=1}^N f_i(x^*)+\varSigma _{i=1}^N \nabla f_i(x^*)(x_i-x^*)\), \(g_{ij}(x_i)\ge g_{ij}(x^*)+\nabla g_{ij}(x^*)(x_i-x^*)\), \(h_{ij}(x_i) = h_{ij}(x^*)+ \nabla h_{ij}(x^*)(x_i-x^*)\), and \({\varvec{x}}-{\varvec{x}}^*={\varvec{x}}-{\varvec{x}}^*\). Multiplying them by \(1, \lambda _{ij}^*, \nu _{ij}^*, {\varvec{\theta }}^*\), respectively (for the forth equality we use \(({\varvec{x}}-{\varvec{x}}^*)^T {\varvec{\theta }}^*\) for multiplication), and adding gives \(\varPhi ({\varvec{x}}, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\ge \varPhi ({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)+[{\varvec{\theta }}^*+ \nabla F({\varvec{x}}^*) +\varvec{\lambda }^* \odot \nabla G({\varvec{x}}^*)+\varvec{\nu }^* \odot \nabla H({\varvec{x}}^*)]^T({\varvec{x}}-{\varvec{x}}^*)\). In view of Eq. (7), the terms in the square bracket sum to be zero. Therefore, \(\varPhi ({\varvec{x}}, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*) \ge \varPhi ({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\).
On other hand, noting that \(\lambda _{ij}g_{ij}(x_i^*)\le 0=\lambda _{ij}^*g_{ij}(x_i^*)\) and \(h_{ij}(x^*)=0\), it can be directly checked that \(\varPhi ({\varvec{x}}^*, {\varvec{\theta }}, {\varvec{\lambda }}, {\varvec{\nu }})\le \varPhi ({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\). This inequality, together with the inequality derived in last paragraph, gives rise to the saddle point condition (11).
Appendix D: Proof of “\({\mathcal {A}}V=0 \Rightarrow ({\varvec{x}}, {\varvec{\theta }}, {\varvec{\lambda }}, {\varvec{\nu }})=({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\)”
Letting \({\mathcal {A}}V=0\) gives \(({\varvec{x}}-{\varvec{x}}^*)^T \,({\varvec{x}}-{\varvec{x}}^*)=0\) and \(\varPhi ({\varvec{x}}^*, {\varvec{\theta }}, {\varvec{\lambda }}, {\varvec{\nu }})= \varPhi ({\varvec{x}}, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\), where the former implies \({\varvec{x}}={\varvec{x}}^*\) and the latter, together with the fact that \(({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\) is a saddle point of the Lagrangian \( \varPhi \), implies \( \varPhi ({\varvec{x}}^*, {\varvec{\theta }}, {\varvec{\lambda }}, {\varvec{\nu }})= \varPhi ({\varvec{x}}, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)=\varPhi ({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\). Recall that \({\varvec{x}}^*={\mathbf {1}}_N \otimes x^*\) with \(x^*\) being the optimal solution which satisfies \(g_{ij}(x^*)\le 0\) in the KKT condition (2a). Inserting \(x_i=x^*\) into (5c) yields \(\dot{\lambda }_{ij}=\frac{\lambda _{ij}g_{ij}(x^*)}{1+\eta _{ij}\lambda _{ij}}\). If \(g_{ij}(x^*)=0\), then \(\dot{\lambda }_{ij}=0\) which implies that \(\lambda _{ij}(t)\) stays positive for all \(t\ge 0\) since the initial value of \(\lambda _{ij}\) is chosen to be positive. If \(g_{ij}(x^*)<0\), then with \(g_{ij}(x^*)=-a<0\), one has \(\dot{\lambda }_{ij}=\frac{-a\lambda _{ij}}{1+\eta _{ij}\lambda _{ij}}\) whose trajectory can be shown by elementary analysis as \(\lambda _{ij}(t)\ge 0\) for \(t\ge 0\) since the initial value of \(\lambda _{ij}\) is chosen to be positive. In short, in both cases, for Eq. (5c) with \(x_i=x^*\), one has that \(\lambda _{ij}(t)\ge 0, \forall t\ge 0\) provided the initial value is positive, and consequently \(\dot{\lambda }_{ij}(t)\le 0\) since \(g_{ij}(x^*)\le 0\). Therefore, \(\lambda _{ij}\ge \lambda _{ij}^*\). On the other hand, the fact \( \varPhi ({\varvec{x}}^*, {\varvec{\theta }}, {\varvec{\lambda }}, {\varvec{\nu }})= \varPhi ({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\) yields \(\sum _{i=1}^{N}\sum _{j=1}^{r_i}(\lambda _{ij}-\lambda _{ij}^*)g_{ij}(x^*)\) \(=0\). Since \(g_{ij}(x^*)\le 0\) and \(\lambda _{ij}\ge \lambda _{ij}^*\), each sum in this summation is non-positive and therefore \((\lambda _{ij}-\lambda _{ij}^*)g_{ij}(x^*)=0\), i.e., \(\lambda _{ij}g_{ij}(x^*)=\lambda _{ij}^*g_{ij}(x^*)=0\). This fact implies that the right-hand side of Eq. (5c) with \(x_i=x^*\) is zero and thus \( \lambda _{ij}=\lambda _{ij}^{\dag }\) for some constant \(\lambda _{ij}^{\dag }\ge 0\). Thus, \(\lambda _{ij}^{\dag }g_{ij}(x^*)=0\). Inserting \({\varvec{x}}^*={\mathbf {1}}_N\otimes x^*\) into Eqs. (5b) and (5d) gives that \(\theta _i=\theta _i^{\dag }\) and \(\nu _{ij}=\nu _{ij}^{\dag }\) for some constants \(\theta _i^{\dag }\) and \(\nu _{ij}^{\dag }\). Inserting \(({\varvec{x}}^*, \theta _i^{\dag }, \lambda _{ij}^{\dag }, \nu _{ij}^{\dag })\) into (5a) gives \(\sum _{i=1}^N\nabla f_i(x^*)+\sum _{i=1}^N\sum _{j}^{r_i}\lambda _{ij}^{\dag }\nabla g_{ij}(x^*)+\sum _{i=1}^N\sum _{j}^{s_i}\nu _{ij}^{\dag }\nabla h_{ij}(x^*)=0\). In conclusion, the KKT conditions (2) are satisfied at the point \(({\varvec{x}}^*, {\varvec{\lambda }}^\dag , {\varvec{\nu }}^\dag )\). Then by uniqueness of multipliers in (4), \(\lambda _{ij}^{\dag }=\lambda _{ij}^*\), \(\nu _{ij}^{\dag }=\nu _{ij}^*\).
Also, by differentiating both sides of \( \varPhi ({\varvec{x}}, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)= \varPhi ({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\) with respect to \({\varvec{x}}\) and then by enforcing \({\varvec{x}}={\varvec{x}}^*\), one obtains \({\varvec{\theta }}+ \nabla F({\varvec{x}}^*) +\varvec{\lambda }^* \odot \nabla G({\varvec{x}}^*)+\varvec{\nu }^* \odot \nabla H({\varvec{x}}^*)=0\). This, combined with Eq. (7), gives \({\varvec{\theta }}={\varvec{\theta }}^*\). In conclusion, letting \({\mathcal {A}}V=0\) gives rise to \(({\varvec{x}}, {\varvec{\theta }}, {\varvec{\lambda }}, {\varvec{\nu }})=({\varvec{x}}^*, {\varvec{\theta }}^*, {\varvec{\lambda }}^*, {\varvec{\nu }}^*)\).
Rights and permissions
About this article
Cite this article
Ni, W., Wang, X. A Multi-Scale Method for Distributed Convex Optimization with Constraints. J Optim Theory Appl 192, 379–400 (2022). https://doi.org/10.1007/s10957-021-01982-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-021-01982-0