A new complexity metric for nonconvex rank-one generalized matrix completion

Haixiang Zhang¹,
Baturalp Yalcin²,
Javad Lavaei ORCID: orcid.org/0000-0003-4294-1338² &
…
Somayeh Sojoudi³

487 Accesses
1 Citation
Explore all metrics

Abstract

In this work, we develop a new complexity metric for an important class of low-rank matrix optimization problems in both symmetric and asymmetric cases, where the metric aims to quantify the complexity of the nonconvex optimization landscape of each problem and the success of local search methods in solving the problem. The existing literature has focused on two recovery guarantees. The RIP constant is commonly used to characterize the complexity of matrix sensing problems. On the other hand, the incoherence and the sampling rate are used when analyzing matrix completion problems. The proposed complexity metric has the potential to generalize these two notions and also applies to a much larger class of problems. To mathematically study the properties of this metric, we focus on the rank-1 generalized matrix completion problem and illustrate the usefulness of the new complexity metric on three types of instances, namely, instances with the RIP condition, instances obeying the Bernoulli sampling model, and a synthetic example. We show that the complexity metric exhibits a consistent behavior in the three cases, even when other existing conditions fail to provide theoretical guarantees. These observations provide a strong implication that the new complexity metric has the potential to generalize various conditions of optimization complexity proposed for different applications. Furthermore, we establish theoretical results to provide sufficient conditions and necessary conditions for the existence of spurious solutions in terms of the proposed complexity metric. This contrasts with the RIP and incoherence conditions that fail to provide any necessary condition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

A multi-stage convex relaxation approach to noisy structured low-rank matrix recovery

Article 29 January 2020

A rank-corrected procedure for matrix completion with fixed basis coefficients

Article 30 October 2015

Half thresholding eigenvalue algorithm for semidefinite matrix completion

Article 02 July 2015

Notes

A point $U^0$ is called a spurious local minimum if it is a local minimum of problem (1.2) and $U^0\left( U^0\right) ^T \ne M^*$.
We note that this definition is different from the common definition of a maximum independent set, which only requires that a maximum independent set is not a proper subset of an independent set.
A point $u\in \mathbb {R}^n$ is called a spurious second-order critical point if it satisfies the first-order and the second-order necessary optimality conditions and $uu^T \ne u^*(u^*)^T$.

References

Agarwal, A., Anandkumar, A., Jain, P., Netrapalli, P.: Learning sparsely used overcomplete dictionaries via alternating minimization. SIAM J. Optim. 26(4), 2775–2799 (2016)
Article MathSciNet Google Scholar
Ahn, K., Suarez, F.: Riemannian perspective on matrix factorization. arXiv preprint arXiv:2102.00937 (2021)
Aigner, M.: Turán’s graph theorem. Am. Math. Mon. 102(9), 808–816 (1995)
Google Scholar
Ajayi, T., Mildebrath, D., Kyrillidis, A., Ubaru, S., Kollias, G., Bouchard, K.: Provably convergent acceleration in factored gradient descent with applications in matrix sensing. arXiv preprint arXiv:1806.00534 (2018)
Allen-Zhu, Z., Li, Y.: Neon2: Finding local minima via first-order oracles. Adv. Neural Inform. Process. Syst. 31 (2018)
Bi, Y., Lavaei, J.: On the absence of spurious local minima in nonlinear low-rank matrix recovery problems. In: International Conference on Artificial Intelligence and Statistics, pp. 379–387. PMLR (2021)
Bi, Y., Zhang, H., Lavaei, J.: Local and global linear convergence of general low-rank matrix recovery problems. In: Proceedings of 36th AAAI Conference on Artificial Intelligence (AAAI), pp. 1–9. Vancouver, Canada (2022)
Burer, S., Monteiro, R.D.: A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Math. Program. 95(2), 329–357 (2003)
Article MathSciNet Google Scholar
Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM (JACM) 58(3), 1–37 (2011)
Article MathSciNet Google Scholar
Candes, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via wirtinger flow: theory and algorithms. IEEE Trans. Inform. Theory 61(4), 1985–2007 (2015)
Article MathSciNet Google Scholar
Candes, E.J., Plan, Y.: Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inform. Theory 57(4), 2342–2359 (2011)
Article MathSciNet Google Scholar
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2009)
Article MathSciNet Google Scholar
Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inform. Theory 56(5), 2053–2080 (2010)
Article MathSciNet Google Scholar
Cartis, C., Gould, N.I., Toint, P.L.: Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results. Math. Program. 127(2), 245–295 (2011)
Article MathSciNet Google Scholar
Charisopoulos, V., Chen, Y., Davis, D., Díaz, M., Ding, L., Drusvyatskiy, D.: Low-rank matrix recovery with composite optimization: good conditioning and rapid convergence. Found. Comput. Math. 21(6), 1505–1593 (2021)
Article MathSciNet Google Scholar
Chen, J., Li, X.: Model-free nonconvex matrix completion: local minima analysis and applications in memory-efficient kernel PCA. J. Mach. Learn. Res. 20(142), 1–39 (2019)
MathSciNet Google Scholar
Chen, J., Liu, D., Li, X.: Nonconvex rectangular matrix completion via gradient descent without $\ell _{2,\infty }$ regularization. IEEE Trans. Inform. Theory 66(9), 5806–5841 (2020)
Article MathSciNet Google Scholar
Chen, Y., Chi, Y.: Harnessing structures in big data via guaranteed low-rank matrix estimation: recent theory and fast algorithms via convex and nonconvex optimization. IEEE Signal Process. Mag. 35(4), 14–31 (2018)
Article Google Scholar
Chen, Y., Chi, Y., Fan, J., Ma, C.: Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. Math. Program. 176(1), 5–37 (2019)
Article MathSciNet Google Scholar
Chen, Y., Chi, Y., Fan, J., Ma, C., Yan, Y.: Noisy matrix completion: understanding statistical guarantees for convex relaxation via nonconvex optimization. SIAM J. Optim. 30(4), 3098–3121 (2020)
Article MathSciNet Google Scholar
Chen, Y., Fan, J., Ma, C., Yan, Y.: Bridging convex and nonconvex optimization in robust PCA: noise, outliers and missing data. Ann. Stat. 49(5), 2948–2971 (2021)
Article MathSciNet Google Scholar
Chi, Y., Lu, Y.M., Chen, Y.: Nonconvex optimization meets low-rank matrix factorization: an overview. IEEE Trans. Signal Process. 67(20), 5239–5269 (2019)
Article MathSciNet Google Scholar
Chou, H.H., Gieshoff, C., Maly, J., Rauhut, H.: Gradient descent for deep matrix factorization: dynamics and implicit bias towards low rank. arXiv preprint arXiv:2011.13772 (2020)
Fattahi, S., Sojoudi, S.: Exact guarantees on the absence of spurious local minima for non-negative rank-1 robust principal component analysis. J. Mach. Learn. Res. (2020)
Ge, R., Jin, C., Zheng, Y.: No spurious local minima in nonconvex low rank problems: a unified geometric analysis. In: International Conference on Machine Learning, pp. 1233–1242. PMLR (2017)
Ge, R., Lee, J.D., Ma, T.: Matrix completion has no spurious local minimum. Adv. Neural Inform. Process. Syst. 2981–2989 (2016)
Hardt, M.: Understanding alternating minimization for matrix completion. In: 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 651–660. IEEE (2014)
Hardt, M., Wootters, M.: Fast matrix completion without the condition number. In: Conference on Learning Theory, pp. 638–678. PMLR (2014)
Hou, T.Y., Li, Z., Zhang, Z.: Fast global convergence for low-rank matrix recovery via riemannian gradient descent with random initialization. arXiv preprint arXiv:2012.15467 (2020)
Jain, P., Meka, R., Dhillon, I.: Guaranteed rank minimization via singular value projection. Adv. Neural Inform. Process. Syst. 23 (2010)
Jain, P., Netrapalli, P., Sanghavi, S.: Low-rank matrix completion using alternating minimization. In: Proceedings of the Forty-fifth Annual ACM Symposium on Theory of Computing, pp. 665–674. (2013)
Jin, C., Ge, R., Netrapalli, P., Kakade, S.M., Jordan, M.I.: How to escape saddle points efficiently. In: International Conference on Machine Learning, pp. 1724–1732. PMLR (2017)
Jin, C., Netrapalli, P., Jordan, M.I.: Accelerated gradient descent escapes saddle points faster than gradient descent. In: Conference On Learning Theory, pp. 1042–1085. PMLR (2018)
Lee, J.D., Simchowitz, M., Jordan, M.I., Recht, B.: Gradient descent only converges to minimizers. In: Conference on Learning Theory, pp. 1246–1257. PMLR (2016)
Levin, E., Kileel, J., Boumal, N.: The effect of smooth parametrizations on nonconvex optimization landscapes. arXiv preprint arXiv:2207.03512 (2022)
Li, X., Zhu, Z., Man-Cho So, A., Vidal, R.: Nonconvex robust low-rank matrix recovery. SIAM J. Optim. 30(1), 660–686 (2020)
Article MathSciNet Google Scholar
Li, Y., Ma, T., Zhang, H.: Algorithmic regularization in over-parameterized matrix sensing and neural networks with quadratic activations. In: Conference On Learning Theory, pp. 2–47. PMLR (2018)
Luo, Y., Li, X., Zhang, A.R.: Nonconvex factorization and manifold formulations are almost equivalent in low-rank matrix optimization. arXiv preprint arXiv:2108.01772 (2021)
Ma, J., Fattahi, S.: Sign-rip: a robust restricted isometry property for low-rank matrix recovery. arXiv preprint arXiv:2102.02969 (2021)
Ma, Z., Bi, Y., Lavaei, J., Sojoudi, S.: Sharp restricted isometry property bounds for low-rank matrix recovery problems with corrupted measurements. arXiv preprint arXiv:2105.08232 (2021)
Netrapalli, P., Jain, P., Sanghavi, S.: Phase retrieval using alternating minimization. Adv. Neural Inform. Process. Syst. 26 (2013)
Netrapalli, P., Un, N., Sanghavi, S., Anandkumar, A., Jain, P.: Non-convex robust PCA. Adv. Neural Inform. Process. Syst. 27 (2014)
Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
Article MathSciNet Google Scholar
Renegar, J.: Linear programming, complexity theory and elementary functional analysis. Math. Program. 70(1), 279–351 (1995)
Article MathSciNet Google Scholar
Renegar, J.: Condition numbers, the barrier method, and the conjugate-gradient method. SIAM J. Optim. 6(4), 879–912 (1996)
Article MathSciNet Google Scholar
Stöger, D., Soltanolkotabi, M.: Small random initialization is akin to spectral learning: optimization and generalization guarantees for overparameterized low-rank matrix reconstruction. Adv. Neural Inform. Process. Syst. 34 (2021)
Sun, J., Qu, Q., Wright, J.: Complete dictionary recovery over the sphere I: overview and the geometric picture. IEEE Trans. Inform. Theory 63(2), 853–884 (2016)
Article MathSciNet Google Scholar
Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)
Article MathSciNet Google Scholar
Sun, R., Luo, Z.Q.: Guaranteed matrix completion via non-convex factorization. IEEE Trans. Inform. Theory 62(11), 6535–6579 (2016)
Article MathSciNet Google Scholar
Tong, T., Ma, C., Chi, Y.: Accelerating ill-conditioned low-rank matrix estimation via scaled gradient descent. J. Mach. Learn. Res. 22(150), 1–63 (2021)
MathSciNet Google Scholar
Tong, T., Ma, C., Chi, Y.: Low-rank matrix recovery with scaled subgradient methods: fast and robust convergence without the condition number. IEEE Trans. Signal Process. 69, 2396–2409 (2021)
Article MathSciNet Google Scholar
Tong, T., Ma, C., Prater-Bennette, A., Tripp, E., Chi, Y.: Scaling and scalability: provable nonconvex low-rank tensor estimation from incomplete measurements. arXiv preprint arXiv:2104.14526 (2021)
Tu, S., Boczar, R., Simchowitz, M., Soltanolkotabi, M., Recht, B.: Low-rank solutions of linear matrix equations via procrustes flow. In: International Conference on Machine Learning, pp. 964–973. PMLR (2016)
Wei, K., Cai, J.F., Chan, T.F., Leung, S.: Guarantees of riemannian optimization for low rank matrix recovery. SIAM J. Matrix Anal. Appl. 37(3), 1198–1222 (2016)
Article MathSciNet Google Scholar
Wei, K., Cai, J.F., Chan, T.F., Leung, S.: Guarantees of riemannian optimization for low rank matrix completion. Inverse Probl. Ima. 14(2) (2020)
Yalcin, B., Zhang, H., Lavaei, J., Sojoudi, S.: Factorization approach for low-complexity matrix completion problems: exponential number of spurious solutions and failure of gradient methods. In: International Conference on Artificial Intelligence and Statistics, pp. 1–9. PMLR (2022)
Yi, X., Park, D., Chen, Y., Caramanis, C.: Fast algorithms for robust pca via gradient descent. Adv. Neural Inform. Process. Syst. 29 (2016)
Yurtsever, A., Tropp, J.A., Fercoq, O., Udell, M., Cevher, V.: Scalable semidefinite programming. SIAM J. Math. Data Sci. 3(1), 171–200 (2021)
Article MathSciNet Google Scholar
Zhang, H., Bi, Y., Lavaei, J.: General low-rank matrix optimization: geometric analysis and sharper bounds. Adv. Neural Inform. Process. Syst. 34 (2021)
Zhang, H., Yalcin, B., Lavaei, J., Sojoudi, S.: A new complexity metric for nonconvex rank-one generalized matrix completion. arXiv preprint arXiv:2204.02364 (2022)
Zhang, J., Fattahi, S., Zhang, R.: Preconditioned gradient descent for over-parameterized nonconvex matrix factorization. Adv. Neural Inform. Process. Syst. 34 (2021)
Zhang, R.Y., Sojoudi, S., Lavaei, J.: Sharp restricted isometry bounds for the inexistence of spurious local minima in nonconvex matrix recovery. J. Mach. Learn. Res. 20(114), 1–34 (2019)
MathSciNet Google Scholar
Zheng, Q., Lafferty, J.: A convergent gradient descent algorithm for rank minimization and semidefinite programming from random linear measurements. In: Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1, pp. 109–117 (2015)
Zhu, Z., Li, Q., Tang, G., Wakin, M.B.: Global optimality in low-rank matrix optimization. IEEE Trans. Signal Process. 66(13), 3614–3628 (2018)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by grants from ARO, AFOSR, ONR and NSF.

Author information

Authors and Affiliations

Department of Mathematics, University of California, Berkeley, CA, USA
Haixiang Zhang
Department of Industrial Engineering and Operations Research, University of California, Berkeley, CA, USA
Baturalp Yalcin & Javad Lavaei
Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA, USA
Somayeh Sojoudi

Authors

Haixiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Baturalp Yalcin
View author publications
You can also search for this author in PubMed Google Scholar
Javad Lavaei
View author publications
You can also search for this author in PubMed Google Scholar
Somayeh Sojoudi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Javad Lavaei.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We note that a similar complexity metric based on a special case of instances in Sect. 3.3 was proposed in our conference paper [56]. However, the complexity metric in this work has a different form and is proved to work on a broader set of applications. In addition, we prove several theoretical properties of the metric in this work, which are not included in [56].

Appendices

A Proofs in Sect. 2

1.1 A.1 Proof of Theorem 2.4

The proof of Theorem 2.4 relies on the following two lemmas, which transform the computation of $\mathbb {D}_\alpha ^{min}$ into a one-dimensional optimization problem. The first lemma upper-bounds the maximum possible distance.

Lemma A.1

Suppose that $n\ge 2$. It holds that

$$\begin{aligned} \left( \mathbb {D}_\alpha ^{min}\right) ^{-1} \le \max _{c\in \left[ 0,\frac{1}{n(n-1)}\right] } g(\alpha ,c), \end{aligned}$$

where the function $g(\alpha ,c)$ is defined by

$$\begin{aligned} g(\alpha ,c) := \textrm{min}\bigg \{&2(1-\alpha )\cdot \frac{n-2}{n} + 4\alpha c, \quad 4\alpha (n-1)c, \\&2(1-\alpha )\cdot \frac{n-4}{n} + 2\alpha \left( \frac{4}{n} - 4(n-2)c\right) , \\&2(1-\alpha )\cdot \frac{n-3}{n} + 2\alpha \left( \frac{3}{n} - (3n-5)c\right) , \\&2(1-\alpha )\cdot \frac{n-2}{n} + 2\alpha \left( \frac{2}{n} - 2(n-1)c\right) , \\&2(1-\alpha )\cdot \frac{n-1}{n} + 2\alpha \left( \frac{1}{n} - (n-1)c\right) \bigg \}. \end{aligned}$$

We denote $g_i(\alpha ,c)$ be the i-th term in the above minimization for all $i\in \{1,\dots ,6\}$. The next lemma proves the other direction.

Lemma A.2

Suppose that $n\ge 2$. It holds that

$$\begin{aligned} \left( \mathbb {D}_\alpha ^{min}\right) ^{-1} \ge \max _{c\in \left[ 0,\frac{1}{n(n-1)}\right] } g(\alpha ,c), \end{aligned}$$

where the function $g(\alpha ,c)$ is defined in Lemma A.1.

The proof of Lemmas A.1 and A.2 can be found in the online version [60].

Proof of Theorem 2.4

By the results of Lemmas A.1 and A.2, we only need to compute $\max _{c\in \left[ 0,\frac{1}{n(n-1)}\right] } g(\alpha ,c)$. Let $\kappa := (1-\alpha )/\alpha \in [0,+\infty ]$. We study three cases below.

Case 1. We first consider the case when $\kappa \ge 2(n-3) / [(n-4)(n-1)]$. We prove that $g(\alpha ,c) = g_2(\alpha ,c)$. Since $g_2(\alpha ,c)$ has a larger gradient than $g_1(\alpha ,c)$ and the function $g_i(\alpha ,c)$ is decreasing in c for $i=3,4,5,6$, we only need to show that

$$\begin{aligned} g_i\left( \alpha , \frac{1}{n(n-1)}\right) \ge g_2\left( \alpha , \frac{1}{n(n-1)}\right) ,\quad \forall i\in \{1,3,4,5,6\}. \end{aligned}$$

(A.1)

The above inequality with $i=1$ is equivalent to $\kappa \ge {2}/(n-1)$, which is guaranteed by the assumption that $\kappa \ge 2(n-3) / [(n-4)(n-1)]$. For $i\in \{3,4,5,6\}$, the inequality (A.1) is equivalent to

$$\begin{aligned} \kappa \ge \max \left\{ \frac{2(n-3)}{(n-1)(n-4)}, \frac{2(n-2)}{(n-1)(n-3)}, \frac{2}{n-2}, \frac{2}{n-1} \right\} = \frac{2(n-3)}{(n-1)(n-4)}. \end{aligned}$$

Therefore, it holds that

$$\begin{aligned} g(\alpha ,c) = g_2(\alpha ,c) = 4\alpha (n-1)c. \end{aligned}$$

whose maximum is attained at $c=[n(n-1)]^{-1}$ and

$$\begin{aligned} \max _{C,u^*}\mathbb {T}_\alpha \left( C,u^*\right) = g_2\left( \alpha , \frac{1}{n(n-1)}\right) = \frac{4\alpha }{n}. \end{aligned}$$

The other two cases can be proved in the same way and the proof can be found in the online version [60]. $\square $

B Proofs in Sect. 3

1.1 B.1 Proof of Theorem 3.2

Before proving the estimation of the complexity metric, we prove two properties of $\mu $-incoherent vectors.

Lemma B.1

Given any constant $\mu \in [1,n]$, suppose that $u^*$ has incoherence $\mu $ and $\Vert u^*\Vert _1 = 1$. Then, the following properties hold:

1.
$u^*$ has at least $n / \mu $ nonzero components;
2.
$|u_i^*| \le \mu / n$ for all $i\in [n]$.

The following lemma lower-bounds the perturbation of the weight matrix C.

Lemma B.2

Suppose that the instance $\mathcal{M}\mathcal{C}\left( C,u^*\right) $ satisfies the $\delta $-RIP$_{2,2}$ condition and the weight matrix ${\tilde{C}}\in \mathbb {S}^{n^2-1}_{+,1}$ has N zero entries, where $\delta \in [0,1)$ and $\mathcal {N}\subset [n] \times [n]$. Then, it holds that

$$\begin{aligned} \Vert C - {\tilde{C}}\Vert _1 \ge 2 \sum \limits _{(i,j)\in \mathcal {N}} C_{ij} \ge \frac{2(1-\delta )N}{ (1+\delta )n^2 - 2\delta N }, \end{aligned}$$

where $\mathcal {N}$ is the set of indices of zero entries of ${\tilde{C}}$.

The proofs of Lemmas B.1 and B.2 are direct calculations and can be found in the online version [60]. Now, we prove the main theorem.

Proof of Theorem 3.2

Suppose that $\mathcal{M}\mathcal{C}({\tilde{C}},{\tilde{u}}^*)\in {\overline{\mathcal {D}}}$ is the instance such that

$$\begin{aligned} \left[ \mathbb {D}_\alpha \left( C,u^*\right) \right] ^{-1} = \alpha \Vert C - {\tilde{C}}\Vert _1 + (1-\alpha )\Vert u^* - {\tilde{u}}^*\Vert _1. \end{aligned}$$

In the following, we split the proof into two steps.

Step I. We first fix ${\tilde{u}}^*$ and consider the closest matrix ${\tilde{C}}$ to C such that $({\tilde{C}}, {\tilde{u}}^*)\in {\overline{\mathcal {D}}}$. Let $k:=|\mathcal {I}_1({\tilde{C}}, {\tilde{u}}^*)|$. Without loss of generality, we assume that

$$\begin{aligned} \mathcal {I}_1({\tilde{C}}, {\tilde{u}}^*) = \{1,\dots ,k\},\quad \mathcal {I}_0({\tilde{C}}, {\tilde{u}}^*) = \{k+1,\dots ,n\}. \end{aligned}$$

We first consider the case when $k\ge 2$. If $\mathbb {G}_1({\tilde{C}}, {\tilde{u}}^*)$ is disconnected, at least $2(k-1)$ entries of ${\tilde{C}}$ are 0. If $\mathbb {G}_1({\tilde{C}}, {\tilde{u}}^*)$ are bipartite, at least $k^2/2 \ge 2(k-1)$ entries of ${\tilde{C}}$ are 0. If $\mathcal {I}_{00}({\tilde{C}}, {\tilde{u}}^*)$ is non-empty, at least 2k entries of ${\tilde{C}}$ are 0. Otherwise if $k = 1$, at least one entry of ${\tilde{C}}$ should be 0 to make $\mathbb {G}_1({\tilde{C}}, {\tilde{u}}^*)$ bipartite. In summary, at least N(k) entries of ${\tilde{C}}$ are 0, where

$$\begin{aligned} N(k):= \max \{ 2(k-1), 1\}. \end{aligned}$$

Using the results in Lemma B.2, the distance between C and ${\tilde{C}}$ is at least

$$\begin{aligned} \Vert C - {\tilde{C}}\Vert _1 \ge \frac{2(1-\delta )N(k)}{ (1+\delta )n^2 - 2\delta N(k)}. \end{aligned}$$

(B.1)

We note that the distance is monotonously increasing as a function of k.

Step II. Now, we consider the optimal choice of ${\tilde{u}}^*$ based on the lower bound in (B.1). Let

$$\begin{aligned} \ell := |\mathcal {I}_1\left( C,u^*\right) |,\quad k:=|\mathcal {I}_1({\tilde{C}}, {\tilde{u}}^*)|. \end{aligned}$$

Since the distance between C and ${\tilde{C}}$ is a monotonously increasing function of k, the minimum distance between $\left( C,u^*\right) $ and $({\tilde{C}},{\tilde{u}}^*)$ cannot be attained by $k>\ell $. Therefore, we focus on the case when $k\le \ell $. Without loss of generality, we assume that

$$\begin{aligned} |u_1^*| \ge |u_2^*| \ge \cdots \ge |u_\ell ^*| > 0;\quad |u_i^*|=0,\quad \forall i\ge \ell +1. \end{aligned}$$

Then, the distance between $u^*$ and ${\tilde{u}}^*$ satisfies

$$\begin{aligned} \Vert u^* - {\tilde{u}}^*\Vert _1 \ge 2\sum \limits _{i=k+1}^{\ell } |u_i^*|. \end{aligned}$$

(B.2)

Denote the distance between $\left( C,u^*\right) $ and $({\tilde{C}},{\tilde{u}}^*)$ by

$$\begin{aligned} d_\alpha := \alpha \Vert C - {\tilde{C}}\Vert _1 + (1-\alpha )\Vert u^* - {\tilde{u}}^*\Vert _1. \end{aligned}$$

Step II-1. We first consider the case when $\mu \le 2n/3$. Combining inequalities (B.1) and (B.2), we obtain a lower bound on $d_\alpha $:

$$\begin{aligned} d_\alpha \ge \mathop {\textrm{min}}\limits _{k\in [\ell ]} \left[ \frac{2\alpha (1-\delta )N(k)}{n^2(1+\delta ) - 2\delta N(k)} + 2(1-\alpha ) \sum \limits _{i=k+1}^{\ell } |u_i^*| \right] . \end{aligned}$$

For every $k\in [\ell ]$, the term inside the above minimization can be lower-bounded by

$$\begin{aligned}&\frac{2\alpha (1-\delta )N(k)}{n^2(1+\delta ) - 2\delta N(k)} + 2(1-\alpha ) \sum \limits _{i=k+1}^{\ell } |u_i^*|\\&\hspace{11em}\ge \frac{2\alpha (1-\delta )\cdot 2(k-1)}{n^2(1+\delta )} + 2(1-\alpha ) \sum \limits _{i=k+1}^{\ell } |u_i^*|\\&\hspace{11em}= \frac{4\alpha (1-\delta )}{n^2(1+\delta )}\cdot (k-1) + 2(1-\alpha ) \sum \limits _{i=k+1}^{\ell } |u_i^*|. \end{aligned}$$

The minimum of the right-hand side over $k\in [\ell ]$ can be solved in closed form and is equal to

$$\begin{aligned} \sum \limits _{i=2}^\ell \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{n^2(1+\delta )}, 2(1-\alpha ) |u_i^*| \right\} . \end{aligned}$$

Using the second property in Lemma B.1, we have

$$\begin{aligned} \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{n^2(1+\delta )}, 2(1-\alpha ) |u_i^*| \right\}&\ge \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{n^2(1+\delta )} \cdot \frac{n|u_i^*|}{\mu }, 2(1-\alpha ) |u_i^*| \right\} \\&= \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{\mu n(1+\delta )}, 2(1-\alpha ) \right\} \cdot |u_i^*|. \end{aligned}$$

Taking the summation over $k\in \{2,\dots ,\ell \}$, we can conclude that

$$\begin{aligned} d_\alpha&\ge \sum \limits _{k=2}^{\ell } \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{\mu n(1+\delta )}, 2(1-\alpha ) \right\} \cdot |u_i^*| \nonumber \\&= \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{\mu n(1+\delta )}, 2(1-\alpha ) \right\} \cdot \sum \limits _{k=2}^{\ell } |u_i^*|. \end{aligned}$$

(B.3)

Using the second property in Lemma B.1 and $\Vert u^*\Vert _1= 1$, it follows that

$$\begin{aligned} \sum \limits _{k=2}^{\ell } |u_i^*| \ge 1 - \frac{\mu }{n}. \end{aligned}$$

Substituting back into inequality (B.3), we have

$$\begin{aligned} d_\alpha \ge \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{\mu n(1+\delta )}, 2(1-\alpha ) \right\} \cdot \left( 1 - \frac{\mu }{n}\right) . \end{aligned}$$

Step II-2. Next, we consider the case when $\mu \ge 2n/3$. By Theorem 3.1, the distance is at least

$$\begin{aligned} d_\alpha&\ge \frac{2\alpha (1-\delta )}{n^2(1+\delta )-2\delta } \ge \frac{2\alpha (1-\delta )}{(3/2)\mu \cdot n(1+\delta )} \ge \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{\mu n(1+\delta )}, 2(1-\alpha ) \right\} \cdot \frac{1}{3}, \end{aligned}$$

where the second inequality is due to the assumption that $\mu \ge 2n/3$.

By combining Steps II-1 and II-2, the distance is lower-bounded by

$$\begin{aligned} d_\alpha&\ge \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{\mu n(1+\delta )}, 2(1-\alpha ) \right\} \times \max \left\{ 1 - \frac{\mu }{n}, \frac{1}{3} \right\} \\&= \textrm{min}\left\{ \frac{4\alpha (1-\delta )}{n(1+\delta )}, 2(1-\alpha )\mu \right\} \times \max \left\{ \frac{1}{\mu } - \frac{1}{n}, \frac{1}{3\mu } \right\} \end{aligned}$$

The proof is completed by using the relation between $d_\alpha $ and $\mathbb {T}_\alpha \left( C,u^*\right) $. $\square $

1.2 B.2 Reduction of problem (3.7)

Before discussing the properties of problem instances in Sect. 3.3, we prove that the SSCPs of the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ are closely related to those of the m-dimensional problem

$$\begin{aligned} \mathop {\textrm{min}}\limits _{x\in \mathbb {R}^{m}} \quad \sum \limits _{i\in [m]} \left( x_i^2 - 1\right) ^2 + \epsilon \sum \limits _{i,j\in [m], i\ne j} (x_i x_j-1)^2. \end{aligned}$$

(B.4)

Lemma B.3

If problem (B.4) has no SSCPs, then the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ has no SSCPs. In addition, given a number $N\in \mathbb {N}$, suppose that problem (B.4) has N SSCPs with nonzero components at which the objective function has a positive definite Hessian matrix. Then, the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ has at least N spurious local minima.

The proof of Lemma B.3 is a direct calculation and can be found in the online version [60].

1.3 B.3 Proof of Theorem 3.5

To simplify the notations in the following proofs, we denote the gradient and the Hessian matrix of the objective function of problem (B.4) by

$$\begin{aligned} \textrm{g}_i(x;\epsilon )&:= 4\left[ x_i^3 - x_i + \epsilon \sum \limits _{j\ne i} x_j (x_ix_j - 1) \right] ,\quad \forall i\in [m];\\ H_{ii}(x;\epsilon )&:= 4\left[ 3x_i^2 - 1 + \epsilon \sum \limits _{j\ne i} x_j^2 \right] ,\quad \forall i\in [m];\\ H_{ij}(x;\epsilon )&:= 4\epsilon (2x_ix_j - 1),\quad \forall i,j\in [m]\quad \mathrm {s.t.}\quad i\ne j. \end{aligned}$$

The following theorem guarantees that the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ does not have spurious local minima when $\epsilon \ge O(m^{-1})$.

Theorem B.1

If $\epsilon > 18 / m$, the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ does not have SSCPs, namely, all second-order critical points are global minima associated with the ground truth solution $M^*$.

The proof of Theorem B.1 can be found in the online version [60]. Then, we consider the regime of $\epsilon $ where the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ has spurious solutions. The following theorem studies the case when m is an even number.

Theorem B.2

Suppose that m is an even number. If $\epsilon < 1 / (m+1)$, then the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ has at least $2^{m/2}$ spurious local minima.

Proof

By Lemma B.3, we only need to show that problem (B.4) has at least $\left( {\begin{array}{c}m\\ m/2\end{array}}\right) $ SSCPs whose associated Hessian matrices are positive definite and whose components are nonzero. We consider a point $x^0\in \mathbb {R}^m$ such that

$$\begin{aligned} \left( x_i^0\right) ^2 = \frac{1-\epsilon }{1+(m-1)\epsilon } > 0,\quad \forall i\in [m];\quad \sum \limits _{i\in [m]} x_i^0 = 0. \end{aligned}$$

The above equations have a solution since m is an even number. By a direct calculation, we can verify that the gradient $g(x^0;\epsilon )$ is equal to 0. We only need to show that the Hessian matrix $H(x^0;\epsilon )$ is positive definite, namely

$$\begin{aligned} c^T H(x^0;\epsilon ) c > 0,\quad \forall c\in \mathbb {R}^m \backslash \{0\}. \end{aligned}$$

The above condition is equivalent to

$$\begin{aligned}&\left[ (3 + (m-3)\epsilon )\left( x_1^0\right) ^2 - 1 + \epsilon \right] \sum \limits _{i\in [m]} c_i^2 - \epsilon \left( \sum \limits _{i\in [m]} c_i\right) ^2\\&\hspace{9em} + 2\epsilon \left( x_1^0\right) ^2 \left( \sum \limits _{i\in [m]} \textrm{sign}\left( x_1^0\right) c_i \right) ^2 > 0,\quad \forall c\in \mathbb {R}^n \backslash \{0\}. \end{aligned}$$

Under the normalization constraint $\Vert c\Vert _2 = 1$, the Cauchy inequality implies that the minimum of the left-hand side is attained by

$$\begin{aligned} c_1= \cdots = c_m = 1 / \sqrt{m}. \end{aligned}$$

Therefore, the Hessian is positive definite if and only if

$$\begin{aligned} (3 + (m-3)\epsilon )\left( x_1^0\right) ^2 - 1 + \epsilon > m\epsilon . \end{aligned}$$

By substituting $\left( x_1^0\right) ^2=(1-\epsilon )/[1+(m-1)\epsilon ]$, the above condition is equivalent to

$$\begin{aligned} 2 - (m+4)\epsilon - (m-2)(m+1)\epsilon ^2 > 0. \end{aligned}$$

Using the condition that $(m+1)\epsilon < 1$, we obtain that

$$\begin{aligned} 2 - (m+4)\epsilon - (m-2)(m+1)\epsilon ^2&> 1 - 3\epsilon - (m-2)\epsilon = 1 - (m+1)\epsilon > 0, \end{aligned}$$

where the first inequality is from the fact that $m\ge 2$, which follows from the assumption that $m>0$ is an even number.

To estimate the number of SSCPs, we observe that m/2 components of $x^0$ have a positive sign and the other m/2 components have a negative sign. Hence, there are at least

$$\begin{aligned} \left( {\begin{array}{c}m\\ m/2\end{array}}\right) \end{aligned}$$

spurious SSCPs. The estimate on the combinatorial number is in light of the inequality $\left( {\begin{array}{c}n\\ k\end{array}}\right) \ge (n/k)^k$. $\square $

The estimation of the odd number case is similar and we present the result in the following theorem.

Theorem B.3

Suppose that m is an odd number. If $\epsilon < 1 / [13(m+1)]$, then the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ has at least $[{2m}/(m+1)]^{(m+1)/2}$ spurious local minima.

The proof of Theorem B.3 is similar to that of Theorem B.2 and can be found in the online version [60]. By combining Theorems B.1–B.3, we complete the proof of Theorem 3.5.

1.4 B.4 Proof of Theorem 3.6

The proof of Theorem 3.6 relies on the following lemma, which calculates the complexity metric of the instance $\mathcal{M}\mathcal{C}(C^\epsilon ,u^*)$. The proof of Lemma B.4 is similar to that of Theorem 2.4.

Lemma B.4

Suppose that $n\ge m \ge 5$, $\alpha \in [0,1]$ and $\epsilon \in [0,1]$. The complexity metric $\mathbb {D}_\alpha (C^\epsilon ,u^*)$ has the closed form

$$\begin{aligned}{}[\mathbb {D}_\alpha (C^\epsilon ,u^*)]^{-1} = \textrm{min}\bigg \{ \frac{2\alpha }{Z_\epsilon } + \frac{2(1-\alpha )(m-1)}{m},\frac{4\alpha \epsilon }{Z_\epsilon } +&\frac{2(1-\alpha )(m-2)}{m},\\&\hspace{3em}\frac{4\alpha (m-1)\epsilon }{Z_\epsilon } \bigg \}. \end{aligned}$$

Moreover, $\mathbb {D}_\alpha (C^\epsilon ,u^*)$ is strictly decreasing in $\epsilon $ on [0, 1/2].

The proof of Lemma B.4 can be found in the online version [60]. Combining Theorem 3.5 and Lemma B.4, we are able to estimate the range of the complexity metric.

Proof of Theorem 3.6

By defining constants $\delta :=1/26$ and $\Delta :=18$, Theorem 3.5 implies that

1.
If $\epsilon < \delta / m$, the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ has spurious local minima;
2.
If $\epsilon > \Delta / m$, the instance $\mathcal{M}\mathcal{C}(C^\epsilon , u^*)$ has no spurious local minima.

Then, we study two different cases.

Case I. We first consider the case when $m \epsilon $ is large. Since $\epsilon < \Delta /m \le 1/2$, the threshold is located in the regime where $\mathbb {D}_\alpha (C^\epsilon ,u^*)$ is strictly decreasing. Hence, it suffices to show that

$$\begin{aligned} \left[ \frac{2\alpha \Delta }{n^2} + \textrm{min}\left\{ 4\alpha \Delta \cdot \frac{ m}{n^2}, 2(1-\alpha ) \right\} \right] ^{-1} \end{aligned}$$

is a lower bound on $\mathbb {D}_\alpha (C^\epsilon ,u^*)$ when $\epsilon = \Delta / m$. By Lemma B.4, it holds that

$$\begin{aligned}&\left[ \mathbb {D}_\alpha (C^\epsilon ,u^*)\right] ^{-1}\\ =&\textrm{min}\left\{ \frac{2\alpha }{Z_\epsilon } + \frac{2(1-\alpha )(m-1)}{m},\frac{4\alpha \epsilon }{Z_\epsilon } + \frac{2(1-\alpha )(m-2)}{m}, \frac{4\alpha (m-1)\epsilon }{Z_\epsilon } \right\} \\ \le&\textrm{min}\left\{ \frac{4\alpha \epsilon }{Z_\epsilon } + \frac{2(1-\alpha )(m-2)}{m}, \frac{4\alpha (m-1)\epsilon }{Z_\epsilon } \right\} \\ =&\frac{4\alpha \epsilon }{Z_\epsilon } + (m-2)\textrm{min}\left\{ \frac{4\alpha \epsilon }{Z_\epsilon }, \frac{2(1-\alpha )}{m} \right\} \le \frac{4\alpha \epsilon }{Z_\epsilon } + m\textrm{min}\left\{ \frac{4\alpha \epsilon }{Z_\epsilon }, \frac{2(1-\alpha )}{m} \right\} . \end{aligned}$$

Since the graph $\mathbb {G}$ does not contain any independence set with $m+1$ nodes, Turán’s theorem [3] implies that the graph $\mathbb {G}$ has at least $n^2 / (2m)$ edges, namely,

$$\begin{aligned} |\mathbb {E}| \ge {n^2}/(2m). \end{aligned}$$

We note that the above bound is asymptotically tight and is attained by the Turán graph. Hence, we obtain that

$$\begin{aligned} Z_\epsilon =2|\mathbb {E}|+n+m(m-1)\epsilon \ge 2|\mathbb {E}| \ge n^2 / m. \end{aligned}$$

By substituting into the estimate of $\mathbb {D}_\alpha (C^\epsilon ,u^*)$, it follows that

$$\begin{aligned} \left[ \mathbb {D}_\alpha (C^\epsilon ,u^*)\right] ^{-1}&\le \frac{4\alpha \epsilon \cdot m}{n^2} + m\textrm{min}\left\{ \frac{4\alpha \epsilon \cdot m}{n^2}, \frac{2(1-\alpha )}{m} \right\} \\&= \frac{2\alpha \Delta }{n^2} + \textrm{min}\left\{ 4\alpha \Delta \cdot \frac{ m}{n^2}, 2(1-\alpha ) \right\} . \end{aligned}$$

Case II. Next, we consider the case when $\epsilon m$ is small. Similar to Case I, it suffices to show that

$$\begin{aligned} \frac{18}{17} \max \left\{ \frac{n^2}{4\alpha \delta }, \frac{1}{2(1-\alpha )} \right\} \end{aligned}$$

is an upper bound for $\mathbb {D}_\alpha (C^\epsilon ,u^*)$ when $\epsilon = \delta / m$. Since $\delta < 1/2$, we have

$$\begin{aligned} {2\alpha }/{Z_\epsilon } > {4\alpha \epsilon }/{Z_\epsilon }. \end{aligned}$$

By Lemma B.4, it holds that

$$\begin{aligned}&\left[ \mathbb {D}_\alpha (C^\epsilon ,u^*)\right] ^{-1}\\ =&\textrm{min}\left\{ \frac{2\alpha }{Z_\epsilon } + \frac{2(1-\alpha )(m-1)}{m},\frac{4\alpha \epsilon }{Z_\epsilon } + \frac{2(1-\alpha )(m-2)}{m}, \frac{4\alpha (m-1)\epsilon }{Z_\epsilon } \right\} \\ =&\textrm{min}\left\{ \frac{4\alpha \epsilon }{Z_\epsilon } + \frac{2(1-\alpha )(m-2)}{m}, \frac{4\alpha (m-1)\epsilon }{Z_\epsilon } \right\} \\ =&\frac{4\alpha \epsilon }{Z_\epsilon } + (m-2)\textrm{min}\left\{ \frac{4\alpha \epsilon }{Z_\epsilon }, \frac{2(1-\alpha )}{m} \right\} \ge \frac{17}{18} \textrm{min}\left\{ \frac{4\alpha \epsilon m}{Z_\epsilon }, 2(1-\alpha ) \right\} , \end{aligned}$$

where the last inequality is from $m \ge 36$. Since $\epsilon \le 1$, the definition of $Z_\epsilon $ implies that $Z_\epsilon \le n^2$. By substituting into the estimate of $\mathbb {D}_\alpha (C^\epsilon ,u^*)$, it follows that

$$\begin{aligned} \left[ \mathbb {D}_\alpha (C^\epsilon ,u^*)\right] ^{-1}&\ge \frac{17}{18} \textrm{min}\left\{ \frac{4\alpha \epsilon m}{n^2}, 2(1-\alpha ) \right\} = \frac{17}{18} \textrm{min}\left\{ \frac{4\alpha \delta }{n^2}, 2(1-\alpha ) \right\} . \end{aligned}$$

By combining Cases I and II, we complete the proof. $\square $

C Proofs in Sect. 4

1.1 C.1 Proof of Theorem 4.3

The proof of Theorem 4.3 directly follows from the next two lemmas.

Lemma C.1

Suppose that $\left( C,u^*\right) \in \mathcal{S}\mathcal{D}$ and that $u^0$ is a global solution to $\mathcal{M}\mathcal{C}\left( C,u^*\right) $. Then, for all $k\in [n_1]$, it holds that $u_i^0u_j^0 = u_i^*u_j^*$ for all $i,j\in \mathcal {I}_{1k}$. In addition, $u^0_i = 0$ for all $i\in \mathcal {I}_0\left( C,u^*\right) $.

Proof

Denote $M^*:= u^*(u^*)^T$. We first consider nodes in $\mathcal {G}_{1k}$ for some $k\in [n_1]$. Since the subgraph is not bipartite, there exists a cycle with an odd length $2\ell +1$, which we denote as

$$\begin{aligned} \{i_1,\dots ,i_{2\ell +1}\}. \end{aligned}$$

Then, we have

$$\begin{aligned} \left( u^0_{i_1}\right) ^2&= {\prod }_{s=1}^{2\ell +1} \left( u^0_{i_s}u^0_{i_{s+1}}\right) ^{(-1)^{s-1}} = {\prod }_{s=1}^{2\ell +1}\left( M^*_{i_si_{s+1}}\right) ^{(-1)^{s-1}}\\&= {\prod }_{s=1}^{2\ell +1} \left( u^*_{i_s}u^*_{i_{s+1}}\right) ^{(-1)^{s-1}} = \left( u_{i_1}^*\right) ^2, \end{aligned}$$

which implies that the conclusion holds for $i=j=i_1$. Using the connectivity of $\mathbb {G}_{1k}\left( C,u^*\right) $, we know

$$\begin{aligned} u^0_{i}u^0_{j} = u^*_{i}u^*_{k},\quad \forall i,j\in \mathcal {I}_{1k}\left( C,u^*\right) . \end{aligned}$$

Then, we consider nodes in $\mathcal {I}_0\left( C,u^*\right) $. Since $\mathcal {I}_{00}\left( C,u^*\right) $ is empty, for every node $i\in \mathcal {I}_0\left( C,u^*\right) $, there exists another node $j\in \mathcal {I}_1\left( C,u^*\right) $ such that $C_{ij}>0$. Hence, we have

$$\begin{aligned} u^0_i = {M_{ij}^*}/{u^0_j} = 0. \end{aligned}$$

This completes the proof. $\square $

The following lemma provides a necessary and sufficient condition for instances with a positive definite Hessian matrix at global solutions, which is stronger than what Theorem 4.3 requires.

Lemma C.2

Suppose that $u^0\in \mathbb {R}^n$ is a global minimizer of the instance $\mathcal{M}\mathcal{C}\left( C,u^*\right) $ such that the conditions in Lemma C.1 hold. Then, the Hessian matrix is positive definite at $u^0$ if and only if

1.
$\mathbb {G}_{1i}\left( C,u^*\right) $ is not bipartite for all $i\in [n_1]$;
2.
$\mathcal {I}_{00}\left( C,u^*\right) = \emptyset $.

Proof

We prove the positive definiteness of the Hessian matrix for the sufficiency part since it is the part used in this manuscript. The necessity part is proved in the online version [60].

Sufficiency. Now, we consider the sufficiency part, namely, we prove that the Hessian matrix is positive definite under the two conditions stated in the theorem. Suppose that there exists a nonzero vector $q\in \mathbb {R}^n$ such that

$$\begin{aligned} \left[ \nabla ^2 g\left( u^0;C,u^*\right) \right] (q,q) = 0. \end{aligned}$$

Then, after straightforward calculations, we arrive at

$$\begin{aligned} u^0_i q_j + u^0_j q_i =&0 ,\quad \forall i,j \quad \mathrm {s.\,t.}\quad C_{ij}> 0,~ i\ne j;\\ \left[ 2\left( u^0_i\right) ^2 - \left( u^*_i\right) ^2\right] q_i^2 = \left( u^0_iq_i\right) ^2 =&0 ,\quad \forall i \quad \mathrm {s.t.}\quad C_{ii} > 0. \end{aligned}$$

The two conditions can be written compactly as

$$\begin{aligned} u^0_i q_j + u^0_j q_i =&0,\quad \forall i,j \quad \mathrm {s.t.}\quad C_{ij} > 0. \end{aligned}$$

(C.1)

Consider the index set $\mathcal {I}_{1k}\left( C,u^*\right) $ for some $k\in [n_1]$. The equality (C.1) implies that

$$\begin{aligned} {q_i}/{u^0_i} + {q_j}/{u^0_j} = 0,\quad \forall i,j \in \mathcal {I}_{1k}\left( C,u^*\right) . \end{aligned}$$

(C.2)

Since the graph $\mathbb {G}_{1k}\left( C,u^*\right) $ is not bipartite, there exists a cycle with an odd length $2\ell +1$, which we denote as

$$\begin{aligned} \{i_1,i_2,\dots ,i_{2\ell +1}\}. \end{aligned}$$

Denoting $i_{2\ell +2}:= i_1$, we can calculate that

$$\begin{aligned} 2\frac{q_{i_1}}{u^0_{i_1}} = \sum _{s=1}^{2\ell +1} (-1)^{s-1} \left( \frac{q_{i_s}}{u^0_{i_s}} + \frac{q_{i_{s+1}}}{u^0_{i_{s+1}}}\right) = 0, \end{aligned}$$

which leads to $q_{i_1} = 0$. Using the connectivity of $\mathcal {G}_{1k}$ and the relation (C.2), it follows that

$$\begin{aligned} q_{i} = 0,\quad \forall i\in \mathcal {I}_{1k}\left( C,u^*\right) . \end{aligned}$$

Moreover, the same conclusion holds for all $k\in [n_1]$ and, thus, we conclude that

$$\begin{aligned} q_{i} = 0,\quad \forall i\in \mathcal {I}_1\left( C,u^*\right) . \end{aligned}$$

Since $\mathcal {I}_{00}\left( C,u^*\right) = \emptyset $, for every node $i\in \mathcal {I}_0\left( C,u^*\right) $, there exists another node $j\in \mathcal {I}_1\left( C,u^*\right) $ such that $C_{ij} > 0$. Considering the relation (C.2), we obtain that

$$\begin{aligned} q_j = -{u^0_jq_i}/{u^0_i} = 0. \end{aligned}$$

In summary, we have proved that $q_i = 0$ for all $i\in [n]$, which contradicts the assumption that $q\ne 0$. Hence, the Hessian matrix at $u^0$ is positive definite. $\square $

1.2 C.2 Application of the implicit function theorem

Using the positive-definiteness of the Hessian matrix, we are able to apply the implicit function theorem to certify the existence of spurious local minima.

Lemma C.3

Suppose that $\alpha \in [0,1]$ and consider a pair $\left( C,u^*\right) \in \mathcal{S}\mathcal{D}$. Then, there exists a small constant $\delta (C,u^*) > 0$ such that for every instance $\mathcal{M}\mathcal{C}({\tilde{C}},{\tilde{u}}^*)$ satisfying

$$\begin{aligned} \alpha \Vert {\tilde{C}} - C\Vert _1 + (1-\alpha )\Vert {\tilde{u}}^* - u^*\Vert _1 < \delta \left( C,u^*\right) , \end{aligned}$$

the instance $\mathcal{M}\mathcal{C}({\tilde{C}},{\tilde{u}}^*)$ has spurious local minima.

Proof

By Theorem 4.3, there exists a global solution $u^0$ to the instance $\mathcal{M}\mathcal{C}\left( C,u^*\right) $ such that

$$\begin{aligned} u^0(u^0)^T \ne u^*(u^*)^T,\quad \nabla ^2 g(u^0;C,u^*) \succ 0. \end{aligned}$$

Consider the system of equations:

$$\begin{aligned} \nabla g(u;C,u^*) = 0. \end{aligned}$$

Since the Jacobi matrix of $\nabla g(u;C,u^*)$ with respect to u is the Hessian matrix $\nabla ^2\,g(u;C,u^*)$ and $(u^0, C, u^*)$ is a solution, the implicit function theorem guarantees that there exists a small constant $\delta \left( C,u^*\right) >0$ such that in the neighborhood

$$\begin{aligned} \mathcal {N}:= \left\{ ({\tilde{C}}, {\tilde{u}}^*) ~\big |~ \alpha \Vert {\tilde{C}} - C\Vert _1 + (1-\alpha )\Vert {\tilde{u}}^* - u^*\Vert _1 < \delta \left( C,u^*\right) \right\} , \end{aligned}$$

there exists a function $u({\tilde{C}}, {\tilde{u}}^*):\mathcal {N}\mapsto \mathbb {R}^n$ such that

1.
$u\left( C,u^*\right) = u^0$;
2.
$u(\cdot ,\cdot )$ is a continuous function in $\mathcal {N}$;
3.
$\nabla g[u({\tilde{C}}, {\tilde{u}}^*); {\tilde{C}}, {\tilde{u}}^*] = 0$.

Using the continuity of the Hessian matrix and $u(\cdot ,\cdot )$, we can choose $\delta \left( C,u^*\right) $ to be small enough such that

$$\begin{aligned} u({\tilde{C}},{\tilde{u}}^*)[u({\tilde{C}},{\tilde{u}}^*)]^T \ne {\tilde{u}}^*({\tilde{u}}^*)^T,\quad \nabla ^2 g\left[ u({\tilde{C}}, {\tilde{u}}^*); {\tilde{C}}, {\tilde{u}}^*\right] \succ 0,\quad \forall ({\tilde{C}},{\tilde{u}}^*) \in \mathcal {N}. \end{aligned}$$

Therefore, the point $u({\tilde{C}}, {\tilde{u}}^*)$ is a spurious local minimum of the instance $\mathcal{M}\mathcal{C}({\tilde{C}}, {\tilde{u}}^*)$. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, H., Yalcin, B., Lavaei, J. et al. A new complexity metric for nonconvex rank-one generalized matrix completion. Math. Program. 207, 227–268 (2024). https://doi.org/10.1007/s10107-023-02008-5

Download citation

Received: 05 April 2022
Accepted: 20 July 2023
Published: 16 August 2023
Issue Date: September 2024
DOI: https://doi.org/10.1007/s10107-023-02008-5

A new complexity metric for nonconvex rank-one generalized matrix completion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A multi-stage convex relaxation approach to noisy structured low-rank matrix recovery

A rank-corrected procedure for matrix completion with fixed basis coefficients

Half thresholding eigenvalue algorithm for semidefinite matrix completion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

A Proofs in Sect. 2

1.1 A.1 Proof of Theorem 2.4

Lemma A.1

Lemma A.2

Proof of Theorem 2.4

B Proofs in Sect. 3

1.1 B.1 Proof of Theorem 3.2

Lemma B.1

Lemma B.2

Proof of Theorem 3.2

1.2 B.2 Reduction of problem (3.7)

Lemma B.3

1.3 B.3 Proof of Theorem 3.5

Theorem B.1

Theorem B.2

Proof

Theorem B.3

1.4 B.4 Proof of Theorem 3.6

Lemma B.4

Proof of Theorem 3.6

C Proofs in Sect. 4

1.1 C.1 Proof of Theorem 4.3

Lemma C.1

Proof

Lemma C.2

Proof

1.2 C.2 Application of the implicit function theorem

Lemma C.3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation