Triangularized Orthogonalization-Free Method for Solving Extreme Eigenvalue Problems

305 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

A novel orthogonalization-free method together with two specific algorithms is proposed to address extreme eigenvalue problems. On top of gradient-based algorithms, the proposed algorithms modify the multicolumn gradient such that earlier columns are decoupled from later ones. Locally, both algorithms converge linearly with convergence rates depending on eigengaps. Momentum acceleration, exact linesearch, and column locking are incorporated to accelerate algorithms and reduce their computational costs. We demonstrate the efficiency of both algorithms on random matrices with different spectrum distributions and matrices from computational chemistry.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 3

The Effects of Loss of Orthogonality on Large Scale Numerical Computations

A robust, open-source implementation of the locally optimal block preconditioned conjugate gradient for large eigenvalue problems in quantum chemistry

Article Open access 30 July 2023

Randomized methods for computing joint eigenvalues, with applications to multiparameter eigenvalue problems and root finding

Article Open access 31 October 2024

Data Availability

Enquiries about data availability should be directed to the authors.

References

Banerjee, A.S., Lin, L., Hu, W., Yang, C., Pask, J.E.: Chebyshev polynomial filtered subspace iteration in the discontinuous Galerkin method for large-scale electronic structure calculations. J. Chem. Phys. 145(15), 154101 (2016)
Article Google Scholar
Berljafa, M., Wortmann, D., Di Napoli, E.: An optimized and scalable eigensolver for sequences of eigenvalue problems. Concurr. Comput. Pract. Exp. 27(4), 905–922 (2015)
Article Google Scholar
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)
Article MathSciNet MATH Google Scholar
Brouder, C., Panati, G., Calandra, M., Mourougane, C., Marzari, N.: Exponential localization of Wannier functions in insulators. Phys. Rev. Lett. 98(4), 046402 (2007)
Article Google Scholar
Corsetti, F.: The orbital minimization method for electronic structure calculations with finite-range atomic basis sets. Comput. Phys. Commun. 185(3), 873–883 (2014)
Article Google Scholar
Dai, X., Wang, Q., Zhou, A.: Gradient flow based discretized Kohn-Sham density functional theory. arxiv:1907.06321 (2019a)
Dai, X., Zhang, L., Zhou, A.: Adaptive step size strategy for orthogonality constrained line search methods. arxiv:1906.02883 (2019b)
Davidson, E.R.: The iterative calculation of a few of the lowest eigenvalues and corresponding eigenvectors of large real-symmetric matrices. J. Comput. Phys. 17(1), 87–94 (1975)
Article MathSciNet MATH Google Scholar
Gao, B., Liu, X., Chen, X., Yuan, Y.X.: A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. 28(1), 302–332 (2018)
Article MathSciNet MATH Google Scholar
Gao, B., Liu, X., Yuan, Y.-X.: Parallelizable algorithms for optimization problems with orthogonality constraints. SIAM J. Sci. Comput. 41(3), A1949–A1983 (2019)
Article MathSciNet MATH Google Scholar
Gao, W., Li, Y., Lu, B.: Global convergence of triangularized orthogonalization-free method for solving extreme eigenvalue problems. arxiv:2110.06212 (2021)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. The Johns Hopkins University Press, Baltimore (2013)
MATH Google Scholar
Golub, G.H., Ye, Q.: An inverse free preconditioned krylov subspace method for symmetric generalized eigenvalue problems. SIAM J. Sci. Comput. 24(1), 312–334 (2002)
Article MathSciNet MATH Google Scholar
Huang, W., Gallivan, K.A., Absil, P.-A.: A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM J. Optim. 25(3), 1660–1685 (2015)
Article MathSciNet MATH Google Scholar
Kalkreuter, T., Simma, H.: An accelerated conjugate gradient algorithm to compute low-lying eigenvalues-a study for the dirac operator in su (2) lattice qcd. Comput. Phys. Commun. 93(1), 33–47 (1996)
Article MATH Google Scholar
Knowles, P.J., Handy, N.C.: A new determinant-based full configuration interaction method. Chem. Phys. Lett. 111(4–5), 315–321 (1984)
Article Google Scholar
Knyazev, A.V.: Toward the optimal preconditioned eigensolver: locally optimal block preconditioned conjugate gradient method. SIAM J. Sci. Comput. 23(2), 517–541 (2001)
Article MathSciNet MATH Google Scholar
Lee, J.D., Panageas, I., Piliouras, G., Simchowitz, M., Jordan, M.I., Recht, B.: First-order methods almost always avoid strict saddle points. Math. Program. 176(1–2), 311–337 (2019)
Article MathSciNet MATH Google Scholar
Lei, Q., Zhong, K., Dhillon, I.S.: Coordinate-wise power method. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 2064–2072. Curran Associates Inc, New York (2016)
Google Scholar
Levitt, A., Torrent, M.: Parallel eigensolvers in plane-wave density functional theory. Comput. Phys. Commun. 187, 98–105 (2015)
Article MathSciNet Google Scholar
Li, R.-C.: Rayleigh quotient based optimization methods for eigenvalue problems. In Matrix Functions and Matrix Equations, World Scientific, pp 76–108 (2015)
Li, Y., Lu, J.: Bold diagrammatic Monte Carlo in the lens of stochastic iterative methods. Trans. Math. Appl. 3(1), 1–17 (2019)
MATH Google Scholar
Li, Y., Lu, J.: Optimal orbital selection for full configuration interaction (OptOrbFCI): pursuing basis set limit under budget. arxiv:2004.04205 (2020)
Li, Y., Lu, J., Wang, Z.: Coordinatewise descent methods for leading eigenvalue problem. SIAM J. Sci. Comput. 41(4), A2681–A2716 (2019)
Article MathSciNet MATH Google Scholar
Li, Y., Yang, H.: Spectrum slicing for sparse Hermitian definite matrices based on Zolotarev’s functions. arxiv:1701.08935 (2017)
Liu, W.: An algorithm for solving eigenvectors based on unconstrained optimization problem. Master’s thesis, Fudan University (2021)
Liu, X., Wen, Z., Zhang, Y.: An efficient Gauss-Newton algorithm for symmetric low-rank product matrix approximations. SIAM J. Optim. 25(3), 1571–1608 (2015)
Article MathSciNet MATH Google Scholar
Lu, J., Thicke, K.: Orbital minimization method with l1 regularization. J. Comput. Phys. 336, 87–103 (2017)
Article MathSciNet MATH Google Scholar
Lu, J., Yang, H.: Preconditioning orbital minimization method for planewave discretization. Multiscale Model. Simul. 15(1), 254–273 (2017)
Article MathSciNet MATH Google Scholar
Mauri, F., Galli, G., Car, R.: Orbital formulation for electronic-structure calculations with linear system-size scaling. Phys. Rev. B 47(15), 9973–9976 (1993)
Article Google Scholar
Ordejón, P., Drabold, D.A., Grumbach, M.P., Martin, R.M.: Unconstrained minimization approach for electronic computations that scales linearly with system size. Phys. Rev. B 48(19), 14646–14649 (1993)
Article Google Scholar
Ovtchinnikov, E.E.: Computing several eigenpairs of Hermitian problems by conjugate gradient iterations. J. Comput. Phys. 227(22), 9477–9497 (2008)
Article MathSciNet MATH Google Scholar
Peter Tang, P.T., Polizzi, E.: FEAST as a subspace iteration eigensolver accelerated by approximate spectral projection. SIAM J. Matrix Anal. Appl. 35(2), 354–390 (2014)
Article MathSciNet MATH Google Scholar
Polak, E., Ribiere, G.: Note sur la convergence de méthodes de directions conjuguées. ESAIM Math. Modell. Numer. Anal Modél. Math. Anal. Numér. 3(R1), 35–43 (1969)
MATH Google Scholar
Quillen, P., Ye, Q.: A block inverse-free preconditioned Krylov subspace method for symmetric generalized eigenvalue problems. J. Comput. Appl. Math. 233(5), 1298–1313 (2010)
Article MathSciNet MATH Google Scholar
Saad, Y., Chelikowsky, J.R., Shontz, S.M.: Numerical methods for electronic structure calculations of materials. SIAM Rev. 52(1), 3–54 (2010)
Article MathSciNet MATH Google Scholar
Stubbs, K.D., Watson, A.B., Lu, J.: Existence and computation of generalized Wannier functions for non-periodic systems in two dimensions and higher. arxiv:2003.06676 (2020)
Vecharynski, E., Yang, C., Pask, J.E.: A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix. J. Comput. Phys. 290, 73–89 (2015)
Article MathSciNet MATH Google Scholar
Wang, Z., Li, Y., Lu, J.: Coordinate descent full configuration interaction. J. Chem. Theory Comput. 15(6), 3558–3569 (2019)
Article Google Scholar
Wen, Z., Yang, C., Liu, X., Zhang, Y.: Trace-penalty minimization for large-scale eigenspace computation. J. Sci. Comput. 66(3), 1175–1203 (2016)
Article MathSciNet MATH Google Scholar
Wen, Z., Yin, W.: A feasible method for optimization with orthogonality constraints. Math. Program. 142(1–2), 397–434 (2013)
Article MathSciNet MATH Google Scholar
Yu, V. W.-Z., Campos, C., Dawson, W., García, A., Havu, V., Hourahine, B., Huhn, W. P., Jacquelin, M., Jia, W., Keçeli, M., Laasner, R., Li, Y., Lin, L., Lu, J., Moussa, J., Roman, J. E., Vázquez-Mayagoitia, Á., Yang, C., Blum, V.: ELSI – an open infrastructure for electronic structure solvers. arxiv:1912.13403 (2019)
Yu, V.W.-Z., Corsetti, F., García, A., Huhn, W.P., Jacquelin, M., Jia, W., Lange, B., Lin, L., Lu, J., Mi, W., Seifitokaldani, A., Vázquez-Mayagoitia, Á., Yang, C., Yang, H., Blum, V.: ELSI: a unified software interface for Kohn-Sham electronic structure solvers. Comput. Phys. Commun. 222, 267–285 (2018)
Article Google Scholar
Zhang, X., Zhu, J., Wen, Z., Zhou, A.: Gradient type optimization methods for electronic structure calculations. SIAM J. Sci. Comput. 36(3), C265–C289 (2014)
Article MathSciNet MATH Google Scholar
Zhou, Y., Saad, Y., Tiago, M.L., Chelikowsky, J.R.: Self-consistent-field calculations using Chebyshev-filtered subspace iteration. J. Comput. Phys. 219(1), 172–184 (2006)
Article MATH Google Scholar

Download references

Acknowledgements

The authors thank Jianfeng Lu and Zhe Wang for helpful discussions.

Funding

W. Gao was partially supported by National Key R &D Program of China under Grant No. 2020YFA0711900, 2020YFA0711902 and National Natural Science Foundation of China under Grant No. 71991471, U1811461. Y. Li and B. Lu were partially supported by National Natural Science Foundation of China under Grant No. 12271109.

Author information

Authors and Affiliations

School of Mathematical Sciences, Fudan University, Shanghai, 200433, China
Weiguo Gao & Yingzhou Li
Shanghai Aritifical Intelligence Laboratory, Shanghai, 200433, China
Weiguo Gao
Shanghai Center for Mathematical Sciences, Shanghai, 200438, China
Bichen Lu

Authors

Weiguo Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yingzhou Li
View author publications
You can also search for this author in PubMed Google Scholar
Bichen Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingzhou Li.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Proof of Theorem 4

Proof of Theorem 4

All fixed points of (16) satisfy $g_2(X) = 0$. We first analyze the fixed points for a single column case and then complete the proof by induction. Notations used in this proof are the same as those in the proof of Theorem 3.

We denote the single column X as x. Obviously, when $x = 0$, we have $g_2(x) = 0$. Now, consider the nontrivial case $x \ne 0$. The equality $g_2(x) = 0$ can be expanded as,

$$\begin{aligned} \left( \left( 2 - x^\top x\right) A - x^\top A x I \right) x = 0. \end{aligned}$$

(45)

According to (45), for nonzero x, the matrix $B = \left( 2 - x^\top x\right) A - x^\top A x I$ must has a zero eigenvalue and x lies in its corresponding eigenspace. When $x^\top x = 2$, the matrix $B = x^\top A x I$ does not have zero eigenvalue due to the negativity assumption on A. Hence x is parallel to one of A’s eigenvector, i.e., $Ax = \lambda x$. Substituting this into (45), we obtain,

$$\begin{aligned} 2 (1 - x^\top x) \lambda x = 0. \end{aligned}$$

(46)

Since $\lambda < 0$ and $x \ne 0$, we have $x^\top x = 1$. Hence we conclude that for $g_2(x) = 0$, x is either a zero vector or an eigenvector of A.

Now we consider multicolumn case. The first column of $g_2(X) = 0$ is the same as (45). Hence $X_1 = U P_1 S_1$.

Assume the first i columns of X obey $X_i = U P_i S_i$. Then the $(i+1)$-th column of $g_2(X) = 0$ is

$$\begin{aligned} 2A x_{i+1} - Ax_{i+1} x_{i+1}^\top x_{i+1} - x_{i+1} x_{i+1}^\top A x_{i+1} - AX_i X_i^\top x_{i+1} - X_i X_i^\top Ax_{i+1} = 0. \end{aligned}$$

(47)

Obviously, if $x_{i+1} = 0$, then (47) holds. When $x_{i+1} \ne 0$, we left multiply (47) with $X_i^\top $, adopt the commuting property of diagonal matrices, and obtain,

$$\begin{aligned} \begin{aligned} 0 =&S_i P_i^\top \left( 2\Lambda - x_{i+1}^\top x_{i+1} \Lambda - x_{i+1}^\top A x_{i+1} I - \Lambda P_i P_i^\top - \Lambda \right) U^\top x_{i+1} \\ =&- S_i P_i^\top \left( x_{i+1}^\top x_{i+1} \Lambda + x_{i+1}^\top A x_{i+1} I \right) U^\top x_{i+1} \\ \end{aligned} \end{aligned}$$

(48)

where the second equality adopts the fact that $P_i^\top \Lambda P_i P_i^\top = P_i^\top \Lambda $. Due to the negativity of A, we notice that $x_{i+1}^\top x_{i+1} \Lambda + x_{i+1}^\top A x_{i+1} I$ is a diagonal matrix with strictly negative diagonal entries. Hence the equality (48) is equivalent to

$$\begin{aligned} S_i P_i^\top U^\top x_{i+1} = 0. \end{aligned}$$

(49)

As long as (49) holds, we have $X_i^\top x_{i+1} = 0$ and $X_i^\top A x_{i+1} = 0$. Therefore, solving (47) can be addressed via solving

$$\begin{aligned} 2A x_{i+1} - Ax_{i+1} x_{i+1}^\top x_{i+1} - x_{i+1} x_{i+1}^\top A x_{i+1} = 0. \end{aligned}$$

(50)

Hence $x_{i+1}$ satisfies (49). Combining the solution of the single column case (45) and the constraint (49), we conclude that $X_{i+1}$ is of the form $U P_{i+1} S_{i+1}$.

The stabilities of fixed points should also be analyzed through the spectrum properties of their Jacobian matrices. The Jacobian matrix $\,\mathrm {D}g_2(X)$, again, can be written as a p-by-p block matrix. And using the similar argument as in the proof of Theorem 3, $\,\mathrm {D}g_2(X) = \,\mathrm {D}G$ is a block upper triangular matrix whose spectrum is determined by the spectrum of its diagonal blocks. Through a multivariable calculus, we obtain the expression for $J_{ii}$ as,

$$\begin{aligned} J_{ii} = 2A - A X_i X_i^\top - X_i X_i^\top A - A x_i x_i^\top - x_i^\top x_i A - x_i^\top A x_i I - x_i x_i^\top A. \end{aligned}$$

(51)

We first show the stability of the fixed points of form $X = U_p D$. Substituting these points into (51), we have,

$$\begin{aligned} J_{ii} = A - 2 U_i \Lambda _i U_i^\top - 2 \lambda _i u_i u_i^\top - \lambda _i I. \end{aligned}$$

(52)

Since $\lambda _i$ is smaller than all eigenvalues of $A - U_i \Lambda _i U_i^\top $, $A - U_i \Lambda _i U_i^\top - \lambda _i I$ is strictly positive definite. The rest part of (51) is, obviously, positive definite. Hence $J_{ii}$ is strictly positive definite for all $i = 1, 2, \dots , p$ and fixed points of the form $X = U_p D$ are stable fixed points.

Next we show the rest fixed points are not stable. For a fixed point X, we denote the first index s such that $x_s^\top u_s = 0$. Then we estimate $u_s^\top J_{ss} u_s$ as,

$$\begin{aligned} u_s^\top J_{ii} u_s = 2\lambda _s - x_s^\top x_s \lambda _s - x_s^\top A x_s < 0, \end{aligned}$$

(53)

since $x_s^\top x_s \le 1$ and A is negative definite. Therefore, the rest fixed points are not stable. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gao, W., Li, Y. & Lu, B. Triangularized Orthogonalization-Free Method for Solving Extreme Eigenvalue Problems. J Sci Comput 93, 63 (2022). https://doi.org/10.1007/s10915-022-02025-0

Download citation

Received: 30 November 2021
Revised: 25 July 2022
Accepted: 05 September 2022
Published: 22 October 2022
DOI: https://doi.org/10.1007/s10915-022-02025-0

Keywords

Mathematics Subject Classification (2020)

65CF52

Triangularized Orthogonalization-Free Method for Solving Extreme Eigenvalue Problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The Effects of Loss of Orthogonality on Large Scale Numerical Computations

A robust, open-source implementation of the locally optimal block preconditioned conjugate gradient for large eigenvalue problems in quantum chemistry

Randomized methods for computing joint eigenvalues, with applications to multiparameter eigenvalue problems and root finding

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

A Proof of Theorem 4

Proof of Theorem 4

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2020)

Subscribe and save

Buy Now

Navigation

Triangularized Orthogonalization-Free Method for Solving Extreme Eigenvalue Problems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

The Effects of Loss of Orthogonality on Large Scale Numerical Computations

A robust, open-source implementation of the locally optimal block preconditioned conjugate gradient for large eigenvalue problems in quantum chemistry

Randomized methods for computing joint eigenvalues, with applications to multiparameter eigenvalue problems and root finding

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

A Proof of Theorem 4

A Proof of Theorem 4

Proof of Theorem 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2020)

Subscribe and save

Buy Now

Search

Navigation