Abstract
We propose two fast covariance smoothing methods and associated software that scale up linearly with the number of observations per function. Most available methods and software cannot smooth covariance matrices of dimension \(J>500\); a recently introduced sandwich smoother is an exception but is not adapted to smooth covariance matrices of large dimensions, such as \(J= 10{,}000\). We introduce two new methods that circumvent those problems: (1) a fast implementation of the sandwich smoother for covariance smoothing; and (2) a two-step procedure that first obtains the singular value decomposition of the data matrix and then smoothes the eigenvectors. These new approaches are at least an order of magnitude faster in high dimensions and drastically reduce computer memory requirements. The new approaches provide instantaneous (a few seconds) smoothing for matrices of dimension \(J=10{,}000\) and very fast (\(<\)10 min) smoothing for \(J=100{,}000\). R functions, simulations, and data analysis provide ready to use, reproducible, and scalable tools for practical data analysis of noisy high-dimensional functional data.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Besse, P., Cardot, H., Ferraty, F.: Simultaneous nonparametric regressions of unbalanced longitudinal data. Comput. Stat. Data Anal. 24, 255–270 (1997)
Besse, P., Ramsay, J.O.: Principal components analysis of sampled functions. Psychometrika 51, 285–311 (1986)
Bunea, F., Xiao L.: On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA. To appear in Bernoulli. http://arxiv.org/abs/1212.5321 (2013)
Capra, W., Müller, H.: An accelerated-time model for response curves. J. Am. Stat. Assoc. 92, 72–83 (1997)
Cardot, H.: Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J. Nonparametr. Stat. 12, 503–538 (2000)
Crainiceanu, C., Reiss, P., Goldsmith, J., Huang, L., Huo, L., Scheipl, F., Swihart, B., Greven, S., Harezlak, J., Kundu, M., Zhao, Y., Mclean, M., Xiao, L.: R package refund: methodology for regression with functional data (version 0.1-9). http://cran.r-project.org/web/packages/refund/index.html (2013)
Crainiceanu, C., Staicu, A., Di, C.: Generalized multilevel functional regression. J. Am. Stat. Assoc. 104, 1550–1561 (2009)
Crainiceanu, C., Staicu, A., Ray, S., Punjabi, N.: Bootstrap-based inference on the difference in the means of two correlated functional processes. Stat. Med. 31, 3223–3240 (2012)
Craven, P., Wahba, G.: Smoothing noisy data with spline functions. Numer. Math. 31, 377–403 (1979)
Cummins, D., Filloon, T., Nychka, D.: Confidence intervals for nonparametric curve estimates: toward more uniform pointwise coverage. J. Am. Stat. Assoc. 96, 233–246 (2001)
Dauxois, J., Pousse, A., Romain, Y.: Simultaneous nonparametric regressions of unbalanced longitudinal data. J. Multivar. Anal. 12, 136–154 (1982)
Di, C., Crainiceanu, C.M., Caffo, B.S., Punjabi, N.: Multilevel functional principal component analysis. Ann. Appl. Stat. 3, 458–488 (2009)
Eilers, P., Marx, B.: Flexible smoothing with B-splines and penalties (with Discussion). Stat. Sci. 11, 89–121 (1996)
Eilers, P., Marx, B.: Multivariate calibration with temperature interaction using two-dimensional penalized signal regression. Chemometr. Intell. Lab. Syst. 66, 159–174 (2003)
Goldsmith, J., Bobb, J., Crainiceanu, C., Caffo, B., Reich, D.: Longitudinal functional principal component. J. Comput. Graph. Stat. 20, 830–851 (2011)
Greven, S., Crainiceanu, C., Caffo, B., Reich, D.: Longitudinal functional principal component. Electron. J. Stat. 4, 1022–1054 (2010)
Karhunen, K.: Uber lineare methoden in der wahrscheinlichkeitsrechnung. Annales Academie Scientiarum Fennicae 37, 1–79 (1947)
Kim, Y.J., Gu, C.: Smoothing spline Gaussian regression: more scalable computation via efficient approximation. J. R. Stat. Soc. B 66, 337–356 (2004)
Kneip, A.: Nonparametric estimation of common regressors for similar curve data. Ann. Stat. 22, 1386–1427 (1994)
Marx, B., Eilers, P.: Multidimensional penalized signal regression. Technometrics 47, 13–22 (2005)
Ramsay, J., Dalzell, C.J.: Some tools for functional data analysis (with Discussion). J. R. Stat. Soc. B 53, 539–572 (1991)
Ramsay, J., Silverman, B.: Functional data analysis. Springer, New York (2005)
Ramsay, J., Silverman, B.W.: Applied Functional Data Analysis: Methods and Case Studies. Springer, New York (2002)
Rice, J., Silverman, B.: Estimating the mean and covariance structure nonparametrically when the data are curves. J. R. Stat. Soc. B 53, 233–243 (1991)
Ruppert, D.: Selecting the number of knots for penalized splines. J. Comput. Graph. Stat. 1, 735–757 (2002)
Ruppert, D., Wand, M., Carroll, R.: Semiparametric Regression. Cambridge University Press, Cambridge (2003)
Seber, G.: A Matrix Handbook for Statisticians. Wiley-Interscience, New Jersey (2007)
Shinohara, R., Crainiceanu, C., Caffo, B., Reich, D.: Longitudinal analysis of spatio-temporal processes: a case study of dynamic contrast-enhanced magnetic resonance imaging in multiple sclerosis. http://biostats.bepress.com/jhubiostat/paper231/ (2014)
Shou, H., Zipunnikov, V., Crainiceanu, C., Greven, S.: Structured functional principal component analysis. http://arxiv.org/pdf/1304.6783.pdf (2013)
Staniswalis, J., Lee, J.: Nonparametric regression analysis of longitudinal data. J. Am. Stat. Assoc. 93, 1403–1418 (1998)
Swihart, B., Caffo, B., Crainiceanu, C., Punjabi, N.: Mixed effect poisson log-linear models for clinical and epidemiological sleep hypnogram data. Stat. Med. 31, 855–870 (2012)
Wang, X., Shen, J., Ruppert, D.: Some asymptotic results on generalized penalized spline smoothing. Electron. J. Stat. 4, 1–17 (2011)
Wood, S.: Thin plate regression splines. J. R. Stat. Soc. B 65, 95–114 (2003)
Wood, S.: R package mgcv: mixed GAM computation vehicle with GCV/AIC/REML, smoothese estimation (version 1.7-24). http://cran.r-project.org/web/packages/mgcv/index.html (2013)
Xiao, L., Li, Y., Apanasovich, T., Ruppert, D.: Local asymptotics of P-splines. http://arxiv.org/abs/1201.0708v3 (2012)
Xiao, L., Li, Y., Ruppert, D.: Fast bivariate P-splines: the sandwich smoother. J. R. Stat. Soc. B 75, 577–599 (2013)
Yao, F., Müller, H., Clifford, A., Dueker, S., Follett, J., Lin, Y., Buchholz, B., Vogel, J.: Shrinkage estimation for functional principal component scores with application to the population kinetics of plasma folate. Biometrics 20, 852–873 (2003)
Yao, F., Müller, H., Wang, J.: Functional data analysis for sparse longitudinal data. J. Am. Stat. Assoc. 100, 577–590 (2005)
Zhang, J., Chen, J.: Statistical inferences for functional data. Ann. Stat. 35, 1052–1079 (2007)
Zipunnikov, V., Caffo, B.S., Crainiceanu, C.M., Yousem, D., Davatzikos, C., Schwartz, B.: Multilevel functional principal component analysis for high-dimensional data. J. Comput. Graph. Stat. 20, 852–873 (2011)
Zipunnikov, V., Greven, S., Shou, H., Caffo, B.S., Reich, D.S., Crainiceanu, C.: Longitudinal high-dimensional principal components analysis with application to diffusion tensor imaging of multiple sclerosis. Ann. Appl. Stat. http://biostats.bepress.com/jhubiostat/paper234/ (2012)
Acknowledgments
This work was supported by Grant Number R01EB012547 from the National Institute of Biomedical Imaging And Bioengineering and Grant Number R01NS060910 from the National Institute of Neurological Disorders and Stroke. This work represents the opinions of the researchers and not necessarily that of the granting organizations.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proofs
Proof of Proposition 1
The design matrix \(\mathbf{B}\) is of full rank (Xiao et al. 2012). Hence \(\mathbf{B}^T\mathbf{B}\) is invertible and \(\mathbf{A}_S\) is of rank \(c\). \({\varvec{\Sigma }}_S\) is a diagonal matrix with all elements greater than 0 and \({\widetilde{\mathbf{Y}}}\) is of rank at most \(\min (c,I)\). Hence \({\widetilde{\mathbf{K}}}= \mathbf{A}_S\left( I^{-1}{\varvec{\Sigma }}_S{\widetilde{\mathbf{Y}}}{\widetilde{\mathbf{Y}}}^T{\varvec{\Sigma }}_S\right) \mathbf{A}_S^T\) has a rank at most \(\min (c,I)\) and the proposition follows. \(\square \)
Proof of Proposition 2
First of all, \(\text{ tr }(\mathbf {S}) = \text{ tr }({\varvec{\Sigma }}_S)\) which is easy to calculate. We now compute \(\sum _{i=1}^I \Vert \mathbf{Y}_i - \mathbf{S}\mathbf{Y}_i\Vert ^2\). Because \( \Vert \mathbf{Y}_i - \mathbf{S}\mathbf{Y}_i\Vert ^2 = \mathbf{Y}_i^T(\mathbf{S}-\mathbf{I}_J)^2\mathbf{Y}_i = \text{ tr }\{(\mathbf{S}-\mathbf{I}_J)^2\mathbf{Y}_i\mathbf{Y}_i^T\}\),
It can be shown that \(\mathbf{S}^2 = \mathbf{A}_S {\varvec{\Sigma }}_S^2\mathbf{A}_S^T\). Hence \(\text{ tr }(\mathbf{S}^2\mathbf{Y}\mathbf{Y}^T) = \text{ tr }(\mathbf{Y}^T\mathbf{S}^2\mathbf{Y}) = \text{ tr }({\widetilde{\mathbf{Y}}}^T{\varvec{\Sigma }}_S^2{\widetilde{\mathbf{Y}}})=\text{ tr }({\varvec{\Sigma }}_S^2{\widetilde{\mathbf{Y}}}{\widetilde{\mathbf{Y}}}^T)\). Similarly, we derive \(\text{ tr }(\mathbf{S}\mathbf{Y}\mathbf{Y}^T) = \text{ tr }({\varvec{\Sigma }}_S{\widetilde{\mathbf{Y}}}{\widetilde{\mathbf{Y}}}^T)\). We have \(\text{ tr }(\mathbf{Y}\mathbf{Y}^T) = \Vert \mathbf{Y}\Vert _F^2\). It follows that
\(\square \)
Proposition 3
The computation time of FACE is \(O(IJc +Jc^2+c^3 + ck_0)\), where \(k_0\) is the number of iterations needed for selecting the smoothing parameter (see Sect. 3.2), and the total required computer memory is \(O(JI+I^2+Jc+c^2+k_0)\) memory units.
Proof of Proposition 3
We need to compute or store the following quantities: \(\mathbf{X}\), \(\mathbf{B}\), \(\mathbf{B}^T\mathbf{B}\), \((\mathbf{B}^T\mathbf{B})^{-1/2}, \mathbf{P}, (\mathbf{B}^T\mathbf{B})^{-1/2}\mathbf{P}(\mathbf{B}^T\mathbf{B})^{-1/2}, \mathbf{A}_S, {\widetilde{\mathbf{Y}}}, \mathbf{A}, \mathbf{U}\), and \(\mathbf{A}_S \mathbf{A}\). For the computational complexity, \(\mathbf{B}^T\mathbf{B}\), \(\mathbf{A}_S = \mathbf{B}(\mathbf{B}^T\mathbf{B})^{-1/2}\mathbf{U}\), and \(\mathbf{A}_S\mathbf{A}\) require \(O(Jc^2)\) computations; \((\mathbf{B}^T\mathbf{B})^{-1/2}\), \(\mathbf{P}\), \((\mathbf{B}^T\mathbf{B})^{-1/2}\mathbf{P}(\mathbf{B}^T\mathbf{B})^{-1/2}\), \(\mathbf{A}\), and \(\mathbf{U}\) require \(O(c^3)\) computations; \({\widetilde{\mathbf{Y}}}= \mathbf{A}_S^T\mathbf{Y}\) requires \(O(JIc)\) computations. So in total, \(O(JIc+Jc^2+c^3)\) computations are required. For the memory burden, the loading of \(\mathbf{Y}\) requires \(O(JI)\) memory units, computer of \(\mathbf{B}\) and \(\mathbf{A}_S\mathbf{A}\) requires \(O(Jc)\) memory units, and other objects require \(O(c^2)\) memory units. \(\square \)
Proof of Theorem 1 We have \({\widehat{{\varvec{\xi }}}}_i \!=\! J^{-1/2}(\mathbf{A}_S{\widehat{\mathbf{A}}}_N)^T\mathbf{Y}_i\! =\! J^{-1/2}{\widehat{\mathbf{A}}}_N^T(\mathbf{A}_S^T\mathbf{Y}_i)=J^{-1/2}{\widehat{\mathbf{A}}}_N^T{\widetilde{\mathbf{Y}}}_i\).\(\square \)
Proof of Theorem 2 Let \({\widetilde{\mathbf{A}}}_N\) denote the first \(N\) columns of \(\mathbf{A}_S\mathbf{A}\), then \({\widetilde{\mathbf{A}}}_N = \mathbf{A}_S{\widehat{\mathbf{A}}}\). The estimated BLUPs for \({\varvec{\xi }}_i\) (Ruppert et al. 2003) is
The inverse matrix in the above equality can be replaced by the following [Seber (2007), page 309, equality b(i)],
It follows that
\(\square \)
Appendix 2: Empirical covariance operators for \(K_X\) and \(K_U\)
Let \(I\) denote the number of pairs of cases and controls. For simplicity, we assume estimates of \(\mu _A(t)\) and \(\mu _C(t)\) have been subtracted from \(Y_{iA}\) and \(Y_{iC}\), respectively. Let \(\mathbf{Y}_{iA} = (Y_{iA}(t_1),\ldots , Y_{iA}(t_T))^T\) and \(\mathbf{Y}_{iC} = (Y_{iC}(t_1),\ldots , Y_{iC}(t_J))^T\). By Zipunnikov et al. (2011), we have estimates of the covariance operators,
and
Let \(\mathbf{Y}_A = [\mathbf{Y}_{1A},\ldots , \mathbf{Y}_{nA}]\), \(\mathbf{Y}_C = [\mathbf{Y}_{1C}, \ldots , \mathbf{Y}_{nC}]\) and \(\mathbf{Y}= [\mathbf{Y}_A,\mathbf{Y}_C]\). Then \(\mathbf{Y}\) is of dimension \(J\times 2I\). It can be shown that \({\widehat{\mathbf{K}}}_X = \mathbf{Y}\mathbf{H}_X\mathbf{Y}^T\) and \({\widehat{\mathbf{K}}}_U = \mathbf{Y}\mathbf{H}_U\mathbf{Y}^T\), where
Rights and permissions
About this article
Cite this article
Xiao, L., Zipunnikov, V., Ruppert, D. et al. Fast covariance estimation for high-dimensional functional data. Stat Comput 26, 409–421 (2016). https://doi.org/10.1007/s11222-014-9485-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9485-x