Abstract
Model average receives much attention in recent years. This paper considers the semiparametric model averaging for high-dimensional longitudinal data. To minimize the prediction error, the authors estimate the model weights using a leave-subject-out cross-validation procedure. Asymptotic optimality of the proposed method is proved in the sense that leave-subject-out cross-validation achieves the lowest possible prediction loss asymptotically. Simulation studies show that the performance of the proposed model average method is much better than that of some commonly used model selection and averaging methods.
Similar content being viewed by others
References
Draper D, Assessment and propagation of model uncertainty, Mathematical Socialences, 1995, 57: 45–97.
Raftery A E and Hoeting J A, Bayesian model averaging for linear regression models, Journal of the American Statistical Association, 1997, 92: 179–191.
Hoeting J A, Madigan D, Raftery A E, et al., Bayesian model averaging: A tutorial, Statistical Science, 1999, 14: 382–417.
Raftery A E and Zheng Y, Long-run performance of Bayesian model averaging, Journal of the American Statistical Association, 2003, 98: 931–938.
Buckland S T, Burnham K P, and Augustin N H, Model selection: An integral part of inference, Biometrics, 1997, 53: 603–618.
Hjort N L and Claeskens G, Frequentist model average estimators, Journal of the American Statistical Association, 2003, 98: 879–899.
Hjort N L and Claeskens G, Focused information criteria and model averaging for the Cox hazard regression model, Journal of the American Statistical Association, 2006, 101: 1449–1464.
Yuan Z and Yang Y, Combining linear regression models: When and how?, Journal of the American Statistical Association, 2005, 100: 1202–1214.
Hansen B E, Least squares model averaging, Econometrica, 2007, 75: 1175–1189.
Hansen B E, Least-squares forecast averaging, Journal of Econometrics, 2008, 146: 342–350.
Goldenshluger A, A universal procedure for aggregating estimators, Annals of Statistics, 2009, 37: 542–568.
Schomaker M, Wan A T K, and Heumann C, Frequentist model averaging with missing observations, Computational Statistics and Data Analysis, 2010, 54: 3336–3347.
Wan A T, Zhang X, and Zou G, Least squares model averaging by Mallows criterion, Journal of Econometrics, 2010, 156: 277–283.
Liang H, Zou G, Wan A T, et al., Optimal weight choice for frequentist model average estimators, Journal of the American Statistical Association, 2011, 106: 1053–1066.
Zhang X, Wan A T, and Zhou S Z, Focused information criteria, model selection, and model averaging in a tobit model with a nonzero threshold, Journal of Business and Economic Statistics, 2012, 30: 132–142.
Hansen B E and Racine J S, Jackknife model averaging, Journal of Econometrics, 2012, 167: 38–46.
Zhang X, Wan A T, and Zou G, Model averaging by jackknife criterion in models with dependent data, Journal of Econometrics, 2013, 174: 82–94.
Lu X and Su L, Jackknife model averaging for quantile regressions, Journal of Econometrics, 2015, 188: 40–58.
Zhang X, Zheng Y, and Wang S, A demand forecasting method based on stochastic frontier analysis and model average: An application in air travel demand forecasting, Journal of Systems Science and Complexity, 2019, 32:(4): 615–633.
Yu X, Xiao L, Zeng P, et al., Jackknife model averaging prediction methods for complex phenotypes with gene expression levels by integrating external pathway information, Computational and Mathematical Methods in Medicine, 2019, 2019: 1–8.
Zhang X, Zou G, and Liang H, Model averaging and weight choice in linear mixed-effects models, Biometrika, 2014, 101: 205–218.
Gao Y, Zhang X, Wang S, et al., Model averaging based on leave-subject-out cross-validation, Journal of Econometrics, 2016, 192: 139–151.
Xia Y, Semiparametric Regression Models, Springer, Berlin, 2011.
Zhu R, Wan A T, Zhang X, et al., A Mallows-type model averaging estimator for the varying-coefficient partially linear model, Journal of the American Statistical Association, 2019, 114: 882–892.
Zhang X and Wang W, Optimal model averaging estimation for partially linear models, Statistica Sinica, 2019, 29: 693–718.
Huang T and Li J, Semiparametric model average prediction in panel data analysis, Journal of Nonparametric Statistics, 2018, 30: 125–144.
Ando T and Li K C, A model-averaging approach for high-dimensional regression, Journal of the American Statistical Association, 2014, 109: 254–265.
Ando T and Li K C, A weight-relaxed model averaging approach for high-dimensional generalized linear models, The Annals of Statistics, 2017, 45: 2654–2679.
Xu G and Huang J Z, Asymptotic optimality and efficient computation of the leave-sub jectoutcross-validation, The Annals of Statistics, 2012, 40: 3003–3030.
Green P J and Silverman B W, Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, CRC Press, Florida, 1993.
Ruppert D, Wand M P, and Carroll R J, Semiparametric Regression, Cambridge University Press, Cambridge, 2003.
Claeskens G, Krivobokova T, and Opsomer J D, Asymptotic properties of penalized spline estimators, Biometrika, 2009, 96: 529–544.
Liang K Y and Zeger S L, Longitudinal data analysis using generalized linear models, Biometrika, 1986, 73: 13–22.
Welsh A H, Lin X, and Carroll R J, Marginal longitudinal nonparametric regression: locality and efficiency of spline and kernel methods, Journal of the American Statistical Association, 2002, 97: 482–493.
Zhu Z, Fung W K, and He X, On the asymptotics of marginal regression splines with longitudinal data, Biometrika, 2008, 95: 907–917.
Diggle P J, Heagerty P, Liang K Y, et al., Analysis of Longitudinal Data, Oxford University Press, Oxford, 2002.
White H, Maximum likelihood estimation of misspecified models, Econometrica, 1982, 50: 1–25.
Wang L, GEE analysis of clustered binary data with diverging number of covariates, The Annals of Statistics, 2011, 39: 389–417.
Whittle P, Bounds for the moments of linear and quadratic forms in independent variables, Theory of Probability and Its Applications, 1960, 5: 302–305.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research was supported by the Ministry of Science and Technology of China under Grant No. 2016YFB0502301, Academy for Multidisciplinary Studies of Capital Normal University, and the National Natural Science Foundation of China under Grant Nos. 11971323 and 11529101.
This paper was recommended for publication by Editor LI Qizhai.
Rights and permissions
About this article
Cite this article
Zhao, Z., Zou, G. Average Estimation of Semiparametric Models for High-Dimensional Longitudinal Data. J Syst Sci Complex 33, 2013–2047 (2020). https://doi.org/10.1007/s11424-020-9343-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-020-9343-1