Abstract
Identifying relevant variables among numerous potential predictors has been of primary interest in modern regression analysis. While stochastic search algorithms have surged as a dominant tool for Bayesian variable selection, when the number of potential predictors is large, their practicality is constantly challenged due to high computational cost as well as slow convergence. In this paper, we propose a new Bayesian variable selection scheme by using hybrid deterministic–deterministic variable selection (HD-DVS) algorithm that asymptotically ensures a rapid convergence to the global mode of the posterior model distribution. A key feature of HD-DVS is that it allows us to circumvent the iterative computation of inverse matrices, which is a common computational bottleneck in Bayesian variable selection. A simulation study is conducted to demonstrate that our proposed method outperforms existing Bayesian and frequentist methods. An analysis of the Bardet–Biedl syndrome gene expression data is presented to illustrate the applicability of HD-DVS to real data.
Similar content being viewed by others
References
Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88(422):669–679
Barbieri MM, Berger JO (2004) Optimal predictive model selection. Ann Stat 32(3):870–897
Bhattacharya A, Chakraborty A, Mallick BK (2016) Fast sampling with Gaussian scale mixture priors in high-dimensional regression. Biometrika 103:985–991
Carvalho CM, Polson NG, Scott JG (2009) Handling sparsity via the horseshoe. In: Artificial intelligence and statistics. PMLR, pp 73–80
Carvalho CM, Polson NG, Scott JG (2010) The horseshoe estimator for sparse signals. Biometrika 97(2):465–480
Casella G, Moreno E (2006) Objective Bayesian variable selection. J Am Stat Assoc 101(473):157–167
Chen J, Chen Z (2008) Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95(3):759–771
Cibis H, Biyanee A, Dörner W, Mootz HD, Klempnauer KH (2020) Characterization of the zinc finger proteins ZMYM2 and ZMYM4 as novel B-MYB binding proteins. Sci Rep 10(1):8390
Deng HX, Shi Y, Yang Y, Ahmeti KB, Miller N, Huang C, Cheng L, Zhai H, Deng S, Nuytemans K et al (2016) Identification of TMEM230 mutations in familial Parkinson’s disease. Nat Genet 48(7):733–739
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
George EI, McCulloch RE (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88(423):881–889
Hans C, Dobra A, West M (2007) Shotgun stochastic search for large p regression. J Am Stat Assoc 102(478):507–516
Hindmarch C, Fry M, Yao ST, Smith PM, Murphy D, Ferguson AV (2008) Microarray analysis of the transcriptome of the subfornical organ in the rat: regulation by fluid and food deprivation. Am J Physiol Regul Integr Comp Physiol 295(6):R1914–R1920
Jin S, Goh G (2021) Bayesian selection of best subsets via hybrid search. Comput Stat 36(3):1991–2007
Johndrow J, Orenstein P, Bhattacharya A (2020) Scalable approximate MCMC algorithms for the horseshoe prior. J Mach Learn Res 21(73):1–61
Kass RE, Wasserman L (1995) A reference Bayesian test for nested hypotheses and its relationship to the Shwarz criterion. J Am Stat Assoc 90(431):928–934
Koslovsky M, Swartz MD, Leon-Novelo L, Chan W, Wilkinson AV (2018) Using the EM algorithm for Bayesian variable selection in logistic regression models with related covariates. J Stat Comput Simul 88(3):575–596
Lu TT, Shiou SH (2002) Inverses of 2\(\times\) 2 block matrices. Comput Math Appl 43(1–2):119–129
Moreno E, Girón J, Casella G (2015) Posterior model consistency in variable selection as the model dimension grows. Stat Sci 30(2):228–241
Narisetty NN, Shen J, He X (2018) Skinny Gibbs: a consistent and scalable Gibbs sampler for model selection. J Am Stat Assoc 114(527):1205–1217
Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 103(482):681–686
Raftery AE, Madigan D, Hoeting JA (1997) Bayesian model averaging for linear regression models. J Am Stat Assoc 92(437):179–191
Ročková V, George EI (2014) EMVS: the EM approach to Bayesian variable selection. J Am Stat Assoc 109(506):828–846
Ročková V, George EI (2018) The spike-and-slab lasso. J Am Stat Assoc 113(521):431–444
Rocková V, Moran G (2021) EMVS Vignette
Scheetz TE, Kim KYA, Swiderski RE, Philp AR, Braun TA, Knudtson KL, Dorrance AM, DiBona GF, Huang J, Casavant TL, Sheffield VC, Stone EM (2006) Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc Natl Acad Sci 103(39):14429–14434
Scott JG, Berger JO (2010) Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann Stat 38:2587–2619
Tadesse MG, Vannucci M (2021) Handbook of Bayesian variable selection. CRC Press, Boca Raton
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc Ser B (Methodol) 58(1):267–288
Wang H (2009) Forward regression for ultra-high dimensional variable screening. J Am Stat Assoc 104(488):1512–1524
Yang Y, Wainwright MJ, Jordan MI (2016) On the computational complexity of high-dimensional Bayesian variable selection. Ann Stat 44(6):2497–2532
Zellner A (1986) On assessing prior distributions and Bayesian regression analysis with g-prior distributions. In: Goel PK, Zellner A (eds) Bayesian inference and decision techniques. Elsevier, New York, pp 233–243
Zhang Z (2014) The matrix ridge approximation: algorithms and applications. Mach Learn 97(3):227–258
Zhao K, Lian H (2016) The expectation–maximization approach for Bayesian quantile regression. Comput Stat Data Anal 96:1–11
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Proof of Theorem 3.1
Let \(\Sigma _{X_{\gamma },y}\), \(\Sigma _{X_\gamma ,X_{\gamma }}\) and \(\Sigma _{y,y}\) denote the probability limits of \(n^{-1} {\textbf{X}}_\gamma ^{\textrm{T}}{\textbf{y}}\), \(n^{-1} {\textbf{X}}_\gamma ^{\textrm{T}}{\textbf{X}}_\gamma\) and \(n^{-1}{\textbf{y}}^{\textrm{T}}{\textbf{y}}\), respectively. Note that
which converges to \(\Sigma _{X_{\gamma },y}^{{\textrm{T}}} \Sigma _{X_\gamma ,X_{\gamma }}^{-1} \Sigma _{X_{\gamma },y}\) in probability as \(n\rightarrow \infty\). This implies that
in probability as \(n\rightarrow \infty\), where \({\textbf{P}}_\gamma ^\perp = {\textbf{I}}_n-{\textbf{X}}_\gamma ({\textbf{X}}_\gamma ^{\textrm{T}}{\textbf{X}}_\gamma )^{-1} {\textbf{X}}_\gamma ^{\textrm{T}}\). Hence, we can write
It follows that
Also, note that
Using (A1) and (A2), we can write \(D(\gamma )\) as
where \(\text {EBIC}(\cdot )\) denotes the extended Bayesian information criterion (Chen and Chen 2008). Note that Theorem 1 of Chen and Chen (2008) implies that
in probability as \(n\rightarrow \infty\) for any \(\gamma _1\) and \(\gamma _2\) such that (a) \(\gamma _{\text {HPM}}\subsetneq \gamma _1 \subsetneq \gamma _2\) or (b) \(\gamma _{\text {HPM}}\subset \gamma _1\) and \(\gamma _{\text {HPM}} \not \subset \gamma _2\). By the asymptotic equivalence in A3, we therefore obtain the results of our Theorem 1.
Appendix B: Proof of Theorem 3.2
Suppose that, in Step 1, the HD-DVS algorithm (Algorithm 1) visits \(\tilde{\gamma }\) such that \(\tilde{\gamma }\supset \gamma _{\text {HPM}}\). Then, by Theorem 1(a), the probability that the HD-DVS algorithm converges to \(\gamma _{\text {HPM}}\) goes to one as \(n\rightarrow \infty\).
Suppose that, in Step 1, the HD-DVS algorithm never visits \(\tilde{\gamma }\) such that \(\tilde{\gamma }\supset \gamma _{\text {HPM}}\). In this case, by Theorem 2 of Wang (2009) and A3, the probability that Step 2 of the HD-DVS algorithm converges to \(\tilde{\gamma }_+(\supset \gamma _\text {HPM})\) goes to one as \(n\rightarrow \infty\). Then, the algorithm goes back to Step 1 with the initial value \(\hat{\gamma }=\tilde{\gamma }_+\). Therefore, this time the algorithm converges to \(\gamma _\text {HPM}\) in probability. This completes our proof.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lee, J., Goh, G. A hybrid deterministic–deterministic approach for high-dimensional Bayesian variable selection with a default prior. Comput Stat 39, 1659–1681 (2024). https://doi.org/10.1007/s00180-023-01368-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-023-01368-y