Abstract
Bayesian likelihood-free methods implement Bayesian inference using simulation of data from the model to substitute for intractable likelihood evaluations. Most likelihood-free inference methods replace the full data set with a summary statistic before performing Bayesian inference, and the choice of this statistic is often difficult. The summary statistic should be low-dimensional for computational reasons, while retaining as much information as possible about the parameter. Using a recent idea from the interpretable machine learning literature, we develop some regression-based diagnostic methods which are useful for detecting when different parts of a summary statistic vector contain conflicting information about the model parameters. Conflicts of this kind complicate summary statistic choice, and detecting them can be insightful about model deficiencies and guide model improvement. The diagnostic methods developed are based on regression approaches to likelihood-free inference, in which the regression model estimates the posterior density using summary statistics as features. Deletion and imputation of part of the summary statistic vector within the regression model can remove conflicts and approximate posterior distributions for summary statistic subsets. A larger than expected change in the estimated posterior density following deletion and imputation can indicate a conflict in which inferences of interest are affected. The usefulness of the new methods is demonstrated in a number of real examples.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Anderson, C.W., Coles, S.G.: The largest inclusions in a piece of steel. Extremes 5(3), 237–252 (2002)
Bayarri, M.J., Castellanos, M.E.: Bayesian checking of the second levels of hierarchical models. Stat. Sci. 22, 322–343 (2007)
Beaumont, M.A., Zhang, W., Balding, D.J.: Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035 (2002)
Bissiri, P.G., Holmes, C.C., Walker, S.G.: A general framework for updating belief distributions. J. R. Stat. Soc.: Ser. B (Statistical Methodology) 78(5), 1103–1130 (2016)
Blum, M.G.B., François, O.: Non-linear regression models for approximate Bayesian computation. Stat. Comput. 20, 63–75 (2010)
Blum, M.G.B., Nunes, M.A., Prangle, D., Sisson, S.A.: A comparative review of dimension reduction methods in approximate Bayesian computation. Stat. Sci. 28(2), 189–208 (2013)
Bortot, P., Coles, S., Sisson, S.: Inference for stereological extremes. J. Am. Stat. Assoc. 102(477), 84–92 (2007)
Box, G.E.P.: Sampling and Bayes inference in scientific modelling and robustness (with discussion). J. R. Stat. Soc. Ser. A 143, 383–430 (1980)
Csilléry, K., Francçois, O., Blum, M.G.B.: abc: an R package for approximate Bayesian computation (ABC). Methods Ecol. Evol. 3(3), 475–479 (2012)
Dinev, T., Gutmann, M.: Dynamic likelihood-free inference via ratio estimation (DIRE). arXiv:1810.09899 (2018)
Drovandi, C.C., Pettitt, A.N., Lee, A.: Bayesian indirect inference using a parametric auxiliary model. Stat. Sci. 30(1), 72–95 (2015)
Erhardt, R., Sisson, S.A.: Modelling extremes using approximate Bayesian computation. In: Dey, D., Yan, J. (eds.) Extreme Value Modelling and Risk Analysis, pp. 281–306. Chapman and Hall/CRC (2016)
Evans, M.: Measuring Statistical Evidence Using Relative Belief. Taylor & Francis (2015)
Evans, M., Moshonov, H.: Checking for prior-data conflict. Bayesian Anal. 1, 893–914 (2006)
Fan, J., Ma, C., Zhong, Y.: A selective overview of deep learning. Stat. Sci. 36(2), 264–290 (2021)
Fan, Y., Nott, D.J., Sisson, S.A.: Approximate Bayesian computation via regression density estimation. Stat 2(1), 34–48 (2013). https://doi.org/10.1002/sta4.15
Fasiolo, M., Pya, N., Wood, S.N.: A comparison of inferential methods for highly nonlinear state space models in ecology and epidemiology. Stat. Sci. 31, 96–118 (2016)
Frazier, D.T., Drovandi, C.: Robust approximate Bayesian inference with synthetic likelihood. J. Comput. Gr. Stat. (2021). (to Appear)
Frazier, D.T., Drovandi, C., Loaiza-Maya, R.: (2020a) Robust approximate Bayesian computation: an adjustment approach. arXiv preprint arXiv:2008.04099
Frazier, D.T., Robert, C.P., Rousseau, J.: Model misspecification in approximate Bayesian computation: consequences and diagnostics. J. R. Stat. Soc.: Ser. B (Statistical Methodology) 82(2), 421–444 (2020b)
Gelman, A., Meng, X.L., Stern, H.: Posterior predictive assessment of model fitness via realized discrepancies. Stat. Sin. 6, 733–807 (1996)
Gelman, A., Vehtari, A., Simpson, D., Margossian, C.C., Carpenter, B., Yao, Y., Kennedy, L., Gabry, J., Bürkner, P.C., Modrák, M.: Bayesian workflow. arXiv:2011.01808 (2020)
Gurney, W., Blythe, S., Nisbet, R.: Nicholsons blowflies revisited. Nature 287, 17–21 (1980)
Izbicki, R., Lee, A.B.: Converting high-dimensional regression to high-dimensional conditional density estimation. Electron. J. Stat. 11, 2800–2831 (2017)
Izbicki, R., Lee, A.B., Pospisil, T.: ABC–CDE: toward approximate Bayesian computation with complex high-dimensional data and limited simulations. J. Comput. Gr. Stat. (2019). (to appear)
Joyce, P., Marjoram, P.: Approximately sufficient statistics and Bayesian computation. Stat. Appl. Genet. Mol. Biol. 7, 26 (2008). https://doi.org/10.2202/1544-6115.1389
Klein, N., Nott, D.J., Smith, M.S.: Marginally calibrated deep distributional regression. J. Comput. Gr. Stat. 30(2), 467–483 (2021)
Li, J., Nott, D.J., Fan, Y., Sisson, S.A.: Extending approximate Bayesian computation methods to high dimensions via Gaussian copula. Comput. Stat. Data Anal. 106, 77–89 (2017)
Li, W., Fearnhead, P.: Convergence of regression-adjusted approximate Bayesian computation. Biometrika 105(2), 301–318 (2018)
Marin, J.M., Pudlo, P., Robert, C.P., Ryder, R.J.: Approximate Bayesian computational methods. Stat. Comput. 22(6), 1167–1180 (2012)
Mayer, M.: missRanger: Fast Imputation of Missing Values. (2019). https://CRAN.R-project.org/package=missRanger, r package version 2.1.0
Meinshausen, N.: Quantile regression forests. J. Mach. Learn. Res. 7, 983–999 (2006)
Moritz, S., Bartz-Beielstein, T.: imputeTS: Time Series Missing Value Imputation in R. R J. 9(1), 207–218 (2017). https://doi.org/10.32614/RJ-2017-009
Nicholson, A.: An outline of the dynamics of animal populations. Aust. J. Zool. 2(1), 9–65 (1954)
Nott, D.J., Wang, X., Evans, M., Englert, B.G.: Checking for prior-data conflict using prior-to-posterior divergences. Stat. Sci. 35(2), 234–253 (2020)
Papamakarios, G., Murray, I.: Fast \(\epsilon \)-free inference of simulation models with Bayesian conditional density estimation. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29, pp. 1028–1036. Curran Associates Inc (2016)
Papamakarios, G., Sterratt, D., Murray, I.: Sequential neural likelihood: fast likelihood-free inference with autoregressive flows. In: Chaudhuri, K., Sugiyama, M. (eds.), Proceedings of Machine Learning Research, vol. 89, pp. 837–848 (2019)
Polson, N.G., Sokolov, V.: Deep learning: a Bayesian perspective. Bayesian Anal. 12(4), 1275–1304 (2017)
Presanis, A.M., Ohlssen, D., Spiegelhalter, D.J., Angelis, D.D.: Conflict diagnostics in directed acyclic graphs, with applications in Bayesian evidence synthesis. Stat. Sci. 28, 376–397 (2013)
Price, L.F., Drovandi, C.C., Lee, A.C., Nott, D.J.: Bayesian synthetic likelihood. J. Comput. Gr. Stat. 27(1), 1–11 (2018)
Probst, P., Wright, M.N., Boulesteix, A.L.: Hyperparameters and tuning strategies for random forest. WIREs Data Min. Knowl. Discov. 9(3), e1301 (2019)
Ratmann, O., Andrieu, C., Wiuf, C., Richardson, S.: Model criticism based on likelihood-free inference, with an application to protein network evolution. Proc. Natl. Acad. Sci. 106(26), 10576–10581 (2009)
Ratmann, O., Pudlo, P., Richardson, S., Robert, C.: Monte Carlo algorithms for model assessment via conflicting summaries. arXiv preprint arXiv:1106.5919 (2011)
Raynal, L., Marin, J.M., Pudlo, P., Ribatet, M., Robert, C.P., Estoup, A.: ABC random forests for Bayesian parameter inference. Bioinformatics 35(10), 1720–1728 (2018)
Rényi, A.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1: Contributions to the Theory of Statistics, University of California Press, Berkeley, Calif., pp 547–561 (1961)
Ricker, W.: Stock and recruitment. J. Fish. Res. Board Canada 11(5), 559–623 (1954)
Ridgway, J.: Probably approximate Bayesian computation: nonasymptotic convergence of ABC under misspecification. arXiv preprint arXiv:1707.05987 (2017)
Robnik-Šikonja, M., Kononenko, I.: Explaining classifications for individual instances. IEEE Trans. Knowl. Data Eng. 20(5), 589–600 (2008)
Rubin, D.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)
Sisson, S., Fan, Y., Beaumont, M. (eds) Handbook of Approximate Bayesian Computation. Chapman & Hall/CRC Handbooks of Modern Statistical Methods, CRC Press, Taylor & Francis Group, Boca Raton, Florida (2018a)
Sisson, S., Fan, Y., Beaumont, M.: Overview of Approximate Bayesian Computation. In: Sisson, S., Fan, Y., Beaumont, M. (eds.) Handbook of Approximate Bayesian Computation, Chapman & Hall/CRC Handbooks of Modern Statistical Methods, CRC Press. Taylor & Francis Group, Boca Raton, Florida (2018b)
Thomas, O., Pesonen, H.: ao RSL, de Lencastre H, Kaski S, Corander J Split-BOLFI for for misspecification-robust likelihood free inference in high dimensions. arXiv preprint arXiv:2002.09377 (2020)
Tong, H.: Threshold models in time series analysis—30 years on. Stat. Interface 4(2), 107–118 (2011)
van Buuren, S., Groothuis-Oudshoorn, K.: mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45(3), 1–67 (2011)
Wilkinson, R.D.: Approximate Bayesian computation (ABC) gives exact results under the assumption of model error. Stat. Appl. Genet. Mol. Biol. 12(2), 129–141 (2013)
Wood, S.N.: Statistical inference for noisy nonlinear ecological dynamic systems. Nature 466, 1102–1107 (2010)
Zhang, H., Nieto, F.H.: TAR: Bayesian Modeling of Autoregressive Threshold Time Series Models. (2017). https://CRAN.R-project.org/package=TAR, r package version 1.0
Zintgraf, L.M., Cohen, T.S., Adel, T., Welling, M.: Visualizing deep neural network decisions: Prediction difference analysis. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings, OpenReview.net, (2017). https://openreview.net/forum?id=BJ5UeU9xx
Acknowledgements
We thank the Editor, Associate Editor, and two referees for their comments which greatly improved the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Michael Evans was supported by a grant from the Natural Sciences and Engineering Research Council of Canada.
Appendix
Appendix
1.1 Details of the SETAR auxiliary model for Sect. 5.3
For a time series \(X_t\), \(t=0,1,\dots T\), the SETAR model used to obtain summary statistics takes the form
where \(\epsilon _t\sim N(0,\rho ^2)\) if \(X_{t-1}<c\) and \(\epsilon _t\sim N(0,\zeta ^2)\) if \(X_{t-1}\ge c\). Independence is assumed at different times for the noise sequence \(\epsilon _t\). The parameter c is a threshold parameter, and the dynamics of the process switches between two autoregressive components of order 2 depending on whether the threshold is exceeded. To obtain our summary statistics, we fit this model to the observed data, and fix c based on the observed data fit. With this fixed c the SETAR model is then fitted to any simulated data series d to obtain maximum likelihood estimates of the SETAR model parameters, which are the summary statistics denoted by \(S=S(d)\). We write \(S=(S_L^\top ,S_U^\top )^\top \), where \(S_L=(\widehat{a}_0,\widehat{a}_1,\widehat{a}_2,\widehat{\rho })^\top \) and \(S_U=(\widehat{b}_0,\widehat{b}_1,\widehat{b}_2,\widehat{\zeta })^\top \) are maximum likelihood estimates for the autoregressive component parameters for low and high levels, respectively. The SETAR models are fitted using the TAR package (Zhang and Nieto 2017) in R. In simulating data from the model there were some cases where there were no values of the series above the threshold c. Since the number of such cases was small, we simply discarded these simulations.
1.2 Details of diagnostic for the example of Sect. 5.4
We describe how we implement our diagnostic for the example of Sect. 5.4. Roughly speaking, all windows B of width k are considered including a time t, and then we average \(R_\infty (S_B|S_A)\) over B where \(S_B\) consists of the series values in B and \(S_A\) is the remaining values.
To make the method precise we need some further notation. Let \(d=\{d_i:1\le i\le T\}\) denote a time series of length T. For some subset C of the times, \(C\subseteq \{1,\dots , T\}\), we write \(d(C)=\{d_i:i\in C\}\) and \(d(-C)=\{d_i:i\notin C\}\). Let \(t\in \{1,\dots , T\}\) be a fixed time. Let \(W_t^K=\{C_1^{t,k},\dots , C_{n_{t,k}}^{t,k}\}\) denote the set of all windows of width k containing t of the form \(\{l,\dots , l+k-1\}\) for some l. For each \(j=1,\dots , n_{t,k}\) suppose that \(d^{t,k}_j\) is some time series of length T, and write \(d_.^{t,k}=(d_1^{t,k},\dots , d_{n_{t,k}}^{t,k})\). Write \(d_{obs}^{t,k}\) for the value of \(d_.^{t,k}\) where \(d_j^{t,k}=d_{\mathrm{obs}}\) for all \(j=1,\dots , n_{t,k}\) where \(d_{\mathrm{obs}}\) is the observed series. Let \(d_.^{t,k,*}\) denote the value of \(d_.^{t,k}\) where \(d_j^{t,k,*}(-C_j^{t,k})=d_{obs}(-C_j^{t,k})\) and \(d_j^{t,k,*}(C_j^{t,k})\sim p(d(C_j^{t,k})|d_{\mathrm{obs}} (-C_j^{t,k}))\), i.e., the observation for \(d_j^{t,k,*}\) in \(C_j^{t,k}\) is generated from the conditional prior predictive given the observed value for the remainder of the series. The draws \(d_j^{t,k,*}(C_j^{t,k})\) are independent for different j. Let
Imputation and conditional sampling of features for Example 5.4. Blue-shaded area is the window of length k which we impute in the series. Red-shaded area is the larger window to which a multivariate normal model is fitted based on the training set observations. Normal noise is added to a conditional mean imputation using the covariance matrix for the observations in the blue window, conditional on remaining observations in the red patch. (Color figure online)
Left-hand side: estimated posterior density of \(\eta \) using ABC random forests with different sets of summary statistics in Example 5.1, with imputation using missRanger. Right-hand side: observed maximum log relative belief statistic (vertical lines) and histogram of reference distribution values for imputations when a \(s^2\) is imputed from \(\bar{y}\) and b \(\bar{y}\) is imputed from \(s^2\) in Example 5.1, using missRanger imputation
We base our diagnostic on \(R^{t,k}(d_{\mathrm{obs}}^{t,k})\), calibrated by
and estimate (13) by
where \(d_.^{t,k,i}\), \(i=1,\dots , M^*\) are approximations of draws of \(d_.^{t,k,*}\) based on imputation, i.e., we have imputed \(d_j^{t,k,*}(C_j^{t,k})\) from \(d_{\mathrm{obs}}(-C_j^{t,k})\) independently for each j and i.
1.3 Details of the imputation method for the example of Sect. 5.4
Figure 10 illustrates the idea behind the window-based imputation used in Sect. 5.4. We impute values of the series for a window of size k which has been deleted (blue region in the figure). A larger window around the one of width k is considered (red patch in the figure). A mean imputation using the na_interpolation function in the imputeTS R package (Moritz and Bartz-Beielstein 2017) with spline interpolation is obtained with default settings for tuning parameters. For multiple imputation, we add noise to the conditional mean by fitting a stationary Gaussian autoregressive model of order one to the observed series and then consider zero mean Gaussian noise, where the covariance matrix of the noise is the conditional covariance matrix of the autoregressive process in the blue region given the remaining observations in the red patch. Although the series values are counts, these counts are generally large and we treat them as continuous quantities in the imputation procedure.
1.4 Results for examples using different imputation methods
Top row: estimated marginal posterior densities using quantile regression forests and different summary statistics and imputations of summary statistic subsets in Example 5.2. a, b and c are for parameters \(\log \lambda \), \(\log \sigma \) and \(\xi \), respectively. Middle and bottom rows: observed maximum log relative belief statistic (vertical lines) and histogram of reference distribution values for imputations when \(S_U\) is imputed from \((N,S_L)\) (top row) and \(S_L\) is imputed from \((N,S_U)\) (bottom row). All imputation is done using the local linear MICE approach
Top row: estimated marginal posterior densities using quantile regression forests and different summary statistics and imputations of summary statistic subsets in Example 5.3. a, b and c are for parameters \(\log r\), \(\log \sigma \) and \(\log \phi \), respectively. Middle and bottom rows: observed maximum log relative belief statistic (vertical lines) and histogram of reference distribution values for imputations when \(S_U\) is imputed from \(S_L\) (top row) and \(S_L\) is imputed from \(S_U\) (bottom row). All imputation is done using the local linear MICE approach
Rights and permissions
About this article
Cite this article
Mao, Y., Wang, X., Nott, D.J. et al. Detecting conflicting summary statistics in likelihood-free inference. Stat Comput 31, 78 (2021). https://doi.org/10.1007/s11222-021-10053-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-021-10053-3