Abstract
Many biologists, especially in ecology and evolution, analyze their data by estimating fits to a set of candidate models and selecting the best model according to the Akaike Information Criterion (AIC) or the Bayesian Information Criteria (BIC). When the candidate models represent alternative hypotheses, biologists may want to limit the chance of a false positive to a specified level. Existing model selection methodology, however, allows for only indirect control over error rates by setting a threshold for the difference in AIC scores. We present a novel theoretical framework for parametric Neyman-Pearson (NP) model selection using information criteria that does not require a pre-data null and applies to three or more non-nested models simultaneously. We apply the theoretical framework to the Error Control for Information Criteria (ECIC) procedure introduced by Cullan et al. (J Appl Stat 47: 2565–2581, 2019), and we show it shares many of the desirable properties of AIC-type methods, including false positive and negative rates that converge to zero asymptotically. We discuss implications for the compatibility of evidentialist and severity-based approach to evidence in philosophy of science.
Similar content being viewed by others
References
Aho, K., Derryberry, D. W., & Peterson, T. (2014). Model selection for ecologists: The worldviews of AIC and BIC. Ecology, 95(3), 631–636. https://doi.org/10.1890/13-1452.1
Anderson, D. R. (2008). Model based inference in the life sciences: A primer on evidence. London: Springer.
Bandyopadhyay, P. S., & Boik, R. J. (1999). The curve fitting problem: A Bayesian rejoinder. Philosophy of Science, 66(S3), S390–S402.
Bandyopadhyay, P. S., & Brittan, G. G. (2006). Acceptibility, evidence, and severity. Synthese, 148(2), 259–293. https://doi.org/10.1007/s11229-004-6222-6
Bandyopadhyay, P. S., Brittan, G. G., & Taper, M. L. (2016). Error-statistics, evidence, and severity. In P. S. Bandyopadhyay, G. G. Brittan, & M. L. Taper (Eds.), Belief, evidence, and uncertainty: Problems of epistemic inference (pp. 73–91). Cham: Springer International Publishing.
Berger, J. O. (1985). Statistical decision theory and Bayesian analysis. New York: Springer.
Brewer, M. J., Butler, A., & Cooksley, S. L. (2016). The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity. Methods in Ecology and Evolution, 7(6), 679–692.
Burnham, K. P., & Anderson, D. R. (2002). Model selection and inference: A practical-theoretic approach. New York: Springer.
Casella, G., & Berger, R. L. (2002). Statistical inference. Pacific Grove, CA: Wadsworth Group. Duxbury.
Chambaz, A. (2006). Testing the order of a model. The Annals of Statistics, 34(3), 1166–1203.
Cullan, M., Lidgard, S., & Sterner, B. (2020). Controlling the error probabilities of model selection information criteria using bootstrapping. Journal of Applied Statistics, 47(13–15), 2565–2581.
Dennis, B., et al. (2019). Errors in statistical inference under model misspecification: Evidence, hypothesis testing, and AIC. Frontiers in Ecology and Evolution, 7, 372.
Ding, J., Tarokh, V., & Yang, Y. (2018). Model selection techniques: An overview. IEEE Signal Processing Magazine, 35(6), 16–34. https://doi.org/10.1109/MSP.2018.2867638
Dziak, J. J., et al. (2020). Sensitivity and specificity of information criteria. Briefings in Bioinformatics, 21(2), 553–565. https://doi.org/10.1093/bib/bbz016
Eguchi, S., & Copas, J. (2006). Interpreting Kullback–Leibler divergence with the Neyman–Pearson lemma. Journal of Multivariate Analysis, 97(9), 2034–2040. https://doi.org/10.1016/j.jmva.2006.03.007
Forster, M., & Sober, E. (1994). How to tell when simpler, more unified, or less ad hoc theories will provide more accurate predictions. The British Journal for the Philosophy of Science, 45(1), 1–35.
Glatting, G., et al. (2007). Choosing the optimal fit function: Comparison of the Akaike information criterion and the F-test. Medical physics, 34(11), 4285–4292.
Hegyi, G., & Laczi, M. (2015). Using full models, stepwise regression and model selection in ecological data sets: Monte Carlo simulations. Annales Zoologici Fennici, 52(5), 257–279. https://doi.org/10.5735/086.052.0502
Hunt, G. (2006). Fitting and comparing models of phyletic evolution: Random walks and beyond. Paleobiology, 32(4), 578–601.
Kuha, J. (2004). AIC and BIC: Comparisons of assumptions and performance. Sociological Methods and Research, 33(2), 188–229. https://doi.org/10.1177/0049124103262065
Lele, S. R. (2004). The nature of scientific evidence: Statistical, philosophical, and empirical considerations (pp. 191–216). Chicago: The University of Chicago Press.
Leong, A. S., Dey, S., & Evans, J. S. (2007). Error exponents for Neyman-Pearson detection of Markov chains in noise. IEEE Transactions on Signal Processing, 55(10), 5097–5103. https://doi.org/10.1109/TSP.2007.897863
Markatou, M., Karlis, D., & Ding, Y. (2021). Distance-based statistical inference. Annual Review of Statistics and Its Application, 8(1), 301–327. https://doi.org/10.1146/annurev-statistics-031219-041228
Markon, K. E., & Krueger, R. F. (2004). An empirical comparison of information-theoretic selection criteria for multivariate behavior genetic models. Behavior Genetics, 34, 593–610. https://doi.org/10.1007/s10519-004-5587-0
Matthewson, J., & Weisberg, M. (2008). The structure of tradeoffs in model building. Synthese, 170, 169–190.
Matthewson, J., & Weisberg, M. (2009). Learning from error, severe testing, and the growth of theoretical knowledge. In D. G. Mayo & A. Spanosen (Eds.), Error and inference: Recent exchanges on experimental reasoning, reliability, and the objectivity and rationality of science (pp. 28–57). Cambridge: Cambridge University Press.
Mayo, D. G. (1996). Error and the growth of experimental knowledge. Chicago: University of Chicago Press.
Mayo, D. G. (2000). Experimental practice and an error statistical account of evidence. Philosophy of Science, 67(S3), S193–S207.
Mayo, D. G., & Spanos, A. (2006). Severe testing as a basic concept in a Neyman-Pearson philosophy of induction. The British Journal for the Philosophy of Science, 57(20), 323–357.
Nishii, R. (1988). Maximum likelihood principle and model selection when the true model is unspecified. Journal of Multivariate Analysis, 27(2), 392–403.
Pesaran, M. H. (1990). Non-nested hypotheses. Econometrics (pp. 167–173). London: Palgrave Macmillan.
Ponciano, J. M., & Taper, M. L. (2019). Model projections in model space: A geometric interpretation of the AIC allows estimating the distance between truth and approximating models. Frontiers in Ecology and Evolution, 7, 413.
Ripplinger, J., & Sullivan, J. (2008). Does choice in model selection affect maximum likelihood analysis? Systematic Biology, 57(1), 76–85. https://doi.org/10.1080/10635150801898920
Royall, R. (1997). Statistical evidence: A likelihood paradigm. London, UK: Chapman & Hall.
Royall, R. (2000). On the probability of observing misleading statistical evidence. Journal of the American Statistical Association, 95(451), 760–768.
Sayyareh, A., Obeidi, R., & Bar-Hen, A. (2010). Empiricial comparison between some model selection criteria. Communications in Statistics-Simulation and Computation, 40(1), 72–86. https://doi.org/10.1080/03610918.2010.530367
Shao, J., & Rao, J. S. (2000). The GIC for model selection: A hypothesis testing approach. Journal of Statistical Planning and Inference, 88(2), 215–231. https://doi.org/10.1016/S0378-3758(00)00080-X
Spanos, A. (2010). Akaike-type criteria and the reliability of inference: Model selection versus statistical model specification. Journal of Econometrics, 158(2), 204–220. https://doi.org/10.1016/j.jeconom.2010.01.011
Spanos, A., & Mayo, D. G. (2015). Error statistical modeling and inference: Where methodology meets ontology. Synthese, 192, 3533–3555. https://doi.org/10.1007/s11229-015-0744-y
Sterner, B., & Lidgard, S. (2024). Objectivity and underdetermination in statistical model selection. The British Journal for the Philosophy of Science, 75(3), 717–739. https://doi.org/10.1086/716243.
Sullivan, J., & Joyce, P. (2005). Model selection in phylogenetics. Annual Review of Ecology, Evolution, and Systematics, 36(1), 445–466. https://doi.org/10.1146/annurev.ecolsys.36.102003.152633
Taper, M. L., Lele, S. R., Ponciano, J. M., Dennis, B., Jerde, C. L. (2021) Assessing the global and local uncertainty of scientific evidence in the presence of model misspecification. Frontiers in Ecology and Evolution, 9, 679155. https://doi.org/10.3389/fevo.2021.679155
Taper, M. L., & Ponciano, J. M. (2016). Evidential statistics as a statistical modern synthesis to support 21st century science. Population Ecology, 58, 9–29. https://doi.org/10.1007/s10144-015-0533-y
Tong, X., Feng, Y., & Li, J. J. (2018). Neyman-Pearson classification algorithms and NP receiver operating characteristics. Science Advances. VOLME? PAGE NUMBERS?https://doi.org/10.1126/sciadv.aao1659
Tong, X., Feng, Y., & Zhao, A. (2016). A survey on Neyman-Pearson classification and suggestions for future research. Wiley Interdisciplinary Reviews: Computational Statistics, 8(2), 64–81. https://doi.org/10.1002/wics.1376
Vrieze, S. I. (2012). Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychological Methods, 17(2), 228.
Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: Journal of the Econometric Society, 57(2), 307. https://doi.org/10.2307/1912557
Wagenmakers, E. J., et al. (2004). Assessing model mimicry using the parametric bootstrap. Journal of Mathematical Psychology, 48(1), 28–50.
Funding
Funding was provided by John Templeton Foundation (Grant No. 62220).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflict of interest to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cheng, H., Sterner, B. Error Statistics Using the Akaike and Bayesian Information Criteria. Erkenn (2024). https://doi.org/10.1007/s10670-024-00897-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10670-024-00897-2