[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Variable Selection in Binary Logistic Regression for Modelling Bankruptcy Risk

  • Conference paper
  • First Online:
Statistical Modelling and Risk Analysis (ICRA 2022)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 430))

Included in the following conference series:

Abstract

One of the most fascinating areas of study in the current economic and financial world is the forecasting of credit risk and the ability to predict a company’s insolvency. Meanwhile, one major challenge in constructing predictive failure models is variable selection. Standard selection methods exist alongside new approaches. In addition, the huge availability of data often implies limitations due to processing time and new high-performance procedures provide tools that can take advantage of parallel processing. In the present paper, different variable selection techniques were explored in the context of applying logistic regression for binary data to a balanced data set including only firms active or in bankruptcy. Models deriving from stepwise selection, the Least Absolute Shrinkage and Selection Operator (LASSO) and an unsupervised method, based on the maximum data variance explained, were compared. Then a non-parametric approach was considered and the selection of variables coming from a single decision tree and a forest of trees is compared and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 127.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 159.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
GBP 159.99
Price includes VAT (United Kingdom)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Altman, E.I.: Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Finance 23(4), 589–609 (1968)

    Article  Google Scholar 

  2. Amendola, A., Restaino, M., Sensini, L.: Variable selection in default risk models. J. Risk Model Validation 5(1), 3 (2011)

    Article  Google Scholar 

  3. Austin, P.C., Tu, J.V.: Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality. J. Clin. Epidemiol. 57(11), 1138–1146 (2004)

    Article  Google Scholar 

  4. Banasik, J., Crook, J.N., Thomas, L.C.: Not if but when will borrowers default. J. Oper. Res. Soc. 50(12), 1185–1190 (1999)

    Article  MATH  Google Scholar 

  5. Beaver, W.H.: Financial ratios as predictors of failure. Journal of Account. Res. 4, 71–111 (1966)

    Article  Google Scholar 

  6. Bonini, S., Caivano, G.: The survival analysis approach in Basel II credit risk management: modeling danger rates in the loss given default parameter. J. Credit Risk 9(1), 101–118 (2013)

    Article  Google Scholar 

  7. Bunea, F.: Honest variable selection in linear and logistic regression models via \(\ell \)1 and \(\ell \)1+ \(\ell \)2 penalization. Electron. J. Stat. 2, 1153–1194 (2008)

    Google Scholar 

  8. Bursac, Z., Gauss, C.H., Williams, D.K., Hosmer, D.W.: Purposeful selection of variables in logistic regression. Source Code Biol. Med. 3(1), 1–8 (2008)

    Article  Google Scholar 

  9. Cao, R., Vilar, J.M., Devia, A., Veraverbeke, N., Boucher, J.P., Beran, J.: Modelling consumer credit risk via survival analysis. SORT Stat. Oper. Res. Trans. 33(1), 31–47 (2009)

    MathSciNet  Google Scholar 

  10. Caroni, C., Pierri, F.: Different causes of closure of small business enterprises: alternative models for competing risks survival analysis. Electron. J. Appl. Stat. Anal. 13(1), 211–228 (2020)

    Google Scholar 

  11. Dreiseitl, S., Ohno-Machado, L.: Logistic regression and artificial neural network classification models: a methodology review. J. Biomed. Inform. 35(5–6), 352–359 (2002)

    Article  Google Scholar 

  12. Du Jardin, P.: Predicting bankruptcy using neural networks and other classification methods: the influence of variable selection techniques on model accuracy. Neurocomputing 73(10), 2047–2060 (2010). https://doi.org/10.1016/j.neucom.2009.11.034, https://www.sciencedirect.com/science/article/pii/S0925231210001098, subspace Learning/Selected papers from the European Symposium on Time Series Prediction

  13. Du Jardin, P.: The influence of variable selection methods on the accuracy of bankruptcy prediction models. Bank. Mark. Invest. 116, 20–39 (2012)

    Google Scholar 

  14. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407–499 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Fan, J., Li, R.: Variable selection for Cox’s proportional hazards model and frailty model. Ann. Stat. 30(1), 74–99 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  16. Fan, J., Li, G., Li, R.: An overview on variable selection for survival analysis. In: Contemporary Multivariate Analysis and Design of Experiments: In Celebration of Professor Kai-Tai Fang’s 65th Birthday, pp. 315–336 (2005)

    Google Scholar 

  17. Fu, Z., Parikh, C.R., Zhou, B.: Penalized variable selection in competing risks regression. Lifetime Data Anal. 23(3), 353–376 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  18. Ghosh, K., Ramteke, M., Srinivasan, R.: Optimal variable selection for effective statistical process monitoring. Comput. Chem. Eng. 60, 260–276 (2014)

    Article  Google Scholar 

  19. He, Z., Tu, W., Wang, S., Fu, H., Yu, Z.: Simultaneous variable selection for joint models of longitudinal and survival outcomes. Biometrics 71(1), 178–187 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  20. Kiefer, N.M.: Economic duration data and hazard functions. J. Econ. Literature 26(2), 646–679 (1988)

    Google Scholar 

  21. Kim, J., Sohn, I., Jung, S.H., Kim, S., Park, C.: Analysis of survival data with group lasso. Commun. Stat. Simul. Comput. 41(9), 1593–1605 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  22. King, G., Zeng, L.: Logistic regression in rare events data. Political Anal. 9(2), 137–163 (2001)

    Article  Google Scholar 

  23. Kumar, A., Rao, V.R., Soni, H.: An empirical comparison of neural network and logistic regression models. Mark. Lett. 6(4), 251–263 (1995)

    Article  Google Scholar 

  24. Kundu, S., Mazumdar, M., Ferket, B.: Impact of correlation of predictors on discrimination of risk models in development and external populations. BMC Med. Res. Methodol. 17(1), 1–9 (2017)

    Article  Google Scholar 

  25. Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(1), 53–71 (2008)

    Google Scholar 

  26. Mossman, C.E., Bell, G.G., Swartz, L.M., Turtle, H.: An empirical comparison of bankruptcy models. Financial Rev. 33(2), 35–54 (1998)

    Article  Google Scholar 

  27. Narain, B.: Survival analysis and the credit granting decision. In: Thomas, L.C., Crook, J.N., Edelman, D.B. (eds.), Credit Scoring and Credit Control, pp. 109–122. Oxford Univeristy Press (1992)

    Google Scholar 

  28. Ohlson, J.A.: Financial ratios and the probabilistic prediction of bankruptcy. J. Account. Res. 18(1), 109–131 (1980)

    Article  MathSciNet  Google Scholar 

  29. Orbis: Orbis. Bureau van Dijk, https://orbis.bvdinfo.com/. Accessed June 2020

  30. Pierri, F., Caroni, C.: Bankruptcy prediction by survival models based on current and lagged values of time-varying financial data. Commun. Stat. Case Stud. Data Anal. Appl. 3(3–4), 62–70 (2017)

    Google Scholar 

  31. Pierri, F., Caroni, C.: Analysing the risk of bankruptcy of firms: survival analysis, competing risks and multistate models. In: Demography of Population Health, Aging and Health Expenditures, pp. 385–394. Springer (2020)

    Google Scholar 

  32. Pierri, F., Stanghellini, E., Bistoni, N.: Risk analysis and retrospective unbalanced data. Revstat-Stat. J. 14(2), 157–169 (2016)

    MathSciNet  MATH  Google Scholar 

  33. SAS: SAS/STAT® 9.22 User’s Guide. https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#logistic_toc.htm. Accessed 19 Nov 2022

  34. SAS: SAS/STAT® 9.22 User’s Guide. https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#glmselect_toc.htm. Accessed 19 Nov 2022

  35. SAS: SAS/STAT® 9.22 User’s Guide. https://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_hpsplit_overview.htm. Accessed 19 Nov 2022

  36. SAS: SAS® Enterprise MinerTM: High-Performance Procedures. https://documentation.sas.com/doc/en/emhpprcref/14.2/emhpprcref_hpreduce_details01.htm (2016). Accessed 19 Nov 2022

  37. SAS Institute Inc., Cary, NC: SAS® Enterprise MinerTM 15.2: High-Performance Procedures, last updated: August 18, 2022

    Google Scholar 

  38. Shumway, T.: Forecasting bankruptcy more accurately: a simple hazard model. J. Bus. 74(1), 101–124 (2001)

    Article  Google Scholar 

  39. Stepanova, M., Thomas, L.: Survival analysis methods for personal loan data. Oper. Res. 50(2), 277–289 (2002)

    Article  MATH  Google Scholar 

  40. Sun, K., Huang, S.H., Wong, D.S.H., Jang, S.S.: Design and application of a variable selection method for multilayer perceptron neural network with lasso. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1386–1396 (2016)

    Article  Google Scholar 

  41. Tang, Z., Shen, Y., Zhang, X., Yi, N.: The spike-and-slab lasso Cox model for survival prediction and associated genes detection. Bioinformatics 33(18), 2799–2807 (2017)

    Article  Google Scholar 

  42. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)

    Google Scholar 

  43. Tibshirani, R.: The lasso method for variable selection in the Cox model. Stat. Med. 16(4), 385–395 (1997)

    Article  Google Scholar 

  44. Tibshirani, R.: Regression shrinkage and selection via the lasso: a retrospective. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73(3), 273–282 (2011)

    Google Scholar 

  45. Zellner, D., Keller, F., Zellner, G.E.: Variable selection in logistic regression models. Commun. Stat. Simul. Comput. 33(3), 787–805 (2004)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesca Pierri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pierri, F. (2023). Variable Selection in Binary Logistic Regression for Modelling Bankruptcy Risk. In: Kitsos, C.P., Oliveira, T.A., Pierri, F., Restaino, M. (eds) Statistical Modelling and Risk Analysis. ICRA 2022. Springer Proceedings in Mathematics & Statistics, vol 430. Springer, Cham. https://doi.org/10.1007/978-3-031-39864-3_12

Download citation

Publish with us

Policies and ethics