[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Imbalanced enterprise credit evaluation with DTE-SBD

Published: 01 January 2018 Publication History

Abstract

Enterprise credit evaluation model is an important tool for bank and enterprise risk management, but how to construct an effective decision tree (DT) ensemble model for imbalanced enterprise credit evaluation is seldom studied. This paper proposes a new DT ensemble model for imbalanced enterprise credit evaluation based on the synthetic minority over-sampling technique (SMOTE) and the Bagging ensemble learning algorithm with differentiated sampling rates (DSR), which is named as DTE-SBD (Decision Tree Ensemble based on SMOTE, Bagging and DSR). In different times of iteration for base DT classifier training, new positive (high risky) samples are produced to different degrees by SMOTE with DSR, and different numbers of negative (low risky) samples are drawn with replacement by Bagging with DSR. However, in the same time of iteration with certain sampling rate, the training positive samples including the original and the new are of the same number as the drawn training negative samples, and they are combined to train a DT base classifier. Therefore, DTE-SBD can not only dispose the class imbalance problem of enterprise credit evaluation, but also increase the diversity of base classifiers for DT ensemble. Empirical experiment is carried out for 100 times with the financial data of 552 Chinese listed companies, and the performance of imbalanced enterprise credit evaluation is compared among the six models of pure DT, over-sampling DT, over-under-sampling DT, SMOTE DT, Bagging DT, and DTE-SBD. The experimental results indicate that DTE-SBD significantly outperforms the other five models and is effective for imbalanced enterprise credit evaluation.

References

[1]
M. Ala'raj, M.F. Abbod, Classifiers consensus system approach for credit scoring, Knowl. Based Syst., 104 (2016) 89-105.
[2]
E.I. Altman, Financial ratios discriminant analysis and the prediction of corporate bankruptcy, J. Finance, 4 (1968) 589-609.
[3]
E. Angelini, G. Tollo, A. Roli, A neural network approach for credit risk evaluation, Q. Rev. Econ. Finance, 48 (2008) 733-755.
[4]
D. Besanko, A. Thakor, Collateral and rationing: sorting equilibria in monopolistic and competitive credit markets, Int. Econ. Rev., 28 (1987) 671-689.
[5]
H. Bester, V. Sreening, Rationing in credit markets with imperfect information, Am. Econ. Rev., 75 (1985) 850-855.
[6]
L. Breiman, Bagging predictors, Mach. Learn., 24 (1996) 123-140.
[7]
I. Brown, C. Mues, An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Syst. Appl., 39 (2012) 3446-3453.
[8]
N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique, J. Artif. Intell. Res., 16 (2002) 321-357.
[9]
W. Chen, C. Ma, L. Ma, Mining the customer credit using hybrid support vector machine technique, Expert Syst. Appl., 36 (2009) 7611-7616.
[10]
C.-L. Chuang, Application of hybrid case-based reasoning for enhanced performance in bankruptcy prediction, Inf. Sci., 236 (2013) 174-185.
[11]
S. Crone, S. Finlay, Instance sampling in credit scoring: an empirical study of sample size and balancing, Int. J. Forecast., 28 (2012) 224-238.
[12]
I. Diaz-Valenzuela, V. Loia, M.J. Martn-Bautista, S. Senatore, M.A. Vila, Automatic constraints generation for semisupervised clustering: experiences with documents classification, Soft Comput, 20 (2016) 2329-2339.
[13]
R.A. Eisenbeis, Pitfalls in the application of discriminant analysis in business, finance, and economics, J. Finance, 32 (1997) 875-900.
[14]
H. Guo, Y. Li, J. Shang, M. Gu, Y. Huang, B. Gong, Learning from class-imbalanced data: review of methods and applications, Expert Syst. Appl., 73 (2017) 220-239.
[15]
D.J. Hand, W.E. Henley, Statistical classification methods in consumer credit scoring: a review, J. R. Stat. Soc.: Ser. A (Stat. Soc.), 160 (1997) 523-541.
[16]
B. Hillier, M. Lbrahimo, Asymmetric information and models of credit rationing, Bull. Econ. Res., 45 (1993) 271-304.
[17]
N.-C. Hsieh, L.-P. Hung, A data driven ensemble classifier for credit scoring analysis, Expert Syst. Appl., 37 (2010) 534-545.
[18]
Y.-C. Hu, C.-J. Chen, A PROMETHEE-based classification method using concordance and discordance relations and its application to bankruptcy prediction, Inf. Sci., 181 (2011) 4959-4968.
[19]
Y.-M. Huang, C.-M. Hung, H.C. Jiau, Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem, Nonlinear Anal.: Real World Appl., 7 (2006) 720-747.
[20]
A. Khashman, Neural networks for credit risk evaluation: investigation of different neural models and learning schemes, Expert Syst. Appl., 37 (2010) 6233-6239.
[21]
F.N. Koutanaei, H. Sajedi, M. Khanbabaei, A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring, J Retail Consum Serv, 27 (2015) 11-23.
[22]
E.K. Laitinen, Predicting a corporate credit analysis risk estimate by logistic and linear models, Int. Rev. Financial Anal., 8 (1999) 97-121.
[23]
Y.-C. Lee, Application of support vector machines to corporate credit rating prediction, Expert Syst. Appl., 33 (2007) 67-74.
[24]
S. Lessmann, B. Baesens, H.-V. Seow, L.C. Thomas, Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research, Eur. J. Oper. Res., 247 (2015) 124-136.
[25]
J. .Li, G. Li, D. Sun, C.-F. Lee, Evolution strategies based adaptive Lq penalty support vector machine with gauss kernel for credit risk analysis, Appl. Soft Comput., 12 (2012) 2675-2682.
[26]
J. Li, X. Zhu, C.-F. Lee, D. Wu, J. Feng, Y. Shi, On the aggregation of credit, market and operational risks, Rev. Quant. Finance Accoun., 44 (2015) 161-189.
[27]
F. Louzada, P. Silva, C. Diniz, On the impact of disproportional samples in credit scoring models: an application to a Brazilian bank data, Expert Syst. Appl., 39 (2012) 8071-8078.
[28]
L. Nanni, A. Lumini, An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring, Expert Syst. Appl., 36 (2009) 3028-3033.
[29]
S.-L. Pang, J.-Z. Gong, C5.0 classification algorithm and application on individual credit evaluation of banks, Syst. Eng.Theory Pract., 29 (2009) 94-104.
[30]
J. Stiglitz, A. Wesiss, Credit rationing in markets with imperfect information, Am. Econ. Rev., 71 (1981) 393-410.
[31]
J Sun, Y.-C. Lee, H. Li, Q.-H. Huang, Combining B&B-based hybrid feature selection and the imbalance-oriented multiple-classifier ensemble for imbalanced credit risk assessment, Technol. Econ. Dev. Econ., 21 (2015) 351-378.
[32]
J. Sun, H. Li, Data mining method for listed companies financial distress prediction, Knowl. Based Syst., 21 (2008) 1-5.
[33]
J. Sun, Z. Shang, H. Li, Imbalance-oriented SVM methods for financial distress prediction: a comparative study among the new SB-SVM-ensemble method and traditional methods, J. Oper. Res. Soc., 65 (2014) 1905-1919.
[34]
L.C. Thomas, A survey of credit and behavioral scoring: forecasting financial risks of lending to customers, Int. J. Forecasting, 16 (2000) 149-172.
[35]
C. Tsai, J. Wu, Using neural network ensembles for bankruptcy prediction and credit scoring, Expert Syst. Appl., 34 (2008) 2639-2649.
[36]
G. Wang, J. Hao, J. Mab, H. Jiang, A comparative assessment of ensemble learning for credit scoring, Expert Syst. Appl., 38 (2011) 223-230.
[37]
G. Wang, J. Ma, L. Huang, K. Xu, Two credit scoring models based on dual strategy ensemble trees, Knowl. Based Syst., 26 (2012) 61-68.
[38]
S. Williamson, Costly monitoring, financial intermediation, and equilibrium credit rationing, J. Monetary Econ., 18 (1986) 159-179.
[39]
Y. Xia, C. Liu, Y. Li, N. Liu, A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring, Expert Syst. Appl., 78 (2017) 225-241.
[40]
H. Xiao, Z. Xiao, Y. Wang, Ensemble classification based on supervised clustering for credit scoring, Appl. Soft Comput., 43 (2016) 73-86.
[41]
D. Zhang, X. Zhou, S. Leung, J. Zheng, Vertical bagging decision trees model for credit scoring, Expert Syst. Appl., 37 (2010) 7838-7843.
[42]
L. Zhou, Performance of corporate bankruptcy prediction models on imbalanced dataset: the effect of sampling method, Knowl. Based Syst., 41 (2013) 16-25.
[43]
L. Zhou, K.K. Lai, L. Yu, Least squares support vector machines ensemble models for credit scoring, Expert Syst. Appl., 37 (2010) 127-133.
[44]
L. Zhou, K.P. Tam, H. Fujita, Predicting the listing status of Chinese listed companies with multi-class classification models, Inf. Sci., 328 (2016) 222-236.

Cited By

View all
  • (2024)Enhanced group decision-making through an intelligent algorithmic approach for multiple-attribute credit evaluation with 2-tuple linguistic neutrosophic setsInternational Journal of Knowledge-based and Intelligent Engineering Systems10.3233/KES-23023328:1(163-177)Online publication date: 1-Jan-2024
  • (2024)Predictive analysis for road accidents using a tree-based and deep learning fusion systemJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23207846:1(2381-2397)Online publication date: 1-Jan-2024
  • (2024)AdaFNDFSInternational Journal of Intelligent Systems10.1155/2024/55298472024Online publication date: 1-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Sciences: an International Journal
Information Sciences: an International Journal  Volume 425, Issue C
January 2018
164 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 January 2018

Author Tags

  1. Bagging
  2. Decision tree ensemble
  3. Differentiated sampling rates
  4. Enterprise credit evaluation
  5. Imbalanced classification
  6. SMOTE

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enhanced group decision-making through an intelligent algorithmic approach for multiple-attribute credit evaluation with 2-tuple linguistic neutrosophic setsInternational Journal of Knowledge-based and Intelligent Engineering Systems10.3233/KES-23023328:1(163-177)Online publication date: 1-Jan-2024
  • (2024)Predictive analysis for road accidents using a tree-based and deep learning fusion systemJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23207846:1(2381-2397)Online publication date: 1-Jan-2024
  • (2024)AdaFNDFSInternational Journal of Intelligent Systems10.1155/2024/55298472024Online publication date: 1-Jan-2024
  • (2024)GANs in the Panorama of Synthetic Data Generation MethodsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365729421:1(1-28)Online publication date: 10-Apr-2024
  • (2024)Collaborative Metapath Enhanced Corporate Default Risk Assessment on Heterogeneous GraphProceedings of the ACM Web Conference 202410.1145/3589334.3645402(446-456)Online publication date: 13-May-2024
  • (2024)AutoEISInformation Processing and Management: an International Journal10.1016/j.ipm.2023.10352661:1Online publication date: 1-Jan-2024
  • (2024)An imbalanced contrastive classification method via similarity comparison within sample-neighbors with adaptive generation coefficientInformation Sciences: an International Journal10.1016/j.ins.2024.120273662:COnline publication date: 1-Mar-2024
  • (2024)Constructing small sample datasets with game mixed sampling and improved genetic algorithmThe Journal of Supercomputing10.1007/s11227-024-06263-x80:14(20891-20922)Online publication date: 4-Jun-2024
  • (2024)Hybrid density-based adaptive weighted collaborative representation for imbalanced learningApplied Intelligence10.1007/s10489-024-05393-254:5(4334-4351)Online publication date: 1-Mar-2024
  • (2024)Multiple optimized ensemble learning for high-dimensional imbalanced credit scoring datasetsKnowledge and Information Systems10.1007/s10115-024-02129-z66:9(5429-5457)Online publication date: 1-Sep-2024
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media