[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-44192-9_27guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

LRP-GUS: A Visual Based Data Reduction Algorithm for Neural Networks

Published: 26 September 2023 Publication History

Abstract

Deriving general rules to estimate a neural network sample complexity is a difficult problem. Therefore, in practice, datasets are often large to ensure sufficient class samples representation. This comes at the cost of high power consumption and long training time. This paper introduces a novel data reduction method for Deep Learning classifiers, called LRP-GUS, focusing on visual features. The idea behind LRP-GUS is to reduce the size of our training dataset by exploiting visual features and their relevance. The proposed technique is tested on the MNIST and Fashion-MNIST datasets. We evaluate the method using compression rates, accuracy and F1 scores per class. For instance, our method achieves compression rates of 96.10% for MNIST and 75.94% for Fashion-MNIST, at the cost of a drop of 3% test accuracy for both datasets.

References

[1]
Anand R, Mehrotra KG, Mohan CK, and Ranka S An improved algorithm for neural network classification of imbalanced training sets IEEE Trans. Neural Netw. 1993 4 6 962-969
[2]
Anthony, M., Bartlett, P.L., Bartlett, P.L., et al.: Neural Network Learning: Theoretical Foundations, vol. 9. Cambridge University Press, Cambridge (1999)
[3]
Bach S, Binder A, Montavon G, Klauschen F, Müller KR, and Samek W On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation PLoS ONE 2015 10 7
[4]
Batista GE, Prati RC, and Monard MC A study of the behavior of several methods for balancing machine learning training data ACM SIGKDD Explor. Newsl. 2004 6 1 20-29
[5]
Buda M, Maki A, and Mazurowski MA A systematic study of the class imbalance problem in convolutional neural networks Neural Netw. 2018 106 249-259
[6]
Cheng Y Mean shift, mode seeking, and clustering IEEE Trans. Pattern Anal. Mach. Intell. 1995 17 8 790-799
[7]
Cover T and Hart P Nearest neighbor pattern classification IEEE Trans. Inf. Theory 1967 13 1 21-27
[8]
Datta, P., Kibler, D.: Learning symbolic prototypes. In: ICML, pp. 75–82 (1997)
[9]
Estabrooks A, Jo T, and Japkowicz N A multiple resampling method for learning from imbalanced data sets Comput. Intell. 2004 20 1 18-36
[10]
Garcia S, Derrac J, Cano J, and Herrera F Prototype selection for nearest neighbor classification: Taxonomy and empirical study IEEE Trans. Pattern Anal. Mach. Intell. 2012 34 3 417-435
[11]
Ha, J., Lee, J.S.: A new under-sampling method using genetic algorithm for imbalanced data classification. In: Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication, pp. 1–6 (2016)
[12]
Hasanin, T., Khoshgoftaar, T.: The effects of random undersampling with simulated class imbalance for big data. In: 2018 IEEE International Conference on Information Reuse and Integration (IRI), pp. 70–79. IEEE (2018)
[13]
Hastie T and Tibshirani R Discriminant adaptive nearest neighbor classification and regression Adv. Neural Inf. Process. Syst. 1995 8 409-415
[14]
Jain AK, Duin RPW, and Mao J Statistical pattern recognition: a review IEEE Trans. Pattern Anal. Mach. Intell. 2000 22 1 4-37
[15]
Japkowicz N and Stephen S The class imbalance problem: a systematic study Intell. Data Anal. 2002 6 5 429-449
[16]
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
[17]
Koziarski, M.: CSMOUTE: combined synthetic oversampling and undersampling technique for imbalanced data classification. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2021)
[18]
Krawczyk B, Galar M, Jeleń Ł, and Herrera F Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy Appl. Soft Comput. 2016 38 714-726
[19]
Kubat, M., Matwin, S., et al.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, vol. 97, p. 179. Citeseer (1997)
[20]
Laurikkala J Quaglini S, Barahona P, and Andreassen S Improving identification of difficult small classes by balancing class distribution Artificial Intelligence in Medicine 2001 Heidelberg Springer 63-66
[21]
LeCun Y, Bottou L, Bengio Y, and Haffner P Gradient-based learning applied to document recognition Proc. IEEE 1998 86 11 2278-2324
[22]
Lin WC, Tsai CF, Hu YH, and Jhang JS Clustering-based undersampling in class-imbalanced data Inf. Sci. 2017 409 17-26
[23]
Lundberg, S., Lee, S.I.: A unified approach to interpreting model predictions (2017)
[24]
Montavon G, Binder A, Lapuschkin S, Samek W, and Müller K-R Samek W, Montavon G, Vedaldi A, Hansen LK, and Müller K-R Layer-wise relevance propagation: an overview Explainable AI: Interpreting, Explaining and Visualizing Deep Learning 2019 Cham Springer 193-209
[25]
Olvera-López JA, Carrasco-Ochoa JA, and Martínez-Trinidad J A new fast prototype selection method based on clustering Pattern Anal. Appl. 2010 13 2 131-141
[26]
Ougiaroglou, S., Evangelidis, G.: Efficient dataset size reduction by finding homogeneous clusters. In: Proceedings of the Fifth Balkan Conference in Informatics, pp. 168–173 (2012)
[27]
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you? Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
[28]
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer (Cision), pp. 618–626 (2017)
[29]
Weiss, G.M., Provost, F.: The effect of class distribution on classifier learning: an empirical study. Technical report, Rutgers University (2001)
[30]
Wilson DR and Martinez TR Improved heterogeneous distance functions J. Artif. Intell. Res. 1997 6 1-34
[31]
Wilson DR and Martinez TR Reduction techniques for instance-based learning algorithms Mach. Learn. 2000 38 3 257-286
[32]
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
[33]
Yen SJ and Lee YS Cluster-based under-sampling approaches for imbalanced data distributions Expert Syst. Appl. 2009 36 3 5718-5727

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Artificial Neural Networks and Machine Learning – ICANN 2023: 32nd International Conference on Artificial Neural Networks, Heraklion, Crete, Greece, September 26–29, 2023, Proceedings, Part V
Sep 2023
618 pages
ISBN:978-3-031-44191-2
DOI:10.1007/978-3-031-44192-9
  • Editors:
  • Lazaros Iliadis,
  • Antonios Papaleonidas,
  • Plamen Angelov,
  • Chrisina Jayne

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 26 September 2023

Author Tags

  1. Data reduction
  2. Machine Learning
  3. Visual features
  4. XAI

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media