[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

An Empirical Analysis of Data Reduction Techniques for k-NN Classification

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations (AIAI 2024)

Abstract

This study explores Data Reduction Techniques (DRTs) in the realm of lazy classification algorithms like k-NN, focusing on Prototype Selection (PS) and Prototype Generation (PG) methods. The research provides an in-depth examination of these methodologies, categorizing DRTs into two primary categories: PS and PG, and further dividing them into three sub-categories: condensation methods, edition methods, and hybrid methods. An experimental study compares a total of 20 new and state-of-the-art DRTs across 20 datasets. The objective is to draw performance conclusions within both the primary and sub-categories, offering valuable insights into how these techniques enhance the effectiveness and robustness of the k-NN classifier. The paper provides a comprehensive overview of DRTs, clarifying their strategies and relative performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 223.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alcalá-Fdez, J., et al.: KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput. 13(3), 307–318 (2009). https://doi.org/10.1007/S00500-008-0323-Y

    Article  Google Scholar 

  2. Angiulli, F.: Fast nearest neighbor condensation for large data sets classification. IEEE Trans. Knowl. Data Eng. 19(11), 1450–1464 (2007). https://doi.org/10.1109/TKDE.2007.190645

    Article  Google Scholar 

  3. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964

    Article  Google Scholar 

  4. Decaestecker, C.: Finding prototypes for nearest neighbour classification by means of gradient descent and deterministic annealing. Pattern Recognit. 30(2), 281–288 (1997). https://doi.org/10.1016/S0031-3203(96)00072-6

    Article  Google Scholar 

  5. Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  6. Fernández, F., Isasi, P.: Evolutionary design of nearest prototype classifiers. J. Heuristics 10(4), 431–454 (2004). https://doi.org/10.1023/B:HEUR.0000034715.70386.5B

    Article  Google Scholar 

  7. García, S., Cano, J.R., Herrera, F.: A memetic algorithm for evolutionary prototype selection: a scaling up approach. Pattern Recognit. 41(8), 2693–2709 (2008). https://doi.org/10.1016/J.PATCOG.2008.02.006

    Article  Google Scholar 

  8. García, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417–435 (2012). https://doi.org/10.1109/TPAMI.2011.142

    Article  Google Scholar 

  9. Gates, G.W.: The reduced nearest neighbor rule (corresp.). IEEE Trans. Inf. Theory 18(3), 431–433 (1972). https://doi.org/10.1109/TIT.1972.1054809

  10. Giorginis, T., Ougiaroglou, S., Evangelidis, G., Dervos, D.A.: Fast data reduction by space partitioning via convex hull and MBR computation. Pattern Recognit. 126, 108553 (2022). https://doi.org/10.1016/J.PATCOG.2022.108553

    Article  Google Scholar 

  11. Hart, P.E.: The condensed nearest neighbor rule (corresp.). IEEE Trans. Inf. Theory 14(3), 515–516 (1968). https://doi.org/10.1109/TIT.1968.1054155

  12. Kasemtaweechok, C., Suwannik, W.: Prototype selection for k-nearest neighbors classification using geometric median. In: Proceedings of the Fifth International Conference on Network, Communication and Computing, ICNCC 2016, Kyoto, Japan, 17–21 December 2016, pp. 140–144. ACM (2016). https://doi.org/10.1145/3033288.3033301

  13. Kim, S., Oommen, B.J.: A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Anal. Appl. 6(3), 232–244 (2003). https://doi.org/10.1007/S10044-003-0191-0

    Article  MathSciNet  Google Scholar 

  14. Li, J., Dai, C.: Fast prototype selection algorithm based on adjacent neighbourhood and boundary approximation. Sci. Rep. 12(1), 20108 (2022). https://doi.org/10.1038/s41598-022-23036-9

    Article  MathSciNet  Google Scholar 

  15. Mukahar, N., Rosdi, B.A.: Performance comparison of prototype selection based on edition search for nearest neighbor classification. In: Zamli, K.Z., Mezhuyev, V., Benedicenti, L. (eds.) Proceedings of the 7th International Conference on Software and Computer Applications, ICSCA 2018, Kuantan, Malaysia, 08–10 February 2018, pp. 143–146. ACM (2018). https://doi.org/10.1145/3185089.3185145

  16. Nanni, L., Lumini, A.: Particle swarm optimization for prototype reduction. Neurocomputing 72(4–6), 1092–1097 (2009). https://doi.org/10.1016/J.NEUCOM.2008.03.008

    Article  Google Scholar 

  17. Olvera-López, J.A., Carrasco-Ochoa, J.A., Trinidad, J.F.M.: A new fast prototype selection method based on clustering. Pattern Anal. Appl. 13(2), 131–141 (2010). https://doi.org/10.1007/S10044-008-0142-X

    Article  MathSciNet  Google Scholar 

  18. Ougiaroglou, S., Evangelidis, G.: Efficient editing and data abstraction by finding homogeneous clusters. Ann. Math. Artif. Intell. 76(3–4), 327–349 (2016). https://doi.org/10.1007/S10472-015-9472-8

    Article  MathSciNet  Google Scholar 

  19. Ougiaroglou, S., Evangelidis, G.: RHC: a non-parametric cluster-based data reduction for efficient k-NN classification. Pattern Anal. Appl. 19(1), 93–109 (2016). https://doi.org/10.1007/S10044-014-0393-7

    Article  MathSciNet  Google Scholar 

  20. Rosero-Montalvo, P., et al.: Prototype reduction algorithms comparison in nearest neighbor classification for sensor data: Empirical study. In: 2017 IEEE Second Ecuador Technical Chapters Meeting (ETCM), pp. 1–5 (2017). https://doi.org/10.1109/ETCM.2017.8247530

  21. Sánchez, J.S.: High training set size reduction by space partitioning and prototype abstraction. Pattern Recognit. 37(7), 1561–1564 (2004). https://doi.org/10.1016/J.PATCOG.2003.12.012

    Article  Google Scholar 

  22. Sánchez, J.S., Pla, F., Ferri, F.J.: Prototype selection for the nearest neighbour rule through proximity graphs. Pattern Recognit. Lett. 18(6), 507–513 (1997). https://doi.org/10.1016/S0167-8655(97)00035-4

    Article  Google Scholar 

  23. Skalak, D.B.: Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Cohen, W.W., Hirsh, H. (eds.) Machine Learning, Proceedings of the Eleventh International Conference, Rutgers University, New Brunswick, NJ, USA, 10–13 July 1994, pp. 293–301. Morgan Kaufmann (1994). https://doi.org/10.1016/B978-1-55860-335-6.50043-X

  24. Tomek, I.: An experiment with the edited nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6(6), 448–452 (1976). https://doi.org/10.1109/TSMC.1976.4309523

  25. Triguero, I., Derrac, J., García, S., Herrera, F.: A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans. Syst. Man Cybern. Part C 42(1), 86–100 (2012). https://doi.org/10.1109/TSMCC.2010.2103939

    Article  Google Scholar 

  26. Valero-Mas, J.J., Calvo-Zaragoza, J., Rico-Juan, J.R., Iñesta, J.M.: A study of prototype selection algorithms for nearest neighbour in class-imbalanced problems. In: Alexandre, L.A., Salvador Sánchez, J., Rodrigues, J.M.F. (eds.) IbPRIA 2017. LNCS, vol. 10255, pp. 335–343. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58838-4_37

    Chapter  Google Scholar 

  27. Verbiest, N., Cornelis, C., Herrera, F.: FRPS: a fuzzy rough prototype selection method. Pattern Recognit. 46(10), 2770–2782 (2013). https://doi.org/10.1016/J.PATCOG.2013.03.004

    Article  Google Scholar 

  28. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. SMC-2(3), 408–421 (1972). https://doi.org/10.1109/TSMC.1972.4309137

Download references

Acknowledgements

This paper is a result of research conducted within the “MSc in Artificial Intelligence and Data Analytics” of the Department of Applied Informatics of the University of Macedonia. The publication/presentation of the paper is funded by the University of Macedonia Research Committee.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stylianos Eleftheriadis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Eleftheriadis, S., Evangelidis, G., Ougiaroglou, S. (2024). An Empirical Analysis of Data Reduction Techniques for k-NN Classification. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Avlonitis, M., Papaleonidas, A. (eds) Artificial Intelligence Applications and Innovations. AIAI 2024. IFIP Advances in Information and Communication Technology, vol 714. Springer, Cham. https://doi.org/10.1007/978-3-031-63223-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-63223-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-63222-8

  • Online ISBN: 978-3-031-63223-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics