[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

A new fast prototype selection method based on clustering

Published: 01 May 2010 Publication History

Abstract

In supervised classification, a training set T is given to a classifier for classifying new prototypes. In practice, not all information in T is useful for classifiers, therefore, it is convenient to discard irrelevant prototypes from T. This process is known as prototype selection, which is an important task for classifiers since through this process the time for classification or training could be reduced. In this work, we propose a new fast prototype selection method for large datasets, based on clustering, which selects border prototypes and some interior prototypes. Experimental results showing the performance of our method and comparing accuracy and runtimes against other prototype selection methods are reported.

References

[1]
Kuncheva LI, Bezdek JC (1998) Nearest prototype classification, clustering, genetic algorithms, or random search? IEEE Trans Syst Man Cybern C28(1):160---164
[2]
Bezdek JC, Kuncheva LI (2001) Nearest prototype classifier designs: an experimental study. Int J Intell Syst 16(12):1445---1473
[3]
Wilson DR, Martínez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38:257---286
[4]
Brighton H, Mellish C (2002) Advances in instance selection for instance-based learning algorithms. Data Min Knowl Disc 6(2):153---172
[5]
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21---27
[6]
Atkeson CG, Moorel AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11(1---5):11---73
[7]
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
[8]
Vapnik VN (1998) Statistical learning theory. Wiley, New York
[9]
Cristanni N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
[10]
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
[11]
Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley, New York
[12]
Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14:515---516
[13]
Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern 2:408---421
[14]
Chidananda GK, Krishna G (1979) The condensed nearest neighbor rule using the concept of mutual nearest neighborhood. IEEE Trans Inf Theory 25:488---490
[15]
Chien-Hsing C, Bo-Han K, Fu C (2006) The generalized condensed nearest neighbor rule as a data reduction method. In: Proceedings of the 18th international conference on pattern recognition. IEEE Computer Society, Hong-Kong, pp 556---559
[16]
Tomek I (1976) An experiment with the edited nearest-neighbor rule. IEEE Trans Syst Man Cybern 6---6:448---452
[17]
Devijver PA, Kittler J (1980) On the edited nearest neighbor rule. In: Proceedings of the fifth international conference on pattern recognition, Los Alamitos, CA, pp 72---80
[18]
Liu H, Motoda H (2002) On issues of instance selection. Data Min Knowl Disc 6:115---130
[19]
Spillmann B, Neuhaus M, Bunke H, Pă¿kalska E, Duin RPW (2006) Transforming strings to vector spaces using prototype selection. In: Yeung D-Y et al (eds) SSPR & SPR 2006, Lecture Notes in Computer Science, vol 4109, Hong-Kong, pp 287---296
[20]
Lumini A, Nanni L (2006) A clustering method for automatic biometric template selection. Pattern Recogn 39:495---497
[21]
Venmann CJ, Reinders MJT (2005) The nearest sub-class classifier: a compromise between the nearest mean and nearest neighbor classifier. IEEE Trans Pattern Anal Mach Intell 27(9):1417---1429
[22]
Venmann CJ, Reinders MJT, Backer E (2002) A maximum variance clustering algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273---1280
[23]
Mollineda RA, Ferri FJ, Vidal E (2002) An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering. Pattern Recogn 35:2771---2782
[24]
Raicharoen T, Lursinsap C (2005) A divide-and-conquer approach to the pairwise opposite class-nearest neighbor (POC-NN) algorithm. Pattern Recognit Lett 26(10):1554---1567
[25]
Karaçali B, Krim H (2002) Fast minimization of structural risk by nearest neighbor rule. IEEE Trans Neural Netw 14:127---137
[26]
Asuncion A, Newman DJ (2007) UCI machine learning repository. In: University of California, School of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html
[27]
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895---1924
[28]
Vojtech F, Václav H (2004) Statistical pattern recognition toolbox for Matlab. Research report, Center for Machine Perception Department of Cybernetic, Faculty of Electrical Engineering, Czech Technical University
[29]
Witten IH, Frank E (2005) Data mining: practical machine learning tools techniques, 2nd edn. Morgan Kaufmann, San Francisco
[30]
The MathWorks Inc. (1994---2008) Natick. {http://www.mathworks.com}

Cited By

View all
  • (2024)A selective LVQ algorithm for improving instance reduction techniques and its application for text classificationJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23529046:5-6(11353-11366)Online publication date: 24-Oct-2024
  • (2023)A simple and reliable instance selection for fast training support vector machineNeural Networks10.1016/j.neunet.2023.07.018166:C(379-395)Online publication date: 1-Sep-2023
  • (2023)LRP-GUS: A Visual Based Data Reduction Algorithm for Neural NetworksArtificial Neural Networks and Machine Learning – ICANN 202310.1007/978-3-031-44192-9_27(337-349)Online publication date: 26-Sep-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Analysis & Applications
Pattern Analysis & Applications  Volume 13, Issue 2
May 2010
114 pages
ISSN:1433-7541
EISSN:1433-755X
Issue’s Table of Contents

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 May 2010

Author Tags

  1. Border prototypes
  2. Clustering
  3. Data reduction
  4. Instance-based classifiers
  5. Prototype selection
  6. Supervised classification

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A selective LVQ algorithm for improving instance reduction techniques and its application for text classificationJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23529046:5-6(11353-11366)Online publication date: 24-Oct-2024
  • (2023)A simple and reliable instance selection for fast training support vector machineNeural Networks10.1016/j.neunet.2023.07.018166:C(379-395)Online publication date: 1-Sep-2023
  • (2023)LRP-GUS: A Visual Based Data Reduction Algorithm for Neural NetworksArtificial Neural Networks and Machine Learning – ICANN 202310.1007/978-3-031-44192-9_27(337-349)Online publication date: 26-Sep-2023
  • (2022)Granulation of Large Temporal Databases: An Allan Variance ApproachSN Computer Science10.1007/s42979-022-01397-24:1Online publication date: 15-Oct-2022
  • (2022)ESMOTE: an overproduce-and-choose synthetic examples generation strategy based on evolutionary computationNeural Computing and Applications10.1007/s00521-022-08004-835:9(6891-6977)Online publication date: 3-Dec-2022
  • (2022)WebDR: A Web Workbench for Data ReductionMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44845-8_36(464-467)Online publication date: 10-Mar-2022
  • (2022)A Fast and Efficient Algorithm for Filtering the Training DatasetNeural Information Processing10.1007/978-3-031-30105-6_42(504-512)Online publication date: 22-Nov-2022
  • (2021)Representativeness-Based Instance Selection for Intrusion DetectionSecurity and Communication Networks10.1155/2021/66381342021Online publication date: 1-Jan-2021
  • (2019)Ant colony optimization edge selection for support vector machine speed optimizationNeural Computing and Applications10.1007/s00521-019-04633-832:15(11385-11417)Online publication date: 4-Dec-2019
  • (2019)Very large-scale data classification based on K-means clustering and multi-kernel SVMSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3041-023:11(3793-3801)Online publication date: 1-Jun-2019
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media