More Web Proxy on the site http://driver.im/

article

A new fast prototype selection method based on clustering

Authors:

J. Arturo Olvera-López,

J. Ariel Carrasco-Ochoa,

J. Francisco Martínez-TrinidadAuthors Info & Claims

Pattern Analysis & Applications, Volume 13, Issue 2

Pages 131 - 141

https://doi.org/10.1007/s10044-008-0142-x

Published: 01 May 2010 Publication History

Abstract

In supervised classification, a training set T is given to a classifier for classifying new prototypes. In practice, not all information in T is useful for classifiers, therefore, it is convenient to discard irrelevant prototypes from T. This process is known as prototype selection, which is an important task for classifiers since through this process the time for classification or training could be reduced. In this work, we propose a new fast prototype selection method for large datasets, based on clustering, which selects border prototypes and some interior prototypes. Experimental results showing the performance of our method and comparing accuracy and runtimes against other prototype selection methods are reported.

References

[1]

Kuncheva LI, Bezdek JC (1998) Nearest prototype classification, clustering, genetic algorithms, or random search? IEEE Trans Syst Man Cybern C28(1):160---164

Digital Library

[2]

Bezdek JC, Kuncheva LI (2001) Nearest prototype classifier designs: an experimental study. Int J Intell Syst 16(12):1445---1473

[3]

Wilson DR, Martínez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38:257---286

Digital Library

[4]

Brighton H, Mellish C (2002) Advances in instance selection for instance-based learning algorithms. Data Min Knowl Disc 6(2):153---172

Digital Library

[5]

Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21---27

Digital Library

[6]

Atkeson CG, Moorel AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11(1---5):11---73

[7]

Vapnik V (1995) The nature of statistical learning theory. Springer, New York

[8]

Vapnik VN (1998) Statistical learning theory. Wiley, New York

[9]

Cristanni N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

[10]

Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo

[11]

Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley, New York

[12]

Hart PE (1968) The condensed nearest neighbor rule. IEEE Trans Inf Theory 14:515---516

Digital Library

[13]

Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern 2:408---421

[14]

Chidananda GK, Krishna G (1979) The condensed nearest neighbor rule using the concept of mutual nearest neighborhood. IEEE Trans Inf Theory 25:488---490

Digital Library

[15]

Chien-Hsing C, Bo-Han K, Fu C (2006) The generalized condensed nearest neighbor rule as a data reduction method. In: Proceedings of the 18th international conference on pattern recognition. IEEE Computer Society, Hong-Kong, pp 556---559

Digital Library

[16]

Tomek I (1976) An experiment with the edited nearest-neighbor rule. IEEE Trans Syst Man Cybern 6---6:448---452

[17]

Devijver PA, Kittler J (1980) On the edited nearest neighbor rule. In: Proceedings of the fifth international conference on pattern recognition, Los Alamitos, CA, pp 72---80

[18]

Liu H, Motoda H (2002) On issues of instance selection. Data Min Knowl Disc 6:115---130

[19]

Spillmann B, Neuhaus M, Bunke H, Pă¿kalska E, Duin RPW (2006) Transforming strings to vector spaces using prototype selection. In: Yeung D-Y et al (eds) SSPR & SPR 2006, Lecture Notes in Computer Science, vol 4109, Hong-Kong, pp 287---296

[20]

Lumini A, Nanni L (2006) A clustering method for automatic biometric template selection. Pattern Recogn 39:495---497

Digital Library

[21]

Venmann CJ, Reinders MJT (2005) The nearest sub-class classifier: a compromise between the nearest mean and nearest neighbor classifier. IEEE Trans Pattern Anal Mach Intell 27(9):1417---1429

Digital Library

[22]

Venmann CJ, Reinders MJT, Backer E (2002) A maximum variance clustering algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273---1280

Digital Library

[23]

Mollineda RA, Ferri FJ, Vidal E (2002) An efficient prototype merging strategy for the condensed 1-NN rule through class-conditional hierarchical clustering. Pattern Recogn 35:2771---2782

[24]

Raicharoen T, Lursinsap C (2005) A divide-and-conquer approach to the pairwise opposite class-nearest neighbor (POC-NN) algorithm. Pattern Recognit Lett 26(10):1554---1567

Digital Library

[25]

Karaçali B, Krim H (2002) Fast minimization of structural risk by nearest neighbor rule. IEEE Trans Neural Netw 14:127---137

Digital Library

[26]

Asuncion A, Newman DJ (2007) UCI machine learning repository. In: University of California, School of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html

[27]

Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895---1924

Digital Library

[28]

Vojtech F, Václav H (2004) Statistical pattern recognition toolbox for Matlab. Research report, Center for Machine Perception Department of Cybernetic, Faculty of Electrical Engineering, Czech Technical University

[29]

Witten IH, Frank E (2005) Data mining: practical machine learning tools techniques, 2nd edn. Morgan Kaufmann, San Francisco

[30]

The MathWorks Inc. (1994---2008) Natick. {http://www.mathworks.com}

Cited By

Hayel REl Hindi KHosny MAlharbi R(2024)A selective LVQ algorithm for improving instance reduction techniques and its application for text classificationJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23529046:5-6(11353-11366)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.3233/JIFS-235290
Tang LTian YWang XPardalos P(2023)A simple and reliable instance selection for fast training support vector machineNeural Networks10.1016/j.neunet.2023.07.018166:C(379-395)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.neunet.2023.07.018
Guibert AHurter CCouellan N(2023)LRP-GUS: A Visual Based Data Reduction Algorithm for Neural NetworksArtificial Neural Networks and Machine Learning – ICANN 202310.1007/978-3-031-44192-9_27(337-349)Online publication date: 26-Sep-2023
https://dl.acm.org/doi/10.1007/978-3-031-44192-9_27
Show More Cited By

A new fast prototype selection method based on clustering
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
      2. Unsupervised learning
    2. Machine learning approaches

Recommendations

Prototype Selection for k-Nearest Neighbors Classification Using Geometric Median
ICNCC '16: Proceedings of the Fifth International Conference on Network, Communication and Computing

The k-Nearest Neighbors classifier (kNN) is a well-known classifier implemented extensively in the data mining research area. The kNN classifier suffers from several drawbacks such as high storage requirements, computational complexity and high ...
Prototype Selection Via Prototype Relevance
CIARP '08: Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications

In Pattern recognition, the supervised classifiers use a training set T for classifying new prototypes. In practice, not all information in T is useful for classification therefore it is necessary to discard irrelevant prototypes from T . This ...
Accurate and fast prototype selection based on the notion of relevant and border prototypes
Intelligent and Fuzzy Systems applied to Language & Knowledge Engineering

In supervised classification, a training set is given to a classifier to learn a decision rule for classifying unseen cases. When large training sets are processed, the training stage becomes slow especially for instance-based learning. However, not all ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Analysis & Applications

Pattern Analysis & Applications Volume 13, Issue 2

May 2010

114 pages

ISSN:1433-7541

EISSN:1433-755X

Issue’s Table of Contents

Copyright © Copyright © 2010 Springer-Verlag London Limited.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 May 2010

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

32
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hayel REl Hindi KHosny MAlharbi R(2024)A selective LVQ algorithm for improving instance reduction techniques and its application for text classificationJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23529046:5-6(11353-11366)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.3233/JIFS-235290
Tang LTian YWang XPardalos P(2023)A simple and reliable instance selection for fast training support vector machineNeural Networks10.1016/j.neunet.2023.07.018166:C(379-395)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1016/j.neunet.2023.07.018
Guibert AHurter CCouellan N(2023)LRP-GUS: A Visual Based Data Reduction Algorithm for Neural NetworksArtificial Neural Networks and Machine Learning – ICANN 202310.1007/978-3-031-44192-9_27(337-349)Online publication date: 26-Sep-2023
https://dl.acm.org/doi/10.1007/978-3-031-44192-9_27
Sinanaj LHaeri HMaddipatla SGao LPakala RKathiriya NBeal CBrennan SChen CJerath K(2022)Granulation of Large Temporal Databases: An Allan Variance ApproachSN Computer Science10.1007/s42979-022-01397-24:1Online publication date: 15-Oct-2022
https://dl.acm.org/doi/10.1007/s42979-022-01397-2
Zhang ZPeng RRuan YWu JLuo X(2022)ESMOTE: an overproduce-and-choose synthetic examples generation strategy based on evolutionary computationNeural Computing and Applications10.1007/s00521-022-08004-835:9(6891-6977)Online publication date: 3-Dec-2022
https://dl.acm.org/doi/10.1007/s00521-022-08004-8
Ougiaroglou SEvangelidis G(2022)WebDR: A Web Workbench for Data ReductionMachine Learning and Knowledge Discovery in Databases10.1007/978-3-662-44845-8_36(464-467)Online publication date: 10-Mar-2022
https://dl.acm.org/doi/10.1007/978-3-662-44845-8_36
Jankowski N(2022)A Fast and Efficient Algorithm for Filtering the Training DatasetNeural Information Processing10.1007/978-3-031-30105-6_42(504-512)Online publication date: 22-Nov-2022
https://dl.acm.org/doi/10.1007/978-3-031-30105-6_42
Zhao FXin YZhang KNiu X(2021)Representativeness-Based Instance Selection for Intrusion DetectionSecurity and Communication Networks10.1155/2021/66381342021Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1155/2021/6638134
Akinyelu AEzugwu AAdewumi A(2019)Ant colony optimization edge selection for support vector machine speed optimizationNeural Computing and Applications10.1007/s00521-019-04633-832:15(11385-11417)Online publication date: 4-Dec-2019
https://dl.acm.org/doi/10.1007/s00521-019-04633-8
Tang TChen SZhao MHuang WLuo J(2019)Very large-scale data classification based on K-means clustering and multi-kernel SVMSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3041-023:11(3793-3801)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s00500-018-3041-0
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents