[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

The Nearest Subclass Classifier: A Compromise between the Nearest Mean and Nearest Neighbor Classifier

Published: 01 September 2005 Publication History

Abstract

We present the Nearest Subclass Classifier (NSC), which is a classification algorithm that unifies the flexibility of the nearest neighbor classifier with the robustness of the nearest mean classifier. The algorithm is based on the Maximum Variance Cluster algorithm and, as such, it belongs to the class of prototype-based classifiers. The variance constraint parameter of the cluster algorithm serves to regularize the classifier, that is, to prevent overfitting. With a low variance constraint value, the classifier turns into the nearest neighbor classifier and, with a high variance parameter, it becomes the nearest mean classifier with the respective properties. In other words, the number of prototypes ranges from the whole training set to only one per class. In the experiments, we compared the NSC with regard to its performance and data set compression ratio to several other prototype-based methods. On several data sets, the NSC performed similarly to the k-nearest neighbor classifier, which is a well-established classifier in many domains. Also concerning storage requirements and classification speed, the NSC has favorable properties, so it gives a good compromise between classification performance and efficiency.

References

[1]
D.W. Aha D. Kibler and M.K. Albert, “Instance-Based Learning Algorithms,” Machine Learning, vol. 6, pp. 37-66, 1991.]]
[2]
C. Ambroise and G.J. McLachlan, “Selection Bias in Gene Extraction on the Basis of Microarray Gene-Expression Data,” Proc. Nat'l Academy of Sciences (PNAS), vol. 99, no. 10, pp. 6562-6566, May 2002.]]
[3]
G.H. Ball and D.J. Hall, “A Clustering Technique for Summarizing Multivariate Data,” Behavioral Science, vol. 12, pp. 153-155, Mar. 1967.]]
[4]
J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum Press, 1981.]]
[5]
J.C. Bezdek and L.I. Kuncheva, “Nearest Prototype Classifier Designs: An Experimental Study,” Int'l J. Intelligent Systems, vol. 16, pp. 1445-1473, 2001.]]
[6]
J.C. Bezdek and N.R. Pal, “Some New Indexes of Cluster Validity,” IEEE Trans. Systems, Man, and Cybernetics—Part B, vol. 28,no. 3, pp. 301-315, 1998.]]
[7]
J.C. Bezdek T.R. Reichherzer G.S. Lim and Y. Attikiouzel, “Multiple-Prototype Classifier Design,” IEEE Trans. Systems, Man, and Cybernetics—Part C: Applications and Reviews, vol. 28, no. 1, pp. 67-79, 1998.]]
[8]
C.L. Blake and C.J. Merz, UCI Repository of Machine Learning Databases, Univ. of California, Irvine, 1998.]]
[9]
H. Brighton and C. Mellish, “Advances in Instance Selection for Instance-Based Learning Algorithms,” Data Mining and Knowledge Discovery, vol. 6, pp. 153-172, 2002.]]
[10]
V. Cerverón and F.J. Ferri, “Another Move Toward the Minimum Consistent Subset: A Tabu Search Approach to the Condensed Nearest Neighbor Rule,” IEEE Trans. Systems, Man, and Cybernetics—Part B: Cybernetics, vol. 31, no. 3, pp. 408-413, 2001.]]
[11]
C.-L. Chang, “Finding Prototypes for Nearest Neighbor Classifiers,” IEEE Trans. Computers, vol. 23, no. 11, pp. 1179-1184, Nov. 1974.]]
[12]
D. Chaudhuri C.A. Murthy and B.B. Chaudhuri, “Finding a Subset of Representative Points in a Data Set,” IEEE Trans. Systems, Man, and Cybernetics, vol. 24, no. 9, pp. 1416-1424, 1994.]]
[13]
T.M. Cover and P.E. Hart, “Nearest Neighbor Pattern Classification,” IEEE Trans. Information Theory, vol. 13, no. 1, pp. 21-27, 1967.]]
[14]
Nearest Neighbor NN Norms: NN Pattern Classification Techniques, B.V. Dasarathy, ed. Los Alamitos, Calif.: IEEE Computer Society Press, 1991.]]
[15]
B.V. Dasarathy, “Minimal Consistent Set (MCS) Identification for Optimal Nearest Neighbor Decision Systems Design,” IEEE Trans. Systems, Man, and Cybernetics, vol. 24, no. 3, pp. 511-517, 1994.]]
[16]
B.V. Dasarathy J.S. Sánchez and S. Townsend, “Nearest Neighbor Editing and Condensing Tools—Synergy Exploitation,” Pattern Analysis and Applications, vol. 3, no. 1, pp. 19-30, 2000.]]
[17]
D.L. Davies and D.W. Bouldin, “A Cluster Separation Measure,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 1, no. 2, pp. 224-227, Apr. 1979.]]
[18]
J.C. Dunn, “Well Separated Clusters and Optimal Fuzzy Partitions,” J. Cybernetics, vol. 4, pp. 95-104, 1974.]]
[19]
B. Efron and R. Tibshirani, “Improvements on Cross-Validation: The .632+ Bootstrap Method,” J. Am. Statistical Assoc., vol. 92,no. 438, pp. 548-560, 1997.]]
[20]
R.A. Fisher, “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, vol. 7, no. 2, pp. 179-188, 1936.]]
[21]
E. Fix and J.L. Hodges, “Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties,” USAF School of Aviation Medicine, Project 21-49-004 (Report Number 4), pp. 261-279, 1951.]]
[22]
V. Ganti J. Gehrke and R. Ramakrishnan, “Mining Very Large Databases,” Computer, vol. 32, no. 8, pp. 38-45, Aug. 1999.]]
[23]
G.W. Gates, “The Reduced Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 18, no. 3, pp. 431-433, 1972.]]
[24]
S. Geman E. Bienenstock and R. Doursat, “Neural Networks and the Bias/Variance Dilemma,” Neural Computation, vol. 4, pp. 1-58, 1992.]]
[25]
F. Glover and M. Laguna, Tabu Search. Boston: Kluwer Academic, 1997.]]
[26]
Y. Hamamoto S. Uchimura and S. Tomita, “A Bootstrap Technique for Nearest Neighbor Classifier Design,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 73-79, Jan. 1997.]]
[27]
P.E. Hart, “The Condensed Nearest Neighbor Rule,” IEEE Trans. Information Theory, vol. 14, no. 3, pp. 515-516, 1968.]]
[28]
T. Hastie R. Tibshirani and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2001.]]
[29]
R.J. Henery, Machine Learning, Neural and Statistical Classification, chapter 7, pp. 107-124. Ellis Horwood, 1994.]]
[30]
A.E. Hoerl and R.W. Kennard, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, vol. 12, no. 1, pp. 55-67, 1970.]]
[31]
L.J. Hubert and P. Arabie, “Comparing Partitions,” J. Classification, vol. 2, pp. 193-218, 1985.]]
[32]
A.K. Jain and R.C. Dubes, Algorithms for Clustering Data. New Jersey: Prentice-Hall Inc., 1988.]]
[33]
A.K. Jain and D.E. Zongker, “Feature Selection: Evaluation, Application, and Small Sample Performance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 2, pp. 153-158, Feb. 1997.]]
[34]
R. Kohavi and G.H. John, “Wrappers for Feature Subset Selection,” Artificial Intelligence, vol. 97, pp. 273-324, Dec. 1997.]]
[35]
T. Kohonen, “Improved Versions of Learning Vector Quantization,” Proc. Int'l Joint Conf. Neural Networks, vol. 1, pp. 545-550, 1990.]]
[36]
L.I. Kuncheva and J.C. Bezdek, “Presupervised and Postsupervised Prototype Classifier Design,” IEEE Trans. Neural Networks, vol. 10, no. 5, pp. 1142-1152, Sept. 1999.]]
[37]
L.I. Kuncheva and J.C. Bezdek, “Nearest Prototype Classification: Clustering, Genetic Algorithms, or Random Search,” IEEE Trans. Systems, Man, and Cybernetics, vol. 28, no. 1, pp. 160-164, 1998.]]
[38]
W. Lam C.K Keung and C.X. Ling, “Learning Good Prototypes for Classification Using Filtering and Abstraction of Instances,” Pattern Recognition, vol. 35, no. 7, pp. 1491-1506, July 2002.]]
[39]
W. Lam C.K Keung and D. Liu, “Discovering Useful Concept Prototypes for Classification Based on Filtering and Abstraction,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1075-1090, Aug. 2002.]]
[40]
P. Mitra C.A. Murthy and S.K. Pal, “Density-Based Multiscale Data Condensation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 6, pp. 734-747, June 2002.]]
[41]
R.A. Mollineda F.J. Ferri and E. Vidal, “An Efficient Prototype Merging Strategy for the Condensed 1-NN Rule Through Class-Conditional Hierarchical Clustering,” Pattern Recognition, vol. 35, pp. 2771-2782, 2002.]]
[42]
G.L. Ritter H.B. Woodruff S.R. Lowry and T.L. Isenhour, “An Algorithm for a Selective Nearest Neighbor Decision Rule,” IEEE Trans. Information Theory, vol. 21, no. 6, pp. 665-669, 1975.]]
[43]
C. Schaffer, “Overfitting Avoidance as Bias,” Machine Learning, vol. 10, pp. 153-178, 1993.]]
[44]
V.G. Sigillito S.P. Wing L.V. Hutton and K.B. Baker, “Classification of Radar Returns from the Ionosphere Using Neural Networks,” John Hopkins APL Technical Digest, vol. 10, pp. 262-266, 1989.]]
[45]
D.B. Skalak, “Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms,” Proc. 11th Int'l Conf. Machine Learning, pp. 293-301, 1994.]]
[46]
C.W. Swonger, “Sample Set Condensation for a Condensed Nearest Neighbor Decision Rule for Pattern Recognition,” Frontiers of Pattern Recognition, pp. 511-519, 1972.]]
[47]
I. Tomek, “An Experiment with the Edited Nearest-Neighbor Rule,” IEEE Trans. Systems, Man, and Cybernetics, vol. 6, no. 6, pp. 448-452, 1976.]]
[48]
C.J. Veenman M.J.T. Reinders and E. Backer, “A Maximum Variance Cluster Algorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 9, pp. 1273-1280, Sept. 2002.]]
[49]
C.J. Veenman M.J.T. Reinders and E. Backer, “A Cellular Coevolutionary Algorithm for Image Segmentation,” IEEE Trans. Image Processing, vol. 12, no. 3, pp. 304-316, Mar. 2003.]]
[50]
G. Wilfong, “Nearest Neighbor Problems,” Proc. Seventh Ann. Symp. Computational Geometry, pp. 224-233, 1991.]]
[51]
D.L. Wilson, “Assymptotic Properties of Nearest Neighbor Rules Using Edited Data,” IEEE Trans. Systems, Man, and Cybernetics, vol. 2,no. 3, pp. 408-421, July 1972.]]
[52]
D.R. Wilson and T.R. Martinez, “Instance Pruning Techniques,” Proc. 14th Int'l Conf. Machine Learning, pp. 404-411, 1997.]]
[53]
D.R. Wilson and T.R. Martinez, “Reduction Techniques for Instance-Based Learning Algorithms,” Machine Learning, vol. 38, pp. 257-286, 2000.]]
[54]
W.H. Wolberg and O.L. Mangarasian, “Multisurface Method of Pattern Separation for Medical Diagnoses Applied to Breast Cytology,” Proc. Nat'l Academy of Sciences, vol. 87, pp. 9193-9196, Dec. 1990.]]

Cited By

View all
  • (2023)A Complex Weighted Discounting Multisource Information Fusion With its Application in Pattern ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.320687135:8(7609-7623)Online publication date: 1-Aug-2023
  • (2023)Generalized Divergence-Based Decision Making Method With an Application to Pattern ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317789635:7(6941-6956)Online publication date: 1-Jul-2023
  • (2022)Learning refined features for open-world text classification with class description and commonsense knowledgeWorld Wide Web10.1007/s11280-022-01102-626:2(637-660)Online publication date: 24-Sep-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 27, Issue 9
September 2005
161 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 September 2005

Author Tags

  1. Index Terms- Classification
  2. cross-validation
  3. prototype selection.
  4. regularization

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Complex Weighted Discounting Multisource Information Fusion With its Application in Pattern ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.320687135:8(7609-7623)Online publication date: 1-Aug-2023
  • (2023)Generalized Divergence-Based Decision Making Method With an Application to Pattern ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317789635:7(6941-6956)Online publication date: 1-Jul-2023
  • (2022)Learning refined features for open-world text classification with class description and commonsense knowledgeWorld Wide Web10.1007/s11280-022-01102-626:2(637-660)Online publication date: 24-Sep-2022
  • (2022)Kernel smoothing classification of multiattribute data in the belief function framework: Application to multichannel image segmentationMultimedia Tools and Applications10.1007/s11042-022-12086-w81:20(29587-29608)Online publication date: 1-Aug-2022
  • (2022)Interpretable systems based on evidential prospect theory for decision-makingApplied Intelligence10.1007/s10489-022-03276-y53:2(1640-1665)Online publication date: 30-Apr-2022
  • (2021)Learning Refined Features for Open-World Text ClassificationWeb and Big Data10.1007/978-3-030-85896-4_29(367-381)Online publication date: 23-Aug-2021
  • (2020)Basic quantum circuits for classification and approximation tasksInternational Journal of Applied Mathematics and Computer Science10.34768/amcs-2020-005430:4(733-744)Online publication date: 31-Dec-2020
  • (2019)An efficient hybrid DWT-fuzzy filter in DCT domain based illumination normalization for face recognitionMultimedia Tools and Applications10.1007/s11042-018-6837-078:11(15213-15233)Online publication date: 1-Jun-2019
  • (2019)Ant colony optimization edge selection for support vector machine speed optimizationNeural Computing and Applications10.1007/s00521-019-04633-832:15(11385-11417)Online publication date: 4-Dec-2019
  • (2018)Weighted Fuzzy Dempster–Shafer Framework for Multimodal Information IntegrationIEEE Transactions on Fuzzy Systems10.5555/3196157.319619726:1(338-352)Online publication date: 1-Feb-2018
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media