Abstract
In order to classify an unseen (query) vector q with the k-Nearest Neighbors method (k-NN) one computes a similarity function between q and training vectors in a database. In the basic variant of the k-NN algorithm the predicted class of q is estimated by taking the majority class of the q’s k-nearest neighbors. Various similarity functions may be applied leading to different classification results. In this paper a heterogeneous similarity function is constructed out of different 1-component metrics by minimization of the number of classification errors the system makes on a training set. The HSFL-NN system, which has been introduced in this paper, on five tested datasets has given better results on unseen samples than the plain k-NN method with the optimally selected k parameter and the optimal homogeneous similarity function.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fix, E., Hodges Jr., J.L.: Discriminatory analysis, nonparametric discrimination consistency properties. Technical Report 4, Randolph Filed, TX: US Air Force, School of Aviation Medicine (1951)
Sebestyen, G.S.: Decision-making process in pattern-recognition. The Macmillan Company, New York (1962)
Cover, T., Hart, P.: Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory 13, 21–27 (1967)
Cover, T.: Estimation by the nearest neighbor rule. IEEE Transactions on Information Theory 14, 50–55 (1968)
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. John Wiley & Sons, Chichester (1973)
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Dasarathy, B.V.: Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos (1991)
Cost, S., Salzberg, S.: A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning 10, 57–78 (1993)
Duch, W.: Similarity Based Methods: a general framework for classification, approximation and association. Control and Cybernetics 29(4), 1–30 (2000)
Grudzinski, K.: Similarity Based Methods in Application to Analysis of Scientific and Medical Data, PhD Thesis, Department of Applied Informatics, Nicholaus Copernicus University, Torun, Poland (2002)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: Rapid Prototyping for Complex Data Mining Tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006) (2006)
Knime: Konstanz Information Miner, http://www.knime.org/index.html
SBL, Similarity Based Learner, Software Developed by Karol Grudzinski, Nicholaus Copernicus University: 1997-2002, Kazimierz Wielki University: 2002-2008, University of Economy: 2005-2008s
Stahl, A.: Learning of Knowledge-Intensive Similarity Measures in Case-Based Reasoning. PhD thesis, University of Kaiserslautern, Germany
Mertz, C.J., Murphy, P.M.: UCI repository of machine learning databases, http://www.ics.uci.edu/pub/machine-learning-data-bases
Weiss, S.M., Kulikowski, C.A.: Computer Systems that Learn. Morgan Kaufmann, San Francisco (1991)
Duch, W., Grabczewski, K., Adamczak, R., Grudzinski, K., Hippe, Z.S.: Rules for melanoma skin cancer diagnosis. Komputerowe Systemy Rozpoznawania, KOSYR, Wrocaw 2001, pp. 59–68 (2001)
Hab, Sklodowski, H., Zarzadzania, W.: Spoleczna Wyzsza Szkola Przedsiebiorczosci i Zarzadzania w Lodzi
Nelder, J.A., Mead, R.: A simplex method for function minimization. Computer Journal 7, 308–313 (1965)
Ingber, L.: Adaptive simulated annealing (ASA): Lessons learned. Control and Cybernetics 25(1), 33–54 (1996)
Ortega, J., Koppel, M., Argamon, S.: Arbitrating Among Competing Classifiers Using Learned Referees. Knowledge and Information Systems 3, 470–490 (2001)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting and variants. Machine Learning 36, 105–142 (1999)
Duch, W., Grudziński, K.: Meta-Learning: searching in the model space. In: Proceedings of the International Conference on Neural Information Processing, Shanghai, vol. I, pp. 235–240 (2001)
Duch, W., Grudziński, K.: Meta-learning via search combined with parameter optimization. In: Intelligent Information Systems, Sopot, Poland, 2002, Advances in Soft Computing, pp. 13–22. Physica-Verlag (Springer) (2002)
Grudziński, K.: SBL-PM-M: A System for Partial Memory Learning. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 586–591. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grudziński, K. (2008). Towards Heterogeneous Similarity Function Learning for the k-Nearest Neighbors Classification. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing – ICAISC 2008. ICAISC 2008. Lecture Notes in Computer Science(), vol 5097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69731-2_56
Download citation
DOI: https://doi.org/10.1007/978-3-540-69731-2_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69572-1
Online ISBN: 978-3-540-69731-2
eBook Packages: Computer ScienceComputer Science (R0)