Abstract
Rare category detection is an open challenge in data mining. The existing approaches to this problem often have some flaws, such as inappropriate investigation scopes, high time complexity, and limited applicable conditions, which will degrade their performance and reduce their usability. In this paper, we present FRANC an effective and efficient solution for rare category detection. It adopts an investigation scope based on k-nearest centroid neighbors with an automatically selected k, which helps the algorithm capture the real changes on local densities and data distribution caused by the presence of rare categories. By using our proposed pruning method, the identification of k-nearest centroid neighbors, which is the most computationally expensive step in FRANC, will be much faster for each data example. Extensive experimental results on real data sets demonstrate the effectiveness and efficiency of FRANC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml/
Gou, J., Yi, Z., Du, L., Xiong, T.: A local mean-based \(k\)-nearest centroid neighbor classifier. Comput. J. 55(9), 1058–1071 (2012)
He, J., Carbonell, J.: Nearest-neighbor-based active learning for rare category detection. In: NIPS 2007, pp. 633–640 (2007)
He, J., Carbonell, J.: Prior-free rare category detection. In: SDM 2009, pp. 155–163 (2009)
He, J., Liu, Y., Lawrence, R.: Graph-based rare category detection. In: ICDM 2008, pp. 833–838 (2008)
Hospedales, T.M., Gong, S., Xiang, T.: Finding rare classes: active learning with generative and discriminative models. IEEE Trans. Knowl. Data Eng. 25(2), 374–386 (2013)
Huang, H., Gao, Y., Chiew, K., Chen, L., He, Q.: Towards effective and efficient mining of arbitrary shaped clusters. In: ICDE 2014, pp. 28–39 (2014)
Huang, H., He, Q., Chiew, K., Qian, F., Ma, L.: CLOVER: a faster prior-free approach to rare-category detection. Knowl. Inf. Syst. 35(3), 713–736 (2013)
Huang, H., He, Q., He, J., Ma, L.: RADAR: rare category detection via computation of boundary degree. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 258–269. Springer, Heidelberg (2011)
Liu, Z., Chiew, K., He, Q., Huang, H., Huang, B.: Prior-free rare category detection: more effective and efficient solutions. Expert Syst. Appl. 41(17), 7691–7706 (2014)
Pelleg, D., Moore, A.W.: Active learning for anomaly and rare-category detection. In: NIPS 2004, pp. 1073–1080 (2004)
Scott, D.W.: Histogram. WIREs Comput. Stat. 2(1), 44–48 (2010)
Vatturi, P., Wong, W.: Category detection using hierarchical mean shift. In: KDD 2009, pp. 847–856 (2009)
Acknowledgements
This work was supported in part by NSFC Grants (61502347, 61522208, 61572376, 61303025, 61379033, and 61232002), the Fundamental Research Funds for the Central Universities (2015XZZX005-07, 2015XZZX004-18, and 2042015kf0038), the Research Funds for Introduced Talents of Wuhan University, and the International Academic Cooperation Training Program of Wuhan University.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, S., Huang, H., Gao, Y., Qian, T., Hong, L., Peng, Z. (2016). Fast Rare Category Detection Using Nearest Centroid Neighborhood. In: Li, F., Shim, K., Zheng, K., Liu, G. (eds) Web Technologies and Applications. APWeb 2016. Lecture Notes in Computer Science(), vol 9931. Springer, Cham. https://doi.org/10.1007/978-3-319-45814-4_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-45814-4_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45813-7
Online ISBN: 978-3-319-45814-4
eBook Packages: Computer ScienceComputer Science (R0)