Abstract
Most existing rough set-based feature selection algorithms suffer from intensive computation of either discernibility functions or positive regions to find attribute reduct. In this paper, we develop a new computation model based on relative attribute dependency that is defined as the proportion of the projection of the decision table on a subset of condition attributes to the projection of the decision table on the union of the subset of condition attributes and the set of decision attributes. To find an optimal reduct, we use information entropy conveyed by the attributes as the heuristic. A novel algorithm to find optimal reducts of condition attributes based on the relative attribute dependency is implemented using Java, and is experimented with 10 data sets from UCI Machine Learning Repository. We conduct the comparison of data classification using C4.5 with the original data sets and their reducts. The experiment results demonstrate the usefulness of our algorithm.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Almuallim, H., Dietterich, T.G.: Learning Boolean concepts in the presence of many irrelevant features. Artificial Intelligence 69(1-2), 279–305 (1994)
Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Han, J., Hu, X., Lin, T.Y.: A New Computation Model for Rough Set Theory Based on Database Systems. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 381–390. Springer, Heidelberg (2003)
Han, J., Hu, X., Lin, T.Y.: Feature Subset Selection Based on Relative Dependency Between Attributes. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 176–185. Springer, Heidelberg (2004)
Grzymala-Busse, J.W.: LERS - A system for learning from examples based on rough sets. In: Slowinski, R. (ed.) Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publishers, Dordrecht (1992)
Grzymala-Busse, J.W.: A Comparison of Three Strategies to Rule Induction. In: Proc. of the International Workshop on Rough Sets in Knowledge Discovery, Warsaw, Poland, April 5-13, pp. 132–140 (2003)
Kira, K., Rendell, L.A.: The Feature Selection Problem: Traditional Methods and a new Algorithm. In: 9th National Conference on Artificial Intelligence (AAAI), pp. 129–134 (1992)
Lin, T.Y., Cercone, N.: Applications of Rough Sets Theory and Data Mining. Kluwer Academic Publishers, Dordrecht (1997)
Lin, T.Y., Yin, P.: Heuristically Fast Finding of the Shortest Reducts. In: Tsumoto, S., Słowiński, R., Komorowski, J., Grzymała-Busse, J.W. (eds.) RSCTC 2004. LNCS (LNAI), vol. 3066, pp. 465–470. Springer, Heidelberg (2004)
Liu, H., Setiono, R.: Chi2: Feature Selection and Discretization of Numeric Attributes. In: 7th IEEE International Conference on Tools with Artificial Intelligence (1995)
Modrzejewski, M.: Feature Selection Using Rough Sets Theory. In: European Conference on Machine Learning, pp. 213–226 (1993)
Nguyen, H., Nguyen, S.: Some efficient algorithms for rough set methods. In: IPMU, pp. 1451–1456 (1996)
Pagallo, G., Haussler, D.: Boolean Feature Discovery in Empirical Learning. Machine Learning 5, 71–99 (1990)
Pawlak, Z.: Rough Sets. International Journal of Information and Computer Science 11(5), 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Dordrecht (1991)
Quafafou, M., Boussouf, M.: Generalized Rough Sets Based Feature Selection. Intelligent Data Analysis 4, 3–17 (2000)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Sever, H., Raghavan, V., Johnsten, D.T.: The Status of Research on Rough Sets for Knowledge Discovery in Databases. In: 2nd International Conference on Nonlinear Problems in Aviation and Aerospace, vol. 2, pp. 673–680 (1998)
Shen, Q., Chouchoulas, A.: A Rough-fuzzy Approach for Generating Classification Rules. Pattern Recognition 35, 2425–2438 (2002)
Zhang, J., Wang, J., Li, D., He, H., Sun, J.: A New Heuristic Reduct Algorithm Based on Rough Sets Theory. In: Dong, G., Tang, C., Wang, W. (eds.) WAIM 2003. LNCS, vol. 2762, pp. 247–253. Springer, Heidelberg (2003)
Zhang, M., Yao, J.: A Rough Set based Approach ro Feature Selection. In: Proc. IEEE Annual Meeting of Fuzzy Information NAFIP, pp. 434–439 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Han, J., Sanchez, R., Hu, X. (2005). Feature Selection Based on Relative Attribute Dependency: An Experimental Study. In: Ślęzak, D., Wang, G., Szczuka, M., Düntsch, I., Yao, Y. (eds) Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. RSFDGrC 2005. Lecture Notes in Computer Science(), vol 3641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11548669_23
Download citation
DOI: https://doi.org/10.1007/11548669_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28653-0
Online ISBN: 978-3-540-31825-5
eBook Packages: Computer ScienceComputer Science (R0)