[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3369114.3369119acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaaiConference Proceedingsconference-collections
research-article

Filter-Based Information-Theoretic Feature Selection

Published: 21 January 2020 Publication History

Abstract

Feature subset selection methods aim at identifying the smallest subset of features that maximize generalization performance, while preserving the true nature of the joint data distribution. In classification tasks, this is tantamount to finding an optimal subset of features relevant to the target class. A distinctive family of feature selection methods use a distance metric to identify relevant features, even under high feature interaction, by looking at the local class distribution. In this study we present EBFS: a new algorithm that is inspired by Relieff and uses an entropy-based metric to discover relevant features. Results on UCI data-sets show the effectiveness of our approach when compared to other filter-based feature selection methods.

References

[1]
Isabelle Guyon and André Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar):1157--1182, 2003.
[2]
Lei Yu and Huan Liu. Efficient Feature Selection via Analysis of Relevance and Redundancy. Technical report, 2004.
[3]
Igor Kononenko. Estimating attributes: analysis and extensions of RELIEF. In European Conference on Machine Learning, pages 171--182, 1994.
[4]
Claude E Shannon. A mathematical theory of communication. The Bell System Technical Journal, 27(July 1928):379--423, 1948.
[5]
T M Cover and J A Thomas. Elements of Information Theory. 2012.
[6]
Aleks Jakulin. Attribute interactions in machine learning. Master's thesis, University of Ljubljana, Department of Computer and Information Science, Slovenija, 2003.
[7]
Hanchuan Peng, Fuhui Long, and Chris Ding. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(8):1226--1238, 2005.
[8]
Francois Fleuret. Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research, 5:1531--1555, 2004.
[9]
H Yang and John Moody. Feature selection based on joint mutual information. In Proceedings of International ICSC Symposium on Advances in Intelligent Data Analysis, pages 22--25. Citeseer, 1999.
[10]
Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján. Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. Journal of Machine Learning Research, 13(Jan):27--66, 2012.
[11]
Mark Hall. Correlation-based Feature Selection for Machine Learning. Methodology, 21i195-i20(April):1--5, 1999.
[12]
Lei Yu and Huan Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. International Conference on Machine Learning (ICML), pages 1--8, 2003.
[13]
George H John, Ron Kohavi, and Karl Pfleger. Irrelevant features and the subset selection problem. In Machine Learning Proceedings 1994, pages 121--129. Elsevier, 1994.
[14]
Zheng Zhao and Huan Liu. Searching for interacting features. In IJCAI International Joint Conference on Artificial Intelligence, pages 1156--1161, 2007.
[15]
Zilin Zeng, Hongjun Zhang, Rui Zhang, and Chengxiang Yin. A novel feature selection method considering feature interaction. Pattern Recognition, 48(8):2656--2666, 2015.
[16]
Alexander Shishkin, Anastasia Bezzubtseva, and Alexey Drutsa. Efficient high-order interaction-aware feature selection based on conditional mutual information. Advances in Neural Information Processing Systems, (Nips):1--9, 2016.
[17]
Xiaochuan Tang, Yuanshun Dai, Yanping Xiang, and Liang Luo. An interaction-enhanced feature selection algorithm. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pages 115--125. Springer, 2018.
[18]
Ran Gilad-Bachrach, Amir Navot, and Naftali Tishby. Margin based feature selection - theory and algorithms. In Proceedings of the 21st International Conference on Machine Learning, page 43, 2004.
[19]
Yijun Sun, Sinisa Todorovic, and Steve Goodison. Local-learning-based feature selection for high-dimensional data analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1610--1626, 2010.
[20]
Yijun Sun and Jian Li. Iterative RELIEF for feature weighting. In Proceedings of the 23rd International Conference on Machine Learning, pages 913--920. ACM, 2006.
[21]
Yijun Sun and Jian Li. Iterative RELIEF for feature weighting. In Proceedings of the 23rd International Conference on Machine Learning, pages 913--920. ACM, 2006.

Cited By

View all
  • (2021)A Novel Forward Filter Feature Selection Algorithm Based on Maximum Dual Interaction and Maximum Feature Relevance(MDIMFR) for Machine Learning2021 International Conference on Advances in Computing and Communications (ICACC)10.1109/ICACC-202152719.2021.9708300(1-7)Online publication date: 21-Oct-2021
  • (2021)Unsupervised feature selection via transformed auto-encoderKnowledge-Based Systems10.1016/j.knosys.2021.106748215(106748)Online publication date: Mar-2021

Index Terms

  1. Filter-Based Information-Theoretic Feature Selection

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICAAI '19: Proceedings of the 3rd International Conference on Advances in Artificial Intelligence
    October 2019
    253 pages
    ISBN:9781450372534
    DOI:10.1145/3369114
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Northumbria University: University of Northumbria at Newcastle

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 January 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Feature selection
    2. feature subset selection
    3. supervised learning

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICAAI 2019

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)A Novel Forward Filter Feature Selection Algorithm Based on Maximum Dual Interaction and Maximum Feature Relevance(MDIMFR) for Machine Learning2021 International Conference on Advances in Computing and Communications (ICACC)10.1109/ICACC-202152719.2021.9708300(1-7)Online publication date: 21-Oct-2021
    • (2021)Unsupervised feature selection via transformed auto-encoderKnowledge-Based Systems10.1016/j.knosys.2021.106748215(106748)Online publication date: Mar-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media