Ensemble Feature Ranking

Kees Jong²²,
Jérémie Mary²³,
Antoine Cornuéjols²³,
Elena Marchiori²² &
…
Michèle Sebag²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3202))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

2961 Accesses
25 Citations

Abstract

A crucial issue for Machine Learning and Data Mining is Feature Selection, selecting the relevant features in order to focus the learning search. A relaxed setting for Feature Selection is known as Feature Ranking, ranking the features with respect to their relevance.

This paper proposes an ensemble approach for Feature Ranking, aggregating feature rankings extracted along independent runs of an evolutionary learning algorithm named ROGER. The convergence of ensemble feature ranking is studied in a theoretical perspective, and a statistical model is devised for the empirical validation, inspired from the complexity framework proposed in the Constraint Satisfaction domain. Comparative experiments demonstrate the robustness of the approach for learning (a limited kind of) non-linear concepts, specifically when the features significantly outnumber the examples.

Download to read the full chapter text

Chapter PDF

Ensemble Feature Selection for Rankings of Features

Weighted Feature Ranking Merging for Supervised Machine Learning

A new ranking-based stability measure for feature selection algorithms

Article 03 January 2023

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bi, J., Bennett, K.P., Embrechts, M., Breneman, C.M., Song, M.: Dimensionality reduction via sparse support vector machines. J. of Machine Learning Research 3, 1229–1243 (2003)
Article MATH Google Scholar
Botta, M., Giordana, A., Saitta, L., Sebag, M.: Relational learning as search in a critical region. J. of Machine Learning Research 4, 431–463 (2003)
Article MathSciNet Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition (1997)
Google Scholar
Breiman, L.: Arcing classifiers. Annals of Statistics 26(3), 801–845 (1998)
Article MATH MathSciNet Google Scholar
Esposito, R., Saitta, L.: Monte Carlo theory as an explanation of bagging and boosting. In: Proc. of IJCAI 2003, pp. 499–504 (2003)
Google Scholar
Ferri, C., Flach, P.A., Hernández-Orallo, J.: Learning decision trees using the area under the ROC curve. In: Proc. ICML 2002, pp. 179–186. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Freund, Y., Shapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Proc. ICML 1996, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Giordana, A., Saitta, L.: Phase transitions in relational learning. Machine Learning 41, 217–251 (2000)
Article MATH Google Scholar
Guerra-Salcedo, C., Whitley, D.: Genetic approach to feature selection for ensemble creation. In: Proc. GECCO 1999, pp. 236–243 (1999)
Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)
Article MATH Google Scholar
Hogg, T., Huberman, B.A., Williams, C.P. (eds.): Artificial Intelligence: Special Issue on Frontiers in Problem Solving: Phase Transitions and Complexity, vol. 81(1-2). Elsevier, Amsterdam (1996)
Google Scholar
John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proc. ICML 1994, pp. 121–129. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Ling, C.X., Hunag, J., Zhang, H.: AUC: a better measure than accuracy in comparing learning algorithms. In: Proc. of IJCAI 2003 (2003)
Google Scholar
Pepe, M.S., Longton, G., Anderson, G.L., Schummer, M.: Selecting differentially expressed genes from microarray experiments. Biometrics 59, 133–142 (2003)
Article MathSciNet MATH Google Scholar
Rosset, S.: Model selection via the AUC. In: Proc. ICML 2004, Morgan Kaufmann, San Francisco (2004) (to appear)
Google Scholar
Sebag, M., Azé, J., Lucas, N.: Impact studies and sensitivity analysis in medical data mining with ROC-based genetic learning. In: IEEE-ICDM 2003, pp. 637–640 (2003)
Google Scholar
Sebag, M., Azé, J., Lucas, N.: ROC-based evolutionary learning: Application to medical data mining. In: Liardet, P., Collet, P., Fonlupt, C., Lutton, E., Schoenauer, M. (eds.) EA 2003. LNCS, vol. 2936, pp. 384–396. Springer, Heidelberg (2004)
Chapter Google Scholar
Stoppiglia, H., Dreyfus, G., Dubois, R., Oussar, Y.: Ranking a random feature for variable and feature selection. J. of Machine Learning Research 3, 1399–1414 (2003)
Article MATH Google Scholar
Vafaie, H., De Jong, K.: Genetic algorithms as a tool for feature selection in machine learning. In: Proc. ICTAI 1992, pp. 200–204 (1992)
Google Scholar
Yan, L., Dodier, R.H., Mozer, M., Wolniewicz, R.H.: Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic. In: Proc. of ICML 2003, pp. 848–855. Morgan Kaufmann, San Francisco (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Vrije Universiteit Amsterdam, The Netherlands
Kees Jong & Elena Marchiori
Laboratoire de Recherche en Informatique, CNRS-INRIA, Université Paris-Sud Orsay, France
Jérémie Mary, Antoine Cornuéjols & Michèle Sebag

Authors

Kees Jong
View author publications
You can also search for this author in PubMed Google Scholar
Jérémie Mary
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Cornuéjols
View author publications
You can also search for this author in PubMed Google Scholar
Elena Marchiori
View author publications
You can also search for this author in PubMed Google Scholar
Michèle Sebag
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Pisa KDD Laboratory, ISTI - CNR, Area della Ricerca di Pisa, Via Giuseppe Moruzzi 1, Pisa, Italy
Fosca Giannotti
Dipartimento di Informatica, Via F. Buonarroti 2, 56127, Pisa, Italy
Dino Pedreschi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jong, K., Mary, J., Cornuéjols, A., Marchiori, E., Sebag, M. (2004). Ensemble Feature Ranking. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Knowledge Discovery in Databases: PKDD 2004. PKDD 2004. Lecture Notes in Computer Science(), vol 3202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30116-5_26

Download citation

DOI: https://doi.org/10.1007/978-3-540-30116-5_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23108-0
Online ISBN: 978-3-540-30116-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Ensemble Feature Ranking

Abstract

Chapter PDF

Similar content being viewed by others

Ensemble Feature Selection for Rankings of Features

Weighted Feature Ranking Merging for Supervised Machine Learning

A new ranking-based stability measure for feature selection algorithms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Ensemble Feature Ranking

Abstract

Chapter PDF

Similar content being viewed by others

Ensemble Feature Selection for Rankings of Features

Weighted Feature Ranking Merging for Supervised Machine Learning

A new ranking-based stability measure for feature selection algorithms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation