More Web Proxy on the site http://driver.im/

short-paper

Fairness-Aware Unsupervised Feature Selection

Authors:

Jundong LiAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 3548 - 3552

https://doi.org/10.1145/3459637.3482106

Published: 30 October 2021 Publication History

Abstract

Feature selection is a prevalent data preprocessing paradigm for various learning tasks. Due to the expensive cost of acquiring supervision information, unsupervised feature selection sparks great interests recently. However, existing unsupervised feature selection algorithms do not have fairness considerations and suffer from a high risk of amplifying discrimination by selecting features that are over associated with protected attributes such as gender, race, and ethnicity. In this paper, we make an initial investigation of the fairness-aware unsupervised feature selection problem and develop a principled framework, which leverages kernel alignment to find a subset of high-quality features that can best preserve the information in the original feature space while being minimally correlated with protected attributes. Specifically, different from the mainstream in-processing debiasing methods, our proposed framework can be regarded as a model-agnostic debiasing strategy that eliminates biases and discrimination before downstream learning algorithms are involved. Experimental results on real-world datasets demonstrate that our framework achieves a good trade-off between feature utility and promoting feature fairness.

References

[1]

Nachman Aronszajn. 1950. Theory of reproducing kernels. Transactions of the American mathematical society (1950).

[2]

Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, and Tal Wagner. 2019. Scalable fair clustering. arXiv preprint arXiv:1902.03519 (2019).

[3]

Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Unsupervised feature selection for multi-cluster data. In KDD.

Digital Library

[4]

Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. 2017. Fair clustering through fairlets. In NIPS.

Digital Library

[5]

Alexandra Chouldechova and Aaron Roth. 2018. The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810 (2018).

[6]

Elliot Creager, David Madras, Jörn-Henrik Jacobsen, Marissa A Weis, Kevin Swersky, Toniann Pitassi, and Richard Zemel. 2019. Flexibly fair representation learning by disentanglement. arXiv preprint arXiv:1906.02589 (2019).

[7]

Nello Cristianini, Jaz Kandola, Andre Elisseeff, and John Shawe-Taylor. 2006. On kernel target alignment. In Innovations in Machine Learning.

[8]

Mengnan Du, Fan Yang, Na Zou, and Xia Hu. 2020. Fairness in deep learning: A computational perspective. IEEE Intelligent Systems (2020).

[9]

A. K. Farahat, A. Ghodsi, and M. S. Kamel. 2011. An Efficient Greedy Method for Unsupervised Feature Selection. In 2011 IEEE 11th International Conference on Data Mining. 161--170. https://doi.org/10.1109/ICDM.2011.22

Digital Library

[10]

Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. Journal of Machine Learning Research (2003).

Digital Library

[11]

Xiaofei He, Deng Cai, and Partha Niyogi. 2006. Laplacian score for feature selection. In NeurIPS.

Digital Library

[12]

H Hannah Inbarani, Ahmad Taher Azar, and G Jothi. 2014. Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis. Computer Methods and Programs in Biomedicine (2014).

Digital Library

[13]

Nathan Kallus, Xiaojie Mao, and Angela Zhou. 2019. Assessing algorithmic fairness with unobserved protected class using data combination. arXiv preprint arXiv:1906.00285 (2019).

[14]

Matthäus Kleindessner, Samira Samadi, Pranjal Awasthi, and Jamie Morgenstern. 2019. Guarantees for spectral clustering with fairness constraints. arXiv preprint arXiv:1901.08668 (2019).

[15]

Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P Trevino, Jiliang Tang, and Huan Liu. 2017. Feature selection: a data perspective. Comput. Surveys (2017).

Digital Library

[16]

Jundong Li, Jiliang Tang, and Huan Liu. 2017. Reconstruction-based Unsupervised Feature Selection: An Embedded Approach. In IJCAI.

Digital Library

[17]

Jundong Li, Liang Wu, Harsh Dani, and Huan Liu. 2018. Unsupervised Personalized Feature Selection. In AAAI.

[18]

Peizhao Li, Han Zhao, and Hongfu Liu. 2020. Deep Fair Clustering for Visual Learning. In CVPR.

[19]

Zechao Li, Yi Yang, Jing Liu, Xiaofang Zhou, and Hanqing Lu. 2012. Unsupervised feature selection using nonnegative spectral analysis. In AAAI.

Digital Library

[20]

Deron Liang, Chih-Fong Tsai, and Hsin-Ting Wu. 2015. The effect of feature selection on financial distress prediction. Knowledge-Based Systems (2015).

Digital Library

[21]

Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. 2015. The variational fair autoencoder. arXiv preprint arXiv:1511.00830 (2015).

[22]

Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019).

[23]

Daniel Moyer, Shuyang Gao, Rob Brekelmans, Aram Galstyan, and Greg Ver Steeg. 2018. Invariant representations without adversarial training. In NeurIPS.

Digital Library

[24]

Melanie Schmidt, Chris Schwiegelshohn, and Christian Sohler. 2018. Fair coresets and streaming algorithms for fair k-means clustering. arXiv preprint arXiv:1812.10854 (2018).

[25]

Drishty Sobnath, Tobiasz Kaduk, Ikram Ur Rehman, and Olufemi Isiaq. 2020. Feature selection for UK disabled students' engagement post higher education: a machine learning approach for a predictive employment model. IEEE Access (2020).

[26]

Hanyu Song, Peizhao Li, and Hongfu Liu. 2021. Deep Clustering based Fair Outlier Detection. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Digital Library

[27]

Xiaokai Wei, Bokai Cao, and Philip S. Yu. 2016. Nonlinear joint unsupervised feature selection. In SDM.

Digital Library

[28]

Yi Yang, Heng Tao Shen, Zhigang Ma, Zi Huang, and Xiaofang Zhou. 2011. ?2,1- norm regularized discriminative feature selection for unsupervised learning. In IJCAI.

Digital Library

[29]

Lu Zhang, Yongkai Wu, and Xintao Wu. 2016. A causal framework for discovering and removing direct and indirect discrimination. arXiv preprint arXiv:1611.07509 (2016).

[30]

Zhou Zhao, Xiaofei He, Deng Cai, Lijun Zhang, Wilfred Ng, and Yueting Zhuang. 2015. Graph regularized feature selection with data reconstruction. IEEE Transactions on Knowledge and Data Engineering (2015).

Digital Library

[31]

Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In ICML.

Digital Library

Cited By

Dzakpasu DLiu JLi JLiu LSerra ESpezzano F(2024)Integrating Fair Representation Learning with Fairness Regularization for Intersectional Group FairnessProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679802(560-569)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679802
Dong YMa JWang SChen CLi J(2023)Fairness in Graph Mining: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.326559835:10(10583-10602)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1109/TKDE.2023.3265598
Azimi VZaydman M(2023)Optimizing Equity: Working towards Fair Machine Learning Algorithms in Laboratory MedicineThe Journal of Applied Laboratory Medicine10.1093/jalm/jfac0858:1(113-128)Online publication date: 4-Jan-2023
https://doi.org/10.1093/jalm/jfac085
Show More Cited By

Index Terms

Fairness-Aware Unsupervised Feature Selection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
    2. Machine learning algorithms
      1. Feature selection

Recommendations

Feature selection under fairness constraints
SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing

Learning from large dimensional data presents major challenges related to the size of the data. Thus, dimensionality reduction techniques such as feature selection are brought in to reduce computation time, improve prediction performance, and better ...
An Efficient Greedy Method for Unsupervised Feature Selection
ICDM '11: Proceedings of the 2011 IEEE 11th International Conference on Data Mining

In data mining applications, data instances are typically described by a huge number of features. Most of these features are irrelevant or redundant, which negatively affects the efficiency and effectiveness of different learning algorithms. The ...
Unsupervised Feature Value Selection Based on Explainability
Agents and Artificial Intelligence
Abstract
The problem of feature selection has been an area of considerable research in machine learning. Feature selection is known to be particularly difficult in unsupervised learning because different subgroups of features can yield useful insights into ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

October 2021

4966 pages

ISBN:9781450384469

DOI:10.1145/3459637

General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

CIKM '21

Sponsor:

CIKM '21: The 30th ACM International Conference on Information and Knowledge Management

November 1 - 5, 2021

Queensland, Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
305
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)5

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Dzakpasu DLiu JLi JLiu LSerra ESpezzano F(2024)Integrating Fair Representation Learning with Fairness Regularization for Intersectional Group FairnessProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679802(560-569)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679802
Dong YMa JWang SChen CLi J(2023)Fairness in Graph Mining: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.326559835:10(10583-10602)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.1109/TKDE.2023.3265598
Azimi VZaydman M(2023)Optimizing Equity: Working towards Fair Machine Learning Algorithms in Laboratory MedicineThe Journal of Applied Laboratory Medicine10.1093/jalm/jfac0858:1(113-128)Online publication date: 4-Jan-2023
https://doi.org/10.1093/jalm/jfac085
Wang DCheng LWang T(2022)Fairness-aware genetic-algorithm-based few-shot classificationMathematical Biosciences and Engineering10.3934/mbe.202316920:2(3624-3637)Online publication date: 2022
https://doi.org/10.3934/mbe.2023169

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents