Local rough set-based feature selection for label distribution learning with incomplete labels

Wenbin Qian^1,2,
Ping Dong²,
Yinglong Wang²,
Shiming Dai¹ &
…
Jintao Huang³

652 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

Label distribution learning, as a new learning paradigm under the machine learning framework, is widely applied to address label ambiguity. However, most existing label distribution learning methods require complete supervised information, which is obtained through costly and laborious efforts to label the data. In reality, the annotation information may be incomplete and traditional methods cannot directly deal with the incomplete data. Hence, a new theoretical framework is proposed to handle the limited labeled data, which is called the local rough set. In addition, label distribution learning also experiences the “curse of dimensionality” problem, and it is essential to adopt some pre-processing methods, such as feature selection, to reduce the data dimensionality. Nevertheless, few feature selection algorithms are designed for handling label distribution data. Motivated by this, a model based on local rough set and neighborhood granularity, which can effectively and efficiently work with incompletely labeled data, is introduced in this paper. Furthermore, a local rough set-based incomplete label distribution feature selection algorithm is proposed to reduce the data dimensionality. Experimental results on 12 real-world label distribution datasets indicate that the proposed method outperforms the global rough set in computational efficiency and achieves better classification performance than the other five methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set

Article 02 May 2022

Dynamic multi-label feature selection algorithm based on label importance and label correlation

Article Open access 13 March 2024

Multi-label feature selection based on fuzzy neighborhood rough sets

Article Open access 10 January 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13
Article Google Scholar
Li SY, Jiang Y, Chawla NV, Zhou ZH (2019) Multi-label learning from crowds. IEEE Trans Knowl Data Eng 31(7):1369–1382
Article Google Scholar
Xu M, Li Y, Zhou Z (2020) Robust multi-label learning with PRO loss. IEEE Trans Knowl Data Eng 32(8):1610–1624
Article Google Scholar
Zhuang N, Yan Y, Chen S, Wang H, Shen C (2018) Multi-label learning based deep transfer neural network for facial attribute classification. Pattern Recognit 80:225–240
Article Google Scholar
Cheng Y, Zhao D, Wang Y, Pei G (2019) Multi-label learning with kernel extreme learning machine autoencoder. Knowl Based Syst 178:1–10
Article Google Scholar
Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28:1734–1748
Article Google Scholar
Geng X, Wang Q, Xia Y (2014) Facial age estimation by adaptive label distribution learning. In: Proceedings of the 22nd international conference on pattern recognition, pp 4465–4470
He Z, Li X, Zhang Z, Wu F, Geng X, Zhang Y, Yang MH, Zhuang Y (2017) Data-dependent label distribution learning for age estimation. IEEE Trans Image Process 26(8):3846–3858
Article MathSciNet Google Scholar
Geng X, Qian X, Huo Z, Zhang Y (2020) Head pose estimation based on multivariate label distribution. IEEE Trans Pattern Anal Mach Intell 44:1974-1991
Chen S, Wang J, Chen Y, Shi Z, Geng X, Rui Y (2020) Label distribution learning on auxiliary label space graphs for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13981–13990
Wang S, She D, Zhang Y, Yang J (2018) Text emotion distribution learning via multi-task convolutional neural network. In: Proceedings of the 27th international joint conference on artificial intelligence, pp 4595–4601
Roffo G, Melzi s, Castellani U, Vinciarelli A, Cristani M (2020) Infinite feature selection: a graph-based feature filtering approach. IEEE Trans Pattern Anal Mach Intell 43:4396–4410
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324
Article MATH Google Scholar
Chen X, Yuan G, Nie F, Ming Z (2020) Semi-supervised feature selection via sparse rescaled linear square regression. IEEE Trans Knowl Data Eng 32(1):165–176
Article Google Scholar
Qian Y, Liang X, Wang Q, Liang J, Liu B, Skowron A, Yao Y, Ma J, Dang C (2018) Local rough set: a solution to rough data analysis in big data. Int J Approx Reason 97:38–63
Article MathSciNet MATH Google Scholar
Wang Q, Qian Y, Liang X, Guo Q, Liang J (2018) Local neighborhood rough set. Knowl Based Syst 153:53–64
Article Google Scholar
Qian Y, Liang X, Lin G, Guo Q, Liang J (2017) Local multigranulation decision-theoretic rough sets. Int J Approx Reason 82:119–137
Article MathSciNet MATH Google Scholar
Wu T, Lin C, Weng R (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005
MathSciNet MATH Google Scholar
Geng X, Yin C, Zhou ZH (2013) Facial age estimation by learning from label distributions. IEEE Trans Pattern Anal Mach Intell 35(10):2401–2412
Article Google Scholar
Xu M, Zhou Z (2017) Incomplete label distribution learning. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 3175–3181
Zeng X, Chen S, Xiang R, Li G, Fu X (2020) Incomplete label distribution learning based on supervised neighborhood information. Int J Mach Learn Cybern 11:111–121
Article Google Scholar
Zeng X, Chen S, Xiang R, Wu S, Wan Z (2019) Filling missing values by local reconstruction for incomplete label distribution learning. Int J Wirel Mob Comput 16:314–321
Article Google Scholar
Xu S, Ju H, Shang L, Pedrycz W, Yang X, Li C (2020) Label distribution learning: a local collaborative mechanism. Int J Approx Reason 121:59–84
Article MathSciNet Google Scholar
Dai J, Chen J, Liu Y, Hu H (2020) Novel multi-label feature selection via label symmetric uncertainty correlation learning and feature redundancy evaluation. Knowl Based Syst 207:106342
Article Google Scholar
Tao Y, Li J, Xu J (2020) Multi-label feature selection method via maximizing correlation-based criterion with mutation binary bat algorithm. In: Proceedings of the International Joint conference on neural networks, pp 1–8
Li F, Miao D, Pedrycz W (2017) Granular multi-label feature selection based on mutual information. Pattern Recognit 67:410–423
Article Google Scholar
Che X, Chen D, Mi J (2020) A novel approach for learning label correlation with application to feature selection of multi-label data. Inf Sci 512:795–812
Article MathSciNet MATH Google Scholar
Lee J, Kim DW (2017) SCLS: multi-label feature selection based on scalable criterion for large label set. Pattern Recognit 66:342–352
Article MathSciNet Google Scholar
Liu J, Lin Y, Li Y, Weng W, Wu S (2018) Online multi-label streaming feature selection based on neighborhood rough set. Pattern Recognit 84:273–287
Article Google Scholar
Qian W, Xiong C, Wang Y (2021) A ranking-based feature selection for multi-label classification with fuzzy relative discernibility. Appl Soft Comput 102:106995
Article Google Scholar
Zhai Y, Dai J (2019) Label distribution data feature reduction based on fuzzy rough set model. Aust J Intell Inf Process Syst 16:27–35
Google Scholar
Qian W, Huang J, Wang Y, Shu W (2020) Mutual information-based label distribution feature selection for multi-label learning. Knowl Based Syst 195:105684
Article Google Scholar
Qian W, Long X, Wang Y, Xie Y (2020) Multi-label feature selection based on label distribution and feature complementarity. Appl Soft Comput J 90:106167
Article Google Scholar
Qian W, Huang J, Wang Y, Xie Y (2021) Label distribution feature selection for multi-label classification with rough set. Int J Approx Reason 128:32–55
Article MathSciNet MATH Google Scholar
Lin TY, Huang KJ, Liu Q, Chen W (1990) Rough sets, neighborhood systems and approximation. In: Proceedings of the 5th international symposium on methodologies for intelligent systems, pp 130–141
Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
Article MathSciNet MATH Google Scholar
Xia S, Zhang Z, Li W, Wang G, Giem E, Chen Z (2020) GBNRS: a novel rough set algorithm for fast adaptive attribute reduction in classification. IEEE Trans Knowl Data Eng 34:1231–1242
Xia S, Peng D, Meng D, Zhang C, Wang G, Giem E, Wei W, Chen Z (2020) A fast adaptive k-means with no bounds. IEEE Trans Pattern Anal Mach Intell 44:87–99
Ding Y, Zhao Y, Shen X, Musuvathi M, Mytkowicz T (2015) Yinyang K-means: a drop-in replacement of the classic K-means with consistent speedup. In: Proceedings of the 32nd international conference on machine learning, pp 579–587
Zhu X, Ying C, Wang J, Li J, Lai X, Wang G (2021) Ensemble of ML-KNN for classification algorithm recommendation. Knowl Based Syst 221:106933
Article Google Scholar
Chen Y, Hu X, Fan W, Shen L, Zhang Z, Liu X, Du J, Li H, Chen Y, Li H (2020) Fast density peak clustering for large scale data based on kNN. Knowl Based Syst 187:104824
Article Google Scholar
Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356
Article MATH Google Scholar
Swiniarski R, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recognit Lett 24(6):833–849
Article MATH Google Scholar
Wei J, Wang S, Yuan X (2010) Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans Knowl Data Eng 22:381–391
Article Google Scholar
Prasad M, Tripathi S, Dahal K (2020) An efficient feature selection based Bayesian and Rough set approach for intrusion detection. Appl Soft Comput J 87:105980
Zhao J, Liang J, Dong Z, Tang D, Liu Z (2020) Accelerating information entropy-based feature selection using rough set theory with classified nested equivalence classes. Pattern Recognit 107:107517
Dai J, Chen J (2020) Feature selection via normative fuzzy information weight with application into tumor classification. Appl Soft Comput 92:106299
Article Google Scholar
Tan A, Wu W, Qian Y, Liang J, Chen J, Li J (2019) Intuitionistic fuzzy rough set-based granular structures and attribute subset selection. IEEE Trans Fuzzy Syst 27(3):527–539
Article Google Scholar
Yao Y (2020) Three-way granular computing, rough sets, and formal concept analysis. Int J Approx Reason 116:106–125
Article MathSciNet MATH Google Scholar
Du Y, Hu Q, Zhu P, Ma P (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467
Article MathSciNet Google Scholar
Li J, Mei C, Lv Y (2013) Incomplete decision contexts: approximate concept construction, rule acquisition and knowledge reduction. Int J Approx Reason 54(1):149–165
Article MathSciNet MATH Google Scholar
She Y, He X, Shi H, Qian Y (2017) A multiple-valued logic approach for multigranulation rough set model. Int J Approx Reason 82:270–284
Article MathSciNet MATH Google Scholar
Chen D, Zhang X, Wang X, Liu Y (2018) Uncertainty learning of rough set-based prediction under a holistic framework. Inf Sci 463–464:129–151
Article MathSciNet MATH Google Scholar
Liu D, Li T, Liang D (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approx Reason 55(1):197–210
Article MathSciNet MATH Google Scholar
Chen Y, Yue X, Fujita H, Fu S (2017) Three-way decision support for diagnosis on focal liver lesions. Knowl Based Syst 127:85–99
Article Google Scholar
Hu J, Li T, Wang H, Fujita H (2016) Hierarchical cluster ensemble model based on knowledge granulation. Knowl Based Syst 91:179–188
Article Google Scholar
Yao Y, Wong SKM, Lingras P (1990) A decision-theoretic rough set model. Methodol Intell Syst 5:17–24
MathSciNet Google Scholar
Wang Y, Dai J (2019) Label distribution feature selection based on mutual information in fuzzy rough set theory. In: Proceedings of the international joint conference on neural networks, pp 1–2
Zhang J, Lin Y, Jiang M, Li S, Tang Y, Tan KC (2020) Multi-label feature selection via global relevance and redundancy optimization. In: Proceedings of the 29th international joint conferences on artificial intelligence, pp 2512–2518
Zhang Y, Zhou ZH (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4(3):1–21
Article Google Scholar
Kong D, Ding C, Huang H, Zhao H (2012) Multi-label ReliefF and F-statistic feature selections for image annotation. In: Proceedings of the IEEE computer vision and pattern recognition, pp 2352–2359
Spolaor N, Cherman EA, Monard MC (2011) Using ReliefF for multi-label feature selection. In: Proceedings of the Conferencia Latinoamericana de Informática, pp 960–975
Hu Q, Yu D, Xie Z (2008) Neighborhood classifiers. Expert Syst Appl 34:866–876
Article Google Scholar
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92
Article MathSciNet MATH Google Scholar
Dešar J (1993) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
MathSciNet Google Scholar

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2020YFD1100605), the National Natural Science Foundation of China (No. 61966016), the Natural Science Foundation of Jiangxi Province, China (No. 20192BAB207018), and the Scientific Research Project of Education department of Jiangxi Province, China (No. GJJ180200).

Author information

Authors and Affiliations

School of Software, Jiangxi Agricultural University, Nanchang, 330045, China
Wenbin Qian & Shiming Dai
School of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, 330045, China
Wenbin Qian, Ping Dong & Yinglong Wang
Department of Computer and Information Science, University of Macau, Macau, 999078, China
Jintao Huang

Authors

Wenbin Qian
View author publications
You can also search for this author in PubMed Google Scholar
Ping Dong
View author publications
You can also search for this author in PubMed Google Scholar
Yinglong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shiming Dai
View author publications
You can also search for this author in PubMed Google Scholar
Jintao Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenbin Qian.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qian, W., Dong, P., Wang, Y. et al. Local rough set-based feature selection for label distribution learning with incomplete labels. Int. J. Mach. Learn. & Cyber. 13, 2345–2364 (2022). https://doi.org/10.1007/s13042-022-01528-4

Download citation

Received: 19 June 2021
Accepted: 19 February 2022
Published: 12 March 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s13042-022-01528-4

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set

Dynamic multi-label feature selection algorithm based on label importance and label correlation

Multi-label feature selection based on fuzzy neighborhood rough sets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Local rough set-based feature selection for label distribution learning with incomplete labels

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set

Dynamic multi-label feature selection algorithm based on label importance and label correlation

Multi-label feature selection based on fuzzy neighborhood rough sets

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation