[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Local rough set-based feature selection for label distribution learning with incomplete labels

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Label distribution learning, as a new learning paradigm under the machine learning framework, is widely applied to address label ambiguity. However, most existing label distribution learning methods require complete supervised information, which is obtained through costly and laborious efforts to label the data. In reality, the annotation information may be incomplete and traditional methods cannot directly deal with the incomplete data. Hence, a new theoretical framework is proposed to handle the limited labeled data, which is called the local rough set. In addition, label distribution learning also experiences the “curse of dimensionality” problem, and it is essential to adopt some pre-processing methods, such as feature selection, to reduce the data dimensionality. Nevertheless, few feature selection algorithms are designed for handling label distribution data. Motivated by this, a model based on local rough set and neighborhood granularity, which can effectively and efficiently work with incompletely labeled data, is introduced in this paper. Furthermore, a local rough set-based incomplete label distribution feature selection algorithm is proposed to reduce the data dimensionality. Experimental results on 12 real-world label distribution datasets indicate that the proposed method outperforms the global rough set in computational efficiency and achieves better classification performance than the other five methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Tsoumakas G, Katakis I (2007) Multi-label classification: an overview. Int J Data Warehous Min 3(3):1–13

    Article  Google Scholar 

  2. Li SY, Jiang Y, Chawla NV, Zhou ZH (2019) Multi-label learning from crowds. IEEE Trans Knowl Data Eng 31(7):1369–1382

    Article  Google Scholar 

  3. Xu M, Li Y, Zhou Z (2020) Robust multi-label learning with PRO loss. IEEE Trans Knowl Data Eng 32(8):1610–1624

    Article  Google Scholar 

  4. Zhuang N, Yan Y, Chen S, Wang H, Shen C (2018) Multi-label learning based deep transfer neural network for facial attribute classification. Pattern Recognit 80:225–240

    Article  Google Scholar 

  5. Cheng Y, Zhao D, Wang Y, Pei G (2019) Multi-label learning with kernel extreme learning machine autoencoder. Knowl Based Syst 178:1–10

    Article  Google Scholar 

  6. Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28:1734–1748

    Article  Google Scholar 

  7. Geng X, Wang Q, Xia Y (2014) Facial age estimation by adaptive label distribution learning. In: Proceedings of the 22nd international conference on pattern recognition, pp 4465–4470

  8. He Z, Li X, Zhang Z, Wu F, Geng X, Zhang Y, Yang MH, Zhuang Y (2017) Data-dependent label distribution learning for age estimation. IEEE Trans Image Process 26(8):3846–3858

    Article  MathSciNet  Google Scholar 

  9. Geng X, Qian X, Huo Z, Zhang Y (2020) Head pose estimation based on multivariate label distribution. IEEE Trans Pattern Anal Mach Intell 44:1974-1991

  10. Chen S, Wang J, Chen Y, Shi Z, Geng X, Rui Y (2020) Label distribution learning on auxiliary label space graphs for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13981–13990

  11. Wang S, She D, Zhang Y, Yang J (2018) Text emotion distribution learning via multi-task convolutional neural network. In: Proceedings of the 27th international joint conference on artificial intelligence, pp 4595–4601

  12. Roffo G, Melzi s, Castellani U, Vinciarelli A, Cristani M (2020) Infinite feature selection: a graph-based feature filtering approach. IEEE Trans Pattern Anal Mach Intell 43:4396–4410

  13. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324

    Article  MATH  Google Scholar 

  14. Chen X, Yuan G, Nie F, Ming Z (2020) Semi-supervised feature selection via sparse rescaled linear square regression. IEEE Trans Knowl Data Eng 32(1):165–176

    Article  Google Scholar 

  15. Qian Y, Liang X, Wang Q, Liang J, Liu B, Skowron A, Yao Y, Ma J, Dang C (2018) Local rough set: a solution to rough data analysis in big data. Int J Approx Reason 97:38–63

    Article  MathSciNet  MATH  Google Scholar 

  16. Wang Q, Qian Y, Liang X, Guo Q, Liang J (2018) Local neighborhood rough set. Knowl Based Syst 153:53–64

    Article  Google Scholar 

  17. Qian Y, Liang X, Lin G, Guo Q, Liang J (2017) Local multigranulation decision-theoretic rough sets. Int J Approx Reason 82:119–137

    Article  MathSciNet  MATH  Google Scholar 

  18. Wu T, Lin C, Weng R (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005

    MathSciNet  MATH  Google Scholar 

  19. Geng X, Yin C, Zhou ZH (2013) Facial age estimation by learning from label distributions. IEEE Trans Pattern Anal Mach Intell 35(10):2401–2412

    Article  Google Scholar 

  20. Xu M, Zhou Z (2017) Incomplete label distribution learning. In: Proceedings of the 26th international joint conference on artificial intelligence, pp 3175–3181

  21. Zeng X, Chen S, Xiang R, Li G, Fu X (2020) Incomplete label distribution learning based on supervised neighborhood information. Int J Mach Learn Cybern 11:111–121

    Article  Google Scholar 

  22. Zeng X, Chen S, Xiang R, Wu S, Wan Z (2019) Filling missing values by local reconstruction for incomplete label distribution learning. Int J Wirel Mob Comput 16:314–321

    Article  Google Scholar 

  23. Xu S, Ju H, Shang L, Pedrycz W, Yang X, Li C (2020) Label distribution learning: a local collaborative mechanism. Int J Approx Reason 121:59–84

    Article  MathSciNet  Google Scholar 

  24. Dai J, Chen J, Liu Y, Hu H (2020) Novel multi-label feature selection via label symmetric uncertainty correlation learning and feature redundancy evaluation. Knowl Based Syst 207:106342

    Article  Google Scholar 

  25. Tao Y, Li J, Xu J (2020) Multi-label feature selection method via maximizing correlation-based criterion with mutation binary bat algorithm. In: Proceedings of the International Joint conference on neural networks, pp 1–8

  26. Li F, Miao D, Pedrycz W (2017) Granular multi-label feature selection based on mutual information. Pattern Recognit 67:410–423

    Article  Google Scholar 

  27. Che X, Chen D, Mi J (2020) A novel approach for learning label correlation with application to feature selection of multi-label data. Inf Sci 512:795–812

    Article  MathSciNet  MATH  Google Scholar 

  28. Lee J, Kim DW (2017) SCLS: multi-label feature selection based on scalable criterion for large label set. Pattern Recognit 66:342–352

    Article  MathSciNet  Google Scholar 

  29. Liu J, Lin Y, Li Y, Weng W, Wu S (2018) Online multi-label streaming feature selection based on neighborhood rough set. Pattern Recognit 84:273–287

    Article  Google Scholar 

  30. Qian W, Xiong C, Wang Y (2021) A ranking-based feature selection for multi-label classification with fuzzy relative discernibility. Appl Soft Comput 102:106995

    Article  Google Scholar 

  31. Zhai Y, Dai J (2019) Label distribution data feature reduction based on fuzzy rough set model. Aust J Intell Inf Process Syst 16:27–35

    Google Scholar 

  32. Qian W, Huang J, Wang Y, Shu W (2020) Mutual information-based label distribution feature selection for multi-label learning. Knowl Based Syst 195:105684

    Article  Google Scholar 

  33. Qian W, Long X, Wang Y, Xie Y (2020) Multi-label feature selection based on label distribution and feature complementarity. Appl Soft Comput J 90:106167

    Article  Google Scholar 

  34. Qian W, Huang J, Wang Y, Xie Y (2021) Label distribution feature selection for multi-label classification with rough set. Int J Approx Reason 128:32–55

    Article  MathSciNet  MATH  Google Scholar 

  35. Lin TY, Huang KJ, Liu Q, Chen W (1990) Rough sets, neighborhood systems and approximation. In: Proceedings of the 5th international symposium on methodologies for intelligent systems, pp 130–141

  36. Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594

    Article  MathSciNet  MATH  Google Scholar 

  37. Xia S, Zhang Z, Li W, Wang G, Giem E, Chen Z (2020) GBNRS: a novel rough set algorithm for fast adaptive attribute reduction in classification. IEEE Trans Knowl Data Eng 34:1231–1242

  38. Xia S, Peng D, Meng D, Zhang C, Wang G, Giem E, Wei W, Chen Z (2020) A fast adaptive k-means with no bounds. IEEE Trans Pattern Anal Mach Intell 44:87–99

  39. Ding Y, Zhao Y, Shen X, Musuvathi M, Mytkowicz T (2015) Yinyang K-means: a drop-in replacement of the classic K-means with consistent speedup. In: Proceedings of the 32nd international conference on machine learning, pp 579–587

  40. Zhu X, Ying C, Wang J, Li J, Lai X, Wang G (2021) Ensemble of ML-KNN for classification algorithm recommendation. Knowl Based Syst 221:106933

    Article  Google Scholar 

  41. Chen Y, Hu X, Fan W, Shen L, Zhang Z, Liu X, Du J, Li H, Chen Y, Li H (2020) Fast density peak clustering for large scale data based on kNN. Knowl Based Syst 187:104824

    Article  Google Scholar 

  42. Pawlak Z (1982) Rough sets. Int J Comput Inf Sci 11:341–356

    Article  MATH  Google Scholar 

  43. Swiniarski R, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recognit Lett 24(6):833–849

    Article  MATH  Google Scholar 

  44. Wei J, Wang S, Yuan X (2010) Ensemble rough hypercuboid approach for classifying cancers. IEEE Trans Knowl Data Eng 22:381–391

    Article  Google Scholar 

  45. Prasad M, Tripathi S, Dahal K (2020) An efficient feature selection based Bayesian and Rough set approach for intrusion detection. Appl Soft Comput J 87:105980

  46. Zhao J, Liang J, Dong Z, Tang D, Liu Z (2020) Accelerating information entropy-based feature selection using rough set theory with classified nested equivalence classes. Pattern Recognit 107:107517

  47. Dai J, Chen J (2020) Feature selection via normative fuzzy information weight with application into tumor classification. Appl Soft Comput 92:106299

    Article  Google Scholar 

  48. Tan A, Wu W, Qian Y, Liang J, Chen J, Li J (2019) Intuitionistic fuzzy rough set-based granular structures and attribute subset selection. IEEE Trans Fuzzy Syst 27(3):527–539

    Article  Google Scholar 

  49. Yao Y (2020) Three-way granular computing, rough sets, and formal concept analysis. Int J Approx Reason 116:106–125

    Article  MathSciNet  MATH  Google Scholar 

  50. Du Y, Hu Q, Zhu P, Ma P (2011) Rule learning for classification based on neighborhood covering reduction. Inf Sci 181(24):5457–5467

    Article  MathSciNet  Google Scholar 

  51. Li J, Mei C, Lv Y (2013) Incomplete decision contexts: approximate concept construction, rule acquisition and knowledge reduction. Int J Approx Reason 54(1):149–165

    Article  MathSciNet  MATH  Google Scholar 

  52. She Y, He X, Shi H, Qian Y (2017) A multiple-valued logic approach for multigranulation rough set model. Int J Approx Reason 82:270–284

    Article  MathSciNet  MATH  Google Scholar 

  53. Chen D, Zhang X, Wang X, Liu Y (2018) Uncertainty learning of rough set-based prediction under a holistic framework. Inf Sci 463–464:129–151

    Article  MathSciNet  MATH  Google Scholar 

  54. Liu D, Li T, Liang D (2014) Incorporating logistic regression to decision-theoretic rough sets for classifications. Int J Approx Reason 55(1):197–210

    Article  MathSciNet  MATH  Google Scholar 

  55. Chen Y, Yue X, Fujita H, Fu S (2017) Three-way decision support for diagnosis on focal liver lesions. Knowl Based Syst 127:85–99

    Article  Google Scholar 

  56. Hu J, Li T, Wang H, Fujita H (2016) Hierarchical cluster ensemble model based on knowledge granulation. Knowl Based Syst 91:179–188

    Article  Google Scholar 

  57. Yao Y, Wong SKM, Lingras P (1990) A decision-theoretic rough set model. Methodol Intell Syst 5:17–24

    MathSciNet  Google Scholar 

  58. Wang Y, Dai J (2019) Label distribution feature selection based on mutual information in fuzzy rough set theory. In: Proceedings of the international joint conference on neural networks, pp 1–2

  59. Zhang J, Lin Y, Jiang M, Li S, Tang Y, Tan KC (2020) Multi-label feature selection via global relevance and redundancy optimization. In: Proceedings of the 29th international joint conferences on artificial intelligence, pp 2512–2518

  60. Zhang Y, Zhou ZH (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4(3):1–21

    Article  Google Scholar 

  61. Kong D, Ding C, Huang H, Zhao H (2012) Multi-label ReliefF and F-statistic feature selections for image annotation. In: Proceedings of the IEEE computer vision and pattern recognition, pp 2352–2359

  62. Spolaor N, Cherman EA, Monard MC (2011) Using ReliefF for multi-label feature selection. In: Proceedings of the Conferencia Latinoamericana de Informática, pp 960–975

  63. Hu Q, Yu D, Xie Z (2008) Neighborhood classifiers. Expert Syst Appl 34:866–876

    Article  Google Scholar 

  64. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11:86–92

    Article  MathSciNet  MATH  Google Scholar 

  65. Dešar J (1993) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2020YFD1100605), the National Natural Science Foundation of China (No. 61966016), the Natural Science Foundation of Jiangxi Province, China (No. 20192BAB207018), and the Scientific Research Project of Education department of Jiangxi Province, China (No. GJJ180200).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenbin Qian.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qian, W., Dong, P., Wang, Y. et al. Local rough set-based feature selection for label distribution learning with incomplete labels. Int. J. Mach. Learn. & Cyber. 13, 2345–2364 (2022). https://doi.org/10.1007/s13042-022-01528-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01528-4

Keywords

Navigation