Abstract
As an important preprocessing procedure, dimensionality reduction for multi-label learning is an effective way to solve the challenge caused by high-dimensionality data. Most existing dimensionality reduction methods are mainly used to deal with single-label and multi-label data, which assumes each related label to the instance with the same important degree. However, there are different relatively important degrees for the related labels of each instance in many real applications. In this paper, a granular ball-based label enhancement algorithm is proposed to convert the logical label into label distribution for obtaining more supervision information. The granular ball can be regarded as local coarse grain to explore sample similarity based on neighborhood viewpoints. Then, the between-granular ball scatter and within-granular ball scatter measures are presented, which are utilized to construct a label distribution feature extraction algorithm. In addition, a two-stage mutual iterative learning framework is developed, label enhancement and dimensionality reduction are mutual interactive. Finally, Experiments are conducted with the six state-of-the-art methods on eleven multi-label data in terms of multiple representative evaluation measures. Experimental results show that the proposed method significantly outperforms other comparison methods by an average of 36.8% over six widely-used evaluation metrics.
Graphical abstract
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Chen LM, Xiu BX, Ding ZY (2022) Multiple weak supervision for short text classification. Appl Intell 52:9101–9116
Chen T, Lin L, Chen R et al (2022) Knowledge-Guided Multi-Label Few-Shot Learning for General Image Recognition. IEEE Trans Pattern Anal Mach Intell 44:1371–1384
Pham T, Tao X, Zhang J et al (2022) Graph-based multi-label disease prediction model learning from medical data and domain knowledge. Knowl-Based Syst 235:107662
Geng X (2016) Label Distribution Learning. IEEE Trans Knowl Data Eng 28:1734–1748
Zhang Z, Lai C, Liu H, Li YF (2020) Infrared facial expression recognition via Gaussian-based label distribution learning in the dark illumination environment for human emotion detection. Neurocomputing 409:341–350
Ling M, Geng X (2019) Indoor Crowd Counting by Mixture of Gaussians Label Distribution Learning. IEEE Trans Image Process 28:5691–5701
Zaman EAK, Mohamed A, Ahmad A (2022) Feature selection for online streaming high-dimensional data: A state-of-the-art review. Appl Soft Comput 127:109355
Liu R, Ren R, Liu J, Liu J (2020) A clustering and dimensionality reduction based evolutionary algorithm for large-scale multi-objective problems. Appl Soft Comput J 89:106120
Siblini W, Kuntz P, Meyer F (2021) A Review on Dimensionality Reduction for Multi-Label Classification. IEEE Trans Knowl Data Eng 33:839–857
Moyano JM, Gibaja EL, Ventura S (2017) MLDA: A tool for analyzing multi-label datasets. Knowl-Based Syst 121:1–3
Peng X, Wang P, Xia S et al (2022) VPGB: A granular-ball based model for attribute reduction and classification with label noise. Inf Sci 611:504–521
Chen Y, Wang P, Yang X et al (2021) Granular ball guided selector for attribute reduction. Knowl-Based Syst 229:107326
Xu M, Zhou ZH (2017) Incomplete label distribution learning. IJCAI Int Jt Conf Artif Intell 0:3175–3181
Jia X, Li Z, Zheng X et al (2021) Label Distribution Learning with Label Correlations on Local Samples. IEEE Trans Knowl Data Eng 33:1619–1631
Ren T, Jia X, Li W, Zhao S (2019) Label distribution learning with label correlations via low-rank approximation. IJCAI Int Jt Conf Artif Intell 2019-Augus:3325–3331
Qian W, Huang J, Wang Y, Xie Y (2021) Label distribution feature selection for multi-label classification with rough set. Int J Approx Reason 128:32–55
Tan C, Chen S, Ji G, Geng X (2021) A Novel Probabilistic Label Enhancement Algorithm for Multi-label Distribution Learning. IEEE Trans Knowl Data Eng 4347:1–15
Tang H, Zhu J, Zheng Q, et al (2020) Label enhancement with sample correlations via low-rank representation. In: 2020 - 34th AAAI Conference on Artificial Intelligence, pp 5932–5939
Li W, Chen J, Gao P, Huang Z (2022) Label enhancement with label-specific feature learning. Int J Mach Learn Cybern 13:2857–2867
Shao R, Xu N, Geng X (2017) Multi-label Learning with Label Enhancement. In: 2018 IEEE International Conference on Data Mining (ICDM), pp 437–446
Zhang F, Jia X, Li W (2020) Tensor based multi-view label enhancement for multi-label learning. IJCAI Int Jt Conf Artif Intell 2021-Janua:2369–2375
Long X, Qian W, Wang Y, Shu W (2021) Cost-sensitive feature selection on multi-label data via neighborhood granularity and label enhancement. Appl Intell 51:2210–2232
Zhang P, Gao W (2020) Feature selection considering Uncertainty Change Ratio of the class label. Appl Soft Comput J 95:106537
Fan Y, Chen B, Huang W et al (2022) Multi-label feature selection based on label correlations and feature redundancy. Knowl-Based Syst 241:108256
Sha ZC, Liu ZM, Ma C, Chen J (2021) Feature selection for multi-label classification by maximizing full-dimensional conditional mutual information. Appl Intell 51:326–340
Li Y, Hu L, Gao W (2022) Label correlations variation for robust multi-label feature selection. Inf Sci 609:1075–1097
Zhang P, Li T, Wang G et al (2021) Multi-source information fusion based on rough set theory: A review. Inf Fusion 68:85–117
Sun L, Yin T, Ding W et al (2022) Feature Selection with Missing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy. IEEE Trans Fuzzy Syst 30:1197–1211
Levada ALM (2021) PCA-KL: a parametric dimensionality reduction approach for unsupervised metric learning. Adv Data Anal Classif 15:829–868
Guo B, Hou C, Nie F, Yi D (2017) Semi-supervised multi-label dimensionality reduction. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp 919–924
Mikalsen K, Soguero-Ruiz C, Bianchi FM, Jenssen R (2019) Noisy multi-label semi-supervised dimensionality reduction. Pattern Recognit 90:257–270
Yu K, Yu S, Tresp V (2005) Multi-label informed latent semantic indexing. SIGIR 2005 - Proc 28th Annu Int ACM SIGIR Conf Res Dev Inf Retr 258–265
Shu X, Lai D, Xu H, Tao L (2015) Learning shared subspace for multi-label dimensionality reduction via dependence maximization. Neurocomputing 168:356–364
Zhang Y, Zhou ZH (2010) Multilabel dimensionality reduction via dependence maximization. ACM Trans Knowl Discov Data 4:1–21
Shu X, Qiu J (2017) Speed up kernel dependence maximization for multi-label feature extraction. J Vis Commun Image Represent 49:361–370
Nie T (2018) Multi-label learning based on label-specific feature extraction. In: 2018 IEEE International Conference on Big Knowledge (ICBK), pp 298–305
Xu J, Liu J, Yin J, Sun C (2016) A multi-label feature extraction algorithm via maximizing feature variance and feature-label dependence simultaneously. Knowl-Based Syst 98:172–184
Wu JH, Zhang ML (2019) Disambiguation enabled linear discriminant analysis for partial label dimensionality reduction. Proc ACM SIGKDD Int Conf Knowl Discov Data Min 416–424
Yu H, Zhang T, Jia W (2020) Shared subspace least squares multi-label linear discriminant analysis. Appl Intell 50:939–950
Xu J (2018) A weighted linear discriminant analysis framework for multi-label feature extraction. Neurocomputing 275:107–120
Xu J, Mao ZH (2021) Multilabel Feature Extraction Algorithm via Maximizing Approximated and Symmetrized Normalized Cross-Covariance Operator. IEEE Trans Cybern 51:3510–3523
Xia S, Liu Y, Ding X et al (2019) Granular ball computing classifiers for efficient, scalable and robust learning. Inf Sci 483:136–152
Xia S, Peng D, Meng D et al (2022) Ball k-Means: Fast Adaptive Clustering With No Bounds. IEEE Trans Pattern Anal Mach Intell 44:87–99
Li W, Xia S, Chen Z (2021) A Fast Attribute Reduction Algorithm of Neighborhood Rough Set. In: 2021 13th International Conference on Knowledge and Smart Technology (KST), pp 43–48
Xia S, Zheng S, Wang G, et al (2021) Granular Ball Sampling for Noisy Label Classification or Imbalanced Classification. IEEE Trans Neural Networks Learn Syst 1–12
Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26:1819–1837
Xu J (2018) A weighted linear discriminant analysis framework for multi-label feature extraction. Neurocomputing 275:107–120
Sheskin J D (2000) Parametric and Nonparametric Statistical procedures. Chapman and Hall/CRC 402–410
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Anna Math Stat 11:86–92
Dunn JQ (1961) Multiple comparisons among means. J Am Stat Assoc 56:52–64
Qian W, Xiong C, Wang Y (2021) A ranking-based feature selection for multi-label classification with fuzzy relative discernibility. Appl Soft Comput 102:106995
Zhang ML, Zhou ZH (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18:1338–1351
Xu X, Shan D, Li S et al (2019) Multi-label learning method based on ML-RBF and laplacian ELM. Neurocomputing 331:213–219
Acknowledgements
This work is supported by National Natural Science Foundation of China (No.61966016), the National Key Research and Development Program of China (No.2020YFD1100605), and the Natural Science Foundation of Jiangxi Province, China (No.20224BAB202020).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Wenbin Qian and Wenyong Ruan contributed equally to this work.
Appendix A: Abbreviations nomenclature
Appendix A: Abbreviations nomenclature
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qian, W., Ruan, W., Li, Y. et al. Granular ball-based label enhancement for dimensionality reduction in multi-label data. Appl Intell 53, 24008–24033 (2023). https://doi.org/10.1007/s10489-023-04771-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04771-6