More Web Proxy on the site http://driver.im/

research-article

Robust Learning under Hybrid Noise

Authors:

Chen GongAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology, Volume 16, Issue 2

Article No.: 34, Pages 1 - 27

https://doi.org/10.1145/3709149

Published: 15 February 2025 Publication History

Abstract

Feature noise and label noise are ubiquitous in practical scenarios, which pose great challenges for training a robust machine learning model. Most previous approaches usually deal with only a single problem of either feature noise or label noise. However, in real-world applications, hybrid noise, which contains both feature noise and label noise, is very common due to the unreliable data collection and annotation processes. Although some results have been achieved by a few representation learning based attempts, this issue is still far from being addressed with promising performance and guaranteed theoretical analyses. To address the challenge, we propose a novel unified learning framework called Feature and Label Recovery (FLR) to combat the hybrid noise from the perspective of data recovery, where we concurrently reconstruct both the feature matrix and the label matrix of input data. Specifically, the clean feature matrix is discovered by the low-rank approximation, and the ground-truth label matrix is embedded based on the recovered features with a nuclear norm regularization. Meanwhile, the feature noise and label noise are characterized by their respective adaptive matrix norms to satisfy the corresponding maximum likelihood. As this framework leads to a non-convex optimization problem, we develop the non-convex Alternating Direction Method of Multipliers (ADMM) with the convergence guarantee to solve our learning objective. We also provide the theoretical analysis to show that the generalization error of FLR can be upper-bounded in the presence of hybrid noise. Experimental results on several typical benchmark datasets clearly demonstrate the superiority of our proposed method over the state-of-the-art robust learning approaches for various noises.

References

[1]

Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis? Journal of the ACM 58, 3 (2011), 1–37.

Digital Library

[2]

Jinhui Chen and Jian Yang. 2013. Robust subspace segmentation via low-rank representation. IEEE Transactions on Cybernetics 44, 8 (2013), 1432–1445.

[3]

Shuo Chen, Jian Yang, Lei Luo, Yang Wei, Kaihua Zhang, and Ying Tai. 2017. Low-rank latent pattern approximation with applications to robust image classification. IEEE Transactions on Image Processing 26, 11 (2017), 5519–5530.

[4]

Shuo Chen, Jian Yang, Yang Wei, Lei Luo, Gui-Fu Lu, and Chen Gong. 2019. \(\delta\)-norm-based robust regression with applications to image analysis. IEEE Transactions on Cybernetics 51, 6 (2019), 3371–3383.

[5]

Kaiyang Chiang, Chojui Hsieh, and Inderjit S. Dhillon. 2015. Matrix completion with noisy side information. In Proceedings of the Advances in Neural Information Processing Systems, 3447–3455.

[6]

Jiazhi Du, Xin Qiao, Zifei Yan, Hongzhi Zhang, and Wangmeng Zuo. 2024. Flexible image denoising model with multi-layer conditional feature modulation. Pattern Recognition 152 (2024), 110372.

Digital Library

[7]

Junbin Gao. 2008. Robust L1 principal component analysis and its Bayesian variational inference. Neural Computation 20, 2 (2008), 555–572.

Digital Library

[8]

Wei Gao, Lu Wang, Yu-Feng li, and Zhi-Hua Zhou. 2016. Risk minimization in the presence of label noise. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30, 1575–1581.

[9]

Aritra Ghosh, Himanshu Kumar, and P. Shanti Sastry. 2017. Robust loss functions under label noise for deep neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, 1919–1925.

[10]

Chen Gong, Hong Shi, Tongliang Liu, Chuang Zhang, Jian Yang, and Dacheng Tao. 2019. Loss decomposition and centroid estimation for positive and unlabeled learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 3 (2019), 918–932.

[11]

Chen Gong, Hengmin Zhang, Jian Yang, and Dacheng Tao. 2017. Learning with inadequate and incorrect supervision. In Proceedings of the 2017 IEEE International Conference on Data Mining, 889–894.

[12]

Frank E. Grubbs. 1973. Errors of measurement, precision, accuracy and the statistical comparison of measuring instruments. Technometrics 15, 1 (1973), 53–66.

[13]

Jipeng Guo, Yanfeng Sun, Junbin Gao, Yongli Hu, and Baocai Yin. 2021. Rank consistency induced multiview subspace clustering via low-rank matrix factorization. IEEE Transactions on Neural Networks and Learning Systems 33, 7 (2021), 3157–3170.

[14]

Bo Han, Jiangchao Yao, Gang Niu, Mingyuan Zhou, Ivor Tsang, Ya Zhang, and Masashi Sugiyama. 2018. Masking: A new perspective of noisy supervision. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 31, 5841–5851.

[15]

Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor Tsang, and Masashi Sugiyama. 2018. Co-teaching: Robust training of deep neural networks with extremely noisy labels. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 31, 8536–8546.

[16]

Cho-Jui Hsieh, Nagarajan Natarajan, and Inderjit Dhillon. 2015. PU learning for matrix completion. In Proceedings of International Conference on Machine Learning, 2445–2453.

[17]

Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, and Li Fei-Fei. 2018. MentorNet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In Proceedings of International Conference on Machine Learning, 2304–2313.

[18]

Jingchen Ke, Chen Gong, Tongliang Liu, Lin Zhao, Jian Yang, and Dacheng Tao. 2020. Laplacian Welsch regularization for robust semisupervised learning. IEEE Transactions on Cybernetics 52, 1 (2020), 164–177.

[19]

Junnan Li, Richard Socher, and Steven C. H. Hoi. 2019. DivideMix: Learning with noisy labels as semi-supervised learning. In Proceedings of International Conference on Learning Representations, 1–14.

[20]

Junnan Li, Caiming Xiong, and Steven C. H. Hoi. 2021. Learning from noisy data with robust representation learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9485–9494.

[21]

Xiuchuan Li, Xiaobo Xia, Fei Zhu, Tongliang Liu, Xuyao Zhang, and Chenglin Liu. 2023. Dynamics-aware loss for learning with label noise. Pattern Recognition 144 (2023), 109835.

Digital Library

[22]

Zhouchen Lin, Minming Chen, and Yi Ma. 2010. The augmented Lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv:1009.5055. Retrieved from https://arxiv.org/abs/1009.5055

[23]

Guangcan Liu, Zhouchen Lin, and Yong Yu. 2010. Robust subspace segmentation by low-rank representation. In Proceedings of International Conference on Machine Learning, 663–670.

Digital Library

[24]

Wenshui Luo, Shuo Chen, Tongliang Liu, Bo Han, Gang Niu, Masashi Sugiyama, Dacheng Tao, and Chen Gong. 2024. Estimating per-class statistics for label noise learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 47 (2024), 1–17.

[25]

Yijing Luo, Bo Han, and Chen Gong. 2021. A bi-level formulation for label noise learning with spectral cluster discovery. In Proceedings of International Conference on International Joint Conferences on Artificial Intelligence, 2605–2611.

[26]

Ping Ma, Jinchang Ren, Genyun Sun, Huimin Zhao, Xiuping Jia, Yijun Yan, and Jaime Zabalza. 2023. Multiscale superpixelwise prophet model for noise-robust feature extraction in hyperspectral images. IEEE Transactions on Geoscience and Remote Sensing 61 (2023), 1–12.

[27]

George D. Magoulas and Andriana Prentza. 2001. Machine Learning in Medical Applications. Springer, Berlin, 300–307.

[28]

André L. B. Miranda, Luís Paulo F. Garcia, André C. P. L. F. Carvalho, and Ana C. Lorena. 2009. Use of classification algorithms in noise detection and elimination. In Hybrid Artificial Intelligence Systems. Emilio Corchado, Xindong Wu, Erkki Oja, Álvaro Herrero, and Bruno Baruque (Eds.), Springer, 417–424.

Digital Library

[29]

Fabrice Muhlenbach, Stéphane Lallich, and Djamel A. Zighed. 2004. Identifying and handling mislabelled instances. Journal of Intelligent Information Systems 22, 1 (2004), 89–109.

Digital Library

[30]

Giorgio Patrini, Frank Nielsen, Richard Nock, and Marcello Carioni. 2016. Loss factorization, weakly supervised learning and label noise robustness. In Proceedings of International Conference on Machine Learning, 708–717.

[31]

Halsey Royden and Patrick Michael Fitzpatrick. 2010. Real Analysis. China Machine Press.

[32]

Ziqiang Shi, Jiqing Han, and Tieran Zheng. 2014. Audio classification with low-rank matrix representation features. ACM Transactions on Intelligent Systems and Technology 5, 1 (2014), 1–17.

Digital Library

[33]

Haoliang Sun, Chenhui Guo, Qi Wei, Zhongyi Han, and Yilong Yin. 2022. Learning to rectify for robust learning with noisy labels. Pattern Recognition 124 (2022), 108467.

Digital Library

[34]

Le Sun, Zebin Wu, Jianjun Liu, and Zhihui Wei. 2014. A novel supervised method for hyperspectral image classification with spectral-spatial constraints. Chinese Journal of Electronics 23, EN20140125 (2014), 135.

[35]

Vasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi, and Nirvana Meratnia. 2024. Labeling chaos to learning harmony: Federated learning with noisy labels. ACM Transactions on Intelligent Systems and Technology 15, 2 (2024), 1–26.

Digital Library

[36]

Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, Pierre-Antoine Manzagol, and Léon Bottou. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research 11, 12 (2010), 3371–3408.

[37]

Deqing Wang and Guoqiang Hu. 2024. Efficient nonnegative tensor decomposition using alternating direction proximal method of multipliers. Chinese Journal of Electronics 33, 5 (2024), 1308–1316.

[38]

Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, and James Bailey. 2019. Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 322–330.

[39]

Hongxin Wei, Lei Feng, Xiangyu Chen, and Bo An. 2020. Combating noisy labels by agreement: A joint training method with co-regularization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13726–13735.

[40]

Yang Wei, Chen Gong, Shuo Chen, Tongliang Liu, Jian Yang, and Dacheng Tao. 2019. Harnessing side information for classification under label noise. IEEE Transactions on Neural Networks and Learning Systems 31, 9 (2019), 3178–3192.

[41]

John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma. 2008. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 2 (2008), 210–227.

Digital Library

[42]

Chang Xu, Dacheng Tao, and Chao Xu. 2016. Robust extreme multi-label learning. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1275–1284.

Digital Library

[43]

Yilun Xu, Peng Cao, Yuqing Kong, and Yizhou Wang. 2019. L_dmi: A novel information-theoretic loss function for training deep nets robust to label noise. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 32, 6225–6236.

[44]

Jian Yang, Lei Luo, Jianjun Qian, Ying Tai, Fanlong Zhang, and Yong Xu. 2016. Nuclear norm based matrix regression with applications to face recognition with occlusion and illumination changes. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 1 (2016), 156–171.

Digital Library

[45]

Zunzhi You, Daochang Liu, Bohyung Han, and Chang Xu. 2024. Beyond pretrained features: Noisy image modeling provides adversarial defense. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 36.

[46]

Bixiao Zeng, Xiaodong Yang, Yiqiang Chen, Hanchao Yu, and Yingwei Zhang. 2022. CLC: A consensus-based label correction approach in federated learning. ACM Transactions on Intelligent Systems and Technology 13, 5 (2022), 1–23.

Digital Library

[47]

Hengmin Zhang, Jian Yang, Fanhua Shang, Chen Gong, and Zhenyu Zhang. 2018. LRR for subspace segmentation via tractable schatten-\(P\) norm minimization and factorization. IEEE Transactions on Cybernetics 49, 5 (2018), 1722–1734.

[48]

Jinghui Zhang, Dingyang Lv, Qiangsheng Dai, Fa Xin, and Fang Dong. 2023. Noise-aware local model training mechanism for federated learning. ACM Transactions on Intelligent Systems and Technology 14, 4 (2023), 1–22.

Digital Library

[49]

Zhilu Zhang and Mert Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. In Proceedings of the Advances in Neural Information Processing Systems, Vol. 31, 8792–8802.

[50]

Zhaowei Zhu, Jialu Wang, and Yang Liu. 2022. Beyond images: Label noise transition matrix estimation for tasks with lower-quality features. In Proceedings of International Conference on Machine Learning, 27633–27653.

Index Terms

Robust Learning under Hybrid Noise
1. Computing methodologies
  1. Machine learning

Recommendations

Noise-robust semi-supervised learning via fast sparse coding

This paper presents a novel noise-robust graph-based semi-supervised learning algorithm to deal with the challenging problem of semi-supervised learning with noisy initial labels. Inspired by the successful use of sparse coding for noise reduction, we ...
Sparse representation-based robust face recognition by graph regularized low-rank sparse representation recovery

This paper proposes a graph regularized low-rank sparse representation recovery (GLRSRR) method for sparse representation-based robust face recognition, in which both the training and test samples might be corrupted because of illumination variations, ...
Recovering Low-Rank and Sparse Components of Matrices from Incomplete and Noisy Observations

Many problems can be characterized by the task of recovering the low-rank and sparse components of a given matrix. Recently, it was discovered that this nondeterministic polynomial-time hard (NP-hard) task can be well accomplished, both theoretically ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 16, Issue 2

April 2025

435 pages

EISSN:2157-6912

DOI:10.1145/3703036

Editor:
Huan Liu
Arizona State University, AZ

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 February 2025

Online AM: 23 December 2024

Accepted: 08 November 2024

Revised: 17 October 2024

Received: 04 July 2024

Published in TIST Volume 16, Issue 2

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF of China
NSF for Distinguished Young Scholar of Jiangsu

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
97
Total Downloads

Downloads (Last 12 months)97
Downloads (Last 6 weeks)44

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents