More Web Proxy on the site http://driver.im/

research-article

Open access

Divide and Denoise: Empowering Simple Models for Robust Semi-Supervised Node Classification against Label Noise

Authors:

Shirui PanAuthors Info & Claims

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 574 - 584

https://doi.org/10.1145/3637528.3671798

Published: 24 August 2024 Publication History

Abstract

Graph neural networks (GNNs) based on message passing have achieved remarkable performance in graph machine learning. By combining it with the power of pseudo labeling, one can further push forward the performance on the task of semi-supervised node classification. However, most existing works assume that the training node labels are purely noise-free, while this strong assumption usually does not hold in practice. GNNs will overfit the noisy training labels and the adverse effects of mislabeled nodes can be exaggerated by being propagated to the remaining nodes through the graph structure, exacerbating the model failure. Worse still, the noisy pseudo labels could also largely undermine the model's reliability without special treatment. In this paper, we revisit the role of (1) message passing and (2) pseudo labels in the studied problem and try to address two denoising subproblems from the model architecture and algorithm perspective, respectively. Specifically, we first develop a label-noise robust GNN that discards the coupled message-passing scheme. Despite its simple architecture, this learning backbone prevents overfitting to noisy labels and also inherently avoids the noise propagation issue. Moreover, we propose a novel reliable graph pseudo labeling algorithm that can effectively leverage the knowledge of unlabeled nodes while mitigating the adverse effects of noisy pseudo labels. Based on those novel designs, we can attain exceptional effectiveness and efficiency in solving the studied problem. We conduct extensive experiments on benchmark datasets for semi-supervised node classification with different levels of label noise and show new state-of-the-art performance. The code is available at https://github.com/DND-NET/DND-NET.

References

[1]

Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2014. Spectral networks and locally connected networks on graphs. In ICLR.

[2]

Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In AAAI.

[3]

Ling-Hao Chen, Yuanshuo Zhang, Taohua Huang, Liangcai Su, Zeyi Lin, Xi Xiao, Xiaobo Xia, and Tongliang Liu. 2023. ERASE: Error-Resilient Representation Learning on Graphs for Label Noise Tolerance. arXiv preprint arXiv:2312.08852 (2023).

[4]

Eli Chien, Jianhao Peng, Pan Li, and Olgica Milenkovic. 2021. Adaptive Universal Generalized PageRank Graph Neural Network. In ICLR.

[5]

Enyan Dai, Charu Aggarwal, and Suhang Wang. 2021. NRGNN: Learning a Label Noise-Resistant Graph Neural Network on Sparsely and Noisily Labeled Graphs. In KDD.

Digital Library

[6]

Kaize Ding, Jundong Li, Rohit Bhanushali, and Huan Liu. 2019. Deep anomaly detection on attributed networks. In SDM.

[7]

Kaize Ding, Elnaz Nouri, Guoqing Zheng, Huan Liu, and Ryen White. 2024. Toward robust graph semi-supervised learning against extreme data scarcity. TNNLS (2024).

[8]

Kaize Ding, Jianling Wang, James Caverlee, and Huan Liu. 2022. Meta Propagation Networks for Graph Few-shot Semi-supervised Learning. In AAAI.

[9]

Kaize Ding, Jianling Wang, Jundong Li, Kai Shu, Chenghao Liu, and Huan Liu. 2020. Graph prototypical networks for few-shot learning on attributed networks. In CIKM.

[10]

Kaize Ding, Yancheng Wang, Yingzhen Yang, and Huan Liu. 2023. Eliciting structural and semantic global knowledge in unsupervised graph contrastive learning. In AAAI.

[11]

Kaize Ding, Zhe Xu, Hanghang Tong, and Huan Liu. 2022. Data augmentation for deep graph learning: A survey. SIGKDD Explorations (2022).

Digital Library

[12]

Hande Dong, Jiawei Chen, Fuli Feng, Xiangnan He, Shuxian Bi, Zhaolin Ding, and Peng Cui. 2021. On the Equivalence of Decoupled Graph Convolution Network and Label Propagation. In TheWebConf.

[13]

Xuefeng Du, Tian Bian, Yu Rong, Bo Han, Tongliang Liu, Tingyang Xu, Wenbing Huang, and Junzhou Huang. 2021. PI-GNN: A Novel Perspective on Semi-Supervised Node Classification against Noisy Labels. TMLR (2021).

[14]

Wenzheng Feng, Jie Zhang, Yuxiao Dong, Yu Han, Huanbo Luan, Qian Xu, Qiang Yang, Evgeny Kharlamov, and Jie Tang. 2020. Graph random neural networks for semi-supervised learning on graphs. In NeurIPS.

[15]

Alex Fout, Jonathon Byrd, Basir Shariat, and Asa Ben-Hur. 2017. Protein Interface Prediction using Graph Convolutional Networks. In NeurIPS.

[16]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NeurIPS.

[17]

Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W Tsang, James T Kwok, and Masashi Sugiyama. 2020. A survey of label-noise representation learning: Past, present and future. arXiv preprint arXiv:2011.04406 (2020).

[18]

Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor Tsang, and Masashi Sugiyama. 2018. Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels. In NeurIPS.

[19]

Cole Hawkins, Vassilis N Ioannidis, Soji Adeshina, and George Karypis. 2021. Scalable consistency training for graph neural networks via self-ensemble self-distillation. arXiv preprint arXiv:2110.06290 (2021).

[20]

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. 2020. Open graph benchmark: Datasets for machine learning on graphs. In NeurIPS, Vol. 33. 22118--22133.

[21]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.

[22]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Predict then propagate: Graph neural networks meet personalized pagerank. In ICLR.

[23]

Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2019. Combining Neural Networks with Personalized PageRank for Classification on Graphs. In ICLR.

[24]

Johannes Klicpera, Stefan Weißenberger, and Stephan Günnemann. 2019. Diffusion improves graph learning. In NeurIPS.

[25]

Dong-Hyun Lee et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In ICML Workshop.

[26]

Qimai Li, Zhichao Han, and Xiao-Ming Wu. 2018. Deeper insights into graph convolutional networks for semi-supervised learning. In AAAI.

[27]

Yayong Li, Jie Yin, and Ling Chen. 2021. Unified robust training for graph neural networks against label noise. In PAKDD.

[28]

Hongrui Liu, Binbin Hu, Xiao Wang, Chuan Shi, Zhiqiang Zhang, and Jun Zhou. 2022. Confidence may cheat: Self-training on graph neural networks under distribution shift. In TheWebConf.

[29]

Meng Liu, Hongyang Gao, and Shuiwang Ji. 2020. Towards deeper graph neural networks. In KDD.

[30]

Yixin Liu, Kaize Ding, Qinghua Lu, Fuyi Li, Leo Yu Zhang, and Shirui Pan. 2023. Towards self-interpretable graph-level anomaly detection. In NeurIPS, Vol. 36.

[31]

Yixin Liu, Kaize Ding, Jianling Wang, Vincent Lee, Huan Liu, and Shirui Pan. 2023. Learning strong graph neural networks with weak information. In KDD.

[32]

Hoang NT, Choong Jin, and Tsuyoshi Murata. 2019. Learning Graph Neural Networks with Noisy Labels. In ICLR LLD Workshop.

[33]

Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. 2017. Making deep neural networks robust to label noise: A loss correction approach. In CVPR.

[34]

Siyi Qian, Haochao Ying, Renjun Hu, Jingbo Zhou, Jintai Chen, Danny Z Chen, and Jian Wu. 2023. Robust Training of Graph Neural Networks via Noise Governance. In WSDM.

[35]

Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI Magazine (2008).

[36]

Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of Graph Neural Network Evaluation. In NeurIPS Relational Representation Learning Workshop.

[37]

Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A Raffel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. 2020. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. In NeurIPS.

[38]

Ke Sun, Zhouchen Lin, and Zhanxing Zhu. 2020. Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In AAAI.

[39]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. JMLR (2008).

[40]

Petar Velivcković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).

[41]

Jianling Wang, Kaize Ding, Liangjie Hong, Huan Liu, and James Caverlee. 2020. Next-item recommendation with sequential hypergraphs. In SIGIR.

[42]

Song Wang, Yushun Dong, Kaize Ding, Chen Chen, and Jundong Li. 2023. Few-shot node classification with extremely weak supervision. In WSDM.

[43]

Yanling Wang, Jing Zhang, Shasha Guo, Hongzhi Yin, Cuiping Li, and Hong Chen. 2021. Decoupling representation learning and classification for gnn-based anomaly detection. In SIGIR. 1239--1248.

[44]

Felix Wu, Amauri Souza, Tianyi Zhang, Christopher Fifty, Tao Yu, and Kilian Weinberger. 2019. Simplifying graph convolutional networks. In ICML.

[45]

Jun Xia, Haitao Lin, Yongjie Xu, Lirong Wu, Zhangyang Gao, Siyuan Li, and Stan Z Li. 2021. Towards robust graph neural networks against label noise. (2021).

[46]

Xiaobo Xia, Tongliang Liu, Bo Han, Chen Gong, Nannan Wang, Zongyuan Ge, and Yi Chang. 2021. Robust early-learning: Hindering the memorization of noisy labels. In ICLR.

[47]

Xiaobo Xia, Tongliang Liu, Nannan Wang, Bo Han, Chen Gong, Gang Niu, and Masashi Sugiyama. 2019. Are anchor points really indispensable in label-noise learning?. In NeurIPS.

[48]

Qizhe Xie, Zihang Dai, Eduard Hovy, Thang Luong, and Quoc Le. 2020. Unsupervised data augmentation for consistency training. In NeurIPS.

[49]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.

[50]

Han Yue, Chunhui Zhang, Chuxu Zhang, and Hongfu Liu. 2022. Label-invariant augmentation for semi-supervised graph classification. In NeurIPS.

[51]

Jiaqi Zeng and Pengtao Xie. 2020. Contrastive self-supervised learning for graph classification. arXiv preprint arXiv:2009.05923 (2020).

[52]

Hao Zhu and Piotr Koniusz. 2020. Simple spectral graph convolution. In ICLR.

Cited By

Ding KLiu YZhang CWang J(2024)Data‐efficient graph learning: Problems, progress, and prospectsAI Magazine10.1002/aaai.12200Online publication date: 18-Oct-2024
https://doi.org/10.1002/aaai.12200

Index Terms

Divide and Denoise: Empowering Simple Models for Robust Semi-Supervised Node Classification against Label Noise

Index terms have been assigned to the content through auto-classification.

Recommendations

Semi-supervised multi-label classification using incomplete label information
Highlights
- An inductive semi-supervised method called Smile is proposed for multi-label classification using incomplete label information.
Abstract
Classifying multi-label instances using incompletely labeled instances is one of the fundamental tasks in multi-label learning. Most existing methods regard this task as supervised weak-label learning problem and assume sufficient ...
Semi-supervised partial label learning algorithm via reliable label propagation
Abstract
Partial label learning (PLL) is a weakly supervised learning method that is able to predict one label as the correct answer from a given candidate label set. In PLL, when all possible candidate labels are as signed to real-world training examples, ...
Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

In multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2024

6901 pages

ISBN:9798400704901

DOI:10.1145/3637528

General Chairs:
Ricardo Baeza-Yates
Northeastern University, USA
,
Francesco Bonchi
CENTAI / Eurecat, Italy

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '24

Sponsor:

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
508
Total Downloads

Downloads (Last 12 months)508
Downloads (Last 6 weeks)194

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ding KLiu YZhang CWang J(2024)Data‐efficient graph learning: Problems, progress, and prospectsAI Magazine10.1002/aaai.12200Online publication date: 18-Oct-2024
https://doi.org/10.1002/aaai.12200

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents