More Web Proxy on the site http://driver.im/

research-article

GEML: a graph-enhanced pre-trained language model framework for text classification via mutual learning

Authors:

Adriano Tavares,

Hao XuAuthors Info & Claims

Applied Intelligence, Volume 54, Issue 23

Pages 12215 - 12229

https://doi.org/10.1007/s10489-024-05831-1

Published: 11 September 2024 Publication History

Abstract

Large-scale Pre-trained Language Models (PLMs) have become the backbones of text classification due to their exceptional performance. However, they treat input documents as independent and uniformly distributed, thereby disregarding potential relationships among the documents. This limitation could lead to some miscalculations and inaccuracies in text classification. To address this issue, some recent work explores the integration of Graph Neural Networks (GNNs) with PLMs, as GNNs can effectively model document relationships. Yet, combining graph-based methods with PLMs is challenging due to the structural incompatibility between graphs and sequences. To tackle this challenge, we propose a graph-enhanced text mutual learning framework that integrates graph-based models with PLMs to boost classification performance. Our approach separates graph-based methods and language models into two independent channels and allows them to approximate each other through mutual learning of probability distributions. This probability-distribution-guided approach simplifies the adaptation of graph-based models to PLMs and enables seamless end-to-end training of the entire architecture. Moreover, we introduce Asymmetrical Learning, a strategy that enhances the learning process, and incorporate Uncertainty Weighting loss to achieve smoother probability distribution learning. These enhancements significantly improve the performance of mutual learning. The practical value of our research lies in its potential applications in various industries, such as social network analysis, information retrieval, and recommendation systems, where understanding and leveraging document relationships are crucial. Importantly, our method can be easily combined with different PLMs and consistently achieves state-of-the-art results on multiple public datasets.

References

[1]

Chen H, Lin Y, Qi F et al (2021) Aspect-level sentiment-controllable review generation with mutual learning framework. In: Proceedings of the AAAI conference on artificial intelligence, pp 12639–12647

[2]

Chen Z, Mao H, Li H, et al. Exploring the potential of large language models (llms) in learning on graphs ACM SIGKDD Explorations Newsl 2024 25 2 42-61

Digital Library

[3]

Cui H, Wang G, Li Y, et al. Self-training method based on GCN for semi-supervised short text classification Inf Sci 2022 611 18-29

Digital Library

[4]

Devlin J (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805

[5]

Ding K, Wang J, Li J et al (2020) Be more with less: hypergraph attention networks for inductive text classification. In: Conference on empirical methods in natural language processing. pp 4927–4936

[6]

Fang X, Zhu J, Zhang R, et al. Ibnet: interactive branch network for salient object detection Neurocomputing 2021 465 574-583

Digital Library

[7]

Forman G (2008) BNS feature scaling: an improved representation over tf-idf for svm text classification. In: Proceedings of the 17th ACM conference on information and knowledge management, CIKM 2008, Napa Valley, California, USA, October 26-30, 2008. pp 263–270

[8]

Gao W and Huang H A gating context-aware text classification model with bert and graph convolutional networks J Intell Fuzz Syst 2021 40 3 4331-4343

Digital Library

[9]

Gui L, Jia L, Zhou J et al (2020) Multi-task learning with mutual learning for joint sentiment classification and topic detection. IEEE Trans Knowl Data Eng 1–1

[10]

Hinton G (2015) Distilling the knowledge in a neural network. arXiv:1503.02531

[11]

Huang L, Ma D, Li S et al (2019) Text level graph neural network for text classification. In: Inui K, Jiang J, Ng V et al (eds) Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. pp 3442–3448

[12]

Huang X, Ma T, Jia L, et al. An effective multimodal representation and fusion method for multimodal intent recognition Neurocomputing 2023 548 126373

Digital Library

[13]

Joachims T (1998) Text categorization with support vector machines: learning with many relevant features. In: Machine Learning: ECML-98, 10th European conference on machine learning, Chemnitz, Germany, April 21-23, 1998, Proceedings. pp 137–142

[14]

Joulin A, Grave E, Bojanowski P et al (2016) Bag of tricks for efficient text classification. arXiv:1607.01759

[15]

Kendall A, Gal Y, Cipolla R (2018) Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp 7482–7491

[16]

Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing. pp 1746–1751

[17]

Kingma DP and Ba J Bengio Y and LeCun Y Adam: a method for stochastic optimization 3rd International Conference on Learning Representations 2015 ICLR

[18]

Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings

[19]

Li C, Peng X, Peng H, et al (2021) Textgtl: graph-based transductive learning for semi-supervised text classification via structure-sensitive interpolation. In: Zhou Z (ed) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021, Virtual Event / Montreal, Canada, 19-27 August 2021. pp 2680–2686

[20]

Li Q, Li L, Wang W, et al. A comprehensive exploration of semantic relation extraction via pre-trained cnns Knowl-Based Syst 2020 194 105488

[21]

Liang Y, Li H, Guo B, et al. Fusion of heterogeneous attention mechanisms in multi-view convolutional neural network for text classification Inf Sci 2021 548 295-312

[22]

Lin Y, Meng Y, Sun X et al (2021) Bertgcn: transductive text classification by combining GNN and BERT. In: Findings of the association for computational linguistics: ACL/IJCNLP 2021, Online Event, August 1-6, 2021. pp 1456–1462

[23]

Liu X, You X, Zhang X, et al (2020) Tensor graph convolutional networks for text classification. In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, The thirty-second innovative applications of artificial intelligence conference, IAAI 2020, The tenth AAAI symposium on educational advances in artificial intelligence, EAAI 2020, New York, USA, February 7-12, 2020. pp 8409–8416

[24]

Liu Y (2019) Roberta: a robustly optimized bert pretraining approach. arXiv:1907.11692

[25]

Liu Y, Guan R, Giunchiglia F et al (2021) Deep attention diffusion graph neural networks for text classification. In: Moens M, Huang X, Specia L et al (eds) Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. pp 8142–8152

[26]

Lu Z, Du P, Nie J (2020) VGCN-BERT: augmenting BERT with graph embedding for text classification. In: Advances in information retrieval - 42nd European conference on IR research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part I. pp 369–382

[27]

Ma Q, Yu L, Chen H, et al. Sequence labeling with mlta: multi-level topic-aware mechanism Inf Sci 2023 637 118934

Digital Library

[28]

Ma Y, Yu J, Ji B, et al (2021) Three-way decisions based rnn models for sentiment classification. In: Rough sets: international joint conference, IJCRS 2021, Bratislava, Slovakia, September 19–24, 2021, Proceedings. Springer, pp 247–258

[29]

Ma Y, Hiraoka T, Okazaki N (2022) Joint entity and relation extraction based on table labeling using convolutional neural networks. In: Proceedings of the sixth workshop on structured prediction for NLP. pp 11–21

[30]

Maron ME Automatic indexing: an experimental inquiry J ACM 1961 8 3 404-417

Digital Library

[31]

Mikolov T, Karafiát M, Burget L et al (2010) Recurrent neural network based language model. In: Interspeech, Makuhari. pp 1045–1048

[32]

Müller R, Kornblith S, Hinton GE (2019) When does label smoothing help? In: Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. pp 4696–4705

[33]

Onan A Hierarchical graph-based text classification framework with contextual node embedding and bert-based dynamic fusion J King Saud Univ Comput Inf 2023 100 101610

[34]

Pan M, Pei Q, Liu Y, et al. Sprf: a semantic pseudo-relevance feedback enhancement for information retrieval via conceptnet Knowl-Based Syst 2023 274 110602

Digital Library

[35]

Phan XH, Nguyen ML, Horiguchi S (2008) Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Huai J, Chen R, Hon H et al (eds) Proceedings of the 17th international conference on World Wide Web, WWW 2008, Beijing, China, April 21-25, 2008. pp 91–100

[36]

Ragesh R, Sellamanickam S, Iyer A et al (2021) Hetegcn: Heterogeneous graph convolutional networks for text classification. In: WSDM ’21, The fourteenth ACM international conference on web search and data mining, Virtual Event, Israel, March 8-12, 2021. pp 860–868

[37]

Romero R, Celard P, Sorribes-Fdez JM, et al. Mobydeep: a lightweight cnn architecture to configure models for text classification Knowl-Based Syst 2022 257 109914

Digital Library

[38]

Sanh V (2019) Distilbert, a distilled version of bert: Smaller, faster, cheaper and lighter. arXiv:1910.01108

[39]

Shen D, Wang G, Wang W et al (2018) Baseline needs more love: on simple word-embedding-based models and associated pooling mechanisms. In: Proceedings of the 56th annual meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, vol 1: Long Papers. pp 440–450

[40]

Socher R, Perelygin A, Wu J et al (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on Empirical Methods in Natural Language Processing, EMNLP 2013, 18-21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL. pp 1631–1642

[41]

Song R, Giunchiglia F, Shen Q, et al. Improving abusive language detection with online interaction network Inf Process Manag 2022 59 5 103009

Digital Library

[42]

Song R, Giunchiglia F, Zhao K, et al. Graph topology enhancement for text classification Appl Intell 2022 52 13 15091-15104

Digital Library

[43]

Soni S, Chouhan SS, and Rathore SS Textconvonet: a convolutional neural network based architecture for text classification Appl Intell 2023 53 11 14249-14268

Digital Library

[44]

Sun S, Luo C, and Chen J A review of natural language processing techniques for opinion mining systems Inf Fusion 2017 36 10-25

Digital Library

[45]

Tan Z, Liu B, Yin G (2021) Asymmetric graph representation learning. arXiv preprint arXiv:2110.07436

[46]

Tu M, Zhu K, Guo H, et al. Multi-granularity mutual learning network for object re-identification IEEE Trans Intell Transp Syst 2022 23 9 15178-15189

Digital Library

[47]

Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA. pp 5998–6008

[48]

Wallach HM (2006) Topic modeling: beyond bag-of-words. In: Machine learning, proceedings of the twenty-third international conference (ICML 2006), Pittsburgh, Pennsylvania, USA, June 25-29, 2006. pp 977–984

[49]

Wang K, Han SC, Poon J (2022) Induct-gcn: inductive graph convolutional networks for text classification. In: 2022 26th International Conference on Pattern Recognition (ICPR). IEEE, pp 1243–1249

[50]

Wang S, Manning CD (2012) Baselines and bigrams: simple, good sentiment and topic classification. In: The 50th annual meeting of the association for computational linguistics, proceedings of the conference, July 8-14, 2012, Jeju Island, Korea - Volume 2: Short Papers. pp 90–94

[51]

Xia R, Zong C, and Li S Ensemble of feature sets and classification algorithms for sentiment classification Inf Sci 2011 181 6 1138-1152

Digital Library

[52]

Xu J, Xu B, Wang P et al (2017) Self-taught convolutional neural networks for short text clustering. Neural Netw 22–31

[53]

Yang J, Liu Z, Xiao S, et al. Graphformers: Gnn-nested transformers for representation learning on textual graph Adv Neural Inf Process Syst 2021 34 28798-28810

[54]

Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. In: Proceedings of the AAAI conference on artificial intelligence. pp 7370–7377

[55]

Ye Z, Jiang G, Liu Y, et al (2020) Document and word representations generated by graph convolutional network and bert for short text classification. ECAI 2020: 24TH European conference on artificial intelligence. pp 2275–2281

[56]

Zhang Y, Jin R, and Zhou Z Understanding bag-of-words model: a statistical framework Int J Mach Learn Cybern 2010 1 1–4 43-52

[57]

Zhang Y, Liu Q, Song L (2018) Sentence-state LSTM for text representation. In: Gurevych I, Miyao Y (eds) Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, vol 1: Long Papers. pp 317–327

[58]

Zhang Y, Xiang T, Hospedales MT, et al (2018) Deep mutual learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp 4320–4328

[59]

Zhang Y, Yu X, Cui Z, et al (2020) Every document owns its structure: inductive text classification via graph neural networks. In: Proceedings of the 58th annual meeting of the association for computational linguistics, ACL 2020, Online, July 5-10, 2020. pp 334–339

[60]

Zhang Z, Zhou Z, Wang Y (2022) Ssegcn: syntactic and semantic enhanced graph convolutional network for aspect-based sentiment analysis. In: Proceedings of the 2022 conference of the North American Chapter of the association for computational linguistics: human language technologies. pp 4916–4925

[61]

Zhou L, Chen Y, Cao C, et al. Macro-micro mutual learning inside compositional model for human pose estimation Neurocomputing 2021 449 176-188

Index Terms

GEML: a graph-enhanced pre-trained language model framework for text classification via mutual learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Graph Fusion Network for Text Classification
Abstract
Text classification is an important and classical problem in natural language processing. Recently, Graph Neural Networks (GNNs) have been widely applied in text classification and achieved outstanding performance. Despite the success ...
Highlights
- We transform external knowledge into structural information to build better graphs.
Augmenting Low-Resource Text Classification with Graph-Grounded Pre-training and Prompting
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Text classification is a fundamental problem in information retrieval with many real-world applications, such as predicting the topics of online articles and the categories of e-commerce product descriptions. However, low-resource text classification, ...
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Applied Intelligence

Applied Intelligence Volume 54, Issue 23

Dec 2024

576 pages

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 11 September 2024

Accepted: 29 August 2024

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
Education Department of Jilin Province
Department of Science and Technology of Jilin Province

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents