More Web Proxy on the site http://driver.im/

research-article

Text classification using capsules

Authors:

Sungchul ChoiAuthors Info & Claims

Volume 376, Issue C

Pages 214 - 221

https://doi.org/10.1016/j.neucom.2019.10.033

Published: 01 February 2020 Publication History

Abstract

This paper presents an empirical exploration of the use of capsule networks for text classification. While it has been shown that capsule networks are effective for image classification, the research regarding their validity in the domain of text has been initiated recently. In this paper, we show that capsule networks indeed have the potential for text classification and that they have several advantages over convolutional neural networks. As well, we compare our proposed model to the initial studies regarding capsule network-based text classification. We further suggest a simple routing method that effectively reduces the computational complexity of dynamic routing. We utilized seven benchmark datasets to demonstrate that capsule networks, along with the proposed routing method provide comparable results.

References

[1]

N. Shafiabady, L. Lee, R. Rajkumar, V. Kallimani, N.A. Akram, D. Isa, Using unsupervised clustering approach to train the support vector machine for text classification, Neurocomputing 211 (2016) 4–10.

[2]

W. Zhang, T. Yoshida, X. Tang, Text classification based on multi-word with support vector machine, Knowl. Based Syst. 21 (8) (2008) 879–886.

[3]

T. Joachims, Text categorization with support vector machines: Learning with many relevant features, Proceedings of the European Conference on Machine Learning (ECML), Springer, 1998, pp. 137–142.

[4]

T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2013, pp. 3111–3119.

[5]

Q. Le, T. Mikolov, Distributed representations of sentences and documents, Proceedings of International Conference on Machine Learning (ICML), 2014, pp. 1188–1196.

[6]

Y. Kim, Convolutional neural networks for sentence classification, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, 2014, pp. 1746–1751.

[7]

P. Wang, B. Xu, J. Xu, G. Tian, C.-L. Liu, H. Hao, Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification, Neurocomputing 174 (2016) 806–814.

[8]

J. Wang, B. Peng, X. Zhang, Using a stacked residual LSTM model for sentiment intensity prediction, Neurocomputing 322 (2018) 93–101.

[9]

G. Rao, W. Huang, Z. Feng, Q. Cong, LSTM with sentence representations for document-level sentiment classification, Neurocomputing 308 (2018) 49–57.

Digital Library

[10]

G.E. Hinton, A. Krizhevsky, S.D. Wang, Transforming auto-encoders, Proceedings of the International Conference on Artificial Neural Networks, Springer, 2011, pp. 44–51.

[11]

S. Sabour, N. Frosst, G.E. Hinton, Dynamic routing between capsules, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2017, pp. 3859–3869.

[12]

Y.N. Dauphin, A. Fan, M. Auli, D. Grangier, Language modeling with gated convolutional networks, Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, JMLR.org, 2017, pp. 933–941.

[13]

X. Zhang, J. Zhao, Y. LeCun, Character-level convolutional networks for text classification, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2015, pp. 649–657.

[14]

N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional neural network for modelling sentences, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), Association for Computational Linguistics, 2014.

[15]

A.M. Dai, Q.V. Le, Semi-supervised sequence learning, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2015, pp. 3079–3087.

[16]

T. Miyato, A.M. Dai, I. Goodfellow, Adversarial training methods for semi-supervised text classification, Proceedings of the International Conference on Learning Representations (ICLR), 2017.

[17]

T. Shen, T. Zhou, G. Long, J. Jiang, C. Zhang, Bi-directional block self-attention for fast and memory-efficient sequence modeling, Proceedings of the International Conference on Learning Representations (ICLR), 2018.

[18]

M. Yang, W. Zhao, J. Ye, Z. Lei, Z. Zhao, S. Zhang, Investigating capsule networks with dynamic routing for text classification, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 3110–3119. https://www.aclweb.org/anthology/D18-1350.

[19]

J. Gong, X. Qiu, S. Wang, X. Huang, Information aggregation via dynamic routing for sequence encoding, Proceedings of the 27th International Conference on Computational Linguistics, Association for Computational Linguistics, Santa Fe, New Mexico, USA, 2018, pp. 2742–2752. https://www.aclweb.org/anthology/C18-1232.

[20]

C. Xia, C. Zhang, X. Yan, Y. Chang, P. Yu, Zero-shot user intent detection via capsule neural networks, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 3090–3099. https://www.aclweb.org/anthology/D18-1348.

[21]

C. Peng, X. Zhang, G. Yu, G. Luo, J. Sun, Large kernel matters – improve semantic segmentation by global convolutional network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

[22]

B. Pang, L. Lee, A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, Proceedings of the 42nd annual meeting on Association for Computational Linguistics (ACL), Association for Computational Linguistics, 2004, p. 271.

[23]

B. Pang, L. Lee, Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales, Proceedings of the 43rd annual meeting on association for computational linguistics (ACL), Association for Computational Linguistics, 2005, pp. 115–124.

[24]

X. Li, D. Roth, Learning question classifiers, Proceedings of the 19th International Conference on Computational Linguistics-Volume 1, Association for Computational Linguistics, 2002, pp. 1–7.

[25]

J. Wiebe, T. Wilson, C. Cardie, Annotating expressions of opinions and emotions in language, Lang. Resour. Eval. 39 (2–3) (2005) 165–210.

[26]

A.L. Maas, R.E. Daly, P.T. Pham, D. Huang, A.Y. Ng, C. Potts, Learning word vectors for sentiment analysis, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), Association for Computational Linguistics, 2011, pp. 142–150.

[27]

D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, Proceedings of the International Conference on Learning Representations (ICLR), 2015.

[28]

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: a system for large-scale machine learning., Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI), 16, 2016, pp. 265–283.

[29]

M.T. Ribeiro, S. Singh, C. Guestrin, “why should i trust you?”: explaining the predictions of any classifier, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, ACM, New York, NY, USA, 2016, pp. 1135–1144.

Cited By

De Sousa Ribeiro FDuarte KEverett MLeontidis GShah M(2024)Object-centric Learning with Capsule Networks: A SurveyACM Computing Surveys10.1145/367450056:11(1-291)Online publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1145/3674500
Pan ZMao YXiong LPang TPing P(2024)MFAEPattern Recognition Letters10.1016/j.patrec.2024.06.008184:C(59-65)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.patrec.2024.06.008
Wang GDu YJiang YLiu JLi XChen XGao HXie CLee Y(2024)Label-text bi-attention capsule networks model for multi-label text classificationNeurocomputing10.1016/j.neucom.2024.127671588:COnline publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.127671
Show More Cited By

Index Terms

Text classification using capsules
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
      2. Neural networks
2. Information systems

Index terms have been assigned to the content through auto-classification.

Recommendations

Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
A Novel BGCapsule Network for Text Classification
Abstract
Several text classification tasks such as sentiment analysis, news categorization, multi-label classification and opinion classification are challenging problems even for modern deep learning networks. Recently, Capsule Networks (CapsNets) are ...
Boosting to correct inductive bias in text classification
CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management

This paper studies the effects of boosting in the context of different classification methods for text categorization, including Decision Trees, Naive Bayes, Support Vector Machines (SVMs) and a Rocchio-style classifier. We identify the inductive biases ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing

Neurocomputing Volume 376, Issue C

Feb 2020

257 pages

ISSN:0925-2312

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 February 2020

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

De Sousa Ribeiro FDuarte KEverett MLeontidis GShah M(2024)Object-centric Learning with Capsule Networks: A SurveyACM Computing Surveys10.1145/367450056:11(1-291)Online publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1145/3674500
Pan ZMao YXiong LPang TPing P(2024)MFAEPattern Recognition Letters10.1016/j.patrec.2024.06.008184:C(59-65)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1016/j.patrec.2024.06.008
Wang GDu YJiang YLiu JLi XChen XGao HXie CLee Y(2024)Label-text bi-attention capsule networks model for multi-label text classificationNeurocomputing10.1016/j.neucom.2024.127671588:COnline publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.127671
Reusens MStevens ATonglet JDe Smedt JVerbeke Wvanden Broucke SBaesens B(2024)Evaluating text classificationExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124302254:COnline publication date: 18-Oct-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.124302
Ghorbanali ASohrabi M(2024)Capsule network-based deep ensemble transfer learning for multimodal sentiment analysisExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122454239:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.122454
Emanuel RDocherty PLunt HMöller K(2024)The effect of activation functions on accuracy, convergence speed, and misclassification confidence in CNN text classification: a comprehensive explorationThe Journal of Supercomputing10.1007/s11227-023-05441-780:1(292-312)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1007/s11227-023-05441-7
Mehrabi Hashjin NAmiri MMohammadzadeh AMirjalili SKhodadadi N(2024)Novel hybrid classifier based on fuzzy type-III decision maker and ensemble deep learning model and improved chaos game optimizationCluster Computing10.1007/s10586-024-04475-727:7(10197-10234)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1007/s10586-024-04475-7
Smitha ESendhilkumar SMahalakshmi G(2023)Intelligence system for sentiment classification with deep topic embedding using N-gram based topic modelingJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23024645:1(1539-1565)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.3233/JIFS-230246
Zhou XZhu YLi YQiang JYuan YWu XZhang R(2023)A hybrid classification method via keywords screening and attention mechanisms in extreme short textIntelligent Data Analysis10.3233/IDA-22041727:5(1331-1345)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.3233/IDA-220417
Chen DChen XLu PWang XLan X(2023)CNFRDInternational Journal of Intelligent Systems10.1155/2023/24675392023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/2467539
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents