[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Text classification using capsules

Published: 01 February 2020 Publication History

Abstract

This paper presents an empirical exploration of the use of capsule networks for text classification. While it has been shown that capsule networks are effective for image classification, the research regarding their validity in the domain of text has been initiated recently. In this paper, we show that capsule networks indeed have the potential for text classification and that they have several advantages over convolutional neural networks. As well, we compare our proposed model to the initial studies regarding capsule network-based text classification. We further suggest a simple routing method that effectively reduces the computational complexity of dynamic routing. We utilized seven benchmark datasets to demonstrate that capsule networks, along with the proposed routing method provide comparable results.

References

[1]
N. Shafiabady, L. Lee, R. Rajkumar, V. Kallimani, N.A. Akram, D. Isa, Using unsupervised clustering approach to train the support vector machine for text classification, Neurocomputing 211 (2016) 4–10.
[2]
W. Zhang, T. Yoshida, X. Tang, Text classification based on multi-word with support vector machine, Knowl. Based Syst. 21 (8) (2008) 879–886.
[3]
T. Joachims, Text categorization with support vector machines: Learning with many relevant features, Proceedings of the European Conference on Machine Learning (ECML), Springer, 1998, pp. 137–142.
[4]
T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2013, pp. 3111–3119.
[5]
Q. Le, T. Mikolov, Distributed representations of sentences and documents, Proceedings of International Conference on Machine Learning (ICML), 2014, pp. 1188–1196.
[6]
Y. Kim, Convolutional neural networks for sentence classification, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, 2014, pp. 1746–1751.
[7]
P. Wang, B. Xu, J. Xu, G. Tian, C.-L. Liu, H. Hao, Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification, Neurocomputing 174 (2016) 806–814.
[8]
J. Wang, B. Peng, X. Zhang, Using a stacked residual LSTM model for sentiment intensity prediction, Neurocomputing 322 (2018) 93–101.
[9]
G. Rao, W. Huang, Z. Feng, Q. Cong, LSTM with sentence representations for document-level sentiment classification, Neurocomputing 308 (2018) 49–57.
[10]
G.E. Hinton, A. Krizhevsky, S.D. Wang, Transforming auto-encoders, Proceedings of the International Conference on Artificial Neural Networks, Springer, 2011, pp. 44–51.
[11]
S. Sabour, N. Frosst, G.E. Hinton, Dynamic routing between capsules, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2017, pp. 3859–3869.
[12]
Y.N. Dauphin, A. Fan, M. Auli, D. Grangier, Language modeling with gated convolutional networks, Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, JMLR.org, 2017, pp. 933–941.
[13]
X. Zhang, J. Zhao, Y. LeCun, Character-level convolutional networks for text classification, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2015, pp. 649–657.
[14]
N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional neural network for modelling sentences, Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL), Association for Computational Linguistics, 2014.
[15]
A.M. Dai, Q.V. Le, Semi-supervised sequence learning, Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2015, pp. 3079–3087.
[16]
T. Miyato, A.M. Dai, I. Goodfellow, Adversarial training methods for semi-supervised text classification, Proceedings of the International Conference on Learning Representations (ICLR), 2017.
[17]
T. Shen, T. Zhou, G. Long, J. Jiang, C. Zhang, Bi-directional block self-attention for fast and memory-efficient sequence modeling, Proceedings of the International Conference on Learning Representations (ICLR), 2018.
[18]
M. Yang, W. Zhao, J. Ye, Z. Lei, Z. Zhao, S. Zhang, Investigating capsule networks with dynamic routing for text classification, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 3110–3119. https://www.aclweb.org/anthology/D18-1350.
[19]
J. Gong, X. Qiu, S. Wang, X. Huang, Information aggregation via dynamic routing for sequence encoding, Proceedings of the 27th International Conference on Computational Linguistics, Association for Computational Linguistics, Santa Fe, New Mexico, USA, 2018, pp. 2742–2752. https://www.aclweb.org/anthology/C18-1232.
[20]
C. Xia, C. Zhang, X. Yan, Y. Chang, P. Yu, Zero-shot user intent detection via capsule neural networks, Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Brussels, Belgium, 2018, pp. 3090–3099. https://www.aclweb.org/anthology/D18-1348.
[21]
C. Peng, X. Zhang, G. Yu, G. Luo, J. Sun, Large kernel matters – improve semantic segmentation by global convolutional network, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[22]
B. Pang, L. Lee, A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts, Proceedings of the 42nd annual meeting on Association for Computational Linguistics (ACL), Association for Computational Linguistics, 2004, p. 271.
[23]
B. Pang, L. Lee, Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales, Proceedings of the 43rd annual meeting on association for computational linguistics (ACL), Association for Computational Linguistics, 2005, pp. 115–124.
[24]
X. Li, D. Roth, Learning question classifiers, Proceedings of the 19th International Conference on Computational Linguistics-Volume 1, Association for Computational Linguistics, 2002, pp. 1–7.
[25]
J. Wiebe, T. Wilson, C. Cardie, Annotating expressions of opinions and emotions in language, Lang. Resour. Eval. 39 (2–3) (2005) 165–210.
[26]
A.L. Maas, R.E. Daly, P.T. Pham, D. Huang, A.Y. Ng, C. Potts, Learning word vectors for sentiment analysis, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL), Association for Computational Linguistics, 2011, pp. 142–150.
[27]
D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, Proceedings of the International Conference on Learning Representations (ICLR), 2015.
[28]
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., Tensorflow: a system for large-scale machine learning., Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI), 16, 2016, pp. 265–283.
[29]
M.T. Ribeiro, S. Singh, C. Guestrin, “why should i trust you?”: explaining the predictions of any classifier, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD, ACM, New York, NY, USA, 2016, pp. 1135–1144.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing
Neurocomputing  Volume 376, Issue C
Feb 2020
257 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 February 2020

Author Tags

  1. Deep learning
  2. Text classification
  3. Capsule network
  4. Machine learning
  5. Text mining

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Object-centric Learning with Capsule Networks: A SurveyACM Computing Surveys10.1145/367450056:11(1-291)Online publication date: 21-Jun-2024
  • (2024)MFAEPattern Recognition Letters10.1016/j.patrec.2024.06.008184:C(59-65)Online publication date: 1-Aug-2024
  • (2024)Label-text bi-attention capsule networks model for multi-label text classificationNeurocomputing10.1016/j.neucom.2024.127671588:COnline publication date: 17-Jul-2024
  • (2024)Evaluating text classificationExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124302254:COnline publication date: 18-Oct-2024
  • (2024)Capsule network-based deep ensemble transfer learning for multimodal sentiment analysisExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122454239:COnline publication date: 1-Apr-2024
  • (2024)The effect of activation functions on accuracy, convergence speed, and misclassification confidence in CNN text classification: a comprehensive explorationThe Journal of Supercomputing10.1007/s11227-023-05441-780:1(292-312)Online publication date: 1-Jan-2024
  • (2024)Novel hybrid classifier based on fuzzy type-III decision maker and ensemble deep learning model and improved chaos game optimizationCluster Computing10.1007/s10586-024-04475-727:7(10197-10234)Online publication date: 1-Oct-2024
  • (2023)Intelligence system for sentiment classification with deep topic embedding using N-gram based topic modelingJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23024645:1(1539-1565)Online publication date: 1-Jan-2023
  • (2023)A hybrid classification method via keywords screening and attention mechanisms in extreme short textIntelligent Data Analysis10.3233/IDA-22041727:5(1331-1345)Online publication date: 1-Jan-2023
  • (2023)CNFRDInternational Journal of Intelligent Systems10.1155/2023/24675392023Online publication date: 1-Jan-2023
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media