Abstract
The purpose of document-level sentiment classification in social network is to predict users’ sentiment expressed in the document. Traditional methods based on deep neural networks rely on unsupervised word vectors. However, the word vectors cannot exactly represent the contextual relationship of context. On the other hand, Recurrent Neural Networks (RNNs) generally used to process the sentiment classification problem have complex structures, numerous model parameters and RNNs are hard to train. To address above issues, we propose a Transfer Learning based Hierarchical Attention Neural Network (TLHANN). Firstly, we train an encoder to understand in the context with machine translation task. Secondly, we transfer the encoder to sentiment classification task by concatenating the hidden vector generated by the encoder with the corresponding unsupervised vector. Finally, for the sentiment classification task, we apply a two-level hierarchical network. A simplified RNN unit called the Minimal Gate Unit (MGU) is arranged at each level. We use the attention mechanism at each level. Experimental results on several datasets show that the TLHANN model has excellent performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cognit. Sci. 34(8), 1388–1429 (2010)
Krizhevsky, I.S., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Mikolov, K.C., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR (Workshop) (2013)
Pennington, R.S., Manning, C.D.: Glove: Global vectors for word representation. In: EMNLP 2014 (2014)
McCann, B., Bradbury, J., Xiong, C., Socher, R.: Learned in Translation: Contextualized Word Vectors. arXiv preprint arXiv:1708.00107v1
Pan, S.J., Tsang, I.W., Kwok, J.T., et al.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2011)
Yosinski, J., Clune, J., Bengio, Y., et al.: How transferable are features in deep neural networks? In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Long, M., Cao, Y., Wang, J., Jordan, M.I.: Learning transferable features with deep adaptation networks. In: International Conference on Machine Learning (ICML) (2015)
Li, J., Jurafsky, D., Hovy, E.: When are tree structures necessary for deep learning of representations? arXiv preprint arXiv:1503.00185 (2015)
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of Empirical Methods in Natural Language Processing, pp. 1724–1735 (2014)
Ravanelli, M., Brakel, P., Omologo, M., Bengio, Y.: Improving speech recognition by revising gated recurrent units. arXiv preprint arXiv:1710.00641v1
Luong, M.-T., Pham, H., Manning, C.D.: Effective approaches to attention based neural machine translation. In: Proceedings of EMNLP (2015)
Zhou, G., Wu, J., Zhang, C., Zhou, Z.: Minimal gated unit for recurrent neural networks (2016). http://arxiv.org/abs/1603.09420
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings NAACL (2016)
Tang, D., Qin, B., Liu, T.: Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of ACL (2015b)
Zeiler, M.D.: Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Xu, J., Chen, D., Qiu, X., Huang, X.: Cached long short-term memory neural networks for document-level sentiment classification. arXiv preprint arXiv:1610.04989 (2016)
Chen, H., Sun, M., Tu, C., Lin, Y., Liu, Z.: Neural sentiment classification with user and product attention. EMNLP (2016)
Long, Y., Lu, Q., Xiang, R., Li, M., Huang, C.-R.: A cognition based attention model for sentiment analysis. EMNLP (2017)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (NSFC) grant funded by the China government, Ministry of Science and Technology (No. 61672108).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Qu, Z., Wang, Y., Wang, X., Zheng, S. (2018). A Transfer Learning Based Hierarchical Attention Neural Network for Sentiment Classification. In: Tan, Y., Shi, Y., Tang, Q. (eds) Data Mining and Big Data. DMBD 2018. Lecture Notes in Computer Science(), vol 10943. Springer, Cham. https://doi.org/10.1007/978-3-319-93803-5_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-93803-5_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93802-8
Online ISBN: 978-3-319-93803-5
eBook Packages: Computer ScienceComputer Science (R0)