Abstract
Bi-directional LSTM (BLSTM) often utilizes Attention Mechanism (AM) to improve the ability of modeling sentences. But additional parameters within AM may lead to difficulties of model selection and BLSTM training. To solve the problem, this paper redefines AM from a novel perspective of the quantum cognition and proposes a parameter-free Quantum AM (QAM). Furthermore, we make a quantum interpretation for BLSTM with Two-State Vector Formalism (TSVF) and find the similarity between sentence understanding and quantum Weak Measurement (WM) under TSVF. Weak value derived from WM is employed to represent the attention for words in a sentence. Experiments show that QAM based BLSTM outperforms common AM (CAM) [1] based BLSTM on most classification tasks discussed in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
In QT, the quantum probability space is encapsulated in an Hilbert space \(\mathbb {H}^n\), which is an abstract vector space processing the structure of the inner product. A finite dimensional space is sufficient for the work reported in this paper. Thus, we limit our researches to a finite real space \(\mathbb {R}^n\). With the Dirac’s notation, a quantum state can be written as a column vector \(| \varPsi \rangle \), whose conjugate transpose is a row vector\(\langle \varPsi |\).
- 3.
- 4.
http://nlp.stanford.edu/sentiment/ Data is actually provided at the phrase level. Hence both phrases and sentences are used to train the model, but only sentences are scored at test time [2, 4, 16]. Thus the training set is an order of magnitude larger than listed in Table 1.
- 5.
References
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: HLT-NAACL, pp. 1480–1489 (2016)
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), vol. 1631, p. 1642. Citeseer (2013)
Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016)
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. arXiv preprint arXiv:1404.2188 (2014)
Wang, Z., Busemeyer, J.R., Atmanspacher, H., Pothos, E.M.: The potential of using quantum theory to build models of cognition. Top. Cogn. Sci. 5(4), 672–688 (2013)
Bruza, P.D., Wang, Z., Busemeyer, J.R.: Quantum cognition: a new theoretical approach to psychology. Trends Cogn. Sci. 19(7), 383–393 (2015)
Aharonov, Y., Vaidman, L.: Complete description of a quantum system at a given time. J. Phys. A: Math. Gen. 24(10), 2315 (1991)
Ravon, T., Vaidman, L.: The three-box paradox revisited. J. Phys. A: Math. Theor. 40(11), 2873 (2007)
Gibran, B.: Causal realism in the philosophy of mind. Essays Philos. 15(2), 5 (2014)
Aharonov, Y., Vaidman, L.: The two-state vector formalism: an updated review. In: Muga, J., Mayato, R.S., Egusquiza, Í. (eds.) Time in Quantum Mechanics. Lecture Notes in Physics, vol. 734. pp. 399–447. Springer, Heidelberg (2008). doi:10.1007/978-3-540-73473-4_13
Aharonov, Y., Bergmann, P.G., Lebowitz, J.L.: Time symmetry in the quantum process of measurement. Phys. Rev. 134(6B), B1410 (1964)
Latta, R.L.: The Basic Humor Process: A Cognitive-shift Theory and the Case Against Incongruity, vol. 5. Walter de Gruyter (1999)
Tamir, B., Cohen, E.: Introduction to weak measurements and weak values. Quanta 2(1), 7–17 (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics (2005)
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML 2014, pp. 1188–1196 (2014)
Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 271. Association for Computational Linguistics (2004)
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Socher, R., Pennington, J., Huang, E.H., Ng, A.Y., Manning, C.D.: Semi-supervised recursive autoencoders for predicting sentiment distributions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 151–161. Association for Computational Linguistics (2011)
Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211. Association for Computational Linguistics (2012)
Dong, L., Wei, F., Liu, S., Zhou, M., Xu, K.: A statistical parsing framework for sentiment classification. Comput. Linguist. (2015)
Nakagawa, T., Inui, K., Kurohashi, S.: Dependency tree-based sentiment classification using CRFS with hidden variables. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 786–794. Association for Computational Linguistics (2010)
Acknowledgments
This work is funded in part by the Chinese 863 Program (grant No. 2015AA015403), the Key Project of Tianjin Natural Science Foundation (grant No. 15JCZDJC31100), the Tianjin Younger Natural Science Foundation (Grant no: 14JCQNJC00400), the Major Project of Chinese National Social Science Fund (grant No. 14ZDB153) and MSCA-ITN-ETN - European Training Networks Project (grant No. 721321, QUARTZ).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Niu, X., Hou, Y., Wang, P. (2017). Bi-Directional LSTM with Quantum Attention Mechanism for Sentence Modeling. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10635. Springer, Cham. https://doi.org/10.1007/978-3-319-70096-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-70096-0_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70095-3
Online ISBN: 978-3-319-70096-0
eBook Packages: Computer ScienceComputer Science (R0)