Abstract
Neural attention based sequence to sequence (seq2seq) network models have achieved remarkable performance on NLP tasks such as image caption generation, paraphrase generation, and machine translation. The underlying framework for these models is usually a deep neural architecture comprising of multi-layer encoder-decoder sub-networks. The performance of the decoding sub-network is greatly affected by how well it extracts the relevant source-side contextual information. Conventional approaches only consider the outputs of the last encoding layer when computing the source contexts via a neural attention mechanism. Due to the nature of information flow across the time-steps within each encoder layer as well flow from layer to layer, there is no guarantee that the necessary information required to build the source context is stored in the final encoding layer. These approaches also do not fully capture the structural composition of natural language. To address these limitations, this paper presents several new strategies to generating the contextual feature vector jointly across all the encoding layers. The proposed strategies consistently outperform the conventional approaches to performing the neural attention computation on the task of paraphrase generation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Britz, D., Goldie, A., Luong, M.T., Le, Q.: Massive exploration of neural machine translation architectures. In: Proceedings of the EMNLP, pp. 1442–1451 (2017)
Cheng, J., Dong, L., Lapata, M.: Long short-term memory-networks for machine reading. In: Proceedings of the Conference on EMNLP, pp. 551–561 (2016)
Chollampatt, S., Ng, H.T.: A multilayer convolutional encoder-decoder neural network for grammatical error correction. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Creutz, M.: Open subtitles paraphrase corpus for six languages. In: Proceedings of the LREC (2018)
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1243–1252. JMLR. org (2017)
Hasan, S.A., et al.: Neural paraphrase generation with stacked residual lstm networks. In: Proceedings of COLING: Technical Papers, pp. 2923–2934 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, S., Wu, Y., Wei, F., Zhou, M.: Dictionary-guided editing networks for paraphrase generation. arXiv:1806.08077 (2018)
Lavie, A., Agarwal, A.: Meteor: an automatic metric for mt evaluation with high levels of correlation with human judgments. In: Proceedings of the Second Workshop on Statistical Machine Translation, pp. 228–231. ACL (2007)
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of EMNLP, pp. 1412–1421 (2015)
Madnani, N., Tetreault, J., Chodorow, M.: Re-examining machine translation metrics for paraphrase identification. In: Proceedings of the Conference of the NAACL:HLT, pp. 182–190. ACL (2012)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the ACL, pp. 311–318 (2002)
Sjöblom, E., Creutz, M., Aulamo, M.: Paraphrase detection on noisy subtitles in six languages. W-NUT 2018, 64 (2018)
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas. vol. 200 (2006)
Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: an open multilingual graph of general knowledge. In: AAAI Conference (2017)
Wu, Y., et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144 (2016)
Wubben, S., Van Den Bosch, A., Krahmer, E.: Paraphrase generation as monolingual translation: data and evaluation. In: Proceedings of the NLG, pp. 203–207. ACL (2010)
Zhou, J., Cao, Y., Wang, X., Li, P., Xu, W.: Deep recurrent models with fast-forward connections for neural machine translation. Trans. ACL 4, 371–383 (2016)
Zhu, J., Yang, M., Li, S., Zhao, T.: Sentence-level paraphrasing for machine translation system combination. In: Che, W., et al. (eds.) ICYCSEE 2016. CCIS, vol. 623, pp. 612–620. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-2053-7_54
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ampomah, I.K.E., McClean, S., Lin, Z., Hawe, G. (2019). JASs: Joint Attention Strategies for Paraphrase Generation. In: Métais, E., Meziane, F., Vadera, S., Sugumaran, V., Saraee, M. (eds) Natural Language Processing and Information Systems. NLDB 2019. Lecture Notes in Computer Science(), vol 11608. Springer, Cham. https://doi.org/10.1007/978-3-030-23281-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-23281-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23280-1
Online ISBN: 978-3-030-23281-8
eBook Packages: Computer ScienceComputer Science (R0)