More Web Proxy on the site http://driver.im/

research-article

Multi-attending Memory Network for Modeling Multi-turn Dialogue

Authors:

Xiaoxiao MaAuthors Info & Claims

HPCCT '19: Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference

Pages 237 - 242

https://doi.org/10.1145/3341069.3342970

Published: 22 June 2019 Publication History

Abstract

Modeling and reasoning about the dialogue history is a main challenge for building a good multi-turn conversational agent. End-to-end memory networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from relatively low computational efficiency for its complex architectures and costly strong supervision information or fixed priori knowledge. This paper proposes a multi-head attention based end-to-end approach called multi-attending memory network without additional information or knowledge, which can effectively model and reason about multi-turn history dialogue. Specifically, a parallel multi-head attention mechanism is introduced to model conversational context via attending to different important sections of a full dialog. Thereafter, a stacked architecture with shortcut connections is presented to reason about the memory (the result of context modeling). Experiments on the bAbI-dialog datasets demonstrate the effectiveness of proposed approach.

References

[1]

Hongshen Chen, Xiaorui Liu, Dawei Yin, Jiliang Tang. A survey on dialogue systems: Recent advances and new frontiers. ACM SIGKDD Explorations Newsletter, 2017, 19(2): 25--35.

Digital Library

[2]

Yafang Huang, Zuchao Li, Zhuosheng Zhang, Hai Zhao. Moon IME: Neural-based Chinese Pinyin Aided Input Method with Customizable Association. In proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018(4): 140--145.

[3]

Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu, Hai Zhao, Gongshen Liu. Modeling Multi-turn Conversation with Deep Utterance Aggregation. In proceedings of the 27th International Conference on Computational Linguistics, 2018: 3740--3752.

[4]

Minghui Qiu, Feng-Lin Li, Siyu Wang, Xing Gao, Yan Chen, Weipeng Zhao, Haiqing Chen, Jun Huang, Wei Chu: AliMe Chat. A Sequence to Sequence and Rerank based Chatbot Engine. In proceedings of the 55th Annual Meeting of the Association for Computational Linguistics 2017(2): 498--503.

[5]

Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation, 1997, 9(8): 1735--1780.

Digital Library

[6]

Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv, 2014:1412.3555.

[7]

Ilya Sutskever, Oriol Vinyals, Quoc V. Le. Sequence to Sequence Learning with Neural Networks. In Proceedings of the 28th In Advances in Neural Information Processing Systems, 2014: 3104--3112.

Digital Library

[8]

Liyuan Liu, Jingbo Shang, Xiang Ren, Frank Fangzheng Xu, Huan Gui, Jian Peng, Jiawei Han. Empower Sequence Labeling with Task-Aware Neural Language Model. In Proceedings of the 32th AAAI Conference on Artificial Intelligence, 2018: 5253--5260.

[9]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Attention is All you Need. In Proceedings of the 31th In Advances in Neural Information Processing Systems, 2017: 6000--6010.

Digital Library

[10]

Bordes A, Boureau Y L, Weston J. Learning end-to-end goal-oriented dialog. arXiv preprint arXiv:1605.07683, 2016.

[11]

Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber. Taining Very Deep Networks. In Proceedings of the 29th In Advances in Neural Information Processing Systems, 2015: 2377--2385.

Digital Library

[12]

Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv, 2014:1409.0473.

[13]

Yoon Kim, Carl Denton, Luong Hoang, and Alexander M. Rush. Structured attention networks. In International Conference on Learning Representations, 2017.

[14]

Serban I V, Klinger T, Tesauro G, et al. Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation. In Proceedings of the 31th AAAI Conference on Artificial Intelligence, 2017: 3288--3294.

Digital Library

[15]

Weston J, Chopra S, Bordes A. Memory Networks. In International Conference on Learning Representations, 2015.

[16]

Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus. End-To-End Memory Networks. In Proceedings of the 29th Advances in Neural Information Processing Systems, 2015: 2440--2448.

Digital Library

[17]

Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing. In Proceedings of the 33th International Conference on Machine Learning, 2016: 1378--1387.

Digital Library

[18]

Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason Weston. Key-Value Memory Networks for Directly Reading Documents. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2016: 1400--1409.

[19]

Fei Liu, Julien Perez. Gated End-to-End Memory Networks. In Proceedings of the European Chapter of the Association for Computational Linguistics, 2017(1): 1--10.

[20]

Caiming Xiong, Stephen Merity, Richard Socher. Dynamic Memory Networks for Visual and Textual Question Answering. In Proceedings of the 33th International Conference on Machine Learning, 2016: 2397--2406.

Digital Library

[21]

Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv, 2018: 1810.04805.

[22]

Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv, 2014: 1412.3555.

[23]

Chiu, Chung-Cheng, et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2018: 4774--4778.

[24]

Young T, Hazarika D, Poria S, et al. Recent trends in deep learning based natural language processing. IEEE Computational Intelligence Magazine, 2018, 13(3): 55--75.

[25]

Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 1994, 5(2): 157--166.

Digital Library

[26]

Schmidhuber J. Deep learning in neural networks: An overview. Neural networks, 2015, 61: 85--117.

Digital Library

[27]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016: 770--778.

[28]

Srivastava R K, Greff K, Schmidhuber J. Highway networks. arXiv preprint arXiv, 2015: 1505.00387.

[29]

Matthew Henderson, Blaise Thomson, Jason D. Williams. The Second Dialog State Tracking Challenge. SIGDIAL Conference 2014: 263--272.

[30]

Hess R A. Human-in-the-loop control. Control System Applications. CRC Press, 2018: 327--334.

Index Terms

Multi-attending Memory Network for Modeling Multi-turn Dialogue
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics

Recommendations

Memory Graph with Message Rehearsal for Multi-Turn Dialogue Generation
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management

Multi-turn dialogue system has attracted increasing attention in both academic and industry community. Multi-turn dialogue generation task is a challenging work as the relations among words, utterances and external knowledge are extremely complex. ...
Discourse Relation-Aware Multi-turn Dialogue Response Generation
Natural Language Processing and Chinese Computing
Abstract
Multi-turn dialogue response generation aims to generate a response with consideration of the context. It is not equal to multiple single-turn dialogues due to the context dependence of response. Many existing models achieve great success for ...
Visual Dialog with Multi-turn Attentional Memory Network
Advances in Multimedia Information Processing – PCM 2018
Abstract
Visual dialog is a task of answering a question given an input image, a historical dialog about the image and often requires to retrieve visual and textual facts about the question. This problem is different from visual question answering (VQA), ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

HPCCT '19: Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference

June 2019

293 pages

ISBN:9781450371858

DOI:10.1145/3341069

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

the Strategy Priority Research Program of Chinese Academy of Sciences
the National Key Research and Development Program of China
the National Natural Science Foundation of China
the National Natural Science Foundation of China together with the National Re-search Foundation of Singapore

Conference

HPCCT 2019

HPCCT 2019: 2019 The 3rd High Performance Computing and Cluster Technologies Conference

June 22 - 24, 2019

Guangzhou, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
92
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents