More Web Proxy on the site http://driver.im/

Article

A hierarchical latent variable encoder-decoder model for generating dialogues

Authors:

Iulian Vlad Serban,

Alessandro Sordoni,

Laurent Charlin,

Aaron Courville,

Yoshua BengioAuthors Info & Claims

AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

Pages 3295 - 3301

Published: 04 February 2017 Publication History

Abstract

Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterances in a dialogue. To model these dependencies in a generative framework, we propose a neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with other recent neural-network architectures. We evaluate the model performance through a human evaluation study. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state.

References

[1]

Bangalore, S.; Di Fabbrizio, G.; and Stent, A. 2008. Learning the structure of task-driven human-human dialogs. IEEE Transactions on Audio, Speech, and Language Processing 16(7):1249-1259.

Digital Library

[2]

Bayer, J., and Osendorfer, C. 2014. Learning stochastic recurrent networks. In NIPS, Workshop on Advances in Variational Inference.

[3]

Bengio, Y.; Simard, P.; and Frasconi, P. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5(2):157-166.

Digital Library

[4]

Boulanger-Lewandowski, N.; Bengio, Y.; and Vincent, P. 2012. Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. In ICML.

Digital Library

[5]

Bowman, S. R.; Vilnis, L.; Vinyals, O.; Dai, A. M.; Jozefowicz, R.; and Bengio, S. 2016. Generating sentences from a continuous space. CoNLL.

[6]

Chelba, C.; Mikolov, T.; Schuster, M.; Ge, Q.; Brants, T.; Koehn, P.; and Robinson, T. 2014. One billion word benchmark for measuring progress in statistical language modeling. In INTERSPEECH.

[7]

Cho, K., et al. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In EMNLP.

[8]

Chung, J.; Kastner, K.; Dinh, L.; Goel, K.; Courville, A.; and Bengio, Y. 2015. A recurrent latent variable model for sequential data. In NIPS.

Digital Library

[9]

Crook, N.; Granell, R.; and Pulman, S. 2009. Unsupervised classification of dialogue acts using a dirichlet process mixture model. In SIGDIAL.

Digital Library

[10]

Denton, E. L.; Chintala, S.; Szlam, A.; and Fergus, R. 2015. Deep generative image models using a laplacian pyramid of adversarial networks. In NIPS.

Digital Library

[11]

Fabius, O., and van Amersfoort, J. R. 2015. Variational recurrent auto-encoders. ICLR, Workshop Papers.

[12]

Forgues, G.; Pineau, J.; Larchevêque, J.-M.; and Tremblay, R. 2014. Bootstrapping dialog systems with word embeddings. NIPS, Modern Machine Learning and Natural Language Processing Workshop. Galley, M., et al. 2015. deltaBLEU: A discriminative metric for generation tasks with intrinsically diverse targets. In ACL.

[13]

Goodfellow, I.; Courville, A.; and Bengio, Y. 2015. Deep Learning. MIT Press.

Digital Library

[14]

Gorin, A. L.; Riccardi, G.; and Wright, J. H. 1997. How may i help you? Speech communication 23(1):113-127.

Digital Library

[15]

Graves, A. 2012. Sequence transduction with recurrent neural networks. In ICML, Representation Learning Workshop.

[16]

Gregor, K.; Danihelka, I.; Graves, A.; and Wierstra, D. 2015. DRAW: A recurrent neural network for image generation. In ICLR.

Digital Library

[17]

Hinton, G., et al. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Processing Magazine, IEEE 29(6):82-97.

[18]

Hochreiter, S., and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8).

Digital Library

[19]

Kannan, A.; Kurach, K.; Ravi, S.; Kaufmann, T.; Tomkins, A.; Miklos, B.; Corrado, G.; Lukács, L.; Ganea, M.; et al. 2016. Smart reply: Automated response suggestion for email. In ACM SIGKDD.

Digital Library

[20]

Kingma, D., and Ba, J. 2015. Adam: A method for stochastic optimization. In ICLR.

[21]

Kingma, D. P., and Welling, M. 2014. Auto-encoding variational bayes. In ICLR.

[22]

Li, J.; Galley, M.; Brockett, C.; Gao, J.; and Dolan, B. 2016. A diversity-promoting objective function for neural conversation models. In NAACL.

[23]

Liu, C.-W.; Lowe, R.; Serban, I. V.; Noseworthy, M.; Charlin, L.; and Pineau, J. 2016. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In EMNLP.

[24]

Lowe, R.; Pow, N.; Serban, I.; and Pineau, J. 2015. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems. In SIGDIAL.

[25]

Markoff, J., and Mozur, P. 2015. For sympathetic ear, more chinese turn to smartphone program. NY Times.

[26]

Mikolov, T., et al. 2010. Recurrent neural network based language model. In INTERSPEECH.

[27]

Mitchell, J., and Lapata, M. 2008. Vector-based models of semantic composition. In ACL, 236-244.

[28]

Pietquin, O., and Hastie, H. 2013. A survey on metrics for the evaluation of user simulations. The knowledge engineering review 28(01):59-73.

[29]

Rezende, D. J.; Mohamed, S.; and Wierstra, D. 2014. Stochastic backpropagation and approximate inference in deep generative models. In ICML.

Digital Library

[30]

Ritter, A.; Cherry, C.; and Dolan, W. B. 2011. Data-driven response generation in social media. In EMNLP.

Digital Library

[31]

Rus, V., and Lintean, M. 2012. A comparison of greedy and optimal assessment of natural language student input using word-to-word similarity metrics. In ACL, Building Educational Applications Workshop.

Digital Library

[32]

Serban, I. V.; Sordoni, A.; Bengio, Y.; Courville, A. C.; and Pineau, J. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In AAAI.

Digital Library

[33]

Shaikh, S.; Strzalkowski, T.; Taylor, S.; and Webb, N. 2010. VCA: an experiment with a multiparty virtual chat agent. In ACL, Workshop on Companionable Dialogue Systems.

Digital Library

[34]

Singh, S.; Litman, D.; Kearns, M.; and Walker, M. 2002. Optimizing dialogue management with reinforcement learning: Experiments with the NJFun system. JAIR 16:105-133.

Digital Library

[35]

Sordoni, A.; Bengio, Y.; Vahabi, H.; Lioma, C.; Simonsen, J. G.; and Nie, J.-Y. 2015a. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In CIKM.

Digital Library

[36]

Sordoni, A.; Galley, M.; Auli, M.; Brockett, C.; Ji, Y.; Mitchell, M.; Nie, J.-Y.; Gao, J.; and Dolan, B. 2015b. A neural network approach to context-sensitive generation of conversational responses. In NAACL-HLT.

[37]

Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In NIPS.

Digital Library

[38]

Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688.

[39]

Vinyals, O., and Le, Q. 2015. A neural conversational model. ICML, Deep Learning Workshop.

[40]

Young, S.; Gasic, M.; Thomson, B.; and Williams, J. D. 2013. POMDP-based statistical spoken dialog systems: A review. Proceedings of the IEEE 101(5):1160-1179.

[41]

Young, S. 2000. Probabilistic methods in spoken-dialogue systems. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 358(1769).

[42]

Zhai, K., and Williams, J. D. 2014. Discovering latent structure in task-oriented dialogues. In ACL.

Cited By

Li YFeng SSun BLi KWilliams BChen YNeville J(2023)Heterogeneous-branch collaborative learning for dialogue generationProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i11.26544(13148-13156)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i11.26544
Pang XWang YFan SChen LShang SHan P(2023)EmpMFF: A Multi-factor Sequence Fusion Framework for Empathetic Response GenerationProceedings of the ACM Web Conference 202310.1145/3543507.3583438(1754-1764)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583438
Ma HWang ZZhou XZhou GZhou Q(2022)Emotion Recognition with Conversational Generation TransferACM Transactions on Asian and Low-Resource Language Information Processing10.1145/349453221:4(1-17)Online publication date: 19-Jan-2022
https://dl.acm.org/doi/10.1145/3494532
Show More Cited By

Recommendations

Implications for generating clarification requests in task-oriented dialogues
ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics

Clarification requests (CRs) in conversation ensure and maintain mutual understanding and thus play a crucial role in robust dialogue interaction. In this paper, we describe a corpus study of CRs in task-oriented dialogue and compare our findings to ...
A unified latent variable model for contrastive opinion mining
Abstract
There are large and growing textual corpora in which people express contrastive opinions about the same topic. This has led to an increasing number of studies about contrastive opinion mining. However, there are several notable issues with the ...
Distilling dialogues: a method using natural dialogue corpora for dialogue systems development
ANLC '00: Proceedings of the sixth conference on Applied natural language processing

We report on a method for utilising corpora collected in natural settings. It is based on distilling (re-writing) natural dialogues to elicit the type of dialogue that would occur if one the dialogue participants was a computer instead of a human. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence

February 2017

5106 pages

Program Chairs:
Satinder Singh
University of Michigan
,
Shaul Markovitch
Technion-Israel Institute of Technology

Sponsors

Association for the Advancement of Artificial Intelligence
amazon: amazon
Infosys
Facebook: Facebook
IBM: IBM

Publisher

AAAI Press

Publication History

Published: 04 February 2017

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

68
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li YFeng SSun BLi KWilliams BChen YNeville J(2023)Heterogeneous-branch collaborative learning for dialogue generationProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i11.26544(13148-13156)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i11.26544
Pang XWang YFan SChen LShang SHan P(2023)EmpMFF: A Multi-factor Sequence Fusion Framework for Empathetic Response GenerationProceedings of the ACM Web Conference 202310.1145/3543507.3583438(1754-1764)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543507.3583438
Ma HWang ZZhou XZhou GZhou Q(2022)Emotion Recognition with Conversational Generation TransferACM Transactions on Asian and Low-Resource Language Information Processing10.1145/349453221:4(1-17)Online publication date: 19-Jan-2022
https://dl.acm.org/doi/10.1145/3494532
Li MZhang JLu XZong C(2021)Dual-View Conditional Variational Auto-Encoder for Emotional Dialogue GenerationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/348189021:3(1-18)Online publication date: 13-Dec-2021
https://dl.acm.org/doi/10.1145/3481890
Shen LZhan HShen XSong YZhao XShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)Text is NOT EnoughProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475568(4287-4296)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475568
Zhang RGuo JChen LFan YCheng X(2021)A Review on Question Generation from Natural Language TextACM Transactions on Information Systems10.1145/346888940:1(1-43)Online publication date: 8-Sep-2021
https://dl.acm.org/doi/10.1145/3468889
Liu ZZhou KMao JWilson MDemartini GZuccon GCulpepper JHuang ZTong H(2021)POSSCOREProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482463(1119-1129)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482463
Shen LZhan HShen XChen HZhao XZhu XDemartini GZuccon GCulpepper JHuang ZTong H(2021)Identifying Untrustworthy SamplesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482352(1598-1608)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482352
Wei WLiu JMao XGuo GZhu FZhou PHu YFeng S(2021)Target-guided Emotion-aware Chat MachineACM Transactions on Information Systems10.1145/345641439:4(1-24)Online publication date: 17-Aug-2021
https://dl.acm.org/doi/10.1145/3456414
Li JLiu CTao CChan ZZhao DZhang MYan R(2021)Dialogue History Matters! Personalized Response Selection in Multi-Turn Retrieval-Based ChatbotsACM Transactions on Information Systems10.1145/345318339:4(1-25)Online publication date: 17-Aug-2021
https://dl.acm.org/doi/10.1145/3453183
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents