[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3298023.3298047guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A hierarchical latent variable encoder-decoder model for generating dialogues

Published: 04 February 2017 Publication History

Abstract

Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterances in a dialogue. To model these dependencies in a generative framework, we propose a neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with other recent neural-network architectures. We evaluate the model performance through a human evaluation study. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state.

References

[1]
Bangalore, S.; Di Fabbrizio, G.; and Stent, A. 2008. Learning the structure of task-driven human-human dialogs. IEEE Transactions on Audio, Speech, and Language Processing 16(7):1249-1259.
[2]
Bayer, J., and Osendorfer, C. 2014. Learning stochastic recurrent networks. In NIPS, Workshop on Advances in Variational Inference.
[3]
Bengio, Y.; Simard, P.; and Frasconi, P. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5(2):157-166.
[4]
Boulanger-Lewandowski, N.; Bengio, Y.; and Vincent, P. 2012. Modeling temporal dependencies in high-dimensional sequences: Application to polyphonic music generation and transcription. In ICML.
[5]
Bowman, S. R.; Vilnis, L.; Vinyals, O.; Dai, A. M.; Jozefowicz, R.; and Bengio, S. 2016. Generating sentences from a continuous space. CoNLL.
[6]
Chelba, C.; Mikolov, T.; Schuster, M.; Ge, Q.; Brants, T.; Koehn, P.; and Robinson, T. 2014. One billion word benchmark for measuring progress in statistical language modeling. In INTERSPEECH.
[7]
Cho, K., et al. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In EMNLP.
[8]
Chung, J.; Kastner, K.; Dinh, L.; Goel, K.; Courville, A.; and Bengio, Y. 2015. A recurrent latent variable model for sequential data. In NIPS.
[9]
Crook, N.; Granell, R.; and Pulman, S. 2009. Unsupervised classification of dialogue acts using a dirichlet process mixture model. In SIGDIAL.
[10]
Denton, E. L.; Chintala, S.; Szlam, A.; and Fergus, R. 2015. Deep generative image models using a laplacian pyramid of adversarial networks. In NIPS.
[11]
Fabius, O., and van Amersfoort, J. R. 2015. Variational recurrent auto-encoders. ICLR, Workshop Papers.
[12]
Forgues, G.; Pineau, J.; Larchevêque, J.-M.; and Tremblay, R. 2014. Bootstrapping dialog systems with word embeddings. NIPS, Modern Machine Learning and Natural Language Processing Workshop. Galley, M., et al. 2015. deltaBLEU: A discriminative metric for generation tasks with intrinsically diverse targets. In ACL.
[13]
Goodfellow, I.; Courville, A.; and Bengio, Y. 2015. Deep Learning. MIT Press.
[14]
Gorin, A. L.; Riccardi, G.; and Wright, J. H. 1997. How may i help you? Speech communication 23(1):113-127.
[15]
Graves, A. 2012. Sequence transduction with recurrent neural networks. In ICML, Representation Learning Workshop.
[16]
Gregor, K.; Danihelka, I.; Graves, A.; and Wierstra, D. 2015. DRAW: A recurrent neural network for image generation. In ICLR.
[17]
Hinton, G., et al. 2012. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Processing Magazine, IEEE 29(6):82-97.
[18]
Hochreiter, S., and Schmidhuber, J. 1997. Long short-term memory. Neural computation 9(8).
[19]
Kannan, A.; Kurach, K.; Ravi, S.; Kaufmann, T.; Tomkins, A.; Miklos, B.; Corrado, G.; Lukács, L.; Ganea, M.; et al. 2016. Smart reply: Automated response suggestion for email. In ACM SIGKDD.
[20]
Kingma, D., and Ba, J. 2015. Adam: A method for stochastic optimization. In ICLR.
[21]
Kingma, D. P., and Welling, M. 2014. Auto-encoding variational bayes. In ICLR.
[22]
Li, J.; Galley, M.; Brockett, C.; Gao, J.; and Dolan, B. 2016. A diversity-promoting objective function for neural conversation models. In NAACL.
[23]
Liu, C.-W.; Lowe, R.; Serban, I. V.; Noseworthy, M.; Charlin, L.; and Pineau, J. 2016. How NOT to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. In EMNLP.
[24]
Lowe, R.; Pow, N.; Serban, I.; and Pineau, J. 2015. The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems. In SIGDIAL.
[25]
Markoff, J., and Mozur, P. 2015. For sympathetic ear, more chinese turn to smartphone program. NY Times.
[26]
Mikolov, T., et al. 2010. Recurrent neural network based language model. In INTERSPEECH.
[27]
Mitchell, J., and Lapata, M. 2008. Vector-based models of semantic composition. In ACL, 236-244.
[28]
Pietquin, O., and Hastie, H. 2013. A survey on metrics for the evaluation of user simulations. The knowledge engineering review 28(01):59-73.
[29]
Rezende, D. J.; Mohamed, S.; and Wierstra, D. 2014. Stochastic backpropagation and approximate inference in deep generative models. In ICML.
[30]
Ritter, A.; Cherry, C.; and Dolan, W. B. 2011. Data-driven response generation in social media. In EMNLP.
[31]
Rus, V., and Lintean, M. 2012. A comparison of greedy and optimal assessment of natural language student input using word-to-word similarity metrics. In ACL, Building Educational Applications Workshop.
[32]
Serban, I. V.; Sordoni, A.; Bengio, Y.; Courville, A. C.; and Pineau, J. 2016. Building end-to-end dialogue systems using generative hierarchical neural network models. In AAAI.
[33]
Shaikh, S.; Strzalkowski, T.; Taylor, S.; and Webb, N. 2010. VCA: an experiment with a multiparty virtual chat agent. In ACL, Workshop on Companionable Dialogue Systems.
[34]
Singh, S.; Litman, D.; Kearns, M.; and Walker, M. 2002. Optimizing dialogue management with reinforcement learning: Experiments with the NJFun system. JAIR 16:105-133.
[35]
Sordoni, A.; Bengio, Y.; Vahabi, H.; Lioma, C.; Simonsen, J. G.; and Nie, J.-Y. 2015a. A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In CIKM.
[36]
Sordoni, A.; Galley, M.; Auli, M.; Brockett, C.; Ji, Y.; Mitchell, M.; Nie, J.-Y.; Gao, J.; and Dolan, B. 2015b. A neural network approach to context-sensitive generation of conversational responses. In NAACL-HLT.
[37]
Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In NIPS.
[38]
Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688.
[39]
Vinyals, O., and Le, Q. 2015. A neural conversational model. ICML, Deep Learning Workshop.
[40]
Young, S.; Gasic, M.; Thomson, B.; and Williams, J. D. 2013. POMDP-based statistical spoken dialog systems: A review. Proceedings of the IEEE 101(5):1160-1179.
[41]
Young, S. 2000. Probabilistic methods in spoken-dialogue systems. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 358(1769).
[42]
Zhai, K., and Williams, J. D. 2014. Discovering latent structure in task-oriented dialogues. In ACL.

Cited By

View all
  • (2023)Heterogeneous-branch collaborative learning for dialogue generationProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i11.26544(13148-13156)Online publication date: 7-Feb-2023
  • (2023)EmpMFF: A Multi-factor Sequence Fusion Framework for Empathetic Response GenerationProceedings of the ACM Web Conference 202310.1145/3543507.3583438(1754-1764)Online publication date: 30-Apr-2023
  • (2022)Emotion Recognition with Conversational Generation TransferACM Transactions on Asian and Low-Resource Language Information Processing10.1145/349453221:4(1-17)Online publication date: 19-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
AAAI'17: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence
February 2017
5106 pages

Sponsors

  • Association for the Advancement of Artificial Intelligence
  • amazon: amazon
  • Infosys
  • Facebook: Facebook
  • IBM: IBM

Publisher

AAAI Press

Publication History

Published: 04 February 2017

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Heterogeneous-branch collaborative learning for dialogue generationProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i11.26544(13148-13156)Online publication date: 7-Feb-2023
  • (2023)EmpMFF: A Multi-factor Sequence Fusion Framework for Empathetic Response GenerationProceedings of the ACM Web Conference 202310.1145/3543507.3583438(1754-1764)Online publication date: 30-Apr-2023
  • (2022)Emotion Recognition with Conversational Generation TransferACM Transactions on Asian and Low-Resource Language Information Processing10.1145/349453221:4(1-17)Online publication date: 19-Jan-2022
  • (2021)Dual-View Conditional Variational Auto-Encoder for Emotional Dialogue GenerationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/348189021:3(1-18)Online publication date: 13-Dec-2021
  • (2021)Text is NOT EnoughProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475568(4287-4296)Online publication date: 17-Oct-2021
  • (2021)A Review on Question Generation from Natural Language TextACM Transactions on Information Systems10.1145/346888940:1(1-43)Online publication date: 8-Sep-2021
  • (2021)POSSCOREProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482463(1119-1129)Online publication date: 26-Oct-2021
  • (2021)Identifying Untrustworthy SamplesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482352(1598-1608)Online publication date: 26-Oct-2021
  • (2021)Target-guided Emotion-aware Chat MachineACM Transactions on Information Systems10.1145/345641439:4(1-24)Online publication date: 17-Aug-2021
  • (2021)Dialogue History Matters! Personalized Response Selection in Multi-Turn Retrieval-Based ChatbotsACM Transactions on Information Systems10.1145/345318339:4(1-25)Online publication date: 17-Aug-2021
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media