More Web Proxy on the site http://driver.im/

research-article

Free access

Data augmentation for spoken language understanding via joint variational generation

AUTHORs:

Sang-goo LeeAuthors Info & Claims

AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

Article No.: 909, Pages 7402 - 7409

https://doi.org/10.1609/aaai.v33i01.33017402

Published: 27 January 2019 Publication History

PDF eReader Publisher Site

Abstract

Data scarcity is one of the main obstacles of domain adaptation in spoken language understanding (SLU) due to the high cost of creating manually tagged SLU datasets. Recent works in neural text generative models, particularly latent variable models such as variational autoencoder (VAE), have shown promising results in regards to generating plausible and natural sentences. In this paper, we propose a novel generative architecture which leverages the generative power of latent variable models to jointly synthesize fully annotated utterances. Our experiments show that existing SLU models trained on the additional synthetic examples achieve performance gains. Our approach not only helps alleviate the data scarcity issue in the SLU task for many datasets but also indiscriminately improves language understanding performances for various SLU models, supported by extensive experiments and rigorous statistical testing.

References

[1]

Al-Rfou, R.; Perozzi, B.; and Skiena, S. 2013. Polyglot: Distributed word representations for multilingual nlp. In CoNLL, 183–192.

[2]

Bowman, S. R.; Vilnis, L.; Vinyals, O.; Dai, A.; Jozefowicz, R.; and Bengio, S. 2016. Generating sentences from a continuous space. In CoNLL, 10–21.

[3]

Chen, Y.-N.; Hakanni-Tür, D.; Tur, G.; Celikyilmaz, A.; Guo, J.; and Deng, L. 2016. Syntax or semantics? knowledge-guided joint semantic frame parsing. In SLT Workshop, 348–355.

[4]

Conneau, A.; Kiela, D.; Schwenk, H.; Barrault, L.; and Bordes, A. 2017. Supervised learning of universal sentence representations from natural language inference data. In EMNLP, 670–680.

[5]

Dao, T.; Gu, A.; Ratner, A. J.; Smith, V.; De Sa, C.; and Ré, C. 2018. A kernel theory of modern data augmentation. arXiv preprint arXiv:1803.06084.

[6]

Fabius, O., and van Amersfoort, J. R. 2014. Variational recurrent auto-encoders. arXiv preprint arXiv:1412.6581.

[7]

Fadaee, M.; Bisazza, A.; and Monz, C. 2017. Data augmentation for low-resource neural machine translation. In ACL, volume 2, 567–573.

[8]

Fedus, W.; Goodfellow, I.; and Dai, A. M. 2018. Maskgan: Better text generation via filling in the arXiv preprint arXiv:1801.07736.

[9]

Goo, C.-W.; Gao, G.; Hsu, Y.-K.; Huo, C.-L.; Chen, T.-C.; Hsu, K.-W.; and Chen, Y.-N. 2018. Slot-gated modeling for joint slot filling and intent prediction. In NAACL, volume 2, 753–757.

[10]

Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. 2014. Generative adversarial nets. In NIPS, 2672–2680.

[11]

Guo, D.; Tur, G.; Yih, W.-t.; and Zweig, G. 2014. Joint semantic utterance classification and slot filling with recursive neural networks. In SLT Workshop, 554–559.

[12]

Hemphill, C. T.; Godfrey, J. J.; and Doddington, G. R. 1990. The atis spoken language systems pilot corpus. In The Workshop on Speech and Natural Language, 96–101.

[13]

Hou, Y.; Liu, Y.; Che, W.; and Liu, T. 2018. Sequence-to-sequence data augmentation for dialogue language understanding. In ICCL, 1234–1245.

[14]

Hu, Z.; Yang, Z.; Liang, X.; Salakhutdinov, R.; and Xing, E. P. 2017. Toward controlled generation of text. In ICML, 1587–1596.

[15]

Huang, Z.; Xu, W.; and Yu, K. 2015. Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991.

[16]

Ioffe, S., and Szegedy, C. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 448–456.

[17]

Kafle, K.; Yousefhussien, M.; and Kanan, C. 2017. Data augmentation for visual question answering. In INLG, 198– 202.

[18]

Kingma, D. P., and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

[19]

Kingma, D. P., and Welling, M. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

[20]

Krizhevsky, A.; Sutskever, I.; and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In NIPS, 1097–1105.

[21]

Kurata, G.; Xiang, B.; Zhou, B.; and Yu, M. 2016b. Leveraging sentence-level information with encoder lstm for semantic slot filling. In EMNLP, 2077–2083.

[22]

Kurata, G.; Xiang, B.; and Zhou, B. 2016a. Labeled data generation with encoder-decoder lstm for semantic slot filling. In INTERSPEECH, 725–729.

[23]

Lai, S.; Xu, L.; Liu, K.; and Zhao, J. 2015. Recurrent convolutional neural networks for text classification. In AAAI, volume 333, 2267–2273.

Digital Library

[24]

Li, J.; Monroe, W.; Shi, T.; Jean, S.; Ritter, A.; and Jurafsky, D. 2017. Adversarial learning for neural dialogue generation. In EMNLP, 2157–2169.

[25]

Liu, B., and Lane, I. 2016. Attention-based recurrent neural network models for joint intent detection and slot filling. In INTERSPEECH, 685–689.

[26]

Mesnil, G.; Dauphin, Y.; Yao, K.; Bengio, Y.; Deng, L.; Hakkani-Tur, D.; He, X.; Heck, L.; Tur, G.; Yu, D.; and Zweig, G. 2015. Using recurrent neural networks for slot filling in spoken language understanding. IEEE/ACM TASLP 23:530–539.

Digital Library

[27]

Pan, S. J., and Yang, Q. 2010. A Survey on Transfer Learning. IEEE TKDE 22:1345–1359.

[28]

Pennington, J.; Socher, R.; and Manning, C. 2014. Glove: Global vectors for word representation. In EMNLP, 1532– 1543.

[29]

Ratner, A. J.; Ehrenberg, H.; Hussain, Z.; Dunnmon, J.; and Ré, C. 2017. Learning to compose domain-specific transformations for data augmentation. In NIPS, 3236–3246.

[30]

Rezende, D. J.; Mohamed, S.; and Wierstra, D. 2014. Stochastic backpropagation and approximate inference in deep generative models. In ICML, II–1278.

[31]

Serban, I. V.; Sordoni, A.; Lowe, R.; Charlin, L.; Pineau, J.; Courville, A. C.; and Bengio, Y. 2017. A hierarchical latent variable encoder-decoder model for generating dialogues. In AAAI, 3295–3301.

[32]

Shen, T.; Lei, T.; Barzilay, R.; and Jaakkola, T. 2017. Style transfer from non-parallel text by cross-alignment. In NIPS, 6830–6841.

[33]

Simard, P.; Steinkraus, D.; and Platt, J. 2003. Best practices for convolutional neural networks applied to visual document analysis. In ICDAR, 958.

[34]

Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R. 2014. Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1):1929–1958.

Digital Library

[35]

Torralba, A., and Efros, A. A. 2011. Unbiased look at dataset bias. In CVPR, 1521–1528.

[36]

Yao, K.; Peng, B.; Zhang, Y.; Yu, D.; Zweig, G.; and Shi, Y. 2014. Spoken language understanding using long short-term memory neural networks. In SLT Workshop, 189–194.

[37]

Yu, L.; Zhang, W.; Wang, J.; and Yu, Y. 2017. Seqgan: Sequence generative adversarial nets with policy gradient. In AAAI, 2852–2858.

[38]

Zoph, B.; Yuret, D.; May, J.; and Knight, K. 2016. Transfer learning for low-resource neural machine translation. In EMNLP, 1568–1575.

Cited By

Xia CXiong CYu PDiaz FShah CSuel TCastells PJones RSakai T(2021)Pseudo Siamese Network for Few-shot Intent GenerationProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462995(2005-2009)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462995
Ni PLi YLi GChang V(2020)Natural language understanding approaches based on joint task of intent detection and slot filling for IoT voice interactionNeural Computing and Applications10.1007/s00521-020-04805-x32:20(16149-16166)Online publication date: 13-Mar-2020
https://dl.acm.org/doi/10.1007/s00521-020-04805-x

Index Terms

Data augmentation for spoken language understanding via joint variational generation
1. Applied computing
  1. Arts and humanities
    1. Language translation
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Machine translation
  2. Machine learning
    1. Learning paradigms
    2. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Spoken language understanding and interaction

In recent years, the interest in research in speech understanding and spoken interaction has soared due to the emergence of virtual personal assistants. However, while the ability of these agents to recognise conversational speech is maturing rapidly, ...
Learning Dialogue History for Spoken Language Understanding
Natural Language Processing and Chinese Computing
Abstract
In task-oriented dialogue systems, spoken language understanding (SLU) aims to convert users’ queries expressed by natural language to structured representations. SLU usually consists of two parts, namely intent identification and slot filling. ...
Data Augmentation via Variational Auto-Encoders
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
Abstract
Data augmentation is a widely considered technique to improve the performance of Convolutional Neural Networks during training. This step consists in synthetically generate new labeled data by perturbing the samples of the training set, which is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence

January 2019

10088 pages

ISBN:978-1-57735-809-1

Copyright © 2019 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 27 January 2019

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
33
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)2

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xia CXiong CYu PDiaz FShah CSuel TCastells PJones RSakai T(2021)Pseudo Siamese Network for Few-shot Intent GenerationProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462995(2005-2009)Online publication date: 11-Jul-2021
https://dl.acm.org/doi/10.1145/3404835.3462995
Ni PLi YLi GChang V(2020)Natural language understanding approaches based on joint task of intent detection and slot filling for IoT voice interactionNeural Computing and Applications10.1007/s00521-020-04805-x32:20(16149-16166)Online publication date: 13-Mar-2020
https://dl.acm.org/doi/10.1007/s00521-020-04805-x

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents