[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Robust Spoken Language Understanding with RL-Based Value Error Recovery

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12430))

  • 3250 Accesses

Abstract

Spoken Language Understanding (SLU) aims to extract structured semantic representations (e.g., slot-value pairs) from speech recognized texts, which suffers from errors of Automatic Speech Recognition (ASR). To alleviate the problem caused by ASR-errors, previous works may apply input adaptations to the speech recognized texts, or correct ASR errors in predicted values by searching the most similar candidates in pronunciation. However, these two methods are applied separately and independently. In this work, we propose a new robust SLU framework to guide the SLU input adaptation with a rule-based value error recovery module. The framework consists of a slot tagging model and a rule-based value error recovery module. We pursue on an adapted slot tagging model which can extract potential slot-value pairs mentioned in ASR hypotheses and is suitable for the existing value error recovery module. After the value error recovery, we can achieve a supervision signal (reward) by comparing refined slot-value pairs with annotations. Since operations of the value error recovery are non-differentiable, we exploit policy gradient based Reinforcement Learning (RL) to optimize the SLU model. Extensive experiments on the public CATSLU dataset show the effectiveness of our proposed approach, which can improve the robustness of SLU and outperform the baselines by significant margins.

C. Liu and S. Zhu—Contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 71.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 89.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    All possible value candidates of each slot are provided in the domain ontology.

  2. 2.

    E.g., the value candidate set for slot address can be all available addresses saved in the database of a dialogue system.

  3. 3.

    https://sites.google.com/view/CATSLU.

  4. 4.

    https://dumps.wikimedia.org/zhwiki/latest.

References

  1. Chen, Q., Zhuo, Z., Wang, W.: BERT for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909 (2019)

  2. Goo, C.W., et al.: Slot-gated modeling for joint slot filling and intent prediction. In: NAACL (2018)

    Google Scholar 

  3. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  4. Li, H., Liu, C., Zhu, S., Yu, K.: Robust spoken language understanding with acoustic and domain knowledge. In: ICMI, pp. 531–535 (2019)

    Google Scholar 

  5. Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling. In: INTERSPEECH, pp. 685–689 (2016)

    Google Scholar 

  6. Liu, C., Zhu, S., Zhao, Z., Cao, R., Chen, L., Yu, K.: Jointly encoding word confusion network and dialogue context with bert for spoken language understanding. arXiv preprint arXiv:2005.11640 (2020)

  7. Mesnil, G., He, X., Deng, L., Bengio, Y.: Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In: INTERSPEECH, pp. 3771–3775 (2013)

    Google Scholar 

  8. Qin, L., Che, W., Li, Y., Wen, H., Liu, T.: A stack-propagation framework with token-level intent detection for spoken language understanding. In: Proceedings of EMNLP-IJCNLP, pp. 2078–2087 (2019)

    Google Scholar 

  9. Schumann, R., Angkititrakul, P.: Incorporating ASR errors with attention-based, jointly trained RNN for intent detection and slot filling. In: ICASSP (2018)

    Google Scholar 

  10. Sutton, R.S., McAllester, D.A., Singh, S.P., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: NeurIPS (2000)

    Google Scholar 

  11. Tan, C., Ling, Z.: Multi-classification model for spoken language understanding. In: ICMI, pp. 526–530 (2019)

    Google Scholar 

  12. Tür, G., Deoras, A., Hakkani-Tür, D.: Semantic parsing using word confusion networks with conditional random fields. In: INTERSPEECH, pp. 2579–2583 (2013)

    Google Scholar 

  13. Wang, X., et al.: Transfer learning methods for spoken language understanding. In: ICMI, pp. 510–515 (2019)

    Google Scholar 

  14. Williams, J.D.: Web-style ranking and SLU combination for dialog state tracking. In: SIGDIAL, pp. 282–291 (2014)

    Google Scholar 

  15. Yang, X., Liu, J.: Using word confusion networks for slot filling in spoken language understanding. In: INTERSPEECH, pp. 1353–1357 (2015)

    Google Scholar 

  16. Yao, K., Peng, B., Zhang, Y., Yu, D., Zweig, G., Shi, Y.: Spoken language understanding using long short-term memory neural networks. In: SLT (2014)

    Google Scholar 

  17. Zhao, Z., Zhu, S., Yu, K.: A hierarchical decoding model for spoken language understanding from unaligned data. In: ICASSP, pp. 7305–7309 (2019)

    Google Scholar 

  18. Zhu, S., Lan, O., Yu, K.: Robust spoken language understanding with unsupervised ASR-error adaptation. In: ICASSP, pp. 6179–6183 (2018)

    Google Scholar 

  19. Zhu, S., Yu, K.: Encoder-decoder with focus-mechanism for sequence labelling based spoken language understanding. In: ICASSP, pp. 5675–5679 (2017)

    Google Scholar 

  20. Zhu, S., Zhao, Z., Zhao, T., Zong, C., Yu, K.: CATSLU: The 1st Chinese audio-textual spoken language understanding challenge. In: ICMI, pp. 521–525 (2019)

    Google Scholar 

Download references

Acknowledgements

We thank the anonymous reviewers for their thoughtful comments. This work has been supported by the National Key Research and Development Program of China (Grant No. 2017YFB1002102) and Shanghai Jiao Tong University Scientific and Technological Innovation Funds (YG2020YQ01).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lu Chen or Kai Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, C., Zhu, S., Chen, L., Yu, K. (2020). Robust Spoken Language Understanding with RL-Based Value Error Recovery. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60450-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60449-3

  • Online ISBN: 978-3-030-60450-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics