Zorrilla et al., 2021 - Google Patents

Audio embeddings help to learn better dialogue policies

Zorrilla et al., 2021

Document ID: 9682646436731070562
Author: Zorrilla A; Torres M; Cuayáhuitl H
Publication year: 2021
Publication venue: 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

External Links

Cited by

Snippet

Neural transformer architectures have gained a lot of interest for text-based dialogue management in the last few years. They have shown high learning capabilities for open domain dialogue with huge amounts of data and also for domain adaptation in task-oriented …

Continue reading at repository.lincoln.ac.uk (PDF) (other versions)

238000011156 evaluation 0 abstract description 14

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models

Similar Documents

Publication	Publication Date	Title
Baevski et al.	2019	vq-wav2vec: Self-supervised learning of discrete speech representations
Liu et al.	2017	An end-to-end trainable neural network model with belief tracking for task-oriented dialog
Serban et al.	2017	A deep reinforcement learning chatbot
US10446148B2 (en)	2019-10-15	Dialogue system, a dialogue method and a method of adapting a dialogue system
US11669699B2 (en)	2023-06-06	Systems and methods for composed variational natural language generation
CN110782870A (en)	2020-02-11	Speech synthesis method, speech synthesis device, electronic equipment and storage medium
Shi et al.	2019	Unsupervised dialog structure learning
US11475220B2 (en)	2022-10-18	Predicting joint intent-slot structure
Zheng et al.	2016	Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach.
Zorrilla et al.	2021	Audio embeddings help to learn better dialogue policies
Zorrilla et al.	2022	Audio Embedding-Aware Dialogue Policy Learning
Neelakantan et al.	2019	Neural assistant: Joint action prediction, response generation, and latent knowledge reasoning
CN110032736A (en)	2019-07-19	A kind of text analyzing method, apparatus and storage medium
CN117033582A (en)	2023-11-10	Training method and device for dialogue model, electronic equipment and storage medium
CN114386426B (en)	2023-01-13	Gold medal speaking skill recommendation method and device based on multivariate semantic fusion
Sheikhan et al.	2015	Structure and weights optimisation of a modified Elman network emotion classifier using hybrid computational intelligence algorithms: a comparative study
Yerukola et al.	2021	Data augmentation for voice-assistant NLU using bert-based interchangeable rephrase
Dadas et al.	2019	A deep learning model with data enrichment for intent detection and slot filling
Kreyssig	2018	Deep learning for user simulation in a dialogue system
Lee et al.	2013	An integrated neural network model for domain action determination in goal-oriented dialogues
Liu et al.	2023	Prompt pool based class-incremental continual learning for dialog state tracking
Yamazaki et al.	2020	Filler prediction based on bidirectional lstm for generation of natural response of spoken dialog
Orozko et al.	2015	Online learning of stochastic bi-automaton to model dialogues
Kawano et al.	2021	Controlled Neural Response Generation by Given Dialogue Acts Based on Label-aware Adversarial Learning
Forsati et al.	2010	An efficient meta heuristic algorithm for pos-tagging