Zorrilla et al., 2021 - Google Patents
Audio embeddings help to learn better dialogue policiesZorrilla et al., 2021
View PDF- Document ID
- 9682646436731070562
- Author
- Zorrilla A
- Torres M
- Cuayáhuitl H
- Publication year
- Publication venue
- 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
External Links
Snippet
Neural transformer architectures have gained a lot of interest for text-based dialogue management in the last few years. They have shown high learning capabilities for open domain dialogue with huge amounts of data and also for domain adaptation in task-oriented …
- 238000011156 evaluation 0 abstract description 14
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Baevski et al. | vq-wav2vec: Self-supervised learning of discrete speech representations | |
Liu et al. | An end-to-end trainable neural network model with belief tracking for task-oriented dialog | |
Serban et al. | A deep reinforcement learning chatbot | |
US10446148B2 (en) | Dialogue system, a dialogue method and a method of adapting a dialogue system | |
US11669699B2 (en) | Systems and methods for composed variational natural language generation | |
CN110782870A (en) | Speech synthesis method, speech synthesis device, electronic equipment and storage medium | |
Shi et al. | Unsupervised dialog structure learning | |
US11475220B2 (en) | Predicting joint intent-slot structure | |
Zheng et al. | Improving Prosodic Boundaries Prediction for Mandarin Speech Synthesis by Using Enhanced Embedding Feature and Model Fusion Approach. | |
Zorrilla et al. | Audio embeddings help to learn better dialogue policies | |
Zorrilla et al. | Audio Embedding-Aware Dialogue Policy Learning | |
Neelakantan et al. | Neural assistant: Joint action prediction, response generation, and latent knowledge reasoning | |
CN110032736A (en) | A kind of text analyzing method, apparatus and storage medium | |
CN117033582A (en) | Training method and device for dialogue model, electronic equipment and storage medium | |
CN114386426B (en) | Gold medal speaking skill recommendation method and device based on multivariate semantic fusion | |
Sheikhan et al. | Structure and weights optimisation of a modified Elman network emotion classifier using hybrid computational intelligence algorithms: a comparative study | |
Yerukola et al. | Data augmentation for voice-assistant NLU using bert-based interchangeable rephrase | |
Dadas et al. | A deep learning model with data enrichment for intent detection and slot filling | |
Kreyssig | Deep learning for user simulation in a dialogue system | |
Lee et al. | An integrated neural network model for domain action determination in goal-oriented dialogues | |
Liu et al. | Prompt pool based class-incremental continual learning for dialog state tracking | |
Yamazaki et al. | Filler prediction based on bidirectional lstm for generation of natural response of spoken dialog | |
Orozko et al. | Online learning of stochastic bi-automaton to model dialogues | |
Kawano et al. | Controlled Neural Response Generation by Given Dialogue Acts Based on Label-aware Adversarial Learning | |
Forsati et al. | An efficient meta heuristic algorithm for pos-tagging |