Bevacqua et al., 2010 - Google Patents
Multimodal backchannels for embodied conversational agentsBevacqua et al., 2010
View PDF- Document ID
- 6682594803593466542
- Author
- Bevacqua E
- Pammi S
- Hyniewska S
- Schröder M
- Pelachaud C
- Publication year
- Publication venue
- Intelligent Virtual Agents: 10th International Conference, IVA 2010, Philadelphia, PA, USA, September 20-22, 2010. Proceedings 10
External Links
Snippet
One of the most desirable characteristics of an Embodied Conversational Agent (ECA) is the capability of interacting with users in a human-like manner. While listening to a user, an ECA should be able to provide backchannel signals through visual and acoustic modalities. In …
- 230000000007 visual effect 0 abstract description 29
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bevacqua et al. | Multimodal backchannels for embodied conversational agents | |
Bavelas et al. | Conversational hand gestures and facial displays in face-to-face dialogue | |
US20200279553A1 (en) | Linguistic style matching agent | |
Bevacqua et al. | A listening agent exhibiting variable behaviour | |
WO2017200074A1 (en) | Dialog method, dialog system, dialog device, and program | |
Poppe et al. | Backchannel strategies for artificial listeners | |
Rojc et al. | The TTS-driven affective embodied conversational agent EVA, based on a novel conversational-behavior generation algorithm | |
Sakai et al. | Online speech-driven head motion generating system and evaluation on a tele-operated robot | |
WO2018163646A1 (en) | Dialogue method, dialogue system, dialogue device, and program | |
Nagy et al. | A framework for integrating gesture generation models into interactive conversational agents | |
Prasad et al. | Robots that can hear, understand and talk | |
Mihoub et al. | Social behavior modeling based on incremental discrete hidden Markov models | |
Bertrand et al. | French face-to-face interaction: repetition as a multimodal resource | |
Gasparini et al. | Sentiment recognition of Italian elderly through domain adaptation on cross-corpus speech dataset | |
Lee et al. | Learning a model of speaker head nods using gesture corpora. | |
Gibbon | Gesture theory is linguistics: On modelling multimodality as prosody | |
WO2017200077A1 (en) | Dialog method, dialog system, dialog device, and program | |
Bevacqua | Computational model of listener behavior for embodied conversational agents | |
Urbain et al. | Laugh machine | |
Granström et al. | Inside out–acoustic and visual aspects of verbal and non-verbal communication | |
Liu et al. | Speech-gesture GAN: gesture generation for robots and embodied agents | |
Kim et al. | Introduction to the special issue on auditory-visual expressive speech and gesture in humans and machines | |
Lagha et al. | Understanding prosodic pauses in sign language from motion-capture and video-data | |
Al Moubayed et al. | Multimodal feedback from robots and agents in a storytelling experiment | |
Pammi | Synthesis of listener vocalizations: towards interactive speech synthesis |