Metallinou et al., 2012 - Google Patents

Context-sensitive learning for enhanced audiovisual emotion classification

Metallinou et al., 2012

Document ID: 15072255530110625130
Author: Metallinou A; Wollmer M; Katsamanis A; Eyben F; Schuller B; Narayanan S
Publication year: 2012
Publication venue: IEEE Transactions on Affective Computing

External Links

Cited by

Snippet

Human emotional expression tends to evolve in a structured manner in the sense that certain emotional evolution patterns, ie, anger to anger, are more probable than others, eg, anger to happiness. Furthermore, the perception of an emotional display can be affected by …

Continue reading at sail.usc.edu (PDF) (other versions)

230000002996 emotional 0 abstract description 164

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Taking into account non-speech caracteristics
- G10L2015/228—Taking into account non-speech caracteristics of application context
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/06—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
- G10L21/10—Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines

Similar Documents

Publication	Publication Date	Title
Metallinou et al.	2012	Context-sensitive learning for enhanced audiovisual emotion classification
Poria et al.	2018	Multimodal sentiment analysis: Addressing key issues and setting up the baselines
US11282297B2 (en)	2022-03-22	System and method for visual analysis of emotional coherence in videos
Poria et al.	2017	A review of affective computing: From unimodal analysis to multimodal fusion
Perez-Gaspar et al.	2016	Multimodal emotion recognition with evolutionary computation for human-robot interaction
Mower et al.	2009	Interpreting ambiguous emotional expressions
Hema et al.	2023	Emotional speech recognition using cnn and deep learning techniques
Wu et al.	2014	Survey on audiovisual emotion recognition: databases, features, and data fusion strategies
Triantafyllopoulos et al.	2023	An overview of affective speech synthesis and conversion in the deep learning era
JP2017527926A (en)	2017-09-21	Generation of computer response to social conversation input
Moore et al.	2014	Word-level emotion recognition using high-level features
Fu et al.	2021	CONSK-GCN: conversational semantic-and knowledge-oriented graph convolutional network for multimodal emotion recognition
Liang et al.	2018	Computational modeling of human multimodal language: The mosei dataset and interpretable dynamic fusion
Wang et al.	2019	Comic-guided speech synthesis
Zhang et al.	2020	Multimodal Deception Detection Using Automatically Extracted Acoustic, Visual, and Lexical Features.
Wei et al.	2014	Exploiting psychological factors for interaction style recognition in spoken conversation
Hoque et al.	2006	Robust recognition of emotion from speech
Qadri et al.	2019	A critical insight into multi-languages speech emotion databases
Kadali et al.	2020	Studies on paralinguistic speech sounds
Cambria et al.	2019	Speaker-independent multimodal sentiment analysis for big data
Al-Saadawi et al.	2024	A systematic review of trimodal affective computing approaches: Text, audio, and visual integration in emotion recognition and sentiment analysis
Kim et al.	2018	Automatic temporal ranking of children’s engagement levels using multi-modal cues
Novais	2022	A framework for emotion and sentiment predicting supported in ensembles
Getahun et al.	2016	Emotion identification from spontaneous communication
CN114627898A (en)	2022-06-14	Voice conversion method, apparatus, computer device, storage medium and program product