[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Kumar et al., 2024 - Google Patents

Interpretable multimodal emotion recognition using hybrid fusion of speech and image data

Kumar et al., 2024

View PDF
Document ID
1135947120697567032
Author
Kumar P
Malik S
Raman B
Publication year
Publication venue
Multimedia Tools and Applications

External Links

Snippet

This paper proposes a multimodal emotion recognition system based on hybrid fusion that classifies the emotions depicted by speech utterances and corresponding images into discrete classes. A new interpretability technique has been developed to identify the …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • G06N99/005Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6279Classification techniques relating to the number of classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/02Knowledge representation
    • G06N5/022Knowledge engineering, knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computer systems utilising knowledge based models
    • G06N5/04Inference methods or devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/18Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computer systems based on specific mathematical models
    • G06N7/005Probabilistic networks

Similar Documents

Publication Publication Date Title
Gheisari et al. Deep learning: Applications, architectures, models, tools, and frameworks: A comprehensive survey
Zhu et al. Multimodal sentiment analysis based on fusion methods: A survey
Alam et al. Survey on deep neural networks in speech and vision systems
Zhang et al. A quantum-like multimodal network framework for modeling interaction dynamics in multiparty conversational sentiment analysis
Niu et al. A review on the attention mechanism of deep learning
Wadawadagi et al. Sentiment analysis with deep neural networks: comparative study and performance assessment
Geetha et al. Multimodal Emotion Recognition with deep learning: advancements, challenges, and future directions
Kumar et al. Interpretable multimodal emotion recognition using hybrid fusion of speech and image data
Zulqarnain et al. An efficient two-state GRU based on feature attention mechanism for sentiment analysis
Halvardsson et al. Interpretation of swedish sign language using convolutional neural networks and transfer learning
Hofmann et al. Innovating with artificial intelligence: capturing the constructive functional capabilities of deep generative learning
Kommineni et al. Attention-based Bayesian inferential imagery captioning maker
Wankhade et al. MAPA BiLSTM-BERT: multi-aspects position aware attention for aspect level sentiment analysis
Sharma et al. Multilevel attention and relation network based image captioning model
Lei et al. A multi-level mesh mutual attention model for visual question answering
Sreevidya et al. Elder emotion classification through multimodal fusion of intermediate layers and cross-modal transfer learning
Paul et al. A context-sensitive multi-tier deep learning framework for multimodal sentiment analysis
Wu et al. Sentimental visual captioning using multimodal transformer
Wieser et al. Understanding auditory representations of emotional expressions with neural networks
Chatterjee et al. Class-biased sarcasm detection using BiLSTM variational autoencoder-based synthetic oversampling
Yuan [Retracted] A Classroom Emotion Recognition Model Based on a Convolutional Neural Network Speech Emotion Algorithm
Ghorbanali et al. Capsule network-based deep ensemble transfer learning for multimodal sentiment analysis
Dixit et al. Deep CNN with late fusion for real time multimodal emotion recognition
Yang et al. SMFNM: Semi-supervised multimodal fusion network with main-modal for real-time emotion recognition in conversations
Jia et al. Multimodal emotion distribution learning