CN112800998A - Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA - Google Patents
Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA Download PDFInfo
- Publication number
- CN112800998A CN112800998A CN202110159085.8A CN202110159085A CN112800998A CN 112800998 A CN112800998 A CN 112800998A CN 202110159085 A CN202110159085 A CN 202110159085A CN 112800998 A CN112800998 A CN 112800998A
- Authority
- CN
- China
- Prior art keywords
- emotion
- expression
- electroencephalogram
- vector
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 48
- 230000008909 emotion recognition Effects 0.000 title claims abstract description 42
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000008451 emotion Effects 0.000 claims abstract description 137
- 230000014509 gene expression Effects 0.000 claims abstract description 84
- 230000002093 peripheral effect Effects 0.000 claims abstract description 82
- 230000002996 emotional effect Effects 0.000 claims abstract description 55
- 230000008921 facial expression Effects 0.000 claims abstract description 14
- 238000010219 correlation analysis Methods 0.000 claims abstract description 12
- 239000013598 vector Substances 0.000 claims description 227
- 239000011159 matrix material Substances 0.000 claims description 45
- 230000004927 fusion Effects 0.000 claims description 32
- 230000009466 transformation Effects 0.000 claims description 29
- 230000006870 function Effects 0.000 claims description 21
- 238000012549 training Methods 0.000 claims description 21
- 238000013507 mapping Methods 0.000 claims description 14
- 239000006185 dispersion Substances 0.000 claims description 9
- 238000003062 neural network model Methods 0.000 claims description 8
- 230000001131 transforming effect Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 230000015654 memory Effects 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 238000013145 classification model Methods 0.000 description 6
- 241000282414 Homo sapiens Species 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000037007 arousal Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- PIZHFBODNLEQBL-UHFFFAOYSA-N 2,2-diethoxy-1-phenylethanone Chemical compound CCOC(OCC)C(=O)C1=CC=CC=C1 PIZHFBODNLEQBL-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007500 overflow downdraw method Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000000763 evoking effect Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 208000020016 psychiatric disease Diseases 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Mathematical Optimization (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Pure & Applied Mathematics (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Signal Processing (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention discloses a multi-modal emotion recognition method and system for fusing an attention mechanism and identifying multi-set canonical correlation analysis (DMCCA). The method comprises the following steps: respectively extracting electroencephalogram signal features, peripheral physiological signal features and expression features from the preprocessed electroencephalogram signals, peripheral physiological signals and facial expression videos; respectively extracting discriminating electroencephalogram emotional characteristics, peripheral physiological emotional characteristics and expression emotional characteristics by using an attention mechanism; obtaining electroencephalogram-peripheral physiological-expression multi-mode emotional characteristics by using a DMCCA method for the electroencephalogram emotional characteristics, the peripheral physiological emotional characteristics and the expression emotional characteristics; and (4) performing classification and identification on the multi-modal emotional features by using a classifier. According to the method, the attention mechanism is adopted to selectively focus on the characteristics with emotion discrimination in each mode, and the relevance and complementarity among emotion characteristics of different modes are fully utilized by combining DMCCA, so that the accuracy and robustness of emotion recognition can be effectively improved.
Description
Technical Field
The invention relates to the technical field of emotion recognition and artificial intelligence, in particular to a multi-modal emotion recognition method and system for fusing an attention mechanism and identifying multi-set canonical correlation analysis (DMCCA).
Background
Human emotion is a psychological and physiological state accompanying the process of human consciousness, and plays an important role in interpersonal communication. With the continuous progress of technologies such as artificial intelligence, people pay more attention to the more intelligent and humanized Human-Computer interaction (HCIs) experience. People have higher and higher requirements on machine intellectualization, and the machine is expected to have the capability of perceiving, understanding and even expressing emotion, realize humanized human-computer interaction and better serve human beings. Emotion recognition is a branch of emotion calculation, is a basic and core technology for realizing human-computer emotion interaction, has become a research hotspot in the fields of computer science, cognitive science, artificial intelligence and the like, and is widely concerned by the academic and industrial fields. For example, in clinical care, if the emotional state of a patient, especially a patient with a dysexpressive disorder, can be known, different care measures can be taken to improve the quality of care. In addition, there is also an increasing interest in psychobehavioral monitoring of patients with mental disorders, human-machine friendly interaction of emotional robots, and the like.
In the past, many studies on emotion recognition have focused on recognizing human emotional states using information of a single modality, such as speech-based emotion recognition and facial expression-based emotion recognition. Because the emotion information expressed by single voice or expression information is incomplete and is easily influenced by various external factors, for example, facial expression recognition is easily influenced by shading and illumination change, while emotion recognition based on voice is easily influenced by environmental noise interference and sound difference of different subjects, in addition, sometimes people face and smile, hold a cavity and do nothing to silence in order to cover up their real emotions, at this time, the facial expression or body posture has certain deception, and the emotion recognition method based on voice is invalid when people silence and are not speaking, so that the single-mode emotion recognition has certain limitation. Therefore, more and more researchers are focusing on emotion recognition research based on multi-mode information fusion, and it is expected that a robust emotion recognition model can be constructed by utilizing complementarity between various modal information so as to achieve higher emotion recognition accuracy.
Currently, in multi-modal emotion recognition research, a more common information fusion strategy includes decision layer fusion and feature layer fusion. Decision layer fusion is usually based on the result of individual identification of each mode, and then decision judgment is made according to relevant rules, such as a Mean (Mean) rule, a Sum (Sum) rule, a maximum (Max) rule, a voting mechanism of minority majority obeying, and the like, so as to obtain a final identification result. The decision layer fusion technology considers the difference of different modal information comprehensively according to different contributions of the different modal information to emotion recognition, but ignores the correlation of the different modal information. The multi-modal emotion recognition performance based on decision-making layer fusion is not only related to the emotion recognition rate of a single mode, but also depends on the performance of a decision-making layer fusion algorithm. The feature layer fusion refers to combining emotional features of a plurality of modes to form a fused feature vector. The feature layer fusion method utilizes the complementarity of different modal emotional features, but how to determine the weights of the different modal emotional features to reflect the differences of the different features in emotion classification and identification is a key for performing multi-modal feature fusion, and is still an open subject facing challenges at present.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the defects of low accuracy and poor robustness of single-mode emotion recognition and the defects of the existing multi-mode emotion feature fusion method, the invention aims to provide the multi-mode emotion recognition method and system for fusing an attention mechanism and identifying multi-set canonical correlation analysis (DMCCA).
The technical scheme is as follows: the invention adopts the following technical scheme for realizing the aim of the invention:
a multi-modal emotion recognition method fusing an attention mechanism and DMCCA comprises the following steps:
(1) extracting electroencephalogram signal feature vectors and expression feature vectors from the preprocessed electroencephalogram signals and facial expression videos by using respective trained neural network models, and extracting peripheral physiological signal feature vectors from the preprocessed peripheral physiological signals by extracting signal waveform descriptors and statistical features thereof;
(2) mapping the electroencephalogram signal feature vector, the peripheral physiological signal feature vector and the expression feature vector into a plurality of groups of feature vectors through linear transformation matrixes respectively, determining importance weights of different feature vector groups by using an attention mechanism module respectively, and forming an electroencephalogram emotion feature vector, a peripheral physiological emotion feature vector and an expression emotion feature vector which have the same dimension and are discriminating through weighting fusion;
(3) determining a projection matrix of each emotion characteristic vector by using a discrimination multiple set canonical correlation analysis (DMCCA) method for the electroencephalogram emotion characteristic vector, the peripheral physiological emotion characteristic vector and the expression emotion characteristic vector and maximizing the correlation among different modal emotion characteristics of the same type of sample, projecting each emotion characteristic vector to a public subspace, and obtaining the electroencephalogram-peripheral physiological-expression multi-modal emotion characteristic vector after addition and fusion;
(4) and classifying and identifying the multi-mode emotion feature vectors by using a classifier to obtain emotion categories.
Further, the specific steps of extracting discriminating electroencephalogram emotional characteristics, peripheral physiological emotional characteristics and expression emotional characteristics by using an attention mechanism module in the step (2) comprise:
(2.1) representing the electroencephalogram signal characteristics extracted in the step (1) into a matrix formAnd by linearly transforming the matrix W(1)Mapping to M1Group feature vector4≤M1Not more than 16, the dimension of each group of feature vectors is N, N is not less than 16 and not more than 64, and the order isThe linear transformation expression is as follows:
E(1)=(F(1))TW(1)
wherein, the superscript (1) represents an electroencephalogram mode, and T represents a transposed symbol;
determining importance weights of different feature vector groups by using a first attention mechanism module, and forming the electroencephalogram emotional feature vector with discriminative power by weighted fusion, wherein the weight of the characteristic vector of the r-th group of electroencephalogram signalsAnd the electroencephalogram emotional characteristic vector x(1)Expressed as:
wherein, r is 1,2, …, M1,Representing the r-th group of electroencephalogram signal feature vectors,for a trainable linear transformation parameter vector, exp (·) represents an exponential function based on a natural constant e;
(2.2) expressing the peripheral physiological signal characteristics extracted in the step (1) in a matrix formCombined pipeOver-linear transformation matrix W(2)Mapping to M2Group feature vector4≤M2Less than or equal to 16, order The linear transformation expression is as follows:
E(2)=(F(2))TW(2)
wherein the superscript (2) represents a peripheral physiological modality;
determining importance weights of different feature vector groups by using a second attention mechanism module, and forming discriminating peripheral physiological emotion feature vectors by weighted fusion, wherein the weights of the s-th group of peripheral physiological signal feature vectorsAnd peripheral physiological emotion feature vector x(2)Expressed as:
wherein, s is 1,2, …, M2,Represents the s-th group of peripheral physiological signal feature vectors,a trainable linear transformation parameter vector;
(2.3) expressing features extracted in the step (1) in a matrix formIs shown asAnd by linearly transforming the matrix W(3)Mapping to M3Group feature vector4≤M3Less than or equal to 16, order The linear transformation expression is as follows:
E(3)=(F(3))TW(3)
wherein, the superscript (3) represents an expression mode;
determining importance weights of different feature vector groups by using a third attention mechanism module, and forming expression emotion feature vectors with discriminative power by weighted fusion, wherein the weights of the t-th group of expression emotion feature vectorsAnd expression emotion feature vector x(3)Expressed as:
wherein, t is 1,2, …, M3,Representing the characteristic vector of the t-th group expression,for trainable linear transformationA parameter vector.
Further, the step (3) specifically comprises the following sub-steps:
(3.1) acquiring DMCCA projection matrix which is obtained through training and respectively corresponds to electroencephalogram emotional characteristics, peripheral physiological emotional characteristics and expression emotional characteristics And32≤d≤128;
(3.2) respectively using projection matrixes omega, phi and psi to extract the electroencephalogram emotion feature vector x from the step (2)(1)Peripheral physiological emotion feature vector x(2)And expression emotion feature vector x(3)Projected into a d-dimensional public subspace, wherein the electroencephalogram emotional characteristic vector x(1)Projection into d-dimensional common subspace is ΩTx(1)Peripheral physiological affective feature vector x(2)Projection into d-dimensional common subspace is ΨTx(2)Expression emotion feature vector x(3)Projection into d-dimensional common subspace is ΨTx(3);
(3.3) reducing omegaTx(1)、ΦTx(2)And ΨTx(3)Fusing to obtain electroencephalogram-peripheral physiology-expression multi-modal emotion feature vector omegaTx(1)+ΦTx(2)+ΨTx(3)。
Further, the projection matrices Ω, Φ, and Ψ in step (3.1) are obtained by training in the following steps:
(3.1.1) respectively extracting training samples of all emotion types from the training sample set to generate 3 groups of emotion feature vectorsWhereinM is the number of training samples, N isI 1,2,3, M1, 2, …, M; let i-1 represent the electroencephalogram modality, i-2 represent the peripheral physiological modality, i-3 represent the expression modality,representing the electroencephalogram emotional characteristic vector,representing a vector of peripheral physiological emotional features,representing an expression emotion feature vector;
(3.1.2) calculation of X(i)Mean of vectors in each column, pair X(i)Carrying out centralized operation;
(3.1.3) solving a group of projection matrixes omega, phi and psi based on the idea of identifying multi-set canonical correlation analysis (DMCCA), so that the linear correlation of the same type of samples in a public projection shadow space is maximized, the inter-class dispersion of data in the modality is maximized, and the intra-class dispersion of the data in the modality is minimized, and X is enabled to be(i)Is a projection vector of1,2,3, the objective function of DMCCA is:
wherein,represents X(i)The intra-class dispersion matrix of (a),represents X(i)Cov (·, ·) represents the covariance, i, j ∈ {1,2,3 };
constructing an optimization model as follows and solving to obtain projection matrixes omega, phi and psi:
further, solving the optimization model of the DMCCA objective function using Lagrange multiplier (Lagrange multiplier) can obtain the following Lagrange (Lagrange) function:
wherein λ is Lagrange multiplier, and then respectively calculating L (w)(1),w(2),w(3)) To w(1)、w(2)And w(3)And making it zero, i.e. order
To obtain
By further simplifying the above equation, the following generalized eigenvalue problem can be obtained:
the first d maximum eigenvalues lambda are selected by solving the generalized eigenvalue problem in the above formula1≥λ2≥…≥λdCorresponding characteristic vector, namely obtaining a projection matrix And
based on the same inventive concept, the multi-modal emotion recognition system integrating the attention mechanism and the DMCCA, provided by the invention, comprises:
the characteristic primary extraction module is used for respectively extracting electroencephalogram signal characteristic vectors and expression characteristic vectors from the preprocessed electroencephalogram signals and facial expression videos by using respective trained neural network models, and extracting peripheral physiological signal characteristic vectors from the preprocessed peripheral physiological signals by extracting signal waveform descriptors and statistical characteristics thereof;
the characteristic identification enhancement module is used for mapping the electroencephalogram signal characteristic vector, the peripheral physiological signal characteristic vector and the expression characteristic vector into a plurality of groups of characteristic vectors through linear transformation matrixes respectively, determining importance weights of different characteristic vector groups respectively by using the attention mechanism module, and forming an electroencephalogram emotion characteristic vector, a peripheral physiological emotion characteristic vector and an expression emotion characteristic vector which have the same dimension and have identification power through weighting fusion;
the projection matrix determining module is used for determining a projection matrix of each emotion characteristic vector by maximizing the correlation among different modal emotion characteristics of the same class of samples by using a discrimination multi-set canonical correlation analysis (DMCCA) method;
the feature fusion module is used for projecting the electroencephalogram emotion feature vector, the peripheral physiological emotion feature vector and the expression emotion feature vector to a public subspace through respective corresponding projection matrixes, and obtaining an electroencephalogram-peripheral physiological-expression multi-mode emotion feature vector after addition and fusion;
and the classification and identification module is used for classifying and identifying the multi-mode emotion feature vectors by using the classifier to obtain the emotion types.
Based on the same inventive concept, the multi-modal emotion recognition system fusing the attention mechanism and the DMCCA provided by the invention comprises at least one computing device, wherein the computing device comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, and the computer program realizes the multi-modal emotion recognition method fusing the attention mechanism and the DMCCA when being loaded to the processor.
Has the advantages that: compared with the prior art, the invention has the following technical effects:
(1) according to the invention, an attention mechanism is adopted to selectively focus on the significant characteristics playing a key role in emotion recognition in each mode, the characteristics with emotion identification capability are adaptively learned, and the accuracy and robustness of multi-mode emotion recognition can be effectively improved.
(2) The invention adopts a typical correlation analysis method for identifying multiple sets, introduces the category information of the samples, can excavate the nonlinear correlation relationship among different modes by maximizing the correlation among different modal emotional characteristics of the same category sample and maximizing the inter-class dispersion of the same modal emotional characteristics and minimizing the intra-class dispersion of the same modal emotional characteristics, fully utilizes the correlation and complementarity among electroencephalogram emotional characteristics, peripheral physiological emotional characteristics and expression emotional characteristics, eliminates some invalid redundant characteristics at the same time, and can effectively improve the identification power and robustness of characteristic representation.
(3) Compared with a single-mode emotion recognition method, the method comprehensively utilizes various modal information in the emotion expression process, can combine the characteristics of different modes and fully utilize the complementarity of the characteristics to mine multi-mode emotion characteristics, and can effectively improve the accuracy and robustness of emotion recognition.
Drawings
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
fig. 2 is a block diagram of an embodiment of the present invention.
Detailed Description
For a more detailed understanding of the present invention, reference is now made to the following descriptions taken in conjunction with the accompanying drawings and specific examples.
As shown in fig. 1 and fig. 2, a multi-modal emotion recognition method combining an attention mechanism and a DMCCA provided by an embodiment of the present invention mainly includes the following steps:
(1) extracting electroencephalogram signal feature vectors and expression feature vectors from the preprocessed electroencephalogram signals and facial expression videos by using the trained neural network models respectively, and extracting peripheral physiological signal feature vectors from the preprocessed peripheral physiological signals by extracting signal waveform descriptors and statistical features thereof.
In this embodiment, a deap (database for electronic Analysis using physical signals) Emotion database is used, and in practice, other Emotion databases including electroencephalogram, peripheral Physiological signals, and facial expression videos may be used. The DEAP database used in this example was a published multimodal emotion database collected by Koelstra et al, university of Mary, London, England. The database comprises physiological signals generated by 32 subjects watching different types of music video clip evoked stimuli with the time length of 40 minutes, peripheral physiological signals and facial expression videos of the first 22 subjects watching the music video clip. Each subject required 40 experiments and had a timely Self-assessment (SAM) after each experiment was completed, 40 Self-assessments on a SAM questionnaire. The SAM questionnaire contains mental scales of the subjects' Arousal (Arousal), Valence (Valence), Dominance (Dominance) and Liking (Liking) for the video. The arousal degree represents the state excitation degree of the human, the change range is gradually transited from a calm state to an excitation state, and the value is measured by the value from 1 to 9; the valence degree is also called the pleasure degree and represents the pleasure degree of the mood of a person, and the variation range is gradually transited from a Negative (Negative) state to a Positive (Positive) state and is also measured by the scores of the numbers 1 to 9; the degree of dominance varies from compliant (or "uncontrolled") to dominant (or "controlled"); the preference indicates the individual preference of the subject for the video. Each subject needs to select a score representing the emotional state after each experiment for classification and identification analysis of the subsequent emotional classifications.
In the DEAP database, the physiological signals are 512Hz sampled, 128Hz complex sampled (preprocessed complex sampled data is provided by the authorities), and the physiological signal matrix of each subject is 40 × 40 × 8064(40 different kinds of music video clips, 40 physiological signal channels, 8064 sampling points). Of the 40 physiological signal channels, the first 32 channels collect electroencephalogram signals, and the last 8 channels collect peripheral physiological signals. The 8064 samples are 63s long at 128Hz sampling rate, and each segment of the signal has 3s silence time before recording.
In the embodiment of the invention, 880 samples with electroencephalogram signals, peripheral physiological signals and facial expressions are used as training samples, and classification recognition is respectively carried out on 4 dimensions of arousal degree, valence degree, dominance degree and preference degree.
The Neural Network model for extracting the electroencephalogram signal features can adopt a Long Short-Term Memory (LSTM) Network or a Convolutional Neural Network (CNN), and the Neural Network model for extracting the expression features can adopt a 3D Convolutional Neural Network, a CNN-LSTM, and the like. In this embodiment, a trained Convolutional Neural Network (CNN) model is used to perform feature extraction on the preprocessed electroencephalogram signal, so as to obtain a 256-dimensional electroencephalogram signal feature vector; extracting 128-dimensional peripheral physiological signal characteristic vectors of preprocessed peripheral physiological signals such as electrocardio, respiration, electrooculogram and myoelectricity by extracting Low Level Descriptors (LLD) of signal waveforms and statistical characteristics (including average value, standard deviation, power spectrum, median, maximum value and minimum value) of the LLD; and extracting 256-dimensional expression feature vectors from the preprocessed facial expression video by using a trained CNN-LSTM model.
(2) And respectively extracting the discriminating electroencephalogram emotion characteristic vector, peripheral physiological emotion characteristic vector and expression emotion characteristic vector from the electroencephalogram signal characteristic vector, the peripheral physiological signal characteristic vector and the expression characteristic vector by using an attention mechanism module.
(3) And obtaining the electroencephalogram-peripheral physiology-expression multi-mode emotion feature vector by using a discrimination multi-set canonical correlation analysis (DMCCA) method for the electroencephalogram emotion feature vector, the peripheral physiology emotion feature vector and the expression emotion feature vector.
(4) And classifying and identifying the multi-mode emotion feature vectors by using a classifier to obtain emotion categories.
Further, the specific steps of extracting discriminating electroencephalogram emotional characteristics, peripheral physiological emotional characteristics and expression emotional characteristics by using an attention mechanism module in the step (2) comprise:
(2.1) representing the electroencephalogram signal characteristics extracted in the step (1) into a matrix formAnd by linearly transforming the matrix W(1)Mapping to M1Group feature vector4≤M1Not more than 16, the dimension of each group of feature vectors is N, N is not less than 16 and not more than 64, and the order isThe linear transformation expression is as follows:
E(1)=(F(1))TW(1)
wherein, the superscript (1) represents the electroencephalogram mode, and T represents the transposed symbol.
Determining importance weights of different feature vector groups by using a first attention mechanism module, and forming the electroencephalogram emotional feature vector with discriminative power by weighted fusion, wherein the weight of the characteristic vector of the r-th group of electroencephalogram signalsAnd the electroencephalogram emotional characteristic vector x(1)Expressed as:
wherein, r is 1,2, …, M1,Representing the r-th group of electroencephalogram signal feature vectors,for a trainable linear transformation parameter vector, exp (·) represents an exponential function based on a natural constant e. In this embodiment, M1=8,N=32。
To train the linear transformation matrix W(1)The parameter of (2) needs to be connected with a softmax classifier after the first attention mechanism module, and the electroencephalogram emotion feature vector x output by the first attention mechanism module is used for(1)C output nodes connected to the softmax classifier output a probability distribution vector after passing through the softmax functionWherein C is [1, C ]]And C is the number of emotion categories.
Further, the linear transformation matrix W is trained by the cross entropy loss function shown in the following equation(1)The parameter (c) of (c).
Wherein x is(1)The electroencephalogram emotion feature vector is 32-dimensional;probability distribution vectors representing the prediction emotion classes of the softmax classification model;representing the real emotion category label of the mth electroencephalogram sample, and if the real emotion category label of the mth electroencephalogram sample is c when one-hot coding is adoptedOtherwiseRepresenting the probability that the softmax classification model predicts the mth electroencephalogram sample as the class c; loss(1)Representing a linear transformation matrix W(1)A loss function during training; in this embodiment, C is 2 and M is 880.
And continuously carrying out iterative training through an error back propagation algorithm until the model parameters reach the optimal values. Then, the electroencephalogram emotional characteristic vector x can be extracted from the electroencephalogram signal of the newly input test sample(1)。
(2.2) expressing the peripheral physiological signal characteristics extracted in the step (1) in a matrix formAnd by linearly transforming the matrix W(2)Mapping to M2Group feature vector4≤M2Less than or equal to 16, order The linear transformation expression is as follows:
E(2)=(F(2))TW(2)
wherein the superscript (2) represents the peripheral physiological modality.
Determining importance weights of different feature vector groups by using a second attention mechanism module, and forming discriminating peripheral physiological emotion feature vectors by weighted fusion, wherein the weights of the s-th group of peripheral physiological signal feature vectorsHeavy loadAnd peripheral physiological emotion feature vector x(2)Expressed as:
wherein, s is 1,2, …, M2,Represents the s-th group of peripheral physiological signal feature vectors,the parameter vector is transformed linearly, which is trainable. In this embodiment, M2=4。
To train the linear transformation matrix W(2)The peripheral physiological emotion feature vector x output by the second attention mechanism module needs to be connected with a softmax classifier after the second attention mechanism module(2)C output nodes connected to the softmax classifier output a probability distribution vector after passing through the softmax function
Further, the linear transformation matrix W is trained by the cross entropy loss function shown in the following equation(2)The parameter (c) of (c).
Wherein x is(2)A 32-dimensional peripheral physiological emotion feature vector;probability distribution vectors representing the prediction emotion classes of the softmax classification model;when one-hot coding is adopted, if the real emotion category label of the mth peripheral physiological signal sample is c, thenOtherwiseRepresenting the probability that the softmax classification model predicts the mth peripheral physiological signal sample as class c; loss(2)Representing a linear transformation matrix W(2)A loss function during training; in this embodiment, C is 2 and M is 880.
And continuously carrying out iterative training through an error back propagation algorithm until the model parameters reach the optimal values. Then, a peripheral physiological emotion characteristic vector x can be extracted from the newly input peripheral physiological signal of the test sample(2)。
(2.3) expressing the expression characteristics extracted in the step (1) in a matrix form into expression characteristicsAnd by linearly transforming the matrix W(3)Mapping to M3Group feature vector4≤M3Less than or equal to 16, order The linear transformation expression is as follows:
E(3)=(F(3))TW(3)
wherein, the superscript (3) represents an expression mode.
Determining importance weights of different feature vector groups by using a third attention mechanism module, and forming expression emotion feature vectors with discriminative power by weighted fusion, wherein the weights of the t-th group of expression emotion feature vectorsAnd expression emotion feature vector x(3)Expressed as:
wherein, t is 1,2, …, M3,Representing the characteristic vector of the t-th group expression,the parameter vector is transformed linearly, which is trainable. In this embodiment, M3=8。
To train the linear transformation matrix W(3)The third attention mechanism module is connected with a softmax classifier, and the expression emotion feature vector x output by the third attention mechanism module is used for classifying the expression emotion feature vector x(3)C output nodes connected to the softmax classifier output a probability distribution vector after passing through the softmax function
Further, represented by the following formulaTraining linear transformation matrix W by cross entropy loss function(3)The parameter (c) of (c).
Wherein x is(3)Expression emotion feature vectors in 32 dimensions;probability distribution vectors representing the prediction emotion classes of the softmax classification model;and when one-hot coding is adopted, if the real emotion category label of the mth expression video sample is c, thenOtherwiseRepresenting the probability that the m-th expression video sample is predicted to be of the category c by the softmax classification model; loss(3)Representing a linear transformation matrix W(3)A loss function during training; in this embodiment, C is 2 and M is 880.
And continuously carrying out iterative training through an error back propagation algorithm until the model parameters reach the optimal values. Then, the expression emotion feature vector x can be extracted from the newly input expression video of the test sample(3)。
Further, the step (3) specifically comprises the following sub-steps:
(3.1) acquiring DMCCA projection matrix which is obtained through training and respectively corresponds to electroencephalogram emotional characteristics, peripheral physiological emotional characteristics and expression emotional characteristics Andd is more than or equal to 32 and less than or equal to 128. In the present embodiment, d is 40.
(3.2) respectively using projection matrixes omega, phi and psi to extract the electroencephalogram emotion feature vector x from the step (2)(1)Peripheral physiological emotion feature vector x(2)And expression emotion feature vector x(3)Projected into a d-dimensional public subspace, wherein the electroencephalogram emotional characteristic vector x(1)Projection into d-dimensional common subspace is ΩTx(1)Peripheral physiological affective feature vector x(2)Projection into d-dimensional common subspace is ΦTx(2)Expression emotion feature vector x(3)Projection into d-dimensional common subspace is ΨTx(3)。
(3.3) reducing omegaTx(1)、ΦTx(2)And ΨTx(3)Fusing to obtain electroencephalogram-peripheral physiology-expression multi-modal emotion feature vector omegaTx(1)+ΦTx(2)+ΨTx(3)。
Further, the projection matrices Ω, Φ, and Ψ in step (3.1) are obtained by training in the following steps:
(3.1.1) generating 3 groups of emotional feature vectors for the samples of the class C emotion classes in the training sample setWhereinM is the number of training samples (in this example, the data size in the sample set is not large, all samples participate in the calculation, and the sample set with large data size can randomly extract samples of each emotion type), i is 1,2,3, M is 1,2, …, M; let i-1 represent the electroencephalogram mode,i-2 stands for peripheral physiological modality, i-3 stands for expression modality,representing the electroencephalogram emotional characteristic vector,representing a vector of peripheral physiological emotional features,representing an expression emotion feature vector; in this embodiment, C is 2, M is 880, and N is 32.
(3.1.2) calculation of X(i)Mean value of the vectors of each columnTo X(i)Performing a centralization operation to obtainFor convenience of description, the following will be centeredIs still marked as X(i)I.e. to assumeHave all been centralized.
(3.1.3) the idea of discriminating multiple sets canonical correlation analysis (DMCCA) is to find a set of projection matrices Ω, Φ, and Ψ to maximize the linear correlation of homogeneous samples in the common projection shadow space, while also maximizing the inter-class scattering of data within the modality and minimizing the intra-class scattering of data within the modality, let X be(i)Is a projection vector of1,2,3, the objective function of DMCCA is:
wherein,represents X(i)The intra-class dispersion matrix of (a),represents X(i)Cov (·, ·) represents the covariance, i, j ∈ {1,2,3 }.
The solution to the DMCCA objective function may be represented as an optimization model as follows:
(3.1.4) solving the optimization model of the DMCCA objective function using Lagrange multiplier (Lagrange multiplier) yields the following Lagrange (Lagrange) function:
wherein λ is Lagrange multiplier, and then respectively calculating L (w)(1),w(2),w(3)) To w(1)、w(2)And w(3)And making it zero, i.e. order
To obtain
By further simplifying the above equation, the following generalized eigenvalue problem can be obtained:
the first d maximum eigenvalues lambda are selected by solving the generalized eigenvalue problem in the above formula1≥λ2≥…≥λdCorresponding characteristic vector, namely obtaining a projection matrix Andin the present embodiment, d is 40.
Based on the same inventive concept, the multi-modal emotion recognition system integrating the attention mechanism and the DMCCA provided by the embodiment of the invention comprises:
the characteristic primary extraction module is used for respectively extracting electroencephalogram signal characteristic vectors and expression characteristic vectors from the preprocessed electroencephalogram signals and facial expression videos by using respective trained neural network models, and extracting peripheral physiological signal characteristic vectors from the preprocessed peripheral physiological signals by extracting signal waveform descriptors and statistical characteristics thereof;
the characteristic identification enhancement module is used for mapping the electroencephalogram signal characteristic vector, the peripheral physiological signal characteristic vector and the expression characteristic vector into a plurality of groups of characteristic vectors through linear transformation matrixes respectively, determining importance weights of different characteristic vector groups respectively by using the attention mechanism module, and forming an electroencephalogram emotion characteristic vector, a peripheral physiological emotion characteristic vector and an expression emotion characteristic vector which have the same dimension and have identification power through weighting fusion;
the projection matrix determining module is used for determining a projection matrix of each emotion characteristic vector by maximizing the correlation among different modal emotion characteristics of the same type of samples by using a DMCCA method;
the feature fusion module is used for projecting the electroencephalogram emotion feature vector, the peripheral physiological emotion feature vector and the expression emotion feature vector to a public subspace through respective corresponding projection matrixes, and obtaining the electroencephalogram-peripheral physiological-expression multi-mode emotion feature vector after addition and fusion;
and the classification and identification module is used for classifying and identifying the multi-mode emotion feature vectors by using the classifier to obtain the emotion types.
For specific implementation of each module, reference is made to the above method embodiment, and details are not repeated. Those skilled in the art will appreciate that the modules in the embodiments may be adaptively changed and arranged in one or more systems different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components.
Based on the same inventive concept, the multi-modal emotion recognition system combining the attention mechanism and the DMCCA provided by the embodiment of the invention comprises at least one computing device, wherein the computing device comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, and the computer program realizes the multi-modal emotion recognition method combining the attention mechanism and the DMCCA when being loaded into the processor.
The technical scheme disclosed by the invention not only comprises the technical methods related in the above embodiments, but also comprises the technical scheme formed by randomly combining the above technical methods. Those skilled in the art can make certain improvements and modifications without departing from the principles of the present invention, and such improvements and modifications are to be considered within the scope of the present invention.
Claims (7)
1. The multimode emotion recognition method integrating the attention mechanism and the DMCCA is characterized by comprising the following steps of:
(1) extracting electroencephalogram signal feature vectors and expression feature vectors from the preprocessed electroencephalogram signals and facial expression videos by using respective trained neural network models, and extracting peripheral physiological signal feature vectors from the preprocessed peripheral physiological signals by extracting signal waveform descriptors and statistical features thereof;
(2) mapping the electroencephalogram signal feature vector, the peripheral physiological signal feature vector and the expression feature vector into a plurality of groups of feature vectors through linear transformation matrixes respectively, determining importance weights of different feature vector groups by using an attention mechanism module respectively, and forming an electroencephalogram emotion feature vector, a peripheral physiological emotion feature vector and an expression emotion feature vector which have the same dimension and are discriminating through weighting fusion;
(3) determining a projection matrix of each emotion characteristic vector by using a discrimination multiple set canonical correlation analysis (DMCCA) method for the electroencephalogram emotion characteristic vector, the peripheral physiological emotion characteristic vector and the expression emotion characteristic vector and maximizing the correlation among different modal emotion characteristics of the same type of sample, projecting each emotion characteristic vector to a public subspace, and obtaining the electroencephalogram-peripheral physiological-expression multi-modal emotion characteristic vector after addition and fusion;
(4) and classifying and identifying the multi-mode emotion feature vectors by using a classifier to obtain emotion categories.
2. The multi-modal emotion recognition method combining attention mechanism and DMCCA as recited in claim 1, wherein step (2) comprises the sub-steps of:
(2.1) representing the electroencephalogram signal characteristics extracted in the step (1) into a matrix formAnd by linearly transforming the matrix W(1)Mapping to M1Group feature vector4≤M1Not more than 16, the dimension of each group of feature vectors is N, N is not less than 16 and not more than 64, and the order isThe linear transformation expression is as follows:
E(1)=(F(1))TW(1)
wherein, the superscript (1) represents an electroencephalogram mode, and T represents a transposed symbol;
determining importance weights of different feature vector groups by using a first attention mechanism module, and forming the electroencephalogram emotional feature vector with discriminative power by weighted fusion, wherein the weight of the characteristic vector of the r-th group of electroencephalogram signalsAnd the electroencephalogram emotional characteristic vector x(1)Expressed as:
wherein, r is 1,2, …, M1,Representing the r-th group of electroencephalogram signal feature vectors,for a trainable linear transformation parameter vector, exp (·) represents an exponential function based on a natural constant e;
(2.2) expressing the peripheral physiological signal characteristics extracted in the step (1) in a matrix formAnd by linearly transforming the matrix W(2)Mapping to M2Group feature vector4≤M2Less than or equal to 16, order The linear transformation expression is as follows:
E(2)=(F(2))TW(2)
wherein the superscript (2) represents a peripheral physiological modality;
determining importance weights of different feature vector groups by using a second attention mechanism module, and forming discriminating peripheral physiological emotion feature vectors by weighted fusion, wherein the weights of the s-th group of peripheral physiological signal feature vectorsAnd peripheral physiological emotion feature vector x(2)Expressed as:
wherein, s is 1,2, …, M2,Represents the s-th group of peripheral physiological signal feature vectors,a trainable linear transformation parameter vector;
(2.3) expressing the expression characteristics extracted in the step (1) in a matrix form into expression characteristicsAnd by linearly transforming the matrix W(3)Mapping to M3Group feature vector4≤M3Less than or equal to 16, order The linear transformation expression is as follows:
E(3)=(F(3))TW(3)
wherein, the superscript (3) represents an expression mode;
determining importance weights of different feature vector groups by using a third attention mechanism module, and forming expression emotion feature vectors with discriminative power by weighted fusion, wherein the weights of the t-th group of expression emotion feature vectorsAnd expression emotion feature vector x(3)Expressed as:
3. The multi-modal emotion recognition method combining attention mechanism and DMCCA as recited in claim 2, wherein step (3) comprises the sub-steps of:
(3.1) acquiring DMCCA projection matrix which is obtained through training and respectively corresponds to electroencephalogram emotional characteristics, peripheral physiological emotional characteristics and expression emotional characteristics And32≤d≤128;
(3.2) respectively using projection matrixes omega, phi and psi to extract the electroencephalogram emotion feature vector x from the step (2)(1)Peripheral physiological emotion feature vector x(2)And expression emotion feature vector x(3)Projected into a d-dimensional public subspace, wherein the electroencephalogram emotional characteristic vector x(1)Projection into d-dimensional common subspace is ΩTx(1)Peripheral physiological affective feature vector x(2)Projection into d-dimensional common subspace is ΦTx(2)Expression emotion feature vector x(3)Projection into d-dimensional common subspace is ΨTx(3);
(3.3) reducing omegaTx(1)、ΦTx(2)And ΨTx(3)Fusing to obtain electroencephalogram-peripheral physiology-expression multi-modal emotion feature vector omegaTx(1)+ΦTx(2)+ΨTx(3)。
4. The multi-modal emotion recognition method integrating an attention mechanism and DMCCA according to claim 3, wherein the projection matrices Ω, Φ and Ψ in step (3.1) are obtained by training:
(3.1.1) respectively extracting training samples of each emotion type from the training sample set to generate 3 groups of emotional feature vectorsMeasurement ofWhereinM is the number of training samples, i is 1,2,3, M is 1,2, …, M; let i-1 represent the electroencephalogram modality, i-2 represent the peripheral physiological modality, i-3 represent the expression modality,representing the electroencephalogram emotional characteristic vector,representing a vector of peripheral physiological emotional features,representing an expression emotion feature vector;
(3.1.2) calculation of X(i)Mean of vectors in each column, pair X(i)Carrying out centralized operation;
(3.1.3) solving a group of projection matrixes omega, phi and psi based on the idea of identifying multi-set canonical correlation analysis (DMCCA), so that the linear correlation of the same type of samples in a public projection shadow space is maximized, the inter-class dispersion of data in the modality is maximized, and the intra-class dispersion of the data in the modality is minimized, and X is enabled to be(i)Is a projection vector ofThe objective function of DMCCA is:
wherein,represents X(i)The intra-class dispersion matrix of (a), represents X(i)Cov (·, ·) represents the covariance, i, j ∈ {1,2,3 }; constructing an optimization model as follows and solving to obtain projection matrixes omega, phi and psi:
5. the multi-modal emotion recognition method integrating the attention mechanism and the DMCCA according to claim 4, wherein the optimized model of the DMCCA objective function constructed by solving the method by using the Lagrangian multiplier method is specifically as follows: the optimization model is expressed as the following lagrange function:
wherein λ is Lagrange multiplier, and then respectively calculating L (w)(1),w(2),w(3)) To w(1)、w(2)And w(3)And making it zero, i.e. order
To obtain
By further simplifying the above equation, the following generalized eigenvalue problem can be obtained:
6. the multimode emotion recognition system fusing an attention mechanism and DMCCA is characterized by comprising:
the characteristic primary extraction module is used for respectively extracting electroencephalogram signal characteristic vectors and expression characteristic vectors from the preprocessed electroencephalogram signals and facial expression videos by using respective trained neural network models, and extracting peripheral physiological signal characteristic vectors from the preprocessed peripheral physiological signals by extracting signal waveform descriptors and statistical characteristics thereof;
the characteristic identification enhancement module is used for mapping the electroencephalogram signal characteristic vector, the peripheral physiological signal characteristic vector and the expression characteristic vector into a plurality of groups of characteristic vectors through linear transformation matrixes respectively, determining importance weights of different characteristic vector groups respectively by using the attention mechanism module, and forming an electroencephalogram emotion characteristic vector, a peripheral physiological emotion characteristic vector and an expression emotion characteristic vector which have the same dimension and have identification power through weighting fusion;
the projection matrix determining module is used for determining a projection matrix of each emotion characteristic vector by maximizing the correlation among different modal emotion characteristics of the same class of samples by using a discrimination multi-set canonical correlation analysis (DMCCA) method;
the feature fusion module is used for projecting the electroencephalogram emotion feature vector, the peripheral physiological emotion feature vector and the expression emotion feature vector to a public subspace through respective corresponding projection matrixes, and obtaining an electroencephalogram-peripheral physiological-expression multi-mode emotion feature vector after addition and fusion;
and the classification and identification module is used for classifying and identifying the multi-mode emotion feature vectors by using the classifier to obtain the emotion types.
7. A multi-modal emotion recognition system combining an attention mechanism and DMCCA, comprising at least one computing device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program when loaded into the processor implementing the multi-modal emotion recognition method combining an attention mechanism and DMCCA according to any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110159085.8A CN112800998B (en) | 2021-02-05 | 2021-02-05 | Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110159085.8A CN112800998B (en) | 2021-02-05 | 2021-02-05 | Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112800998A true CN112800998A (en) | 2021-05-14 |
CN112800998B CN112800998B (en) | 2022-07-29 |
Family
ID=75814276
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110159085.8A Active CN112800998B (en) | 2021-02-05 | 2021-02-05 | Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112800998B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113269173A (en) * | 2021-07-20 | 2021-08-17 | 佛山市墨纳森智能科技有限公司 | Method and device for establishing emotion recognition model and recognizing human emotion |
CN113297981A (en) * | 2021-05-27 | 2021-08-24 | 西北工业大学 | End-to-end electroencephalogram emotion recognition method based on attention mechanism |
CN113326781A (en) * | 2021-05-31 | 2021-08-31 | 合肥工业大学 | Non-contact anxiety recognition method and device based on face video |
CN113616209A (en) * | 2021-08-25 | 2021-11-09 | 西南石油大学 | Schizophrenia patient discrimination method based on space-time attention mechanism |
CN113729710A (en) * | 2021-09-26 | 2021-12-03 | 华南师范大学 | Real-time attention assessment method and system integrating multiple physiological modes |
CN113749656A (en) * | 2021-08-20 | 2021-12-07 | 杭州回车电子科技有限公司 | Emotion identification method and device based on multi-dimensional physiological signals |
CN114091599A (en) * | 2021-11-16 | 2022-02-25 | 上海交通大学 | Method for recognizing emotion of intensive interaction deep neural network among modalities |
CN114298189A (en) * | 2021-12-20 | 2022-04-08 | 深圳市海清视讯科技有限公司 | Fatigue driving detection method, device, equipment and storage medium |
CN114947852A (en) * | 2022-06-14 | 2022-08-30 | 华南师范大学 | Multi-mode emotion recognition method, device, equipment and storage medium |
CN117935339A (en) * | 2024-03-19 | 2024-04-26 | 北京长河数智科技有限责任公司 | Micro-expression recognition method based on multi-modal fusion |
CN118332505A (en) * | 2024-06-12 | 2024-07-12 | 临沂大学 | Physiological signal data processing method, system and device based on multi-mode fusion |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108510456A (en) * | 2018-03-27 | 2018-09-07 | 华南理工大学 | The sketch of depth convolutional neural networks based on perception loss simplifies method |
CN109145983A (en) * | 2018-08-21 | 2019-01-04 | 电子科技大学 | A kind of real-time scene image, semantic dividing method based on lightweight network |
CN109543502A (en) * | 2018-09-27 | 2019-03-29 | 天津大学 | A kind of semantic segmentation method based on the multiple dimensioned neural network of depth |
-
2021
- 2021-02-05 CN CN202110159085.8A patent/CN112800998B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108510456A (en) * | 2018-03-27 | 2018-09-07 | 华南理工大学 | The sketch of depth convolutional neural networks based on perception loss simplifies method |
CN109145983A (en) * | 2018-08-21 | 2019-01-04 | 电子科技大学 | A kind of real-time scene image, semantic dividing method based on lightweight network |
CN109543502A (en) * | 2018-09-27 | 2019-03-29 | 天津大学 | A kind of semantic segmentation method based on the multiple dimensioned neural network of depth |
Non-Patent Citations (1)
Title |
---|
袁秋壮等: "基于深度学习神经网络的SAR星上目标识别系统研究", 《上海航天》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113297981A (en) * | 2021-05-27 | 2021-08-24 | 西北工业大学 | End-to-end electroencephalogram emotion recognition method based on attention mechanism |
CN113297981B (en) * | 2021-05-27 | 2023-04-07 | 西北工业大学 | End-to-end electroencephalogram emotion recognition method based on attention mechanism |
CN113326781B (en) * | 2021-05-31 | 2022-09-02 | 合肥工业大学 | Non-contact anxiety recognition method and device based on face video |
CN113326781A (en) * | 2021-05-31 | 2021-08-31 | 合肥工业大学 | Non-contact anxiety recognition method and device based on face video |
CN113269173A (en) * | 2021-07-20 | 2021-08-17 | 佛山市墨纳森智能科技有限公司 | Method and device for establishing emotion recognition model and recognizing human emotion |
CN113749656A (en) * | 2021-08-20 | 2021-12-07 | 杭州回车电子科技有限公司 | Emotion identification method and device based on multi-dimensional physiological signals |
CN113749656B (en) * | 2021-08-20 | 2023-12-26 | 杭州回车电子科技有限公司 | Emotion recognition method and device based on multidimensional physiological signals |
CN113616209A (en) * | 2021-08-25 | 2021-11-09 | 西南石油大学 | Schizophrenia patient discrimination method based on space-time attention mechanism |
CN113616209B (en) * | 2021-08-25 | 2023-08-04 | 西南石油大学 | Method for screening schizophrenic patients based on space-time attention mechanism |
CN113729710A (en) * | 2021-09-26 | 2021-12-03 | 华南师范大学 | Real-time attention assessment method and system integrating multiple physiological modes |
CN114091599A (en) * | 2021-11-16 | 2022-02-25 | 上海交通大学 | Method for recognizing emotion of intensive interaction deep neural network among modalities |
CN114298189A (en) * | 2021-12-20 | 2022-04-08 | 深圳市海清视讯科技有限公司 | Fatigue driving detection method, device, equipment and storage medium |
CN114947852B (en) * | 2022-06-14 | 2023-01-10 | 华南师范大学 | Multi-mode emotion recognition method, device, equipment and storage medium |
CN114947852A (en) * | 2022-06-14 | 2022-08-30 | 华南师范大学 | Multi-mode emotion recognition method, device, equipment and storage medium |
CN117935339A (en) * | 2024-03-19 | 2024-04-26 | 北京长河数智科技有限责任公司 | Micro-expression recognition method based on multi-modal fusion |
CN118332505A (en) * | 2024-06-12 | 2024-07-12 | 临沂大学 | Physiological signal data processing method, system and device based on multi-mode fusion |
CN118332505B (en) * | 2024-06-12 | 2024-08-20 | 临沂大学 | Physiological signal data processing method, system and device based on multi-mode fusion |
Also Published As
Publication number | Publication date |
---|---|
CN112800998B (en) | 2022-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112800998B (en) | Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA | |
Abdullah et al. | Multimodal emotion recognition using deep learning | |
CN108805087B (en) | Time sequence semantic fusion association judgment subsystem based on multi-modal emotion recognition system | |
CN108877801B (en) | Multi-turn dialogue semantic understanding subsystem based on multi-modal emotion recognition system | |
CN108899050B (en) | Voice signal analysis subsystem based on multi-modal emotion recognition system | |
CN108805088B (en) | Physiological signal analysis subsystem based on multi-modal emotion recognition system | |
CN106886792B (en) | Electroencephalogram emotion recognition method for constructing multi-classifier fusion model based on layering mechanism | |
CN112784798A (en) | Multi-modal emotion recognition method based on feature-time attention mechanism | |
CN111134666A (en) | Emotion recognition method of multi-channel electroencephalogram data and electronic device | |
CN107766898A (en) | The three classification mood probabilistic determination methods based on SVM | |
Jinliang et al. | EEG emotion recognition based on granger causality and capsnet neural network | |
Schels et al. | Multi-modal classifier-fusion for the recognition of emotions | |
CN117198468B (en) | Intervention scheme intelligent management system based on behavior recognition and data analysis | |
Rayatdoost et al. | Subject-invariant EEG representation learning for emotion recognition | |
Chen et al. | Patient emotion recognition in human computer interaction system based on machine learning method and interactive design theory | |
Lu et al. | Speech depression recognition based on attentional residual network | |
CN117935339A (en) | Micro-expression recognition method based on multi-modal fusion | |
Peng | Research on Emotion Recognition Based on Deep Learning for Mental Health | |
Zhao et al. | Multiscale Global Prompt Transformer for EEG-Based Driver Fatigue Recognition | |
CN117609863A (en) | Long-time electroencephalogram emotion recognition method based on electroencephalogram micro state | |
Nakisa | Emotion classification using advanced machine learning techniques applied to wearable physiological signals data | |
Zhang et al. | Evolutionary Ensemble Learning for EEG-based Cross-Subject Emotion Recognition | |
CN111709314B (en) | Emotion distribution identification method based on facial surface myoelectricity | |
Akalya devi et al. | Multimodal emotion recognition framework using a decision-level fusion and feature-level fusion approach | |
Udurume et al. | Real-time Multimodal Emotion Recognition Based on Multithreaded Weighted Average Fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |