[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN107506722A - One kind is based on depth sparse convolution neutral net face emotion identification method - Google Patents

One kind is based on depth sparse convolution neutral net face emotion identification method Download PDF

Info

Publication number
CN107506722A
CN107506722A CN201710714001.6A CN201710714001A CN107506722A CN 107506722 A CN107506722 A CN 107506722A CN 201710714001 A CN201710714001 A CN 201710714001A CN 107506722 A CN107506722 A CN 107506722A
Authority
CN
China
Prior art keywords
mrow
msup
mtr
mtd
msubsup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710714001.6A
Other languages
Chinese (zh)
Inventor
吴敏
苏婉娟
陈略峰
周梦甜
刘振焘
曹卫华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN201710714001.6A priority Critical patent/CN107506722A/en
Publication of CN107506722A publication Critical patent/CN107506722A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/175Static expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2136Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/061Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using biological neurons, e.g. biological neurons connected to an integrated circuit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Neurology (AREA)
  • Computational Linguistics (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The invention provides one kind to be based on depth sparse convolution neutral net face emotion identification method, first emotion image preprocessing, afterwards affective feature extraction, last affective characteristics identification classification.The present invention is a kind of to be optimized based on depth sparse convolution neutral net face emotion identification method from Nesterov accelerating gradients descent algorithm to the weights of depth sparse convolution neutral net, it is optimal network structure, to improve the generalization of face emotion recognition algorithm, because NAGD has anticipation, prevent to the property of can appreciate that algorithm is progressively too fast or too slow, the responding ability of algorithm can be strengthened simultaneously, and more preferable local optimum can be obtained.

Description

Deep sparse convolution neural network-based face emotion recognition method
Technical Field
The invention relates to a face emotion recognition method based on a deep sparse convolution neural network, and belongs to the field of pattern recognition.
Background
In recent years, along with the development of various technologies, the degree of intelligence of the society is also continuously improved, and people are more and more eager to experience natural and harmonious human-computer interaction. However, emotion is always a gap that cannot be healed across human and machine. Therefore, breaking through the bottleneck of current emotion calculation is the key to the development of the artificial emotion field. The expression is one of important channels for human emotion expression, and the human face emotion recognition has certain application value in the fields of human-computer interaction, fatigue driving detection, remote nursing, pain assessment and the like, and has very wide application prospect. Therefore, more accurate expression recognition can be realized to promote the intelligent development of the society.
The face emotion recognition can be mainly divided into emotion feature extraction and emotion feature recognition classification. The face emotion recognition is still in the laboratory stage, and the expression of the opposite side cannot be naturally and smoothly recognized like a human in the human-computer interaction process. The existing face emotion recognition algorithms are difficult to accurately extract emotion characteristics, the complexity of the algorithms is high, the recognition time is long, and the real-time requirement in the human-computer interaction process cannot be met. Therefore, the method extracts the characteristics with obvious difference among the expressions, more accurately classifies the expressions with different expression forms, improves the algorithm efficiency, and is the key for realizing the human face emotion recognition.
Deep learning is a new field in machine learning research, and the motivation is to establish and simulate a neural network for human brain to analyze and learn, which simulates the mechanism of human brain to interpret data. The deep sparse convolutional neural network is a neural network composed of a convolutional neural network, Dropout and Softmax regression, and is one of deep learning models. The method introduces the randomized sparsity in the deep convolutional network through the Dropout layer, thereby improving the training efficiency of the network; and the network structures optimized by training each time are different, so that the optimization of the weight value does not depend on the combined action of neurons with fixed relations, the joint adaptability among the neurons is weakened, the method is similar to sexual propagation in natural selection, and meanwhile, the generalization capability of the network is improved. In deep learning, selection of an optimization algorithm is very important, in some previous researches, only setting of a network structure is usually considered, and a traditional gradient descent algorithm is easy to fall into a poor local optimal value, so that generalization performance of a neural network is poor.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a face emotion recognition method based on a deep sparse convolutional neural network, wherein a Nesterov Accelerated Gradient Descent (NAGD) algorithm is selected to optimize the weight of the deep sparse convolutional neural network, so that the network structure is optimal, and the generalization of the face emotion recognition algorithm is improved. The NAGD has a predictive capability to predictably prevent the algorithm from advancing too fast or too slow, while at the same time enhancing the response capability of the algorithm and obtaining a better local optimum.
The technical scheme adopted by the invention for solving the technical problem is as follows: the method for recognizing the human face emotion based on the deep sparse convolution neural network comprises the following steps of:
(1) preprocessing the emotion image: firstly, carrying out rotation correction and face cutting processing on an emotion image sample to be recognized, extracting an emotion characteristic key area, normalizing the image to be of a uniform size, and then carrying out histogram equalization on the emotion image to obtain a preprocessed emotion image;
(2) extracting emotional characteristics: firstly, extracting main component emotional characteristics of a preprocessed emotional image based on a PCA method to obtain characteristic data of different emotions; then, whitening the extracted feature data to obtain a PCA feature map of the emotion image to be recognized;
(3) and (3) emotion feature identification and classification: the method comprises the steps of constructing a deep sparse convolution neural network consisting of a convolution layer, a sub-sampling layer, a Dropout layer and a Softmax regression layer, firstly inputting a PCA characteristic diagram of a training set and a label value corresponding to emotion into the deep sparse convolution neural network, optimizing the deep sparse convolution neural network by adopting a Nesterov accelerated gradient descent algorithm, then inputting the PCA characteristic diagram of an emotion image to be recognized into the deep sparse convolution neural network, and outputting a recognition result, namely the label value corresponding to the emotion type.
The emotion image preprocessing in the step (1) specifically comprises the following processes:
(1-1) calibrating three feature points of two eyes and a nose tip in the emotion image to obtain coordinate values of the three feature points;
(1-2) rotating the emotion image according to the coordinate values of the left eye and the right eye, enabling the two eyes to be on the same horizontal line, setting the interval between the two eyes to be d, and setting the middle point of the interval to be O;
(1-3) cutting the face according to the facial features, taking d in the left and right direction of the horizontal direction and cutting 0.5d and 1.5d rectangular regions in the up and down direction of the vertical direction respectively by taking O as a reference, wherein the rectangular regions are face emotion sub-regions;
(1-4) transforming the scale of the face emotion subarea into a uniform size of 128 x 128 pixels;
and (1-5) carrying out histogram equalization on the emotion subarea of the human face to obtain a preprocessed emotion image.
The extraction of the emotional features in the step (2) specifically comprises the following steps:
(2-1) performing mean normalization by subtracting the average brightness value mu of the emotion image to make the characteristic mean values of data in the emotion image be around 0, specifically comprising the following steps:
the preprocessed emotion image data of size 128 × 128 is stored in a matrix 128 × 128, i.e. { x'(1),x′(2),…,x′(n)},x′(i)∈RnAnd n is 128, and each preprocessed emotion image is subjected to zero averaging by using an equation (1) and an equation (2):
x′(i)=x′(i)-μ (2)
(2-2) calculating a feature vector U of a covariance matrix sigma of the emotion image after zero averaging, wherein a calculation formula of sigma is as follows:
the emotion image pixel point value x' is calculated using the basis { U } of the feature vector U1,u2,…,unExpression:
(2-3) selection of x'rotThe first k principal components to retain 99% variance, i.e., the minimum value of k when equation (5) is chosen:
λjrepresenting the corresponding jth eigenvalue in the eigenvector U;
(2-4) mixing x'rotExcept k main components to be preserved, the rest are set to zeroIs x'rotIs approximately expressed, thenExpressed as:
(2-5) pairsScaling was performed to remove the correlation between the individual features, all with unit variance:
(2-6) transforming the feature vector U by adopting ZCA whitening to change the covariance matrix of the emotion image into a unit matrix I:
x′ZCAwhite=Ux′PCAwhite(8)
x'ZCAwhiteNamely the PCA characteristic diagram of the emotion image to be identified.
And (4) the label values 1-7 in the step (3) are respectively in one-to-one correspondence with 7 types of emotions, such as anger, aversion, fear, joy, neutrality, difficulty and surprise.
The emotion feature identification and classification in the step (3) specifically comprises the following processes:
(3-1) creating a deep sparse convolutional neural network sequentially consisting of a convolutional layer, a sub-sampling layer, a Dropout layer and a Softmax regression layer, and inputting training set data into the deep sparse convolutional neural network, wherein the training set data comprises a PCA characteristic diagram of a training set and label values corresponding to emotions, namely { (x)1,y1),...,(xm,ym) And y ism∈ {1, 2.., k }, where x isiFor the PCA profile of the training set, yiIs xiIteratively training a deep sparse convolution neural network by adopting a NAGD algorithm according to a corresponding emotion label value i ∈ {1, 2.., m }, wherein the iterative training comprises the following processes:
(3-1-1) randomly shuffling the training set data, grouping the data in the training set, wherein the number of the data in each group is consistent, and sequentially inputting each group into a deep sparse convolution neural network;
(3-1-2) each group of training set data firstly passes through a convolutional layer respectively, wherein the convolutional layer is provided with 100 convolutional kernels with the dimensionality of 29 multiplied by 29, and the moving step length of the convolutional kernels is 1; the deep sparse convolution neural network excavates local correlation information in a PCA characteristic diagram of a training set through convolution kernels, and the implementation process of the convolution layers is as follows:
ai,k=f(xi*rot90(Wk,2)+bk) (9)
wherein, ai,kThe ith PCA profile data x of the training set input via the kth convolution kernel in the convolutional layeriConvolution characteristic graph obtained by convolution processing, namely valid convolution operation, WkWeight representing the kth convolution, bkFor the deviation corresponding to the kth convolution kernel, f (-) is a Sigmoid-type activation function:
(3-1-3) inputting the convolution characteristic graph generated by the convolution layer into a sub-sampling layer, wherein the sub-sampling layer adopts average pooling, the average pooling dimension is set to be 4, the moving step length is 4, the size of the pooled characteristic graph obtained after the convolution characteristic graph passes through the sub-sampling layer is changed into one fourth of the original size, the number of the characteristic graphs is unchanged, and the average pooling adopts the following formula:
wherein, cjGenerating a jth pooling feature map for the sub-sampling layer, wherein p is an average pooling dimension;
(3-1-4) utilizing the Dropout layer to reduce the phenomenon of network overfitting, and randomly making all data passing through the Dropout layer, namely the pooling characteristic map generated in the step (3-1-3), not work, and keeping the data which does not work, wherein the calculation process is as follows:
DropoutTrain(x)=RandomZero(p)×x (12)
wherein DropotTracin (x) represents a data matrix obtained after a Dropout layer passes through a training stage, and RandomZero (p) represents that a value input into the data matrix x of the layer is set to be 0 with a set probability p;
(3-1-5) classifying and identifying rows of the data matrix obtained after passing through a Dropout layer by utilizing a Softmax regression layer:
(3-1-5-1) using the hypothesis function hθ(x) Calculating the probability value p (y is j | x), h of the data matrix obtained after passing through the Dropout layer and appearing in each expression category jθ(x) Is a k-dimensional vector, each vector element value corresponds to the probability value of the k categories respectively, and the vector elementsThe sum of the elements is 1, hθ(x) In the form of:
wherein theta is12,...,θk∈Rn+1The parameters of the model are obtained by random assignment in the initial training stage; x is the number of(i)Representing the ith pooling characteristic diagram data in the data matrix obtained after passing through the Dropout layer;
(3-1-5-2) the Softmax regression layer evaluates the classification effect using a cost function J (θ):
wherein 1{ y(i)J is an illustrative function whose value rule is 1{ expression whose value is true } ═ 1, such as 1{1+1 ═ 3} ═ 0, 1{1+1 ═ 2} ═ 1, y(i)Representing an emotion tag value;
the above formula is derived to obtain a gradient formula:
wherein λ represents a weight attenuation term in the formula (15)The factor (c) is a preset value;
(3-1-6) calculating residual errors of all layers and gradients of network parameters theta in a cost function J (W, b; x, y) in Softmax regression by using a back propagation algorithm, and specifically comprising the following steps of:
(3-1-6-1) if the l-th layer is fully connected to the l + 1-th layer, the residual calculation of the l-layer uses the following formula:
(l)=((W(l))T (l+1))·f‘(z(l)) (16)
the gradient calculation formula of the parameter W is:
the gradient calculation formula of the parameter b is as follows:
wherein,(l+1)is the residual error of the l +1 th layer in the network, J (W, b; x, y) is the cost function, (W, b) is the weight and the threshold parameter, and (x, y) is the training data and the label respectively;
(3-1-6-2) if the l-th layer is a convolutional layer and the l + 1-th layer is a sub-sampling layer, the residual is propagated by the following equation:
where k is the number of the convolution kernel,denotes xi*rot90(Wk,2)+bkIs a Sigmoid-type partial derivative of the activation function, of the form:
(3-1-7) according to the calculated gradient of theta, NAGD uses a momentum term gamma vt-1To update the parameter theta by calculating theta-gamma vt-1To obtainTo an approximation of the future position of the parameter θ, the update formula for NAGD is:
θ=θ-vt(22)
wherein,is formed by training set (x)(i),y(i)) The gradient of the parameter theta is obtained by calculation, α is the learning rate, vtIs the current velocity vector, vt-1Is the velocity vector in the previous iteration, α is initially set to 0.1, vtInitially set to 0, and the same dimension as the parameter vector θ, γ ∈ (0, 1)]Setting gamma to 0.5 in the initial training stage, and increasing gamma to 0.95 after the first training iteration is finished;
(3-1-8) returning to the step (3-1-1) until the set iteration times are reached, and finishing the training optimization of the deep sparse convolution neural network;
(3-2) inputting the PCA feature map of the emotion image to be recognized into a deep sparse convolution neural network, and recognizing and classifying the feature map:
(3-2-1) PCA feature map of emotion image to be identified firstly passes through convolution layer and sub-sampling layer, and x'ZCAwhiteSubstituting input x in equation (9)iObtaining a convolution feature map a 'obtained by performing convolution processing on a PCA feature map of the emotion image to be identified input through the kth convolution kernel of the convolution layer'i,k
Then a 'is prepared'i,kSubstituting into formula (11) for a thereini,kAcquiring a pooling characteristic map c' of the emotional image to be identified, namely high-level emotional characteristics;
(3-2-2) when the pooled feature map c 'of the emotion image to be recognized continues to pass through a Dropout layer, carrying out average processing on c':
DropoutTest(c′)=(1-p)×c′ (23)
dropottest (c ') represents a data matrix obtained after the pooling characteristic diagram c' of the emotion image to be identified continuously passes through a Dropout layer;
(3-2-3) hypothetical function h Using Softmax regression layerθ(x) And calculating the probability value of c' appearing in each expression category j, and outputting the category j corresponding to the obtained maximum probability value, namely outputting a classification result.
The invention has the beneficial effects based on the technical scheme that:
the invention introduces the randomized sparsity in the deep sparse convolution neural network through the Dropout layer, and the network structures optimized by each training are different, so that the optimization of the weight value does not depend on the combined action of neurons with fixed relation, the joint adaptability among the neurons is weakened, and the generalization capability and the training efficiency of the network are improved. Optimizing the weight of the deep convolutional neural network by adopting NAGD to optimize the network structure; compared with the traditional gradient descent algorithm, the NAGD has the capability of predicting, predictably preventing the algorithm from advancing too fast or too slow, simultaneously enhancing the response capability of the algorithm and obtaining a better local optimal value.
Drawings
FIG. 1 is a general flow diagram of the present invention.
FIG. 2 is a schematic diagram of emotion image preprocessing.
FIG. 3 is a face emotion feature image after feature extraction based on PCA.
FIG. 4 is a schematic diagram of a deep sparse convolutional neural network.
FIG. 5 JAFFE and CK + database partial image samples.
Fig. 6 is a line graph showing the influence of p on the recognition effect and training time in the Dropout layer.
Fig. 7 symmetrically transforms an image contrast map.
Fig. 8 experimental confusion matrix.
FIG. 9 is a topological structure diagram of a human-computer interaction system based on human face emotion recognition.
FIG. 10 GUI System debug interface.
Detailed Description
The invention is further illustrated by the following figures and examples.
The invention provides a human face emotion recognition method based on a deep sparse convolution neural network, and a general flow diagram of the method is shown in figure 1. Firstly, carrying out image preprocessing on an emotion image sample, namely correcting and cutting the direction of a human face, and carrying out histogram equalization on the emotion image sample; then extracting bottom layer emotional characteristics based on PCA; and finally, mining and learning high-level emotional features by using the constructed deep sparse convolutional neural network, identifying and classifying the high-level emotional features, and training and optimizing the network weight by using the NAGD so as to optimize the whole network structure and improve the human face emotion identification performance.
The method for recognizing the human face emotion based on the deep sparse convolution neural network can be mainly divided into three parts, namely emotion image preprocessing, emotion feature extraction and emotion feature recognition classification, and the realization process comprises the following steps:
(1) preprocessing the emotion image: as shown in fig. 2, firstly, performing rotation correction and face clipping on an emotion image sample to be recognized, extracting an emotion feature key region, normalizing the image to a uniform size, and then performing histogram equalization on the emotion image to obtain a preprocessed emotion image; the method specifically comprises the following steps:
(1-1) manually calibrating three feature points of two eyes and a nose tip in the emotion image by using a function [ x, y ] ═ ginput (3), and obtaining coordinate values of the three feature points;
(1-2) rotating the emotion image according to the coordinate values of the left eye and the right eye, enabling the two eyes to be on the same horizontal line, setting the interval between the two eyes to be d, and setting the middle point of the interval to be O;
(1-3) cutting the face according to the facial features, taking d in the left and right direction of the horizontal direction and cutting 0.5d and 1.5d rectangular regions in the up and down direction of the vertical direction respectively by taking O as a reference, wherein the rectangular regions are face emotion sub-regions;
(1-4) transforming the scale of the face emotion subarea into a uniform size of 128 x 128 pixels;
and (1-5) carrying out histogram equalization on the emotion subarea of the human face to obtain a preprocessed emotion image.
(2) Extracting emotional characteristics: firstly, extracting main component emotional characteristics of a preprocessed emotional image based on a PCA method to obtain characteristic data which are different among different emotions and easy to process; and whitening the extracted feature data to obtain a PCA feature map of the emotion image to be recognized. The obtained face emotion image after extracting emotion features based on PCA is shown in FIG. 3, and specifically comprises the following steps:
(2-1) performing mean normalization by subtracting the average brightness value mu of the emotion image to make the characteristic mean values of data in the emotion image be around 0, specifically comprising the following steps:
the preprocessed emotion image data of size 128 × 128 is stored in a matrix 128 × 128, i.e. { x'(1),x′(2),…,x′(n)},x′(i)∈RnAnd n is 128, and each preprocessed emotion image is subjected to zero averaging by using an equation (1) and an equation (2):
x′(i)=x′(i)-μ (2)
(2-2) calculating a feature vector U of a covariance matrix sigma of the emotion image after zero averaging, wherein a calculation formula of sigma is as follows:
the emotion image pixel point value x' is calculated using the basis { U } of the feature vector U1,u2,…,unExpression:
(2-3) selection of x'rotThe first k principal components to retain 99% variance, i.e., the minimum value of k when equation (5) is chosen:
λjrepresenting the corresponding jth eigenvalue in the eigenvector U;
(2-4) mixing x'rotExcept k main components to be preserved, the rest are set to zeroIs x'rotIs approximately expressed, thenExpressed as:
(2-5) pairsScaling was performed to remove the correlation between the individual features, all with unit variance:
(2-6) transforming the feature vector U by adopting ZCA whitening to change the covariance matrix of the emotion image into a unit matrix I:
x′ZCAwhite=Ux′PCAwhite(8)
x'ZCAwhiteNamely the PCA characteristic diagram of the emotion image to be identified.
(3) And (3) emotion feature identification and classification: and constructing a deep sparse convolutional neural network which consists of a convolutional layer, a sub-sampling layer (pooling layer), a Dropout layer and a Softmax regression layer and is shown in figure 4, wherein the convolutional layer, the pooling layer and the Dropout layer are used for mining and learning high-level emotional features, and the Softmax regression layer is used for identifying and classifying the learned emotional features and outputting classification results, namely label values corresponding to the emotional categories. The label values of 1-7 correspond to 7 types of emotions, namely anger, aversion, fear, joy, neutrality, difficulty and surprise one to one respectively.
Firstly, inputting PCA characteristic graphs of a training set and label values corresponding to emotions into a deep sparse convolutional neural network, optimizing the deep sparse convolutional neural network by adopting a Nesterov accelerated gradient descent algorithm to optimize a network structure so as to improve the generalization of a face emotion recognition algorithm, and after network training is finished, storing an optimal weight of the network to obtain the optimized deep sparse convolutional neural network. And then, in a testing stage, inputting a testing set, namely a PCA characteristic diagram of the emotion image to be recognized, into the deep sparse convolution neural network, and outputting a recognition result, namely a label value corresponding to the emotion category. The method specifically comprises the following steps:
(3-1) creating a deep sparse convolution neural network sequentially consisting of a convolution layer, a sub-sampling layer, a Dropout layer and a Softmax regression layer, and inputting training set data into the deep sparse convolution neural network, wherein the training set data comprisesPCA feature maps of the training set and tag values for corresponding emotions, i.e., { (x)1,y1),...,(xm,ym) And y ism∈ {1, 2.., k }, where x isiFor the PCA profile of the training set, yiIs xiIteratively training a deep sparse convolution neural network by adopting a NAGD algorithm according to a corresponding emotion label value i ∈ {1, 2.., m }, wherein the iterative training comprises the following processes:
(3-1-1) randomly shuffling the training set data, grouping the data in the training set, wherein the number of the data in each group is consistent, and sequentially inputting each group into a deep sparse convolution neural network;
(3-1-2) each group of training set data firstly passes through a convolutional layer respectively, wherein the convolutional layer is provided with 100 convolutional kernels with the dimensionality of 29 multiplied by 29, and the moving step length of the convolutional kernels is 1; the deep sparse convolution neural network excavates local correlation information in a PCA characteristic diagram of a training set through convolution kernels, and the implementation process of the convolution layers is as follows:
ai,k=f(xi*rot90(Wk,2)+bk) (9)
wherein, ai,kThe ith PCA profile data x of the training set input via the kth convolution kernel in the convolutional layeriConvolution characteristic graph obtained by convolution processing, namely valid convolution operation, WkWeight representing the kth convolution, bkFor the deviation corresponding to the kth convolution kernel, f (-) is a Sigmoid-type activation function:
(3-1-3) inputting the convolution characteristic graph generated by the convolution layer into a sub-sampling layer, wherein the sub-sampling layer adopts average pooling, the average pooling dimension is set to be 4, the moving step length is 4, the size of the pooled characteristic graph obtained after the convolution characteristic graph passes through the sub-sampling layer is changed into one fourth of the original size, the number of the characteristic graphs is unchanged, and the average pooling adopts the following formula:
wherein, cjGenerating a jth pooling feature map for the sub-sampling layer, wherein p is an average pooling dimension;
(3-1-4) utilizing the Dropout layer to reduce the phenomenon of network overfitting, and randomly making all data passing through the Dropout layer, namely the pooling characteristic map generated in the step (3-1-3), not work, and keeping the data which does not work, wherein the calculation process is as follows:
DropoutTrain(x)=RandomZero(p)×x (12)
wherein DropotTracin (x) represents a data matrix obtained after a Dropout layer passes through a training stage, and RandomZero (p) represents that a value input into the data matrix x of the layer is set to be 0 with a set probability p;
(3-1-5) classifying and identifying the input data by utilizing a Softmax regression layer:
(3-1-5-1) using the hypothesis function hθ(x) Calculating the probability value p (y is j | x), h of the data matrix obtained after passing through the Dropout layer and appearing in each expression category jθ(x) Is a k-dimensional vector, each vector element value corresponds to the probability value of the k classes, respectively, and the sum of the vector elements is 1, hθ(x) In the form of:
wherein theta is12,...,θk∈Rn+1The parameters of the model are obtained by random assignment in the initial training stage; x is the number of(i)Representing the ith pooling characteristic diagram data in the data matrix obtained after passing through the Dropout layer;
(3-1-5-2) the Softmax regression layer evaluates the classification effect using a cost function J (θ):
wherein 1{ y(i)J is an illustrative function whose value rule is 1{ expression whose value is true } ═ 1, such as 1{1+1 ═ 3} ═ 0, 1{1+1 ═ 2} ═ 1, y(i)Representing an emotion tag value;
the above formula is derived to obtain a gradient formula:
wherein λ represents a weight attenuation term in the formula (15)The factor (c) is a preset value;
(3-1-6) calculating the residual error of each layer and the gradient of each network parameter theta in a cost function J (W, b; x, y) in Softmax regression by using a reverse conduction algorithm, and specifically comprising the following steps of:
(3-1-6-1) if the l-th layer is fully connected to the l + 1-th layer, the residual calculation of the l-layer uses the following formula:
(l)=((W(l))T (l+1))·f‘(z(l)) (16)
the gradient calculation formula of the parameter W is:
the gradient calculation formula of the parameter b is as follows:
wherein,(l+1)is the residual error of the l +1 th layer in the network, J (W, b; x, y) is the cost function, (W, b) is the weight and the threshold parameter, and (x, y) is the training data and the label respectively;
(3-1-6-2) if the l-th layer is a convolutional layer and the l + 1-th layer is a sub-sampling layer, the residual is propagated by the following equation:
where k is the number of the convolution kernel,denotes xi*rot90(Wk,2)+bkIs a Sigmoid-type partial derivative of the activation function, of the form:
(3-1-7) according to the calculated gradient of theta, NAGD uses a momentum term gamma vt-1To update the parameter theta by calculating theta-gamma vt-1To obtain an approximation of the future position of the parameter θ, the update formula for NAGD is:
θ=θ-vt(22)
wherein,is formed by training set (x)(i),y(i)) The gradient of the parameter theta is obtained by calculation, α is the learning rate, vtIs the current velocity vector, vt-1Is the velocity vector in the previous iteration, αInitial setting to 0.1, vtInitially set to 0, and the same dimension as the parameter vector θ, γ ∈ (0, 1)]Setting gamma to 0.5 in the initial training stage, and increasing gamma to 0.95 after the first training iteration is finished;
(3-1-8) returning to the step (3-1-1), knowing the set iteration times, and completing the training optimization of the deep sparse convolution neural network;
(3-2) inputting the PCA feature map of the emotion image to be recognized into a deep sparse convolution neural network, and recognizing and classifying the feature map:
(3-2-1) PCA feature map of emotion image to be identified firstly passes through convolution layer and sub-sampling layer, and x'ZCAwhiteSubstituting input x in equation (9)iObtaining a convolution feature map a 'obtained by performing convolution processing on a PCA feature map of the emotion image to be identified input through the kth convolution kernel of the convolution layer'i,k
Then a 'is prepared'i,kSubstituting into formula (11) for a thereini,kAcquiring a pooling characteristic map c' of the emotional image to be identified, namely high-level emotional characteristics;
(3-2-2) when the pooled feature map c 'of the emotion image to be recognized continues to pass through a Dropout layer, carrying out average processing on c':
DropoutTest(c′)=(1-p)×c′ (23)
dropottest (c ') represents a data matrix obtained after the pooling characteristic diagram c' of the emotion image to be identified continuously passes through a Dropout layer;
(3-2-3) hypothetical function h Using Softmax regression layerθ(x) And calculating the probability value of c' appearing in each expression category j, and outputting the category j corresponding to the obtained maximum probability value, namely outputting a classification result.
The face emotion database used in the experiment by using the method is JAFFE and CK + databases, and partial sample images of the databases are shown in FIG. 5, wherein the first row is JAFFE database samples, and the second row is CK + database samples. The JAFFE database is 213 gray images composed of 7 basic expressions of 10 women, the size of the image is 256 × 256, and each expression image of each person has 2 to 4 images. The CK + database consists of 210 adults of different ethnic and sexual classes, 18 to 50 years old, and comprises 326 labeled expression image sequences, wherein each image has a size of 640 x 490 and comprises 7 expressions, namely anger, disgust, fear, joy, difficulty, surprise and slight. Taking the expression in a calm state as a neutral expression, and combining seven basic expressions with six expression peak value image frames except for the slight to form 399 expressions.
80% of the JAFFE facial expression database was used as a training sample and 20% as a test sample. Fig. 6 shows a graph obtained by changing the size of the p value in the Dropout layer, and it can be seen that the training time gradually shortens and the recognition rate tends to increase as the p value increases. This indicates that when training a deep sparse convolutional neural network, selecting an appropriate p-value in the Dropout layer is beneficial to improve the generalization performance of the network and shorten the required training time. The influence of p on the training time and the recognition rate is comprehensively considered, and p is 0.5 as an optimal value, so that the time required by network training can be effectively reduced, the training efficiency is greatly improved, the network performance is also improved, and a good recognition effect can be obtained.
One of the problems common to deep learning algorithms is that they require a large amount of data to learn during the training phase. However, the amount of data available in some existing public databases is not sufficient to satisfy the data required by the deep learning algorithm. Thus, to increase the number of training samples without having some duplicate samples, all the original samples are symmetrically transformed to double the number of database samples, and the symmetrically transformed image contrast map is shown in fig. 7. To verify the effectiveness of the added samples, the control variable experiments were set as follows: 80% of the JAFFE facial expression database was used as a training sample and 20% as a test sample. Keeping the parameters of the algorithm unchanged, and training the proposed deep sparse convolution neural network by using two training sets, wherein the two training sets respectively consist of an original image and an image added with symmetric transformation, and the two experimental use test sets are the same. Since the Dropout layer has a significant effect on the recognition effect, in order to highlight the effect of increasing the sample on the recognition effect, in the experiment, p is set to be 0, the Dropout layer is shielded, and the NAGD is adopted to optimize the network, and the experiment result is shown in table 1:
TABLE 1 comparison of the results
Table 2 shows the emotion recognition effect obtained by training the deep sparse convolution neural network with the conventional Momentum-based random gradient Descent (MSGD) and NAGD algorithms. The experimental database adopts a JAFFE database, symmetric transformation images are added to training samples, 1 is taken, and 0.5 is taken as p. The experimental result shows that the experimental result obtained by training the network by using the NAGD is more stable and the recognition effect is better than the experimental result obtained by training the network by using the MSGD.
TABLE 2 NAGD and MSGD test results
In order to verify the effectiveness of the algorithm provided by the invention, experiments are respectively carried out in JAFFE and CK + databases. 80% of the JAFFE facial expression database was used as a training sample and 20% as a test sample. The CK + database has wider ranges of ages, sexes and races than the JAFFE database, and in order to better learn various emotional characteristics of various people, the CK + database uses a training set with a larger proportion than JAFFE, namely 90% of images in the database are selected as the training set, and the rest 10% of images are selected as the test set. The experimental results obtained by adding a symmetrically transformed image to each picture in the training set, taking 1 and p as 0.5 are shown in table 3.
TABLE 3 identification results obtained on JAFFE and CK + databases
As can be seen from table 3, the proposed algorithm achieved good recognition in both the JAFFE and CK + databases, with a recognition rate of 97.62% for JAFFE and 95.12% for CK +. The training time per image was 0.6757 seconds on average and the recognition time per image was 0.1258 seconds on average, based on the training time and recognition time found in the table, divided by the number of images in the training set/test set. The recognition rate and misclassification of each type of expression for the two experiments are shown in the confusion matrix in fig. 8, in which AN., DI., FE., NE., sa, and SU. correspond to seven basic expressions, angry, aversive, fear, happy, neutral, difficult, and surprised, respectively.
The invention builds a set of human-computer interaction system based on a human face emotion recognition algorithm, the human-computer interaction system mainly comprises a wheeled robot, an emotion calculation workstation, a router, data transmission equipment and the like, and a topological structure diagram is shown in figure 9. The system firstly acquires human face emotion image frame data through a Kinect configured on the wheeled robot, then transmits the data to an emotion calculation workstation, the workstation inputs the data to a trained human face emotion recognition system for recognition, and finally the workstation feeds back a recognition result to the wheeled robot, so that the wheeled robot can realize natural and harmonious interaction with a human.
A GUI interface is built on a system debugging interface of the human-computer interaction system for debugging through MATLAB 2016a, and a schematic diagram of the GUI interface is shown in FIG. 10. In a GUI system debugging interface, clicking an image preview button, calling a Kinect color camera by the system, and displaying a captured image on an image window on the left side of the GUI interface in real time; clicking an emotion recognition button, acquiring a currently captured image and displaying the currently captured image on an image window on the right side of a GUI interface, then manually acquiring coordinates of two eyes and a nose tip so as to correct and cut a face, and inputting the cut face image into a trained deep convolutional neural network for face emotion recognition; and feeding back the final recognition result to the GUI interface and displaying the final recognition result.
And 2 groups of 7 basic expression image frames of 3 individuals are collected as a training set and input into a deep convolutional neural network for training, and then the image frames captured by the Kinect are input into the trained network for recognition. Table 4 shows the online recognition results of the 7 basic expressions for the 3 persons,
table 4 application test results
As can be seen from the table, the average recognition rate of the three groups of experiments is 76.190%, which shows the prospect of the invention in practical application.

Claims (5)

1. A face emotion recognition method based on a deep sparse convolution neural network is characterized by comprising the following steps:
(1) preprocessing the emotion image: firstly, carrying out rotation correction and face cutting processing on an emotion image sample to be recognized, extracting an emotion characteristic key area, normalizing the image to be of a uniform size, and then carrying out histogram equalization on the emotion image to obtain a preprocessed emotion image;
(2) extracting emotional characteristics: firstly, extracting main component emotional characteristics of a preprocessed emotional image based on a PCA method to obtain characteristic data of different emotions; then, whitening the extracted feature data to obtain a PCA feature map of the emotion image to be recognized;
(3) and (3) emotion feature identification and classification: the method comprises the steps of constructing a deep sparse convolution neural network consisting of a convolution layer, a sub-sampling layer, a Dropout layer and a Softmax regression layer, firstly inputting a PCA characteristic diagram of a training set and a label value corresponding to emotion into the deep sparse convolution neural network, optimizing the deep sparse convolution neural network by adopting a Nesterov accelerated gradient descent algorithm, then inputting the PCA characteristic diagram of an emotion image to be recognized into the deep sparse convolution neural network, and outputting a recognition result, namely the label value corresponding to the emotion type.
2. The facial emotion recognition method based on the deep sparse convolutional neural network of claim 1, wherein: the emotion image preprocessing in the step (1) specifically comprises the following processes:
(1-1) calibrating three feature points of two eyes and a nose tip in the emotion image to obtain coordinate values of the three feature points;
(1-2) rotating the emotion image according to the coordinate values of the left eye and the right eye, enabling the two eyes to be on the same horizontal line, setting the interval between the two eyes to be d, and setting the middle point of the interval to be O;
(1-3) cutting the face according to the facial features, taking d in the left and right direction of the horizontal direction and cutting 0.5d and 1.5d rectangular regions in the up and down direction of the vertical direction respectively by taking O as a reference, wherein the rectangular regions are face emotion sub-regions;
(1-4) transforming the scale of the face emotion subarea into a uniform size of 128 x 128 pixels;
and (1-5) carrying out histogram equalization on the emotion subarea of the human face to obtain a preprocessed emotion image.
3. The facial emotion recognition method based on the deep sparse convolutional neural network of claim 1, wherein: the extraction of the emotional features in the step (2) specifically comprises the following steps:
(2-1) performing mean normalization by subtracting the average brightness value mu of the emotion image to make the characteristic mean values of data in the emotion image be around 0, specifically comprising the following steps:
the preprocessed emotion image data of size 128 × 128 is stored in a matrix 128 × 128, i.e. { x'(1),x′(2),…,x′(n)},x′(i)∈RnAnd n is 128, and each preprocessed emotion image is subjected to zero averaging by using an equation (1) and an equation (2):
<mrow> <mi>&amp;mu;</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mi>n</mi> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <msup> <mi>x</mi> <mrow> <mo>&amp;prime;</mo> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
x′(i)=x′(i)-μ (2)
(2-2) calculating a feature vector U of a covariance matrix sigma of the emotion image after zero averaging, wherein a calculation formula of sigma is as follows:
<mrow> <mi>&amp;Sigma;</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mi>n</mi> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <mrow> <mo>(</mo> <msup> <mi>x</mi> <mrow> <mo>&amp;prime;</mo> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>)</mo> </mrow> <msup> <mrow> <mo>(</mo> <msup> <mi>x</mi> <mrow> <mo>&amp;prime;</mo> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msup> <mo>)</mo> </mrow> <mi>T</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
the emotion image pixel point value x' is calculated using the basis { U } of the feature vector U1,u2,…,unExpression:
<mrow> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> </mrow> <mo>&amp;prime;</mo> </msubsup> <mo>=</mo> <msup> <mi>U</mi> <mi>T</mi> </msup> <msup> <mi>x</mi> <mo>&amp;prime;</mo> </msup> <mo>=</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <mrow> <msubsup> <mi>u</mi> <mn>1</mn> <mi>T</mi> </msubsup> <msup> <mi>x</mi> <mo>&amp;prime;</mo> </msup> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>u</mi> <mn>2</mn> <mi>T</mi> </msubsup> <msup> <mi>x</mi> <mo>&amp;prime;</mo> </msup> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mrow> <msubsup> <mi>u</mi> <mi>n</mi> <mi>T</mi> </msubsup> <msup> <mi>x</mi> <mo>&amp;prime;</mo> </msup> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>
(2-3) selection of x'rotThe first k principal components to retain 99% variance, i.e., the minimum value of k when equation (5) is chosen:
<mrow> <mfrac> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </msubsup> <msub> <mi>&amp;lambda;</mi> <mi>j</mi> </msub> </mrow> <mrow> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <msub> <mi>&amp;lambda;</mi> <mi>j</mi> </msub> </mrow> </mfrac> <mo>&amp;GreaterEqual;</mo> <mn>0.99</mn> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow>
λjrepresenting the corresponding jth eigenvalue in the eigenvector U;
(2-4) mixing x'rotExcept k main components to be preserved, the rest are set to zeroIs x'rotIs approximately expressed, thenExpressed as:
<mrow> <msup> <mover> <mi>x</mi> <mo>~</mo> </mover> <mo>&amp;prime;</mo> </msup> <mo>=</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> <mo>,</mo> <mn>1</mn> </mrow> <mo>&amp;prime;</mo> </msubsup> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> <mo>,</mo> <mi>k</mi> </mrow> <mo>&amp;prime;</mo> </msubsup> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> </mtr> </mtable> </mfenced> <mo>&amp;ap;</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> <mo>,</mo> <mn>1</mn> </mrow> <mo>&amp;prime;</mo> </msubsup> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> <mo>,</mo> <mi>k</mi> </mrow> <mo>&amp;prime;</mo> </msubsup> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> <mo>,</mo> <mi>k</mi> <mo>+</mo> <mn>1</mn> </mrow> <mo>&amp;prime;</mo> </msubsup> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> <mo>,</mo> <mi>n</mi> </mrow> <mo>&amp;prime;</mo> </msubsup> </mtd> </mtr> </mtable> </mfenced> <mo>=</mo> <msubsup> <mi>x</mi> <mrow> <mi>r</mi> <mi>o</mi> <mi>t</mi> </mrow> <mo>&amp;prime;</mo> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow>
(2-5) pairsScaling was performed to remove the correlation between the individual features, all with unit variance:
<mrow> <msubsup> <mi>x</mi> <mrow> <mi>P</mi> <mi>C</mi> <mi>A</mi> <mi>w</mi> <mi>h</mi> <mi>i</mi> <mi>t</mi> <mi>e</mi> <mo>,</mo> <mi>i</mi> </mrow> <mo>&amp;prime;</mo> </msubsup> <mo>=</mo> <mfrac> <msubsup> <mover> <mi>x</mi> <mo>~</mo> </mover> <mi>i</mi> <mo>&amp;prime;</mo> </msubsup> <msqrt> <mrow> <msub> <mi>&amp;lambda;</mi> <mi>i</mi> </msub> <mo>+</mo> <mi>&amp;epsiv;</mi> </mrow> </msqrt> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow>
(2-6) transforming the feature vector U by adopting ZCA whitening to change the covariance matrix of the emotion image into a unit matrix I:
x′ZCAwhite=Ux′PCAwhite(8)
x'ZCAwhiteNamely the PCA characteristic diagram of the emotion image to be identified.
4. The facial emotion recognition method based on the deep sparse convolutional neural network of claim 1, wherein: and (4) the label values 1-7 in the step (3) are respectively in one-to-one correspondence with 7 types of emotions, such as anger, aversion, fear, joy, neutrality, difficulty and surprise.
5. The facial emotion recognition method based on the deep sparse convolutional neural network of claim 3, wherein: the emotion feature identification and classification in the step (3) specifically comprises the following processes:
(3-1) creating a deep sparse convolutional neural network sequentially consisting of a convolutional layer, a sub-sampling layer, a Dropout layer and a Softmax regression layer, and inputting training set data into the deep sparse convolutional neural network, wherein the training set data comprises a PCA characteristic diagram of a training set and label values corresponding to emotions, namely { (x)1,y1),...,(xm,ym) And y ism∈ {1, 2.., k }, where x isiFor the PCA profile of the training set, yiIs xiIteratively training a deep sparse convolution neural network by adopting a NAGD algorithm according to a corresponding emotion label value i ∈ {1, 2.., m }, wherein the iterative training comprises the following processes:
(3-1-1) randomly shuffling the training set data, grouping the data in the training set, wherein the number of the data in each group is consistent, and sequentially inputting each group into a deep sparse convolution neural network;
(3-1-2) each group of training set data firstly passes through a convolutional layer respectively, wherein the convolutional layer is provided with 100 convolutional kernels with the dimensionality of 29 multiplied by 29, and the moving step length of the convolutional kernels is 1; the deep sparse convolution neural network excavates local correlation information in a PCA characteristic diagram of a training set through convolution kernels, and the implementation process of the convolution layers is as follows:
ai,k=f(xi*rot90(Wk,2)+bk) (9)
wherein, ai,kThe ith PCA profile data x of the training set input via the kth convolution kernel in the convolutional layeriConvolution characteristic graph obtained by convolution processing, namely valid convolution operation, WkWeight representing the kth convolution, bkFor the deviation corresponding to the kth convolution kernel, f (-) is a Sigmoid-type activation function:
<mrow> <mi>f</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>+</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mi>x</mi> </mrow> </msup> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>10</mn> <mo>)</mo> </mrow> </mrow>
(3-1-3) inputting the convolution characteristic graph generated by the convolution layer into a sub-sampling layer, wherein the sub-sampling layer adopts average pooling, the average pooling dimension is set to be 4, the moving step length is 4, the size of the pooled characteristic graph obtained after the convolution characteristic graph passes through the sub-sampling layer is changed into one fourth of the original size, the number of the characteristic graphs is unchanged, and the average pooling adopts the following formula:
<mrow> <msub> <mi>c</mi> <mi>j</mi> </msub> <mo>=</mo> <mi>f</mi> <mrow> <mo>(</mo> <msub> <mi>a</mi> <mrow> <mi>i</mi> <mo>.</mo> <mi>k</mi> </mrow> </msub> <mo>*</mo> <mfrac> <mn>1</mn> <msup> <mi>p</mi> <mn>2</mn> </msup> </mfrac> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>11</mn> <mo>)</mo> </mrow> </mrow>
wherein, cjGenerating a jth pooling feature map for the sub-sampling layer, wherein p is an average pooling dimension;
(3-1-4) utilizing the Dropout layer to reduce the phenomenon of network overfitting, and randomly making all data passing through the Dropout layer, namely the pooling characteristic map generated in the step (3-1-3), not work, and keeping the data which does not work, wherein the calculation process is as follows:
DropoutTrain(x)=RandomZero(p)×x (12)
wherein DropotTracin (x) represents a data matrix obtained after a Dropout layer passes through a training stage, and RandomZero (p) represents that a value input into the data matrix x of the layer is set to be 0 with a set probability p;
(3-1-5) classifying and identifying rows of the data matrix obtained after passing through a Dropout layer by utilizing a Softmax regression layer:
(3-1-5-1) using the hypothesis function hθ(x) Calculating the probability value p (y is j | x), h of the data matrix obtained after passing through the Dropout layer and appearing in each expression category jθ(x) Is a k-dimensional vector, each vector element value corresponds to the probability value of the k classes, respectively, and the sum of the vector elements is 1, hθ(x) In the form of:
<mrow> <msub> <mi>h</mi> <mi>&amp;theta;</mi> </msub> <mrow> <mo>(</mo> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <msup> <mi>y</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mn>1</mn> <mo>|</mo> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>;</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <msup> <mi>y</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mn>2</mn> <mo>|</mo> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>;</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <msup> <mi>y</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mi>k</mi> <mo>|</mo> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>;</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <msup> <mi>e</mi> <mrow> <msubsup> <mi>&amp;theta;</mi> <mi>j</mi> <mi>T</mi> </msubsup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msup> </mrow> </mfrac> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msup> <mi>e</mi> <mrow> <msubsup> <mi>&amp;theta;</mi> <mn>1</mn> <mi>T</mi> </msubsup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msup> </mtd> </mtr> <mtr> <mtd> <msup> <mi>e</mi> <mrow> <msubsup> <mi>&amp;theta;</mi> <mn>2</mn> <mi>T</mi> </msubsup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msup> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msup> <mi>e</mi> <mrow> <msubsup> <mi>&amp;theta;</mi> <mi>k</mi> <mi>T</mi> </msubsup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </msup> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>13</mn> <mo>)</mo> </mrow> </mrow>
wherein theta is12,...,θk∈Rn+1The parameters of the model are obtained by random assignment in the initial training stage; x is the number of(i)Representing the ith pooling characteristic diagram data in the data matrix obtained after passing through the Dropout layer;
(3-1-5-2) the Softmax regression layer evaluates the classification effect using a cost function J (θ):
<mrow> <mi>J</mi> <mrow> <mo>(</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <mfrac> <mn>1</mn> <mi>m</mi> </mfrac> <mo>&amp;lsqb;</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <mn>1</mn> <mo>{</mo> <msup> <mi>y</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mi>j</mi> <mo>}</mo> <mi>log</mi> <mfrac> <msup> <mi>e</mi> <mrow> <msubsup> <mi>&amp;theta;</mi> <mi>j</mi> <mi>T</mi> </msubsup> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> </mrow> </msup> <mrow> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <msup> <mi>e</mi> <mrow> <msubsup> <mi>&amp;theta;</mi> <mi>l</mi> <mi>T</mi> </msubsup> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> </mrow> </msup> </mrow> </mfrac> <mo>&amp;rsqb;</mo> <mo>+</mo> <mfrac> <mi>&amp;lambda;</mi> <mn>2</mn> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>k</mi> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>0</mn> </mrow> <mi>n</mi> </munderover> <msubsup> <mi>&amp;theta;</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mn>2</mn> </msubsup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>14</mn> <mo>)</mo> </mrow> </mrow>
wherein 1{ y(i)J is an illustrative function whose value rule is 1{ expression whose value is true } ═ 1, such as 1{1+1 ═ 3} ═ 0, 1{1+1 ═ 2} ═ 1, y(i)Representing an emotion tag value;
the above formula is derived to obtain a gradient formula:
<mrow> <msub> <mo>&amp;dtri;</mo> <msub> <mi>&amp;theta;</mi> <mi>j</mi> </msub> </msub> <mi>J</mi> <mrow> <mo>(</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <mfrac> <mn>1</mn> <mi>m</mi> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <mo>&amp;lsqb;</mo> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mrow> <mo>(</mo> <mn>1</mn> <mo>{</mo> <msup> <mi>y</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mi>j</mi> <mo>}</mo> <mo>-</mo> <mi>p</mi> <mo>(</mo> <msup> <mi>y</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>=</mo> <mi>j</mi> <mo>|</mo> <msup> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </msup> <mo>;</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> <mo>)</mo> <mo>&amp;rsqb;</mo> <mo>+</mo> <msub> <mi>&amp;lambda;&amp;theta;</mi> <mi>j</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>15</mn> <mo>)</mo> </mrow> </mrow>
wherein λ represents a weight attenuation term in the formula (15)The factor (c) is a preset value;
(3-1-6) calculating residual errors of all layers and gradients of network parameters theta in a cost function J (W, b; x, y) in Softmax regression by using a back propagation algorithm, and specifically comprising the following steps of:
(3-1-6-1) if the l-th layer is fully connected to the l + 1-th layer, the residual calculation of the l-layer uses the following formula:
(l)=((W(l))T (l+1))·f‘(z(l)) (16)
the gradient calculation formula of the parameter W is:
<mrow> <msub> <mo>&amp;dtri;</mo> <msup> <mi>W</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </msup> </msub> <mi>J</mi> <mrow> <mo>(</mo> <mi>W</mi> <mo>,</mo> <mi>b</mi> <mo>;</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>&amp;delta;</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </msup> <msup> <mrow> <mo>(</mo> <msup> <mi>a</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </msup> <mo>)</mo> </mrow> <mi>T</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>17</mn> <mo>)</mo> </mrow> </mrow>
the gradient calculation formula of the parameter b is as follows:
<mrow> <msub> <mo>&amp;dtri;</mo> <msup> <mi>b</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </msup> </msub> <mi>J</mi> <mrow> <mo>(</mo> <mi>W</mi> <mo>,</mo> <mi>b</mi> <mo>;</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>&amp;delta;</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>18</mn> <mo>)</mo> </mrow> </mrow>
wherein,(l+1)is the residual error of the l +1 th layer in the network, J (W, b; x, y) is the cost function, (W, b) is the weight and the threshold parameter, and (x, y) is the training data and the label respectively;
(3-1-6-2) if the l-th layer is a convolutional layer and the l + 1-th layer is a sub-sampling layer, the residual is propagated by the following equation:
<mrow> <msubsup> <mi>&amp;delta;</mi> <mi>k</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </msubsup> <mo>=</mo> <mi>u</mi> <mi>p</mi> <mi>s</mi> <mi>a</mi> <mi>m</mi> <mi>l</mi> <mi>e</mi> <mrow> <mo>(</mo> <msup> <mrow> <mo>(</mo> <msubsup> <mi>W</mi> <mi>k</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mi>T</mi> </msup> <msubsup> <mi>&amp;delta;</mi> <mi>k</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>&amp;CenterDot;</mo> <msup> <mi>f</mi> <mo>&amp;prime;</mo> </msup> <mrow> <mo>(</mo> <msubsup> <mi>z</mi> <mi>k</mi> <mrow> <mo>(</mo> <mi>l</mi> <mo>)</mo> </mrow> </msubsup> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>19</mn> <mo>)</mo> </mrow> </mrow>
where k is the number of the convolution kernel,denotes xi*rot90(Wk,2)+bkIs a Sigmoid-type partial derivative of the activation function, of the form:
<mrow> <msup> <mi>f</mi> <mo>&amp;prime;</mo> </msup> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mi>x</mi> </mrow> </msup> <msup> <mrow> <mo>(</mo> <mn>1</mn> <mo>+</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mi>x</mi> </mrow> </msup> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mfrac> <mo>=</mo> <mi>f</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>)</mo> </mrow> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>f</mi> <mo>(</mo> <mi>x</mi> <mo>)</mo> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>20</mn> <mo>)</mo> </mrow> </mrow>
(3-1-7) according to the calculated gradient of theta, NAGD uses a momentum term gamma vt-1To update the parameter theta by calculating theta-gamma vt-1To obtain an approximation of the future position of the parameter θ, the update formula for NAGD is:
vt=γvt-1+α▽θJ(θ-γvt-1;x(i),y(i)) (21)
θ=θ-vt(22)
wherein, ▽θJ(θ;x(i),y(i)) Is formed by training set (x)(i),y(i)) The gradient of the parameter theta is obtained by calculation, α is the learning rate, vtIs the current velocity vector, vt-1Is the velocity vector in the previous iteration, α is initially set to 0.1, vtInitially set to 0, and the same dimension as the parameter vector θ, γ ∈ (0, 1)]Setting gamma to 0.5 in the initial training stage, and increasing gamma to 0.95 after the first training iteration is finished;
(3-1-8) returning to the step (3-1-1) until the set iteration times are reached, and finishing the training optimization of the deep sparse convolution neural network;
(3-2) inputting the PCA feature map of the emotion image to be recognized into a deep sparse convolution neural network, and recognizing and classifying the feature map:
(3-2-1) PCA feature map of emotion image to be identified firstly passes through convolution layer and sub-sampling layer, and x'ZCAwhiteSubstituting input x in equation (9)iObtaining a convolution feature map a 'obtained by performing convolution processing on a PCA feature map of the emotion image to be identified input through the kth convolution kernel of the convolution layer'i,k
Then will bea′i,kSubstituting into formula (11) for a thereini,kAcquiring a pooling characteristic map c' of the emotional image to be identified, namely high-level emotional characteristics;
(3-2-2) when the pooled feature map c 'of the emotion image to be recognized continues to pass through a Dropout layer, carrying out average processing on c':
DropoutTest(c′)=(1-p)×c′ (23)
dropottest (c ') represents a data matrix obtained after the pooling characteristic diagram c' of the emotion image to be identified continuously passes through a Dropout layer;
(3-2-3) hypothetical function h Using Softmax regression layerθ(x) And calculating the probability value of c' appearing in each expression category j, and outputting the category j corresponding to the obtained maximum probability value, namely outputting a classification result.
CN201710714001.6A 2017-08-18 2017-08-18 One kind is based on depth sparse convolution neutral net face emotion identification method Pending CN107506722A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710714001.6A CN107506722A (en) 2017-08-18 2017-08-18 One kind is based on depth sparse convolution neutral net face emotion identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710714001.6A CN107506722A (en) 2017-08-18 2017-08-18 One kind is based on depth sparse convolution neutral net face emotion identification method

Publications (1)

Publication Number Publication Date
CN107506722A true CN107506722A (en) 2017-12-22

Family

ID=60692255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710714001.6A Pending CN107506722A (en) 2017-08-18 2017-08-18 One kind is based on depth sparse convolution neutral net face emotion identification method

Country Status (1)

Country Link
CN (1) CN107506722A (en)

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154504A (en) * 2017-12-25 2018-06-12 浙江工业大学 Method for detecting surface defects of steel plate based on convolutional neural network
CN108182260A (en) * 2018-01-03 2018-06-19 华南理工大学 A kind of Multivariate Time Series sorting technique based on semantic selection
CN108460329A (en) * 2018-01-15 2018-08-28 任俊芬 A kind of face gesture cooperation verification method based on deep learning detection
CN108491835A (en) * 2018-06-12 2018-09-04 常州大学 Binary channels convolutional neural networks towards human facial expression recognition
CN108597539A (en) * 2018-02-09 2018-09-28 桂林电子科技大学 Speech-emotion recognition method based on parameter migration and sound spectrograph
CN108614875A (en) * 2018-04-26 2018-10-02 北京邮电大学 Chinese emotion tendency sorting technique based on global average pond convolutional neural networks
CN108711150A (en) * 2018-05-22 2018-10-26 电子科技大学 A kind of end-to-end pavement crack detection recognition method based on PCA
CN108764128A (en) * 2018-05-25 2018-11-06 华中科技大学 A kind of video actions recognition methods based on sparse time slice network
CN108806667A (en) * 2018-05-29 2018-11-13 重庆大学 The method for synchronously recognizing of voice and mood based on neural network
CN108846380A (en) * 2018-04-09 2018-11-20 北京理工大学 A kind of facial expression recognizing method based on cost-sensitive convolutional neural networks
CN108875904A (en) * 2018-04-04 2018-11-23 北京迈格威科技有限公司 Image processing method, image processing apparatus and computer readable storage medium
CN108898105A (en) * 2018-06-29 2018-11-27 成都大学 It is a kind of based on depth characteristic and it is sparse compression classification face identification method
CN108985457A (en) * 2018-08-22 2018-12-11 北京大学 A kind of deep neural network construction design method inspired by optimization algorithm
CN109033994A (en) * 2018-07-03 2018-12-18 辽宁工程技术大学 A kind of facial expression recognizing method based on convolutional neural networks
CN109409219A (en) * 2018-09-19 2019-03-01 湖北工业大学 Indoor occupant locating and tracking algorithm based on depth convolutional network
CN109635790A (en) * 2019-01-28 2019-04-16 杭州电子科技大学 A kind of pedestrian's abnormal behaviour recognition methods based on 3D convolution
CN109685126A (en) * 2018-12-17 2019-04-26 北斗航天卫星应用科技集团有限公司 Image classification method and image classification system based on depth convolutional neural networks
CN109815953A (en) * 2019-01-30 2019-05-28 电子科技大学 One kind being based on vehicle annual test target vehicle identification matching system
CN109934132A (en) * 2019-02-28 2019-06-25 北京理工大学珠海学院 Face identification method, system and storage medium based on random drop convolved data
CN109993290A (en) * 2017-12-30 2019-07-09 北京中科寒武纪科技有限公司 Integrated circuit chip device and Related product
CN110046223A (en) * 2019-03-13 2019-07-23 重庆邮电大学 Film review sentiment analysis method based on modified convolutional neural networks model
CN110210380A (en) * 2019-05-30 2019-09-06 盐城工学院 The analysis method of personality is generated based on Expression Recognition and psychology test
CN110223712A (en) * 2019-06-05 2019-09-10 西安交通大学 A kind of music emotion recognition method based on two-way convolution loop sparse network
CN110276189A (en) * 2019-06-27 2019-09-24 电子科技大学 A kind of method for authenticating user identity based on gait information
US20200011668A1 (en) * 2018-07-09 2020-01-09 Samsung Electronics Co., Ltd. Simultaneous location and mapping (slam) using dual event cameras
CN110705621A (en) * 2019-09-25 2020-01-17 北京影谱科技股份有限公司 Food image identification method and system based on DCNN and food calorie calculation method
CN110765809A (en) * 2018-07-25 2020-02-07 北京大学 Facial expression classification method and device and emotion intelligent robot
CN110807420A (en) * 2019-10-31 2020-02-18 天津大学 Facial expression recognition method integrating feature extraction and deep learning
WO2020097936A1 (en) * 2018-11-16 2020-05-22 华为技术有限公司 Neural network compressing method and device
WO2020164271A1 (en) * 2019-02-13 2020-08-20 平安科技(深圳)有限公司 Pooling method and device for convolutional neural network, storage medium and computer device
CN112036433A (en) * 2020-07-10 2020-12-04 天津城建大学 CNN-based Wi-Move behavior sensing method
CN112149449A (en) * 2019-06-26 2020-12-29 北京华捷艾米科技有限公司 Face attribute recognition method and system based on deep learning
CN112329701A (en) * 2020-11-20 2021-02-05 北京联合大学 Facial expression recognition method for low-resolution images
CN112613552A (en) * 2020-12-18 2021-04-06 北京工业大学 Convolutional neural network emotion image classification method combining emotion category attention loss
CN113673567A (en) * 2021-07-20 2021-11-19 华南理工大学 Panorama emotion recognition method and system based on multi-angle subregion self-adaption
US11651202B2 (en) 2017-12-30 2023-05-16 Cambricon Technologies Corporation Limited Integrated circuit chip device and related product
US11704544B2 (en) 2017-12-30 2023-07-18 Cambricon Technologies Corporation Limited Integrated circuit chip device and related product
US11734548B2 (en) 2017-12-30 2023-08-22 Cambricon Technologies Corporation Limited Integrated circuit chip device and related product

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105447473A (en) * 2015-12-14 2016-03-30 江苏大学 PCANet-CNN-based arbitrary attitude facial expression recognition method
CN105512624A (en) * 2015-12-01 2016-04-20 天津中科智能识别产业技术研究院有限公司 Smile face recognition method and device for human face image
CN106503654A (en) * 2016-10-24 2017-03-15 中国地质大学(武汉) A kind of face emotion identification method based on the sparse autoencoder network of depth
CN106529503A (en) * 2016-11-30 2017-03-22 华南理工大学 Method for recognizing face emotion by using integrated convolutional neural network
CN106778444A (en) * 2015-11-23 2017-05-31 广州华久信息科技有限公司 A kind of expression recognition method based on multi views convolutional neural networks
CN106778657A (en) * 2016-12-28 2017-05-31 南京邮电大学 Neonatal pain expression classification method based on convolutional neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106778444A (en) * 2015-11-23 2017-05-31 广州华久信息科技有限公司 A kind of expression recognition method based on multi views convolutional neural networks
CN105512624A (en) * 2015-12-01 2016-04-20 天津中科智能识别产业技术研究院有限公司 Smile face recognition method and device for human face image
CN105447473A (en) * 2015-12-14 2016-03-30 江苏大学 PCANet-CNN-based arbitrary attitude facial expression recognition method
CN106503654A (en) * 2016-10-24 2017-03-15 中国地质大学(武汉) A kind of face emotion identification method based on the sparse autoencoder network of depth
CN106529503A (en) * 2016-11-30 2017-03-22 华南理工大学 Method for recognizing face emotion by using integrated convolutional neural network
CN106778657A (en) * 2016-12-28 2017-05-31 南京邮电大学 Neonatal pain expression classification method based on convolutional neural networks

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
ALI MOLLAHOSSEINI 等: "Going Deeper in Facial Expression Recognition using Deep Neural Networks", 《2016 IEEE WINTER CONFERENCE ON APPLICATION OF COMPUTER VISION》 *
余萍 等: "基于矩阵2-范数池化的卷积神经网络图像识别算法", 《图学学报》 *
吴正文: "卷积神经网络在图像分类中的研究应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
孙晓 等: "基于ROI-KNN卷积神经网络的面部表情识别", 《自动化学报》 *
李慧珂: "人脸表情识别方法的分析与研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
楚敏南: "基于卷积神经网络的图像分类技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154504A (en) * 2017-12-25 2018-06-12 浙江工业大学 Method for detecting surface defects of steel plate based on convolutional neural network
US11704544B2 (en) 2017-12-30 2023-07-18 Cambricon Technologies Corporation Limited Integrated circuit chip device and related product
CN109993290B (en) * 2017-12-30 2021-08-06 中科寒武纪科技股份有限公司 Integrated circuit chip device and related product
US11734548B2 (en) 2017-12-30 2023-08-22 Cambricon Technologies Corporation Limited Integrated circuit chip device and related product
US11710031B2 (en) 2017-12-30 2023-07-25 Cambricon Technologies Corporation Limited Parallel processing circuits for neural networks
US11651202B2 (en) 2017-12-30 2023-05-16 Cambricon Technologies Corporation Limited Integrated circuit chip device and related product
CN109993290A (en) * 2017-12-30 2019-07-09 北京中科寒武纪科技有限公司 Integrated circuit chip device and Related product
CN108182260A (en) * 2018-01-03 2018-06-19 华南理工大学 A kind of Multivariate Time Series sorting technique based on semantic selection
CN108460329A (en) * 2018-01-15 2018-08-28 任俊芬 A kind of face gesture cooperation verification method based on deep learning detection
CN108597539B (en) * 2018-02-09 2021-09-03 桂林电子科技大学 Speech emotion recognition method based on parameter migration and spectrogram
CN108597539A (en) * 2018-02-09 2018-09-28 桂林电子科技大学 Speech-emotion recognition method based on parameter migration and sound spectrograph
CN108875904A (en) * 2018-04-04 2018-11-23 北京迈格威科技有限公司 Image processing method, image processing apparatus and computer readable storage medium
CN108846380A (en) * 2018-04-09 2018-11-20 北京理工大学 A kind of facial expression recognizing method based on cost-sensitive convolutional neural networks
CN108846380B (en) * 2018-04-09 2021-08-24 北京理工大学 Facial expression recognition method based on cost-sensitive convolutional neural network
CN108614875B (en) * 2018-04-26 2022-06-07 北京邮电大学 Chinese emotion tendency classification method based on global average pooling convolutional neural network
CN108614875A (en) * 2018-04-26 2018-10-02 北京邮电大学 Chinese emotion tendency sorting technique based on global average pond convolutional neural networks
CN108711150A (en) * 2018-05-22 2018-10-26 电子科技大学 A kind of end-to-end pavement crack detection recognition method based on PCA
CN108711150B (en) * 2018-05-22 2022-03-25 电子科技大学 End-to-end pavement crack detection and identification method based on PCA
CN108764128A (en) * 2018-05-25 2018-11-06 华中科技大学 A kind of video actions recognition methods based on sparse time slice network
CN108806667A (en) * 2018-05-29 2018-11-13 重庆大学 The method for synchronously recognizing of voice and mood based on neural network
CN108491835A (en) * 2018-06-12 2018-09-04 常州大学 Binary channels convolutional neural networks towards human facial expression recognition
CN108898105A (en) * 2018-06-29 2018-11-27 成都大学 It is a kind of based on depth characteristic and it is sparse compression classification face identification method
CN109033994B (en) * 2018-07-03 2021-08-10 辽宁工程技术大学 Facial expression recognition method based on convolutional neural network
CN109033994A (en) * 2018-07-03 2018-12-18 辽宁工程技术大学 A kind of facial expression recognizing method based on convolutional neural networks
US10948297B2 (en) * 2018-07-09 2021-03-16 Samsung Electronics Co., Ltd. Simultaneous location and mapping (SLAM) using dual event cameras
US20200011668A1 (en) * 2018-07-09 2020-01-09 Samsung Electronics Co., Ltd. Simultaneous location and mapping (slam) using dual event cameras
US11668571B2 (en) 2018-07-09 2023-06-06 Samsung Electronics Co., Ltd. Simultaneous localization and mapping (SLAM) using dual event cameras
CN110765809A (en) * 2018-07-25 2020-02-07 北京大学 Facial expression classification method and device and emotion intelligent robot
CN108985457B (en) * 2018-08-22 2021-11-19 北京大学 Deep neural network structure design method inspired by optimization algorithm
CN108985457A (en) * 2018-08-22 2018-12-11 北京大学 A kind of deep neural network construction design method inspired by optimization algorithm
CN109409219A (en) * 2018-09-19 2019-03-01 湖北工业大学 Indoor occupant locating and tracking algorithm based on depth convolutional network
WO2020097936A1 (en) * 2018-11-16 2020-05-22 华为技术有限公司 Neural network compressing method and device
CN113302657B (en) * 2018-11-16 2024-04-26 华为技术有限公司 Neural network compression method and device
CN113302657A (en) * 2018-11-16 2021-08-24 华为技术有限公司 Neural network compression method and device
CN109685126A (en) * 2018-12-17 2019-04-26 北斗航天卫星应用科技集团有限公司 Image classification method and image classification system based on depth convolutional neural networks
CN109635790A (en) * 2019-01-28 2019-04-16 杭州电子科技大学 A kind of pedestrian's abnormal behaviour recognition methods based on 3D convolution
CN109815953A (en) * 2019-01-30 2019-05-28 电子科技大学 One kind being based on vehicle annual test target vehicle identification matching system
WO2020164271A1 (en) * 2019-02-13 2020-08-20 平安科技(深圳)有限公司 Pooling method and device for convolutional neural network, storage medium and computer device
CN109934132A (en) * 2019-02-28 2019-06-25 北京理工大学珠海学院 Face identification method, system and storage medium based on random drop convolved data
CN110046223B (en) * 2019-03-13 2021-05-18 重庆邮电大学 Film evaluation emotion analysis method based on improved convolutional neural network model
CN110046223A (en) * 2019-03-13 2019-07-23 重庆邮电大学 Film review sentiment analysis method based on modified convolutional neural networks model
CN110210380A (en) * 2019-05-30 2019-09-06 盐城工学院 The analysis method of personality is generated based on Expression Recognition and psychology test
CN110210380B (en) * 2019-05-30 2023-07-25 盐城工学院 Analysis method for generating character based on expression recognition and psychological test
CN110223712B (en) * 2019-06-05 2021-04-20 西安交通大学 Music emotion recognition method based on bidirectional convolution cyclic sparse network
CN110223712A (en) * 2019-06-05 2019-09-10 西安交通大学 A kind of music emotion recognition method based on two-way convolution loop sparse network
CN112149449A (en) * 2019-06-26 2020-12-29 北京华捷艾米科技有限公司 Face attribute recognition method and system based on deep learning
CN110276189A (en) * 2019-06-27 2019-09-24 电子科技大学 A kind of method for authenticating user identity based on gait information
CN110705621A (en) * 2019-09-25 2020-01-17 北京影谱科技股份有限公司 Food image identification method and system based on DCNN and food calorie calculation method
CN110807420A (en) * 2019-10-31 2020-02-18 天津大学 Facial expression recognition method integrating feature extraction and deep learning
CN112036433B (en) * 2020-07-10 2022-11-04 天津城建大学 CNN-based Wi-Move behavior sensing method
CN112036433A (en) * 2020-07-10 2020-12-04 天津城建大学 CNN-based Wi-Move behavior sensing method
CN112329701A (en) * 2020-11-20 2021-02-05 北京联合大学 Facial expression recognition method for low-resolution images
CN112613552B (en) * 2020-12-18 2024-05-28 北京工业大学 Convolutional neural network emotion image classification method combined with emotion type attention loss
CN112613552A (en) * 2020-12-18 2021-04-06 北京工业大学 Convolutional neural network emotion image classification method combining emotion category attention loss
CN113673567A (en) * 2021-07-20 2021-11-19 华南理工大学 Panorama emotion recognition method and system based on multi-angle subregion self-adaption
CN113673567B (en) * 2021-07-20 2023-07-21 华南理工大学 Panorama emotion recognition method and system based on multi-angle sub-region self-adaption

Similar Documents

Publication Publication Date Title
CN107506722A (en) One kind is based on depth sparse convolution neutral net face emotion identification method
CN110532900B (en) Facial expression recognition method based on U-Net and LS-CNN
CN108615010B (en) Facial expression recognition method based on parallel convolution neural network feature map fusion
CN108537743B (en) Face image enhancement method based on generation countermeasure network
CN112800903B (en) Dynamic expression recognition method and system based on space-time diagram convolutional neural network
CN105205475B (en) A kind of dynamic gesture identification method
WO2018107760A1 (en) Collaborative deep network model method for pedestrian detection
CN110728209A (en) Gesture recognition method and device, electronic equipment and storage medium
CN109255364A (en) A kind of scene recognition method generating confrontation network based on depth convolution
CN107679491A (en) A kind of 3D convolutional neural networks sign Language Recognition Methods for merging multi-modal data
CN110175501B (en) Face recognition-based multi-person scene concentration degree recognition method
CN108710829A (en) A method of the expression classification based on deep learning and the detection of micro- expression
CN108053398A (en) A kind of melanoma automatic testing method of semi-supervised feature learning
CN106503687A (en) The monitor video system for identifying figures of fusion face multi-angle feature and its method
CN109815826A (en) The generation method and device of face character model
CN110889672A (en) Student card punching and class taking state detection system based on deep learning
CN110378208B (en) Behavior identification method based on deep residual error network
CN105740892A (en) High-accuracy human body multi-position identification method based on convolutional neural network
CN106156765A (en) safety detection method based on computer vision
CN114038037A (en) Expression label correction and identification method based on separable residual attention network
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
Borgalli et al. Deep learning for facial emotion recognition using custom CNN architecture
CN106909938A (en) Viewing angle independence Activity recognition method based on deep learning network
CN111028319A (en) Three-dimensional non-photorealistic expression generation method based on facial motion unit
CN111401116B (en) Bimodal emotion recognition method based on enhanced convolution and space-time LSTM network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171222