[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114582002B - Facial expression recognition method combining attention module and second-order pooling mechanism - Google Patents

Facial expression recognition method combining attention module and second-order pooling mechanism Download PDF

Info

Publication number
CN114582002B
CN114582002B CN202210403298.5A CN202210403298A CN114582002B CN 114582002 B CN114582002 B CN 114582002B CN 202210403298 A CN202210403298 A CN 202210403298A CN 114582002 B CN114582002 B CN 114582002B
Authority
CN
China
Prior art keywords
face
image
facial expression
coordinates
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210403298.5A
Other languages
Chinese (zh)
Other versions
CN114582002A (en
Inventor
周婷
陈劲全
余卫宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210403298.5A priority Critical patent/CN114582002B/en
Publication of CN114582002A publication Critical patent/CN114582002A/en
Application granted granted Critical
Publication of CN114582002B publication Critical patent/CN114582002B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a facial expression recognition method combining an attention module and a second-order pooling mechanism, and relates to the field of deep learning. The invention comprises the following steps: acquiring a face image; preprocessing a face image, wherein the face image preprocessing comprises face detection and alignment, data enhancement and image normalization; and extracting the characteristics of the preprocessed face image, and finishing expression classification. The invention effectively converts the face picture which is interfered by factors such as illumination, head gesture, shielding and the like in the natural environment into the front face picture with proper contrast and no shielding, thereby solving the problem of interference on expression recognition effect caused by other expression-independent factor variables in the real world environment.

Description

Facial expression recognition method combining attention module and second-order pooling mechanism
Technical Field
The invention relates to the field of deep learning, in particular to a facial expression recognition method combining an attention module and a second-order pooling mechanism.
Background
The facial expression is an important mode for transmitting information in human-to-human communication, and the development of facial expression recognition technology can effectively promote the development of related fields such as pattern recognition, image processing and the like, and has high scientific research value, and application scenes of the facial expression recognition technology comprise severe monitoring, fatigue driving monitoring, criminal investigation, human-computer interaction and the like. With the rapid development of large-scale image data and computer hardware (especially GPU), the deep learning method has breakthrough results in image understanding, and the deep neural network has strong feature expression capability, can learn features with discrimination capability and is gradually applied to automatic facial expression recognition tasks. According to the different types of the processed data, the deep facial expression recognition method can be roughly divided into two main types, namely a deep facial expression recognition network based on static images and a deep facial expression recognition network based on videos.
The current advanced depth facial expression recognition method based on static images can be mainly divided into: diversified network inputs, cascade networks, multi-task networks, multi-network fusion, and generation of countermeasure networks, etc., while video-based deep facial expression recognition mainly uses a basic timing network to analyze temporal information carried in video sequences, such as LSTM, C3D, etc., or uses human face key point trajectories to capture dynamic changes of face components in continuous frames, and fuses spatial networks and temporal networks in parallel. In addition, the expression recognition can be expanded to scenes with more practical application value by combining other expression models such as facial action unit models and other multimedia modes such as an audio mode and human physiological information.
Since 2013, expression recognition games such as FER2013 and EmotiW have collected relatively abundant training samples from challenging real world scenes, facilitating the transition of facial expression recognition from a laboratory controlled environment to a natural environment. From the study object, the expression recognition field is experiencing rapid development from laboratory beat to real world spontaneous expression, from long lasting exaggerated expressions to transient micro-expressions, from basic expression classification to complex expression analysis.
As facial expression recognition tasks gradually shift from laboratory controlled environments to challenging real world environments, current deep facial expression recognition systems seek to address several issues:
1) Overfitting problems due to lack of sufficient training data;
2) Interference problems caused by other expression-independent factor variables (such as illumination, head posture and identity characteristics) in a real-world environment;
3) And the recognition accuracy of the facial expression recognition system in the real environment is improved.
Disclosure of Invention
In view of the above, the present invention provides a facial expression recognition method combining an attention module and a second order pooling mechanism to solve the above-mentioned problems in the background art.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a facial expression recognition method combining an attention module and a second-order pooling mechanism comprises the following steps:
Acquiring a face image;
Preprocessing a face image, wherein the face image preprocessing comprises face detection and alignment, data enhancement and image normalization;
And extracting the characteristics of the preprocessed face image, and finishing expression classification.
Optionally, the face detection and alignment includes face detection, key point positioning and face alignment, specifically:
the face detection module inputs the facial expression picture and outputs the facial expression picture as a detected face area;
Positioning the coordinates of key points of the human face according to the human face detection area, and importing a five-point key point detection model by using a human face key point detection interface in a dlib library to obtain the coordinates of the five-point key points of the human face;
and carrying out face alignment by utilizing the coordinates of the five-point key points.
Optionally, the face alignment calculation process is as follows:
First, the center coordinates of the left and right eyes are calculated from the four coordinates (x 1,y1),(x2,y2),(x3,y3),(x4,y4) of the left and right eyes:
After the center coordinates of the two eyes are obtained, connecting the center coordinates of the two eyes, calculating an included angle theta between the connecting line and a horizontal line, and then averaging the center coordinates of the left eye, the center coordinates of the right eye and the three-point coordinates of the point coordinates below the nose to calculate a rotation center point coordinate (x center,ycenter):
and combining the rotation center point coordinate (x center,ycenter) and the included angle theta, solving an interface of the affine transformation matrix in the OpenCV to obtain the affine transformation matrix, and calling an interface function of the OpenCV to carry out affine transformation on the image to obtain the photo with aligned faces.
Optionally, the data enhancement is to dynamically perform random geometric or color transformation operation on the input image in the data reading stage through a transform. Composition () interface in the deep learning framework pytorch, and then input the transformed image into a network for training, so as to realize data expansion.
Optionally, the image normalization divides the pixel value of the image by 255, and any pixel value of the normalized image is between [0,1 ].
Optionally, the extraction of facial expression features is realized by using an 18-layer resnet network, and the network output result is normalized to a probability value of 7 types of expressions by adding a softmax layer at the end of the resnet network, wherein the maximum value is the classification result.
Optionally, feature extraction and expression classification are implemented by using an end-to-end deep neural network, and the structure of the deep neural network is as follows: the first layer is a convolution layer with a convolution kernel size of 7 multiplied by 7, and the number of channels is 64; the second layer is a pooling layer with a pooling core size of 3×3, and the channel number is 64; and connecting eight residual structures fused with the convolution attention module, outputting a feature map of a 512-dimensional channel, connecting a second-order pooling layer to realize feature aggregation, and finally obtaining a classification result by using a full-connection layer and a softmax layer.
Compared with the prior art, the invention discloses a facial expression recognition method combining an attention module and a second-order pooling mechanism, which has the following beneficial effects:
(1) The face picture is subjected to two steps of face detection, alignment and image normalization, so that the face picture which is interfered by factors such as illumination, head posture and shielding in a natural environment is effectively converted into the face picture with proper contrast and no shielding, and the interference problem of other factors which are irrelevant to expression in a real-world environment on the expression recognition effect is solved.
(2) Through the data enhancement means, the operations of random cutting, rotation, overturning, noise adding, color changing and the like are dynamically carried out on the input image in the data reading stage in the network training process, so that the data is expanded to multiple times of the original data, and better data diversity is obtained. The problem of overfitting caused by lack of sufficient training data is effectively solved.
(3) Through improving resnet network structure, make it more suitable for the extraction of expression characteristic, add convolution attention module (CBAM) and make the network more focus on the characteristic extraction of waiting to discern the object, add second order pooling mechanism and draw the second order characteristic of expression to catch the distortion degree information of facial expression muscle better, thereby promote the extraction ability of network model to facial expression characteristic, effectively how to promote the problem of facial expression recognition system recognition accuracy under the real environment.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic overall flow chart of the present invention;
FIG. 2 is a face detection result diagram of the present invention;
FIGS. 3 a-3 b are graphs showing the detection results of key points of the face according to the present invention;
FIGS. 4 a-4 b are front-to-back alignment diagrams of the face alignment of the present invention;
FIG. 5 is a block diagram of an improvement resnet of the present invention;
FIG. 6 is a diagram of the ResBlock + CBAM structure of the present invention;
FIG. 7 is a block diagram of a channel attention module of the present invention;
Fig. 8 is a block diagram of a spatial attention module according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention discloses a facial expression recognition method combining an attention module and a second-order pooling mechanism, which comprises two stages of data preprocessing, feature extraction and expression classification, wherein the input of an algorithm is a facial expression picture, and the output is a classification result value of the picture, and the classification result comprises seven types of qi generation, nausea, fear, happiness, sadness, surprise and neutrality.
Data preprocessing: the data preprocessing module adopted by the scheme comprises three steps of face detection and alignment, data enhancement and image normalization. The face detection and alignment comprises three steps of face detection, key point positioning and face alignment. Face detection is first achieved using the HOG feature and linear classifier based face detector interface (dlib. Get_ frontal _face_detector) and CNN based face detector interface (dlib. Cnn_face_detection_model_v1) in the dlib library. The face detection module inputs the facial expression picture, outputs the facial expression picture as the detected face area, and the face detection result is shown in fig. 2.
And then, according to the result detected in the last step, carrying out the next step of face key point coordinate positioning, and importing a five-point key point detection model by using a face key point detection interface (dlib. Shape_predictor) in a dlib library to obtain five-point key point coordinates of the face, wherein the positions of the five points are shown in fig. 3 a-3 b.
After the coordinates are obtained, the coordinates of the key points are utilized to align the faces, and the specific implementation is as follows: first, the center coordinates of the left and right eyes are calculated from the four coordinates (x 1,y1),(x2,y2),(x3,y3),(x4,y4) of the left and right eyes:
After the center coordinates of the two eyes are obtained, connecting the center coordinates of the two eyes, calculating an included angle theta between the connecting line and a horizontal line, and then averaging the center coordinates of the left eye, the center coordinates of the right eye and the three-point coordinates of the point coordinates below the nose to calculate a rotation center point coordinate (x center,ycenter):
After the coordinates of the rotation center point and the included angle theta are included, an affine transformation matrix can be obtained by utilizing an interface of the affine transformation matrix in the OpenCV, then an interface function of the OpenCV is called to carry out affine transformation on the image, the image is adjusted to 224 multiplied by 224 pixel size, and a result diagram after the face alignment of FIG. 4a is shown in FIG. 4 b.
The data enhancement is to dynamically perform random geometric or color class transformation operation on the input image in the data reading stage through a transformation.composition () interface in the deep learning framework pytorch, and then input the transformed image into a network for training, so as to realize data expansion. The normalization of the image divides the pixel value of the image by 255, and any pixel value of the normalized image is between 0 and 1.
Feature extraction and expression classification: the module utilizes an improved 18-layer resnet network, as shown in fig. 5, to extract facial expression features, and normalizes a network output result into a probability value of 7 types of expressions by adding a softmax layer at the end of the resnet network, wherein the maximum value is a classification result, so that the classification of the expressions is realized.
The whole module uses an end-to-end deep neural network to realize the steps of feature extraction and expression classification, and the structure of the deep neural network is shown in fig. 5. The first layer is a convolution layer with a convolution kernel size of 7 multiplied by 7, and the number of channels is 64; the second layer is a pooling layer with a pooling core size of 3×3, and the channel number is 64; and then connecting eight residual structures fused with the convolution attention module, outputting a characteristic diagram of a 512-dimensional channel by the structure, connecting a second-order pooling layer to realize characteristic aggregation, and finally obtaining a classification result by using a full-connection layer and a softmax layer. The input of the network is a facial expression picture obtained after the data preprocessing in the step 1, the characteristics of the facial expression from the bottom layer to the high layer are extracted through a multi-layer convolution layer and a pooling layer of the network, then the characteristics are converted into column vectors with 1X 7 dimension by using a full connection layer, and the final classification probability value is obtained by normalizing seven classification results through a softmax layer.
(1) Residual structure incorporating convolution attention module (ResBlock + CBAM module in FIG. 6)
The structure of ResBlock + CBAM module is shown in fig. 6, where F is the feature extracted by the previous convolution layer, and the F input channel attention module calculates the channel attention map M C,MC and multiplies the input F to obtain the output F1 of the channel attention module; then, the spatial attention module is input into F1 to calculate a spatial attention force map M S,MS, and the spatial attention force map M S,MS is multiplied by the input F1 to obtain a final output F2 of the convolution attention module, and then the convolution layer after the F2 is continuously input continues to perform feature learning. By adding convolution modules in the base residual block of resnet, the resnet network can be made more aware of the object to be identified.
The convolution attention module (CBAM) is divided into a channel attention module and a spatial attention module, given an intermediate feature map, the CBAM module sequentially extrapolates an attention map along the channel and the space, and then multiplies the attention map with the input feature map for adaptive feature optimization.
The channel attention module structure is shown in fig. 7, in which the feature map extracted from the previous layer is simultaneously subjected to global average pooling and global maximum pooling to achieve compression in the space dimension, a one-dimensional vector is obtained and then sent to a shared full-connection layer network of two layers, the two branches are summed and combined element by element, and finally a sigmoid activation function is used to generate a channel attention map M C.
The structure of the spatial attention module is shown in fig. 8, and the feature diagram output by the channel attention module is used as the input feature diagram of the module. And respectively carrying out average value pooling and maximum value pooling on the channel dimensions, and then splicing and combining the extracted feature graphs (the number of the channels is 1) to obtain a 2-channel feature graph. Then the dimension is reduced to 1 channel through a convolution layer with the convolution kernel size of 7 multiplied by 7, and a spatial attention map M S is generated through a sigmoid activation function.
(2) Second order pooling mechanism
The global second order pooling is performed by calculating the covariance matrix (second order information) of the feature map to select the value representing the data distribution of the feature map. It is assumed that a set of feature maps F i (i=1, 2, …, C) of size h×w is obtained after the preceding convolution operation, where C is the number of channels of the set of feature maps. The idea of global covariance pooling is to consider the feature map as a random variable, each element of which is a sample value of this random variable. The feature map F i is straightened into a vector F i of (h×w, 1), and the covariance matrix of the set of feature maps is calculated:
the physical significance of the covariance matrix is very obvious, and the ith row represents the statistical correlation of channel i with all channels.
The second-order covariance pooling can effectively utilize the relevant information among the channels extracted by the deep neural network learning, and contains richer characteristic information, so that the global average pooling layer of resnet is changed into a global second-order pooling layer, and the characteristic expression capability of the network can be improved. The specific implementation details are as follows: firstly, reducing the dimension of a 512-dimensional feature map of the output of a convolution layer before a second-order pooling layer to 256 dimensions by using a 1X 1 convolution check feature map, then calculating a covariance matrix of the set of features and carrying out matrix square root normalization to obtain a 32896-dimensional feature map so as to realize the second-order pooling operation.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (4)

1. The facial expression recognition method combining the attention module and the second-order pooling mechanism is characterized by comprising the following steps of:
Acquiring a face image;
Preprocessing a face image, wherein the face image preprocessing comprises face detection and alignment, data enhancement and image normalization;
extracting features of the preprocessed face image, and finishing expression classification;
The face detection and alignment comprises face detection, key point positioning and face alignment, and specifically comprises the following steps:
the face detection module inputs the facial expression picture and outputs the facial expression picture as a detected face area;
Positioning the coordinates of key points of the human face according to the human face detection area, and importing a five-point key point detection model by using a human face key point detection interface in a dlib library to obtain the coordinates of the five-point key points of the human face;
Carrying out face alignment by utilizing coordinates of the five key points;
The face alignment calculation process is as follows:
First, the center coordinates of the left and right eyes are calculated from the four coordinates (x 1,y1),(x2,y2),(x3,y3),(x4,y4) of the left and right eyes:
After the center coordinates of the two eyes are obtained, connecting the center coordinates of the two eyes, calculating an included angle theta between the connecting line and a horizontal line, and then averaging three-point coordinates of the center coordinates of the left eye, the center coordinates of the right eye and the coordinates of a point below the nose (x 5,y5) to calculate a rotation center point coordinate (x center,ycenter):
Combining the rotation center point coordinate (x center,ycenter) and the included angle theta, solving an interface of the affine transformation matrix in the OpenCV to obtain the affine transformation matrix, and calling an interface function of the OpenCV to carry out affine transformation on the image to obtain a photo with aligned faces;
Feature extraction and expression classification are realized by using an end-to-end deep neural network, and the deep neural network has the following structure: the first layer is a convolution layer with a convolution kernel size of 7 multiplied by 7, and the number of channels is 64; the second layer is a pooling layer with a pooling core size of 3×3, and the channel number is 64; and connecting eight residual structures fused with the convolution attention module, outputting a feature map of a 512-dimensional channel, connecting a second-order pooling layer to realize feature aggregation, and finally obtaining a classification result by using a full-connection layer and a softmax layer.
2. The facial expression recognition method combining an attention module and a second order pooling mechanism according to claim 1, wherein the data enhancement is to dynamically perform a random geometric or color class transformation operation on an input image in a data reading stage through a transform () interface in a deep learning framework pytorch, and then train the transformed image input network to realize data expansion.
3. The facial expression recognition method combining an attention module and a second order pooling mechanism of claim 1, wherein the image normalization divides the pixel value of the image by 255, and any pixel value of the normalized image is between [0,1 ].
4. The facial expression recognition method combining an attention module and a second-order pooling mechanism according to claim 1, wherein extraction of facial expression features is achieved by using an 18-layer resnet network, and a softmax layer is added at the end of the resnet network to normalize a network output result into a probability value of 7 types of expressions, wherein the maximum value is a classification result.
CN202210403298.5A 2022-04-18 2022-04-18 Facial expression recognition method combining attention module and second-order pooling mechanism Active CN114582002B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210403298.5A CN114582002B (en) 2022-04-18 2022-04-18 Facial expression recognition method combining attention module and second-order pooling mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210403298.5A CN114582002B (en) 2022-04-18 2022-04-18 Facial expression recognition method combining attention module and second-order pooling mechanism

Publications (2)

Publication Number Publication Date
CN114582002A CN114582002A (en) 2022-06-03
CN114582002B true CN114582002B (en) 2024-07-09

Family

ID=81784744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210403298.5A Active CN114582002B (en) 2022-04-18 2022-04-18 Facial expression recognition method combining attention module and second-order pooling mechanism

Country Status (1)

Country Link
CN (1) CN114582002B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115565159B (en) * 2022-09-28 2023-03-28 华中科技大学 Construction method and application of fatigue driving detection model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874861A (en) * 2017-01-22 2017-06-20 北京飞搜科技有限公司 A kind of face antidote and system
CN108805040A (en) * 2018-05-24 2018-11-13 复旦大学 It is a kind of that face recognition algorithms are blocked based on piecemeal

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9053372B2 (en) * 2012-06-28 2015-06-09 Honda Motor Co., Ltd. Road marking detection and recognition
CN109344693B (en) * 2018-08-13 2021-10-26 华南理工大学 Deep learning-based face multi-region fusion expression recognition method
CN111243050B (en) * 2020-01-08 2024-02-27 杭州未名信科科技有限公司 Portrait simple drawing figure generation method and system and painting robot
CN111783622A (en) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 Method, device and equipment for recognizing facial expressions and computer-readable storage medium
CN112541422B (en) * 2020-12-08 2024-03-12 北京科技大学 Expression recognition method, device and storage medium with robust illumination and head posture
CN112766158B (en) * 2021-01-20 2022-06-03 重庆邮电大学 Multi-task cascading type face shielding expression recognition method
CN113076916B (en) * 2021-04-19 2023-05-12 山东大学 Dynamic facial expression recognition method and system based on geometric feature weighted fusion
CN113869229B (en) * 2021-09-29 2023-05-09 电子科技大学 Deep learning expression recognition method based on priori attention mechanism guidance
CN114299578A (en) * 2021-12-28 2022-04-08 杭州电子科技大学 Dynamic human face generation method based on facial emotion analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874861A (en) * 2017-01-22 2017-06-20 北京飞搜科技有限公司 A kind of face antidote and system
CN108805040A (en) * 2018-05-24 2018-11-13 复旦大学 It is a kind of that face recognition algorithms are blocked based on piecemeal

Also Published As

Publication number Publication date
CN114582002A (en) 2022-06-03

Similar Documents

Publication Publication Date Title
Yeh et al. Lightweight deep neural network for joint learning of underwater object detection and color conversion
CN108537743B (en) Face image enhancement method based on generation countermeasure network
CN107273800B (en) Attention mechanism-based motion recognition method for convolutional recurrent neural network
Hara et al. Learning spatio-temporal features with 3d residual networks for action recognition
Pang et al. Visual haze removal by a unified generative adversarial network
CN112288627B (en) Recognition-oriented low-resolution face image super-resolution method
CN111639692A (en) Shadow detection method based on attention mechanism
Xiang et al. Learning super-resolution reconstruction for high temporal resolution spike stream
Li et al. Dynamic Hand Gesture Recognition Using Multi-direction 3D Convolutional Neural Networks.
Kim et al. Exposing fake faces through deep neural networks combining content and trace feature extractors
CN111695457A (en) Human body posture estimation method based on weak supervision mechanism
CN112906520A (en) Gesture coding-based action recognition method and device
CN117333753A (en) Fire detection method based on PD-YOLO
CN112785626A (en) Twin network small target tracking method based on multi-scale feature fusion
CN114582002B (en) Facial expression recognition method combining attention module and second-order pooling mechanism
CN113255464A (en) Airplane action recognition method and system
CN116012255A (en) Low-light image enhancement method for generating countermeasure network based on cyclic consistency
CN111882581A (en) Multi-target tracking method for depth feature association
CN110647905A (en) Method for identifying terrorist-related scene based on pseudo brain network model
CN112132253B (en) 3D action recognition method, device, computer readable storage medium and equipment
Huang et al. Single image super-resolution reconstruction of enhanced loss function with multi-gpu training
Kong et al. Progressive motion context refine network for efficient video frame interpolation
CN111325149A (en) Video action identification method based on voting time sequence correlation model
Wang et al. Criss-Cross Attentional Siamese Networks for Object Tracking.
CN114005157A (en) Micro-expression recognition method of pixel displacement vector based on convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant