CN114519897B - Human face living body detection method based on color space fusion and cyclic neural network - Google Patents
Human face living body detection method based on color space fusion and cyclic neural network Download PDFInfo
- Publication number
- CN114519897B CN114519897B CN202111663546.1A CN202111663546A CN114519897B CN 114519897 B CN114519897 B CN 114519897B CN 202111663546 A CN202111663546 A CN 202111663546A CN 114519897 B CN114519897 B CN 114519897B
- Authority
- CN
- China
- Prior art keywords
- face
- color
- living body
- video
- human face
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 68
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 19
- 230000004927 fusion Effects 0.000 title claims abstract description 17
- 125000004122 cyclic group Chemical group 0.000 title claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000000034 method Methods 0.000 claims description 37
- 230000008569 process Effects 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 210000002569 neuron Anatomy 0.000 claims description 3
- 230000004044 response Effects 0.000 claims description 2
- 230000000306 recurrent effect Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000005286 illumination Methods 0.000 description 3
- 230000005012 migration Effects 0.000 description 3
- 238000013508 migration Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000000741 silica gel Substances 0.000 description 1
- 229910002027 silica gel Inorganic materials 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a face living body detection method based on color space fusion and a cyclic neural network, and relates to the technical field of living body detection. The invention includes fusing the new color space; constructing a face living body detection LSTM network; the color features of the fake face attack video of the public data set are input into the constructed LSTM for training; and utilizing the newly fused color space and the trained network model for human face living body detection. The human face living body detection algorithm provided by the invention can directly detect the human face living body of the content captured by the camera, can realize accurate detection under two-dimensional fake human face attack and fine-done three-dimensional fake human face attack, and solves the problem of low human face living body detection stability under multi-dimensional and cross-dataset fake human face attack.
Description
Technical Field
The invention belongs to the field of biological authentication anti-counterfeiting, and particularly relates to a human face living body detection method based on color space fusion and a cyclic neural network.
Background
With the advent of the artificial intelligence era, people began to use their own biological characteristics as identity marks, so that the identity recognition became more convenient and safer. In biological recognition, the face recognition has a large specific gravity in the biological recognition due to the advantages of low cost and no need of equipment management, and along with the development of face detection and recognition technology, the face recognition is widely applied to the daily life fields such as face payment, access control systems, attendance checking systems and the like.
The fake face attack is an attack aiming at the face recognition system, and the fake face attack attempts to enable the face recognition system to authenticate an illegal user as a legal user by presenting a fake version of the face of the legal user to a camera, so that the illegal user obtains the trust of the face recognition system.
In general, fake face attacks can be classified into 2 categories: two-dimensional fake face attacks and three-dimensional fake face attacks, wherein the two-dimensional fake face attacks comprise photo attacks and video attacks, the photo attacks refer to the trust of a face recognition system obtained by an illegal user in a mode that a photo or a picture of a legal user is printed or displayed on electronic equipment, and the video attacks refer to the trust of the face recognition system obtained by the illegal user by utilizing a video containing the face information of the legal user. The three-dimensional fake face attack mainly refers to that an illegal user utilizes various materials (such as silica gel and latex) to manufacture a 3D face mask of a legal user, and the trust of a face recognition system is obtained by wearing the 3D face mask.
In recent years, despite a great deal of research on human face living body detection, there is still room for improvement in the existing human face living body detection algorithm in terms of cross-dataset, multi-dimensional fake human face attacks. The existing human face living detection algorithm has methods such as LBP, LBP-TOP, markov and the like based on manual characteristics, and mainly utilizes the designed texture characteristics to distinguish real human faces from fake human faces; there are detection methods based on deep learning, such as: CNN, CNN+LSTM, DEEP TREE NET, etc., to learn the difference between the real face and the fake face by using the neural network, so as to judge the fake face attack; there are methods based on vital information (e.g., rpg method) that use vital information specific to a real face to distinguish a real face from a fake face attack. The method has higher detection accuracy when independently detecting two-dimensional or three-dimensional fake face attacks, but has lower detection accuracy in multi-dimensional and cross-dataset tests.
In the human face living body detection, the detection of two-dimensional and three-dimensional fake human face attacks is included, and the detection method using the mixed characteristics (firstly detecting the two-dimensional fake human face attacks and then detecting three fake human face attacks) can obtain higher detection accuracy on the multi-dimensional fake human face attacks, but the time is longer, so that the method is not beneficial to practical use. The deep learning method requires multiple data for training, and the generalization capability of the model tends to be reduced in the learning process. Therefore, the problem of multi-dimensional fake face attack across data sets is a current hot spot in the field of face living body detection.
The application publication number CN105354554A is a human face living body detection method based on color and singular value characteristics, and mainly solves the problems of complex calculation and low recognition rate of the existing human face authenticity recognition technology. The implementation steps are as follows: 1) Marking positive and negative samples of a face database, and dividing the positive and negative samples into a training set and a testing set; 2) Partitioning the training set face image, and extracting color features and singular value features of small blocks of the training set in batches; 3) Normalizing the extracted feature vectors, and sending the normalized feature vectors to a support vector machine classifier for training to obtain a training model; 4) And extracting the characteristics of the test set data, and predicting the characteristics of the test set data by using a training model to obtain a classification result. The invention improves the classification efficiency, obtains higher classification effect, and can be used for detecting the authenticity of the face in social networks or real life. According to the method, the extracted color features are trained and predicted through the support vector machine, so that mask fake face attacks with fine work can be hardly detected successfully, and the mask fake face has feature information very similar to that of a real face. The invention combines the specific pulse characteristics (rPPG signals) of the real human face to carry out human face living body detection, and has higher detection accuracy on three-dimensional fake human face attack.
CN111881815A is a human face living body detection method for multi-model feature migration, and by constructing and fusing heterogeneous data sets, living body training is carried out by adopting the multi-model feature migration method under various color spaces, and the accuracy and generalization capability of a living body detection model are improved. In the training stage, fusing visible light images on an open source or private data set, performing face detection, alignment and cutting, and simultaneously training an RGB model and a YUV model respectively until the models converge; in the prediction stage, the acquired visible light images are respectively input into a trained RGB model and a trained YUV model, the results of the two models are respectively obtained, a final score is obtained through a model score fusion strategy, and finally, a living body detection result is judged according to the score. The method has good generalization performance and high precision, and is suitable for industrial deployment and use. The method carries out the living detection of the human face by a multi-model characteristic migration method under a plurality of color spaces, can well avoid the influence of illumination on the detection result, but has the problem of difficult detection for mask type fake human face attacks with fine work and fake human face attacks across data sets. The invention can avoid the influence of external illumination on the detection result by color space fusion, combines the specific pulse characteristics (rPPG signals) of the real human face to carry out human face living detection, and has higher detection accuracy in multi-dimensional and cross-dataset fake human face attack.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. The method and the device have the advantages that the original video is directly used for human face living body detection, the image in the video does not need to be preprocessed, and the method and the device accord with actual application scenes. Meanwhile, the human face living body detection method based on the color space fusion and the cyclic neural network has stable and higher detection precision under the cross-data set and multi-dimensional fake human face attack. The technical scheme of the invention is as follows:
A human face living body detection method based on color space fusion and a circulating neural network comprises the following steps:
Carrying out face detection on each frame of image in the original video, and dividing a face area and a background area in the image;
constructing a new color space by utilizing the correlation between the face region and the background region rPPG signal;
Face detection is carried out on each frame of input image, and a face area and a background area in the image are segmented; converting the segmented image into HSV and YCbCr color spaces from RGB color spaces, segmenting 9 color channels, and performing Fourier transformation on each color channel to obtain rPPG signals of face areas and background areas of the color channels;
By utilizing the principle that the correlation between a real face region and a background region rPPG is smaller and the correlation between a fake face region is larger, three color channels are selected to construct a new color space.
Constructing an LSTM network for human face living body detection;
inputting the color characteristics of the video in the public data set into a constructed LSTM network for training;
and performing face living body detection by using a face living body detection model trained by the new color space and the LSTM network.
Further, the step of dividing the face region image and the background region image includes the steps of:
detecting each frame of image in the original video using frontal _face_detector in dlib library as face detector;
Positioning and labeling the face position of the person in the image by using a shape_predictor_68_face_landmarks.dat face feature extraction extractor;
And dividing the face area and the background area according to the marked position information, and dividing the face area and the background area.
Further, the building of the new color space comprises the steps of:
Capturing each frame of image in an original video, carrying out face detection on the image, and dividing the image into a face part and a background part;
Performing color space conversion on the face and background area images, and converting the face and background area images into HSV and YCbCr color spaces;
dividing RGB, HSV, YCbCr three color spaces to obtain 9 color channels;
Acquiring color characteristics of each color channel of a segmentation area of each frame of a current video to form 9 face color characteristic lists and 9 background color characteristic lists;
performing Fourier transformation on the face color feature list and the background area color feature list to obtain an rPPG signal;
calculating and recording a correlation coefficient of a human face region and a background region rPPG signal;
For the average correlation coefficient value of all channels of all videos, for a real face video, arranging color channels according to the ascending order of the correlation coefficient, and for a fake face attack video, arranging the color channels according to the descending order, and selecting three color channels with the highest front-to-back contact ratio as a new color space.
Further, the calculating the correlation coefficient of the rpg signal of the face region and the background region specifically includes:
{R1,R2,R3}=fmin({m1,m2,m3.......m9})
{F1,F2,F3}=fmax({m1,m2,m3.......m9})
Wherein m j is the correlation coefficient of the rPPG signal generated by the face of the same channel and the same video of the background, C i is the average correlation coefficient f max generated by n videos, f min is the maximum 3 m values of 9 color channels, and f min is the minimum 3 m values of 9 color channels.
Further, the construction of the LSTM network for human face living body detection comprises the following steps: extracting the color characteristics of the face area and the original area of each frame of the original video by using a new color space to form a Feature Map; introducing Feature Map into LSTM network, using LSTM layer with 100 hidden neurons, full connection layer and FFT layer, LSTM is used to estimate frame input sequence with N f Wherein I j represents the color characteristics of each frame, and the FFT layer converts the response of the full-connection layer into the fourier domain to obtain an rpg signal;
The method comprises the steps of accessing a Fourier transform layer after a full connection layer of an LSTM network, and further performing Fourier transform on a difference sequence output by the network by using the Fourier transform layer, so as to obtain frequency domain information;
And combining the correlation between the LSTM prediction result and the frequency information of each region of the face and the background region, and outputting the result.
Further, the step of inputting the color characteristics of the video in the public dataset into the constructed LSTM network for training comprises the following steps:
Aiming at a real face video in the public dataset, a traditional rPPG method is used for obtaining an rPPG signal of the real face video, the rPPG signal is used as ground truth in the network training process, and for a fake face attack video, the rPPG signal is set to be 0;
Extracting color features from the video in the public dataset by using the constructed new color space, and inputting the color feature sequence and the set ground truth into an LSTM network for training;
and inserting a Fourier transform layer after the full connection layer of the LSTM network, converting the color characteristic change sequence into an rPPG signal, and then classifying and predicting.
The invention has the advantages and beneficial effects as follows:
The method based on color space fusion and the cyclic neural network provided by the invention can ensure that stable and higher detection precision can be realized under the attack of multi-dimensional fake faces on the living body detection task of the faces; according to the related color space fusion method, through the principle that the correlation between the rPPG signals extracted by each region of the human face and the pulse signals extracted by the background region is low, and the correlation between the pulse signals extracted by each partial region of the human face and the rPPG signals of the background region is high, each frame of image is divided into a human face region and a background region, the correlation between each color channel of the human face region and each color channel of the background region is calculated, three color channels which can most represent the color characteristic change of the human face are obtained through statistics, and the three color channels are combined into a new color space, so that the influence of external environment noise on the human face living body detection process is reduced, and the human face living body detection precision is improved. Compared with the existing method, the method can combine the rPPG method and the cyclic neural network, and the color characteristic change of each frame of the video is captured to the greatest extent by constructing a new color space, so that the illumination robustness is enhanced, and compared with other methods, the accuracy and the practicability are improved well.
Drawings
Fig. 1 is a flowchart of a face in-vivo detection method based on color space fusion and recurrent neural network according to a preferred embodiment of the present invention.
Fig. 2 is a flow chart of face region and background region segmentation in an embodiment of the present invention.
Fig. 3 is a flowchart of color space fusion in an embodiment of the invention.
Fig. 4 is a diagram of a part of a recurrent neural network according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and specifically described below with reference to the drawings in the embodiments of the present invention. The described embodiments are only a few embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
In order to realize face living body detection, the face living body detection method based on the color space fusion and the cyclic neural network provided by the embodiment of the invention comprises three stages of face region and background region segmentation, color space fusion and LSTM network prediction rPPG signal introduction, and comprises the following steps:
face region and background region segmentation stage
S1: face detection is carried out on each frame of image in the original video, and a face area and a background area in the image are segmented
Color space fusion phase
S2: constructing a new color space by utilizing the correlation between the face region and the background region rPPG signal;
rPPG signal phase prediction by introducing cyclic neural network
S3: introducing an LSTM network for human face living body detection;
s4: inputting the color characteristics of the video in the public data set into a constructed cyclic neural network for training;
s5: and using the new color space and the LSTM network to train a human face living body detection model for human face living body detection.
In the step S1, the step of dividing the face region image and the background region image includes the following steps (as shown in fig. 2):
S11: detecting each frame of image in the original video using frontal _face_detector in dlib library as face detector;
S12: positioning and labeling the face position of the person in the image by using a shape_predictor_68_face_landmarks.dat face feature extraction extractor;
s13: and dividing the face area and the background area according to the marked position information, and dividing the face area and the background area. Mainly comprises the following steps:
A1: finding out the position midpoint of the face in the vertical direction;
Further, according to the identification information of 68 feature points, the position information 28 representing the midpoint of the eyes is found, the x1 variable name is taken to represent, the position information 34 representing the midpoint of the nose is found, the x2 variable name is taken to represent, the position information 9 representing the midpoint of the chin is found, and the x3 variable name is taken to represent;
further, the value of (x1+x3)/2 is stored by the x_mid_up variable through the position information of x1, x2 and x3, the position information of the upper part in the human face is represented, and the value of (x2+x3)/2 is stored by the x_mid_down variable, the position information of the lower part in the human face is represented.
A2: finding out the position midpoint of the human face in the horizontal direction;
further, according to the identification information of 68 feature points, position information 3 representing the middle contour point on the left side of the face is found, the y1 variable name is taken to represent the position information 30 representing the middle upper point in the vertical direction of the face is found, the y2 variable name is taken to represent the position information 13 representing the middle lower position contour point on the right side of the face is found, and the y3 variable name is taken to represent the position information;
further, the position information of the face part of the person with the nose far left is represented by the position information of y1, y2 and y3, the value of (y1+y2)/2 is stored by using a y_mid_left variable, and the position information of the face part with the nose far right is represented by using a value of (y2+y3)/2 stored by using an x_mid_right variable.
A3, dividing the face area according to the values of the following x1, x2, x3, y1, y2 and y 3;
A4, finding out the position information of the background area around the face;
Further, according to the position information of y_mid_left and y1, creating a distance variable to store the value of y_mid_left-y1, in order to avoid the appearance of a human face, let y0=y1-distance×3, y4=y3+distance×3 and reset the values of y1, y3 for the purpose of zooming out, let y1=y1-distance×2, y3=y3+distance×2;
a5, segmenting a background area;
further, judging the size of y0, when y0<0, cutting the left background area of the human face into img [ x1: x3,0: y1], and when y0>0, cutting the left background area of the human face into img [ x1: x3, y0: y1];
Further, judging the size of y4, when y4>640, cutting the background area on the right side of the human face into img [ x1: x3, y3:640], and when y4<640, cutting the background area on the right side of the human face into img [ x1: x3, y3: y4];
in S2, the color space fusion includes the following steps (as shown in fig. 3):
S21: according to the principle that the correlation between the rPPG signals extracted from each region of the face and the rPPG signals extracted from the background region is low, and the correlation between the rPPG signals extracted from each partial region of the forged face and the rPPG signals of the background region is high, performing color space conversion on the segmented face region and the background region, and converting RGB color space into HSV and YCbCr color space;
s22: dividing RGB, HSV, YCbCr color spaces of the face area and the background area image to obtain R, G, B, H, S, V, Y, cb, cr color channels of the face area and the background area;
s23: acquiring color characteristics of each color channel of a segmentation area of each frame of a current video to form 9 face color characteristic lists and 9 background color characteristic lists;
S24: performing Fourier transformation on the face color feature list and the background area color feature list to obtain an rPPG signal;
s25: calculating and recording the correlation coefficient of the rPPG signals of the face area and the background area, and utilizing a formula Obtaining average correlation coefficient values of all channels of all videos, wherein M j is the correlation coefficient of rPPG signals generated by the same channel face and the same background video, and C i is the average correlation coefficient generated by n videos;
S26: obtaining 3 color channels to be recorded through a formula {R1,R2,R3}=fmin({x1,x2,x3.......x9}){F1,F2,F3}=fmax({x1,x2,x3.......x9}), wherein f max is the color channels which are arranged in a descending order according to the correlation coefficient when the original video is a fake face attack video, the 3 color channels with the largest correlation coefficient between a face area and a background area are taken for recording, f min is the color channels which are arranged in an ascending order according to the correlation coefficient when the original video is a real face video, the 3 color channels with the smallest correlation coefficient between the face area and the background area are taken for recording, and three color channels with the highest front-to-back coincidence degree are selected to be used as a new color space;
in S3, the construction of the recurrent neural network for human face living body detection includes the following steps (as shown in fig. 3):
s31: extracting color features of a face region and an original region in each frame of image of an original video by using the new color space combined in the S2 to form a Feature Map;
S32: the Feature Map was imported into the LSTM network using LSTM layers with 100 hidden neurons. The purpose of LSTM is to estimate an input sequence with N f frames Is a rpg signal f;
S33: the full-connection layer is accessed after the LSTM layer, the Fourier transform layer is accessed after the full-connection layer, and then the Fourier transform is carried out on the difference sequence output by the network by utilizing the layer, so that the frequency domain information is obtained;
s34: and combining the correlation between the LSTM prediction result and the frequency information of each region of the face and the background region, and outputting the result.
In the step S4, the color characteristics of the video in the public data set are input into the constructed cyclic neural network for training, and the method comprises the following steps:
S41: aiming at the real face video in the public dataset, a traditional rPPG method is used for obtaining an rPPG signal of the real face video, and the rPPG signal is used as ground truth in the network training process. For fake face attack video, the rPPG signal of the fake face attack video is set to be 0;
S42: extracting color features from video in the public dataset by using the color space constructed by S2, inputting the color feature sequence and the set ground truth into a recurrent neural network for training, wherein the objective function is set as follows Wherein θ R is RNN parameter, F j is face region feature map, N s is number of frame sequences, F i denotes ground truth of the i-th frame;
S43: and inserting a Fourier transform layer after the full connection layer of the LSTM network, converting the color characteristic change sequence into an rPPG signal, and then classifying and predicting.
Compared with the existing method, the detection result obtained by the color space fusion and cyclic neural network based detection method has stable ratio and better accuracy.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The above examples should be understood as illustrative only and not limiting the scope of the invention. Various changes and modifications to the present invention may be made by one skilled in the art after reading the teachings herein, and such equivalent changes and modifications are intended to fall within the scope of the invention as defined in the appended claims.
Claims (1)
1. The human face living body detection method based on the color space fusion and the cyclic neural network is characterized by comprising the following steps of:
Carrying out face detection on each frame of image in the original video, and dividing a face area and a background area in the image;
Constructing a new color space by utilizing the correlation between the face region and the background region rPPG signal; face detection is carried out on each frame of input image, and a face area and a background area in the image are segmented;
Converting the segmented image into HSV and YCbCr color spaces from RGB color spaces, segmenting 9 color channels, and performing Fourier transformation on each color channel to obtain rPPG signals of face areas and background areas of the color channels;
By utilizing the principle that the correlation between a real face area and a background area rPPG is smaller and the correlation between a fake face area is larger, three color channels are selected to construct a new color space;
Constructing an LSTM network for human face living body detection;
inputting the color characteristics of the video in the public data set into a constructed LSTM network for training;
Performing human face living body detection by using a human face living body detection model trained by a new color space and an LSTM network;
the step of dividing the face region image and the background region image comprises the following steps:
detecting each frame of image in the original video using frontal _face_detector in dlib library as face detector;
Positioning and labeling the face position of the person in the image by using a shape_predictor_68_face_landmarks.dat face feature extraction extractor;
Dividing a face area and a background area according to the marked position information, and dividing the face area and the background area;
the building of the new color space comprises the steps of:
Capturing each frame of image in an original video, carrying out face detection on the image, and dividing the image into a face part and a background part;
Performing color space conversion on the face and background area images, and converting the face and background area images into HSV and YCbCr color spaces;
dividing RGB, HSV, YCbCr three color spaces to obtain 9 color channels;
Acquiring color characteristics of each color channel of a segmentation area of each frame of a current video to form 9 face color characteristic lists and 9 background color characteristic lists;
performing Fourier transformation on the face color feature list and the background area color feature list to obtain an rPPG signal;
calculating and recording a correlation coefficient of a human face region and a background region rPPG signal;
for the average correlation coefficient value of all channels of all videos, arranging color channels according to the ascending order of the correlation coefficient for a real face video, arranging color channels according to the descending order of a fake face attack video, and selecting three color channels with the highest front-to-back contact ratio as a new color space;
the calculating the correlation coefficient of the rPPG signal of the face region and the background region specifically comprises the following steps:
{R1,R2,R3}=fmin({m1,m2,m3……m9})
{F1,F2,F3}=fmax({m1,m2,m3……m9})
Wherein m j is the correlation coefficient of rPPG signals generated by the same channel face and the same background video, C i is the average correlation coefficient f max generated by n videos, f min is the 3 m values with the largest 9 color channels, and f min is the 3 m values with the smallest 9 color channels;
The construction of the LSTM network for human face living body detection comprises the following steps: extracting the color characteristics of the face area and the original area of each frame of the original video by using a new color space to form a Feature Map; introducing Feature Map into LSTM network, using LSTM layer with 100 hidden neurons, full connection layer and FFT layer, LSTM is used to estimate frame input sequence with N f The FFT layer converts the response of the full connection layer into a Fourier domain to obtain an rPPG signal;
The method comprises the steps of accessing a Fourier transform layer after a full connection layer of an LSTM network, and further performing Fourier transform on a difference sequence output by the network by using the Fourier transform layer, so as to obtain frequency domain information;
Combining the correlation of the LSTM prediction result and the frequency information of each region of the face and the background region, and outputting a result;
The method for inputting the color characteristics of the video in the public dataset into the constructed LSTM network for training comprises the following steps:
Aiming at a real face video in the public dataset, a traditional rPPG method is used for obtaining an rPPG signal of the real face video, the rPPG signal is used as ground truth in the network training process, and for a fake face attack video, the rPPG signal is set to be 0;
Extracting color features from the video in the public dataset by using the constructed new color space, and inputting the color feature sequence and the set ground truth into an LSTM network for training;
and inserting a Fourier transform layer after the full connection layer of the LSTM network, converting the color characteristic change sequence into an rPPG signal, and then classifying and predicting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111663546.1A CN114519897B (en) | 2021-12-31 | 2021-12-31 | Human face living body detection method based on color space fusion and cyclic neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111663546.1A CN114519897B (en) | 2021-12-31 | 2021-12-31 | Human face living body detection method based on color space fusion and cyclic neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114519897A CN114519897A (en) | 2022-05-20 |
CN114519897B true CN114519897B (en) | 2024-09-24 |
Family
ID=81597109
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111663546.1A Active CN114519897B (en) | 2021-12-31 | 2021-12-31 | Human face living body detection method based on color space fusion and cyclic neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114519897B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116563957B (en) * | 2023-07-10 | 2023-09-29 | 齐鲁工业大学(山东省科学院) | Face fake video detection method based on Fourier domain adaptation |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107862299A (en) * | 2017-11-28 | 2018-03-30 | 电子科技大学 | A kind of living body faces detection method based on near-infrared Yu visible ray binocular camera |
CN111368666A (en) * | 2020-02-25 | 2020-07-03 | 上海蠡图信息科技有限公司 | Living body detection method based on novel pooling and attention mechanism double-current network |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108596141B (en) * | 2018-05-08 | 2022-05-17 | 深圳大学 | Detection method and system for generating face image by deep network |
TWI684433B (en) * | 2019-04-19 | 2020-02-11 | 鉅怡智慧股份有限公司 | Biological image processing method and biological information sensor |
-
2021
- 2021-12-31 CN CN202111663546.1A patent/CN114519897B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107862299A (en) * | 2017-11-28 | 2018-03-30 | 电子科技大学 | A kind of living body faces detection method based on near-infrared Yu visible ray binocular camera |
CN111368666A (en) * | 2020-02-25 | 2020-07-03 | 上海蠡图信息科技有限公司 | Living body detection method based on novel pooling and attention mechanism double-current network |
Also Published As
Publication number | Publication date |
---|---|
CN114519897A (en) | 2022-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111709409B (en) | Face living body detection method, device, equipment and medium | |
CN109583342B (en) | Human face living body detection method based on transfer learning | |
CN104933414B (en) | A kind of living body faces detection method based on WLD-TOP | |
CN108520226B (en) | Pedestrian re-identification method based on body decomposition and significance detection | |
CN104881637B (en) | Multimodal information system and its fusion method based on heat transfer agent and target tracking | |
CN105740780B (en) | Method and device for detecting living human face | |
CN107230267B (en) | Intelligence In Baogang Kindergarten based on face recognition algorithms is registered method | |
KR101901591B1 (en) | Face recognition apparatus and control method for the same | |
Kashem et al. | Face recognition system based on principal component analysis (PCA) with back propagation neural networks (BPNN) | |
CN109934195A (en) | A kind of anti-spoofing three-dimensional face identification method based on information fusion | |
US10445602B2 (en) | Apparatus and method for recognizing traffic signs | |
CN108960047B (en) | Face duplication removing method in video monitoring based on depth secondary tree | |
CN106778496A (en) | Biopsy method and device | |
CN109255289B (en) | Cross-aging face recognition method based on unified generation model | |
CN105243376A (en) | Living body detection method and device | |
CN112989889B (en) | Gait recognition method based on gesture guidance | |
CN106650574A (en) | Face identification method based on PCANet | |
CN108416291A (en) | Face datection recognition methods, device and system | |
CN112668557A (en) | Method for defending image noise attack in pedestrian re-identification system | |
CN103605993B (en) | Image-to-video face identification method based on distinguish analysis oriented to scenes | |
CN109063643A (en) | A kind of facial expression pain degree recognition methods under the hidden conditional for facial information part | |
Miao et al. | Abnormal behavior learning based on edge computing toward a crowd monitoring system | |
CN114519897B (en) | Human face living body detection method based on color space fusion and cyclic neural network | |
CN115862055A (en) | Pedestrian re-identification method and device based on comparison learning and confrontation training | |
CN113468954B (en) | Face counterfeiting detection method based on local area features under multiple channels |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |