CN112926497B

CN112926497B - Face recognition living body detection method and device based on multichannel data feature fusion

Info

Publication number: CN112926497B
Application number: CN202110299117.4A
Authority: CN
Inventors: 杜永生
Original assignee: Hangzhou Zhicun Intelligent Technology Co ltd
Current assignee: Hangzhou Zhicun Intelligent Technology Co ltd
Priority date: 2021-03-20
Filing date: 2021-03-20
Publication date: 2024-07-05
Anticipated expiration: 2041-03-20
Also published as: CN112926497A

Abstract

The invention provides a face recognition living body detection method and device based on multichannel data feature fusion, wherein the method comprises the following steps: acquiring RGB data, IR Gray data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of all pixels, the IR Gray data comprises Gray values of all pixels, and the Depth data comprises Depth values of all pixels; fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 characteristic values to obtain characteristic data of the image to be detected; the characteristic data is used as a detection sample to be input into a pre-trained deep learning neural network model, and the output of the deep learning neural network model is used as a detection result of whether the image to be detected contains a living body or not, so that the multiplexing degree of the neural network is improved, the total calculated amount and the total consumption are reduced, and the judgment precision is improved.

Description

Face recognition living body detection method and device based on multichannel data feature fusion

Technical Field

The invention relates to the technical field of machine vision, in particular to a face recognition living body detection method and device based on multichannel data feature fusion.

Background

At present, face recognition technology is mature, commercial application is wider, however, faces are very easy to copy in modes of photos, videos and the like, face recognition living body detection technology needs to be researched and developed for preventing malicious persons from forging and stealing face photos or videos of other people for identity authentication, namely whether submitted biological characteristics come from living individuals or not is judged.

The face recognition living body detection technology has increasingly strong requirements in actual commercialized scenes, and the current face recognition living body detection algorithm based on the 3D cameras is to design a plurality of networks to respectively input various data acquired by the 3D cameras, and the results of the plurality of networks are connected in series to obtain the judgment of whether a living body exists or not, or the living body is judged by using partial data, mainly depth data, as a software algorithm scheme.

On the one hand, the existing scheme has the defects of large total calculation amount, long total time consumption and insufficient judgment precision.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a face recognition living body detection method and device based on multi-channel data feature fusion, electronic equipment and a computer readable storage medium, which can at least partially solve the problems in the prior art.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

In a first aspect, a face recognition living body detection method based on multi-channel data feature fusion is provided, including:

Acquiring RGB data, IR Gray data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of all pixels, the IR Gray data comprises Gray values of all pixels, and the Depth data comprises Depth values of all pixels;

Fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 characteristic values to obtain characteristic data of the image to be detected;

And inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body.

Further, the fusing the R value, the G value, the B value, the Gray value, and the Depth value of each pixel of the image to be detected into 4 feature values includes:

calculating an average value of an R value and a Gray value of a pixel as a first characteristic value;

calculating an average value of the G value and the Gray value of the pixel as a second characteristic value;

calculating an average value of the B value and the Gray value of the pixel to be used as a third characteristic value, and taking the Depth value of the pixel as a fourth characteristic value;

Wherein, 4 eigenvalues corresponding to each pixel of the image to be detected form the eigenvalue data of the image to be detected.

Further, before fusing the R value, the G value, the B value, the Gray value, and the Depth value of each pixel of the image to be detected into 4 feature values to obtain feature data of the image to be detected, the method further includes:

Detecting a human face in the image to be detected according to the RGB data of the image to be detected;

And if the image to be detected does not contain a human face, ending the detection.

Respectively positioning an IR face position and a Depth face position according to the RGB face position detected by the RGB data, and respectively picking up a corresponding face frame according to the face position;

And respectively scaling the RGB face frame, the IR face frame and the Depth face frame which are extracted to a preset size, and respectively obtaining corresponding RGB data, IR gray data and Depth data in the scaled RGB face frame, IR face frame and Depth face frame as a data base for obtaining feature data through feature fusion.

And carrying out pixel alignment on RGB data, IR gray data and Depth data of the image to be detected according to factory calibration information of the data acquisition equipment.

Further, the deep learning neural network model is a CNN model.

Further, the CNN model is a lightweight network model MobileNet V2.

In a second aspect, a face recognition living body detection device based on multi-channel data feature fusion is provided, including:

The data acquisition module acquires RGB data of an image to be detected, IR Gray data and Depth data, wherein the RGB data comprises R values, G values and B values of all pixels, the IR Gray data comprises Gray values of all pixels, and the Depth data comprises Depth values of all pixels;

The feature extraction module is used for fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 feature values to obtain feature data of the image to be detected;

And the living body detection module inputs the characteristic data as a detection sample into a pre-trained deep learning neural network model, and takes the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body.

In a third aspect, a face recognition living body detection method based on multi-channel feature fusion is provided, including:

acquiring RGB data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of pixels, and the Depth data comprises the Depth values of the pixels;

fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel characteristic values to obtain characteristic data of the image to be detected;

Further, the fusing the R value, the G value, the B value, and the Depth value of each pixel of the image to be detected into four-way feature values includes:

taking the R value of a pixel as a first characteristic value;

Taking the G value of the pixel as a second characteristic value;

Taking the B value of the pixel as a third characteristic value;

Taking the Depth value of the pixel as a fourth characteristic value;

And 4 characteristic values corresponding to each pixel of the image to be detected form characteristic data of the image to be detected.

Further, before fusing the R value, the G value, the B value, and the Depth value of each pixel of the image to be detected into four-channel feature values to obtain feature data of the image to be detected, the method further includes:

And ending the detection if the image to be detected does not contain a human face.

Positioning a Depth face position according to the RGB face position detected by the RGB data, and digging a corresponding face frame according to the face position;

Scaling the RGB face frames and Depth face frames to preset sizes respectively, and obtaining corresponding RGB data and Depth data in the scaled RGB face frames and Depth face frames respectively as a data basis for feature fusion to obtain feature data.

And carrying out pixel alignment on the RGB data and Depth data of the image to be detected according to factory calibration information of the data acquisition equipment.

Further, the deep learning neural network model is a CNN model.

Further, the CNN model is a lightweight network model MobileNet V2.

In a fourth aspect, a face recognition living body detection device based on multi-channel feature fusion is provided, including:

The data acquisition module acquires RGB data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of all pixels, and the Depth data comprises the Depth values of all pixels;

the feature extraction module is used for fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel feature values to obtain feature data of the image to be detected;

In a fifth aspect, an electronic device is provided, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to implement the steps of the face recognition living detection method based on multi-channel data feature fusion.

In a sixth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the above-described face recognition living detection method based on multi-channel data feature fusion.

The invention provides a face recognition living body detection method and device based on multichannel data feature fusion, wherein the method comprises the following steps: acquiring RGB data, IR Gray data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of all pixels, the IR Gray data comprises Gray values of all pixels, and the Depth data comprises Depth values of all pixels; fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 characteristic values to obtain characteristic data of the image to be detected; the characteristic data is input into a pre-trained deep learning neural network model as a detection sample, and the output of the deep learning neural network model is used as a detection result of whether the image to be detected contains a living body, wherein R value, G value, B value, gray value and Depth value 5 channel data of each pixel of the image to be detected are fused into 4 characteristic values, the fused data is used as the input of the deep learning neural network model, the human face recognition living body detection is realized by using one deep learning neural network model, the degree of multiplexing of the neural network is improved, the total calculated amount and the total consumption are reduced, and the judgment precision is improved.

In addition, the invention provides another face recognition living body detection method and device based on multi-channel feature fusion, and the method comprises the following steps: acquiring RGB data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of pixels, and the Depth data comprises the Depth values of the pixels; fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel characteristic values to obtain characteristic data of the image to be detected; and inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body or not, wherein the R value, the G value, the B value and the Depth value of each pixel of the image to be detected are fused into four-channel characteristic values, the fused data are used as the input of the deep learning neural network model, and one deep learning neural network model is used for realizing human face recognition living body detection, so that the degree of multiplexing of the neural network is improved, the total calculated amount and the total consumption are reduced, and the judgment precision is improved.

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments, as illustrated in the accompanying drawings.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:

Fig. 1 is a schematic diagram of an architecture between a server S1 and a client device B1 according to an embodiment of the present invention;

Fig. 2 is a schematic diagram of an architecture among a server S1, a client device B1 and a database server S2 according to an embodiment of the present invention;

Fig. 3 is a schematic flow chart of a face recognition living body detection method based on five-channel data feature fusion in the embodiment of the invention;

FIG. 4 shows specific steps of step S200 in an embodiment of the invention;

fig. 5 is a second flow chart of a face recognition living body detection method based on five-channel data feature fusion in the embodiment of the invention;

fig. 6 is a flowchart of a face recognition living body detection method based on five-channel data feature fusion in the embodiment of the invention;

fig. 7 is a flowchart of a face recognition living body detection method based on five-channel data feature fusion in the embodiment of the invention;

fig. 8 is a block diagram of a face recognition living body detection device based on five-channel data feature fusion in the embodiment of the invention;

fig. 9 is a schematic flow diagram of a face recognition living body detection method based on RGBD four-channel feature fusion in the embodiment of the present invention;

FIG. 10 shows the specific steps of step S200 in an embodiment of the invention;

fig. 11 is a second flow chart of a face recognition living body detection method based on RGBD four-channel feature fusion in the embodiment of the present invention;

fig. 12 is a flowchart of a face recognition living body detection method based on RGBD four-channel feature fusion in the embodiment of the present invention;

fig. 13 is a flowchart of a face recognition living body detection method based on RGBD four-channel feature fusion in the embodiment of the present invention;

fig. 14 is a block diagram of a face recognition living body detection device based on RGBD four-channel feature fusion in the embodiment of the present invention;

Fig. 15 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

It is noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of the present application and in the foregoing figures, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.

The face recognition living body detection technology has increasingly strong requirements in actual commercialized scenes, and the current face recognition living body detection algorithm based on the 3D cameras is to design a plurality of networks to respectively input various data acquired by the 3D cameras, and the results of the plurality of networks are connected in series to obtain the judgment of whether a living body exists or not, or the living body is judged by using partial data, mainly depth data, as a software algorithm scheme. On the one hand, the existing scheme has the defects of large total calculation amount, long total time consumption and insufficient judgment precision.

In order to at least partially solve the technical problems in the prior art, the embodiment of the invention provides a face recognition living body detection method based on multi-channel data feature fusion, which improves the multiplexing degree of a neural network, reduces the total calculated amount and the total consumption and improves the judgment precision.

In view of this, the present application provides a human face recognition living body detection apparatus based on multi-channel data feature fusion, which may be a server S1, and referring to fig. 1, the server S1 may be communicatively connected to at least one client device B1, the client device B1 may send data of an image to be detected to the server S1, and the server S1 may receive the image to be detected online. One way is as follows: the server S1 may perform online or offline preprocessing on the acquired image to be detected, to acquire RGB data, IR Gray data and Depth data of the image to be detected, where the RGB data includes R values, G values, and B values of each pixel, the IR Gray data includes Gray values of each pixel, and the Depth data includes Depth values of each pixel; fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 characteristic values to obtain characteristic data of the image to be detected; and inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body. Then, the server S1 may send the detection result of whether the image to be detected contains the living body to the client apparatus B1 online. The client device B1 may receive the detection result of whether the image to be detected contains a living body on line.

Another way is: the server S1 can preprocess the acquired image to be detected on line or off line to acquire RGB data and Depth data of the image to be detected, wherein the RGB data comprises R values, G values and B values of all pixels, and the Depth data comprises the Depth values of all pixels; fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel characteristic values to obtain characteristic data of the image to be detected; and inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body. Then, the server S1 may send the detection result of whether the image to be detected contains the living body to the client apparatus B1 online. The client device B1 may receive the detection result of whether the image to be detected contains a living body on line.

In addition, referring to fig. 2, the server S1 may also be communicatively connected to at least one database server S2, where the database server S2 is configured to store training sample sets. The database server S2 sends the training sample set to the server S1 online, and the server S1 may receive the training sample set online and then apply the training sample set to perform model training on the model.

Based on the above, the database server S2 may also be used to store sample data for testing. The database server S2 sends test sample data to the server S1 on line, the server S1 can receive the test sample data on line, then a test sample is applied to carry out model test on the model, the output of the model is used as a test result, whether the current model meets preset requirements is judged based on the test result and a corresponding known judgment result, if yes, the current model is used as a target model for face recognition living body detection based on five-channel data feature fusion; if the current model does not meet the preset requirement, optimizing the current model and/or re-training the model by applying the updated training sample set.

Based on the above, the client device B1 may have a display interface so that a user can view, according to the interface, whether the image to be detected transmitted by the server S1 contains a detection result of a living body.

It is understood that the client device B1 may include a smart phone, a tablet electronic device, a network set-top box, a portable computer, a desktop computer, a Personal Digital Assistant (PDA), a vehicle-mounted device, a smart wearable device, etc. Wherein, intelligent wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..

In practical applications, the part of performing face recognition living detection based on five-channel data feature fusion may be performed on the server S1 side as described above, i.e., the architecture shown in fig. 1, or all operations may be performed in the client device B1, and the client device B1 may be directly connected to the database server S2 in a communication manner. Specifically, the selection may be performed according to the processing capability of the client device B1, and restrictions of the use scenario of the user. The application is not limited in this regard. If all operations are completed in the client device B1, the client device B1 may further include a processor for performing specific processing of face recognition living detection based on five-channel data feature fusion.

Any suitable network protocol may be used for communication between the server and the client device, including those not yet developed on the filing date of the present application. The network protocols may include, for example, TCP/IP protocol, UDP/IP protocol, HTTP protocol, HTTPS protocol, etc. Of course, the network protocol may also include, for example, RPC protocol (Remote Procedure Call Protocol ), REST protocol (Representational STATE TRANSFER) or the like used above the above-described protocol.

In one or more embodiments of the present application, the test sample data is not included in the sample for model training, and for the test sample, a known judgment result thereof is acquired.

Fig. 3 is a schematic flow chart of a face recognition living body detection method based on five-channel data feature fusion in the embodiment of the invention; as shown in fig. 3, the face recognition living body detection method based on five-channel data feature fusion may include the following:

step S100: acquiring RGB data, IR Gray data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of all pixels, the IR Gray data comprises Gray values of all pixels, and the Depth data comprises Depth values of all pixels;

the 3D camera is used for collecting RGB data, IR gray data and Depth data of an image to be detected, and it is worth explaining that a binocular structured light (rgb+ir) scheme and a TOF (single IR camera) scheme exist in the 3D camera product at present.

Specifically, a frame synchronization signal is added to the color RGB and the infrared IR, and the RGB data and the IR gray data are synchronously collected, and the Depth data is generated by processing the frame data by a 3D Depth generation algorithm, specifically, the Depth data can be generated based on the RGB data and the IR gray data, or the Depth data can be generated according to the IR gray data. The TOF scheme only requires IR grayscale data to generate Depth data.

Step S200: fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 characteristic values to obtain characteristic data of the image to be detected;

It should be noted that Gray is a name adopting color cognition, that is, an image is a Gray map, Y is a Y part in YUV of a format to which the data is dependent, and belongs to information in a professional image format, and a Gray value utilized in the embodiment of the present invention may also be referred to as Y/Gray or Y (Gray).

Step S300: and inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body.

Specifically, five-channel fusion is performed on data of each pixel in the whole image, each pixel point is represented by 4 fused eigenvalues, the fused data is used as a detection sample to be input into a pre-trained deep learning neural network model, the deep learning neural network model is used as a classifier, the obtained classification result can be 0 or 1, 0 can be used for representing that no living body is contained in the image to be detected, and 1 can be used for representing that the living body is contained in the image to be detected, otherwise, the embodiment of the invention is not limited to the above.

According to the technical scheme, R value, G value, B value, gray value and Depth value 5 channel data of each pixel of the image to be detected are fused into 4 characteristic values, the fused data are used as input of a deep learning neural network model, face recognition living body detection is realized by using one deep learning neural network model, the multiplexing degree of the neural network is improved, the total calculated amount and the total consumption are reduced, and the judgment precision is improved.

In an alternative embodiment, referring to fig. 4, this step S200 may include the following:

Step S210: calculating an average value of an R value and a Gray value of a pixel as a first characteristic value;

Step S220: calculating an average value of the G value and the Gray value of the pixel to be used as a second characteristic value;

Step S230: calculating an average value of the B value and the Gray value of the pixel to be used as a third characteristic value, and taking the Depth value of the pixel as a fourth characteristic value;

Specifically, five-channel data are fused into four-channel data (C1, C2, C3, C4), where c1= (r+gray)/2, c1= (g+gray)/2, c3= (b+gray)/2, c4=depth.

By adopting the technical scheme, five-channel data in three data of RGB data, IR gray data and Depth data are fused, the fused data are used as the input of the neural network, the multiplexing degree of the neural network is improved, the total computation and the total time consumption of the living body detection function are effectively reduced, and meanwhile, the complementary advantages of the data are reserved by fusion use of all paths of data information.

In an alternative embodiment, referring to fig. 5, the face recognition living body detection method based on the five-channel data feature fusion may further include:

Step S400: detecting a human face in the image to be detected according to the RGB data of the image to be detected;

Ending the detection if the image to be detected does not contain a human face; otherwise, step S200 is performed.

Wherein, a face detection algorithm is adopted to carry out face detection.

By adopting the technical scheme, whether the image to be detected contains the face or not can be judged in advance before the subsequent detection calculation process is carried out, if no face exists, the living body detection is not necessary, so that the process can be ended in advance, the waste of calculation resources when no face exists in the image to be detected is avoided, and the efficiency is further improved, and the calculation amount is reduced.

In an alternative embodiment, referring to fig. 6, the face recognition living body detection method based on the five-channel data feature fusion may further include:

Step S500: respectively positioning an IR face position and a Depth face position according to the RGB face position detected by the RGB data, and respectively picking up a corresponding face frame according to the face position;

Step S600: and respectively scaling the RGB face frame, the IR face frame and the Depth face frame which are extracted to a preset size, and respectively obtaining corresponding RGB data, IR gray data and Depth data in the scaled RGB face frame, IR face frame and Depth face frame as a data base for obtaining feature data through feature fusion.

Specifically, after face rect (rectangular area) is positioned and extracted, the face extracted from each path of data is scaled to the same preset size, such as 64×64, 128×128, and the like, and then channel fusion is performed on the scaled data, so that the calculated amount is further reduced, and the detection precision is improved.

In an alternative embodiment, referring to fig. 7, the face recognition living body detection method based on the five-channel data feature fusion may further include:

step S700: and carrying out pixel alignment on the RGB data, the IR gray data and the Depth data of the image to be detected according to factory calibration information of the data acquisition equipment.

Through pixel alignment of RGB, IR, depth data, pixels at each pixel position belong to the same pixel target, so that help can be provided for accurately positioning face positions in IR and Depth data according to face positions in RGB, and accuracy and stability of living body detection are further improved.

By adopting the technical scheme, RGB, depth and Gray data acquired by a 3D camera are fused after being aligned with pixels, the fusion is to take three types of aligned data as overlapped RGB+D+G of pixel level, RGB data is taken as an example, each pixel has three types of information of R+G+B, on the basis, after the Depth and Gray data are fused, each pixel is R+G+B+depth+Gray five-channel data, then the fused data are input into a customized optimized deep learning CNN network, and the network output result is living or non-living body.

In an alternative embodiment, the deep learning neural network model is a CNN model.

The CNN convolutional neural network is suitable for classifying and regressing data of an image data structure, for example, the embodiment of the invention can select a mobile-end lightweight neural network structure MobileNet V2, and in addition, the CNN convolutional neural network can also perform cutting and optimization of deep matching training tasks on MobileNet V so that the final floating point calculation amount is 1.5MFLOPS.

In an optional embodiment, the face recognition living body detection method based on five-channel data feature fusion may further include: model training step.

In the training process of the model, multiple groups of training data, such as about 1 ten thousand groups of training data, about 10000 groups of test data (positive and negative sample proportion is 6:4), about 6000 groups of positive sample data and about 4000 groups of negative sample data, can be acquired aiming at the use scene, and in the training process, brightness, contrast, noise and the like are randomly added into RGB data in each group of data to carry out data expansion, and brightness and noise are added into Y/Gray data; the depth data does not make any data expansion; the test precision of the trained model on the test set reaches a very high precision level.

Specifically, by utilizing the technical scheme, the training data is subjected to face detection, pixel alignment, face region matting, data scaling and data feature extraction fusion, and then is input into a pre-established network model, the output of the network model is compared with the label of a corresponding training sample, and then the parameters of the network model are reversely regulated to realize model training.

It is worth to describe that, the face recognition living body detection scheme based on five-channel data feature fusion provided by the embodiment of the invention fully utilizes the fusion information of the existing data on the premise of a single algorithm model, and can reach the training precision meeting the commercial reservation on the premise of reducing the number of training samples to the greatest extent.

In an optional embodiment, the face recognition living body detection method based on five-channel data feature fusion may further include: and (3) model testing.

Specifically, a plurality of test samples of predictive labels (the labels represent whether the test samples contain living human faces) are obtained, the test samples are input into a pre-trained neural network model, the output of the model is compared with the corresponding labels, based on the test samples, the precision of the pre-trained neural network model is judged, the precision of the model is compared with a preset requirement, if the requirement is met, the model is applied to detect, and if the requirement is not met, the model is adjusted or the training samples are reorganized to retrain until the model precision meets the test requirement.

It is worth to be noted that, the face recognition living body detection method based on five-channel data feature fusion provided by the embodiment of the invention can also be integrated on an image acquisition and processing integrated device so as to acquire image data in real time and process the image in real time.

Based on the same inventive concept, the embodiment of the application also provides a face recognition living body detection device based on five-channel data feature fusion, which can be used for realizing the method described in the embodiment, as described in the following embodiment. Because the principle of the face recognition living body detection device based on the five-channel data feature fusion for solving the problem is similar to that of the method, the implementation of the face recognition living body detection device based on the five-channel data feature fusion can be referred to the implementation of the method, and the repetition is omitted. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

Fig. 8 is a block diagram of a face recognition living body detection device based on five-channel data feature fusion in an embodiment of the present invention. As shown in fig. 8, the face recognition living body detection device based on five-channel data feature fusion specifically includes: a data acquisition module 10, a feature extraction module 20, and a living body detection module 30.

The data acquisition module 10 acquires RGB data of an image to be detected, IR Gray data, and Depth data, wherein the RGB data comprises R values, G values and B values of each pixel, the IR Gray data comprises Gray values of each pixel, and the Depth data comprises Depth values of each pixel;

the feature extraction module 20 fuses the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 feature values to obtain feature data of the image to be detected;

the living body detection module 30 inputs the feature data as a detection sample into a pre-trained deep learning neural network model, and outputs the deep learning neural network model as a detection result of whether the image to be detected contains a living body.

Fig. 9 is a schematic flow diagram of a face recognition living body detection method based on RGBD four-channel feature fusion in the embodiment of the present invention; as shown in fig. 9, the face recognition living body detection method based on RGBD four-channel feature fusion may include the following:

Step S100': acquiring RGB data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of pixels, and the Depth data comprises the Depth values of the pixels;

The 3D camera is used for collecting RGB data and Depth data of an image to be detected, and it is worth to say that a binocular structured light (rgb+ir) scheme and a TOF (single IR camera) scheme exist in the 3D camera product at present.

Step S200': fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel characteristic values to obtain characteristic data of the image to be detected;

step S300': and inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body.

Specifically, four-channel fusion is performed on data of each pixel in the whole image, each pixel point is represented by 4 fused eigenvalues, the fused data is used as a detection sample to be input into a pre-trained deep learning neural network model, the deep learning neural network model is used as a classifier, the obtained classification result can be 0 or 1, 0 can be used for representing that no living body is contained in the image to be detected, and 1 can be used for representing that the living body is contained in the image to be detected, otherwise, the embodiment of the invention is not limited to the above.

According to the technical scheme, the R value, the G value, the B value and the Depth value of each pixel of the image to be detected are fused into four-channel characteristic values, the fused data are used as the input of the deep learning neural network model, the face recognition living body detection is realized by using the deep learning neural network model, the multiplexing degree of the neural network is improved, the total calculated amount and the total consumption are reduced, and the judgment precision is improved.

In an alternative embodiment, referring to fig. 10, this step S200' may include the following:

Step S210': taking the R value of a pixel as a first characteristic value;

Step S220': taking the G value of the pixel as a second characteristic value;

Step S230': taking the B value of the pixel as a third characteristic value;

step S240': and taking the Depth value of the pixel as a fourth characteristic value.

Specifically, RGB data, depth data are fused into four-channel data (C1, C2, C3, C4), where c1=r, c1=g, c3=b, c4=depth.

By adopting the technical scheme, the RGB data and Depth data are fused, the fused data are used as the input of the neural network, the multiplexing degree of the neural network is improved, the total amount and the total time consumption of the living body detection function are effectively reduced, and meanwhile, the complementary advantages of the data are reserved by fusion use of all paths of data information.

In an alternative embodiment, referring to fig. 11, the face recognition living body detection method based on RGBD four-channel feature fusion may further include:

Step S400': detecting a human face in the image to be detected according to the RGB data of the image to be detected;

Ending the detection if the image to be detected does not contain a human face; otherwise, step S200' is performed.

Wherein, a face detection algorithm is adopted to carry out face detection.

In an alternative embodiment, referring to fig. 12, the face recognition living body detection method based on RGBD four-channel feature fusion may further include:

Step S500': positioning a Depth face position according to the RGB face position detected by the RGB data, and digging a corresponding face frame according to the face position;

Step S600': scaling the RGB face frames and Depth face frames to preset sizes respectively, and obtaining corresponding RGB data and Depth data in the scaled RGB face frames and Depth face frames respectively as a data basis for feature fusion to obtain feature data.

In an alternative embodiment, referring to fig. 13, the face recognition living body detection method based on RGBD four-channel feature fusion may further include:

step S700': and carrying out pixel alignment on the RGB data and Depth data of the image to be detected according to factory calibration information of the data acquisition equipment.

The pixels at each pixel position belong to the same pixel target by aligning the RGB and the Depth, so that the face position in the Depth can be accurately positioned according to the face position in the RGB later to provide help, and the accuracy and the stability of living body detection are further improved.

By adopting the technical scheme, the two types of aligned data are subjected to pixel level superposition RGB+D, and the RGB data are taken as an example, so that three channels of data are formed, each pixel has three types of information R+G+B, after the Depth data are fused on the basis, each pixel is R+G+B+depth four channels of data, and then the fused data are input into a customized optimized deep learning CNN network, and the network output result is living or non-living.

In an alternative embodiment, the face recognition living body detection method based on RGBD four-channel feature fusion may further include: model training step.

In the model training process, multiple groups of training data, such as about 1 ten thousand groups of training data, about 10000 groups of test data (positive and negative sample proportion is 6:4), about 6000 groups of positive sample data and about 4000 groups of negative sample data, can be acquired aiming at the use scene, and in the training process, data expansion is carried out by randomly adding brightness, contrast, noise and the like into RGB data in each group of data, and the depth data is not subjected to any data expansion; the test precision of the trained model on the test set reaches a very high precision level.

It is worth to describe that, the face recognition living body detection scheme based on RGBD four-channel feature fusion provided by the embodiment of the invention fully utilizes the fusion information of the existing data on the premise of a single algorithm model, and can reach training precision meeting commercial reservation on the premise of reducing the number of training samples to the greatest extent.

In an alternative embodiment, the face recognition living body detection method based on RGBD four-channel feature fusion may further include: and (3) model testing.

Based on the same inventive concept, the embodiment of the application also provides a human face identification living body detection device based on RGBD four-channel feature fusion, which can be used for realizing the method described in the embodiment, as described in the following embodiment. Because the principle of solving the problem of the human face recognition living body detection device based on RGBD four-channel feature fusion is similar to that of the method, the implementation of the human face recognition living body detection device based on RGBD four-channel feature fusion can be referred to the implementation of the method, and the repetition is omitted. As used below, the term "unit" or "module" may be a combination of software and/or hardware that implements the intended function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

Fig. 14 is a block diagram of a face recognition living body detection apparatus based on RGBD four-channel feature fusion in the embodiment of the present invention. As shown in fig. 14, the human face recognition living body detection device based on RGBD four-channel feature fusion specifically includes: a data acquisition module 10', a feature extraction module 20', and a living body detection module 30'.

The data acquisition module 10' acquires RGB data of an image to be detected, the RGB data including R values, G values, and B values of each pixel, and Depth data including Depth values of each pixel;

The feature extraction module 20' fuses the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel feature values to obtain feature data of the image to be detected;

The living body detection module 30' inputs the feature data as a detection sample into a pre-trained deep learning neural network model, and outputs the deep learning neural network model as a detection result of whether the image to be detected contains a living body.

The apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. A typical implementation device is an electronic device, which may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

In a typical example, the electronic device specifically includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to implement the steps of the face recognition living detection method based on five-channel data feature fusion described above.

Referring now to fig. 15, a schematic diagram of an electronic device 600 suitable for use in implementing embodiments of the present application is shown.

As shown in fig. 15, the electronic apparatus 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate works and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM)) 603. In the RAM603, various programs and data required for the operation of the system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on drive 610 as needed, so that a computer program read therefrom is mounted as needed as storage section 608.

In particular, according to embodiments of the present invention, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, an embodiment of the present invention includes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the face recognition living detection method based on five-channel data feature fusion described below.

In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. The human face identification living body detection method based on the multi-channel data feature fusion is characterized by comprising the following steps of:

Inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body or not; the fusing the R value, the G value, the B value, the Gray value, and the Depth value of each pixel of the image to be detected into 4 feature values includes:

calculating an average value of the G value and the Gray value of the pixel to be used as a second characteristic value;

Wherein, 4 eigenvalues corresponding to each pixel of the image to be detected form the eigenvalue of the image to be detected;

wherein, adding frame synchronizing signal to color RGB and infrared IR, synchronously collecting RGB data and IR gray data, and processing the frame data to generate Depth data.

2. The face recognition living body detection method based on the multi-channel data feature fusion according to claim 1, wherein before the fusing R value, G value, B value, gray value and Depth value of each pixel of the image to be detected into 4 feature values to obtain feature data of the image to be detected, the method further comprises:

3. The face recognition living body detection method based on the multi-channel data feature fusion according to claim 2, wherein before the fusing R value, G value, B value, gray value and Depth value of each pixel of the image to be detected into 4 feature values to obtain the feature data of the image to be detected, the method further comprises:

4. The face recognition living body detection method based on the multi-channel data feature fusion according to claim 1, wherein before the fusing R value, G value, B value, gray value and Depth value of each pixel of the image to be detected into 4 feature values to obtain feature data of the image to be detected, the method further comprises:

and carrying out pixel alignment on the RGB data, the IR gray data and the Depth data of the image to be detected according to factory calibration information of the data acquisition equipment.

5. The human face identification living body detection method based on the multichannel feature fusion is characterized by comprising the following steps of:

Inputting the characteristic data as a detection sample into a pre-trained deep learning neural network model, and taking the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body or not; the fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel characteristic values comprises the following steps:

taking the R value of a pixel as a first characteristic value;

Taking the G value of the pixel as a second characteristic value;

Taking the B value of the pixel as a third characteristic value;

Taking the Depth value of the pixel as a fourth characteristic value;

6. The face recognition living body detection method based on multi-channel feature fusion according to claim 5, wherein before the fusing the R value, G value, B value, depth value of each pixel of the image to be detected into four-channel feature values to obtain feature data of the image to be detected, the method further comprises:

7. The face recognition living body detection method based on multi-channel feature fusion according to claim 6, wherein before the fusing the R value, G value, B value, depth value of each pixel of the image to be detected into four-channel feature values to obtain feature data of the image to be detected, further comprises:

8. The face recognition living body detection method based on multi-channel feature fusion according to claim 5, wherein before the fusing the R value, G value, B value, depth value of each pixel of the image to be detected into four-channel feature values to obtain feature data of the image to be detected, the method further comprises:

9. Face recognition living body detection device based on multichannel data feature fusion, characterized by comprising:

The device comprises a first data acquisition module, a second data acquisition module and a third data acquisition module, wherein the first data acquisition module acquires RGB data of an image to be detected, IR Gray data and Depth data, the RGB data comprises R values, G values and B values of pixels, the IR Gray data comprises Gray values of the pixels, and the Depth data comprises Depth values of the pixels;

The first feature extraction module is used for fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 feature values to obtain feature data of the image to be detected;

The first living body detection module inputs the characteristic data as a detection sample into a pre-trained deep learning neural network model, and takes the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body or not;

The fusing the R value, the G value, the B value, the Gray value and the Depth value of each pixel of the image to be detected into 4 characteristic values comprises the following steps:

10. Face recognition living body detection device based on multichannel feature fusion, characterized by comprising:

The second data acquisition module acquires RGB data and Depth data of an image to be detected, wherein the RGB data comprises R values, G values and B values of all pixels, and the Depth data comprises the Depth values of all pixels;

The second feature extraction module is used for fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel feature values to obtain feature data of the image to be detected;

the second living body detection module inputs the characteristic data as a detection sample into a pre-trained deep learning neural network model, and takes the output of the deep learning neural network model as a detection result of whether the image to be detected contains a living body or not;

the fusing the R value, the G value, the B value and the Depth value of each pixel of the image to be detected into four-channel characteristic values comprises the following steps:

taking the R value of a pixel as a first characteristic value;

Taking the G value of the pixel as a second characteristic value;

Taking the B value of the pixel as a third characteristic value;

Taking the Depth value of the pixel as a fourth characteristic value;

Wherein, 4 eigenvalues corresponding to each pixel of the image to be detected form the eigenvalue of the image to be detected; wherein, adding frame synchronizing signal to color RGB and infrared IR, synchronously collecting RGB data and IR gray data, and processing the frame data to generate Depth data.

11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the face recognition living detection method based on multi-channel data feature fusion of any one of claims 1 to 8 when the program is executed by the processor.

12. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the human face recognition living detection method based on multi-channel data feature fusion of any one of claims 1 to 8.