CN105138973A

CN105138973A - Face authentication method and device

Info

Publication number: CN105138973A
Application number: CN201510490244.7A
Authority: CN
Inventors: 郇淑雯; 毛秀萍; 张伟琳; 朱和贵
Original assignee: Beijing Techshino Technology Co Ltd
Current assignee: Beijing Eyes Intelligent Technology Co ltd; Beijing Eyecool Technology Co Ltd
Priority date: 2015-08-11
Filing date: 2015-08-11
Publication date: 2015-12-09
Anticipated expiration: 2035-08-11
Also published as: CN105138973B

Abstract

The invention discloses face authentication method and device, and belongs to the field of biological recognition. The method comprises the following steps: sequentially extracting multiple levels of feature vectors from a to-be-authenticated face image and a face image template with a multi-level depth convolutional network which is subjected to multi-layer classified network joint training in advance; sequentially mapping the multiple levels of feature vectors into unified dimension feature vectors through a unified dimension linear mapping matrix; connecting the unified dimension feature vectors into joint feature vectors in series; carrying out dimension-reducing mapping on the joint feature vectors through a linear dimension-reducing mapping matrix; and normalizing cosine values with absolute values through linear discriminant analysis, and carrying out comparison and authentication on the obtained comprehensive feature vectors of the to-be-authenticated face image and comprehensive feature vectors of the face image template. Compared with the prior art, the face authentication method disclosed by the invention is high in anti-jamming capability, good in expandability and high in authentication accuracy rate.

Description

Face authentication method and device

Technical Field

The invention relates to the field of biological recognition, in particular to a method and a device for face authentication.

Background

The face authentication is a form of biological identification, the features of two face pictures are obtained by effectively representing the face, and a classification algorithm is utilized to judge whether the two pictures are the same person. Generally, a face image is stored in a face recognition device in advance and serves as a face image template; during authentication, one face image is shot as a face image to be authenticated, the characteristics of the two images are extracted, and whether the two pictures are the same person or not is judged by utilizing a classification algorithm.

The method for extracting the features comprises the following steps: a feature vector is designed manually, and a specified feature vector is taken out through various algorithms, such as a face authentication method based on geometric features, a face authentication method based on subspace, a face authentication method based on signal processing and the like.

The face recognition and authentication technology based on the deep network can automatically learn and extract features, but the general deep network has the problem of gradient diffusion, the processing and understanding of features of each level are insufficient, and the image is not sufficiently described by only using high-level features.

Disclosure of Invention

The invention provides a method and a device for face authentication.

In order to solve the technical problems, the invention provides the following technical scheme:

a method of face authentication, comprising:

sequentially extracting a plurality of levels of feature vectors from the face image to be authenticated and the face image template by using a multi-level deep convolutional network which is subjected to multi-level classification network joint training in advance;

mapping the feature vectors of multiple levels into uniform dimension feature vectors sequentially through a uniform dimension linear mapping matrix;

the unified dimension feature vectors are connected in series to form a combined feature vector;

carrying out dimensionality reduction mapping on the combined eigenvector through a linear dimensionality reduction mapping matrix to obtain a comprehensive eigenvector;

and comparing and authenticating the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template by using the absolute value normalized cosine value through linear discriminant analysis.

An apparatus for face authentication, comprising:

the first extraction module is used for sequentially extracting a plurality of levels of feature vectors by using a multi-level depth convolution network which is subjected to multi-level classification network joint training in advance for the face image to be authenticated and the face image template;

the first mapping module is used for sequentially mapping the characteristic vectors of a plurality of levels into uniform dimension characteristic vectors through a uniform dimension linear mapping matrix;

the first serial module is used for serially connecting the uniform dimension feature vectors into a combined feature vector;

the second mapping module is used for carrying out dimension reduction mapping on the combined feature vector through a linear dimension reduction mapping matrix to obtain a comprehensive feature vector;

and the first comparison module is used for comparing and authenticating the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template by utilizing the absolute value normalized cosine value through linear discriminant analysis.

The invention has the following beneficial effects:

the method for authenticating the human face comprises the steps of firstly extracting a plurality of levels of feature vectors of a human face image to be authenticated and a human face image template by using a multi-level deep convolution network which is subjected to multi-level classification network joint training in advance, then mapping the plurality of levels of feature vectors into unified dimension feature vectors sequentially through a unified dimension linear mapping matrix, then connecting the unified dimension feature vectors in series into a joint feature vector, carrying out dimension reduction mapping on the joint feature vectors through a linear dimension reduction mapping matrix to obtain a comprehensive feature vector, and finally carrying out comparison authentication on the obtained comprehensive feature vector of the human face image to be authenticated and the obtained comprehensive feature vector of the human face image template by using an absolute value to normalize a cosine value through linear discriminant analysis.

Compared with the prior art, the method has the advantages that the characteristics are automatically learned and extracted through the multi-level deep convolution network, and compared with a characteristic vector manually designed in the prior art, the method is strong in anti-interference performance, good in expandability and high in authentication accuracy.

The multi-level deep convolutional network is obtained by performing combined training through the multi-level classification network, the problem of gradient dispersion is avoided, and the authentication accuracy is high.

Moreover, the feature vectors of a plurality of levels are fused, the feature richness of the image is increased, and the defects that the common depth network cannot fully process the features of the levels and only uses the high-level features to fully describe the image are overcome; the authentication accuracy is further improved.

The inventor also finds that the traditional comparison authentication method, particularly the cosine similarity method ignores the modular length difference of vectors, so that the difference description is incomplete, and the accuracy of comparison authentication is reduced; the invention adopts linear discriminant analysis to compare and analyze a plurality of difference characteristics including the absolute value normalized cosine value, thereby further improving the authentication accuracy.

Therefore, the human face authentication method has the advantages of strong anti-interference capability, good expandability and high authentication accuracy, avoids the problem of gradient dispersion, and makes up for the defect that the image cannot be fully described by utilizing high-level characteristics.

Drawings

FIG. 1 is a flow chart of a method of face authentication of the present invention;

FIG. 2 is a schematic diagram of a face authentication apparatus according to the present invention;

FIG. 3 is a schematic diagram of image preprocessing in the present invention;

FIG. 4 is a schematic diagram of training a multi-level deep convolutional network and a classification network according to the present invention;

FIG. 5 is a schematic diagram of the basic convolutional network of the present invention;

FIG. 6 is a schematic diagram of a multi-level deep convolutional network in accordance with the present invention;

FIG. 7 is a schematic diagram of a classification network according to the present invention;

fig. 8 is a schematic diagram of a downsampling operation in the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.

In one aspect, the present invention provides a method for face authentication, as shown in fig. 1, including:

step S101: sequentially extracting a plurality of levels of feature vectors from the face image to be authenticated and the face image template by using a multi-level deep convolutional network which is subjected to multi-level classification network joint training in advance;

the multi-level deep convolutional network comprises more than 2 convolutional networks, each convolutional network comprises convolution, activation and downsampling operations, the sequence and the number of the operations are not fixed and are determined according to actual conditions; each convolution network of the invention extracts a feature vector which can be recorded as fea₁,fea₂,fea₃… (only 1 group of feature vectors of multiple levels, that is, feature vectors of multiple levels of a face image to be authenticated or a face image template are listed here, and the following formula also only writes an image formula), the input of the first convolution network is the face image to be authenticated or the face image template, and the input of the following convolution network is the feature image after the previous convolution network is operated;

the general deep network has the problem of gradient dispersion, and the multi-level deep convolutional network is obtained by performing joint training through a multi-level classification network, so that the problem is avoided.

Step S102: mapping the feature vectors of multiple levels into uniform dimension feature vectors sequentially through a uniform dimension linear mapping matrix; the uniform dimension linear mapping matrix is obtained by pre-training and can be recorded as W₁,W₂,W₃…, the unified dimensional feature vector may be denoted as f₁,f₂,f₃,…。

Step S103: the unified dimension feature vectors are connected in series to form a combined feature vector; can be recorded as feature_merge。

Step S104: carrying out dimensionality reduction mapping on the combined eigenvector through a linear dimensionality reduction mapping matrix to obtain a comprehensive eigenvector; the linear dimensionality reduction mapping matrix is obtained through pre-training and can be marked as W_TThe composite feature vector may be denoted as f_T。

Step S105: and comparing and authenticating the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template by using the absolute value normalized cosine value through linear discriminant analysis.

As an improvement of the method for authenticating a human face of the present invention, before the step S101, the method further includes:

step S100: and preprocessing the face image to be authenticated and the face image template, wherein the preprocessing comprises characteristic point positioning, image correction and normalization processing. In fact, the face image template may have been pre-processed in advance, and this step may not be performed.

The invention adopts a face detection algorithm based on cascade Adaboost to detect the face of an image, then utilizes a face feature point positioning algorithm based on SDM to position the feature points of the detected face, and corrects and normalizes and aligns the face by image scaling, rotation and translation, as shown in figure 3.

The invention adopts simple gray normalization preprocessing, and the main purpose of gray normalization is to facilitate network processing of continuous data and avoid processing of larger discrete gray values, thereby avoiding abnormal conditions.

The invention can be used for preprocessing the face image to facilitate the subsequent authentication process and avoid the influence of abnormal pixel points on the authentication result.

As another improvement of the method for face authentication of the present invention, each convolution network includes a convolution operation, an activation operation, and a downsampling operation, and the feature vector of each level is calculated by the following steps:

step S1011: carrying out convolution operation on the face image to be authenticated and the face image template by using a convolution kernel to obtain a convolution characteristic graph, wherein the convolution operation is same convolution operation;

the invention adopts the convolution operation in the same form, and zero filling is carried out on the input image during the operation. The feature map obtained by the convolution operation in the same form as the original size of the input image.

Step S1012: and performing activation operation on the convolution characteristic diagram by using an activation function to obtain the activation characteristic diagram, wherein the activation function is a ReLU activation function.

Step S1013: performing down-sampling operation on the activation characteristic graph by using a sampling function to obtain a sampling characteristic graph, wherein the down-sampling operation is maximum value sampling;

the method adopts maximum value sampling, the maximum value sampling takes the maximum value of element values in a sampling block as the characteristic of the sampling block, and in the image processing, the maximum value sampling can extract the texture information of an image and maintain certain invariance of the image to a certain extent, such as rotation, translation, scaling and the like; in addition, according to statistical experiments, compared with average sampling, maximum value sampling is insensitive to data distribution change, and feature extraction is relatively stable.

Step S1014: repeating the steps on the obtained sampling characteristic diagram to obtain a new sampling characteristic diagram, and repeating the steps for a plurality of times;

step S1015: vectorizing all the obtained sampling feature maps to obtain a feature vector of each level, and combining all the sampling feature maps obtained in each step into one vector.

The invention can extract the characteristic vector with rich and stable characteristics, can fully describe the face image and increases the authentication accuracy.

As a further improvement of the face authentication method of the present invention, the multi-level deep convolutional network is obtained by softmax classification network joint training, and includes:

during training, firstly, a face image sample library is provided, and then a plurality of levels of feature vectors are sequentially extracted from the face image sample by using an initialized multi-level depth convolution network; the same as the foregoing step S101, except that the above is an authentication process, here is a training process, and each parameter in the multi-level deep convolutional network at this time takes an initial value;

mapping the feature vectors of multiple levels into uniform dimension feature vectors with the same dimension sequentially through a uniform dimension linear mapping matrix;

respectively mapping the uniform dimension characteristic vectors by using a linear mapping matrix in the softmax classification network to obtain mapping vectors; the linear mapping matrix at the moment takes an initial value;

activating the mapping vector by using a softmax function to obtain a network output value vector;

calculating a network error through a cross entropy loss function by taking the network output value vector and the label data of the face image sample as input quantities;

all the uniform dimension feature vectors are connected in series to form a combined feature vector;

distributing weight to the network error, and calculating the update gradients of a linear mapping matrix, a uniform dimension linear mapping matrix, a linear dimension reduction mapping matrix and a convolution kernel;

performing iterative updating on the linear mapping matrix, the unified dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel by using the updating gradients of the linear mapping matrix, the unified dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel;

and judging whether the network error and the iteration frequency meet the requirements, if so, ending, otherwise, turning to the step of using an initialized multi-level depth convolution network for the face image sample to sequentially extract the feature vectors of a plurality of levels.

The network error meets the requirement, namely the network error value is minimum (or is small to a certain degree), and at the moment, all parameters (a linear mapping matrix, a uniform dimension linear mapping matrix, a linear dimensionality reduction mapping matrix and a convolution kernel) of the multi-layer deep convolutional network and the softmax classification network are the trained multi-layer deep convolutional network and the softmax classification network; the iteration times meet the requirement, namely the iteration times reach a set value.

The invention carries out combined training through the softmax classification network, further avoids the problem of gradient dispersion, and can further increase the flexibility of network learning by weighting the classification network errors.

As still another improvement of the method of face authentication of the present invention, step S105 includes:

step S1051: performing cosine similarity operation by taking the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template as input quantities to obtain cosine similarity;

step S1052: carrying out absolute value normalization cosine operation by taking the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template as input quantities to obtain an absolute value normalization cosine value;

step S1053: performing modular operation on the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template to obtain a first modular length and a second modular length;

step S1054: forming a four-dimensional difference vector by the cosine similarity, the absolute value normalized cosine value, the first modular length and the second modular length;

step S1055: mapping the difference vector by using a difference vector mapping matrix to obtain a one-dimensional vector as a comparison score;

step S1056: and comparing the comparison score with a comparison threshold, and if the comparison score is greater than the comparison threshold, the face authentication is passed.

The inventor finds that the traditional comparison authentication method, particularly the cosine similarity method ignores the modular length difference of vectors, so that the difference description is incomplete, and the accuracy of comparison authentication is reduced; the absolute value normalized cosine value is sensitive to the difference of the module lengths of the comparison vectors, and the problem of incomplete description of the difference caused by neglecting the difference of the module lengths of the vectors in cosine similarity can be solved.

Therefore, the invention combines the cosine similarity, the absolute value normalized cosine value and the two characteristic modular lengths of the comparison characteristic into a four-dimensional difference vector to perform linear discriminant analysis, thereby further improving the authentication accuracy.

On the other hand, the present invention provides a face authentication device, as shown in the right fig. 2, including:

the first extraction module 11 is configured to sequentially extract feature vectors of multiple levels from the face image to be authenticated and the face image template by using a multi-level deep convolutional network that is pre-subjected to multi-level classification network joint training;

the first mapping module 12 is configured to sequentially map the feature vectors of multiple hierarchies into a uniform dimension feature vector through a uniform dimension linear mapping matrix;

the first series module 13 is used for serially connecting the uniform dimension feature vectors into a joint feature vector;

the second mapping module 14 is configured to perform dimension reduction mapping on the joint feature vector through a linear dimension reduction mapping matrix to obtain a comprehensive feature vector;

and the first comparison module 15 is configured to compare and authenticate the obtained comprehensive feature vector of the face image to be authenticated and the comprehensive feature vector of the face image template by using the absolute value normalized cosine value through linear discriminant analysis.

The human face authentication device has the advantages of strong anti-interference capability, good expandability and high authentication accuracy, avoids the problem of gradient dispersion, and overcomes the defect that the image cannot be fully described by using high-level characteristics.

As an improvement of the apparatus for face authentication of the present invention, the first extraction module further includes:

and the preprocessing module is used for preprocessing the face image to be authenticated and the face image template, and the preprocessing comprises characteristic point positioning, image correction and normalization processing.

As another improvement of the apparatus for face authentication of the present invention, the feature vector of each hierarchy is calculated by:

the convolution unit is used for carrying out convolution operation on the face image to be authenticated and the face image template by using convolution kernel to obtain a convolution characteristic graph, and the convolution operation is same convolution operation;

the activation unit is used for performing activation operation on the convolution characteristic diagram by using an activation function to obtain an activation characteristic diagram, and the activation function is a ReLU activation function;

the sampling unit is used for carrying out downsampling operation on the activation characteristic graph by using a sampling function to obtain a sampling characteristic graph, and the downsampling operation is maximum value sampling;

the circulation unit is used for repeating the steps on the obtained sampling characteristic diagram to obtain a new sampling characteristic diagram, and repeating the steps for a plurality of times;

and the first vector quantization unit is used for vectorizing all the obtained sampling feature maps to obtain a feature vector of each level.

As still another improvement of the apparatus for face authentication of the present invention, the multi-level deep convolutional network is obtained by joint training of softmax classification networks, and includes:

the second extraction module is used for sequentially extracting a plurality of levels of feature vectors from the face image sample by using the initialized multi-level depth convolution network;

the third mapping module is used for sequentially mapping the characteristic vectors of a plurality of levels into uniform dimension characteristic vectors with the same dimension through a uniform dimension linear mapping matrix;

the fourth mapping module is used for respectively mapping the uniform dimension characteristic vectors by using linear mapping matrixes in the softmax classification network to obtain mapping vectors;

the activation module is used for activating the mapping vector by using a softmax function to obtain a network output value vector;

the first calculation module is used for calculating a network error through a cross entropy loss function by taking the network output value vector and the label data of the face image sample as input quantities;

the second serial module is used for serially connecting all the uniform dimension feature vectors into a combined feature vector;

the fifth mapping module is used for carrying out dimension reduction mapping on the combined feature vector through a linear dimension reduction mapping matrix to obtain a comprehensive feature vector;

the second calculation module is used for distributing weight to the network error and calculating a linear mapping matrix, a unified dimension linear mapping matrix, a linear dimensionality reduction mapping matrix and an update gradient of a convolution kernel;

the updating module is used for carrying out iterative updating on the linear mapping matrix, the unified dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel by utilizing the updating gradients of the linear mapping matrix, the unified dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel;

and the judging module is used for judging whether the network error and the iteration times meet the requirements, if so, ending, and otherwise, turning to the second extracting module.

As still another improvement of the apparatus for face authentication of the present invention, the first comparison module includes:

the first calculation unit is used for carrying out cosine similarity operation by taking the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template as input quantities to obtain cosine similarity;

the second calculation unit is used for carrying out absolute value normalization cosine operation by taking the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template as input quantities to obtain an absolute value normalization cosine value;

the third calculation unit is used for performing modular operation on the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template to obtain a first modular length and a second modular length;

the second direction quantization unit is used for forming a four-dimensional difference vector by the cosine similarity, the absolute value normalized cosine value, the first modular length and the second modular length;

the mapping unit is used for mapping the difference vector by using the difference vector mapping matrix to obtain a one-dimensional vector as a comparison score;

and the comparison unit is used for comparing the comparison score with the comparison threshold, and if the comparison score is greater than the comparison threshold, the face authentication is passed.

The invention combines the cosine similarity, the absolute value normalized cosine value and the two characteristic modular lengths of the comparison characteristics into a four-dimensional difference vector to perform linear discriminant analysis, thereby further improving the authentication accuracy.

The invention is described below in a specific embodiment:

the invention needs to be trained before authentication, the specific flow is shown in fig. 4, and the training process is as follows:

the method firstly provides a new convolution network to extract image feature vectors, and a multi-level feature fusion accumulated weighted depth convolution network (multi-level depth convolution network), and then performs feature learning on the image by utilizing a softmax network and a learning process shown in figure 3.

The network learning process mainly comprises forward calculation of the network and backward propagation of network errors.

(A) Convolutional network forward computation

The basic convolutional network (as shown in fig. 5, note that fig. 5 is only an example of a convolutional network, and is not a convolutional network used in the present invention, and the convolutional network of the present invention is convolution, activation, and downsampling, …), which includes convolution operation, activation operation, and downsampling operation, and for the convenience of subsequent calculation, a vectorization operation is generally required. In fig. 6, the convolutional network of each layer represents a basic convolutional network, and the sequence and the number of the various operations included therein can be set according to specific problems.

The convolution operation has different modes, the invention adopts the convolution operation in the same form, and zero filling is carried out on the input image during the operation. The feature map obtained by the convolution operation in the same form as the original size of the input image.

According to the convolution calculation formula, when the input data is a two-dimensional image, the calculation formula of the convolution characteristic map element can be obtained, and the formula (2):

wherein, c^kRepresenting the kth convolution kernel of the convolution operation, c^k(i, j) represents c^kThe ith row and the jth column of (1), s_cRepresenting the side length, M, of the convolution kernel^CkRepresenting input images I and c^kConvolution signature, M, obtained by convolution^Ck(M, n) represents M^CkRow m, column n element, neighbor borwood (m, n, s)_c) Represents (m, n) as the center and the side length as s_cThe neighborhood of (a) is determined,representing convolution operators in the form of same.

And when the input data is a feature map obtained through a plurality of operations, convolving the calculation formula of the feature map elements, as shown in formula (3):

for the convolution characteristic map M obtained by convolution operation^CkPerforming an activation operation, namely, activating M^CkIs input into the activation function f for mapping, as in equation (4):

M^Ak(m,n)＝f(M^Ck(m,n)).(4)

wherein M is^AkRepresents M^CkAnd f represents the activation function.

The present invention employs a ReLU activation function.

f(x)＝ReLU(x)＝max(0,x).(5)

Activation characteristic diagram M obtained for activation operation^AkAnd performing down-sampling operation, mainly reducing the dimension of the features in a sampling mode, and further compressing and abstracting the image features.

The downsampling operation first divides the input data into s without coincidence_s×s_sSmall pieces of (a), s_sAnd (3) representing the side length of the sampling core, then inputting the data of each sub-block into a sampling function for mapping, wherein the mapping output is a sampling value corresponding to the sub-block, and the formula is (6):

M^Sk(m,n)＝s(M^Ak(s_s·(m-1)+1:s_s·m,s_s·(n-1)+1:s_s·n))(6)

wherein M is^SkRepresents M^AkA sampling feature map, M, obtained by a sampling function^Sk(M, n) represents M^SkThe element in the m-th row and the n-th column, s, represents a sampling function. FIG. 8 illustrates a process for downsampling input data of size 4 × 4, where s_s＝2。

The invention employs maximum value sampling.

Maximum value sampling takes the maximum value of the element values in the sampling block as the characteristic of the sampling block, as shown in formula (7):

s(I)＝max(I)(7)

in image processing, maximum value sampling can extract texture information of an image and maintain certain invariance of the image to a certain extent, such as rotation, translation, scaling and the like; in addition, according to statistical experiments, compared with average sampling, maximum value sampling is insensitive to data distribution change, and feature extraction is relatively stable.

After the feature extraction is finished, vectorization operation needs to be performed on the obtained feature graph to obtain a feature vector fea, so that features can be input into the classification network, and network parameters can be learned.

Vectorization operation is as follows (8):

f e a = \underset{k = 1... K}{c o n c a t} (v (M^{S k})) . - - - (8)

where v denotes stretching the non-vector data into a column vector, concat denotes concatenating the indicated vectors into a high-dimensional vector, and K denotes the total number of feature maps.

(B) Unified dimensional linear mapping

The image is subjected to a plurality of convolution, activation and down-sampling operations of a convolution network to obtain a series of feature maps. As shown in formula (9), wherein n_fDimension, n, representing a uniform dimension feature vector_iRepresents fea_iDimension (c):

<math> <mrow> <msub> <mi>f</mi> <mi>i</mi> </msub> <mo>=</mo> <msub> <mi>W</mi> <mi>i</mi> </msub> <msub> <mi>fea</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>W</mi> <mi>i</mi> </msub> <mo>&Element;</mo> <msup> <mi>R</mi> <mrow> <msub> <mi>n</mi> <mi>f</mi> </msub> <mo>×</mo> <msub> <mi>n</mi> <mi>i</mi> </msub> </mrow> </msup> <mo>.</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>9</mn> <mo>)</mo> </mrow> </mrow> </math>

(C) softmax sortation network

FIG. 7 shows the basic structure of a softmax network, in which f_iRepresenting the i-th component, N, of the input feature vector f_CIndicates the number of categories, W^idA linear mapping matrix is represented.

It should be noted here that when the linear mapping is implemented in a network form, a linear mapping with bias (bias) is generally used. Because the vector addition can realize equivalence by rewriting the mapping matrix and the multiplication of the mapping vector, for the convenience of writing, all related linear mapping operation expressions adopt a rewriting form, and the rewritten mapping matrix and the mapping vector are directly expressed by using the original variable name without reflecting bias in the expression, wherein o represents the output after linear mapping, and o in the graph represents the output after linear mapping_iRepresenting the ith component of o.

o＝W^id·f(10)

h_iThe ith component of the network output value h obtained after o is activated by the softmax function is represented as:

h＝softmax(o).(11)

the softmax function is a nonlinear activation function adopted by the softmax network, and the expression of the softmax function is as follows:

as can be derived from equation (12), the softmax function is a "nonnegative and unity summation function", and therefore, its function output value can be taken as "the probability that the input data belongs to the corresponding class", that is, the probability that the input data belongs to the corresponding class

h_i＝P(lable_i＝1)＝P(input∈CLASS_i).(13)

Wherein,a binary vector for the original LABEL of the data (representing the second LABEL person in the data set), as in equation (14); CLASS_iRepresenting a class i dataset, representing all images of the ith person in face recognition:

class is a classification decision given by the network according to the network output h:

c l a s s = \arg \max_{c} (h_{c}) . - - - (15)

the method for identifying the face identity in the image is an image classification problem, the classification algorithm adopted by the invention is a softmax classification network, and the adopted loss function is a cross entropy loss function, as shown in formula (16):

<math> <mrow> <mi>l</mi> <mi>o</mi> <mi>s</mi> <mi>s</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>l</mi> <mi>a</mi> <mi>b</mi> <mi>e</mi> <mi>l</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>Σ</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>N</mi> <mi>C</mi> </msub> </munderover> <mrow> <mo>(</mo> <msub> <mi>label</mi> <mi>i</mi> </msub> <mi>log</mi> <mo>(</mo> <msub> <mi>h</mi> <mi>i</mi> </msub> <mo>)</mo> <mo>)</mo> </mrow> <mo>.</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>16</mn> <mo>)</mo> </mrow> </mrow> </math>

h is a network output value vector passing through the softmax function in the classification network, and LABEL is a binary vector of the original data LABEL LABEL.

As the network has more parameters and the overfitting problem is easy to occur, the network parameters are limited by regularization, so that the overfitting phenomenon is relieved to a certain extent. From the above description, the error of the network can be expressed as equation (17):

J(θ)＝loss(h,label)+λΣ||θ||₂.(17)

wherein J (θ) represents a network error, λ is a regular term coefficient, θ is a set of all learnable parameters in the feature learning network, as in equation (18), including a convolution kernel of the convolution network, a linear mapping matrix of the classification network:

θ＝{θ_c，θ_id}，θ_c＝{c₁，c₂，...，c_K}，θ_id＝W^id(18)

the learning objective of the network is to solve a set of parameters theta that minimizes the network error (17)_optAs shown in (19):

in FIG. 6, J (Θ)_i) Represents the network error of the i-th layer convolution network calculation, wherein theta_iRepresenting the network parameters from the 1 st layer to the i th layer of the convolutional network and the linear mapping matrix W of the unified dimension of the current layer_iAs in formula (20):

wherein, theta_iThe learnable set of parameters representing the i-th layer convolutional network includes all learnable parameters involved in the convolution operation, the activation operation, and the downsampling operation.

(D) Multi-level feature fusion and dimensionality reduction

Feature as shown in FIG. 6_mergeRepresenting a unified dimensional feature vector f by levels_iThe joint eigenvectors formed in series, i.e.

{feature}_{m e r g e} = \underset{i = 1, ..., 4}{c o n c a t} (f_{i}) . - - - (21)

W_TRepresenting pairs of joint feature vectors feature_mergeA mapping matrix for performing linear dimension reduction mapping, f_TIs represented by feature_mergeAnd obtaining a comprehensive characteristic vector through linear dimensionality reduction mapping, wherein the comprehensive characteristic vector comprises characteristic vector information of each hierarchical network, as shown in a formula (22), and n in the formula_TIndicating set f_TDimension (c):

<math> <mrow> <msub> <mi>f</mi> <mi>T</mi> </msub> <mo>=</mo> <msub> <mi>W</mi> <mi>T</mi> </msub> <msub> <mi>feature</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>r</mi> <mi>g</mi> <mi>e</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>W</mi> <mi>T</mi> </msub> <mo>&Element;</mo> <msup> <mi>R</mi> <mrow> <msub> <mi>n</mi> <mi>T</mi> </msub> <mo>×</mo> <mrow> <mo>(</mo> <mn>4</mn> <msub> <mi>n</mi> <mi>f</mi> </msub> <mo>)</mo> </mrow> </mrow> </msup> <mo>.</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>22</mn> <mo>)</mo> </mrow> </mrow> </math>

J(Θ_T) Representing the assignment to the comprehensive feature vector f_TNetwork error of the classification network of (1); wherein, theta_TRepresents the set of all convolutional network parameters, all unified linear mapping matrices, and linear dimension reduction mapping matrices, as in equation (23):

(E) back propagation of network errors

The invention updates the network parameters by using the BP algorithm.

According to the chain rule, the network error is propagated from back to front.

Linear mapping derivation of classification network:

learnable in classification network of i (i-1 … 4, T) th layerParameter is W^i,idAccording to J (theta)_i) The definition and chain derivation rules may be:

<math> <mrow> <mo>&dtri;</mo> <msubsup> <mi>W</mi> <mrow> <mi>c</mi> <mi>j</mi> </mrow> <mrow> <mi>i</mi> <mo>,</mo> <mi>i</mi> <mi>d</mi> </mrow> </msubsup> <mo>=</mo> <mo>-</mo> <msub> <mi>label</mi> <mi>c</mi> </msub> <mo>·</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>h</mi> <mi>c</mi> </msub> <mo>)</mo> </mrow> <mo>·</mo> <msub> <mi>f</mi> <mi>j</mi> </msub> <mo>+</mo> <mn>2</mn> <mi>λ</mi> <mo>·</mo> <msubsup> <mi>W</mi> <mrow> <mi>c</mi> <mi>j</mi> </mrow> <mrow> <mi>i</mi> <mo>,</mo> <mi>i</mi> <mi>d</mi> </mrow> </msubsup> <mo>.</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>24</mn> <mo>)</mo> </mrow> </mrow> </math>

j (theta) can be obtained at the same time_i) Derivative with respect to f:

<math> <mrow> <mfrac> <mrow> <mo>∂</mo> <mi>J</mi> <mrow> <mo>(</mo> <msub> <mi>Θ</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>j</mi> </msub> </mrow> </mfrac> <mo>=</mo> <munderover> <mo>Σ</mo> <mrow> <mi>c</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>N</mi> <mi>C</mi> </msub> </munderover> <mo>[</mo> <mo>-</mo> <msub> <mi>label</mi> <mi>c</mi> </msub> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>h</mi> <mi>c</mi> </msub> <mo>)</mo> </mrow> <mo>·</mo> <msubsup> <mi>W</mi> <mrow> <mi>c</mi> <mi>j</mi> </mrow> <mrow> <mi>i</mi> <mo>,</mo> <mi>i</mi> <mi>d</mi> </mrow> </msubsup> <mo>]</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>25</mn> <mo>)</mo> </mrow> </mrow> </math>

uniform dimension linear mapping derivation:

each uniform dimension linear mapping matrix W_iWill be paired with J (theta)_i) And J (theta)_T) Two net errors contribute, so the BP algorithm is used to pair W_iWhen updating, W_iThe update gradient of (a) is represented by J (Θ)_i) To W_iDerivative of (c) and J (theta)_T) To W_iThe derivative is formed jointly, meanwhile, in the training process, each network error is given a weight, and in conclusion, W can be obtained_iAs shown in equation (26):

according to the chain-type derivation rule, there can be:

<math> <mrow> <mtable> <mtr> <mtd> <mrow> <mfrac> <mrow> <mo>∂</mo> <mi>J</mi> <mrow> <mo>(</mo> <msub> <mi>Θ</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mo>∂</mo> <msub> <mi>W</mi> <mi>i</mi> </msub> </mrow> </mfrac> <mo>=</mo> <mfrac> <mrow> <mo>∂</mo> <mi>J</mi> <mrow> <mo>(</mo> <msub> <mi>Θ</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>T</mi> </msub> </mrow> </mfrac> <mo>·</mo> <mfrac> <mrow> <mo>∂</mo> <mi>f</mi> </mrow> <mrow> <mo>∂</mo> <msub> <mi>feature</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>r</mi> <mi>g</mi> <mi>e</mi> </mrow> </msub> </mrow> </mfrac> <mo>·</mo> <mfrac> <mrow> <mo>∂</mo> <msub> <mi>feature</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>r</mi> <mi>g</mi> <mi>e</mi> </mrow> </msub> </mrow> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>i</mi> </msub> </mrow> </mfrac> <mo>·</mo> <mfrac> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>i</mi> </msub> </mrow> <mrow> <mo>∂</mo> <msub> <mi>W</mi> <mi>i</mi> </msub> </mrow> </mfrac> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>=</mo> <msub> <mrow> <mo>[</mo> <msubsup> <mi>W</mi> <mi>T</mi> <mi>T</mi> </msubsup> <mo>×</mo> <mfrac> <mrow> <mo>∂</mo> <mi>J</mi> <mrow> <mo>(</mo> <msub> <mi>Θ</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>T</mi> </msub> </mrow> </mfrac> <mo>]</mo> </mrow> <mrow> <mi>n</mi> <mi>f</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>+</mo> <mn>1</mn> <mo>:</mo> <mi>n</mi> <mi>f</mi> <mo>·</mo> <mi>i</mi> </mrow> </msub> <mo>×</mo> <msubsup> <mi>fea</mi> <mi>i</mi> <mi>T</mi> </msubsup> <mo>.</mo> </mrow> </mtd> </mtr> </mtable> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>28</mn> <mo>)</mo> </mrow> </mrow> </math>

thus, can have

And (3) synthesizing linear dimensionality reduction mapping derivation of the feature layer:

linear dimension reduction mapping matrix W of comprehensive characteristic layer_TFor J (theta) only_T) The effect is easily obtained according to the chain-type derivation rule:

<math> <mrow> <mtable> <mtr> <mtd> <mrow> <mo>&dtri;</mo> <msub> <mi>W</mi> <mi>T</mi> </msub> <mo>=</mo> <msub> <mi>w</mi> <mi>T</mi> </msub> <mo>·</mo> <mfrac> <mrow> <mo>∂</mo> <mi>J</mi> <mrow> <mo>(</mo> <msub> <mi>Θ</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mo>∂</mo> <msub> <mi>W</mi> <mi>T</mi> </msub> </mrow> </mfrac> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>=</mo> <msub> <mi>w</mi> <mi>T</mi> </msub> <mo>·</mo> <mfrac> <mrow> <mo>∂</mo> <mi>J</mi> <mrow> <mo>(</mo> <msub> <mi>Θ</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>T</mi> </msub> </mrow> </mfrac> <mo>·</mo> <mfrac> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>T</mi> </msub> </mrow> <mrow> <mo>∂</mo> <msub> <mi>W</mi> <mi>T</mi> </msub> </mrow> </mfrac> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>=</mo> <msub> <mi>w</mi> <mi>T</mi> </msub> <mrow> <mo>(</mo> <mfrac> <mrow> <mo>∂</mo> <mi>J</mi> <mrow> <mo>(</mo> <msub> <mi>Θ</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mo>∂</mo> <msub> <mi>f</mi> <mi>T</mi> </msub> </mrow> </mfrac> <mo>×</mo> <msubsup> <mi>feature</mi> <mrow> <mi>m</mi> <mi>e</mi> <mi>r</mi> <mi>g</mi> <mi>e</mi> </mrow> <mi>T</mi> </msubsup> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>30</mn> <mo>)</mo> </mrow> </mrow> </math>

meanwhile, the derivative of the input feature vector of the unified dimensional linear mapping of each level can be calculated as follows:

and (3) convolution network parameter derivation:

the only parameters that can be learned in a convolutional network are the convolution kernels in the convolution operation, and therefore, J (Θ) needs to be calculated_i) The updated gradient for each level of the convolution kernel c. According to the chain-type derivation rule, there can be:

wherein,

<math> <mrow> <mi>k</mi> <mi>r</mi> <mi>o</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>A</mi> <mo>,</mo> <mi>B</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = '[' close = ']'> <mtable> <mtr> <mtd> <mrow> <msub> <mi>a</mi> <mn>11</mn> </msub> <mi>B</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mn>12</mn> </msub> <mi>B</mi> </mrow> </mtd> <mtd> <mo>...</mo> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mrow> <mn>1</mn> <mi>N</mi> </mrow> </msub> <mi>B</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>a</mi> <mn>21</mn> </msub> <mi>B</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mn>21</mn> </msub> <mi>B</mi> </mrow> </mtd> <mtd> <mo>...</mo> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mrow> <mn>2</mn> <mi>N</mi> </mrow> </msub> <mi>B</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>...</mo> </mtd> <mtd> <mo>...</mo> </mtd> <mtd> <mo>...</mo> </mtd> <mtd> <mo>...</mo> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>a</mi> <mrow> <mi>M</mi> <mn>1</mn> </mrow> </msub> <mi>B</mi> </mrow> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mrow> <mi>M</mi> <mn>1</mn> </mrow> </msub> <mi>B</mi> </mrow> </mtd> <mtd> <mo>...</mo> </mtd> <mtd> <mrow> <msub> <mi>a</mi> <mi>MN</mi> </msub> <mi>B</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>,</mo> <mi>A</mi> <mo>&Element;</mo> <msup> <mi>R</mi> <mrow> <mi>M</mi> <mo>×</mo> <mi>N</mi> </mrow> </msup> <mo>.</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>36</mn> <mo>)</mo> </mrow> </mrow> </math>

location represents M^SValue of (A) is in M^AThe binarization matrix of the positions in (1), namely:

in order to introduce the principle of the process of feature learning by using the cumulative weighted deep convolutional network with multi-level feature fusion, a specific algorithm is given below, as shown in table 1:

table 1 shows the process of feature learning using the cumulative weighted deep convolutional network with multi-level feature fusion.

The authentication process of the present invention can be performed as follows:

(I) image preprocessing

The invention adopts a face detection algorithm based on cascade Adaboost to detect the face of an image, then utilizes a face feature point positioning algorithm based on SDM to position the feature points of the detected face, corrects and normalizes and aligns the face by image scaling, rotation and translation, and finally obtains a face image with the size of 100 x 100, and in the image, the image coordinates of a left eye are (30,30), and the image coordinates of a right eye are (30,70), as shown in figure 3.

The invention adopts simple gray scale normalization preprocessing, as shown in the following formula (1), wherein I (I, j) represents the gray scale value of an image (I, j). The main purpose of the gray level normalization is to facilitate the network processing of continuous data and avoid processing of larger discrete gray values, thereby avoiding abnormal situations.

I (i, j) = \frac{I (i, j)}{256} . - - - (1)

(II) feature extraction

Image feature extraction using trained networks

After training of the multi-level feature fusion accumulated weighted depth convolution network is completed, the trained network can be used to extract features of the input image, as shown in table 2:

(III) feature alignment

(I) Absolute value normalized cosine value

The invention provides a normalized cosine value (cos) of absolute value_AN) As defined in formula (39):

\cos_{A N} (x, y) = c o s (\hat{x}, \hat{y}) - - - (39)

wherein,

{\hat{x}}_{i} = \frac{x_{i}}{| x_{i} | + | y_{y} |}, {\hat{y}}_{i} = \frac{y_{i}}{| x_{i} | + | y_{i} |} . - - - (40)

experiments show that the absolute value normalized cosine value is sensitive to the difference of the modular lengths of the compared vectors, and the problem of incomplete description of the difference caused by neglecting the difference of the modular lengths of the vectors in cosine similarity can be solved.

(II) LDA-based multi-difference fusion comparison algorithm

The invention combines the cosine similarity, the absolute value normalized cosine value and the two characteristic modular lengths of the comparison characteristics into a four-dimensional difference vector, namely

f_diff(f_T1,f_T2)＝[cos(f_T1,f_T2),cos_AN(f_T2,f_T2),|f_T2|,|f_T2|]^T(41)

Then the four-dimensional difference vectors are fused into a one-dimensional similarity quantity by LDA (Linear discriminant analysis), that is, the difference vector mapping matrix W_LDAA four-dimensional disparity vector needs to be mapped to a one-dimensional vector.

sim(f_T1,f_T2)＝W_LDAf_diff(42)

Wherein W_LDARepresenting the mapping vector obtained with LDA.

The technical scheme of the embodiment of the invention has the following beneficial effects:

in this embodiment, the cumulative weighted depth convolution network with multi-level feature fusion is used for feature learning and feature extraction, and then the characteristics of two face images are compared by using the LDA-based multi-difference fusion comparison algorithm, which has the following five advantages: firstly, the invention utilizes the convolution network to automatically learn and extract the characteristics, thereby avoiding the defects of artificial characteristics; secondly, the problem of gradient dispersion is avoided through the multi-layer classification network combined training; thirdly, by means of multi-level feature fusion, the feature richness of the image is increased, and the defects that a common depth network cannot fully process the features of each level and only uses high-level features to fully describe the image are overcome; fourthly, increasing the flexibility of network learning by weighting the error of the multi-layer classification network; and fifthly, the problem that the cosine similarity does not describe the difference of the feature vectors completely is solved through a multi-difference fusion comparison algorithm based on LDA. The four sub-libraries Fb, Fc, DupI and DupII respectively obtain the certification rates of 99.9%, 100%, 98.8% and 99.6% (the false positive rate is 0.1%) when tested on the FERET database.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method for face authentication, comprising:

2. The method of claim 1, wherein before the extracting feature vectors of multiple levels in sequence from the face image to be authenticated and the face image template by using a multi-level deep convolutional network that is jointly trained by a multi-level classification network in advance, the method further comprises:

and preprocessing the face image to be authenticated and the face image template, wherein the preprocessing comprises characteristic point positioning, image correction and normalization processing.

3. The method for authenticating the human face according to claim 1, wherein the feature vector of each level is calculated by the following steps:

carrying out convolution operation on the face image to be authenticated and the face image template by using a convolution kernel to obtain a convolution characteristic graph, wherein the convolution operation is same convolution operation;

activating the convolution characteristic diagram by using an activation function to obtain an activation characteristic diagram, wherein the activation function is a ReLU activation function;

performing downsampling operation on the activation characteristic graph by using a sampling function to obtain a sampling characteristic graph, wherein the downsampling operation is maximum value sampling;

repeating the steps on the obtained sampling characteristic diagram to obtain a new sampling characteristic diagram, and repeating the steps for a plurality of times;

vectorizing all the obtained sampling feature maps to obtain a feature vector of each level.

4. The method for authenticating the human face according to any one of claims 1 to 3, wherein the multi-level deep convolutional network is obtained by joint training of a softmax classification network, and the training step comprises the following steps:

sequentially extracting a plurality of levels of feature vectors from the face image sample by using an initialized multi-level depth convolution network;

respectively mapping the uniform dimension characteristic vectors by using a linear mapping matrix in the softmax classification network to obtain mapping vectors;

distributing weight to the network error, and calculating the update gradients of the linear mapping matrix, the uniform dimension linear mapping matrix, the linear dimensionality reduction mapping matrix and the convolution kernel;

iteratively updating the linear mapping matrix, the uniform dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel by using the updating gradients of the linear mapping matrix, the uniform dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel;

and judging whether the network error and the iteration frequency meet the requirements, if so, ending, otherwise, turning to the step of using the initialized multi-level depth convolution network for the face image sample to sequentially extract the feature vectors of a plurality of levels.

5. The method for authenticating the human face according to any one of claims 1 to 3, wherein the comparing and authenticating the obtained comprehensive feature vector of the human face image to be authenticated and the comprehensive feature vector of the human face image template by using the absolute value normalized cosine value through the linear discriminant analysis comprises:

performing cosine similarity operation by taking the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template as input quantities to obtain cosine similarity;

carrying out absolute value normalization cosine operation by taking the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template as input quantities to obtain an absolute value normalization cosine value;

performing modular operation on the obtained comprehensive characteristic vector of the face image to be authenticated and the comprehensive characteristic vector of the face image template to obtain a first modular length and a second modular length;

forming a four-dimensional difference vector by the cosine similarity, the absolute value normalized cosine value, the first modular length and the second modular length;

mapping the difference vector by using a difference vector mapping matrix to obtain a one-dimensional vector as a comparison score;

and comparing the comparison score with a comparison threshold, and if the comparison score is greater than the comparison threshold, the face authentication is passed.

6. An apparatus for face authentication, comprising:

7. The apparatus for face authentication according to claim 6, wherein the first extraction module further comprises:

8. The apparatus for authenticating a human face according to claim 6, wherein the feature vector of each hierarchy is calculated by:

the convolution unit is used for carrying out convolution operation on the face image to be authenticated and the face image template by using convolution kernel to obtain a convolution feature map, wherein the convolution operation is same convolution operation;

the activation unit is used for performing activation operation on the convolution characteristic diagram by using an activation function to obtain an activation characteristic diagram, wherein the activation function is a ReLU activation function;

the sampling unit is used for carrying out downsampling operation on the activation characteristic diagram by using a sampling function to obtain a sampling characteristic diagram, and the downsampling operation is maximum value sampling;

9. The apparatus for authenticating human face according to any one of claims 6 to 8, wherein the multi-level deep convolutional network is obtained by joint training of softmax classification networks, and comprises:

the fifth mapping module is used for carrying out dimensionality reduction mapping on the combined eigenvector through a linear dimensionality reduction mapping matrix to obtain a comprehensive eigenvector;

the second calculation module is used for distributing weight to the network error and calculating the linear mapping matrix, the uniform dimension linear mapping matrix, the linear dimensionality reduction mapping matrix and the update gradient of the convolution kernel;

the updating module is used for carrying out iterative updating on the linear mapping matrix, the uniform dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel by utilizing the updating gradients of the linear mapping matrix, the uniform dimension linear mapping matrix, the linear dimension reduction mapping matrix and the convolution kernel;

and the judging module is used for judging whether the network error and the iteration frequency meet the requirements or not, if so, ending, and otherwise, turning to the second extracting module.

10. The apparatus for authenticating a human face according to any one of claims 6 to 8, wherein the first comparison module comprises:

the mapping unit is used for mapping the difference vector by using a difference vector mapping matrix to obtain a one-dimensional vector as a comparison score;