A kind of across age face verification method based on feature learning
Technical field
The present invention relates to a kind of face verification methods, and in particular to a kind of face verification method across the age, especially base
In across the age face verification method of feature learning.
Background technique
Face is widely used in the identification of various occasions as one most significant region of people of identification.Generally
For, the recognition methods of face includes four steps: man face image acquiring and detection, facial image pretreatment, facial image spy
Sign is extracted, face matches and verifying.Usually using some Feature Descriptors manually set, such as LBP, SIFT and Gabor etc.,
It indicates human face data, measures the similarity of a pair of of image using COS distance, to realize judgement verifying.
But with advancing age, the face of people can be inevitably generated variation.In some occasions, an only people
The photo of different age group, such as the photo only before the more than ten years need to carry out in the head portrait of alternative personnel and existing clue
Comparison, to achieve the goal, this requires carry out across age face verification.So-called across age face verification is exactly given one
The facial image of a little different age groups, determines whether these facial images belong to the same person.If face verification method can
The variation that reply face generates with advancing age, in archive management system, the criminal of security authentication systems, public security system
The fields such as identification, the monitoring of bank and customs, will have broad application prospects.
In order to realize across age verification, most of traditional methods were modeled to the age, passed through design face growth
Model carries out the face verification across the age.However, such methods generally require rely on priori, such as individual practical year
Age, and not all data set can provide age information.
Deep learning method simulates the Gradation processing structure of human brain, and the inherence of data rich is portrayed with succinct expression way
Information, it is a kind of model of nonlinearity, and data capability of fitting and learning ability with super strength, ability to express is stronger,
The internal information of data rich can more be portrayed.Depth network can learn from data to feature, this mode unsupervisedly
The feature practised also complies with the mechanism in the human perception world, and often has one by the feature that deep learning method learns
Fixed semantic feature.Chinese invention patent application CN104573679A discloses the people based on deep learning under a kind of monitoring scene
Face identifying system, including video acquisition unit, Face datection unit, matching display unit, storage unit, wherein detection unit
In be provided with face difference module, neural network model that face difference module is established using deep learning module carries out face and sentences
Not, which is 5 layers of neural net layer.This method attempts to use the quick knowledge that deep learning method realizes face
Not, it but is difficult to reach target using 5 layers of neural net layer, therefore, in the disclosure fails to provide identification accurately
The assessment data of rate.
For the step feature extraction of most critical in face verification, it is primarily present two problems at present:
1, the monotonicity of facial image.The a large amount of human face datas being currently known are concentrated, and facial image is often more dull
, and most methods are done on single scale at present, the feature extracted in this way is often not abundant enough, is not enough to characterize
Face.
2, the problem of another merits attention is exactly the acquisition of feature.What traditional face verification used is all hand-designed
Feature, this feature specific aim is relatively high, but typically low-level feature, does not often include semantic information, and extensive
It is indifferent.With the arrival of big data era, data volume is also increasing, and how automatically to obtain feature is worth as one
The project of research.
Summary of the invention
Goal of the invention of the invention is to provide a kind of across age face verification method based on feature learning, to improve across year
The accuracy rate of age face verification.
To achieve the above object of the invention, the technical solution adopted by the present invention is that: a kind of across age people based on feature learning
Face verification method, includes the following steps:
(1) two width facial images to be compared are obtained;
(2) alignment operation is carried out to two width facial images using the method for facial modeling;
(3) feature extraction, the feature extraction are carried out to each image respectively are as follows:
1. automatically extracting high-level semantics features by depth convolutional neural networks;
2. calculating the LBP histogram feature of image;
3. obtaining the feature of image by 1. and 2. the middle feature obtained merges, being expressed as feature vector;Here it uses
Fusion method be that two-part feature vector is connected into a feature vector.
(4) the distance between the feature vector of two images that step (3) obtain is calculated using cosine similarity method, according to
This judges whether two images come from same people.
Preferred technical solution in step (2), carries out alignment operation using Flandmark method.
In above-mentioned technical proposal, the depth convolutional neural networks are successively made of following each layer: input layer, the first convolution
Layer, the first maximum pond layer, the second convolutional layer, the second maximum pond layer, third convolutional layer, third maximum pond layer, Volume Four
Lamination, full articulamentum, output layer.
Wherein, it is 4 × 4 wave devices that first convolutional layer, which is equipped with 20 convolution kernel sizes, the first maximum pond layer, pond
Step-length is 2, and the second convolutional layer is equipped with the filter that 40 convolution kernel sizes are 3 × 3, the second maximum pond layer, and pond step-length is
2, third convolutional layer is equipped with the filter that 60 convolution kernel sizes are 4 × 4, third maximum pond layer, and pond step-length is the 2, the 4th
Convolutional layer is equipped with the filter that 60 convolution kernel sizes are 3 × 3.
In above-mentioned technical proposal, the softmax classifier of a K class is provided in the output layer, K is the number to be classified
Mesh.
The calculation method of LBP histogram feature is to give piece image I, the pyramid representation of image I are as follows:, whereinIt is Gauss
Core, k are the pyramidal numbers of plies;The image pyramid that G is indicated, (x, y) indicate the position of pixel.
It is divided into 8 × 8 block, each piece of statistics LBP histogram, by each piece of LBP histogram for pyramidal each layer
Figure connects into a vector, the LBP pyramid representation of image are as follows:
, wherein L
It (I) is the statement mapped image.
Cosine similarity calculation method is,
The feature vector of image i is,
The feature vector of image j is,
In formula, the number that n is characterized,For n-th of feature of image i,For n-th of feature of image j;
Cosine similarity calculation formula between image i and the feature vector of image j are as follows:
, k is feature in feature vector
Serial number.
Due to the above technical solutions, the present invention has the following advantages over the prior art:
1, the present invention automatically obtains the high-level semantics features of facial image using nine layer depth convolutional neural networks, it
The shared complexity for reducing network model of weight, this is also for the first time by depth network application to across age face verification.
2, the present invention creatively carries out the feature of the LBP histogram feature of hand-designed and depth Web-based Self-regulated Learning
Fusion realizes that high-level semantics features are complementary with low-level feature, the experimental results showed that it is with better accuracy rate.
Detailed description of the invention
Fig. 1 is the frame composition of method in the embodiment of the present invention;
Fig. 2 is depth convolutional neural networks architecture diagram used in embodiment;
Fig. 3 is CACD data set exemplary diagram used in embodiment;
Fig. 4 is the ROC curve figure of result in embodiment.
Specific embodiment
The invention will be further described with reference to the accompanying drawings and embodiments:
Embodiment one:
A kind of across age face verification method based on feature learning, it is shown in Figure 1, include the following steps: that (1) obtains
Two width facial images to be compared;
(2) alignment operation is carried out to two width facial images using the method for facial modeling;
Under free environmental condition, facial image inevitably will receive facial expression, illumination or block
Influence, when the face part of two images without sufficiently be aligned when, this influence can be amplified.
The purpose of facial modeling is to further determine that facial feature points (eyes, eyebrow on the basis of Face datection
Hair, nose, mouth, face's outer profile) position.The basic ideas of location algorithm are: the textural characteristics of face and each feature
Position constraint between point combines.The facial characteristics point location of early stage is concentrated mainly in the positioning of several key points, such as
The center of eyes and mouth.Later, researchers had found that more feature points constraints, which are added, can effectively improve accuracy rate, increase
Stiff stability.In the present embodiment, face alignment is carried out using Flandmark method.It is the facial key point of a detection
Open Source Code library, and real-time alignment operation may be implemented.
(3) feature extraction, the feature extraction are carried out to each image respectively are as follows:
1. automatically extracting high-level semantics features by depth convolutional neural networks (DCNN);
In the present embodiment, the visual signature extracted in image for level is operated using convolution operation and pondization, from office
Portion's low-level feature is to global high-level characteristic.Wherein include 4 convolutional layers, a maximum pond layer is connect behind each convolutional layer.?
Each convolutional layer, the weight of neuron is shared in layer, and there is no shared for interlayer weight.It is finally two full connections
The output of layer, layer second from the bottom is obtained coarse feature, and the last layer is output layer, and output is every width facial image
The corresponding ID number of the facial image of corresponding maximum probability, that is, the corresponding classification of every piece image.
Specifically, referring to shown in attached drawing 2, in the overall structure of DCNN, process registration process that a size is 55 × 47
Facial image be input to first convolutional layer of convolutional neural networks, it is 4 × 4 filtering that first layer, which has 20 convolution kernel sizes,
Device.20 characteristic patterns are obtained, then these characteristic patterns are input to a maximum pond layer, and pond step-length is 2.Then pond layer
Input of the output as next convolutional layer, this convolutional layer have 40 convolution kernels for 3 × 3 filter.This three layers main
Work when extract image low-level feature such as simple edge feature.When image local has small variation, most
Great Chiization is so that the obtained result of our convolutional layers is more robust.And the 2D face alignment mentioned before, so that whole network pair
There is better adaptability in some subtle adjustment of face.It is 4 × 4 that our third convolutional layer, which is equipped with 60 convolution kernel sizes,
Filter, then carry out the maximum pond of third time, pond step-length is 2, and the 4th layer of last convolutional layer is equipped with 60 convolution kernels
The filter that size is 3 × 3.By the layer-by-layer extraction of DCNN, a series of face characteristic has been obtained.By the process of feature extraction
It is defined as f=Conv(x, q), wherein Conv () indicates that the feature extraction function of convolutional neural networks, x indicate the image of input,
The feature vector that f is represented, what q was indicated is the parameter for needing to learn in DCNN.
The output of the last one full articulamentum will be sent in the softmax classifier of a K class, and what K was indicated will classify
Number, this classifier generate a class label distribution.A given input, it is assumed that wherein k-th of output is, then
The activation primitive of softmax classifier is are as follows:
Trained purpose is the probability for maximizing every width facial image and being assigned to correct face_id, and this problem can
To be converted into the cross entropy loss function for minimizing each training sample.If k is the correct label value of training sample,
So this loss function can be with is defined as:Letter can be lost to this using the method for stochastic gradient descent
Number carries out minimum calculating, and gradient declines using most classic BP algorithm.In a convolutional layer, upper one layer of feature maps
Convolution is carried out by a convolution kernel that can learn, then passes through an activation primitive, so that it may output feature map is obtained, it is each
A output map may be the value of multiple input map.
2. calculating the LBP histogram feature of image;
The calculation method of LBP histogram feature is to give piece image I, the pyramid representation of image I are as follows:, whereinIt is Gauss
Core, s are the pyramidal numbers of plies;
It is divided into 8 × 8 block, each piece of statistics LBP histogram, by each piece of LBP histogram for pyramidal each layer
Figure connects into a vector, the LBP pyramid representation of image are as follows:
3. obtaining the feature of image by 1. and 2. the middle feature obtained merges, being expressed as feature vector;
(4) the distance between the feature vector of two images that step (3) obtain is calculated using cosine similarity method, according to
This judges whether two images come from same people.
Cosine similarity calculation method is,
The feature vector of image i is,
The feature vector of image j is,
In formula, the number that n is characterized,For n-th of feature of image i,For n-th of feature of image j;
Cosine similarity calculation formula between image i and the feature vector of image j are as follows:
, k is feature in feature vector
Serial number.
By being trained on the data set of given label, the parameter of classifier is finely adjusted, this reality is enabled to
The classifier performance for applying example is more preferable.
Illustrate effect of the invention by specifically comparing further below.
Experiment carries out on CACD (Cross-Age Celebrity Dataset) age data collection and LFW data set.
CACD age data collection comprises more than 160000 facial images from 2000 different people, everyone has multiple not the same years
The image of age grade section, age range are 16 to 62 years old.Since in LFW data set, most people all only has a picture, therefore
It is less feasible for being trained with this data set, therefore, carries out model using this data set of CelebFaces+
Training, this data set includes 202599 pictures from 10177 different people, and LFW and CelebFaces data set
In people it is essentially identical, therefore can train and model and then be tested on LFW data set on this.Experimental Hardware
Environment: Window 7, Core i7 processor, dominant frequency 3.4G inside save as 8G.Code running environment is: Matlab 2013a.
In an experiment, experiment is measured with True Positive Rate (TPR)-False Positive Rate (TPR)
Effect.TPR and FPR are defined as follows:
1, the experiment on CACD data set
Data set example is referring to shown in attached drawing 3.
For different method for measuring similarity, the results are shown in Table 1.
The comparison of the different method for measuring similarity of table 1
Measure |
Accuracy rate (%) |
Distance |
86.3 |
Euclidean distance |
81.4 |
Pasteur's distance |
82.1 |
Cosine similarity |
89.5 |
Algorithm of the invention is assessed with ROC curve, and is compared with other methods.Compare experiment
Method has: gradient direction pyramid (GOP),Norm, Bayesian+PointFive Face (PFF).In this experiment, it surveys
Examination collection contains 2000 pairs of positive samples and 2000 pairs of negative samples.The ROC curve of experimental result is as shown in Figure 4.
From fig. 4, it can be seen that using method (DCNN+LBPH) of the invention compared to other methods for, as a result on have
A degree of raising.
In the CACD data set subset set, also the accuracy rate of verifying is assessed, and the ratio of other methods
Relatively as shown in table 2.
Table 2:
Method |
The accuracy rate (%) of verifying |
High-Dimensional LBP |
81.6 |
Hidden Factor Analysis |
84.4 |
Cross-Age Reference Coding |
87.6 |
DCNN+LBPH(the present embodiment) |
89.5 |
Manually, average value |
85.7 |
Manually, ballot method |
94.2 |
2, the experiment on LFW data set
LFW data set is the standard data set of a field of face identification.In this huge data set, it is added
Limitation, for example after the variation at age, intense light irradiation or expression, any slight progress for surmounting current algorithm is all not allow very much
Easy.
After running through training set, the work that the present embodiment is verified in this subset of LFW view2, this subset is
List all same facial image pair and non-same facial image pair.It takes same with the experiment on CACD data set
Method, the first step are to obtain an accurate face expression, and it is same to verify whether to belong to then to calculate their similarity
Individual.Method of the invention introduces deep learning frame, and compared with other algorithms, experimental result is as shown in table 3.As it can be seen that this
Other methods before the method for invention is an advantage in accuracy rate.
Table 3
Method |
Accuracy rate (%) |
PLDA |
90.07 |
Joint Bayesian |
90.90 |
Linear rectified units |
80.73 |
GSML |
84.18 |
OSS, TSS, full |
86.83 |
DCNN+LBPH(the present embodiment) |
91.40 |