WO2021012494A1 - 基于深度学习的人脸识别方法、装置及计算机可读存储介质 - Google Patents
基于深度学习的人脸识别方法、装置及计算机可读存储介质 Download PDFInfo
- Publication number
- WO2021012494A1 WO2021012494A1 PCT/CN2019/116934 CN2019116934W WO2021012494A1 WO 2021012494 A1 WO2021012494 A1 WO 2021012494A1 CN 2019116934 W CN2019116934 W CN 2019116934W WO 2021012494 A1 WO2021012494 A1 WO 2021012494A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- face
- neural network
- convolutional neural
- training
- picture
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
Definitions
- This application relates to the field of artificial intelligence technology, and in particular to a face recognition method, device and computer-readable storage medium based on Gabor filters and convolutional neural networks.
- Face recognition is a kind of biometric recognition technology based on human facial feature information.
- face recognition technology mainly uses cameras and other camera equipment to collect images or video streams containing human faces, and automatically detect faces in the images, and then perform a series of related operations on the detected faces.
- the process of face recognition is the process of extracting and recognizing features from standard face images. Therefore, the quality of the extracted facial image features directly affects the final recognition accuracy, and the recognition model also plays a vital role in the accuracy of face recognition.
- most of the current feature extraction is mainly based on manual feature extraction. This method is restricted by many factors, and the current recognition models are based on traditional machine learning algorithms. Therefore, in general, the face recognition effect is not ideal and the recognition accuracy is not high.
- This application provides a face recognition method, device and computer-readable storage medium based on deep learning, the main purpose of which is to accurately identify a person from the face picture or video when a user inputs a face picture or video Face result.
- a face recognition method based on deep learning includes:
- a picture of a user's face is received, and the picture of the user's face is input to the convolutional neural network for face recognition, and the recognition result is output.
- the web page includes a web page of an ORL face database, a Yale face database, an AR face database, and/or a FERET face database.
- the extracting the face features of the original face image set according to the Gabor filter to obtain the face feature set includes:
- a Gabor filter bank composed of several Gabor filters receives the original face image set
- the Gabor filter bank sequentially performs a first convolution operation with pictures in the original face image set to obtain Gabor features
- the Gabor features obtained by each first convolution operation are combined into a set to obtain the face feature set.
- the first convolution operation is:
- O y, u, v (x 1 , x 2 ) is the Gabor feature
- M(x 1 , x 2 ) is the pixel value coordinates of the picture in the original face image set
- ⁇ y, u, v (z) is the convolution function
- z is the convolution operator
- y, u, and v represent the three components of the picture
- y is the brightness of the picture
- u, v are the chromaticity of the picture.
- the convolutional neural network includes a sixteen-layer convolutional layer, a sixteen-layer pooling layer, and a fully connected layer; and the input of the face feature vector set to a pre-built convolutional neural network Training in the network model until the loss function value in the convolutional neural network is less than the preset threshold to exit training, including:
- the convolutional neural network After receiving the face feature vector set, the convolutional neural network inputs the face feature vector set to the sixteen-layer convolutional layer and sixteen-layer pooling layer to perform a second convolution operation and a maximum pooling Input to the fully connected layer after transformation operation;
- the fully connected layer is combined with the activation function to calculate the training value, and the training value is input into the loss function of the model training layer, and the loss function calculates the loss value, and the magnitude of the loss value and a preset threshold is judged Relationship, until the loss value is less than the preset threshold, the convolutional neural network exits training.
- this application also provides a face recognition device based on deep learning.
- the device includes a memory and a processor.
- the memory stores a person based on deep learning that can run on the processor.
- a face recognition program when the face recognition program based on deep learning is executed by the processor, the following steps are implemented:
- a picture of a user's face is received, and the picture of the user's face is input to the convolutional neural network for face recognition, and the recognition result is output.
- the web page includes a web page of an ORL face database, a Yale face database, an AR face database, and/or a FERET face database.
- the extracting the face features of the original face image set according to the Gabor filter to obtain the face feature set includes:
- a Gabor filter bank composed of several Gabor filters receives the original face image set
- the Gabor filter bank sequentially performs a first convolution operation with pictures in the original face image set to obtain Gabor features
- the Gabor features obtained by each first convolution operation are combined into a set to obtain the face feature set.
- the present application also provides a computer-readable storage medium that stores a face recognition program based on deep learning, and the face recognition program based on deep learning can be One or more processors execute to implement the steps of the face recognition method based on deep learning as described above.
- the face recognition method, device and computer readable storage medium based on deep learning proposed in this application can use crawler technology to collect a large number of high-quality face data sets from the Internet, which is ready for subsequent face feature analysis and recognition Pre-based, and because most faces do not occupy the entire picture or video, according to the shape of the Gabor filter, the features of the face part are extracted from the entire picture or video, which not only reduces the cumbersomeness of manually extracting features, but also At the same time, sufficient preparations are made for the subsequent analysis of the facial features by the convolutional neural network, which can effectively analyze the facial features and produce accurate face recognition effects. Therefore, this application can achieve an efficient and accurate face recognition effect.
- FIG. 1 is a schematic flowchart of a face recognition method based on deep learning provided by an embodiment of this application;
- FIG. 2 is a Gabor feature generation diagram of a face recognition method based on deep learning provided by an embodiment of this application;
- FIG. 3 is a schematic diagram of the internal structure of a face recognition device based on deep learning provided by an embodiment of the application;
- FIG. 4 is a schematic diagram of modules of a face recognition program based on deep learning in a face recognition device based on deep learning provided by an embodiment of the application.
- This application provides a face recognition method based on deep learning.
- FIG. 1 it is a schematic flowchart of a face recognition method based on deep learning provided by an embodiment of this application.
- the method can be executed by a device, and the device can be implemented by software and/or hardware.
- the face recognition method based on deep learning includes:
- the several face image databases include ORL face database, Yale face database, AR face database, and/or FERET face database, etc.
- the Yale face database includes 15 people, including 11 photos per person, and each photo has changes in lighting conditions, changes in facial expressions, etc.
- the FERET face database is Counterdrug Technology Transfer Program ( CTTP)
- CTTP Counterdrug Technology Transfer Program
- a face database collection activity of Face Recognition Technology Face Recognition Technology, FERET for short
- the FERET face database includes a general face database and a general test standard.
- the same face picture includes pictures of different expressions, lighting, postures and age groups.
- this application uses the Urllib module of python to read web page data, such as reading the web page of the FERET face database, and capture the face image data in the web page of the FERET face database, and combine these The data composes the original face image set.
- the Urllib module reads web pages such as Yale face database, AR face database, etc., and captures the face image data before placing it in the original face image set.
- this application composes several Gabor filters into a Gabor filter bank, and after the Gabor filter bank receives the original face image set, the Gabor filter bank is in turn with those in the original face image set.
- the picture is subjected to the first convolution operation to obtain Gabor features, and the Gabor features obtained from each first convolution operation are combined into a set to obtain the face feature set.
- O y, u, v (x 1 , x 2 ) is the Gabor feature
- M(x 1 , x 2 ) is the pixel value coordinates of the picture in the original face image set
- ⁇ y, u, v (z) is the convolution function
- z is the convolution operator
- y, u, and v represent the three components of the picture
- y is the brightness of the picture
- u, v are the chromaticity of the picture.
- the preferred embodiment of this application selects 40 Gabor filters to form a Gabor filter bank.
- the 40 Gabor filters form a Gabor filter bank to read an image of the original face image set and compare it with the Gabor filter bank.
- the filter bank performs the first convolution operation to obtain Gabor features, and the feature dimension of each Gabor feature is 40, and so on, the Gabor features form the face feature set.
- the change from the original face image to the Gabor feature is shown in Figure 2.
- the downsampling technology dimensionality reduction processing includes the first feature dimensionality reduction and the second feature dimensionality reduction.
- the first feature dimensionality reduction is to sequentially extract Gabor features from the face feature set, and based on a sliding window with a matrix dimension of 2*2, from left to right and from top to bottom in the extracted Gabor A mean value sampling with a step length of 2 is performed on the feature, whereby the feature dimension of the extracted Gabor feature is reduced to 1/4 of the original dimension, and the feature dimension becomes 10, and the first feature dimensionality reduction is completed.
- the feature dimension of the Gabor feature is reduced to 1/4 of the original dimension, and then an RBM model is connected to perform the second feature reduction.
- the RBM is an energy model (Energy based model, EBM), which is derived from Evolved from the physical energy model, the RBM model receives input data and solves the probability distribution of the input data according to an energy function, and obtains output data after optimization based on the probability distribution.
- EBM Energy model
- the second feature reduction uses the face feature set after the first feature reduction as the input data of the RBM model.
- the feature dimension of the output feature of the RBM model is 5.
- the dimensionality reduction processing reduces the feature dimension of Gabor features from 40 to 5, and so on to process each Gabor feature and finally compose the output dimensionality reduction feature into a face feature vector set.
- the pre-built convolutional neural network includes a sixteen-layer convolutional layer, a sixteen-layer pooling layer, and a fully connected layer.
- the convolutional neural network receives the face feature vector set, Input the face feature vector set to the sixteen-layer convolutional layer and the sixteen-layer pooling layer to perform a second convolution operation and a maximum pooling operation, and then input to the fully connected layer;
- the fully connected layer is combined with the activation function to calculate the training value, and the training value is input into the loss function of the model training layer.
- the loss function calculates the loss value, and judges the loss value and the preset value. The size relationship of the threshold value, until the loss value is less than the preset threshold value, the convolutional neural network exits training.
- ⁇ ' is the output data
- ⁇ is the input data
- k is the size of the convolution kernel
- s is the stride of the convolution operation
- p is the data zero-filling matrix
- the maximum pooling operation is to select a matrix in the matrix The largest value in the data replaces the entire matrix
- the activation function is:
- n is the size of the original picture set
- y t is the training value
- ⁇ t is the original picture set
- the preset threshold is generally set at 0.01.
- the invention also provides a face recognition device based on deep learning.
- FIG. 3 it is a schematic diagram of the internal structure of a face recognition device based on deep learning provided by an embodiment of this application.
- the face recognition apparatus 1 based on deep learning may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, or a portable computer, or a server.
- the face recognition device 1 based on deep learning at least includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.
- the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc.
- the memory 11 may be an internal storage unit of the face recognition device 1 based on deep learning, for example, a hard disk of the face recognition device 1 based on deep learning.
- the memory 11 may also be an external storage device of the face recognition device 1 based on deep learning, such as a plug-in hard disk equipped on the face recognition device 1 based on deep learning, and a smart media card (Smart Media Card). , SMC), Secure Digital (SD) card, Flash Card, etc.
- the memory 11 may also include both an internal storage unit of the face recognition apparatus 1 based on deep learning and an external storage device.
- the memory 11 can be used not only to store application software and various data installed in the face recognition device 1 based on deep learning, such as the code of the face recognition program 01 based on deep learning, etc., but also to temporarily store the output or The data to be output.
- the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip, and is used to run the program code or processing stored in the memory 11 Data, such as the face recognition program 01 based on deep learning.
- CPU central processing unit
- controller microcontroller
- microprocessor or other data processing chip
- the communication bus 13 is used to realize the connection and communication between these components.
- the network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the device 1 and other electronic devices.
- the device 1 may also include a user interface.
- the user interface may include a display (Display) and an input unit such as a keyboard (Keyboard).
- the optional user interface may also include a standard wired interface and a wireless interface.
- the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc.
- the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the face recognition device 1 based on deep learning and to display a visualized user interface.
- FIG. 3 only shows the deep learning-based face recognition device 1 with components 11-14 and the deep-learning-based face recognition program 01. Those skilled in the art can understand that the structure shown in FIG. 3 does not constitute The definition of the face recognition device 1 based on deep learning may include fewer or more components than shown, or a combination of certain components, or different component arrangements.
- the memory 11 stores a face recognition program 01 based on deep learning; the processor 12 implements the following steps when executing the face recognition program 01 based on deep learning stored in the memory 11:
- the several face image databases include ORL face database, Yale face database, AR face database, and/or FERET face database, etc.
- the Yale face database includes 15 people, including 11 photos per person, and each photo has changes in lighting conditions, changes in facial expressions, etc.
- the FERET face database is Counterdrug Technology Transfer Program ( CTTP)
- CTTP Counterdrug Technology Transfer Program
- a face database collection activity of Face Recognition Technology Face Recognition Technology, FERET for short
- the FERET face database includes a general face database and a general test standard.
- the same face picture includes pictures of different expressions, lighting, postures and age groups.
- this application uses the Urllib module of python to read web page data, such as reading the web page of the FERET face database, and capture the face image data in the web page of the FERET face database, and combine these The data composes the original face image set.
- the Urllib module reads web pages such as Yale face database, AR face database, etc., and captures the face image data before placing it in the original face image set.
- this application composes several Gabor filters into a Gabor filter bank, and after the Gabor filter bank receives the original face image set, the Gabor filter bank is in turn with those in the original face image set.
- the picture is subjected to the first convolution operation to obtain Gabor features, and the Gabor features obtained from each first convolution operation are combined into a set to obtain the face feature set.
- O y, u, v (x 1 , x 2 ) is the Gabor feature
- M(x 1 , x 2 ) is the pixel value coordinates of the picture in the original face image set
- ⁇ y, u, v (z) is the convolution function
- z is the convolution operator
- y, u, v represent the three components of the picture
- y is the brightness of the picture
- u, v are the chromaticity of the picture.
- the preferred embodiment of this application selects 40 Gabor filters to form a Gabor filter bank.
- the 40 Gabor filters form a Gabor filter bank to read an image of the original face image set and compare it with the Gabor filter bank.
- the filter bank performs the first convolution operation to obtain Gabor features, and the feature dimension of each Gabor feature is 40, and so on, the Gabor features form the face feature set.
- the change from the original face image to the Gabor feature is shown in Figure 2.
- the downsampling technology dimensionality reduction processing includes the first feature dimensionality reduction and the second feature dimensionality reduction.
- the first feature dimensionality reduction is to sequentially extract Gabor features from the face feature set, and based on a sliding window with a matrix dimension of 2*2, from left to right and from top to bottom in the extracted Gabor A mean value sampling with a step length of 2 is performed on the feature, whereby the feature dimension of the extracted Gabor feature is reduced to 1/4 of the original dimension, and the feature dimension becomes 10, and the first feature dimensionality reduction is completed.
- the feature dimension of the Gabor feature is reduced to 1/4 of the original dimension, and then an RBM model is connected to perform the second feature reduction.
- the RBM is an energy model (Energy based model, EBM), which is derived from Evolved from the physical energy model, the RBM model receives input data and solves the probability distribution of the input data according to an energy function, and obtains output data after optimization based on the probability distribution.
- EBM Energy model
- the second feature reduction uses the face feature set after the first feature reduction as the input data of the RBM model.
- the feature dimension of the output feature of the RBM model is 5.
- the dimensionality reduction processing reduces the feature dimension of Gabor features from 40 to 5, and so on to process each Gabor feature and finally compose the output dimensionality reduction feature into a face feature vector set.
- the pre-built convolutional neural network includes a sixteen-layer convolutional layer, a sixteen-layer pooling layer, and a fully connected layer.
- the convolutional neural network receives the face feature vector set, Input the face feature vector set to the sixteen-layer convolutional layer and the sixteen-layer pooling layer to perform a second convolution operation and a maximum pooling operation, and then input to the fully connected layer;
- the fully connected layer is combined with the activation function to calculate the training value, and the training value is input into the loss function of the model training layer.
- the loss function calculates the loss value, and judges the loss value and the preset value. The size relationship of the threshold value, until the loss value is less than the preset threshold value, the convolutional neural network exits training.
- ⁇ ' is the output data
- ⁇ is the input data
- k is the size of the convolution kernel
- s is the stride of the convolution operation
- p is the data zero-filling matrix
- the maximum pooling operation is to select a matrix in the matrix The largest value in the data replaces the entire matrix
- the activation function is:
- n is the size of the original picture set
- y t is the training value
- ⁇ t is the original picture set
- the preset threshold is generally set at 0.01.
- the deep learning-based face recognition program can also be divided into one or more modules, and the one or more modules are stored in the memory 11 and are executed by one or more processors ( This embodiment is executed by the processor 12) to complete this application.
- the module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is used to describe how a face recognition program based on deep learning is based on deep learning. The execution process in the face recognition device.
- FIG. 4 is a schematic diagram of the program modules of the face recognition program based on deep learning in an embodiment of the face recognition device based on deep learning of this application
- the face recognition program based on deep learning The recognition program can be divided into a source data receiving module 10, a feature extraction module 20, a model training module 30, and a face recognition result output module 40.
- the source data receiving module 10 is used to obtain face image data from web pages based on crawler technology to form an original face image set.
- the feature extraction module 20 is configured to: extract the face features of the original face image set according to the Gabor filter to obtain a face feature set, and perform dimensionality reduction processing on the face feature set according to a downsampling technique to form a face feature Vector set.
- the model training module 30 is configured to: input the face feature vector set into a pre-built convolutional neural network model for training, and exit training when the loss function value in the convolutional neural network is less than a preset threshold.
- the face recognition result output module 40 is configured to receive a face picture of the user, and input the face picture of the user into the convolutional neural network for face recognition, and output the recognition result.
- the above-mentioned source data receiving module 10, feature extraction module 20, model training module 30, face recognition result output module 40, and other program modules that implement functions or operation steps when executed are substantially the same as those in the foregoing embodiment, and will not be repeated here.
- an embodiment of the present application also proposes a computer-readable storage medium that stores a face recognition program based on deep learning, and the face recognition program based on deep learning can be used by one or more Each processor executes to achieve the following operations:
- a picture of a user's face is received, and the picture of the user's face is input to the convolutional neural network for face recognition, and the recognition result is output.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
- Collating Specific Patterns (AREA)
Abstract
一种基于深度学习的人脸识别方法、装置和计算机可读存储介质,涉及人工智能技术。所述方法包括:基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集(S1),根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集(S2),将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练(S3),接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果(S4)。该方法可以实现高效精准的人脸识别。
Description
本申请要求于2019年07月19日提交中国专利局、申请号为201910658687.0、发明名称为“基于深度学习的人脸识别方法、装置及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
本申请涉及人工智能技术领域,尤其涉及一种基于Gabor滤波器与卷积神经网络的人脸识别方法、装置及计算机可读存储介质。
人脸识别是基于人的脸部特征信息进行身份识别的一种生物识别技术。目前人脸识别技术主要用摄像机等摄像装备采集含有人脸的图像或视频流,并自动在图像中检测人脸,进而对检测到的人脸进行脸部识别的一系列相关操作。人脸识别的过程就是对标准的人脸图像进行特征提取和对特征进行识别的过程。因此所提取到的人脸图像特征的质量直接影响着最终的识别准确率,同时识别模型对人脸识别准确率也起到至关重要的影响。但目前多数的特征提取主要靠人工提取特征,该方法受很多因素的制约,且目前识别模型都基于传统机器学习算法,因此总体来说,人脸识别效果不理想、识别精度不高。
发明内容
本申请提供一种基于深度学习的人脸识别方法、装置及计算机可读存储介质,其主要目的在于当用户输入人脸图片或视频时,从所述人脸图片或视频中精准的识别出人脸结果。
为实现上述目的,本申请提供的一种基于深度学习的人脸识别方法,包括:
基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集;
根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集;
将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练;
接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
可选地,所述网页包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库的网页。
可选地,所述根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,包括:
由若干个Gabor滤波器组成的Gabor滤波器组接收所述原始人脸图像集;
所述Gabor滤波器组依次与所述原始人脸图像集内的图片做第一卷积操作得到Gabor特征;
将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
可选地,所述第一卷积操作为:
O
y,u,v(x
1,x
2)=M(x
1,x
2)*φ
y,u,v(z)
其中,O
y,u,v(x
1,x
2)为所述Gabor特征,M(x
1,x
2)为所述原始人脸图像集内的图片的像素值坐标,φ
y,u,v(z)为卷积函数,z为卷积算子,y,u,v代表图片的三个分量,其中y为图片明亮度、u,v为图片的色度。
可选地,所述卷积神经网络包括十六层卷积层、十六层池化层和一层全连接层;以及所述将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练,包括:
所述卷积神经网络接收所述人脸特征向量集后,将所述人脸特征向量集输入至所述十六层卷积层和十六层池化层进行第二卷积操作和最大池化操作后输入至全连接层;
所述全连接层结合激活函数计算得到训练值,将所述训练值输入至所述模型训练层的损失函数中,所述损失函数计算出损失值,判断所述损失值与预设阈值的大小关系,直至所述损失值小于所述预设阈值时,所述卷积神经网络退出训练。
此外,为实现上述目的,本申请还提供一种基于深度学习的人脸识别装 置,该装置包括存储器和处理器,所述存储器中存储有可在所述处理器上运行的基于深度学习的人脸识别程序,所述基于深度学习的人脸识别程序被所述处理器执行时实现如下步骤:
基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集;
根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集;
将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练;
接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
可选地,所述网页包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库的网页。
可选地,所述根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,包括:
由若干个Gabor滤波器组成的Gabor滤波器组接收所述原始人脸图像集;
所述Gabor滤波器组依次与所述原始人脸图像集内的图片做第一卷积操作得到Gabor特征;
将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有基于深度学习的人脸识别程序,所述基于深度学习的人脸识别程序可被一个或者多个处理器执行,以实现如上所述的基于深度学习的人脸识别方法的步骤。
本申请提出的基于深度学习的人脸识别方法、装置及计算机可读存储介质,采用爬虫技术可从网上采取到大量高质量的人脸数据集,为后续人脸特征的分析及识别做好了前置基础,同时由于多数人脸不会占据整张图片或视频,因此根据Gabor滤波器的形状,从整张图片或视频中抽取人脸部分的特征,不仅减少手工提取特征带来的繁琐,同时更为后续卷积神经网络分析人脸特征做好充足准备,所述卷积神经网络可有效分析人脸特征并产生精准的人脸识别效果。因此,本申请可实现高效精准的人脸识别效果。
图1为本申请一实施例提供的基于深度学习的人脸识别方法的流程示意图;
图2为本申请一实施例提供的基于深度学习的人脸识别方法的Gabor特征生成图;
图3为本申请一实施例提供的基于深度学习的人脸识别装置的内部结构示意图;
图4为本申请一实施例提供的基于深度学习的人脸识别装置中基于深度学习的人脸识别程序的模块示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供一种基于深度学习的人脸识别方法。参照图1所示,为本申请一实施例提供的基于深度学习的人脸识别方法的流程示意图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。
在本实施例中,基于深度学习的人脸识别方法包括:
S1、基于爬虫技术从网页,如若干个人脸图像数据库的网页中获取人脸图像数据,组成原始人脸图像集。
所述若干个人脸图像数据库包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库等。其中,所述Yale人脸数据库包括15人,其中每人11张照片,每张照片都有光照条件的变化、表情的变化等;所述FERET人脸库是美国国防部的Counterdrug Technology Transfer Program(CTTP)为了促进人脸识别技术的进一步优化,发起的人脸识别技术(Face Recognition Technology,简称FERET)的人脸库收集活动,所述FERET人脸库包括通用人脸库以及通用测试标准。同一人脸图片包括不同表情,光照,姿态和年龄段的图片。
较佳地,本申请运用python的Urllib模块读取web页面数据,如读取FERET人脸库的网页,并对所述FERET人脸数据库的网页中的人脸图像数据进行抓取,并将这些数据组成原始人脸图像集,同理所述Urllib模块读取Yale人脸数据库、AR人脸数据库等网页,并进行人脸图像数据抓取后放至所述原始人脸图像集。
S2、根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集。
优选地,本申请将若干个Gabor滤波器组成Gabor滤波器组,所述Gabor滤波器组接收所述原始人脸图像集后,所述Gabor滤波器组依次与所述原始人脸图像集内的图片做第一卷积操作得到Gabor特征,将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
进一步地,所述第一卷积操作为:
O
y,u,v(x
1,x
2)=M(x
1,x
2)*φ
y,u,v(z)
其中,O
y,u,v(x
1,x
2)为所述Gabor特征,M(x
1,x
2)为所述原始人脸图像集内的图片的像素值坐标,φ
y,u,v(z)为卷积函数,z为卷积算子,y,u,v代表图片的三个分量,其中y为图片明亮度、u,v为图片的色度。
本申请较佳实施例选用40个Gabor滤波器组成Gabor滤波器组,如所述40个Gabor滤波器组成Gabor滤波器组读取所述原始人脸图像集的一个图像,将其与所述Gabor滤波器组进行第一卷积操作后得到Gabor特征,每个Gabor特征的特征维数为40,以此类推Gabor特征组成了所述人脸特征集。原始人脸图像到Gabor特征的变化如附图2所示。
优选地,所述下采样技术降维处理包括第一次特征降维和第二次特征降维。所述第一次特征降维是依次从所述人脸特征集中提取Gabor特征,并基于一个矩阵维度为2*2的滑动窗口从左到右、从上到下依次在所述提取出的Gabor特征上进行步长为2的平均值采样,由此所述提取出的Gabor特征的特征维数降至原先维度的1/4,特征维度变为10,完成所述第一次特征降维。
可选地,Gabor特征的特征维数降至原先维度的1/4后再接RBM模型进行所述第二次特征降维,所述RBM是一个能量模型(Energy based model,EBM),是从物理学能量模型中演变而来,所述RBM模型接收输入数据后根 据能量函数求解所述输入数据的概率分布,基于所述概率分布求解最优化后得到输出数据。具体地,所述第二次特征降维将所述第一次特征降维后的人脸特征集作为所述RBM模型的输入数据,较佳地,所述RBM模型的输出特征的特征维度为5,综合来说,降维处理将Gabor特征的特征维度从40降至5,以此类推处理每个Gabor特征并最终将输出的降维特征组成人脸特征向量集。
S3、将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练。
较佳地,所述预先构建的卷积神经网络包括十六层卷积层、十六层池化层和一层全连接层,所述卷积神经网络接收所述人脸特征向量集后,将所述人脸特征向量集输入至所述十六层卷积层和十六层池化层进行第二卷积操作和最大池化操作后输入至全连接层;
进一步地,所述全连接层结合激活函数计算得到训练值,将所述训练值输入至所述模型训练层的损失函数中,所述损失函数计算出损失值,判断所述损失值与预设阈值的大小关系,直至所述损失值小于所述预设阈值时,所述卷积神经网络退出训练。
本申请较佳实施例所述第二卷积操作为:
其中ω’为输出数据,ω为输入数据,k为卷积核的大小,s为所述卷积操作的步幅,p为数据补零矩阵,所述最大池化操作是在矩阵内选择矩阵数据中数值最大的值代替整个矩阵;
所述激活函数为:
其中y为所述训练值,e为无限不循环小数。
本申请较佳实施例所述损失值T为:
其中,n为所述原始图片集大小,y
t为所述训练值,μ
t为所述原始图片集,所述预设阈值一般设置在0.01。
S4、接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网 络中进行人脸识别,并输出识别结果。
发明还提供一种基于深度学习的人脸识别装置。参照图3所示,为本申请一实施例提供的基于深度学习的人脸识别装置的内部结构示意图。
在本实施例中,所述基于深度学习的人脸识别装置1可以是PC(Personal Computer,个人电脑),或者是智能手机、平板电脑、便携计算机等终端设备,也可以是一种服务器等。该基于深度学习的人脸识别装置1至少包括存储器11、处理器12,通信总线13,以及网络接口14。
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、磁性存储器、磁盘、光盘等。存储器11在一些实施例中可以是基于深度学习的人脸识别装置1的内部存储单元,例如该基于深度学习的人脸识别装置1的硬盘。存储器11在另一些实施例中也可以是基于深度学习的人脸识别装置1的外部存储设备,例如基于深度学习的人脸识别装置1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括基于深度学习的人脸识别装置1的内部存储单元也包括外部存储设备。存储器11不仅可以用于存储安装于基于深度学习的人脸识别装置1的应用软件及各类数据,例如基于深度学习的人脸识别程序01的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行基于深度学习的人脸识别程序01等。
通信总线13用于实现这些组件之间的连接通信。
网络接口14可选的可以包括标准的有线接口、无线接口(如WI-FI接口),通常用于在该装置1与其他电子设备之间建立通信连接。
可选地,该装置1还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显 示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在基于深度学习的人脸识别装置1中处理的信息以及用于显示可视化的用户界面。
图3仅示出了具有组件11-14以及基于深度学习的人脸识别程序01的基于深度学习的人脸识别装置1,本领域技术人员可以理解的是,图3示出的结构并不构成对基于深度学习的人脸识别装置1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。
在图3所示的装置1实施例中,存储器11中存储有基于深度学习的人脸识别程序01;处理器12执行存储器11中存储的基于深度学习的人脸识别程序01时实现如下步骤:
S1、基于爬虫技术从网页,如若干个人脸图像数据库的网页中获取人脸图像数据,组成原始人脸图像集。
所述若干个人脸图像数据库包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库等。其中,所述Yale人脸数据库包括15人,其中每人11张照片,每张照片都有光照条件的变化、表情的变化等;所述FERET人脸库是美国国防部的Counterdrug Technology Transfer Program(CTTP)为了促进人脸识别技术的进一步优化,发起的人脸识别技术(Face Recognition Technology,简称FERET)的人脸库收集活动,所述FERET人脸库包括通用人脸库以及通用测试标准。同一人脸图片包括不同表情,光照,姿态和年龄段的图片。
较佳地,本申请运用python的Urllib模块读取web页面数据,如读取FERET人脸库的网页,并对所述FERET人脸数据库的网页中的人脸图像数据进行抓取,并将这些数据组成原始人脸图像集,同理所述Urllib模块读取Yale人脸数据库、AR人脸数据库等网页,并进行人脸图像数据抓取后放至所述原始人脸图像集。
S2、根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集。
优选地,本申请将若干个Gabor滤波器组成Gabor滤波器组,所述Gabor滤波器组接收所述原始人脸图像集后,所述Gabor滤波器组依次与所述原始 人脸图像集内的图片做第一卷积操作得到Gabor特征,将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
进一步地,所述第一卷积操作为:
O
y,u,v(x
1,x
2)=M(x
1,x
2)*φ
y,u,v(z)
其中,O
y,u,v(x
1,x
2)为所述Gabor特征,M(x
1,x
2)为所述原始人脸图像集内的图片的像素值坐标,φ
y,u,v(z)为卷积函数,z为卷积算子,y,u,v代表图片的三个分量,其中y为图片明亮度、u,v为图片的色度。
本申请较佳实施例选用40个Gabor滤波器组成Gabor滤波器组,如所述40个Gabor滤波器组成Gabor滤波器组读取所述原始人脸图像集的一个图像,将其与所述Gabor滤波器组进行第一卷积操作后得到Gabor特征,每个Gabor特征的特征维数为40,以此类推Gabor特征组成了所述人脸特征集。原始人脸图像到Gabor特征的变化如附图2所示。
优选地,所述下采样技术降维处理包括第一次特征降维和第二次特征降维。所述第一次特征降维是依次从所述人脸特征集中提取Gabor特征,并基于一个矩阵维度为2*2的滑动窗口从左到右、从上到下依次在所述提取出的Gabor特征上进行步长为2的平均值采样,由此所述提取出的Gabor特征的特征维数降至原先维度的1/4,特征维度变为10,完成所述第一次特征降维。
可选地,Gabor特征的特征维数降至原先维度的1/4后再接RBM模型进行所述第二次特征降维,所述RBM是一个能量模型(Energy based model,EBM),是从物理学能量模型中演变而来,所述RBM模型接收输入数据后根据能量函数求解所述输入数据的概率分布,基于所述概率分布求解最优化后得到输出数据。具体地,所述第二次特征降维将所述第一次特征降维后的人脸特征集作为所述RBM模型的输入数据,较佳地,所述RBM模型的输出特征的特征维度为5,综合来说,降维处理将Gabor特征的特征维度从40降至5,以此类推处理每个Gabor特征并最终将输出的降维特征组成人脸特征向量集。
S3、将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练。
较佳地,所述预先构建的卷积神经网络包括十六层卷积层、十六层池化 层和一层全连接层,所述卷积神经网络接收所述人脸特征向量集后,将所述人脸特征向量集输入至所述十六层卷积层和十六层池化层进行第二卷积操作和最大池化操作后输入至全连接层;
进一步地,所述全连接层结合激活函数计算得到训练值,将所述训练值输入至所述模型训练层的损失函数中,所述损失函数计算出损失值,判断所述损失值与预设阈值的大小关系,直至所述损失值小于所述预设阈值时,所述卷积神经网络退出训练。
本申请较佳实施例所述第二卷积操作为:
其中ω’为输出数据,ω为输入数据,k为卷积核的大小,s为所述卷积操作的步幅,p为数据补零矩阵,所述最大池化操作是在矩阵内选择矩阵数据中数值最大的值代替整个矩阵;
所述激活函数为:
其中y为所述训练值,e为无限不循环小数。
本申请较佳实施例所述损失值T为:
其中,n为所述原始图片集大小,y
t为所述训练值,μ
t为所述原始图片集,所述预设阈值一般设置在0.01。
S4、接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
可选地,在其他实施例中,基于深度学习的人脸识别程序还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器11中,并由一个或多个处理器(本实施例为处理器12)所执行以完成本申请,本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段,用于描述基于深度学习的人脸识别程序在基于深度学习的人脸识别装置中的执行过程。
例如,参照图4所示,为本申请基于深度学习的人脸识别装置一实施例中的基于深度学习的人脸识别程序的程序模块示意图,该实施例中,所述基 于深度学习的人脸识别程序可以被分割为源数据接收模块10、特征提取模块20、模型训练模块30以及人脸识别结果输出模块40,示例性地:
所述源数据接收模块10用于:基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集。
所述特征提取模块20用于:根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集。
所述模型训练模块30用于:将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练。
所述人脸识别结果输出模块40用于:接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
上述源数据接收模块10、特征提取模块20、模型训练模块30以及人脸识别结果输出模块40等程序模块被执行时所实现的功能或操作步骤与上述实施例大体相同,在此不再赘述。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质上存储有基于深度学习的人脸识别程序,所述基于深度学习的人脸识别程序可被一个或多个处理器执行,以实现如下操作:
基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集;
根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集;
将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练;
接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
本申请计算机可读存储介质具体实施方式与上述基于深度学习的人脸识别装置和方法各实施例基本相同,在此不作累述。
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的 优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。
Claims (20)
- 一种基于深度学习的人脸识别方法,其特征在于,所述方法包括:基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集;根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集;将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练;接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
- 如权利要求1所述的基于深度学习的人脸识别方法,其特征在于,所述网页包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库的网页。
- 如权利要求1或2所述的基于深度学习的人脸识别方法,其特征在于,所述根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,包括:由若干个Gabor滤波器组成的Gabor滤波器组接收所述原始人脸图像集;所述Gabor滤波器组依次与所述原始人脸图像集内的图片做第一卷积操作得到Gabor特征;将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
- 如权利要求3所述的基于深度学习的人脸识别方法,其特征在于,所述第一卷积操作为:O y,u,v(x 1,x 2)=M(x 1,x 2)*φ y,u,v(z)其中,O y,u,v(x 1,x 2)为所述Gabor特征,M(x 1,x 2)为所述原始人脸图像集内的图片的像素值坐标,φ y,u,v(z)为卷积函数,z为卷积算子,y,u,v代表图片的三个分量,其中y为图片明亮度、u,v为图片的色度。
- 如权利要求4所述的基于深度学习的人脸识别方法,其特征在于,所述卷积神经网络包括十六层卷积层、十六层池化层和一层全连接层;以及所述将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练,包括:所述卷积神经网络接收所述人脸特征向量集后,将所述人脸特征向量集输入至所述十六层卷积层和十六层池化层进行第二卷积操作和最大池化操作后输入至全连接层;所述全连接层结合激活函数计算得到训练值,将所述训练值输入至所述模型训练层的损失函数中,所述损失函数计算出损失值,判断所述损失值与预设阈值的大小关系,直至所述损失值小于所述预设阈值时,所述卷积神经网络退出训练。
- 一种基于深度学习的人脸识别装置,其特征在于,所述装置包括存储器和处理器,所述存储器上存储有可在所述处理器上运行的基于深度学习的人脸识别程序,所述基于深度学习的人脸识别程序被所述处理器执行时实现如下步骤:基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集;根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集;将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练;接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
- 如权利要求6所述的基于深度学习的人脸识别装置,其特征在于,所述网页包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库的网页。
- 如权利要求6或7所述的基于深度学习的人脸识别装置,其特征在于,其特征在于,所述根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,包括:由若干个Gabor滤波器组成的Gabor滤波器组接收所述原始人脸图像集;所述Gabor滤波器组依次与所述原始人脸图像集内的图片做第一卷积操作得到Gabor特征;将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
- 如权利要求8所述的基于深度学习的人脸识别装置,其特征在于,所述第一卷积操作为:O y,u,v(x 1,x 2)=M(x 1,x 2)*φ y,u,v(z)其中,O y,u,v(x 1,x 2)为所述Gabor特征,M(x 1,x 2)为所述原始人脸图像集内的图片的像素值坐标,φ y,u,v(z)为卷积函数,z为卷积算子,y,u,v代表图片的三个分量,其中y为图片明亮度、u,v为图片的色度。
- 如权利要求9所述的基于深度学习的人脸识别装置,其特征在于,所述卷积神经网络包括十六层卷积层、十六层池化层和一层全连接层;以及所述将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练,包括:所述卷积神经网络接收所述人脸特征向量集后,将所述人脸特征向量集输入至所述十六层卷积层和十六层池化层进行第二卷积操作和最大池化操作后输入至全连接层;所述全连接层结合激活函数计算得到训练值,将所述训练值输入至所述模型训练层的损失函数中,所述损失函数计算出损失值,判断所述损失值与预设阈值的大小关系,直至所述损失值小于所述预设阈值时,所述卷积神经网络退出训练。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有基于深度学习的人脸识别程序,所述基于深度学习的人脸识别程序可被一个或者多个处理器执行,以实现如下步骤:基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集;根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集;将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练;接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
- 如权利要求11所述的计算机可读存储介质,其特征在于,所述网页包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库的网页。
- 如权利要求11或12所述的计算机可读存储介质,其特征在于,所述根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集, 包括:由若干个Gabor滤波器组成的Gabor滤波器组接收所述原始人脸图像集;所述Gabor滤波器组依次与所述原始人脸图像集内的图片做第一卷积操作得到Gabor特征;将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
- 如权利要求13所述的计算机可读存储介质,其特征在于,所述第一卷积操作为:O y,u,v(x 1,x 2)=M(x 1,x 2)*φ y,u,v(z)其中,O y,u,v(x 1,x 2)为所述Gabor特征,M(x 1,x 2)为所述原始人脸图像集内的图片的像素值坐标,φ y,u,v(z)为卷积函数,z为卷积算子,y,u,v代表图片的三个分量,其中y为图片明亮度、u,v为图片的色度。
- 如权利要求14所述的计算机可读存储介质,其特征在于,所述卷积神经网络包括十六层卷积层、十六层池化层和一层全连接层;以及所述将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练,包括:所述卷积神经网络接收所述人脸特征向量集后,将所述人脸特征向量集输入至所述十六层卷积层和十六层池化层进行第二卷积操作和最大池化操作后输入至全连接层;所述全连接层结合激活函数计算得到训练值,将所述训练值输入至所述模型训练层的损失函数中,所述损失函数计算出损失值,判断所述损失值与预设阈值的大小关系,直至所述损失值小于所述预设阈值时,所述卷积神经网络退出训练。
- 一种基于深度学习的人脸识别系统,其特征在于,所述基于深度学习的人脸识别系统包括:源数据接收模块,用于:基于爬虫技术从网页中获取人脸图像数据,组成原始人脸图像集;特征提取模块,用于:根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,根据下采样技术对所述人脸特征集进行降维处理形成人脸特征向量集;模型训练模块,用于:将所述人脸特征向量集输入至预先构建的卷积神 经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练;人脸识别结果输出模块,用于:接收用户人脸图片,并将所述用户人脸图片输入至所述卷积神经网络中进行人脸识别,并输出识别结果。
- 如权利要求16所述的基于深度学习的人脸识别系统,其特征在于,所述网页包括ORL人脸数据库、Yale人脸数据库、AR人脸数据库、和/或FERET人脸数据库的网页。。
- 如权利要求16或17所述的基于深度学习的人脸识别系统,其特征在于,所述根据Gabor滤波器提取所述原始人脸图像集的人脸特征得到人脸特征集,包括:由若干个Gabor滤波器组成的Gabor滤波器组接收所述原始人脸图像集;所述Gabor滤波器组依次与所述原始人脸图像集内的图片做第一卷积操作得到Gabor特征;将每次第一卷积操作得到的Gabor特征组成集合得到所述人脸特征集。
- 如权利要求18所述的基于深度学习的人脸识别系统,其特征在于,所述第一卷积操作为:O y,u,v(x 1,x 2)=M(x 1,x 2)*φ y,u,v(z)其中,O y,u,v(x 1,x 2)为所述Gabor特征,M(x 1,x 2)为所述原始人脸图像集内的图片的像素值坐标,φ y,u,v(z)为卷积函数,z为卷积算子,y,u,v代表图片的三个分量,其中y为图片明亮度、u,v为图片的色度。
- 如权利要求19所述的基于深度学习的人脸识别系统,其特征在于,所述卷积神经网络包括十六层卷积层、十六层池化层和一层全连接层;以及所述将所述人脸特征向量集输入至预先构建的卷积神经网络模型中训练,直至所述卷积神经网络内的损失函数值小于预设阈值时退出训练,包括:所述卷积神经网络接收所述人脸特征向量集后,将所述人脸特征向量集输入至所述十六层卷积层和十六层池化层进行第二卷积操作和最大池化操作后输入至全连接层;所述全连接层结合激活函数计算得到训练值,将所述训练值输入至所述模型训练层的损失函数中,所述损失函数计算出损失值,判断所述损失值与预设阈值的大小关系,直至所述损失值小于所述预设阈值时,所述卷积神经 网络退出训练。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910658687.0 | 2019-07-19 | ||
CN201910658687.0A CN110516544B (zh) | 2019-07-19 | 2019-07-19 | 基于深度学习的人脸识别方法、装置及计算机可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021012494A1 true WO2021012494A1 (zh) | 2021-01-28 |
Family
ID=68623300
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/116934 WO2021012494A1 (zh) | 2019-07-19 | 2019-11-10 | 基于深度学习的人脸识别方法、装置及计算机可读存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110516544B (zh) |
WO (1) | WO2021012494A1 (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112862095A (zh) * | 2021-02-02 | 2021-05-28 | 浙江大华技术股份有限公司 | 基于特征分析的自蒸馏学习方法、设备以及可读存储介质 |
CN113033406A (zh) * | 2021-03-26 | 2021-06-25 | 睿云联(厦门)网络通讯技术有限公司 | 基于深度可分离圆心差分卷积的人脸活体检测方法及系统 |
CN113807217A (zh) * | 2021-09-02 | 2021-12-17 | 浙江师范大学 | 人脸表情识别模型训练、识别方法、系统、装置及介质 |
CN114091571A (zh) * | 2021-10-25 | 2022-02-25 | 海南大学 | 基于Ridgelet-DCT变换和Tent-Henon双混沌的加密人脸识别方法 |
CN116453201A (zh) * | 2023-06-19 | 2023-07-18 | 南昌大学 | 基于相邻边缘损失的人脸识别方法及系统 |
CN118196872A (zh) * | 2024-04-08 | 2024-06-14 | 陕西丝路众合智能科技有限公司 | 一种混杂场景下的面部精准识别方法 |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111401277B (zh) * | 2020-03-20 | 2024-10-25 | 深圳前海微众银行股份有限公司 | 人脸识别模型更新方法、装置、设备和介质 |
CN111523094B (zh) * | 2020-03-25 | 2023-04-18 | 平安科技(深圳)有限公司 | 深度学习模型水印嵌入方法、装置、电子设备及存储介质 |
CN111523389A (zh) * | 2020-03-25 | 2020-08-11 | 中国平安人寿保险股份有限公司 | 情绪智能识别方法、装置、电子设备及存储介质 |
CN111597896B (zh) * | 2020-04-15 | 2024-02-20 | 卓望数码技术(深圳)有限公司 | 异常人脸的识别方法、识别装置、识别设备和存储介质 |
CN111652064B (zh) * | 2020-04-30 | 2024-06-07 | 平安科技(深圳)有限公司 | 人脸图像生成方法、电子装置及可读存储介质 |
CN111814735A (zh) * | 2020-07-24 | 2020-10-23 | 深圳市爱深盈通信息技术有限公司 | 基于人脸识别的取票方法、装置、设备及存储介质 |
CN112651342B (zh) * | 2020-12-28 | 2024-06-14 | 中国平安人寿保险股份有限公司 | 人脸识别方法、装置、电子设备及存储介质 |
CN113378660B (zh) * | 2021-05-25 | 2023-11-07 | 广州紫为云科技有限公司 | 一种低数据成本的人脸识别的方法及装置 |
CN114707997A (zh) * | 2021-07-23 | 2022-07-05 | 山东浪潮爱购云链信息科技有限公司 | 一种防止招投标恶意竞争的方法、存储介质 |
CN116912918B (zh) * | 2023-09-08 | 2024-01-23 | 苏州浪潮智能科技有限公司 | 一种人脸识别方法、装置、设备及计算机可读存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106127159A (zh) * | 2016-06-28 | 2016-11-16 | 电子科技大学 | 一种基于卷积神经网络的性别识别方法 |
CN107392183A (zh) * | 2017-08-22 | 2017-11-24 | 深圳Tcl新技术有限公司 | 人脸分类识别方法、装置及可读存储介质 |
CN109272039A (zh) * | 2018-09-19 | 2019-01-25 | 北京航空航天大学 | 一种基于无人机的水坝周边异常监测方法与装置 |
US20190205517A1 (en) * | 2017-12-29 | 2019-07-04 | KeyLemon S.A. | Method used in a mobile equipment with a Trusted Execution Environment for authenticating a user based on his face |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100846500B1 (ko) * | 2006-11-08 | 2008-07-17 | 삼성전자주식회사 | 확장된 가보 웨이브렛 특징 들을 이용한 얼굴 인식 방법 및장치 |
CN107423701B (zh) * | 2017-07-17 | 2020-09-01 | 智慧眼科技股份有限公司 | 基于生成式对抗网络的人脸非监督特征学习方法及装置 |
-
2019
- 2019-07-19 CN CN201910658687.0A patent/CN110516544B/zh active Active
- 2019-11-10 WO PCT/CN2019/116934 patent/WO2021012494A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106127159A (zh) * | 2016-06-28 | 2016-11-16 | 电子科技大学 | 一种基于卷积神经网络的性别识别方法 |
CN107392183A (zh) * | 2017-08-22 | 2017-11-24 | 深圳Tcl新技术有限公司 | 人脸分类识别方法、装置及可读存储介质 |
US20190205517A1 (en) * | 2017-12-29 | 2019-07-04 | KeyLemon S.A. | Method used in a mobile equipment with a Trusted Execution Environment for authenticating a user based on his face |
CN109272039A (zh) * | 2018-09-19 | 2019-01-25 | 北京航空航天大学 | 一种基于无人机的水坝周边异常监测方法与装置 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112862095A (zh) * | 2021-02-02 | 2021-05-28 | 浙江大华技术股份有限公司 | 基于特征分析的自蒸馏学习方法、设备以及可读存储介质 |
CN112862095B (zh) * | 2021-02-02 | 2023-09-29 | 浙江大华技术股份有限公司 | 基于特征分析的自蒸馏学习方法、设备以及可读存储介质 |
CN113033406A (zh) * | 2021-03-26 | 2021-06-25 | 睿云联(厦门)网络通讯技术有限公司 | 基于深度可分离圆心差分卷积的人脸活体检测方法及系统 |
CN113807217A (zh) * | 2021-09-02 | 2021-12-17 | 浙江师范大学 | 人脸表情识别模型训练、识别方法、系统、装置及介质 |
CN113807217B (zh) * | 2021-09-02 | 2023-11-21 | 浙江师范大学 | 人脸表情识别模型训练、识别方法、系统、装置及介质 |
CN114091571A (zh) * | 2021-10-25 | 2022-02-25 | 海南大学 | 基于Ridgelet-DCT变换和Tent-Henon双混沌的加密人脸识别方法 |
CN116453201A (zh) * | 2023-06-19 | 2023-07-18 | 南昌大学 | 基于相邻边缘损失的人脸识别方法及系统 |
CN116453201B (zh) * | 2023-06-19 | 2023-09-01 | 南昌大学 | 基于相邻边缘损失的人脸识别方法及系统 |
CN118196872A (zh) * | 2024-04-08 | 2024-06-14 | 陕西丝路众合智能科技有限公司 | 一种混杂场景下的面部精准识别方法 |
Also Published As
Publication number | Publication date |
---|---|
CN110516544B (zh) | 2024-04-09 |
CN110516544A (zh) | 2019-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021012494A1 (zh) | 基于深度学习的人脸识别方法、装置及计算机可读存储介质 | |
WO2019174130A1 (zh) | 票据识别方法、服务器及计算机可读存储介质 | |
WO2019109526A1 (zh) | 人脸图像的年龄识别方法、装置及存储介质 | |
WO2020164270A1 (zh) | 基于深度学习的行人检测方法、系统、装置及存储介质 | |
WO2020107847A1 (zh) | 基于骨骼点的跌倒检测方法及其跌倒检测装置 | |
WO2019196308A1 (zh) | 人脸识别模型的生成装置、方法及计算机可读存储介质 | |
WO2019095571A1 (zh) | 人物情绪分析方法、装置及存储介质 | |
WO2019033571A1 (zh) | 面部特征点检测方法、装置及存储介质 | |
WO2019033569A1 (zh) | 眼球动作分析方法、装置及存储介质 | |
WO2016150240A1 (zh) | 身份认证方法和装置 | |
US11886492B2 (en) | Method of matching image and apparatus thereof, device, medium and program product | |
CN111989689A (zh) | 用于识别图像内目标的方法和用于执行该方法的移动装置 | |
US10032091B2 (en) | Spatial organization of images based on emotion face clouds | |
WO2020248848A1 (zh) | 智能化异常细胞判断方法、装置及计算机可读存储介质 | |
Vazquez-Fernandez et al. | Built-in face recognition for smart photo sharing in mobile devices | |
CN109637664A (zh) | 一种bmi评测方法、装置及计算机可读存储介质 | |
WO2020253508A1 (zh) | 异常细胞检测方法、装置及计算机可读存储介质 | |
WO2021012493A1 (zh) | 短视频关键词提取方法、装置及存储介质 | |
WO2019033570A1 (zh) | 嘴唇动作分析方法、装置及存储介质 | |
WO2016106966A1 (zh) | 人物标注方法和终端、存储介质 | |
WO2019033568A1 (zh) | 嘴唇动作捕捉方法、装置及存储介质 | |
CN111178195A (zh) | 人脸表情识别方法、装置及计算机可读存储介质 | |
CN113255557B (zh) | 一种基于深度学习的视频人群情绪分析方法及系统 | |
WO2019033567A1 (zh) | 眼球动作捕捉方法、装置及存储介质 | |
Salihbašić et al. | Development of android application for gender, age and face recognition using opencv |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19938534 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19938534 Country of ref document: EP Kind code of ref document: A1 |