CN107480600A - A kind of gesture identification method based on depth convolutional neural networks - Google Patents
A kind of gesture identification method based on depth convolutional neural networks Download PDFInfo
- Publication number
- CN107480600A CN107480600A CN201710597440.3A CN201710597440A CN107480600A CN 107480600 A CN107480600 A CN 107480600A CN 201710597440 A CN201710597440 A CN 201710597440A CN 107480600 A CN107480600 A CN 107480600A
- Authority
- CN
- China
- Prior art keywords
- mrow
- convolutional neural
- neural networks
- training
- hand
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/107—Static hand or arm
- G06V40/113—Recognition of static hand signs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及生物特征识别领域,尤其涉及一种基于深度卷积神经网络的手势识别方法。The invention relates to the field of biological feature recognition, in particular to a gesture recognition method based on a deep convolutional neural network.
背景技术Background technique
生物特征识别是视频监控、安全认证等领域的关键技术之一。生物特征可以分为生理特征与行为特征。生理特征主要包括人脸、指纹和虹膜等,行为特征则包括步态、手势等。典型的基于生理特征的识别方法有指纹识别,手掌形状及轮廓识别,人脸识别,虹膜识别等。指纹识别是目前应用最广泛的基于生物特征的身份识别方法之一。指纹识别具有技术成熟,成本低廉等优点。其缺点是接触式的,具有侵犯性,存在卫生方面的问题,同时指纹也是易磨损的。人脸识别技术是近年来非常活跃的一个研究领域,具有直观性好、方便、友好、容易被人接受的优点。人脸识别是非接触式的,被动识别,不需要人的主动配合;但缺点是容易受到光照、视角、遮挡物、环境、表情等的影响,造成识别困难。虹膜特征识别的安全度和精准度非常高,但是特征采集过程非常困难。基于行为特征的身份识别技术,常见的有步态识别与手势识别。步态识别的输入是一段行走的视频图像序列,数据量很大,导致计算复杂性高,处理起来比较困难。而手势识别作为非接触式人机交互的重要组成部分,目前大多数研究者均将注意力集中在手势的最终识别方面,通常会将手势背景简化,并在单一背景下利用所研究的算法将手势进行分割,然后采用常用的识别方法将手势表达的含义通过系统分析出来。Biometric identification is one of the key technologies in the fields of video surveillance and security authentication. Biological characteristics can be divided into physiological characteristics and behavioral characteristics. Physiological features mainly include face, fingerprints, and irises, while behavioral features include gait, gestures, etc. Typical recognition methods based on physiological characteristics include fingerprint recognition, palm shape and contour recognition, face recognition, iris recognition, etc. Fingerprint recognition is currently one of the most widely used biometric-based identification methods. Fingerprint identification has the advantages of mature technology and low cost. Its disadvantage is that it is contact-type, invasive, has health problems, and fingerprints are also easy to wear. Face recognition technology is a very active research field in recent years. It has the advantages of good intuition, convenience, friendliness, and easy acceptance. Face recognition is non-contact, passive recognition, and does not require the active cooperation of people; but the disadvantage is that it is easily affected by light, viewing angle, occlusion, environment, expression, etc., making recognition difficult. The security and accuracy of iris feature recognition are very high, but the process of feature collection is very difficult. Identification technologies based on behavioral characteristics, common ones are gait recognition and gesture recognition. The input of gait recognition is a sequence of walking video images, which has a large amount of data, resulting in high computational complexity and difficult processing. Gesture recognition is an important part of non-contact human-computer interaction. At present, most researchers focus on the final recognition of gestures, usually simplifying the background of gestures, and using the researched algorithm in a single background. Gestures are segmented, and then the meaning expressed by the gestures is analyzed by the system using commonly used recognition methods.
发明内容Contents of the invention
本发明的目的在于针对现有技术的不足,提供一种基于深度卷积神经网络的手势识别方法。The object of the present invention is to provide a gesture recognition method based on a deep convolutional neural network to address the deficiencies of the prior art.
本发明的目的是通过以下技术方案来实现的:一种基于深度神经网络的手势识别方法,包括以下步骤:The object of the present invention is achieved by the following technical solutions: a gesture recognition method based on deep neural network, comprising the following steps:
(1)对训练集的样本图像进行边缘检测处理和样本集的划分:首先对训练用的手势集图像进行手型检测和边缘检测处理,并将提取的手型图像调整到统一的尺寸;将预处理后的数据划分为训练样本集和验证样本集;(1) Carry out edge detection processing and division of sample sets on the sample images of the training set: first, carry out hand type detection and edge detection processing on the gesture set images for training, and adjust the extracted hand type images to a uniform size; The preprocessed data is divided into training sample set and verification sample set;
(2)构建深度卷积神经网络:设I和O分别为深度卷积神经网络的输入层和输出层,I和O之间的隐藏层为H1、H2、…、Hn。其中,输入层为步骤(1) 获得的手型图像,输出层为一个长度为N的手势特征向量,隐藏层采用多重下采样技术,允许下采样的分块之间有重叠;(2) Constructing a deep convolutional neural network: Let I and O be the input layer and output layer of the deep convolutional neural network respectively, and the hidden layers between I and O are H1, H2, ..., Hn. Wherein, the input layer is the hand image obtained in step (1), the output layer is a gesture feature vector with a length of N, and the hidden layer adopts multiple down-sampling technology, which allows overlapping between sub-blocks that are down-sampled;
(3)确定激活函数和损失函数:选择公式(1)所示的非线性双曲正切函数作为神经元的激活函数;选择公式(2)所示的损失函数。其中,n是训练集中样本的个数,x为手型图像,y为对应于x的手势特征向量,θ为参数向量;(3) Determine the activation function and loss function: select the nonlinear hyperbolic tangent function shown in formula (1) as the activation function of the neuron; select the loss function shown in formula (2). Among them, n is the number of samples in the training set, x is the hand image, y is the gesture feature vector corresponding to x, and θ is the parameter vector;
(4)训练深度神经网络:从训练样本集中选取m个训练样本,采用最速梯度下降法计算梯度;然后,采用验证样本集进行验证,当正确率超过预设阈值99.5%时,结束训练,从而得到具有确定权值w和偏置项b的深度神经网络;(4) Training deep neural network: select m training samples from the training sample set, and use the fastest gradient descent method to calculate the gradient; then, use the verification sample set for verification, and when the correct rate exceeds the preset threshold of 99.5%, the training ends, thereby Obtain a deep neural network with a certain weight w and bias item b;
(5)根据训练后的深度卷积神经网络实现手势识别:其步骤包括:a)从待识别手势数据中提取手型图像;b)将手型图像进行边缘检测和尺寸归一化处理; c)将手型图像输入到深度卷积神经网络中,依据输出层的输出值判定当前手势的归属类。(5) Realize gesture recognition according to the deep convolutional neural network after training: the steps include: a) extracting the hand shape image from the gesture data to be recognized; b) performing edge detection and size normalization processing on the hand shape image; c ) input the hand shape image into the deep convolutional neural network, and determine the attribution of the current gesture according to the output value of the output layer.
本发明的有益效果是:本发明的基于深度卷积神经网络的手势识别方法,使用多重下采样技术构建深度卷积神经网络,并采用双曲正切函数作为激活函数进行神经网络的训练,不但能提高手势识别的效率,而且也能提高手势识别的准确率。The beneficial effects of the present invention are: the gesture recognition method based on the deep convolutional neural network of the present invention uses multiple down-sampling techniques to construct a deep convolutional neural network, and uses the hyperbolic tangent function as the activation function to train the neural network, which not only can Improve the efficiency of gesture recognition, but also improve the accuracy of gesture recognition.
附图说明Description of drawings
图1是本方法的实施流程;Fig. 1 is the implementation process of this method;
图2手型图像提取示意图;Figure 2 Schematic diagram of hand image extraction;
图3多重下采样示意图;Figure 3 Schematic diagram of multiple downsampling;
图4双曲正切函数曲线;Fig. 4 hyperbolic tangent function curve;
图5识别率比较的错误率数据。Figure 5 Error rate data for recognition rate comparison.
具体实施方式detailed description
下面结合附图对本发明进行详细的描述。The present invention will be described in detail below in conjunction with the accompanying drawings.
如图1所示,本发明基于深度卷积神经网络的手势识别方法分为训练阶段与识别阶段。As shown in FIG. 1 , the gesture recognition method based on the deep convolutional neural network of the present invention is divided into a training stage and a recognition stage.
在训练阶段,构建深度卷积神经网络,并使用训练集数据确定其权值和偏移参数的值。具体包括以下子步骤:In the training phase, a deep convolutional neural network is constructed, and the values of its weight and offset parameters are determined using the training set data. Specifically include the following sub-steps:
1、从训练数据中提取手型图像:针对手势交互中的一帧,如图2(a)所示,本发明首先根据肤色特征提取粗略的手部区域图像,如图2(b)所示;其次,采用滤波和快速生态学膨胀、腐蚀算法对手部区域图像进行修正,得到进一步的手型图像,如图2(c)所示;1. Extract hand shape images from training data: for a frame in gesture interaction, as shown in Figure 2(a), the present invention first extracts rough hand region images according to skin color features, as shown in Figure 2(b) ;Secondly, the image of the hand area is corrected by filtering and fast ecological expansion and erosion algorithm, and a further hand image is obtained, as shown in Figure 2(c);
2、对手型图像进行边缘检测和尺寸归一化处理:对步骤1得到的手型图像进行边缘检测和二值化处理,得到初步的手型轮廓图像,如图2(d)所示;然后,对手型轮廓曲线进行修剪和完善,并调整到统一的尺寸得到对应的手型图像;2. Carry out edge detection and size normalization processing on the hand shape image: carry out edge detection and binarization processing on the hand shape image obtained in step 1 to obtain a preliminary hand shape contour image, as shown in Figure 2 (d); then , pruning and perfecting the contour curve of the hand shape, and adjusting it to a uniform size to obtain the corresponding hand shape image;
3、构建深度卷积神经网络:对于步骤2得到的手型图像,本发明构建了深度卷积神经网络用于手势识别。设I和O分别为深度卷积神经网络的输入层和输出层,I和O之间的隐藏层为H1、H2、…、Hn。其中,输入层为步骤2获得的手型图像,输出层为一个长度为N的手势特征向量,隐藏层采用多重下采样技术,允许下采样的分块之间有重叠,如图3所示;3. Constructing a deep convolutional neural network: For the hand image obtained in step 2, the present invention constructs a deep convolutional neural network for gesture recognition. Let I and O be the input layer and output layer of the deep convolutional neural network respectively, and the hidden layers between I and O are H1, H2, ..., Hn. Among them, the input layer is the hand shape image obtained in step 2, the output layer is a gesture feature vector with a length of N, and the hidden layer adopts multiple downsampling technology, which allows overlap between the downsampled blocks, as shown in Figure 3;
4、确定激活函数和损失函数:如图4所示,选择非线性双曲正切函数作为神经元的激活函数;选择公式(2)所示的平方误差函数作为损失函数。4. Determine the activation function and loss function: as shown in Figure 4, select the nonlinear hyperbolic tangent function as the activation function of the neuron; select the square error function shown in formula (2) as the loss function.
5、训练深度卷积神经网络:5. Train a deep convolutional neural network:
从训练样本集中选取m个训练样本,采用最速梯度下降法计算梯度,并根据损失函数对各隐藏层的参数进行迭代优化;然后,采用验证样本集进行验证,当正确率超过预设阈值99.5%时,结束训练,从而得到具有确定权值w和偏置项b 的深度神经网络。Select m training samples from the training sample set, use the fastest gradient descent method to calculate the gradient, and iteratively optimize the parameters of each hidden layer according to the loss function; then, use the verification sample set for verification, when the correct rate exceeds the preset threshold 99.5% When , end the training, so as to obtain a deep neural network with a certain weight w and bias item b.
在识别阶段,首先从待识别数据中提取手型图像,并进行边缘检测和尺寸归一化处理;然后,将其输入到训练好的深度卷积神经网络来判断当前手势的归属类。最后,将本方法与采用数据手套的方法、以及序列相似性检测(SSDA)方法进行了比较,图5给出了错误率实验数据。In the recognition stage, first extract the hand shape image from the data to be recognized, and perform edge detection and size normalization processing; then, input it into the trained deep convolutional neural network to judge the current gesture category. Finally, the method is compared with the method using data gloves and the sequence similarity detection (SSDA) method, and Fig. 5 shows the experimental data of the error rate.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710597440.3A CN107480600A (en) | 2017-07-20 | 2017-07-20 | A kind of gesture identification method based on depth convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710597440.3A CN107480600A (en) | 2017-07-20 | 2017-07-20 | A kind of gesture identification method based on depth convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107480600A true CN107480600A (en) | 2017-12-15 |
Family
ID=60595146
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710597440.3A Pending CN107480600A (en) | 2017-07-20 | 2017-07-20 | A kind of gesture identification method based on depth convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107480600A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334814A (en) * | 2018-01-11 | 2018-07-27 | 浙江工业大学 | A kind of AR system gesture identification methods based on convolutional neural networks combination user's habituation behavioural analysis |
CN109117742A (en) * | 2018-07-20 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | Gestures detection model treatment method, apparatus, equipment and storage medium |
CN109890573A (en) * | 2019-01-04 | 2019-06-14 | 珊口(上海)智能科技有限公司 | Control method, device, mobile robot and the storage medium of mobile robot |
CN111338470A (en) * | 2020-02-10 | 2020-06-26 | 烟台持久钟表有限公司 | Method for controlling big clock through gestures |
CN112889075A (en) * | 2018-10-29 | 2021-06-01 | Sk电信有限公司 | Improving prediction performance using asymmetric hyperbolic tangent activation function |
CN113703581A (en) * | 2021-09-03 | 2021-11-26 | 广州朗国电子科技股份有限公司 | Window adjusting method based on gesture switching, electronic whiteboard and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106529470A (en) * | 2016-11-09 | 2017-03-22 | 济南大学 | Gesture recognition method based on multistage depth convolution neural network |
US20170161607A1 (en) * | 2015-12-04 | 2017-06-08 | Pilot Ai Labs, Inc. | System and method for improved gesture recognition using neural networks |
-
2017
- 2017-07-20 CN CN201710597440.3A patent/CN107480600A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170161607A1 (en) * | 2015-12-04 | 2017-06-08 | Pilot Ai Labs, Inc. | System and method for improved gesture recognition using neural networks |
CN106529470A (en) * | 2016-11-09 | 2017-03-22 | 济南大学 | Gesture recognition method based on multistage depth convolution neural network |
Non-Patent Citations (2)
Title |
---|
LEE的白板报: "卷积神经网络初探", 《斯坦福深度学习HTTPS://MY.OSCHINA.NET/FINDBILL/BLOG/550565》 * |
蔡娟: "基于卷积神经网络的手势识别", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334814A (en) * | 2018-01-11 | 2018-07-27 | 浙江工业大学 | A kind of AR system gesture identification methods based on convolutional neural networks combination user's habituation behavioural analysis |
CN108334814B (en) * | 2018-01-11 | 2020-10-30 | 浙江工业大学 | A gesture recognition method for AR system |
CN109117742A (en) * | 2018-07-20 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | Gestures detection model treatment method, apparatus, equipment and storage medium |
CN109117742B (en) * | 2018-07-20 | 2022-12-27 | 百度在线网络技术(北京)有限公司 | Gesture detection model processing method, device, equipment and storage medium |
CN112889075A (en) * | 2018-10-29 | 2021-06-01 | Sk电信有限公司 | Improving prediction performance using asymmetric hyperbolic tangent activation function |
CN112889075B (en) * | 2018-10-29 | 2024-01-26 | Sk电信有限公司 | Improved predictive performance using asymmetric hyperbolic tangent activation function |
CN109890573A (en) * | 2019-01-04 | 2019-06-14 | 珊口(上海)智能科技有限公司 | Control method, device, mobile robot and the storage medium of mobile robot |
CN109890573B (en) * | 2019-01-04 | 2022-05-03 | 上海阿科伯特机器人有限公司 | Control method and device for mobile robot, mobile robot and storage medium |
CN111338470A (en) * | 2020-02-10 | 2020-06-26 | 烟台持久钟表有限公司 | Method for controlling big clock through gestures |
CN111338470B (en) * | 2020-02-10 | 2022-10-21 | 烟台持久钟表有限公司 | Method for controlling big clock through gestures |
CN113703581A (en) * | 2021-09-03 | 2021-11-26 | 广州朗国电子科技股份有限公司 | Window adjusting method based on gesture switching, electronic whiteboard and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107480600A (en) | A kind of gesture identification method based on depth convolutional neural networks | |
CN107609497B (en) | Real-time video face recognition method and system based on visual tracking technology | |
CN104463100B (en) | Intelligent wheel chair man-machine interactive system and method based on human facial expression recognition pattern | |
CN105825176B (en) | Identification method based on multimodal contactless identity features | |
CN101777131B (en) | Method and device for identifying human face through double models | |
CN100458832C (en) | Palm grain identification method based on direction character | |
CN102915436B (en) | Sparse representation face recognition method based on intra-class variation dictionary and training image | |
CN100395770C (en) | A Hand Feature Fusion Authentication Method Based on Feature Relationship Measurement | |
CN108427921A (en) | A kind of face identification method based on convolutional neural networks | |
CN100576230C (en) | Similar fingerprint recognition system and method for twins based on local structure | |
Fei et al. | Jointly heterogeneous palmprint discriminant feature learning | |
CN106355138A (en) | Face recognition method based on deep learning and key features extraction | |
CN102332084B (en) | Identity identification method based on palm print and human face feature extraction | |
CN111126307B (en) | Small sample face recognition method combining sparse representation neural network | |
CN102831390A (en) | Human ear authenticating system and method | |
CN103034847B (en) | A kind of face identification method based on hidden Markov model | |
CN110555380A (en) | Finger vein identification method based on Center Loss function | |
CN106446867A (en) | Double-factor palmprint identification method based on random projection encryption | |
CN103440480B (en) | Non-contact palmprint recognition method based on palmprint image registration | |
CN101571924A (en) | Gait recognition method and system with multi-region feature integration | |
CN110490107A (en) | A kind of fingerprint identification technology based on capsule neural network | |
CN105138974A (en) | Gabor coding based finger multimodal feature fusion method | |
CN103984922A (en) | Face identification method based on sparse representation and shape restriction | |
CN108596269A (en) | A kind of recognizer of the plantar pressure image based on SVM+CNN | |
CN107315995B (en) | Face recognition method based on Laplace logarithmic face and convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171215 |
|
RJ01 | Rejection of invention patent application after publication |