[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114821786A - Gait recognition method based on human body contour and key point feature fusion - Google Patents

Gait recognition method based on human body contour and key point feature fusion Download PDF

Info

Publication number
CN114821786A
CN114821786A CN202210452885.3A CN202210452885A CN114821786A CN 114821786 A CN114821786 A CN 114821786A CN 202210452885 A CN202210452885 A CN 202210452885A CN 114821786 A CN114821786 A CN 114821786A
Authority
CN
China
Prior art keywords
gait
human body
key point
contour
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210452885.3A
Other languages
Chinese (zh)
Inventor
陈志�
周晨
岳文静
艾虎
王悦
何丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210452885.3A priority Critical patent/CN114821786A/en
Publication of CN114821786A publication Critical patent/CN114821786A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • G06V40/25Recognition of walking or running movements, e.g. gait recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a gait recognition method based on human body contour and key point feature fusion, which comprises the following steps: inputting a gait video of single person walking to obtain a pedestrian contour sequence in the video; substituting the gait video into an OpenPose algorithm module to obtain a normalized human body key point information sequence, and substituting the pedestrian contour sequence into a GaitSet algorithm module to obtain the characteristics of the gait contour sequence; a human key point feature extraction module which consists of LSTM and CNN of the human key point information sequence; respectively obtaining gait contour feature vectors and human body key point feature vectors; connecting the gait contour feature vector with the human key point feature vector and then inputting the connected gait contour feature vector into a feature fusion module; and importing the gait fusion characteristics into a fusion network for characteristic learning, and identifying the identity of the person in the video. The invention utilizes the human body contour feature extraction module and the human body key point feature extraction module to respectively extract the features of the human body contour feature extraction module and then carries out feature layer fusion to obtain gait fusion features, thereby improving the accuracy and robustness of gait recognition.

Description

Gait recognition method based on human body contour and key point feature fusion
Technical Field
The invention belongs to the cross technical field of computer vision, identity recognition, feature fusion and the like, and particularly relates to a gait recognition method based on human body contour and key point feature fusion.
Background
Gait is one of human biological behavior characteristics, describes the motion change rule of upper and lower joints in the walking process of a person, medically considers that the gait of each person is unique, and the human gait characteristics have the characteristics of global uniqueness, long-term invariance, difficult revocation, easy collection, non-moldability, difficult camouflage, non-contact property and the like, and are the biological characteristics which are most suitable for wide-range popularization in multiple fields at present. Gait recognition is to extract physical and behavioral features from an individual's walking pattern for identification. In the advanced security field, because the gait characteristics are difficult to disguise and imitate, the iris characteristics can be assisted to carry out mixed biological characteristic identification so as to improve the safety performance, and stronger safety guarantee is brought to the fields of finance, military affairs and the like.
Although individual gait is unique, factors such as clothing, carry-on things, perspective and viewing angle pose significant challenges to gait recognition, and various approaches have been proposed to address these issues. Early people differentiated people from differences in various parts of the body when each person walked, but this meant modeling numerous structures of the body, involving a large number of variables and complex operations; identification from the original RGB image is also a research direction, but the method faces the challenge of eliminating gait irrelevant information; with the rapid development of deep learning, two gait recognition methods based on human body contour and human body bone key points are gradually changed into mainstream at present.
The method based on the human body contour can avoid the interference of irrelevant pixel points in the video sequence to a great extent, quickly and effectively extract the characteristics in the contour sequence, and is suitable for the condition of low resolution. Although the contour sequence-based method has many advantages, the method only retains the external contour of the human body, and the trunk information of the human body cannot be utilized during walking. The method based on the human skeleton key points estimates the postures of the people in the video sequence, extracts the skeleton key points of the people and identifies the skeleton key points from the skeleton key point sequence.
The two characteristics are complementary, the composition of the two characteristics is expected to become a more comprehensive representation of gait, and then the complementary advantages of the contour and the bone key points are not fully utilized in the past research.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, the invention provides a gait recognition method based on the fusion of human body contour and key point features.
The technical scheme is as follows: the invention provides a gait recognition method based on human body contour and key point feature fusion, which comprises the following steps:
acquiring a pedestrian contour sequence in the video based on the input single walking gait video, and substituting the gait video into an OpenPose algorithm module to obtain a normalized human body key point information sequence;
substituting the pedestrian contour sequence into a GaitSet algorithm module to obtain the characteristics of the gait contour sequence; importing the human body key point information sequence into a human body key point feature extraction module to obtain the features of the human body key points;
respectively obtaining a gait contour feature vector and a human body key point feature vector based on the features of the gait contour sequence and the features of the human body key points;
connecting the gait contour feature vector and the human body key point feature vector, and inputting the connected gait contour feature vector into a feature fusion module to obtain gait fusion features;
and importing the gait fusion characteristics into a fusion network for characteristic learning, and identifying the identity of the person in the video.
In a further embodiment, the method for inputting the gait video of the single person walking and acquiring the pedestrian contour sequence in the video comprises the following steps:
the gait video obtains the human body contour of each frame of image of the gait video by using a KNN algorithm;
calculating the number of non-zero pixel points of each frame of single contour image based on the human body contour of each frame of image, and selecting whether to output the image according to the threshold value of the image pixel;
acquiring an index value interval of an upper limit value and a lower limit value of a row pixel sum which is not 0 for the output image, and cutting an upper area and a lower area of the output image according to the index value interval to obtain a cut image;
searching a median based on an x axis in the cut image, and determining an x axis central point of a person in the image by the searched median;
slicing from the center point to both sides to obtain image arrays with 64-bit rows and columns;
and converting the image array type to obtain a pedestrian contour sequence.
In a further embodiment, the method for obtaining the normalized human body key point information sequence by substituting the gait video into the OpenPose algorithm module comprises the following steps:
acquiring position coordinates of each key point of a human body in a video based on the gait video;
selecting the position of a neck key point from position coordinates of each key point of a human body in a video as an origin, and normalizing other key points by taking the distance between the neck and the hip as a reference to obtain a normalized human body key point frame sequence;
wherein, the normalization formula is:
Figure BDA0003619501910000031
in the formula, P i Is the position of the key point i, P' i Is the normalized position of the key points, P is the position of the neck joint points, and D is the distance between the neck and hip key points.
In a further embodiment, the human body key point feature extraction module includes: an LSTM module and a CNN module; the method for importing the human body key point information sequence into the human body key point feature extraction module to obtain the features of the human body key points is to respectively transmit the obtained human body key point frame sequence into the LSTM module and the CNN module to obtain the features of each frame of the human body key points.
In a further embodiment, the method for obtaining the feature vector of the human body key points based on the features of the human body key points comprises the following steps:
obtaining a feature vector associated with each frame of features based on each frame of features of the key points of the human body, and connecting the obtained feature vectors of each frame;
and inputting the connected feature vectors into the compression block to obtain 62 gamma 128-dimensional human key point feature vectors.
In a further embodiment, the LSTM model consists of fully connected layers and LSTM layers, setting the characteristic dimension of LSTM to 256 dimensions;
the CNN module is a convolution layer provided with 10 layers of 3 gamma 3, the number of filters of the first layer of convolution layer is set to be 32, one pooling layer is respectively arranged between the second layer of convolution layer and the third layer of convolution layer as well as between the fifth layer of convolution layer and the sixth layer of convolution layer, the number of filters from the second layer to the fourth layer of convolution layer is set to be 64, the number of the rest filters is set to be 128, residual connection is carried out on the first layer of pooling layer and the fourth layer of convolution layer, residual connection is carried out on the second layer of pooling layer and the seventh layer of convolution layer, and the dimensionality of the full connection layer is set to be 256 dimensions;
the feature extraction module further comprises a compression module, wherein the compression module is composed of a BN layer, a ReLU layer, a Dropout layer and a 128-dimensional full connection layer.
In a further embodiment, the gait contour feature vector and the human body key point feature vector are connected and then input into the feature fusion module, and the method for obtaining the gait fusion feature comprises the following steps:
connecting each dimension of the gait contour feature vector and the human key point feature vector to obtain a connection vector of each dimension;
leading the connection vector of each dimension into a full connection layer of the feature fusion module to obtain a human gait fusion feature vector;
the feature fusion module introduces a ternary loss function for training, and the expression formula of the ternary loss function is as follows:
Figure BDA0003619501910000041
Figure BDA0003619501910000042
in the formula, L BA () Representing the sum of the loss values of the positive and negative samples,
Figure BDA0003619501910000043
representing the difference between the distance between the anchor sample and the positive sample and the distance between the anchor sample and the negative sample,
Figure BDA0003619501910000044
representing the i-th anchor sample,
Figure BDA0003619501910000045
representing the jth anchor sample, D the distance between two samples,
Figure BDA0003619501910000046
which represents the i-th positive sample,
Figure BDA0003619501910000047
the method comprises the steps of representing the ith negative sample, representing a threshold parameter set according to actual needs, and being used for controlling the difference between the distance between the anchor point sample and the positive sample and the distance between the anchor point sample and the negative sample, wherein a represents the anchor point sample, P represents the negative sample, n represents the negative sample, i represents the number of the positive sample, j represents the number of the negative sample, P represents P ids, and K represents that each id has K samples.
In a further embodiment, the method for importing the gait fusion features into the fusion network for feature learning to identify the identity of the person in the video includes:
fusing gait to features F Q And fusing each feature F in the network feature library G Calculating Euclidean distance to obtain gait fusion characteristic F Q And fusing each feature F in the network feature library G A distance result;
and based on the distance of the distance result, selecting an approximate distance result and determining a recognition result based on the characteristics associated with the approximate distance result, thereby completing the identification of the person in the video.
In a second aspect, the invention provides a processing device, which includes a memory and a processor, the memory stores a computer program, which is executed by the processor to implement the gait recognition method based on human body contour and key point feature fusion.
A third aspect the present invention provides a readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method described above.
Has the advantages that: compared with the prior art, the invention has the following advantages:
(1) the method respectively extracts the figure outline sequence and the skeleton key point sequence in the gait video by using a morphological method and an OpenPose algorithm to be used as the initial walking characteristic representation of the pedestrian.
(2) The invention introduces GaitSet algorithm to extract the characteristics in the contour sequence, and then extracts the time sequence characteristics in the skeleton key point sequence through the combination of the LSTM network and the CNN network.
(3) The invention obtains the fusion characteristics of the gait by using a characteristic layer fusion method, obtains the comprehensive characteristic expression of each person through the learning of a fusion network, and effectively improves the accuracy and reliability of gait recognition.
Drawings
Fig. 1 is a general method flow diagram.
Fig. 2 is an architecture diagram of a gait feature extraction network based on feature fusion.
Detailed Description
In order to more fully understand the technical content of the present invention, the technical solution of the present invention will be further described and illustrated with reference to the following specific embodiments, but not limited thereto.
As shown in fig. 1 and 2, a gait recognition method based on human body contour and key point feature fusion includes the following steps:
step 1) inputting a gait video of single walking to acquire a pedestrian contour sequence in the video
Wherein in the process of acquiring the pedestrian contour sequence in the video, the pedestrian contour sequence is cut to remove invalid pixel points, and the specific steps are as follows:
step 11) extracting the human body contour of each frame of image by using a KNN algorithm;
step 12) calculating the number of nonzero pixel points of each frame of single contour image, and if the sum of image pixels is less than 10000, not returning image information;
step 13) obtaining the highest and lowest indexes of the row pixel sum which is not 0, and cutting the upper and lower areas of the image;
step 14) obtaining the median of an x axis, regarding the median as the x center of a person, and if the median of the image cannot be found, not returning image information;
step 15) slicing from the center point to both sides to obtain an image array with 64 rows and columns, if the value exceeds the image range, translating to the right, and splicing from both sides through all 0 arrays;
and step 16) converting the image array type and returning to obtain the cut pedestrian outline sequence.
Step 2, substituting the gait video into an OpenPose algorithm module to obtain a normalized human body key point information sequence, and substituting the pedestrian contour sequence into a GaitSet algorithm module to obtain the characteristics of the gait contour sequence;
the steps of obtaining the normalized human body key point information sequence are as follows:
step 21) obtaining position coordinates of each key point of a human body in the video;
step 22) in the detection of key points of the human body, the neck and the hip are key points of relatively stable points, and the position of a neck joint point is selected as an origin;
step 23), taking the distance between the neck and the hip as a reference, and normalizing other key points, wherein the normalization formula is as follows:
Figure BDA0003619501910000061
wherein P is i Is the position of the key point i, P' i Is the normalized position of the key points, P is the position of the neck, and D is the distance between the neck and hip key points.
Secondly, the pedestrian contour feature extraction comprises the following steps:
step 24) inputting the pedestrian contour sequence obtained in the step 1) into a GaitSet network so as to obtain pedestrian contour characteristics;
step 25) calculating by a GaitSet network to obtain a 62 x 128-dimensional feature vector.
Step 3, a human key point feature extraction module consisting of LSTM and CNN is used for extracting the features of the human key points from the human key point information sequence;
step 4, based on the characteristics of the human key points, the method for obtaining the characteristic vectors of the human key points comprises the following steps:
step 41) respectively transmitting the obtained human body key point frame sequence into an LSTM module and a CNN module to obtain the characteristics of each frame of the human body key point and connecting the obtained characteristic vectors of each frame;
and 42) inputting the connected feature vectors into the compressed block to obtain 62 gamma 128-dimensional human key point feature vectors.
The human body key point feature extraction module consists of an LSTM network for extracting time sequence information, a CNN network for extracting space information and a compression module.
The LSTM model consists of fully connected layers and LSTM layers, setting the characteristic dimensions of LSTM to 256 dimensions.
The CNN model is provided with 10 convolutional layers of 3 gamma 3, the number of filters of the first convolutional layer is set to be 32, one pooling layer is respectively arranged between the second convolutional layer and the third convolutional layer and between the fifth convolutional layer and the sixth convolutional layer, the number of filters from the second convolutional layer to the fourth convolutional layer is set to be 64, the number of the rest filters is set to be 128, residual connection thought of ResNet is used for reference, the first pooling layer and the fourth convolutional layer are subjected to residual connection, the second pooling layer and the seventh convolutional layer are subjected to residual connection, and the dimensionality of all the connecting layers is set to be 256.
To prevent overfitting, a compression module is provided, which consists of a BN layer, a ReLU layer, a Dropout layer and a 128-dimensional fully-connected layer.
Step 5, connecting the gait contour feature vector and the human key point feature vector and then inputting the connected gait contour feature vector into a feature fusion module, wherein the method for obtaining the gait fusion feature comprises the following steps:
step 51) connecting each dimension of the 62 gamma 128-dimensional gait contour characteristic vector and the 62 gamma 128-dimensional human body key point characteristic vector to obtain a connecting vector of each dimension;
step 52) leading the connection vector of each dimension into a full connection layer of the feature fusion module to obtain a human gait fusion feature vector;
the feature fusion module introduces a ternary loss function for training, and the expression formula of the ternary loss function is as follows:
Figure BDA0003619501910000071
Figure BDA0003619501910000072
in the formula, L BA () Representing the sum of the loss values of the positive and negative samples,
Figure BDA0003619501910000073
representing the difference between the distance between the anchor sample and the positive sample and the distance between the anchor sample and the negative sample,
Figure BDA0003619501910000074
representing the i-th anchor sample,
Figure BDA0003619501910000075
representing the jth anchor sample, D the distance between two samples,
Figure BDA0003619501910000076
which represents the i-th positive sample,
Figure BDA0003619501910000077
the method comprises the steps of representing the ith negative sample, representing a threshold parameter set according to actual needs, and being used for controlling the difference between the distance between the anchor point sample and the positive sample and the distance between the anchor point sample and the negative sample, wherein a represents the anchor point sample, P represents the negative sample, n represents the negative sample, i represents the number of the positive sample, j represents the number of the negative sample, P represents P ids, and K represents that each id has K samples.
Step 6, importing the gait fusion characteristics into a fusion network for characteristic learning, and the method for identifying the identity of the person in the video comprises the following steps:
step 61) fusing gait features F Q And fusing each feature F in the network feature library G Calculating Euclidean distance to obtain gait fusion characteristic F Q And fusing each feature F in the network feature library G A distance result;
step 62) based on the distance of the distance result, selecting an approximate distance result and determining a recognition result based on the characteristics related to the approximate distance result, thereby completing identity recognition of the person in the video.
The method comprises the steps of obtaining a gait video sequence, respectively extracting a human body contour sequence and a human body key point sequence from the video sequence, respectively extracting characteristics of the human body contour sequence and the human body key point sequence by using a human body contour characteristic extraction module and a human body key point characteristic extraction module, then carrying out characteristic layer fusion to obtain gait fusion characteristics, carrying out characteristic learning through a fusion network, and introducing a ternary loss function for training, thereby realizing accurate identification of the identity of a person in the video.
Embodiment 2 provides a processing apparatus comprising a memory and a processor, the memory storing a computer program which is executed by the processor to implement the following gait recognition method based on human body contour and key point feature fusion:
acquiring a pedestrian contour sequence in the video based on the input single walking gait video, and substituting the gait video into an OpenPose algorithm module to obtain a normalized human body key point information sequence;
substituting the pedestrian contour sequence into a GaitSet algorithm module to obtain the characteristics of the gait contour sequence; importing the human body key point information sequence into a human body key point feature extraction module to obtain the features of the human body key points;
respectively obtaining a gait contour feature vector and a human body key point feature vector based on the features of the gait contour sequence and the features of the human body key points;
connecting the gait contour feature vector with the human body key point feature vector and then inputting the connected gait contour feature vector into a feature fusion module to obtain gait fusion features;
and importing the gait fusion characteristics into a fusion network for characteristic learning, and identifying the identity of the person in the video.
Embodiment 3 provides a readable storage medium on which a computer program is stored which, when executed by a processor, implements the steps of the above-described method.
In conclusion, the figure outline sequence and the skeleton key point sequence in the gait video are respectively extracted by using a morphological method and an OpenPose algorithm and are used as the initial walking characteristic representation of the pedestrian; introducing a GaitSet algorithm to extract features in the contour sequence, and extracting time sequence features in the skeleton key point sequence through the combination of an LSTM network and a CNN network; the method has the advantages that the feature layer fusion method is used for obtaining the fusion features of the gait, the comprehensive feature expression of each person is obtained through the learning of the fusion network, and the accuracy and the reliability of gait recognition are effectively improved.
Embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A gait recognition method based on human body contour and key point feature fusion is characterized by comprising the following steps:
acquiring a pedestrian contour sequence in the video based on the input single walking gait video, and substituting the gait video into an OpenPose algorithm module to obtain a normalized human body key point information sequence;
substituting the pedestrian contour sequence into a GaitSet algorithm module to obtain the characteristics of the gait contour sequence; importing the human body key point information sequence into a human body key point feature extraction module to obtain the features of the human body key points;
respectively obtaining a gait contour feature vector and a human body key point feature vector based on the features of the gait contour sequence and the features of the human body key points;
connecting the gait contour feature vector and the human body key point feature vector, and inputting the connected gait contour feature vector into a feature fusion module to obtain gait fusion features;
and importing the gait fusion characteristics into a fusion network for characteristic learning, and identifying the identity of the person in the video.
2. The gait recognition method based on the fusion of the human body contour and the key point features as claimed in claim 1, wherein the process of acquiring the pedestrian contour sequence in the video based on the input single-person walking gait video further comprises the steps of cutting the pedestrian contour sequence, and the method of acquiring the cut pedestrian contour sequence comprises the steps of:
the gait video obtains the human body contour of each frame of image of the gait video by using a KNN algorithm;
calculating the number of non-zero pixel points of each frame of single contour image based on the human body contour of each frame of image, and selecting whether to output the image according to the threshold value of the image pixel;
acquiring an index value interval of an upper limit value and a lower limit value of a row pixel sum which is not 0 for the output image, and cutting an upper area and a lower area of the output image according to the index value interval to obtain a cut image;
searching a median based on an x axis in the cut image, and determining an x axis central point of a person in the image by the searched median;
slicing from the center point to both sides to obtain image arrays with 64-bit rows and columns;
and converting the image array type to obtain a pedestrian contour sequence.
3. The gait recognition method based on human body contour and key point feature fusion as claimed in claim 1, wherein the method for obtaining the normalized human body key point information sequence by substituting the gait video into the openpos algorithm module comprises:
based on the gait video, acquiring position coordinates of each key point of a human body in the video;
selecting the position of a neck key point from position coordinates of each key point of a human body in a video as an origin, and normalizing other key points by taking the distance between the neck and the hip as a reference to obtain a normalized human body key point frame sequence;
wherein, the normalization formula is:
Figure FDA0003619501900000021
in the formula, P i Is the position of the key point i, P' i Is the normalized position of the key point, P is the neck joint pointD is the distance between the neck and hip key points.
4. The gait recognition method based on human body contour and key point feature fusion of claim 1, wherein the human body key point feature extraction module comprises: an LSTM module and a CNN module; the method for importing the human body key point information sequence into the human body key point feature extraction module to obtain the features of the human body key points is to respectively transmit the obtained human body key point frame sequence into the LSTM module and the CNN module to obtain the features of each frame of the human body key points.
5. The gait recognition method based on the fusion of the human body contour and the key point features as claimed in claim 1, wherein the method for obtaining the human body key point feature vector based on the features of the human body key point comprises:
obtaining a feature vector associated with each frame of features based on each frame of features of the key points of the human body, and connecting the obtained feature vectors of each frame;
and compressing the input connected feature vectors to obtain the human key point feature vectors of 62 gamma 128 dimensions.
6. The gait recognition method based on the fusion of the human body contour and the key point features according to claim 4, characterized in that the LSTM model is composed of a fully connected layer and an LSTM layer, and the characteristic dimension of the LSTM is set to 256 dimensions;
the CNN module is a convolution layer provided with 10 layers of 3 gamma 3, the number of filters of the first layer of convolution layer is set to be 32, one pooling layer is respectively arranged between the second layer of convolution layer and the third layer of convolution layer as well as between the fifth layer of convolution layer and the sixth layer of convolution layer, the number of filters from the second layer to the fourth layer of convolution layer is set to be 64, the number of the rest filters is set to be 128, residual connection is carried out on the first layer of pooling layer and the fourth layer of convolution layer, residual connection is carried out on the second layer of pooling layer and the seventh layer of convolution layer, and the dimensionality of the full connection layer is set to be 256 dimensions;
the feature extraction module further comprises a compression module, wherein the compression module is composed of a BN layer, a ReLU layer, a Dropout layer and a 128-dimensional full connection layer.
7. The gait recognition method based on human body contour and key point feature fusion of claim 1, characterized in that the gait contour feature vector and the human body key point feature vector are connected and then input into the feature fusion module, and the method for obtaining the gait fusion feature comprises:
connecting each dimension of the gait contour feature vector and the human key point feature vector to obtain a connection vector of each dimension;
leading the connection vector of each dimension into a full connection layer of the feature fusion module to obtain a human gait fusion feature vector;
the feature fusion module introduces a ternary loss function for training, and the expression formula of the ternary loss function is as follows:
Figure FDA0003619501900000031
Figure FDA0003619501900000032
in the formula, L BA () Representing the sum of the loss values of the positive and negative samples,
Figure FDA0003619501900000033
representing the difference between the distance between the anchor sample and the positive sample and the distance between the anchor sample and the negative sample,
Figure FDA0003619501900000034
representing the i-th anchor sample,
Figure FDA0003619501900000035
representing the jth anchor sample, D the distance between two samples,
Figure FDA0003619501900000036
which represents the i-th positive sample,
Figure FDA0003619501900000037
the method comprises the steps of representing the ith negative sample, representing a threshold parameter set according to actual needs, and being used for controlling the difference between the distance between the anchor point sample and the positive sample and the distance between the anchor point sample and the negative sample, wherein a represents the anchor point sample, P represents the negative sample, n represents the negative sample, i represents the number of the positive sample, j represents the number of the negative sample, P represents P ids, and K represents that each id has K samples.
8. The gait recognition method based on the human body contour and the key point feature fusion of claim 1, characterized in that the gait fusion features are introduced into a fusion network for feature learning, and the method for recognizing the identity of the person in the video comprises the following steps:
fusing gait to features F Q And fusing each feature F in the network feature library G Calculating Euclidean distance to obtain gait fusion characteristic F Q And fusing each feature F in the network feature library G A distance result;
and based on the distance of the distance result, selecting an approximate distance result and determining a recognition result based on the characteristics associated with the approximate distance result, thereby completing the identification of the person in the video.
9. A processing apparatus comprising a memory and a processor, wherein the memory stores a computer program which is executed by the processor to implement the gait recognition method based on the fusion of the body contour and the key point feature according to any one of claims 1 to 8.
10. A readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
CN202210452885.3A 2022-04-27 2022-04-27 Gait recognition method based on human body contour and key point feature fusion Pending CN114821786A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210452885.3A CN114821786A (en) 2022-04-27 2022-04-27 Gait recognition method based on human body contour and key point feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210452885.3A CN114821786A (en) 2022-04-27 2022-04-27 Gait recognition method based on human body contour and key point feature fusion

Publications (1)

Publication Number Publication Date
CN114821786A true CN114821786A (en) 2022-07-29

Family

ID=82508944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210452885.3A Pending CN114821786A (en) 2022-04-27 2022-04-27 Gait recognition method based on human body contour and key point feature fusion

Country Status (1)

Country Link
CN (1) CN114821786A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476198A (en) * 2020-04-24 2020-07-31 广西安良科技有限公司 Gait recognition method, device and system based on artificial intelligence, storage medium and server
CN116665309A (en) * 2023-07-26 2023-08-29 山东睿芯半导体科技有限公司 Method, device, chip and terminal for identifying walking gesture features

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170243058A1 (en) * 2014-10-28 2017-08-24 Watrix Technology Gait recognition method based on deep learning
CN111428658A (en) * 2020-03-27 2020-07-17 大连海事大学 Gait recognition method based on modal fusion
CN111950418A (en) * 2020-08-03 2020-11-17 启航汽车有限公司 Gait recognition method, device and system based on leg features and readable storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170243058A1 (en) * 2014-10-28 2017-08-24 Watrix Technology Gait recognition method based on deep learning
CN111428658A (en) * 2020-03-27 2020-07-17 大连海事大学 Gait recognition method based on modal fusion
CN111950418A (en) * 2020-08-03 2020-11-17 启航汽车有限公司 Gait recognition method, device and system based on leg features and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张秋红;苏锦;杨新锋;: "基于特征融合和神经网络对步态识别仿真研究", 计算机仿真, no. 08, 15 August 2012 (2012-08-15) *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476198A (en) * 2020-04-24 2020-07-31 广西安良科技有限公司 Gait recognition method, device and system based on artificial intelligence, storage medium and server
CN111476198B (en) * 2020-04-24 2023-09-26 广西安良科技有限公司 Gait recognition method, device, system, storage medium and server based on artificial intelligence
CN116665309A (en) * 2023-07-26 2023-08-29 山东睿芯半导体科技有限公司 Method, device, chip and terminal for identifying walking gesture features
CN116665309B (en) * 2023-07-26 2023-11-14 山东睿芯半导体科技有限公司 Method, device, chip and terminal for identifying walking gesture features

Similar Documents

Publication Publication Date Title
CN106815566B (en) Face retrieval method based on multitask convolutional neural network
CN110659589B (en) Pedestrian re-identification method, system and device based on attitude and attention mechanism
CN109934195A (en) A kind of anti-spoofing three-dimensional face identification method based on information fusion
CN111783748A (en) Face recognition method and device, electronic equipment and storage medium
CN113449704B (en) Face recognition model training method and device, electronic equipment and storage medium
CN112801054B (en) Face recognition model processing method, face recognition method and device
CN108537181A (en) A kind of gait recognition method based on the study of big spacing depth measure
CN111310668A (en) Gait recognition method based on skeleton information
CN104834905A (en) Facial image identification simulation system and method
CN111985332B (en) Gait recognition method of improved loss function based on deep learning
CN114821786A (en) Gait recognition method based on human body contour and key point feature fusion
CN111444488A (en) Identity authentication method based on dynamic gesture
Badave et al. Evaluation of person recognition accuracy based on openpose parameters
CN112541421B (en) Pedestrian reloading and reloading recognition method for open space
CN117333908A (en) Cross-modal pedestrian re-recognition method based on attitude feature alignment
CN110909678B (en) Face recognition method and system based on width learning network feature extraction
CN116311377A (en) Method and system for re-identifying clothing changing pedestrians based on relationship between images
CN115100684A (en) Clothes-changing pedestrian re-identification method based on attitude and style normalization
CN114429646A (en) Gait recognition method based on deep self-attention transformation network
CN114445691A (en) Model training method and device, electronic equipment and storage medium
CN117854155A (en) Human skeleton action recognition method and system
CN111444374B (en) Human body retrieval system and method
Pundir et al. A Review of Deep Learning Approaches for Human Gait Recognition
CN114863555B (en) 3D bone point action recognition method based on space-time multi-residual image convolution
CN114639116B (en) Pedestrian re-recognition method and device, storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination