CN113392740B - Pedestrian heavy identification system based on dual attention mechanism - Google Patents
Pedestrian heavy identification system based on dual attention mechanism Download PDFInfo
- Publication number
- CN113392740B CN113392740B CN202110618743.5A CN202110618743A CN113392740B CN 113392740 B CN113392740 B CN 113392740B CN 202110618743 A CN202110618743 A CN 202110618743A CN 113392740 B CN113392740 B CN 113392740B
- Authority
- CN
- China
- Prior art keywords
- layer
- convolutional
- attention mechanism
- convolution
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of image processing, and particularly relates to a pedestrian re-identification system based on a dual attention mechanism; attention mechanisms are introduced into the strongbaseline network and comprise a channel attention mechanism and a space attention mechanism, wherein the channel attention mechanism can promote a model by compressing in a space dimension so as to focus on a key channel; the spatial attention mechanism may highlight semantic pixels by aggregating similar features of all channels; the essence of the attention mechanism is to emphasize important positions useful for learning objects and suppress irrelevant information by assigning weight coefficients to image feature information; the attention mechanism is inserted into the human re-recognition model, so that the problems of camera angle, body posture change, body misalignment, image diversification and the like are solved, the feature extraction capability of the network model can be improved on the premise of not obviously increasing the calculated amount and the parameter amount, and the network performance is improved.
Description
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a pedestrian re-identification system based on a dual attention mechanism.
Background
In recent years, researchers have conducted extensive research into Person re-identification (Person re-identification), which aims to verify the identity of a pedestrian in a sequence of images captured by non-overlapping cameras, has many applications in public safety video surveillance, and has great practical significance for security and criminal investigations. In recent years, with the development of deep learning, convolutional neural networks have been successfully used for human re-recognition. These methods achieve good results when the background is relatively simple and the situation is relatively fixed. However, in many real-life scenarios, the situation is often more complex, and person re-recognition is a challenging task due to the presence of field changes, such as spatial misalignment, background interference, and pedestrian pose changes. The traditional convolutional neural network cannot adaptively focus on useful channels and regions of the feature map, which limits the accuracy of pedestrian re-identification.
Disclosure of Invention
Aiming at the defects of the prior art, in order to obtain higher accuracy, the invention provides a pedestrian re-identification system based on a double attention mechanism, which has a channel and space double attention mechanism, focuses on important features and inhibits unnecessary features, and can improve the feature extraction capability of a network model on the premise of not obviously increasing the calculated amount and the parameter amount.
The invention adopts the following technical scheme:
a pedestrian re-identification system based on a double attention mechanism introduces an attention mechanism in a strongbaseline network, and comprises a channel attention mechanism and a space attention mechanism, wherein the channel attention mechanism can promote a model to concentrate on a key channel by compressing in a space dimension; the spatial attention mechanism may highlight semantic pixels by aggregating similar features of all channels; the essence of the attention mechanism is to emphasize important positions useful for learning the target and suppress irrelevant information by assigning a weight coefficient to image feature information.
A pedestrian re-identification system based on a double attention mechanism is characterized in that a double attention mechanism module is inserted on the basis of a strongbaseline network; the structure is as follows:
the first layer is a convolution layer, the second layer is a normalization layer, the third layer is an activation function layer, the fourth layer is a pooling layer, and a Stage structure is formed by the Stage1, Stage2, Stage3 and Stage 4; wherein:
inserting a dual attention module behind the third layer of the first branch in the Conv Block of Stage1, and inserting a dual attention module behind the third convolutional layer in each Identity Block of Stage 1;
Inserting a dual attention mechanism module behind the third layer of the first branch in Conv Block of Stage2, and inserting a dual attention mechanism module behind the third convolutional layer in each Identity Block of Stage 2;
inserting a dual attention mechanism module behind the third layer of the first branch in Conv Block of Stage3, and inserting a dual attention mechanism module behind the third convolutional layer in each Identity Block of Stage 3;
inserting a dual attention mechanism module behind the third layer of the first branch in Conv Block of Stage4, and inserting a dual attention mechanism module behind the third convolutional layer in each Identity Block of Stage 4;
and finally, sequentially forming a pooling layer, a normalization layer, a full connection layer and a SoftMax classifier.
The method for constructing the channel attention mechanism in the dual attention mechanism module comprises the following specific steps:
the method comprises the following steps: respectively carrying out average pooling and maximum pooling on a feature graph F obtained by block at the insertion position of the double attention mechanism module to obtain two C-dimension pooling feature graphs:and
step two: will be provided withAndsending the data into a multilayer sensor MLP comprising a hidden layer to obtain two channel attention diagrams with the size of 1 × C; wherein, in order to reduce the number of parameters, the hidden layer of MLP The number of the neurons is C/r, and r is a compression ratio;
step three: and adding corresponding elements of the two channel attention diagrams obtained through the multilayer perceptron MLP, then performing an activation function, wherein the activation function adopts a Sigmoid activation function to obtain a final channel attention mechanism Mc (F), and applying Mc (F) to the feature diagram F to obtain a final channel attention diagram F'.
The space attention mechanism in the dual attention mechanism module is constructed by the following specific steps:
the method comprises the following steps: for the final channel attention diagram F', firstly carrying out maximum pooling and average pooling along the channel direction to obtain two-dimensional feature mapsAndcarrying out concat dimension splicing on the two obtained two-dimensional characteristic graphs to obtain spliced characteristic graphs, wherein the sizes of the two characteristic graphs are 1 × H × W;
step two: and generating a spatial attention mechanism Ms (F ') by using the spliced feature map through a convolution layer with a convolution kernel size of 7 x 7, and applying Ms (F') to the feature map F 'to obtain a final spatial attention map F'.
The pedestrian re-identification system based on the dual attention mechanism has the specific structure that:
the first layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 7 x 7, the second layer is a normalization layer, the third layer is an activation function layer, the activation function adopts a Relu activation function, the fourth layer is a pooling layer, the maximum pooling is adopted, and the pooling size is 3 x 3;
Next, Stage structure comprising Stage1, Stage2, Stage3, Stage 4; wherein:
stage1 consists of a Conv Block and 2 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 64, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 64, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a convolutional layer, the number of convolutional cores is 256, and each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the obtained characteristic graphs to obtain a new input characteristic graph; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 1 x 1, the second layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 3 x 3, the third layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 x 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage2 consists of a Conv Block and 3 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 128, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 128, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 512, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a convolutional layer, the number of convolutional cores is 512, and each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams to obtain a new input characteristic diagram; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 1 × 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage3 consists of a Conv Block and 5 Identity blocks, where the Conv Block comprises two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 1024, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a layer of convolutional layers, the number of convolutional cores is 1024, each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams of the two branches to obtain a new input characteristic diagram; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 1024, the size of each convolution kernel is 1 × 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage4 consists of a Conv Block and 2 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is convolutional layer, the number of convolutional cores is 512, each convolutional core size is 1 × 1, the second layer is convolutional layer, the number of convolutional cores is 512, each convolutional core size is 3 × 3, the third layer is convolutional layer, the number of convolutional cores is 2048, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is convolutional layer, the number of convolutional cores is 2048, each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams of the two branches to obtain a new input characteristic diagram; the first layer of the Identity Block is convolution layers, the number of convolution kernels is 512, the size of each convolution kernel is 1 × 1, the second layer is convolution layers, the number of convolution kernels is 512, the size of each convolution kernel is 3 × 3, the third layer is convolution layers, the number of convolution kernels is 2048, the size of each convolution kernel is 1 × 1, and BN layers are added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
And sequentially passing the obtained feature graph through a pooling layer, a normalization layer, a full connection layer and a SoftMax classifier, and classifying by the SoftMax classifier according to the features to obtain the category of the image.
And the pooling layer adopts global average pooling, and the pooling size is 3 x 3.
The training process of the pedestrian re-identification system based on the double attention mechanism is as follows:
step one, acquiring a public pedestrian re-identification data set, and carrying out normalization operation on the sizes of pictures in the data set, so that the pixel size of each picture is 256 × 128;
secondly, initializing parameters of a strongbaseline network in the pedestrian re-identification system based on the double attention mechanism by adopting ImageNet pre-training network parameters, and randomly initializing the parameters by an introduced double attention mechanism module;
and step three, inputting the data set processed in the step one as a training set into a pedestrian re-identification system based on a double attention mechanism, enabling the system to learn the characteristics of each pedestrian in the training set by adopting a back propagation algorithm and a random gradient descent method, finally evaluating the effectiveness of the system in pedestrian re-identification through two indexes of mAP and Rank1, and obtaining a well-trained system when the mAP and Rank1 reach optimal values simultaneously.
The invention has the beneficial effects that:
the invention combines the recognition model and the attention mechanism in the pedestrian, inserts the attention mechanism into the personnel re-recognition model, reduces the problems of camera angle, body posture change, body misalignment, image diversification and the like, can improve the feature extraction capability of the network model on the premise of not obviously increasing the calculated amount and the parameter amount, improves the network performance, more accurately recognizes the pedestrians in the same category, and better assists other fields such as safety, criminal investigation and the like.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention;
FIG. 2 is a schematic diagram of a dual attention mechanism module according to the present invention;
FIG. 3 is a schematic view of the channel attention mechanism of the present invention;
FIG. 4 is a schematic view of the spatial attention mechanism of the present invention.
Detailed Description
The invention relates to a pedestrian re-identification algorithm based on a double attention mechanism, which is characterized in that an attention mechanism module is inserted into a strongbasepine network, the attention mechanism module comprises a channel attention mechanism and a space attention mechanism, an attention diagram is multiplied by an input characteristic diagram, and self-adaptive characteristic refinement is carried out, wherein:
the channel attention mechanism utilizes the inter-channel relation of the features to generate a channel attention graph, namely weight, each layer of the feature graph obtained through convolution is multiplied by different weights to represent the association degree and the importance degree of the features represented by the layer to the key information, and correspondingly, the larger the weight is, the more important the information represented by the layer to the key information is, the higher the association degree is; the smaller the weight is, the less important the information expressed by the layer is for the key information, the weight of each dimension is obtained, and the new characteristic is obtained by correspondingly multiplying the weight to the values of different channels.
The spatial attention mechanism utilizes the spatial relationship among the features to generate a spatial attention map, and by means of the attention mechanism, more attention is paid to the position characteristic, the spatial information in the original picture is transformed into another space through a spatial conversion module, and key information is reserved.
A pedestrian re-identification system based on a double attention mechanism is characterized in that a double attention mechanism module is inserted on the basis of a strongbaseline network; the structure is as follows:
the first layer is a convolutional layer, the second layer is a normalization layer, the third layer is an activation function layer, the fourth layer is a pooling layer, and the Stage structure comprises Stage1, Stage2, Stage3 and Stage 4; wherein:
inserting a dual attention mechanism module behind the third layer of the first branch in Conv Block of Stage1, and inserting a dual attention mechanism module behind the third convolutional layer in each Identity Block of Stage 1;
inserting a dual attention mechanism module behind the third layer of the first branch in Conv Block of Stage2, and inserting a dual attention mechanism module behind the third convolutional layer in each Identity Block of Stage 2;
inserting a dual attention mechanism module behind the third layer of the first branch in Conv Block of Stage3, and inserting a dual attention mechanism module behind the third convolutional layer in each Identity Block of Stage 3;
Inserting a dual attention mechanism module behind the third layer of the first branch in Conv Block of Stage4, and inserting a dual attention mechanism module behind the third convolutional layer in each Identity Block of Stage 4;
and finally, sequentially forming a pooling layer, a normalization layer, a full connection layer and a SoftMax classifier.
And sequentially passing the obtained characteristic diagram through a pooling layer, a normalization layer, a full connection layer and a SoftMax classifier, wherein the SoftMax classifier classifies the classes of the pedestrians according to the characteristics.
The method for constructing the channel attention mechanism in the dual attention mechanism module comprises the following specific steps:
the method comprises the following steps: and (3) respectively carrying out average pooling and maximum pooling on the feature diagram F obtained by block at the insertion position of the double attention mechanism module, aggregating spatial information and obtaining two C-dimensional pooling feature diagrams:and
step two: will be provided withAndsending the data into a multilayer sensor MLP comprising a hidden layer to obtain two channel attention diagrams with the size of 1 × C; wherein, in order to reduce the parameter number, the number of hidden layer neurons of the MLP is C/r, and r is a compression ratio;
step three: and adding corresponding elements of the two channel attention diagrams obtained through the multilayer perceptron MLP, then performing an activation function, wherein the activation function adopts a Sigmoid activation function to obtain a final channel attention mechanism Mc (F), and applying Mc (F) to the feature diagram F to obtain a final channel attention diagram F'.
The space attention mechanism in the dual attention mechanism module is constructed by the following specific steps:
the method comprises the following steps: for the final channel attention diagram F', the maximum pooling and the average pooling are firstly carried out along the channel direction to obtain two-dimensional characteristic mapsAndperforming concat dimension splicing on the two obtained two-dimensional feature maps to obtain spliced feature maps, wherein the sizes of the two feature maps are 1 × H × W;
step two: and generating a spatial attention mechanism Ms (F ') through the convolution layer with the convolution kernel size of 7 × 7 for the spliced feature map, and applying Ms (F ') to the feature map F ' to obtain a final spatial attention map F ″.
The characteristic diagram without the channel attention mechanism is F, F is obtained after the channel attention mechanism is carried out on F, and F 'is obtained after the space attention mechanism is carried out on F'.
The pedestrian re-identification system based on the double attention mechanism comprises 2 basic blocks, one is an Identity Block, and the input and output dimensions are the same, so that a plurality of the basic blocks can be connected in series; another basic Block is Conv Block, the input and output dimensions are different, and they cannot be connected in series, and its specific structure is:
the first layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 7 x 7, the second layer is a normalization layer, the third layer is an activation function layer, the activation function adopts a Relu activation function, the fourth layer is a pooling layer, the maximum pooling is adopted, and the pooling size is 3 x 3;
Next, Stage structure comprising Stage1, Stage2, Stage3, Stage 4; wherein:
stage1 consists of a Conv Block and 2 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 64, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 64, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a convolutional layer, the number of convolutional cores is 256, and each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the obtained characteristic graphs of the two branches to obtain a new input characteristic graph; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
The first Identity Block is fused with the previous Conv Block feature, and the second Identity Block is fused with the previous Identity Block feature;
stage2 consists of a Conv Block and 3 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 128, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 128, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 512, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a convolutional layer, the number of convolutional cores is 512, and each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams of the two branches to obtain a new input characteristic diagram; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 128, each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 128, each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 512, each convolution kernel is 1 × 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage3 consists of a Conv Block and 5 Identity blocks, where the Conv Block comprises two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 1024, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a layer of convolutional layers, the number of convolutional cores is 1024, each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams of the two branches to obtain a new input characteristic diagram; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 1024, the size of each convolution kernel is 1 × 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage4 consists of a Conv Block and 2 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is convolutional layer, the number of convolutional cores is 512, each convolutional core size is 1 × 1, the second layer is convolutional layer, the number of convolutional cores is 512, each convolutional core size is 3 × 3, the third layer is convolutional layer, the number of convolutional cores is 2048, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is convolutional layer, the number of convolutional cores is 2048, each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams of the two branches to obtain a new input characteristic diagram; the first layer of the Identity Block is convolution layers, the number of convolution kernels is 512, the size of each convolution kernel is 1 × 1, the second layer is convolution layers, the number of convolution kernels is 512, the size of each convolution kernel is 3 × 3, the third layer is convolution layers, the number of convolution kernels is 2048, the size of each convolution kernel is 1 × 1, and BN layers are added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
And sequentially passing the obtained feature graph through a pooling layer, a normalization layer, a full connection layer and a SoftMax classifier, and classifying by the SoftMax classifier according to the features to obtain the category of the image.
And the pooling layer adopts global average pooling, and the pooling size is 3 x 3.
The training process of the pedestrian re-identification system based on the double attention mechanism is as follows:
step one, acquiring a public pedestrian re-identification data set, and carrying out normalization operation on the sizes of pictures in the data set, so that the pixel size of each picture is 256 × 128;
different pedestrian photos are arranged in the pedestrian re-identification data set, different pedestrian categories are represented by different numbers, and each pedestrian has a plurality of different photos;
secondly, initializing parameters of a strongbaseline network in the pedestrian re-identification system based on a double attention mechanism by adopting ImageNet pre-training network parameters (which are well-known files of pth type and are directly used after being downloaded), and randomly initializing the parameters by an introduced double attention mechanism module;
and step three, inputting the data set processed in the step one as a training set into a pedestrian re-identification system based on a double attention mechanism, enabling the system to learn the characteristics of each pedestrian in the training set by adopting a back propagation algorithm and a random gradient descent method, finally evaluating the effectiveness of the system in pedestrian re-identification through two indexes of mAP and Rank1, and obtaining a well-trained system when the mAP and Rank1 reach optimal values simultaneously.
The effectiveness of the model in the pedestrian re-recognition task is evaluated through mAP and Rank1 indexes, 1000 epoch training models are set, when 660 epochs are trained, mAP and Rank1 reach optimal values, and a well-trained model is obtained, wherein the loss adopts triple loss, center loss and ID loss.
The whole process is a model optimization process, and the aim is to obtain a model with good effect. The model optimization process needs to use a back propagation algorithm and a gradient descent method, a Loss value is calculated during model training, back propagation iteration is carried out according to the magnitude of the Loss value of forward propagation to update the weight of each layer, and the back propagation continuously optimizes the model according to the Loss value so that the model finds good parameters.
Example 2
As shown in fig. 1, the pedestrian re-identification system with dual attention mechanism inserts an attention mechanism module on the basis of strongbaseline. The pedestrian re-identification model of the double attention mechanism has 2 basic blocks, one is an Identity Block, and the input and output dimensions are the same, so that a plurality of pedestrian re-identification models can be connected in series; another basic Block is Conv Block, the input and output dimensions are different, and they cannot be connected in series, and its specific structure is:
The first layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 7 x 7, the second layer is a normalization layer, the third layer is an activation function layer, wherein the activation function adopts a Relu activation function, the fourth layer is a pooling layer, the maximum pooling is adopted, and the pooling size is 3 x 3;
next, Stage structure including Stage1, Stage2, Stage3, Stage 4.
Stage1 is composed of a Conv Block and 2 Identity blocks, wherein the Conv Block comprises two branches, the first layer of the first branch is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 × 1, a double attention mechanism module is inserted behind the layer, the second branch is a layer of convolution layers, the number of convolution kernels is 256, the size of each convolution kernel is 1 × 1, a BN layer is added behind each convolution layer of each branch, and the obtained feature maps of the two branches are fused to obtain a new input feature map. The first layer of the Identity Block is convolution layers, the number of convolution kernels is 64, the size of each convolution kernel is 1 x 1, the second layer is convolution layers, the number of convolution kernels is 64, the size of each convolution kernel is 3 x 3, the third layer is convolution layers, the number of convolution kernels is 256, the size of each convolution kernel is 1 x 1, and BN layers are added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the branch with the previous Block feature to obtain a new input feature graph;
Stage2 is composed of Conv Block and 3 Identity Block, wherein Conv Block includes two branches, the first layer of the first branch is convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 × 1, the second layer is convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 3 × 3, the third layer is convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 1 × 1, a double attention mechanism module is inserted behind the layer, the second branch is one convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 1 × 1, BN layer is added behind each convolution layer of each branch, and feature maps of the two branches are fused to obtain a new input feature map. The first layer of the Identity Block is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 x 1, the second layer is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 3 x 3, the third layer is a convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 1 x 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the branch with the previous Block feature to obtain a new input feature graph;
Stage3 is composed of a Conv Block and 5 Identity blocks, wherein the Conv Block comprises two branches, the first layer of the first branch is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 1024, the size of each convolution kernel is 1 × 1, a double attention mechanism module is inserted behind the layer, the second branch is a layer of convolution layers, the number of convolution kernels is 1024, the size of each convolution kernel is 1 × 1, a BN layer is added behind each convolution layer of each branch, and the feature maps of the two branches are fused to obtain a new input feature map. The first layer of the Identity Block is convolution layers, the number of convolution kernels is 256, the size of each convolution kernel is 1 x 1, the second layer is convolution layers, the number of convolution kernels is 256, the size of each convolution kernel is 3 x 3, the third layer is convolution layers, the number of convolution kernels is 1024, the size of each convolution kernel is 1 x 1, and BN layers are added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the branch with the previous Block feature to obtain a new input feature graph;
Stage4 is composed of Conv Block and 2 Identity Block, wherein Conv Block includes two branches, the first layer of the first branch is convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 1 × 1, the second layer is convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 3 × 3, the third layer is convolution layer, the number of convolution kernels is 2048, the size of each convolution kernel is 1 × 1, a double attention mechanism module is inserted behind the layer, the second branch is one convolution layer, the number of convolution kernels is 2048, the size of each convolution kernel is 1 × 1, a BN layer is added behind each convolution layer of each branch, and feature maps of the two branches are fused to obtain a new input feature map. The first layer of the Identity Block is a convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 1 x 1, the second layer is a convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 3 x 3, the third layer is a convolution layer, the number of convolution kernels is 2048, the size of each convolution kernel is 1 x 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the branch with the previous Block feature to obtain a new input feature graph;
Sequentially passing the obtained characteristic diagram through a pooling layer, and performing global average pooling, wherein the pooling size is 3 x 3; a normalization layer; and finally, extracting image features by adopting a depth convolution mode of a full connection layer in the network, obtaining dimension features, then classifying according to the features by using a SoftMax classifier, and obtaining image categories.
The training process of the pedestrian re-identification algorithm with the double attention mechanism is as follows:
step one, acquiring a public pedestrian re-identification data set, and carrying out normalization operation on the picture size to enable the pixel size of each picture to be 256 × 128;
secondly, initializing pedestrian re-recognition model parameters of the double attention mechanism by adopting ImageNet pre-training network parameters, and randomly initializing parameters by an introduced attention mechanism module;
and step three, inputting the data set into a pedestrian re-identification model with a double attention mechanism for training, enabling the pedestrian re-identification model with the double attention mechanism to learn the characteristics of each pedestrian in the training set, adopting a back propagation algorithm and a random gradient descent method for the pedestrian re-identification with the training double attention mechanism, and carrying out back propagation iteration to update the weight of each layer according to the magnitude of the Loss value of the forward propagation. The effectiveness of the model in a pedestrian re-identification task is evaluated through the mAP and the Rank1, 1000 epoch training models are set, when 660 epochs are trained, the mAP and the Rank1 reach optimal values, and the trained model is obtained, wherein the loss adopts triple loss, center loss and ID loss.
As shown in fig. 2, in the dual attention mechanism module, firstly, feature F extracted from each block of strongbaseline network is compressed in spatial dimension, and the compression adopts global maximum pooling and global average pooling to obtain two one-dimensional vectors, and then the operation is performed to obtain channel attention Mc, and F and Mc are fused into feature F'. And compressing the F ' on the channel by adopting global maximum pooling and global average pooling to obtain two one-dimensional vectors, then operating to obtain the attention Ms of the channel, and fusing the F ' and the Ms into a feature F '. Combining F' with F to obtain the final characteristic. The global average pooling has feedback to each pixel point on the feature map, and the global maximum pooling has the feedback of the gradient only at the place with the maximum response in the feature map when the gradient back propagation calculation is carried out, and can be used as a supplement of the global average pooling.
As shown in fig. 3, a structure diagram of the channel attention mechanism includes the following specific steps:
the method comprises the following steps: and (3) performing average pooling and maximum pooling operations on the feature graph F obtained by each block respectively, and aggregating spatial information to obtain two C-dimensional pooling feature graphs: And
step two: will be provided withAndsending the signal into a multilayer sensor MLP comprising a hidden layer to obtain two channel attention diagrams of 1 × C. Among them, in order to reduce the number of parameters, the number of hidden layer neurons is C/r, and r is called the compression ratio.
Step three: adding corresponding elements of the two channel attention diagrams obtained through MLP, obtaining a final channel attention mechanism Mc (F) through an activation function by adopting a Sigmoid activation function, and obtaining a final channel attention diagram F' by acting Mc (F) on a feature diagram F, wherein the formula is as follows:
wherein the final channel attention mechanism mc (f) is expressed as follows:
wherein W0And W1Respectively represents a hidden layer weight and an output layer weight, AvgPool (F) and MaxPool (F) are respectivelyAnd
as shown in fig. 4, a structure diagram of the spatial attention mechanism is shown, and the spatial attention mechanism is constructed by the following specific steps:
the method comprises the following steps: for F', firstly, carrying out maximum pooling and average pooling along the channel direction to obtain two-dimensional characteristic mapsAndall attributes are 1 × H × W, and the two obtained feature graphs are subjected to concat dimension splicing to obtain spliced feature graphs
Step two: for the spliced feature map, a spatial attention mechanism Ms (F ') is generated through the convolution layer of 7 × 7, and the final spatial attention mechanism F ″ is obtained by applying Ms (F ') to the feature map F '.
The formula is as follows:
wherein the spatial attention mechanism Ms (F') is expressed as follows:
where σ denotes the Sigmoid function, f7*7Represents the convolution operation of 7 x 7, AvgPool (F'); MaxPool (F') is respectivelyAnd
the system can effectively match the same pedestrian, can improve the feature extraction capability of a network model on the premise of not obviously increasing the calculated amount and the parameters, and has strong model generalization capability and popularization capability reliability.
Claims (4)
1. A pedestrian re-identification system based on a double attention mechanism is characterized in that a double attention mechanism module is inserted on the basis of a strongbaseline network; the structure is as follows:
the first layer is a convolution layer, the second layer is a normalization layer, the third layer is an activation function layer, the fourth layer is a pooling layer, and a Stage structure is formed by the Stage1, Stage2, Stage3 and Stage 4; wherein:
inserting a dual attention module behind the second layer of the first branch in the Conv Block of Stage1, and inserting a dual attention module behind the third convolutional layer in each Identity Block of Stage 1;
Inserting a dual attention module behind the third layer of the first branch in the Conv Block of Stage2, and inserting a dual attention module behind the third convolutional layer in each Identity Block of Stage 2;
inserting a dual attention module behind the third layer of the first branch in the Conv Block of Stage3, and inserting a dual attention module behind the third convolutional layer in each Identity Block of Stage 3;
inserting a dual attention module behind the third layer of the first branch in the Conv Block of Stage4, and inserting a dual attention module behind the third convolutional layer in each Identity Block of Stage 4;
finally, a pooling layer, a normalization layer, a full connection layer and a SoftMax classifier are sequentially arranged;
the pedestrian re-identification system based on the dual attention mechanism has the specific structure that:
the first layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 7 x 7, the second layer is a normalization layer, the third layer is an activation function layer, the activation function adopts a Relu activation function, the fourth layer is a pooling layer, the maximum pooling is adopted, and the pooling size is 3 x 3;
next, Stage structures including Stage1, Stage2, Stage3, Stage 4; wherein:
Stage1 consists of a Conv Block and 2 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 64, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 64, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a convolutional layer, the number of convolutional cores is 256, and each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the obtained characteristic graphs to obtain a new input characteristic graph; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 1 x 1, the second layer is a convolution layer, the number of convolution kernels is 64, the size of each convolution kernel is 3 x 3, the third layer is a convolution layer, the number of convolution kernels is 256, the size of each convolution kernel is 1 x 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module into the back of the third layer of each Identity Block, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage2 consists of a Conv Block and 3 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 128, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 128, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 512, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a convolutional layer, the number of convolutional cores is 512, and each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams to obtain a new input characteristic diagram; the first layer of the Identity Block is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 1 × 1, the second layer is a convolution layer, the number of convolution kernels is 128, the size of each convolution kernel is 3 × 3, the third layer is a convolution layer, the number of convolution kernels is 512, the size of each convolution kernel is 1 × 1, and a BN layer is added after each convolution layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage3 consists of a Conv Block and 5 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 256, each convolutional core size is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 1024, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is a convolutional layer, the number of convolutional cores is 1024, each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams of the two branches to obtain a new input characteristic diagram; the first layer of the Identity Block is a convolutional layer, the number of convolutional cores is 256, the size of each convolutional core is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 256, the size of each convolutional core is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 1024, the size of each convolutional core is 1 × 1, and a BN layer is added behind each convolutional layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Stage4 consists of a Conv Block and 2 Identity blocks, where the Conv Block contains two branches, the first layer of the first branch is convolutional layer, the number of convolutional cores is 512, each convolutional core size is 1 × 1, the second layer is convolutional layer, the number of convolutional cores is 512, each convolutional core size is 3 × 3, the third layer is convolutional layer, the number of convolutional cores is 2048, each convolutional core size is 1 × 1, a dual attention mechanism module is inserted behind the layer, the second branch is convolutional layer, the number of convolutional cores is 2048, each convolutional core size is 1 × 1; adding a BN layer after each convolution layer of each branch, and fusing the characteristic diagrams of the two branches to obtain a new input characteristic diagram; the first layer of the Identity Block is a convolutional layer, the number of convolutional cores is 512, the size of each convolutional core is 1 × 1, the second layer is a convolutional layer, the number of convolutional cores is 512, the size of each convolutional core is 3 × 3, the third layer is a convolutional layer, the number of convolutional cores is 2048, the size of each convolutional core is 1 × 1, and a BN layer is added after each convolutional layer; inserting a double attention mechanism module behind the third layer of each Identity Block layer, and fusing the feature graph of the Identity Block with the previous Block feature to obtain a new input feature graph;
Sequentially passing the obtained feature map through a pooling layer, a normalization layer, a full-link layer and a SoftMax classifier, and classifying the pedestrian category by the SoftMax classifier according to the features to obtain the category to which the image belongs;
the training process of the pedestrian re-identification system based on the double attention mechanism is as follows:
step one, acquiring a public pedestrian re-identification data set, and carrying out normalization operation on the sizes of pictures in the data set, so that the pixel size of each picture is 256 × 128;
secondly, initializing parameters of a strongbaseline network in the pedestrian re-identification system based on the double attention mechanism by adopting ImageNet pre-training network parameters, and randomly initializing the parameters by an introduced double attention mechanism module;
and step three, inputting the data set processed in the step one as a training set into a pedestrian re-identification system based on a double attention mechanism, enabling the system to learn the characteristics of each pedestrian in the training set by adopting a back propagation algorithm and a random gradient descent method, finally evaluating the effectiveness of the system in pedestrian re-identification through two indexes of mAP and Rank1, and obtaining a well-trained system when the mAP and Rank1 reach optimal values simultaneously.
2. The pedestrian re-identification system based on the dual attention mechanism is characterized in that the construction of the channel attention mechanism in the dual attention mechanism module comprises the following specific steps:
the method comprises the following steps: respectively carrying out average pooling and maximum pooling on a feature graph F obtained by block at the insertion position of the double attention mechanism module to obtain two C-dimension pooling feature graphs:and
step two: will be provided withAndsending the data into a multilayer sensor MLP comprising a hidden layer to obtain two channel attention diagrams with the size of 1 × C; wherein, in order to reduce the parameter number, the number of hidden layer neurons of the MLP is C/r, and r is a compression ratio;
step three: and adding corresponding elements of the two channel attention diagrams obtained through the multilayer perceptron MLP, then performing an activation function, wherein the activation function adopts a Sigmoid activation function to obtain a final channel attention mechanism Mc (F), and applying Mc (F) to the feature diagram F to obtain a final channel attention diagram F'.
3. The pedestrian re-identification system based on the dual attention mechanism is characterized in that the spatial attention mechanism in the dual attention mechanism module is constructed by the following specific steps:
The method comprises the following steps: for the final channel attention diagram F' first proceeds in the channel directionPerforming maximum pooling and average pooling to obtain two-dimensional characteristic graphsAndperforming concat dimension splicing on the two obtained two-dimensional feature maps to obtain spliced feature maps, wherein the sizes of the two feature maps are 1 × H × W;
step two: and generating a spatial attention mechanism Ms (F ') through the convolution layer with the convolution kernel size of 7 x 7 for the spliced feature map, and applying Ms (F') to the feature map F 'to obtain a final spatial attention map F'.
4. The dual attention mechanism-based pedestrian re-identification system of claim 3 wherein the pooling layer employs global average pooling of 3 x 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110618743.5A CN113392740B (en) | 2021-06-03 | 2021-06-03 | Pedestrian heavy identification system based on dual attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110618743.5A CN113392740B (en) | 2021-06-03 | 2021-06-03 | Pedestrian heavy identification system based on dual attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113392740A CN113392740A (en) | 2021-09-14 |
CN113392740B true CN113392740B (en) | 2022-06-28 |
Family
ID=77618038
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110618743.5A Active CN113392740B (en) | 2021-06-03 | 2021-06-03 | Pedestrian heavy identification system based on dual attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113392740B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116580428A (en) * | 2023-07-11 | 2023-08-11 | 中国民用航空总局第二研究所 | Pedestrian re-recognition method based on multi-scale channel attention mechanism |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670555A (en) * | 2018-12-27 | 2019-04-23 | 吉林大学 | Instance-level pedestrian detection and pedestrian's weight identifying system based on deep learning |
CN110070073A (en) * | 2019-05-07 | 2019-07-30 | 国家广播电视总局广播电视科学研究院 | Pedestrian's recognition methods again of global characteristics and local feature based on attention mechanism |
CN110110642A (en) * | 2019-04-29 | 2019-08-09 | 华南理工大学 | A kind of pedestrian's recognition methods again based on multichannel attention feature |
KR102187302B1 (en) * | 2020-01-13 | 2020-12-04 | 서강대학교 산학협력단 | System and method for searching for pedestrian using by pedestrian fashion information |
CN112069920A (en) * | 2020-08-18 | 2020-12-11 | 武汉大学 | Cross-domain pedestrian re-identification method based on attribute feature driven clustering |
CN112733590A (en) * | 2020-11-06 | 2021-04-30 | 哈尔滨理工大学 | Pedestrian re-identification method based on second-order mixed attention |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3706034A1 (en) * | 2019-03-06 | 2020-09-09 | Robert Bosch GmbH | Movement prediction of pedestrians useful for autonomous driving |
-
2021
- 2021-06-03 CN CN202110618743.5A patent/CN113392740B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109670555A (en) * | 2018-12-27 | 2019-04-23 | 吉林大学 | Instance-level pedestrian detection and pedestrian's weight identifying system based on deep learning |
CN110110642A (en) * | 2019-04-29 | 2019-08-09 | 华南理工大学 | A kind of pedestrian's recognition methods again based on multichannel attention feature |
CN110070073A (en) * | 2019-05-07 | 2019-07-30 | 国家广播电视总局广播电视科学研究院 | Pedestrian's recognition methods again of global characteristics and local feature based on attention mechanism |
KR102187302B1 (en) * | 2020-01-13 | 2020-12-04 | 서강대학교 산학협력단 | System and method for searching for pedestrian using by pedestrian fashion information |
CN112069920A (en) * | 2020-08-18 | 2020-12-11 | 武汉大学 | Cross-domain pedestrian re-identification method based on attribute feature driven clustering |
CN112733590A (en) * | 2020-11-06 | 2021-04-30 | 哈尔滨理工大学 | Pedestrian re-identification method based on second-order mixed attention |
Non-Patent Citations (4)
Title |
---|
An Attention-Driven Two-Stage Clustering Method for Unsupervised Person Re-identification;Zilong Ji等;《European Conference on Computer Vision》;20201103;第20-36页 * |
Person Re-Identification Based on Attention Mechanism and Context Information Fusion;Shengbo Chen等;《Future Internet》;20210313;第13卷(第3期);第1-15页 * |
基于注意力机制的行人重识别研究;李聪;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20210115;第I138-1545页 * |
深度双重注意力的生成与判别联合学习的行人重识别;张晓艳等;《光电工程》;20210515;第48卷(第5期);第57-65页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113392740A (en) | 2021-09-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108520535B (en) | Object classification method based on depth recovery information | |
CN110188795B (en) | Image classification method, data processing method and device | |
CN112446398B (en) | Image classification method and device | |
CN112446476A (en) | Neural network model compression method, device, storage medium and chip | |
CN112446270A (en) | Training method of pedestrian re-identification network, and pedestrian re-identification method and device | |
CN112801015B (en) | Multi-mode face recognition method based on attention mechanism | |
CN110084281A (en) | Image generating method, the compression method of neural network and relevant apparatus, equipment | |
CN112800894A (en) | Dynamic expression recognition method and system based on attention mechanism between space and time streams | |
CN108985252B (en) | Improved image classification method of pulse depth neural network | |
CN110222718B (en) | Image processing method and device | |
CN110390308B (en) | Video behavior identification method based on space-time confrontation generation network | |
CN110781736A (en) | Pedestrian re-identification method combining posture and attention based on double-current network | |
CN108154133B (en) | Face portrait-photo recognition method based on asymmetric joint learning | |
KR101910089B1 (en) | Method and system for extracting Video feature vector using multi-modal correlation | |
US11881020B1 (en) | Method for small object detection in drone scene based on deep learning | |
CN111797882A (en) | Image classification method and device | |
CN112183240A (en) | Double-current convolution behavior identification method based on 3D time stream and parallel space stream | |
CN113920581A (en) | Method for recognizing motion in video by using space-time convolution attention network | |
CN113361549A (en) | Model updating method and related device | |
WO2022246612A1 (en) | Liveness detection method, training method for liveness detection model, apparatus thereof, and system | |
CN115424331A (en) | Human face relative relationship feature extraction and verification method based on global and local attention mechanism | |
CN112446835A (en) | Image recovery method, image recovery network training method, device and storage medium | |
CN113763417B (en) | Target tracking method based on twin network and residual error structure | |
CN114694089A (en) | Novel multi-mode fusion pedestrian re-recognition algorithm | |
CN118097150A (en) | Small sample camouflage target segmentation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |