Research on Person Re-Identification through Local and Global Attention Mechanisms and Combination Poolings
<p>The overall architecture of the MGN. GMP: Global Max Pooling; LMP: Local Max Pooling; conv1*1 reduce: the features are reduced to 256 dimensions by a 1 × 1 convolution, and then subjected to batch normalization and ReLU activation function.</p> "> Figure 2
<p>The schematic diagram of a channel attention mechanism. conv1*1_relu reduce: the feature is reduced to 128 dimensions by a 1 × 1 convolution, and then subjected to ReLU activation function; conv1*1_sigmoid increase: the feature is increased to 2048 dimensions by a 1 × 1 convolution, and then subjected to Sigmoid activation function.</p> "> Figure 3
<p>The overall architecture of MGNACP. GCP: Global Combination Pooling; LCP: Local Combination Pooling; Attention: a channel attention machine; conv1*1 reduce: the features are reduced to 256 dimensions by a 1 × 1 convolution, and then they are subjected to batch normalization and ReLU activation function; Combination Poolings: the combination pooling part of each branch of MGNACP, where GCP is used to obtain global features and LCP is used to obtain local features; Channel Attentions: the channel attention mechanism part of each branch of MGNACP, where each global and local branch adds a corresponding channel attention mechanism (the attention mechanism of each global branch is conducive to learning the most important information of the global feature. The attention mechanism of each local branch is conducive to learning the most important information of local feature).</p> "> Figure 4
<p>The schematic diagrams of GCP and LCP. (<b>a</b>) The schematic diagram of GCP. (<b>b</b>) The schematic diagram of LCP.</p> "> Figure 5
<p>Histogram of results of mAP experiments on the Market-1501 dataset (including methods from recent years and MGNACP methods).</p> "> Figure 6
<p>Histogram of results of top-1 experiments on the Market-1501 dataset (including methods from recent years and MGNACP methods).</p> "> Figure 7
<p>Comparison curves of experimental results between the MGN and MGNACP in the Market-1501 dataset without and with re-ranking. (<b>a</b>) Comparison curves of experimental results without re-ranking between the MGN and MGNACP. (<b>b</b>) Comparison curves of experimental results with re-ranking between the MGN and MGNACP.</p> "> Figure 8
<p>Comparison curves of experimental results between the MGN and MGNACP in the CUHK03 dataset without and with re-ranking. (<b>a</b>) Comparison curves of experimental results without re-ranking between the MGN and MGNACP. (<b>b</b>) Comparison curves of experimental results with re-ranking between the MGN and MGNACP.</p> "> Figure 9
<p>Comparison curves of experimental results between the MGN and MGNA in the Market-1501 dataset without and with re-ranking. (<b>a</b>) Comparison curves of experimental results without re-ranking between the MGN and MGNA. (<b>b</b>) Comparison curves of experimental results with re-ranking between the MGN and MGNA.</p> "> Figure 10
<p>Comparison curves of experimental results between the MGN and MGNA in the CUHK03 dataset without and with re-ranking. (<b>a</b>) Comparison curves of experimental results without re-ranking between the MGN and MGNA. (<b>b</b>) Comparison curves of experimental results with re-ranking between the MGN and MGNA.</p> "> Figure 11
<p>Histogram of the experimental results of top-1 and mAP for MGNACP on the Market-1501 dataset. (<b>a</b>) Histogram of the experimental results of mAP. (<b>b</b>) Histogram of the experimental results of top-1.</p> "> Figure 12
<p>Histogram of the experimental results of top-1 and mAP for MGNACP on the CUHK03 dataset. (<b>a</b>) Histogram of the experimental results of mAP. (<b>b</b>) Histogram of the experimental results of top-1.</p> ">
Abstract
:1. Introduction
2. Related Work
2.1. Feature Representation
2.2. Attention Mechanism
2.3. Pooling
3. Methods
3.1. MGN Network
3.2. Attention Mechanism Method of MGNA
3.3. Combination Pooling Method of MGNACP
- (1)
- Max pooling
- (2)
- Average pooling
- (3)
- Combination pooling
4. Experimentation
4.1. Dataset and Evaluation Protocol
4.2. Implementation Details
4.3. Comparison with State-of-the-Art Methods
4.4. Experimental Discussion
4.4.1. Experimental Results of Attention Mechanisms
4.4.2. Experimental Results of Combination Poolings
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wu, L.; Shen, C.; Hengel, A. PersonNet: Person re-identification with deep convolutional neural networks. arXiv 2016, arXiv:1601.07255. [Google Scholar]
- Li, W.; Zhao, R.; Xiao, T.; Wang, X. DeepReID: Deep filter pairing neural network for person re-identification. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 152–159. [Google Scholar]
- Tan, H.; Liu, X.; Yin, B.; Li, X. MHSA-Net: Multihead Self-Attention Network for Occluded Person Re-Identification. IEEE Trans. Neural Netw. Learn. Syst. 2022, 34, 8210–8224. [Google Scholar] [CrossRef] [PubMed]
- Gao, Z.; Gao, L.; Zhang, H.; Cheng, Z.; Hong, R.; Chen, S. DCR: A Unified Framework for Holistic/PartialPerson ReID. IEEE Trans. Multimed. 2021, 23, 3332–3345. [Google Scholar] [CrossRef]
- Huang, H.; Li, D.; Zhang, Z.; Chen, X.; Huang, K. Adversarially Occluded Samples for Person Re-identification. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 5098–5107. [Google Scholar]
- Wang, G.; Yuan, Y.; Chen, X.; Li, J.; Zhou, X. Learning discriminative features with multiple granularities for person reIdentification. In Proceedings of the 26th ACM International Conference on Multimedia (MM), Seoul, Republic of Korea, 22–26 October 2018; pp. 274–282. [Google Scholar]
- Niu, Z.; Zhong, G.; Yue, G.; Wang, L.; Yu, H.; Ling, X.; Dong, J. Recurrent attention unit: A new gated recurrent unit for long-term memory of important parts in sequential data. Neurocomputing 2023, 517, 1–9. [Google Scholar] [CrossRef]
- Rodríguez, P.; Velazquez, D.; Cucurull, G.; Gonfaus, J.; Roca, E.; González, J. Pay Attention to the Activations: A Modular Attention Mechanism for Fine-Grained Image Recognition. IEEE Trans. Multimed. 2020, 22, 502–514. [Google Scholar] [CrossRef]
- Wang, F.; Jiang, M.; Qian, C.; Yang, S.; Li, C.; Zhang, H.; Wang, X.; Tang, X. Residual Attention Network for Image Classification. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 3156–3164. [Google Scholar]
- Lin, M.; Chen, Q.; Yan, S. Network In Network. arXiv 2014, arXiv:1312.4400. [Google Scholar]
- Luo, H.; Gu, Y.; Liao, X.; Lai, S.; Jiang, W. Bag of Tricks and a Strong Baseline for Deep Person Re-Identification. In Proceedings of the 2019 32thIEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–20 June 2019; pp. 1487–1495. [Google Scholar]
- Zhou, K.; Yang, Y.; Cavallaro, A.; Xiang, T. Omni-Scale Feature Learning for Person Re-Identification. In Proceedings of the 2019 17th IEEE International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3701–3711. [Google Scholar]
- Zhang, Y.; Xiang, T.; Hospedales, T.; Lu, H. Deep Mutual Learning. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4320–4328. [Google Scholar]
- Sun, Y.; Zheng, L.; Deng, W.; Wang, S. SVDNet for pedestrian retrieval. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 3820–3828. [Google Scholar]
- Huang, M.; Niu, H.; Zhang, S.; Ren, G. A Person Re-identification Method Fusing Bottleneck Transformer and Relation-aware Global Attention. In Proceedings of the 2022 11th International Conference on Networks Communication and Computing (ICNCC), Beijing, China, 9–11 December 2022; pp. 31–37. [Google Scholar]
- Fawad; Khan, M.; Rahman, M. Fawad; Khan, M.; Rahman, M. Person Re-Identification by Discriminative Local Features of Overlapping Stripes. Symmetry 2020, 12, 647. [Google Scholar] [CrossRef]
- Liu, K.; Zhao, Z.; Cai, A. Datum-Adaptive Local Metric Learning for Person Re-identification. IEEE Signal Process. Lett. 2015, 22, 1457–1461. [Google Scholar] [CrossRef]
- Ustinova, E.; Ganin, Y.; Lempitsky, V. Multi-region Bilinear Convolutional Neural Networks for Person Re-Identification. In Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy, 29 August–1 September 2017. [Google Scholar]
- Liu, C.; Bao, T.; Zhu, M. Part-based Feature Extraction for Person Re-identification. In Proceedings of the 2018 10th International Conference on Machine Learning and Computing (ICMLC), Macau, China, 26–28 February 2018; pp. 172–177. [Google Scholar]
- Sun, Y.; Zheng, L.; Yang, Y.; Tian, Q.; Wang, S. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the 2018 European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 480–496. [Google Scholar]
- Shen, C.; Jin, Z.; Chu, W.; Jiang, R.; Chen, Y.; Qi, G.; Hua, X. Multi-level Similarity Perception Network for Person Re-identification. ACM Trans. Multimed. Comput. Commun. Appl. 2019, 15, 32. [Google Scholar] [CrossRef]
- Ma, L.; Guan, Z.; Dai, X.; Gao, H.; Lu, Y. A Cross-Modality Person Re-Identification Method Based on Joint Middle Modality and Representation Learning. Electronics 2023, 12, 2687. [Google Scholar] [CrossRef]
- Chikontwe, P.; Lee, H. Deep Multi-Task Network for Learning Person Identity and Attributes. IEEE Access 2018, 6, 60801–60811. [Google Scholar] [CrossRef]
- Wu, Z.; Yu, X.; Zhu, D.; Pang, Q.; Shen, S.; Ma, T.; Zheng, J. SR-DSFF and FENet-ReID: A Two-Stage Approach for Cross Resolution Person Re-Identification. Comput. Intell. Neurosci. 2022, 2022, 4398727. [Google Scholar] [CrossRef]
- Tian, H.; Hu, J. Self-Regulation Feature Network for Person Reidentification. IEEE Trans. Instrum. Meas. 2023, 72, 5005508. [Google Scholar] [CrossRef]
- Li, Z.; Lv, J.; Chen, Y.; Yuan, J. Person re-identification with part prediction alignment. Comput. Vis. Image Underst. 2021, 205, 103172. [Google Scholar] [CrossRef]
- Zhang, J.; Ainam, J.; Song, W.; Zhao, L.; Wang, X.; Li, H. Learning global and local features using graph neural networks for person re-identification. Signal Process. Image Commun. 2022, 107, 116744. [Google Scholar] [CrossRef]
- Xie, G.; Wen, X.; Yuan, L.; Xu, H.; Liu, Z. Global Correlative Network for Person re-identification. Neurocomputing 2022, 469, 298–309. [Google Scholar] [CrossRef]
- Fu, Y.; Wei, Y.; Zhou, Y.; Shi, H.; Huang, G.; Wang, X.; Yao, Z.; Huang, T. Horizontal Pyramid Matching for Person Re-Identification. In Proceedings of the 2019 AAAI Conference on Artificial Intelligence (AAAI), Honolulu, HI, USA, 27 January–1 February 2019; pp. 8295–8302. [Google Scholar]
- Yang, Z.; Wu, D.; Wu, C.; Lin, Z.; Gu, J.; Wang, W. A Pedestrian is Worth One Prompt: Towards Language Guidance Person Re-Identification. In Proceedings of the 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, DC, USA, 17–21 June 2024; pp. 17343–17353. [Google Scholar]
- He, W.; Deng, Y.; Tang, S.; Chen, Q.; Xie, Q.; Wang, Y.; Bai, L.; Zhu, F.; Zhao, R.; Ouyang, W.; et al. Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions. In Proceedings of the 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–21 June 2024; pp. 17521–17531. [Google Scholar]
- Chorowski, J.; Bahdanau, D.; Cho, K.; Bengio, Y. End-to-end continuous speech recognition using attention-based recurrent NN: First results. arXiv 2014, arXiv:1412.1602. [Google Scholar]
- Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial Transformer Networks. In Proceedings of the 2015 28th International Conference on Neural Information Processing Systems (NIPS), Montreal, QC, Canada, 7–12 December 2015; pp. 2017–2025. [Google Scholar]
- Mnih, V.; Heess, N.; Graves, A.; Kavukcuoglu, K. Recurrent Models of Visual Attention. In Proceedings of the 2014 27th International Conference on Neural Information Processing Systems (NIPS), Montreal, QC, Canada, 8–13 December 2014; pp. 2204–2212. [Google Scholar]
- Zhou, B.; Aditya, K.; Agata, L.; Aude, O.; Antonio, T. Learning Deep Features for Discriminative Localization. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable convolutional etworks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Wang, Q.; Wu, T.; Zheng, H.; Guo, G. Hierarchical pyramid diverse attention networks for face recognition. In Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 8326–8335. [Google Scholar]
- Yu, Z.; Li, L.; Xie, J.; Wang, C.; Li, W.; Ning, X. Pedestrian 3D Shape Understanding for Person Re-Identification via Multi-View Learning. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 5589–5602. [Google Scholar] [CrossRef]
- Ning, E.; Zhang, C.; Wang, C.; Ning, X.; Chen, H.; Bai, X. Pedestrian Re-ID based on feature consistency and contrast enhancement. Displays 2023, 79, 102467. [Google Scholar] [CrossRef]
- Chen, B.H.; Deng, W.H.; Hu, J.N. Mixed High-Order Attention Network for Person Re-Identification. In Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 371–381. [Google Scholar]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef]
- Fu, J.; Liu, J.; Tian, H.; Li, Y.; Bao, Y.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. In Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3141–3149. [Google Scholar]
- Zheng, M.; Karanam, S.; Wu, Z.; Radke, R.J. Re-identification with consistent attentive siamese networks. In Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5728–5737. [Google Scholar]
- Jiang, M.; Li, C.; Kong, J.; Teng, Z.; Zhuang, D. Cross-level reinforced attention network for person re-identification. J. Vis. Commun. Image Represent. 2020, 69, 102775. [Google Scholar] [CrossRef]
- Yang, W.; Huang, H.; Zhang, Z.; Chen, X.; Huang, K.; Zhang, S. Towards Rich Feature Discovery with Class Activation Maps Augmentation for Person Re-Identification. In Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 1389–1398. [Google Scholar]
- Li, W.; Zhang, Y.; Shi, W.; Coleman, S. A CAM-Guided Parameter-Free Attention Network for Person Re-Identification. IEEE Signal Process. Lett. 2022, 29, 1559–1563. [Google Scholar] [CrossRef]
- Jamal, M.; Jiang, Z.; Ming, F. An Improved Deep Mutual-Attention Learning Model for Person Re-Identification. Symmetry 2020, 12, 358. [Google Scholar] [CrossRef]
- Hou, R.; Ma, B.; Chang, H.; Gu, X.; Shan, S.; Chen, X. Interaction-and-Aggregation Network for Person Re-identification. In Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 9309–9318. [Google Scholar]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 2015 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, W.; Li, Y.; Zhang, K.; Hou, X.; Xu, J.; Su, R.; Xu, H. An Efficient Multi-Scale Focusing Attention Network for Person Re-Identification. Appl. Sci. 2021, 11, 2010. [Google Scholar] [CrossRef]
- Wang, J.; Yuan, L.; Xu, H.; Xie, G.; Wen, X. Channel-exchanged feature representations for person re-identification. Inf. Sci. 2021, 562, 370–384. [Google Scholar] [CrossRef]
- Ruan, W.; Liang, C.; Yu, Y.; Wang, Z.; Liu, W.; Chen, J.; Ma, J. Correlation Discrepancy Insight Network for Video Re-identification. ACM Trans. Multimed. Comput. Commun. Appl. 2022, 16, 120. [Google Scholar] [CrossRef]
- Tang, Q.; Yan, P.; Chen, J.; Shao, H.; Wang, F.; Wang, G. Person re-identification based on multi-scale global feature and weight-driven part feature. AI Commun. 2022, 35, 207–223. [Google Scholar] [CrossRef]
- Generation Xiong, M.; Gao, Z.; Hu, R.; Chen, J.; He, R.; Cai, H.; Peng, T. A Lightweight Efficient Person Re-Identification Method Based on Multi-Attribute Feature Generation. Appl. Sci. 2022, 12, 4921. [Google Scholar] [CrossRef]
- Zhao, C.; Chen, K.; Wei, Z.; Chen, Y.; Miao, D.; Wang, W. Multilevel triplet deep learning model for person re-identification. Pattern Recognit. Lett. 2019, 117, 161–168. [Google Scholar] [CrossRef]
- Wang, C.; Ning, X.; Li, W.; Bai, X.; Gao, X. 3D Person Re-Identification Based on Global Semantic Guidance and Local Feature Aggregation. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 4698–4712. [Google Scholar] [CrossRef]
- Ghorbel, M.; Ammar, S.; Kessentini, Y.; Jmaiel, M. Masking for better discovery: Weakly supervised complementary body re-gions mining for person re-identification. Expert Syst. Appl. 2022, 197, 116636. [Google Scholar] [CrossRef]
- Miao, J.; Wu, Y.; Liu, P.; Ding, Y.; Yang, Y. Pose-guided feature alignment for occluded person re-identification. In Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV), Seoul, Repulic of Korea, 27 October–2 November 2019; pp. 542–551. [Google Scholar]
- Chang, H.; Qu, D.; Wang, K.; Zhang, H.; Si, N.; Yan, G.; Li, H. Attribute-guided attention and dependency learning for im-proving person re-identification based on data analysis technology. Enterp. Inf. Syst. 2023, 17, 1941274. [Google Scholar] [CrossRef]
- Ding, G.; Khan, S.; Tang, Z.; Porikli, F. Feature mask network for person re-identification. Pattern Recognit. Lett. 2020, 137, 91–98. [Google Scholar] [CrossRef]
- Chen, Y.; Fan, Z.; Chen, Z.; Zhu, Y. CA-Jaccard: Camera-aware Jaccard Distance for Person Re-identification. In Proceedings of the 2024 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, DC, USA, 17–21 June 2024; pp. 17532–17541. [Google Scholar]
- Kalayeh, M.; Basaran, E.; Gökmen, M.; Kamasak, M.; Shah, M. Human semantic parsing for person re-identification. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 1062–1071. [Google Scholar]
Symbols | Explanations |
---|---|
, | Branches of the MGN or branches of the MGNA or branches of the MGNACP |
In the MGN and MGNA, it is the global feature branch using GMP or In MGNACP, it is the global feature branch using GCP | |
, or | In the MGN and MGNA, it is the local feature branch using LMP or In MGNACP, it is the local feature branch using LCP When branch the number of stripe ; when branch the number of stripes |
, | The features obtained by using GMP in each of the three branches of the MGN or The features obtained by using GCP in each of the three branches of MGNACP |
, then , then , | The features obtained by using LMPs in the P2 and P3 branches of MGNACP or The features obtained by using LCPs in the P2 and P3 branches of MGNACP |
, | In the MGN, global features continue to extract features to obtain three 256-dimensional global features In MGNACP, global features continue to extract features to obtain three 256-dimensional global features |
, , then , then , | In the MGN, local features further extract features to obtain five 256-dimensional local features In MGNACP, local features further extract features to obtain five 256-dimensional global features |
1 × 1 convolution | |
BN | Batch normalization |
ReLU | ReLU activation function |
Sigmoid | Sigmoid activation function |
, | Global features obtained through the attention mechanisms |
, then , then , | Local features obtained through the attention mechanisms |
, | Pooling area, the feature area where the pooling window is located |
, | The sequence number of the pooling area that the feature is divided by the pooling window |
The number of pooling areas that the feature is divided by the pooling window | |
, | The number of pixels in the pooling area where the pooling window is located |
, | The th pixel in the pooling area where the pooling window is located |
, | The th pixel value of the pooling area where the pooling window is located |
The entire pooling area, that is, the feature area where the pooling window is located is the entire feature | |
The number of pixels in the entire pooling area | |
, | The output value after the pooling calculation of the th pooling area in the feature |
The output value of the entire pooling area after the pooling calculation | |
, the proportion of max pooling and average pooling in combination pooling | |
GMP | Global Max Pooling |
LMP | Local Max Pooling |
GAP | Global Average Pooling |
LAP | Local Average Pooling |
GCP | Global Combination Pooling |
LCP | Local Combination Pooling |
MGN | Multiple Granularity Network |
MGNA | Multiple Granularity Network with Attentions |
MGNACP | Multiple Granularity Network with Attention Mechanisms and Combination Poolings |
Detail | Market-1501 | CUHK03 |
---|---|---|
ID | 1501 | 1467 |
Annotated box | 32,668 | 14,096 |
Query box | 3368 | 1400 |
Box per ID | 19.9 | 9.7 |
Train box | 12,936 | 7365 |
Test box | 19,732 | 5332 |
Train ID | 751 | 767 |
Test ID | 750 | 700 |
Camera | 6 | 2 |
Method | Top-1 (%) | mAP (%) | |
---|---|---|---|
G | DML (2019) [13] | 89.3 | 70.5 |
OSNet (2019) [12] | 94.8 | 84.9 | |
SVDNet (2017) [14] | 82.3 | 62.1 | |
AOS (2018) [5] | 86.5 | 70.4 | |
BoT (2019) [11] | 94.5 | 85.9 | |
L | PCB + RPP (2018) [20] | 93.8 | 81.6 |
PCB (2018) [20] | 92.3 | 77.4 | |
Multi-region CNN (2017) [18] | 41.2 | 66.4 | |
DLFOS + XQDA (2020) [16] | 62.7 | - | |
part-based CNN + XQDA (2018) [19] | 83.1 | 61.7 | |
M | MSP-CNN (2019) [21] | 84.2 | 66.3 |
SR-DSFF + FENet-ReID (2022) [24] | 90.9 | - | |
SRFnet (2023) [25] | 94.2 | 85.7 | |
PPA + TS (2021) [26] | 92.4 | 79.6 | |
PointReIDNet (2024) [61] | 90.6 | 75.3 | |
PAGCN (2022) [27] | 94.4 | 87.3 | |
GCN (2022) [28] | 95.3 | 85.7 | |
HPM (2020) [29] | 94.2 | 82.7 | |
PCN + PSP (2018) [23] | 92.8 | 78.8 | |
MGN (2018) [6] | 95.7 | 86.9 | |
DCR (2021) [4] | 93.8 | 84.7 | |
A | CASN (2018) [43] | 94.4 | 82.8 |
CAM-Guided Attention (2022) [46] | 94.7 | 85.1 | |
Mutual-Attention (2020) [47] | 93.8 | 83.6 | |
IANet (2019) [48] | 94.4 | 83.1 | |
MHSA-Net (2022) [3] | 94.6 | 84.0 | |
CLRA-CNN (2020) [44] | 92.3 | 78.2 | |
AND (2022) [62] | 92.3 | 87.8 | |
MHN-6 (2019) [40] | 95.1 | 85.0 | |
PGFA (2019) [63] | 91.2 | 76.8 | |
CAMA (2020) [45] | 94.7 | 84.5 | |
HA-CNN (2018) [39] | 91.2 | 75.7 | |
AL-APR (2021) [64] | 89.0 | 74.4 | |
MGNACP (ours) | 95.46 | 88.82 |
Method | Top-1 (%) | mAP (%) |
---|---|---|
FMN (2020) [65] | 42.6 | 39.2 |
PointReIDNet (2024) [61] | 53.43 | 48.76 |
PCN + PSP (2018) [23] | 60.7 | 56.0 |
AND (2022) [62] | 60.6 | 56.5 |
HPM (2020) [29] | 63.9 | 57.5 |
DCR (2021) [4] | 68.4 | 61.4 |
PPA + TS (2021) [26] | 65.5 | 62.4 |
CAMA (2020) [45] | 66.6 | 64.2 |
CASN (2018) [43] | 71.5 | 64.4 |
MHN-6 (2019) [40] | 71.7 | 65.4 |
OSNet (2019) [12] | 72.3 | 67.8 |
SRFnet (2023) [25] | 73.3 | 69.6 |
MHSA-Net (2022) [3] | 73.4 | 70.2 |
PAGCN (2022) [27] | 75.1 | 71.6 |
GCN (2022) [28] | 78.5 | 74.7 |
MGN(2018) [6] (Our Imp.) | 80.07 | 77.31 |
MGNACP (ours) | 81.57 | 78.61 |
Method | Top-1 (%) | mAP (%) |
---|---|---|
SRFnet (2023) [25] | 95.3 | 93.7 |
PAGCN (2022) [27] | 96.1 | 94.1 |
CAM-guided Attention (2022) [46] | 95.1 | 92.7 |
MHSA-Net (2022) [3] | 95.5 | 93.0 |
PCN + PSP (2018) [23] | 94.4 | 90.8 |
MGN (2018) [6] | 96.6 | 94.2 |
BoT (2019) [11] | 95.4 | 94.2 |
SPReID (2018) [67] | 94.6 | 91.0 |
FMN (2020) [65] | 87.9 | 80.6 |
CC* + CAJ (2024) [66] | 93.7 | 90.2 |
MV-3DSReID (2023) [38] | 96.1 | 90.9 |
MGNACP (ours) | 96.32 | 94.55 |
Method | Top-1 (%) | mAP (%) |
---|---|---|
FMN (2020) [65] | 47.5 | 48.5 |
PCN + PSP (2018) [23] | 71.2 | 72.1 |
MHSA-Net (2022) [3] | 80.2 | 80.9 |
SRFnet (2023) [25] | 80.2 | 81.9 |
MGN (2018) [6] (Our Imp.) | 86.07 | 87.02 |
MGNACP (ours) | 86.50 | 87.82 |
Method | Top-1 (%) | mAP (%) |
---|---|---|
MGN | 95.7 | 86.9 |
MGNA | 95.01 | 88.46 |
Method | Top-1 (%) | mAP (%) |
---|---|---|
MGN | 96.6 | 94.2 |
MGNA | 95.93 | 94.33 |
Method | Top-1 (%) | mAP (%) |
---|---|---|
MGN | 80.07 | 77.31 |
MGNA | 80.79 | 77.53 |
Method | Top-1 (%) | mAP (%) |
---|---|---|
MGN | 86.07 | 87.02 |
MGNA | 86.50 | 87.38 |
Combination Proportion | Top-1 (%) | mAP (%) |
---|---|---|
Max: 0.1, Avg: 0.9 | 95.28 | 88.65 |
Max: 0.2, Avg: 0.8 | 95.46 | 88.82 |
Max: 0.3, Avg: 0.7 | 95.35 | 88.79 |
Max: 0.4, Avg: 0.6 | 95.29 | 88.73 |
Max: 0.5, Avg: 0.5 | 95.03 | 88.60 |
Max: 0.6, Avg: 0.4 | 95.16 | 88.35 |
Max: 0.7, Avg: 0.3 | 95.07 | 88.50 |
Max: 0.8, Avg: 0.2 | 95.23 | 88.39 |
Max: 0.9, Avg: 0.1 | 95.10 | 88.46 |
Combination Proportion | Top-1 (%) | mAP (%) |
---|---|---|
Max: 0.1, Avg: 0.9 | 81.21 | 78.41 |
Max: 0.2, Avg: 0.8 | 81.00 | 78.37 |
Max: 0.3, Avg: 0.7 | 81.71 | 78.65 |
Max: 0.4, Avg: 0.6 | 80.29 | 77.55 |
Max: 0.5, Avg: 0.5 | 79.64 | 76.85 |
Max: 0.6, Avg: 0.4 | 80.00 | 77.93 |
Max: 0.7, Avg: 0.3 | 81.00 | 77.58 |
Max: 0.8, Avg: 0.2 | 80.57 | 77.08 |
Max: 0.9, Avg: 0.1 | 79.43 | 77.29 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, J.; Zhao, S.; Li, S.; Cheng, B.; Chen, J. Research on Person Re-Identification through Local and Global Attention Mechanisms and Combination Poolings. Sensors 2024, 24, 5638. https://doi.org/10.3390/s24175638
Zhou J, Zhao S, Li S, Cheng B, Chen J. Research on Person Re-Identification through Local and Global Attention Mechanisms and Combination Poolings. Sensors. 2024; 24(17):5638. https://doi.org/10.3390/s24175638
Chicago/Turabian StyleZhou, Jieqian, Shuai Zhao, Shengjie Li, Bo Cheng, and Junliang Chen. 2024. "Research on Person Re-Identification through Local and Global Attention Mechanisms and Combination Poolings" Sensors 24, no. 17: 5638. https://doi.org/10.3390/s24175638
APA StyleZhou, J., Zhao, S., Li, S., Cheng, B., & Chen, J. (2024). Research on Person Re-Identification through Local and Global Attention Mechanisms and Combination Poolings. Sensors, 24(17), 5638. https://doi.org/10.3390/s24175638