[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Visible–infrared person re-identification via patch-mixed cross-modality learning

Published: 21 November 2024 Publication History

Abstract

Visible–infrared person re-identification (VI-ReID) aims to retrieve images of the same pedestrian from different modalities, where the challenges lie in the significant modality discrepancy. To alleviate the modality gap, recent methods generate intermediate images by GANs, grayscaling, or mixup strategies. However, these methods could introduce extra data distribution, and the semantic correspondence between the two modalities is not well learned. In this paper, we propose a Patch-Mixed Cross-Modality framework (PMCM), where two images of the same person from two modalities are split into patches and stitched into a new one for model learning. A part-alignment loss is introduced to regularize representation learning, and a patch-mixed modality learning loss is proposed to align between the modalities. In this way, the model learns to recognize a person through patches of different styles, thereby the modality semantic correspondence can be inferred. In addition, with the flexible image generation strategy, the patch-mixed images freely adjust the ratio of different modality patches, which could further alleviate the modality imbalance problem. On two VI-ReID datasets, we report new state-of-the-art performance with the proposed method.

Highlights

Our method treats the RGB and IR images in the same way and alleviate the modality imbalance problem in VI-ReID.
The part-alignment loss constrains the consistency of part and global prediction distributions.
The patch-mixed modality learning loss successfully aligns the new modality with the other two modalities.
Our method achieves SOTA performance, and the data imbalance problem is effectively alleviated.

References

[1]
Ye M., Shen J., Lin G., Xiang T., Shao L., Hoi S.C., Deep learning for person re-identification: A survey and outlook, IEEE Trans. Pattern Anal. Mach. Intell. 44 (6) (2021) 2872–2893.
[2]
Zahra A., Perwaiz N., Shahzad M., Fraz M.M., Person re-identification: A retrospective on domain specific open challenges and future trends, Pattern Recognit. (2023).
[3]
Gavini Y., Agarwal A., Mehtre B., Thermal to visual person re-identification using collaborative metric learning based on maximum margin matrix factorization, Pattern Recognit. 134 (2023).
[4]
Sarker P.K., Zhao Q., Enhanced visible–infrared person re-identification based on cross-attention multiscale residual vision transformer, Pattern Recognit. 149 (2024).
[5]
Huang N., Liu J., Luo Y., Zhang Q., Han J., Exploring modality-shared appearance features and modality-invariant relation features for cross-modality person re-identification, Pattern Recognit. 135 (2023).
[6]
Zhang G., Zhang Y., Zhang H., Chen Y., Zheng Y., Learning dual attention enhancement feature for visible-infrared person re-identification, J. Vis. Commun. Image Represent. (2024).
[7]
Ling Y., Luo Z., Lin Y., Li S., A multi-constraint similarity learning with adaptive weighting for visible-thermal person re-identification, in: IJCAI, 2021, pp. 845–851.
[8]
Zhu J., Wu H., Zhao Q., Zeng H., Zhu X., Huang J., Cai C., Visible-infrared person re-identification using high utilization mismatch amending triplet loss, Image Vis. Comput. 138 (2023).
[9]
Lu J., Zhang S., Chen M., Chen X., Zhang K., Cross-modality person re-identification based on intermediate modal generation, Opt. Lasers Eng. 177 (2024).
[10]
Zhang Y., Yan Y., Lu Y., Wang H., Towards a unified middle modality learning for visible-infrared person re-identification, in: ACM MultiMedia, 2021, pp. 788–796.
[11]
Zhou C., Li J., Li H., Lu G., Xu Y., Zhang M., Video-based visible-infrared person re-identification via style disturbance defense and dual interaction, in: ACM MultiMedia, 2023, pp. 46–55.
[12]
Li D., Wei X., Hong X., Gong Y., Infrared-visible cross-modal person re-identification with an x modality, in: AAAI, Vol. 34, 2020, pp. 4610–4617.
[13]
Ye M., Shen J., Shao L., Visible-infrared person re-identification via homogeneous augmented tri-modal learning, IEEE Trans. Inf. Forensics Secur. 16 (2020) 728–739.
[14]
Huang Z., Liu J., Li L., Zheng K., Zha Z.-J., Modality-adaptive mixup and invariant decomposition for RGB-infrared person re-identification, AAAI (2022).
[15]
Wei Z., Yang X., Wang N., Gao X., Syncretic modality collaborative learning for visible infrared person re-identification, in: ICCV, 2021, pp. 225–234.
[16]
Zhang H., Cisse M., Dauphin Y.N., Lopez-Paz D., Mixup: Beyond empirical risk minimization, 2017, arXiv preprint arXiv:1710.09412.
[17]
Wu A., Zheng W.-S., Yu H.-X., Gong S., Lai J., RGB-infrared cross-modality person re-identification, in: ICCV, 2017, pp. 5380–5389.
[18]
Huang N., Liu K., Liu Y., Zhang Q., Han J., Cross-modality person re-identification via multi-task learning, Pattern Recognit. 128 (2022).
[19]
Wan L., Sun Z., Jing Q., Chen Y., Lu L., Li Z., G2DA: Geometry-guided dual-alignment learning for RGB-infrared person re-identification, Pattern Recognit. 135 (2023).
[20]
Choi S., Lee S., Kim Y., Kim T., Kim C., Hi-CMD: Hierarchical cross-modality disentanglement for visible-infrared person re-identification, in: CVPR, 2020, pp. 10257–10266.
[21]
Ling Y., Zhong Z., Luo Z., Rota P., Li S., Sebe N., Class-aware modality mix and center-guided metric learning for visible-thermal person re-identification, in: ACM MultiMedia, 2020, pp. 889–897.
[22]
Yun S., Han D., Oh S.J., Chun S., Choe J., Yoo Y., Cutmix: Regularization strategy to train strong classifiers with localizable features, in: ICCV, 2019, pp. 6023–6032.
[23]
Kim M., Kim S., Park J., Park S., Sohn K., PartMix: Regularization strategy to learn part discovery for visible-infrared person re-identification, in: CVPR, 2023, pp. 18621–18632.
[24]
Cui Y., Jia M., Lin T.-Y., Song Y., Belongie S., Class-balanced loss based on effective number of samples, in: CVPR, 2019, pp. 9268–9277.
[25]
Huang C., Li Y., Loy C.C., Tang X., Learning deep representation for imbalanced classification, in: CVPR, 2016, pp. 5375–5384.
[26]
Cao K., Wei C., Gaidon A., Arechiga N., Ma T., Learning imbalanced datasets with label-distribution-aware margin loss, NeurIPS 32 (2019).
[27]
Shen L., Lin Z., Huang Q., Relay backpropagation for effective learning of deep convolutional neural networks, in: ECCV, 2016, pp. 467–482.
[28]
Liu J., Sun Y., Zhu F., Pei H., Yang Y., Li W., Learning memory-augmented unidirectional metrics for cross-modality person re-identification, in: CVPR, 2022, pp. 19366–19375.
[29]
Wu Q., Dai P., Chen J., Lin C.-W., Wu Y., Huang F., Zhong B., Ji R., Discover cross-modality nuances for visible-infrared person re-identification, in: CVPR, 2021, pp. 4330–4339.
[30]
Hermans A., Beyer L., Leibe B., In defense of the triplet loss for person re-identification, 2017, arXiv preprint arXiv:1703.07737.
[31]
Sun Y., Zheng L., Yang Y., Tian Q., Wang S., Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline), in: ECCV, 2018, pp. 480–496.
[32]
Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., et al., An image is worth 16x16 words: Transformers for image recognition at scale, in: ICLR, 2021.
[33]
Nguyen D.T., Hong H.G., Kim K.W., Park K.R., Person recognition system based on a combination of body images from visible light and thermal cameras, Sensors 17 (3) (2017) 605.
[34]
Ye M., Ruan W., Du B., Shou M.Z., Channel augmented joint learning for visible-infrared recognition, in: ICCV, 2021, pp. 13567–13576.
[35]
Jiang K., Zhang T., Liu X., Qian B., Zhang Y., Wu F., Cross-modality transformer for visible-infrared person re-identification, in: ECCV, Springer, 2022, pp. 480–496.
[36]
Lu H., Zou X., Zhang P., Learning progressive modality-shared transformers for effective visible-infrared person re-identification, in: AAAI, Vol. 37, 2023, pp. 1835–1843.
[37]
Zhang Y., Yan Y., Li J., Wang H., MRCN: A novel modality restitution and compensation network for visible-infrared person re-identification, in: CVPR, 2023.
[38]
Zhang G., Zhang Y., Tan Z., Protohpe: Prototype-guided high-frequency patch enhancement for visible-infrared person re-identification, in: ACM MultiMedia, 2023, pp. 944–954.
[39]
Zhong Z., Zheng L., Kang G., Li S., Yang Y., Random erasing data augmentation, in: AAAI, Vol. 34, 2020, pp. 13001–13008.
[40]
Park H., Lee S., Lee J., Ham B., Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences, in: ICCV, 2021, pp. 12046–12055.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Recognition
Pattern Recognition  Volume 157, Issue C
Jan 2025
967 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 21 November 2024

Author Tags

  1. Visible–infrared person re-identification
  2. Patch-mix

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media