SiamRhic: Improved Cross-Correlation and Ranking Head-Based Siamese Network for Object Tracking in Remote Sensing Videos
"> Figure 1
<p>The network architecture starts by taking a template image and a search image. It then extracts deep features using an enhanced ResNet50 network with a weighted attention mechanism. The CBAM attention mechanism is incorporated between the third, fourth, and fifth convolutional layers of the feature extraction network. These features are then input into an adaptive head network for cross-correlation and multi-layer feature fusion. Finally, ranking loss is applied to suppress the classification confidence scores of interfering items and reduce the mismatch between classification and regression.</p> "> Figure 2
<p>Attention mechanism. Feature maps from the third, fourth, and fifth convolutional blocks are processed through both channel and spatial attention mechanisms before being sent to the head network. The red box represents the channel attention mechanism, while the blue box represents the spatial attention mechanism.</p> "> Figure 3
<p>Channel attention module (CAM) and spatial attention module (SAM).</p> "> Figure 4
<p>Asymmetric convolution. (<b>a</b>) DW-Xcorr. (<b>b</b>) A naive approach for fusing feature maps of varying sizes. (<b>c</b>) Symmetric convolution.</p> "> Figure 5
<p>Ranking loss. We focus on samples with high classification confidence and increased IoU to achieve higher rankings, leveraging the relationship between the classification and regression branches. The red points represent the center point of the object obtained by classification, and the red boxes represent the bounding box of the object obtained by regression.</p> "> Figure 6
<p>The precision and success rates of our tracker compared to other trackers on the OTB100 dataset. (<b>a</b>) Success plots; (<b>b</b>) Precision plots.</p> "> Figure 7
<p>The success rate of our tracker compared to other trackers across the 11 challenges of the OTB100 dataset. (<b>a</b>) In-plane Rotation; (<b>b</b>) Fast Motion; (<b>c</b>) Out-of-view; (<b>d</b>) Low Resolution; (<b>e</b>) Occlusion; (<b>f</b>) Illumination Variation; (<b>g</b>) Deformation; (<b>h</b>) Motion Blur; (<b>i</b>) Out-of-plane Rotation; (<b>j</b>) Scale Variation; (<b>k</b>) Background Clutter.</p> "> Figure 8
<p>The precision of our tracker in comparison to other trackers across the 11 challenges of the OTB100 dataset. (<b>a</b>) In-plane Rotation; (<b>b</b>) Fast Motion; (<b>c</b>) Out-of-view; (<b>d</b>) Low Resolution; (<b>e</b>) Occlusion; (<b>f</b>) Il-lumination Variation; (<b>g</b>) Deformation; (<b>h</b>) Motion Blur; (<b>i</b>) Out-of-plane Rotation; (<b>j</b>) Background Clutter; (<b>k</b>) Scale Variation.</p> "> Figure 9
<p>The precision and success rates of our tracker, along with those of the comparison trackers, are evaluated on the UAV123 dataset. (<b>a</b>) Success plots; (<b>b</b>) Precision plots.</p> "> Figure 10
<p>The success rates of our tracker, along with those of the comparison trackers, are assessed across the twelve challenges of the UAV123 dataset. (<b>a</b>) Viewpoint Change; (<b>b</b>) Similar Object; (<b>c</b>) Fast Motion; (<b>d</b>) Out-of-view; (<b>e</b>) Full Occlusion; (<b>f</b>) Illumination Variation; (<b>g</b>) Background Clutter; (<b>h</b>) Aspect Ratio Variation; (<b>i</b>) Scale Variation; (<b>j</b>) Partial Occlusion; (<b>k</b>) Low Resolution; (<b>l</b>) Camera Motion.</p> "> Figure 11
<p>The precision of our tracker, as well as that of the comparison trackers, is evaluated across the twelve challenges presented in the UAV123 dataset. (<b>a</b>) Viewpoint Change; (<b>b</b>) Similar Object; (<b>c</b>) Fast Motion; (<b>d</b>) Out-of-view; (<b>e</b>) Full Occlusion; (<b>f</b>) Illumination Variation; (<b>g</b>) Background Clutter; (<b>h</b>) Aspect Ratio Variation; (<b>i</b>) Scale Variation; (<b>j</b>) Partial Occlusion; (<b>k</b>) Low Resolution; (<b>l</b>) Camera Motion.</p> "> Figure 12
<p>The precision, normalized precision, and success rates of both our tracker and the comparison trackers are assessed on the OOTB dataset. (<b>a</b>) Precision plot; (<b>b</b>) Normalized precision plots; (<b>c</b>) Success plots.</p> "> Figure 13
<p>The precision and success rates of our tracker compared to other trackers on the LaSOT dataset. (<b>a</b>) Success plots; (<b>b</b>) Precision plots.</p> "> Figure 14
<p>Visualization of the tracking results for our tracker and the comparative trackers across four video sequences from the OOTB dataset. The tracking results, displayed from left to right and top to bottom, correspond to the videos car_11_1, plane_1_1, ship_12_1, and train_1_1.</p> "> Figure 15
<p>Visualization of the tracking results for our tracker and the comparative trackers across four video sequences from the OTB dataset.</p> "> Figure 16
<p>Visualization of the tracking results for our tracker and the comparative trackers across four video sequences from the UAV123 dataset.</p> ">
Abstract
:1. Introduction
1.1. Object Tracking in Traditional Scenarios
1.2. Object Tracking in Remote Sensing Videos
- We use an enhanced version of the ResNet50 architecture as the backbone network. Additionally, we incorporate the CBAM attention mechanism between the third, fourth, and fifth convolutional layers of the feature extraction network. Experiments show that this enhances the representational capacity of the convolutional features effectively.
- We implement asymmetric convolution to replace the original depth-wise cross-correlation algorithm, decomposing the large convolution process into two separate convolutions. This approach eliminates the need for sliding window operations and feature map concatenation during each iteration, improving both speed and performance.
- A ranking loss is introduced to extend the original loss function of the Siamese network. The classification ranking loss ensures that positive samples are prioritized over difficult negative samples, helping to reduce classification confidence scores for distracting objects. The IoU ranking loss is introduced to address the discrepancies between classification and regression.
2. Related Works
2.1. Correlation Filter-Based Algorithms
2.2. Deep Learning-Based Algorithms
2.2.1. Anchor-Based Algorithms
2.2.2. Anchor-Free Algorithms
3. Methods
3.1. Overview
3.2. Attention Mechanisms
3.3. Asymmetric Convolution
3.4. Ranking Loss
4. Experiments
4.1. Implementation Details
4.2. Evaluation Index
4.3. Experiments on the OTB Benchmark
4.4. Experiments on the UAV123 Benchmark
4.5. Experiments on the OOTB Benchmark
4.6. Experiments on the LaSOT Benchmark
4.7. Qualitative Evaluation
4.7.1. Qualitative Evaluation on the OOTB Benchmark
4.7.2. Qualitative Evaluation on the OTB Benchmark
4.7.3. Qualitative Evaluation on the UAV123 Benchmark
4.8. Ablation Study
5. Discussion
5.1. Discussion on the Performance of Algorithms in Complex Scenarios
5.2. Discussion of Model Generalization Performance
5.3. Discussion on the Contribution of Algorithms
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 2544–2550. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of the Computer Vision-ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; Springer: Berlin/Heidelberg, Germany, 2012. Part IV 12. pp. 702–715. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 583–596. [Google Scholar] [CrossRef] [PubMed]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’OS), San Diego, CA, USA, 20–25 June 2005; pp. 886–893. [Google Scholar]
- Danelljan, M.; Hager, G.; Khan, F.S.; Felsberg, M. Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1561–1575. [Google Scholar] [CrossRef] [PubMed]
- Ma, C.; Yang, X.; Zhang, C.; Yang, M.H. Long-term correlation tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5388–5396. [Google Scholar]
- Bertinetto, L.; Valmadre, J.; Henriques, J.F.; Vedaldi, A.; Torr, P. Fully-Convolutional Siamese Networks for Object Tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 850–865. [Google Scholar]
- Bo, L.; Yan, J.; Wei, W.; Zheng, Z.; Hu, X. High Performance Visual Tracking with Siamese Region Proposal Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8971–8980. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Z.; Wang, Q.; Li, B.; Wu, W.; Yan, J.; Hu, W. Distractor-aware Siamese Networks for Visual Object Trackings. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 101–117. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the International Conference on Computer Vision, Seul, Republic of Korea, 27 October–2 November 2019; pp. 9626–9635. [Google Scholar]
- Xu, Y.; Wang, Z.; Li, Z.; Yuan, Y.; Yu, G. SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 12549–12556. [Google Scholar]
- Zhou, X.; Wang, D.; Krahenbuhl, P. Objects as Points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Li, Q.; Qin, Z.; Zhang, W.; Zheng, W. Siamese Keypoint Prediction Network for Visual Object Tracking. arXiv 2020, arXiv:2006.04078. [Google Scholar]
- Hu, Z.; Yang, D.; Zhang, K.; Chen, Z. Object tracking in satellite videos based on convolutional regression network with appearance and motion features. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 783–793. [Google Scholar] [CrossRef]
- Li, Z.; Yuan, L.; Nevada, R. Global data association for multi-object tracking using network flows. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Feng, J.; Hui, B.; Liang, Y.; Yao, Q.; Zhang, X. Improved SiamRPN++ with Clustering-Based Frame Differencing for Object Tracking of Remote Sensing Videos. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 4163–4166. [Google Scholar]
- Li, B.; Wu, W.; Wang, Q.; Zhang, F.; Xing, J.; Yan, J. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4282–4291. [Google Scholar]
- Yang, J.; Pan, Z.; Liu, Y.; Niu, B.; Lei, B. Single Object Tracking in Satellite Videos Based on Feature Enhancement and Multi-Level Matching Strategy. Remote Sens. 2023, 15, 4351. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Computer Vision—ECCV 2018, Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018. [Google Scholar] [CrossRef]
- Huang, L.; Zhao, X.; Huang, K. GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1562–1577. [Google Scholar] [CrossRef] [PubMed]
- Fan, H.; Lin, L.; Yang, F.; Chu, P.; Deng, G.; Yu, S.; Bai, H.; Xu, Y.; Liao, C.; Ling, H. LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5369–5378. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Wu, Y.; Lim, J.; Yang, M.-H. Online object tracking: A benchmark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA, 23–28 June 2013; pp. 2411–2418. [Google Scholar]
- Zhang, Z.; Peng, H.; Fu, J.; Li, B.; Hu, W. Ocean: Object-aware anchor-free tracking. In Proceedings of the Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer International Publishing: Berlin/Heidelberg, Germany, 2020. Part XXI 16. pp. 771–787. [Google Scholar]
- Danelljan, M.; Bhat, G.; Khan, F.S.; Felsberg, M. Atom: Accurate tracking by overlap maximization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4660–4669. [Google Scholar]
- Li, P.; Chen, B.; Ouyang, W.; Wang, D.; Yang, X.; Lu, H. GradNet: Gradient-guided network for visual object tracking. In Proceedings of the International Conference on Computer Vision, Seul, Republic of Korea, 27 October–2 November 2019; pp. 6162–6171. [Google Scholar]
- Zhang, Z.; Peng, H. Deeper and Wider Siamese Networks for Real-Time Visual Tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4591–4600. [Google Scholar]
- Danelljan, M.; Hager, G.; Shahbaz Khan, F.; Felsberg, M. Learning spatially regularized correlation filters for visual tracking. In Proceedings of the International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 4310–4318. [Google Scholar]
- Zhang, G.; Li, Z.; Li, J.; Hu, X. CFNet: Cascade Fusion Network for Dense Prediction. arXiv 2023, arXiv:2302.06052. [Google Scholar]
- Mueller, M.; Smith, N.; Ghanem, B. A benchmark and simulator for uav tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 445–461. [Google Scholar]
- Fei, D. Research on Visual Target Tracking Method Based on Attention Mechanism. Master’s Thesis, Harbin Institute of Technology, Harbin, China, 2021. [Google Scholar] [CrossRef]
- Chen, Z.; Zhong, B.; Li, G.; Zhang, S.; Ji, R. Siamese box adaptive network for visual tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 6668–6677. [Google Scholar]
- Guo, D.; Wang, J.; Cui, Y.; Wang, Z.; Chen, S. SiamCAR: Siamese fully convolutional classification and regression for visual tracking. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 6268–6276. [Google Scholar]
- Guo, D.Y.; Shao, Y.Y.; Cui, Y.; Wang, Z.; Shen, C. Graph attention tracking. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 9538–9547. [Google Scholar]
- Cao, Z.; Fu, C.; Ye, J.; Li, B.; Li, Y. SiamAPN++: Siamese Attentional Aggregation Network for Real-Time UAV Tracking. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 3086–3092. [Google Scholar]
- Shen, H.; Lin, D.; Song, T. A real-time siamese tracker deployed on UAVs. J. Real-Time Image Process. 2022, 19, 463–473. [Google Scholar] [CrossRef]
- Chen, Y.; Tang, Y.; Xiao, Y.; Yuan, Q.; Zhang, Y.; Liu, F.; He, J.; Zhang, L. Satellite video single object tracking: A systematic review and an oriented object tracking benchmark. ISPRS J. Photogramm. Remote Sens. 2024, 210, 212–240. [Google Scholar] [CrossRef]
- Zhou, J.; Wang, P.; Sun, H. Discriminative and Robust Online Learning for Siamese Visual Tracking. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 13017–13024. [Google Scholar]
- Dong, X.; Shen, J.; Shao, L.; Porikli, F. CLNet: A Compact Latent Network for Fast Adjusting Siamese Trackers. In Proceedings of the Computer Vision—ECCV 2020, 16th European Conference, Glasgow, UK, 23–28 August 2020. [Google Scholar] [CrossRef]
- Zheng, G.; Fu, C.; Ye, J.; Li, B.; Lu, G.; Pan, J. Scale-Aware Siamese Object Tracking for Vision-Based UAM Approaching. IEEE Trans. Ind. Inform. 2023, 19, 9349–9360. [Google Scholar] [CrossRef]
Arithmetic | Success | Precision |
---|---|---|
GradNet | 0.639 | 0.864 |
CFNet | 0.587 | 0.778 |
SiamFC | 0.587 | 0.772 |
SiamRPN | 0.629 | 0.847 |
ATOM | 0.667 | 0.879 |
DaSiamRPN | 0.658 | 0.880 |
SiamDW | 0.627 | 0.828 |
Ocean | 0.676 | 0.897 |
SRDCF | 0.598 | 0.789 |
Ours | 0.670 | 0.892 |
Arithmetic | Success | Precision |
---|---|---|
SiamAPN | 0.573 | 0.763 |
SiamAPN++ | 0.579 | 0.766 |
SiamSlim | 0.609 | 0.805 |
CGACD | 0.620 | 0.815 |
SiamCAR | 0.615 | 0.804 |
SiamRPN++ | 0.611 | 0.804 |
SiamBAN | 0.604 | 0.795 |
SiamRPN | 0.581 | 0.772 |
SiamDW | 0.536 | 0.776 |
Ours | 0.621 | 0.823 |
Algorithm | Success | Normalize | Precision |
---|---|---|---|
SiamBAN | 0.495 | 0.684 | 0.709 |
SiamRPN | 0.490 | 0.744 | 0.747 |
DROL | 0.236 | 0.377 | 0.283 |
CLNet | 0.514 | 0.774 | 0.796 |
SiamAPN++ | 0.476 | 0.752 | 0.767 |
SiamSA | 0.500 | 0.736 | 0.752 |
SiamKPN | 0.518 | 0.733 | 0.735 |
SiamAPN | 0.444 | 0.693 | 0.687 |
Ours | 0.533 | 0.786 | 0.812 |
AM | BC | DEF | FO | IM | IPR | IV | LT | MB | OON | PO | SA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
SiamBAN | 0.719 | 0.688 | 0.502 | 0.502 | 0.816 | 0.721 | 0.670 | 0.518 | 0.716 | 0.505 | 0.699 | 0.622 |
SiamRPN | 0.874 | 0.763 | 0.449 | 0.396 | 0.807 | 0.706 | 0.717 | 0.667 | 0.720 | 0.433 | 0.759 | 0.673 |
DROL | 0.146 | 0.274 | 0.205 | 0.051 | 0.229 | 0.289 | 0.307 | 0.178 | 0.242 | 0.210 | 0.118 | 0.189 |
CLNet | 0.884 | 0.808 | 0.483 | 0.585 | 0.836 | 0.766 | 0.785 | 0.714 | 0.794 | 0.516 | 0.794 | 0.701 |
SiamAPN++ | 0.650 | 0.775 | 0.470 | 0.652 | 0.805 | 0.697 | 0.776 | 0.691 | 0.780 | 0.516 | 0.614 | 0.603 |
SiamSA | 0.735 | 0.758 | 0.466 | 0.571 | 0.777 | 0.714 | 0.764 | 0.682 | 0.776 | 0.477 | 0.653 | 0.631 |
SiamKPN | 0.812 | 0.740 | 0.479 | 0.535 | 0.821 | 0.712 | 0.703 | 0.606 | 0.747 | 0.503 | 0.722 | 0.656 |
SiamAPN | 0.676 | 0.713 | 0.420 | 0.487 | 0.765 | 0.613 | 0.697 | 0.597 | 0.704 | 0.416 | 0.592 | 0.577 |
Ours | 0.847 | 0.826 | 0.517 | 0.576 | 0.843 | 0.784 | 0.791 | 0.715 | 0.816 | 0.533 | 0.774 | 0.714 |
AM | BC | DEF | FO | IM | IPR | IV | LT | MB | OON | PO | SA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
SiamBAN | 0.598 | 0.655 | 0.599 | 0.522 | 0.715 | 0.690 | 0.668 | 0.488 | 0.704 | 0.564 | 0.614 | 0.591 |
SiamRPN | 0.808 | 0.729 | 0.658 | 0.401 | 0.747 | 0.742 | 0.730 | 0.630 | 0.729 | 0.583 | 0.730 | 0.671 |
DROL | 0.165 | 0.352 | 0.377 | 0.059 | 0.215 | 0.382 | 0.425 | 0.221 | 0.357 | 0.336 | 0.137 | 0.268 |
CLNet | 0.784 | 0.763 | 0.697 | 0.541 | 0.756 | 0.781 | 0.758 | 0.668 | 0.749 | 0.655 | 0.752 | 0.687 |
SiamAPN++ | 0.579 | 0.735 | 0.669 | 0.603 | 0.737 | 0.711 | 0.771 | 0.652 | 0.767 | 0.672 | 0.597 | 0.607 |
SiamSA | 0.684 | 0.724 | 0.667 | 0.533 | 0.717 | 0.732 | 0.746 | 0.648 | 0.757 | 0.668 | 0.624 | 0.640 |
SiamKPN | 0.693 | 0.722 | 0.700 | 0.535 | 0.716 | 0.720 | 0.722 | 0.591 | 0.737 | 0.647 | 0.669 | 0.652 |
SiamAPN | 0.601 | 0.693 | 0.551 | 0.450 | 0.696 | 0.635 | 0.724 | 0.560 | 0.728 | 0.532 | 0.537 | 0.576 |
Ours | 0.749 | 0.780 | 0.715 | 0.539 | 0.751 | 0.776 | 0.777 | 0.670 | 0.777 | 0.663 | 0.725 | 0.697 |
AM | BC | DEF | FO | IM | IPR | IV | LT | MB | OON | PO | SA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
SiamBAN | 0.450 | 0.491 | 0.398 | 0.344 | 0.537 | 0.495 | 0.470 | 0.387 | 0.520 | 0.378 | 0.461 | 0.432 |
SiamRPN | 0.555 | 0.493 | 0.386 | 0.250 | 0.514 | 0.466 | 0.478 | 0.441 | 0.494 | 0.360 | 0.511 | 0.458 |
DROL | 0.106 | 0.224 | 0.208 | 0.036 | 0.140 | 0.235 | 0.258 | 0.143 | 0.227 | 0.198 | 0.095 | 0.170 |
CLNet | 0.546 | 0.518 | 0.409 | 0.350 | 0.518 | 0.491 | 0.497 | 0.468 | 0.517 | 0.398 | 0.531 | 0.474 |
SiamAPN++ | 0.359 | 0.476 | 0.381 | 0.392 | 0.499 | 0.425 | 0.487 | 0.433 | 0.481 | 0.400 | 0.398 | 0.388 |
SiamSA | 0.465 | 0.502 | 0.418 | 0.381 | 0.503 | 0.481 | 0.504 | 0.459 | 0.520 | 0.437 | 0.450 | 0.441 |
SiamKPN | 0.527 | 0.531 | 0.405 | 0.373 | 0.527 | 0.500 | 0.498 | 0.442 | 0.522 | 0.401 | 0.495 | 0.463 |
SiamAPN | 0.401 | 0.453 | 0.328 | 0.299 | 0.482 | 0.399 | 0.457 | 0.383 | 0.473 | 0.337 | 0.368 | 0.385 |
Ours | 0.538 | 0.542 | 0.420 | 0.369 | 0.541 | 0.513 | 0.519 | 0.483 | 0.542 | 0.421 | 0.517 | 0.492 |
Algorithm | Success | Precision |
---|---|---|
SiamFC | 0.336 | 0.339 |
SiamDW | 0.347 | 0.329 |
SiamRPN++ | 0.495 | 0.493 |
SiamBAN | 0.514 | 0.521 |
ATOM | 0.499 | 0.497 |
Ocean | 0.526 | 0.526 |
Ours | 0.532 | 0.539 |
CBAM | Asymmetric Convolution | Ranking Loss | Success | Normalize | Precision |
---|---|---|---|---|---|
× | × | × | 0.506 | 0.744 | 0.761 |
√ | × | × | 0.512 | 0.755 | 0.772 |
× | √ | × | 0.509 | 0.749 | 0.767 |
× | × | √ | 0.516 | 0.761 | 0.783 |
√ | √ | × | 0.510 | 0.751 | 0.771 |
√ | × | √ | 0.527 | 0.778 | 0.801 |
× | √ | √ | 0.523 | 0.770 | 0.790 |
√ | √ | √ | 0.533 | 0.786 | 0.812 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, A.; Yang, Z.; Feng, W. SiamRhic: Improved Cross-Correlation and Ranking Head-Based Siamese Network for Object Tracking in Remote Sensing Videos. Remote Sens. 2024, 16, 4549. https://doi.org/10.3390/rs16234549
Yang A, Yang Z, Feng W. SiamRhic: Improved Cross-Correlation and Ranking Head-Based Siamese Network for Object Tracking in Remote Sensing Videos. Remote Sensing. 2024; 16(23):4549. https://doi.org/10.3390/rs16234549
Chicago/Turabian StyleYang, Afeng, Zhuolin Yang, and Wenqing Feng. 2024. "SiamRhic: Improved Cross-Correlation and Ranking Head-Based Siamese Network for Object Tracking in Remote Sensing Videos" Remote Sensing 16, no. 23: 4549. https://doi.org/10.3390/rs16234549
APA StyleYang, A., Yang, Z., & Feng, W. (2024). SiamRhic: Improved Cross-Correlation and Ranking Head-Based Siamese Network for Object Tracking in Remote Sensing Videos. Remote Sensing, 16(23), 4549. https://doi.org/10.3390/rs16234549