[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Deblurring transformer tracking with conditional cross-attention

Published: 30 December 2022 Publication History

Abstract

In object tracking, motion blur is a common challenge induced by rapid movement of target object or long time exposure of the camera, which leads to poor tracking performance. Traditional solutions usually perform image recovery operations before tracking object. However, most image recovery methods usually have higher computational cost, which decreases the tracking speed. In order to solve the above problems, we propose a deblurring Transformer-based tracking method embedding the conditional cross-attention. The proposed method integrates three important modules: (1) an image quality assessment (IQA) module to estimate image quality; (2) an image deblurring module based on lightweight adversarial network to improve image quality; and (3) a tracking module based on Transformer with conditional cross-attention to enhance the object localization ability. Experimental results on two UAV object tracking benchmarks show that the proposed trackers achieve competitive results compared to several state-of-the-art trackers.

References

[1]
You, S., Zhu, H., Li, M., Li, Y.: A review of visual trackers and analysis of its application to mobile robot. ArXiv abs/1910.09761 (2019)
[2]
Li P, Wang D, Wang L, and Lu H Deep visual tracking: review and experimental comparison Pattern Recogn. 2018 76 323-338
[3]
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware siamese networks for visual object tracking. In: ECCV, pp. 101–117 (2018)
[4]
Dai, K., Wang, D., Lu, H., Sun, C., Li, J.: Visual tracking via adaptive spatially-regularized correlation filters. In: CVPR, pp. 4670–4679 (2019)
[5]
Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer tracking. In: CVPR, pp. 8126–8135 (2021)
[6]
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: ECCV, pp. 213–229 (2020)
[7]
Meng, D., Chen, X., Fan, Z., Zeng, G., Li, H., Yuan, Y., Sun, L., Wang, J.: Conditional detr for fast training convergence. In: CVPR, pp. 3651–3660 (2021)
[8]
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., et al.: Searching for mobilenetv3. In: ICCV, pp. 1314–1324 (2019)
[9]
Li S and Yeung D-Y Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models AAAI Conf. Artif. Intell. 2017 31 4140-4146
[10]
Fu C, Cao Z, Li Y, Ye J, and Feng C Onboard real-time aerial tracking with efficient siamese anchor proposal network IEEE Trans. Geosci. Remote Sens. 2021 60 1-13
[11]
Wang F, Yin S, Mbelwa JT, and Sun F Learning saliency aware correlation filter for visual tracking Multimed. Tools Appl. 2022 81 27879-27893
[12]
Wang Y, Wang F, Wang C, He J, and Sun F Context and saliency aware correlation filter for visual target tracking Computer J. 2022 65 1846-1859
[13]
Tao, R., Gavves, E., Smeulders, A.W.M.: Siamese instance search for tracking. In: CVPR, pp. 1420–1429 (2016)
[14]
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.: Fully-convolutional siamese networks for object tracking. In: ECCV, pp. 850–865 (2016)
[15]
Chen, Z., Zhong, B., Li, G., Zhang, S., Ji, R.: Siamese box adaptive network for visual tracking. In: CVPR, pp. 6668–6677 (2020)
[16]
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: CVPR, pp. 4282–4291 (2019)
[17]
Guo, D., Shao, Y., Cui, Y., Wang, Z., Zhang, L., Shen, C.: Graph attention tracking. In: CVPR, pp. 9543–9552 (2021)
[18]
Wu R, Wen X, Liu Z, Yuan L, and Xu H Stasiamrpn: visual tracking based on spatiotemporal and attention Multimed. Syst. 2021 28 1543-1555
[19]
Ondrašovič M and Tarábek P Siamese visual object tracking: a survey IEEE Access 2021 9 110149-110172
[20]
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal network. In: CVPR, pp. 8971–8980 (2018)
[21]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, and Polosukhin I Attention is all you need NeurIPS 2017 30 6000-6010
[22]
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S.A.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
[23]
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: ICCV, pp. 10012–10022 (2021)
[24]
Yan, B., Peng, H., Fu, J., Wang, D., Lu, H.: Learning spatio-temporal transformer for visual tracking. In: ICCV, pp. 10448–10457 (2021)
[25]
Chen, B., Li, P., Bai, L., Qiao, L., Shen, Q., Li, B., Gan, W., Wu, W., Ouyang, W.: Backbone is all your need: a simplified architecture for visual object tracking. In: ECCV, pp. 375–392 (2022)
[26]
Cui, Y., Jiang, C., Wang, L., Wu, G.: Mixformer: End-to-end tracking with iterative mixed attention. In: CVPR, pp. 13598–13608 (2022)
[27]
Song, Z., Yu, J., Chen, Y.P., Yang, W.: Transformer tracking with cyclic shifting window attention. In: CVPR, pp. 8781–8790 (2022)
[28]
Zhao, M., Okada, K., Inaba, M.: Trtr: Visual tracking with transformer. ArXiv abs/2105.03817 (2021)
[29]
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, and Bharath AA Generative adversarial networks: An overview IEEE Signal Process. Mag. 2018 35 1 53-65
[30]
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR, pp. 1125–1134 (2017)
[31]
Badrinarayanan V, Kendall A, and Cipolla R Segnet: a deep convolutional encoder-decoder architecture for image segmentation IEEE Trans. Pattern Anal. Mach. Intell. 2017 39 12 2481-2495
[32]
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: Deblurgan: Blind motion deblurring using conditional adversarial networks. In: CVPR, pp. 8183–8192 (2018)
[33]
Kupyn, O., Martyniuk, T., Wu, J., Wang, Z.: Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In: ICCV, pp. 8878–8887 (2019)
[34]
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR, pp. 7132–7141 (2018)
[35]
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)
[36]
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: ECCV, pp. 694–711 (2016)

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Multimedia Systems
Multimedia Systems  Volume 29, Issue 3
Jun 2023
936 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 30 December 2022
Accepted: 19 December 2022
Received: 20 November 2022

Author Tags

  1. Object tracking
  2. Transformer
  3. Motion blur
  4. Image quality assessment
  5. Image deblurring

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Novel Long Short-Term Memory Learning Strategy for Object TrackingInternational Journal of Intelligent Systems10.1155/2024/66322422024Online publication date: 1-Jan-2024
  • (2024)SOCFExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122131238:PEOnline publication date: 27-Feb-2024
  • (2024)SiamS3C: spatial-channel cross-correlation for visual tracking with centerness-guided regressionMultimedia Systems10.1007/s00530-024-01450-530:5Online publication date: 20-Aug-2024
  • (2024)Siamada: visual tracking based on Siamese adaptive learning networkNeural Computing and Applications10.1007/s00521-024-09481-936:14(7639-7656)Online publication date: 1-May-2024

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media