Abstract
Multi-Object Tracking (MOT) is a challenging research area in computer vision with significant practical applications. With the advent of deep neural networks, significant progress has been made in MOT, and Qdtrack has become a widely used algorithm due to its relatively simple structure and high performance. However, accurate target tracking in complex scenes with mutual occlusion, motion blur and complex backgrounds is still a significant challenge. To address the problem of low target tracking accuracy in complex scenes, this paper proposes an end-to-end multi-object tracking algorithm based on attention feature fusion. First, a new lightweight attention module is introduced into the backbone network, which enhances the ability of the network to capture key information and locate targets without increasing the computational complexity. Second, the feature pyramid structure is improved to reduce the loss of features in the fusion process and improve the feature expression ability of the model. Finally, the Intersection over Union (IoU) of the original model is optimised and the regression ability of the bounding box is improved using polyloss to optimise the cross-entropy loss. Experimental results on the MOT16 and MOT17 benchmarks show that the proposed algorithm effectively improves the accuracy and robustness of multi-object pedestrian tracking compared to other algorithms, and has better tracking performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bewley, A., Ge, Z., Ott, L., et al.: Simple online and realtime tracking. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 3464–3468. IEEE (2016)
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 3645–3649. IEEE (2017)
Pang, J., Qiu, L., Li, X., et al.: Quasi-dense similarity learning for multiple object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 164–173 (2021)
Ma, N., Zhang, X., Zheng, H.T., et al.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)
Zhang, Q.L., Yang, Y.B.: SA-Net: shuffle attention for deep convolutional neural networks. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2235–2239. IEEE (2021)
Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Rossi, L., Karimi, A., Prati, A.: A novel region of interest extraction layer for instance segmentation. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 2203–2209. IEEE (2021)
Rezatofighi, H., Tsoi, N., Gwak, J.Y., et al.: Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)
He, J., Erfani, S., Ma, X., et al.: \(\alpha \)-IoU: a family of power intersection over union losses for bounding box regression. Adv. Neural Inf. Process. Syst. 34, 20230–20242 (2021)
Leng, Z., Tan, M., Liu, C., et al.: PolyLoss: A polynomial expansion perspective of classification loss functions. arXiv preprint arXiv:2204.12511 (2022)
Hornakova, A., Henschel, R., Rosenhahn, B., et al.: Lifted disjoint paths with application in multiple object tracking. In: International Conference on Machine Learning, pp. 4364–4375. PMLR (2020)
Pang, B., Li, Y., Zhang, Y., et al.: TubeTK: adopting tubes to track multi-object in a one-step training model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6308–6318 (2020)
Wang, Z., Zheng, L., Liu, Y., Li, Y., Wang, S.: Towards Real-Time Multi-Object Tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 107–122. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_7
Mahmoudi, N., Ahadi, S.M., Rahmati, M.: Multi-target tracking using CNN-based features: CNNMTT. Multimedia Tools Appl. 78, 7077–7096 (2019)
Peng, J., et al.: Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 145–161. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_9
Wu, J., Cao, J., Song, L., et al.: Track to detect and segment: an online multi-object tracker. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12352–12361 (2021)
Bergmann, P., Meinhardt, T., Leal-Taixe, L.: Tracking without bells and whistles. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 941–951 (2019)
Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., Wei, Y.: MOTR: end-to-end multiple-object tracking with transformer. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision-ECCV 2022. ECCV 2022. Lecture Notes in Computer Science. vol. 13687. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_38
Zhou, X., Koltun, V., Krähenbühl, P.: Tracking objects as points. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 474–490. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_28
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Conflicts of interest
The authors state that they have no conflicting financial interests or personal connections that may have influenced the work reported in this paper.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, Y., Du, Z., Wang, D. (2023). Pedestrian Multi-object Tracking Algorithm Based on Attention Feature Fusion. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2023. Lecture Notes in Computer Science, vol 14134. Springer, Cham. https://doi.org/10.1007/978-3-031-43085-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-43085-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43084-8
Online ISBN: 978-3-031-43085-5
eBook Packages: Computer ScienceComputer Science (R0)