[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Online Pedestrian Multiple-Object Tracking with Prediction Refinement and Track Classification

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

The performance of pedestrian multiple object tracking (MOT), which is based on the tracking-by-detection framework, is exceedingly susceptible to the quality of detection, especially suffering from detection missing or inaccuracy caused by occlusion. Several studies aimed at alleviating the problem continue to perform poorly in scenarios with frequent heavy occlusions. In this study, a novel online pedestrian MOT method is proposed for targets with severe occlusion. First, a regression network is employed to refine the predicted position of the target to obtain a precise bounding box and visibility score. Considering the visibility score and the overlap between these refined bounding boxes globally, the targets that are heavily occluded are categorised into the following two types: (1) targets occluded by a non-pedestrian object and (2) targets occluded by other pedestrians. Then, these occluded targets are handled in different ways, which reduces the number of false negatives (FNs) and false positives (FPs). Finally, to enhance the precision of the prediction, a motion model that combines the Kalman filter and camera motion compensation is developed. The tracking results applied to three widely used pedestrian MOT benchmark datasets demonstrates the state-of-the-art performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032

    Article  MathSciNet  MATH  Google Scholar 

  2. Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779

    Article  Google Scholar 

  3. Yu J, Rui Y, Chen B (2013) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimedia 16(1):159–168

    Article  Google Scholar 

  4. Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058

    Article  Google Scholar 

  5. Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), 25–28 September 2016, pp 3464–3468. https://doi.org/10.1109/ICIP.2016.7533003

  6. Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP), 17–20 September 2017, pp 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962

  7. Chu Q, Ouyang W, Li H, Wang X, Liu B, Yu N (2017) Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism. In: Proceedings of the IEEE international conference on computer vision, pp 4836–4845

  8. Zhu J, Yang H, Liu N, Kim M, Zhang W, Yang M-H (2018) Online multi-object tracking with dual matching attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 366–382

  9. Feng W, Hu Z, Wu W, Yan J, Ouyang W (2019) Multi-object tracking with multiple cues and switcher-aware classification. arXiv Preprint. arXiv:1901.06129

  10. EvangelidisPsarakis GDEZ (2008) Parametric image alignment using enhanced correlation coefficient maximization. IEEE Trans Pattern Anal Mach Intell 30(10):1858–1865

    Article  Google Scholar 

  11. Song Y-m, Jeon M (2016) Online multiple object tracking with the hierarchically adopted GM-PHD filter using motion and appearance. In: 2016 IEEE international conference on consumer electronics-Asia (ICCE-Asia). IEEE, Piscataway, pp 1–4

  12. Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In: 2018 IEEE international conference on Multimedia and Expo (ICME). IEEE, Piscataway, pp 1–6

  13. Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems. Curran Associates, Red Hook, pp 379–387

  14. Choi W, Savarese S (2010) Multiple target tracking in world coordinate with single, minimally calibrated camera. In: European conference on computer vision. Springer, Heidelberg, pp 553–567

  15. Andriyenko A, Schindler K (2011) Multi-target tracking by continuous energy minimization. In: CVPR 2011. IEEE, Piscataway, pp 1265–1272

  16. Leal-Taixé L, Pons-Moll G, Rosenhahn B (2011) Everybody needs somebody: modeling social and grouping behavior on a linear programming multiple people tracker. In: 2011 IEEE international conference on computer vision workshops (ICCV workshops). IEEE, Piscataway, pp 120–127

  17. Scovanner P, Tappen MF (2009) Learning pedestrian dynamics from the real world. In: 2009 IEEE 12th international conference on computer vision. IEEE, Piscataway, pp 381–388

  18. Pellegrini S, Ess A, Schindler K, Van Gool L (2009) You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th international conference on computer vision. IEEE, Piscataway, pp 261–268

  19. Yamaguchi K, Berg AC, Ortiz LE, Berg TL (2011) Who are you with and where are you going? In: CVPR 2011. IEEE, Piscataway, pp 1345–1352

  20. Leal-Taixé L, Fenzi M, Kuznetsova A, Rosenhahn B, Savarese S (2014) Learning an image-based motion context for multiple people tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3542–3549

  21. Alahi A, Goel K, Ramanathan V, Robicquet A, Fei-Fei L, Savarese S (2016) Social LSTM: Human trajectory prediction in crowded spaces. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 961–971

  22. Robicquet A, Sadeghian A, Alahi A, Savarese S (2016) Learning social etiquette: human trajectory understanding in crowded scenes. In: European conference on computer vision. Springer, Heidelberg, pp 549–565

  23. Chen B, Wang D, Li P, Wang S, Lu H (2018) Real-time ‘actor-critic’ tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 318–334

  24. Ren L, Lu J, Wang Z, Tian Q, Zhou J (2018) Collaborative deep reinforcement learning for multi-object tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 586–602

  25. Babaee M, Li Z, Rigoll G (2018) Occlusion handling in tracking multiple people using RNN. In: 2018 25th IEEE international conference on image processing (ICIP). IEEE, Piscataway, pp 2715–2719

  26. Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: 13th annual conference of the international speech communication association

  27. Kim C, Li F, Ciptadi A, Rehg JM (2015) Multiple hypothesis tracking revisited. In: Proceedings of the IEEE international conference on computer vision, pp 4696–4704

  28. Kuo C-H, Nevatia R (2011) How does person identity recognition help multi-person tracking? In: CVPR 2011. IEEE, Piscataway, pp 1217–1224

  29. Yang B, Nevatia R (2012) An online learned CRF model for multi-target tracking. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 2034–2041

  30. Zhao L, Li X, Zhuang Y, Wang J (2017) Deeply-learned part-aligned representations for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3219–3228

  31. Ristani E, Tomasi C (2018) Features for multi-target multi-camera tracking and re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6036–6046

  32. Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  33. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  34. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  35. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  36. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Heidelberg, pp 21–37

  37. Feichtenhofer C, Pinz A, Zisserman A (2017) Detect to track and track to detect. In: Proceedings of the IEEE international conference on computer vision, pp 3038–3046

  38. Kieritz H, Hubner W, Arens M (2018) Joint detection and online multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1459–1467

  39. Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles. In: Proceedings of the IEEE international conference on computer vision, pp 941–951

  40. Keuper M, Tang S, Andres B, Brox T, Schiele B (2018) Motion segmentation and multiple object tracking by correlation co-clustering. IEEE Trans Pattern Anal Mach Intell 42(1):140–153

    Article  Google Scholar 

  41. Bochinski E, Eiselein V, Sikora T (2017) High-speed tracking-by-detection without using image information. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, Piscataway, pp 1–6

  42. Bochinski E, Senst T, Sikora T (2018) Extending IOU based multi-object tracking by visual information. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, Piscataway, pp 1–6

  43. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  44. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  45. Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. Arxiv preprint. arXiv:1703.07737

  46. Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K (2015) Motchallenge 2015: towards a benchmark for multi-target tracking. Arxiv preprint. arXiv:1504.01942

  47. Milan A, Leal-Taixé L, Reid I, Roth S, Schindler KJapa (2016) MOT16: a benchmark for multi-object tracking. Arxiv preprint. arXiv:1603.00831

  48. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  49. Yang F, Choi W, Lin Y (2016) Exploit all the layers: fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2129–2137

  50. Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J Image Video Process 2008:1–10

    Article  Google Scholar 

  51. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, Heidelberg, pp 740–755

  52. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Arxiv preprint. arXiv:1412.6980

  53. Sadeghian A, Alahi A, Savarese S (2017) Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In: Proceedings of the IEEE international conference on computer vision, pp 300–311

  54. Xu J, Cao Y, Zhang Z, Hu H (2019) Spatial-temporal relation networks for multi-object tracking. In: Proceedings of the IEEE international conference on computer vision, pp 3988–3998

  55. Sun S, Akhtar N, Song H, Mian AS, Shah M (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell 43(1):104–119

    Google Scholar 

  56. Chu P, Fan H, Tan CC, Ling H (2019) Online multi-object tracking with instance-aware tracker and dynamic model refreshment. In: 2019 IEEE winter conference on applications of computer vision (WACV). IEEE, Piscataway, pp 161–170

  57. Yoon Y-C, Kim DY, Yoon K, Song Y, Jeon M (2019) Online multiple pedestrian tracking using deep temporal appearance matching association. Inf Sci 561:326–351

    Article  MathSciNet  Google Scholar 

  58. Levinkov E, Uhrig J, Tang S, Omran M, Insafutdinov E, Kirillov A, Rother C, Brox T, Schiele B, Andres B (2017) Joint graph decomposition & node labeling: problem, algorithms, applications. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6012–6020

  59. Tang S, Andriluka M, Andres B, Schiele B (2017) Multiple people tracking by lifted multicut and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3539–3548

  60. Ma L, Tang S, Black MJ, Van Gool L (2018) Customized multi-person tracker. In: Asian conference on computer vision. Springer, Heidelberg, pp 612–628

  61. Chen L, Ai H, Chen R, Zhuang Z (2019) Aggregate tracklet appearance features for multi-object tracking. IEEE Signal Process Lett 26(11):1613–1617

    Article  Google Scholar 

  62. Chu P, Ling H (2019) Famnet: joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In: Proceedings of the IEEE international conference on computer vision, pp 6172–6181

  63. Wang G, Wang Y, Zhang H, Gu R, Hwang J-N (2019) Exploit the connectivity: multi-object tracking with trackletnet. In: Proceedings of the 27th ACM international conference on multimedia, pp 482–490

  64. Henschel R, Zou Y, Rosenhahn B (2019) Multiple people tracking using body and joint detections. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2019

  65. Osep A, Mehner W, Mathias M et al (2017) Combined image-and world-space tracking in traffic scenes. In: IEEE international conference on robotics and automation (ICRA). IEEE, Piscataway, pp 1988–1995

  66. Shenoi A, Patel M, Gwak JY et al (2020) JRMOT: a real-time 3D multi-object tracker and a new large-scale dataset. In: The IEEE/RSJ international conference on intelligent robots and systems (IROS)

  67. Yoon JH, Lee CR, Yang MH et al (2016) Online multi-object tracking via structural constraint event aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1392–1400

  68. Choi W (2015) Near-online multi-target tracking with aggregated local flow descriptor. In: Proceedings of the IEEE international conference on computer vision, pp 3029–3037

  69. Weng X, Wang J, Held D et al (2020) 3D multi-object tracking: A baseline and new evaluation metrics. In: 2020 IEEE/RSJ international conference on intelligent robots and systems. IEEE, Piscataway, pp 10359–10366

Download references

Acknowledgements

This study was supported by the Graduate Innovation Foundation of Jiangsu Province under Grant No. KYLX16_0781, Natural Science Foundation of Jiangsu Province under Grant No. BK20181340, the 111 Project under Grant No. B12018, and PAPD of Jiangsu Higher Education Institutions. We would like to thank Editage (www.editage.cn) for English language editing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongwei Ge.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Ge, H., Yang, J. et al. Online Pedestrian Multiple-Object Tracking with Prediction Refinement and Track Classification. Neural Process Lett 54, 4893–4919 (2022). https://doi.org/10.1007/s11063-022-10840-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-10840-7

Keywords

Navigation