Online Pedestrian Multiple-Object Tracking with Prediction Refinement and Track Classification

Jieming Yang^1,2,
Hongwei Ge^1,2,
Jinlong Yang^1,2,
Yubing Tong³ &
…
Shuzhi Su^1,2,3

380 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

The performance of pedestrian multiple object tracking (MOT), which is based on the tracking-by-detection framework, is exceedingly susceptible to the quality of detection, especially suffering from detection missing or inaccuracy caused by occlusion. Several studies aimed at alleviating the problem continue to perform poorly in scenarios with frequent heavy occlusions. In this study, a novel online pedestrian MOT method is proposed for targets with severe occlusion. First, a regression network is employed to refine the predicted position of the target to obtain a precise bounding box and visibility score. Considering the visibility score and the overlap between these refined bounding boxes globally, the targets that are heavily occluded are categorised into the following two types: (1) targets occluded by a non-pedestrian object and (2) targets occluded by other pedestrians. Then, these occluded targets are handled in different ways, which reduces the number of false negatives (FNs) and false positives (FPs). Finally, to enhance the precision of the prediction, a motion model that combines the Kalman filter and camera motion compensation is developed. The tracking results applied to three widely used pedestrian MOT benchmark datasets demonstrates the state-of-the-art performance of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Online adaptive multiple pedestrian tracking in monocular surveillance video

Article 25 April 2016

Motion estimation and multi-stage association for tracking-by-detection

Article Open access 22 November 2023

Rlm-tracking: online multi-pedestrian tracking supported by relative location mapping

Article Open access 18 January 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032
Article MathSciNet MATH Google Scholar
Yu J, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779
Article Google Scholar
Yu J, Rui Y, Chen B (2013) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimedia 16(1):159–168
Article Google Scholar
Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058
Article Google Scholar
Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), 25–28 September 2016, pp 3464–3468. https://doi.org/10.1109/ICIP.2016.7533003
Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP), 17–20 September 2017, pp 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962
Chu Q, Ouyang W, Li H, Wang X, Liu B, Yu N (2017) Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism. In: Proceedings of the IEEE international conference on computer vision, pp 4836–4845
Zhu J, Yang H, Liu N, Kim M, Zhang W, Yang M-H (2018) Online multi-object tracking with dual matching attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 366–382
Feng W, Hu Z, Wu W, Yan J, Ouyang W (2019) Multi-object tracking with multiple cues and switcher-aware classification. arXiv Preprint. arXiv:1901.06129
EvangelidisPsarakis GDEZ (2008) Parametric image alignment using enhanced correlation coefficient maximization. IEEE Trans Pattern Anal Mach Intell 30(10):1858–1865
Article Google Scholar
Song Y-m, Jeon M (2016) Online multiple object tracking with the hierarchically adopted GM-PHD filter using motion and appearance. In: 2016 IEEE international conference on consumer electronics-Asia (ICCE-Asia). IEEE, Piscataway, pp 1–4
Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In: 2018 IEEE international conference on Multimedia and Expo (ICME). IEEE, Piscataway, pp 1–6
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems. Curran Associates, Red Hook, pp 379–387
Choi W, Savarese S (2010) Multiple target tracking in world coordinate with single, minimally calibrated camera. In: European conference on computer vision. Springer, Heidelberg, pp 553–567
Andriyenko A, Schindler K (2011) Multi-target tracking by continuous energy minimization. In: CVPR 2011. IEEE, Piscataway, pp 1265–1272
Leal-Taixé L, Pons-Moll G, Rosenhahn B (2011) Everybody needs somebody: modeling social and grouping behavior on a linear programming multiple people tracker. In: 2011 IEEE international conference on computer vision workshops (ICCV workshops). IEEE, Piscataway, pp 120–127
Scovanner P, Tappen MF (2009) Learning pedestrian dynamics from the real world. In: 2009 IEEE 12th international conference on computer vision. IEEE, Piscataway, pp 381–388
Pellegrini S, Ess A, Schindler K, Van Gool L (2009) You’ll never walk alone: modeling social behavior for multi-target tracking. In: 2009 IEEE 12th international conference on computer vision. IEEE, Piscataway, pp 261–268
Yamaguchi K, Berg AC, Ortiz LE, Berg TL (2011) Who are you with and where are you going? In: CVPR 2011. IEEE, Piscataway, pp 1345–1352
Leal-Taixé L, Fenzi M, Kuznetsova A, Rosenhahn B, Savarese S (2014) Learning an image-based motion context for multiple people tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3542–3549
Alahi A, Goel K, Ramanathan V, Robicquet A, Fei-Fei L, Savarese S (2016) Social LSTM: Human trajectory prediction in crowded spaces. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 961–971
Robicquet A, Sadeghian A, Alahi A, Savarese S (2016) Learning social etiquette: human trajectory understanding in crowded scenes. In: European conference on computer vision. Springer, Heidelberg, pp 549–565
Chen B, Wang D, Li P, Wang S, Lu H (2018) Real-time ‘actor-critic’ tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 318–334
Ren L, Lu J, Wang Z, Tian Q, Zhou J (2018) Collaborative deep reinforcement learning for multi-object tracking. In: Proceedings of the European conference on computer vision (ECCV), pp 586–602
Babaee M, Li Z, Rigoll G (2018) Occlusion handling in tracking multiple people using RNN. In: 2018 25th IEEE international conference on image processing (ICIP). IEEE, Piscataway, pp 2715–2719
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: 13th annual conference of the international speech communication association
Kim C, Li F, Ciptadi A, Rehg JM (2015) Multiple hypothesis tracking revisited. In: Proceedings of the IEEE international conference on computer vision, pp 4696–4704
Kuo C-H, Nevatia R (2011) How does person identity recognition help multi-person tracking? In: CVPR 2011. IEEE, Piscataway, pp 1217–1224
Yang B, Nevatia R (2012) An online learned CRF model for multi-target tracking. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, Piscataway, pp 2034–2041
Zhao L, Li X, Zhuang Y, Wang J (2017) Deeply-learned part-aligned representations for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3219–3228
Ristani E, Tomasi C (2018) Features for multi-target multi-camera tracking and re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6036–6046
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Heidelberg, pp 21–37
Feichtenhofer C, Pinz A, Zisserman A (2017) Detect to track and track to detect. In: Proceedings of the IEEE international conference on computer vision, pp 3038–3046
Kieritz H, Hubner W, Arens M (2018) Joint detection and online multi-object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1459–1467
Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles. In: Proceedings of the IEEE international conference on computer vision, pp 941–951
Keuper M, Tang S, Andres B, Brox T, Schiele B (2018) Motion segmentation and multiple object tracking by correlation co-clustering. IEEE Trans Pattern Anal Mach Intell 42(1):140–153
Article Google Scholar
Bochinski E, Eiselein V, Sikora T (2017) High-speed tracking-by-detection without using image information. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, Piscataway, pp 1–6
Bochinski E, Senst T, Sikora T (2018) Extending IOU based multi-object tracking by visual information. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, Piscataway, pp 1–6
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. Arxiv preprint. arXiv:1703.07737
Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K (2015) Motchallenge 2015: towards a benchmark for multi-target tracking. Arxiv preprint. arXiv:1504.01942
Milan A, Leal-Taixé L, Reid I, Roth S, Schindler KJapa (2016) MOT16: a benchmark for multi-object tracking. Arxiv preprint. arXiv:1603.00831
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Yang F, Choi W, Lin Y (2016) Exploit all the layers: fast and accurate cnn object detector with scale dependent pooling and cascaded rejection classifiers. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2129–2137
Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J Image Video Process 2008:1–10
Article Google Scholar
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, Heidelberg, pp 740–755
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Arxiv preprint. arXiv:1412.6980
Sadeghian A, Alahi A, Savarese S (2017) Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In: Proceedings of the IEEE international conference on computer vision, pp 300–311
Xu J, Cao Y, Zhang Z, Hu H (2019) Spatial-temporal relation networks for multi-object tracking. In: Proceedings of the IEEE international conference on computer vision, pp 3988–3998
Sun S, Akhtar N, Song H, Mian AS, Shah M (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell 43(1):104–119
Google Scholar
Chu P, Fan H, Tan CC, Ling H (2019) Online multi-object tracking with instance-aware tracker and dynamic model refreshment. In: 2019 IEEE winter conference on applications of computer vision (WACV). IEEE, Piscataway, pp 161–170
Yoon Y-C, Kim DY, Yoon K, Song Y, Jeon M (2019) Online multiple pedestrian tracking using deep temporal appearance matching association. Inf Sci 561:326–351
Article MathSciNet Google Scholar
Levinkov E, Uhrig J, Tang S, Omran M, Insafutdinov E, Kirillov A, Rother C, Brox T, Schiele B, Andres B (2017) Joint graph decomposition & node labeling: problem, algorithms, applications. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6012–6020
Tang S, Andriluka M, Andres B, Schiele B (2017) Multiple people tracking by lifted multicut and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3539–3548
Ma L, Tang S, Black MJ, Van Gool L (2018) Customized multi-person tracker. In: Asian conference on computer vision. Springer, Heidelberg, pp 612–628
Chen L, Ai H, Chen R, Zhuang Z (2019) Aggregate tracklet appearance features for multi-object tracking. IEEE Signal Process Lett 26(11):1613–1617
Article Google Scholar
Chu P, Ling H (2019) Famnet: joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In: Proceedings of the IEEE international conference on computer vision, pp 6172–6181
Wang G, Wang Y, Zhang H, Gu R, Hwang J-N (2019) Exploit the connectivity: multi-object tracking with trackletnet. In: Proceedings of the 27th ACM international conference on multimedia, pp 482–490
Henschel R, Zou Y, Rosenhahn B (2019) Multiple people tracking using body and joint detections. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2019
Osep A, Mehner W, Mathias M et al (2017) Combined image-and world-space tracking in traffic scenes. In: IEEE international conference on robotics and automation (ICRA). IEEE, Piscataway, pp 1988–1995
Shenoi A, Patel M, Gwak JY et al (2020) JRMOT: a real-time 3D multi-object tracker and a new large-scale dataset. In: The IEEE/RSJ international conference on intelligent robots and systems (IROS)
Yoon JH, Lee CR, Yang MH et al (2016) Online multi-object tracking via structural constraint event aggregation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1392–1400
Choi W (2015) Near-online multi-target tracking with aggregated local flow descriptor. In: Proceedings of the IEEE international conference on computer vision, pp 3029–3037
Weng X, Wang J, Held D et al (2020) 3D multi-object tracking: A baseline and new evaluation metrics. In: 2020 IEEE/RSJ international conference on intelligent robots and systems. IEEE, Piscataway, pp 10359–10366

Download references

Acknowledgements

This study was supported by the Graduate Innovation Foundation of Jiangsu Province under Grant No. KYLX16_0781, Natural Science Foundation of Jiangsu Province under Grant No. BK20181340, the 111 Project under Grant No. B12018, and PAPD of Jiangsu Higher Education Institutions. We would like to thank Editage (www.editage.cn) for English language editing.

Author information

Authors and Affiliations

Key Laboratory of Advanced Process Control for Light Industry (Jiangnan University), Ministry of Education, Wuxi, 214122, China
Jieming Yang, Hongwei Ge, Jinlong Yang & Shuzhi Su
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, Jiangsu, China
Jieming Yang, Hongwei Ge, Jinlong Yang & Shuzhi Su
Medical Image Processing Group, Department of Radiology, University of Pennsylvania, Philadelphia, PA, 19104, USA
Yubing Tong & Shuzhi Su

Authors

Jieming Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Ge
View author publications
You can also search for this author in PubMed Google Scholar
Jinlong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yubing Tong
View author publications
You can also search for this author in PubMed Google Scholar
Shuzhi Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongwei Ge.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, J., Ge, H., Yang, J. et al. Online Pedestrian Multiple-Object Tracking with Prediction Refinement and Track Classification. Neural Process Lett 54, 4893–4919 (2022). https://doi.org/10.1007/s11063-022-10840-7

Download citation

Accepted: 09 April 2022
Published: 02 May 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11063-022-10840-7

Online Pedestrian Multiple-Object Tracking with Prediction Refinement and Track Classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Online adaptive multiple pedestrian tracking in monocular surveillance video

Motion estimation and multi-stage association for tracking-by-detection

Rlm-tracking: online multi-pedestrian tracking supported by relative location mapping

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Online Pedestrian Multiple-Object Tracking with Prediction Refinement and Track Classification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Online adaptive multiple pedestrian tracking in monocular surveillance video

Motion estimation and multi-stage association for tracking-by-detection

Rlm-tracking: online multi-pedestrian tracking supported by relative location mapping

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation