Abstract
The paper considers two problems of video analytics, which can be solved by tracking people in a video stream: people counting and estimation of queue waiting time. Modern video surveillance systems have several hundred thousand cameras, which is why one of the most important problems that video analytics has to face is the optimization of computing resource usage. Most presently available tracking algorithms are inefficient because they use computationally expensive CNN-based detectors on frequent video frames. In this paper, we propose methods for solving the problems mentioned above, which improve overall efficiency by applying detection on sparse frames. The experimental evaluation of the proposed methods shows their consistency in terms of both performance and computing resource usage.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.REFERENCES
Bewley, A. et al., Simple online and realtime tracking, 2016.
Wojke, N., Bewley, A., and Paulus, D., Simple online and realtime tracking with a deep association metric, 2017.
Yu, F., et al., Poi: Multiple object tracking with high performance detection and appearance feature, Proc. Eur. Conf. Computer Vision (ECCV), 2016, pp. 36–42.
Kuplyakov, D., et al., A distributed tracking algorithm for counting people in video by head detection, Proc. 30th Int. Conf. Computer Graphics and Machine Vision (GraphiCon). https://doi.org/10.51130/graphicon-2020-2-3-26
Kuplyakov, D., Shalnov, E., and Konushin, A., Further improvement on an MCMC-based video tracking algorithm, Proc. 26th Int. Conf. Computer Graphics and Machine Vision (GraphiCon), 2016, pp. 440–444.
Bochinski, E., Eiselein, V., and Sikora, T., High-speed tracking-by-detection without using image information, Proc. 14th IEEE Int. Conf. Advanced Video and Signal Based Surveillance (AVSS), 2017, pp. 1–6.
Bochinski, E., Senst, T., and Sikora, T., Extending IOU based multi-object tracking by visual information, Proc. 15th IEEE Int. Conf. Advanced Video and Signal Based Surveillance (AVSS), 2018, pp. 1–6. https://doi.org/10.1109/AVSS.2018.8639144
Kuhn, H.W., The Hungarian method for the assignment problem, Nav. Res. Logist., 1955, vol. 2, nos. 1–2, pp. 83–97.
Bergmann, P., Meinhardt, T., and Leal-Taixé, L., Tracking without bells and whistles, 2019. http://arxiv.org/abs/1903.05625.
Vojir, T., Noskova, J., and Matas, J., Robust scale-adaptive mean-shift for tracking, Image Analysis, Kämäräinen, J.-K. and Koskela, M., Eds., Springer, 2013, pp. 652–663.
Liu, W., et al., SSD: Single shot multibox detector, Proc. Eur. Conf. Computer Vision (ECCV), Leibe, B., Ed., Springer, 2016, pp. 21–37.
He, K., et al., Deep residual learning for image recognition, 2015.
Howard, A.G., et al., MobileNets: Efficient convolutional neural networks for mobile vision applications, 2017.
Shao, S. et al., CrowdHuman: A benchmark for detecting human in a crowd, 2018.
Bertinetto, L., et al., Staple: Complementary learners for real-time tracking, 2016. https://doi.org/10.1109/CVPR.2016.156
Leal-Taixé, L., et al., MOTChallenge 2015: Towards a benchmark for multi-target tracking, 2015. http://arxiv.org/abs/1504.01942.
Milan, A., et al., MOT16: A benchmark for multi-object tracking, 2016. http://arxiv.org/abs/1603.00831.
Dendorfer, P. et al., MOT20: A benchmark for multi object tracking in crowded scenes, 2020.
Benfold, B. and Reid, I., Stable multi-target tracking in real-time surveillance video, Proc. Conf. Computer Vision and Pattern Recognition (CVPR), 2011, pp. 3457–3464. https://doi.org/10.1109/CVPR.2011.5995667
Kuplyakov, D.A., et al., Distributed algorithm for people tracking in video, Tr. 28-i mezhdunar. konf. po komp’yuternoi grafike i mashinnomu zreniyu (Proc. 28th Int. Conf. Computer Graphics and Machine Vision), Tomsk, 2018, pp. 208–213.
Luo, H. et al. Bag of tricks and a strong baseline for deep person re-identification, Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2019.
Gao, S.-H. et al., Res2Net: A new multi-scale backbone architecture, Institute of Electrical, Electronics Engineers (IEEE), 2021, vol. 43, pp. 652–662. https://doi.org/10.1109/TPAMI.2019.2938758
Wen, Y. et al., A discriminative feature learning approach for deep face recognition, Proc. Eur. Conf. Computer Vision (ECCV), Leibe, B., Ed., Springer, 2016, pp. 499–515.
Yan, C. et al., Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss, 2020.
Wu, Y. et al., Detectron2, 2019. https://github.com/facebookresearch/detectron2.
Wei, L. et al., Person transfer GAN to bridge domain gap for person re-identification, 2018.
Zheng, L. et al., Scalable person re-identification: A benchmark, Proc. IEEE Int. Conf. Computer Vision (ICCV), 2015, pp. 1116–1124. https://doi.org/10.1109/ICCV.2015.133
Ristani, E. et al., Performance measures and a data set for multi-target, multi-camera tracking, 2016.
Sandler, M. et al., MobileNetV2: Inverted residuals and linear bottlenecks, 2019.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
Additional information
Translated by Yu. Kornienko
Rights and permissions
About this article
Cite this article
Mamedov, T.Z., Kuplyakov, D.A. & Konushin, A.S. Video Analytics Using Detection on Sparse Frames. Program Comput Soft 48, 155–163 (2022). https://doi.org/10.1134/S0361768822030070
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0361768822030070