Unsupervised deep context prediction for background estimation and foreground segmentation

Maryam Sultana¹,
Arif Mahmood²,
Sajid Javed³ &
…
Soon Ki Jung ORCID: orcid.org/0000-0003-0239-6785¹

1745 Accesses
53 Citations
Explore all metrics

Abstract

Background estimation is a fundamental step in many high-level vision applications, such as tracking and surveillance. Existing background estimation techniques suffer from performance degradation in the presence of challenges such as dynamic backgrounds, photometric variations, camera jitters, and shadows. To handle these challenges for the purpose of accurate background estimation, we propose a unified method based on Generative Adversarial Network (GAN) and image inpainting. The proposed method is based on a context prediction network, which is an unsupervised visual feature learning hybrid GAN model. Context prediction is followed by a semantic inpainting network for texture enhancement. We also propose a solution for arbitrary region inpainting using the center region inpainting method and Poisson blending technique. The proposed algorithm is compared with the existing state-of-the-art methods for background estimation and foreground segmentation and outperforms the compared methods by a significant margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Lightweight Image Matting via Efficient Non-local Guidance

High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling

Deep Background Matting

Notes

References

Afifi, M., Hussain, K.F.: Mpb: a modified poisson blending technique. Comput. Vis. Media 1(4), 331–341 (2015)
Article Google Scholar
Bengio, Y., et al.: Learning deep architectures for ai. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MathSciNet MATH Google Scholar
Bouwmans, T., Zahzah, E.H.: Robust pca via principal component pursuit: a review for a comparative evaluation in video surveillance. Comput. Vis. Image Underst. 122, 22–34 (2014)
Article Google Scholar
Bouwmans, T., Maddalena, L., Petrosino, A.: Scene background initialization: a taxonomy. Pattern Recognit. Lett. 96, 3–11 (2017)
Article Google Scholar
Bouwmans, T., Javed, S., Zhang, H., Lin, Z., Otazo, R.: On the applications of robust pca in image and video processing. Proc. IEEE 106(8), 1427–1457 (2018)
Article Google Scholar
Braham, M., Van Droogenbroeck, M.: Deep background subtraction with scene-specific convolutional neural networks. In: 2016 International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 1–4. IEEE (2016)
Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58(3), 11 (2011)
Article MathSciNet MATH Google Scholar
Cao, X., Yang, L., Guo, X.: Total variation regularized rpca for irregularly moving object detection under dynamic background. IEEE Trans. Cybern. 46(4), 1014–1027 (2016)
Article Google Scholar
Chen, M., Wei, X., Yang, Q., Li, Q., Wang, G., Yang, M.H.: Spatiotemporal GMM for background subtraction with superpixel hierarchy. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1518–1525 (2017)
Article Google Scholar
Colombari, A., Cristani, M., Murino, V., Fusiello, A.: Exemplar-based background model initialization. In: Proceedings of the third ACM International Workshop on Video Surveillance & Sensor Networks, pp. 29–36. ACM (2005)
Dong, X., Shen, J., Yu, D., Wang, W., Liu, J., Huang, H.: Occlusion-aware real-time object tracking. IEEE Trans. Multimed. 19(4), 763–771 (2017)
Article Google Scholar
Dong, X., Shen, J., Wang, W., Liu, Y., Shao, L., Porikli, F.: Hyperparameter optimization for tracking with continuous deep q-learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: European Conference on Computer Vision, pp. 751–767. Springer, Berlin (2000)
Erichson, N.B., Donovan, C.: Randomized low-rank dynamic mode decomposition for motion detection. Comput. Vis. Image Underst. 146, 40–50 (2016)
Article Google Scholar
Fu, H., Cao, X., Tu, Z.: Cluster-based co-saliency detection. IEEE Trans. Image Process. 22(10), 3766–3778 (2013)
Article MathSciNet MATH Google Scholar
Gao, Z., Cheong, L.F., Wang, Y.X.: Block-sparse RPCA for salient motion detection. IEEE T-PAMI 36(10), 1975–1987 (2014)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2016)
Article Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Guo, X., Wang, X., Yang, L., Cao, X., Ma, Y.: Robust foreground detection using smoothness and arbitrariness constraints. In: European Conference on Computer Vision, pp 535–550. Springer, Berlin (2014)
Haines, T.S., Xiang, T.: Background subtraction with dirichletprocess mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 36(4), 670–683 (2014)
Article Google Scholar
Han, J., Cheng, G., Li, Z., Zhang, D.: A unified metric learning-based framework for co-saliency detection. In: IEEE Transactions on Circuits and Systems for Video Technology (2017)
Han, J., Quan, R., Zhang, D., Nie, F.: Robust object co-segmentation using background prior. IEEE Trans. Image Process. 27(4), 1639–1651 (2018a)
Article MathSciNet MATH Google Scholar
Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process. Mag. 35(1), 84–100 (2018b)
Article Google Scholar
He, J., Balzano, L., Szlam, A.: Incremental gradient on the grassmannian for online foreground and background separation in subsampled video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1568–1575. IEEE (2012)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Javed, S., Oh, S.H., Bouwmans, T., Jung, S.K.: Robust background subtraction to global illumination changes via multiple features-based online robust principal components analysis with markov random field. J. Electron. Imaging 24(4), 043011 (2015)
Article Google Scholar
Javed, S., Jung, S.K., Mahmood, A., Bouwmans, T.: Motion-aware graph regularized RPCA for background modeling of complex scenes. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 120–125. IEEE (2016)
Javed, S., Mahmood, A., Bouwmans, T., Jung, S.K.: Spatiotemporal low-rank modeling for complex scene background initialization. In: IEEE Transactions on Circuits and Systems for Video Technology (2016)
Javed, S., Mahmood, A., Bouwmans, T., Jung, S.K.: Background-foreground modeling based on spatiotemporal sparse subspace clustering. IEEE Trans. Image Process. 26(12), 5840–5854 (2017a)
Article MathSciNet MATH Google Scholar
Javed, S., Mahmood, A., Bouwmans, T., Jung, S.K.: Background-Foreground Modeling Based on Spatiotemporal Sparse Subspace Clustering. IEEE T-IP (2017)
Javed, S., Mahmood, A., Al-Maadeed, S., Bouwmans, T., Jung, S.K.: Moving object detection in complex scene using spatiotemporal structured-sparse RPCA. IEEE Trans. Image Process. 28, 1007–1022 (2018)
Article MathSciNet MATH Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Bartlett, P.L, Pereira, F.C.N, Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, pp. 1097–1105. Neural Information Processing Systems Conference (2012)
Kwok, T.H., Sheung, H., Wang, C.C.: Fast query for exemplar-based image completion. IEEE Trans. Image Process. 19(12), 3106–3115 (2010)
Article MathSciNet MATH Google Scholar
Li, X., Zhao, B., Lu, X.: A general framework for edited video and raw video summarization. IEEE Trans. Image Process. 26(8), 3652–3664 (2017)
Article MathSciNet MATH Google Scholar
Li, X., Zhao, B., Lu, X.: Key frame extraction in the summary space. IEEE Trans. Cybern. 48(6), 1923–1934 (2018)
Article Google Scholar
Liang, D., Hashimoto, M., Iwata, K., Zhao, X., et al.: Co-occurrence probability-based pixel pairs background model for robust object detection in dynamic scenes. Pattern Recognit. 48(4), 1374–1390 (2015)
Article Google Scholar
Lim, L.A., Keles, H.Y.: Foreground Segmentation Using a Triplet Convolutional Neural Network for Multiscale Feature Encoding (2018). arXiv preprint arXiv:1801.02225
Lim, L.A., Keles, H.Y.: Learning Multi-scale Features for Foreground Segmentation (2018). arXiv preprint arXiv:1808.01477
Liu, C., et al.: Beyond Pixels: Exploring New Representations and Applications for Motion Analysis. PhD thesis, Massachusetts Institute of Technology (2009)
Lu, X.: A multiscale spatio-temporal background model for motion detection. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 3268–3271. IEEE (2014)
Lu, X., Li, X.: Group sparse reconstruction for image segmentation. Neurocomputing 136, 41–48 (2014)
Article Google Scholar
Maddalena, L., Petrosino, A.: Towards benchmarking scene background initialization. In: International Conference on Image Analysis and Processing, pp. 469–476. Springer, Berlin (2015)
Nakashima, Y., Babaguchi, N., Fan, J.: Automatic generation of privacy-protected videos using background estimation. In: 2011 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2011)
Ortego, D., SanMiguel, J.C., Martínez, J.M.: Rejection based multipath reconstruction for background estimation in video sequences with stationary objects. Comput. Vis. Image Underst. 147, 23–37 (2016)
Article Google Scholar
Park, D., Byun, H.: A unified approach to background adaptation and initialization in public scenes. Pattern Recognit. 46(7), 1985–1997 (2013)
Article Google Scholar
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. ACM Trans. Graph. 22(3), 313–318 (2003)
Article Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Shen, J., Jin, X., Zhou, C., Wang, C.C.: Gradient based image completion by solving the poisson equation. Comput. Graph. 31(1), 119–126 (2007)
Article Google Scholar
Shen, J., Hao, X., Liang, Z., Liu, Y., Wang, W., Shao, L.: Real-time superpixel segmentation by dbscan clustering algorithm. IEEE Trans. Image Process. 25(12), 5933–5942 (2016)
Article MathSciNet MATH Google Scholar
Shen, J., Peng, J., Dong, X., Shao, L., Porikli, F.: Higher order energies for image segmentation. IEEE Trans. Image Process. 26(10), 4911–4922 (2017a)
Article MathSciNet MATH Google Scholar
Shen, J., Yu, D., Deng, L., Dong, X.: Fast online tracking with detection refinement. IEEE Trans. Intell. Transp. Syst. 19, 162–173 (2017b)
Article Google Scholar
Shen, J., Peng, J., Shao, L.: Submodular trajectories for better motion segmentation in videos. IEEE Trans. Image Process. 27(6), 2688–2700 (2018)
Article MathSciNet MATH Google Scholar
Shimada, A., Nagahara, H., Taniguchi, R.I.: Background modeling based on bidirectional analysis. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1979–1986. IEEE (2013)
Simonyan, K., Zisserman, A: Very Deep Convolutional Networks for Large-Scale Image Recognition (2014). arXiv preprint arXiv:1409.1556
Sobral, A., Zahzah, Eh: Matrix and tensor completion algorithms for background model initialization: a comparative evaluation. Pattern Recognit. Lett. 96, 22–33 (2017)
Article Google Scholar
Sobral, A., Bouwmans, T., Zahzah, E.H.: Comparison of matrix completion algorithms for background initialization in videos. In: International Conference on Image Analysis and Processing, pp 510–518. Springer, Berlin (2015)
Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 246–252. IEEE (1999)
Tsai, C.C., Qian, X., Lin, Y.Y.: Segmentation guided local proposal fusion for co-saliency detection. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 523–528. IEEE (2017)
Varadarajan, S., Miller, P., Zhou, H.: Spatial mixture of gaussians for dynamic background modelling. In: 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 63–68. IEEE (2013)
Vaswani, N., Bouwmans, T., Javed, S., Narayanamurthy, P.: Robust PCA and Robust Subspace Tracking (2017). arXiv preprint arXiv:1711.09492
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001 (CVPR 2001), vol. 1, pp. I-I. IEEE (2001)
Wang, W., Shen, J., Shao, L.: Consistent video saliency using local gradient flow optimization and global refinement. IEEE Trans. Image Process. 24(11), 4185–4196 (2015)
Article MathSciNet MATH Google Scholar
Wang, W., Shen, J., Sun, H., Shao, L.: Vicos2: video co-saliency guided co-segmentation. IEEE Trans. Circuits Syst. Video Technol. 28, 1727–1736 (2017)
Article Google Scholar
Wang, W., Shen, J., Ling, H.: A deep network solution for attention and aesthetics aware photo cropping. IEEE Trans. Pattern Anal. Mach. Intell. 1–16 (2018)
Wang, W., Shen, J., Porikli, F., Yang, R.: Semi-supervised video object segmentation with super-trajectories. IEEE Trans. Pattern Anal. Mach. Intell. (2018)
Wang, W., Shen, J., Shao, L.: Video salient object detection via fully convolutional networks. IEEE Trans. Image Process. 27(1), 38–49 (2018c)
Article MathSciNet MATH Google Scholar
Wang, W., Shen, J., Yang, R., Porikli, F.: Saliency-aware video object segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 1, 20–33 (2018d)
Article Google Scholar
Wang, Y., Jodoin, P.M., Porikli, F., Konrad, J., Benezeth, Y., Ishwar, P.: Cdnet 2014: an expanded change detection benchmark dataset. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 393–400. IEEE (2014)
Wang, Y., Luo, Z., Jodoin, P.M.: Interactive deep learning method for segmenting moving objects. Pattern Recognit. Lett. 96, 66–75 (2017b)
Article Google Scholar
Wright, J., Ganesh, A., Rao, S., Peng, Y., Ma, Y.: Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In: Advances in Neural Information Processing Systems, pp. 2080–2088. Curran Associates, Inc (2009)
Xu, J., Ithapu, V., Mukherjee, L., Rehg ,J,. Singh, V.: Gosus: Grassmannian online subspace updates with structured-sparsity. In: ICCV (2013)
Xu, J., Ithapu, V.K., Mukherjee, L., Rehg, J.M., Singh, V.: Gosus: Grassmannian online subspace updates with structured-sparsity. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 3376–3383. IEEE (2013)
Xu, X, Huang, T.S.: A loopy belief propagation approach for robust background estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008 (CVPR 2008), pp. 1–7. IEEE (2008)
Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis (2016). arXiv preprint arXiv:1611.09969
Ye, X., Yang, J., Sun, X., Li, K., Hou, C., Wang, Y.: Foreground-background separation from video clips via motion-assisted matrix restoration. IEEE Trans. Circuits Syst. Video Technol. 25(11), 1721–1734 (2015)
Article Google Scholar
Zhang, D., Han, J., Li, C., Wang, J., Li, X.: Detection of co-salient objects by looking deep and wide. Int. J. Comput. Vis. 120(2), 215–232 (2016)
Article MathSciNet Google Scholar
Zhang, D., Meng, D., Han, J.: Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Trans. Pattern Anal. Mach. Intell. 39(5), 865–878 (2017)
Article Google Scholar
Zhang, D., Fu, H., Han, J., Borji, A., Li, X.: A review of co-saliency detection algorithms: fundamentals, applications, and challenges. ACM Trans. Intell. Syst. Technol. 9(4), 38 (2018)
Article Google Scholar
Zhang, T., Liu, S., Xu, C., Lu, H.: Mining semantic context information for intelligent video surveillance of traffic scenes. IEEE Trans. Ind. Inform. 9(1), 149–160 (2013)
Article Google Scholar
Zhang, T., Liu, S., Ahuja, N., Yang, M.H., Ghanem, B.: Robust visual tracking via consistent low-rank sparse learning. Int. J. Comput. Vis. 111(2), 171–190 (2015a)
Article MATH Google Scholar
Zhang, Y., Li, X., Zhang, Z., Wu, F., Zhao, L.: Deep learning driven blockwise moving object detection with binary scene modeling. Neurocomputing 168, 454–463 (2015b)
Article Google Scholar
Zhao, Q., Zhou, G., Zhang, L., Cichocki, A., Amari, S.I.: Bayesian robust tensor factorization for incomplete multiway data. IEEE Trans. Neural Netw. Learn. Syst. 27(4), 736–748 (2016)
Article MathSciNet Google Scholar
Zhou, T., Tao, D.: Godec: randomized low-rank and sparse matrix decomposition in noisy case. In: ICML. Omnipress (2011)
Zhou, X., Yang, C., Yu, W.: Moving object detection by detecting contiguous outliers in the low-rank representation. IEEE T-PAMI 35(3), 597–610 (2013)
Article Google Scholar
Zivkovic, Z.: Improved adaptive gaussian mixture model for background subtraction. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004 (ICPR 2004), vol. 2, pp. 28–31. IEEE (2004)

Download references

Acknowledgements

This research was supported by Development project of leading technology for future vehicle of the business of Daegu metropolitan city (No. 20171105).

Author information

Authors and Affiliations

Virtual Reality Laboratory, School of Computer Science and Engineering, Kyungpook National University, Daegu, Republic of Korea
Maryam Sultana & Soon Ki Jung
Department of Computer Science, Information Technology University (ITU), Lahore, Pakistan
Arif Mahmood
Tissue Image Analytics Laboratory, Department of Computer Science, University of Warwick, Warwick, UK
Sajid Javed

Authors

Maryam Sultana
View author publications
You can also search for this author in PubMed Google Scholar
Arif Mahmood
View author publications
You can also search for this author in PubMed Google Scholar
Sajid Javed
View author publications
You can also search for this author in PubMed Google Scholar
Soon Ki Jung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Soon Ki Jung.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sultana, M., Mahmood, A., Javed, S. et al. Unsupervised deep context prediction for background estimation and foreground segmentation. Machine Vision and Applications 30, 375–395 (2019). https://doi.org/10.1007/s00138-018-0993-0

Download citation

Received: 19 May 2018
Revised: 08 October 2018
Accepted: 14 November 2018
Published: 26 November 2018
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s00138-018-0993-0

Unsupervised deep context prediction for background estimation and foreground segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Lightweight Image Matting via Efficient Non-local Guidance

High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling

Deep Background Matting

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Unsupervised deep context prediction for background estimation and foreground segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Lightweight Image Matting via Efficient Non-local Guidance

High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling

Deep Background Matting

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation