A novel video saliency estimation method in the compressed domain

272 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

This paper presents a novel compressed domain saliency estimation method based on analyzing block motion vectors and transform residuals extracted from the bitstream of H.264/AVC compressed videos. Block motion vectors are analyzed by modeling their orientation values utilizing Dual Cross Patterns, a feature descriptor that earlier found applications in face recognition to obtain the motion saliency map. The transform residuals are analyzed by utilizing lifting wavelet transform on the luminance component of the macro-blocks to obtain the spatial saliency map. The motion saliency map and the spatial saliency map are fused utilizing the Dempster–Shafer combination rule to generate the final saliency map. It is shown through our experiments that Dual Cross Patterns and lifting wavelet transform features fused via Dempster–Shafer rule are superior in predicting fixations as compared to the existing state-of-the-art saliency models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Robust spatio-temporal saliency estimation method for H.264 compressed videos

Article 27 April 2022

Human centered perceptual adaptation for video coding

Article 22 July 2015

Video saliency detection using 3D shearlet transform

Article 23 June 2015

References

Agarwal G, Anbu A, Sinha A (2003) A fast algorithm to find the region-of-interest in the compressed mpeg domain. In: International conference on multimedia and expo. ICME ’03, vol 2, pp II–133
Bellitto G, Salanitri FP, Palazzo S, Rundo F, Giordano D, Spampinato C (2021) Video saliency detection with domain adaptation using hierarchical gradient reversal layers. Int J Comput Vis. 129:3216–3232
Article Google Scholar
Borji A (2019) Saliency prediction in the deep learning era: successes and limitations. IEEE Trans Pattern Anal Mach Intell 1–1
Borji A, Itti L (2013) State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intell 35(1):185–207
Article Google Scholar
Ding C, Choi J, Tao D, Davis LS (2016) Multi-directional multi-level dual cross patterns for robust face recognition. IEEE Trans Pattern Anal Mach Intell 38(3):518–531
Article Google Scholar
Fang Y, Lin W, Chen Z, Tsai C-M, Lin C-W (2014) A video saliency detection model in compressed domain. IEEE Trans Circuits Syst Video Technol 24(1):27–38
Article Google Scholar
Fontani M, Bianchi T, De Rosa A, Piva A, Barni M (2013) A framework for decision fusion in image forensics based on Dempster–Shafer theory of evidence. IEEE Trans Inf Forensics Secur 8(4):593–607
Article Google Scholar
Goferman S, Zelnik-Manor L, Tal A (2012) Context-aware saliency detection. IEEE Trans Pattern Anal Mach Intell 34(10):1915–1926
Article Google Scholar
Hadizadeh H, Bajic IV (2014) Saliency-aware video compression. IEEE Trans Image Process 23(1):19–33
Article MathSciNet Google Scholar
Hadizadeh H, Enriquez MJ, Bajic IV (2012) Eye-tracking database for a set of standard video sequences. IEEE Trans Image Process 21(2):898–903
Article MathSciNet Google Scholar
Hossein Khatoonabadi S, Vasconcelos N, Bajic IV, Shan Y (2015) How many bits does it take for a stimulus to be salient? In: Proceedings to the IEEE conference on computer vision and pattern recognition, pp 5501–5510
Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2(3):194
Article Google Scholar
Khatoonabadi SH, Bajić IV, Shan Y (2015) Compressed-domain correlates of human fixations in dynamic scenes. Multimed Tools Appl. 74(22):10057–10075
Article Google Scholar
Khatoonabadi SH, Bajić IV, Shan Y (2017) Compressed-domain visual saliency models: a comparative study. Multimed Tools Appl 76(24):26297–26328
Article Google Scholar
Le Meur O, Le Callet P, Barba D, Thoreau D (2006) A coherent computational approach to model bottom-up visual attention. IEEE Trans Pattern Anal Mach Intell 28(5):802–817
Article Google Scholar
Li Y, Lei X, Liang Y, Chen J (2018) Human fixations detection model in video-compressed-domain based on MVE and OBDL. In: Proceedings to advanced optical imaging technologies, vol 10816, p 108161O. International Society for Optics and Photonics
Li Y, Li Y (2017) A fast and efficient saliency detection model in video compressed-domain for human fixations prediction. Multimed Tools Appl 76(24):26273–26295
Article Google Scholar
Li Y, Li S, Chen C, Hao A, Qin H (2021) A plug-and-play scheme to adapt image saliency deep model for video data. IEEE Trans Circuits Syst Video Technol 31(6):2315–2327
Article Google Scholar
Liu T, Yuan Z, Sun J, Wang J, Zheng N, Tang X, Shum H (2011) Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intell 33(2):353–367
Article Google Scholar
Liu Y, Han J, Zhang Q, Shan C (2020) Deep salient object detection with contextual information guidance. IEEE Trans Image Process 29:360–374
Article MathSciNet Google Scholar
Ma Y-F, Zhang H-J (2001) A new perceived motion based shot content representation. In: International conference on image processing, vol 3, pp 426–429
Ma Y-F, Zhang H-J (2002) A model of motion attention for video skimming. In: Proceedings international conference on image processing, vol 1, pp I–I
Ouerhani N, Hugli H (2005) Robot self-localization using visual attention. In: International symposium on computational intelligence in robotics and automation, pp 309–314
Peters RJ, Iyer A, Itti L, Koch C (2005) Components of bottom-up gaze allocation in natural images. Vis Res 45(18):2397–2416
Article Google Scholar
Shafer G (1976) A mathematical theory of evidence. Princeton University Press, Princeton
Book Google Scholar
Siagian C, Itti L (2007) Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Trans Pattern Anal Mach Intell 29(2):300–312
Article Google Scholar
Sinha A, Agarwal G, Anbu A (2004) Region-of-interest based compressed domain video transcoding scheme. In: IEEE international conference on acoustics, speech, and signal processing, vol 3, pp iii–161
Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13(7):560–576
Article Google Scholar
Xu J, Guo X, Tu Q, Li C, Men A (2015) A novel video saliency map detection model in compressed domain. In: MILCOM 2015—2015 IEEE military communications conference, pp 157–162
Zhang Q, Huang N, Yao L, Zhang D, Shan C, Han J (2020) RGB-T salient object detection via fusing multi-level CNN features. IEEE Trans Image Process 29:3321–3335
Article Google Scholar

Download references

Acknowledgements

This research work is supported by SERB, Government of India under Grant No ECR/2016/000112. We express our sincere gratitude to the Associate Editor and the anonymous reviewers whose insightful reviews and suggestions have helped us in improving the paper.

Author information

Authors and Affiliations

National Institute of Technology Rourkela, Rourkela, Odisha, India
Pavan Sandula & Manish Okade

Authors

Pavan Sandula
View author publications
You can also search for this author in PubMed Google Scholar
Manish Okade
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manish Okade.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sandula, P., Okade, M. A novel video saliency estimation method in the compressed domain. Pattern Anal Applic 25, 867–878 (2022). https://doi.org/10.1007/s10044-022-01081-4

Download citation

Received: 23 June 2020
Accepted: 19 May 2022
Published: 15 June 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s10044-022-01081-4

A novel video saliency estimation method in the compressed domain

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust spatio-temporal saliency estimation method for H.264 compressed videos

Human centered perceptual adaptation for video coding

Video saliency detection using 3D shearlet transform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A novel video saliency estimation method in the compressed domain

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Robust spatio-temporal saliency estimation method for H.264 compressed videos

Human centered perceptual adaptation for video coding

Video saliency detection using 3D shearlet transform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation