Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms

H. Ravishankar¹,
R. D. AnithaKumari²,
D. R. Sarvamangala¹,
C. Rashmi¹ &
…
K. R. Deepa³

53 Accesses
Explore all metrics

Abstract

In the field of computer vision, video saliency detection is a critical task that involves identifying and highlighting areas within video frames that are most likely to attract human attention. Despite its importance, this task poses significant challenges due to the dynamic nature of video content, which encompasses varying spatial and temporal features. Addressing these challenges, the Dynamic Attentive Integration for Spatial-Temporal Saliency (DAISTS), a novel methodology that significantly enhances the precision of video saliency detection is designed. DAISTS introduces a dual-path spatial-temporal feature hierarchy, effectively merging deep and shallow learning attributes to improve saliency map accuracy. Additionally, spatial and channel attention aided integration is designed, which adaptively refine feature fusion based on scene context and training data. Moreover, DAISTS incorporates a frame-based attention model, a breakthrough in temporal analysis that selectively prioritizes frames for improved saliency prediction. The methodology is further supported by a comprehensive loss function that integrates Relative Entropy, linear correlation coefficient, and normalized scan-way saliency metrics, ensuring a balanced and effective approach to training. This encapsulates the approach to advancing video saliency detection, highlighting DAISTS as a robust, efficient, and adaptable solution, to make a significant impact in the realm of computer vision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Human Vision Attention Mechanism-Inspired Temporal-Spatial Feature Pyramid for Video Saliency Detection

Article 22 February 2023

Contrast Based Hierarchical Spatial-Temporal Saliency for Video

Transformer-based multi-level attention integration network for video saliency prediction

Article 25 May 2024

Data Availability

The dataset produced and scrutinized in this study are accessible from the corresponding author upon reasonable request.

References

Tliba M, Kerkouri MA, Ghariba B, Chetouani A, Coltekin A, Shehata MS, Bruno A. ‘SATSal: a multi-level self-attention based architecture for visual saliency prediction. ’ IEEE Access. 2022;10:20701–13.
Article Google Scholar
Niu L, Aha L, Mattila J, Gotchev A, Ruiz E. A stereoscopic eyein-hand vision system for remote handling in ITER. Fusion Eng Des. Sep. 2019;146:pp1790–1795.
Nousias S, Arvanitis G, Lalos AS, Pavlidis G, Koulamas C, Kalogeras A, Moustakas K. ‘A saliency aware CNN-based 3D model simplification and compression framework for remote inspection of heritage sites. ’ IEEE Access. 2020;8:169982–70001.
Article Google Scholar
Yao Q, Gong X. ‘Saliency guided self-attention network for weakly and semi-supervised semantic segmentation. ’ IEEE Access. 2020;8:14413–23.
Article Google Scholar
Jones Y, Deligianni F, Dalton J. ‘‘Improving ECG classification interpretability using saliency maps,’’ in Proc. IEEE 20th Int. Conf. Bioinf. Bioeng. (BIBE), Oct. 2020, pp. 675–682.
Qamar M, Qamar S, Muneeb M, Bae S-H, Rahman A. Saliency Prediction in Uncategorized Videos Based on Audio-Visual Correlation, in IEEE Access, vol. 11, pp. 15460–15470, 2023, https://doi.org/10.1109/ACCESS.2023.3244191
Prem Kumar M, Ravi Shankar H, Deepa KR, et al. Effective COVID-19 disease identification using correlation coefficient absolute feature selection and logistic boosting neural network algorithm. SN COMPUT SCI. 2024;5:662. https://doi.org/10.1007/s42979-024-02941-y.
Article Google Scholar
Cao L, Guo D, Wang Q, Feng L, Shi C. Video Quality Assessment of Danmaku-based video saliency regions. IEEE Signal Process Lett. 2022;29:2213–7. https://doi.org/10.1109/LSP.2022.3215925.
Article Google Scholar
Zhu S, Liu C, Xu Z. High-Definition Video Compression System Based on Perception Guidance of Salient Information of a Convolutional Neural Network and HEVC Compression Domain, in IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 7, pp. 1946–1959, July 2020, https://doi.org/10.1109/TCSVT.2019.2911396
Ravishankar H, Patil KK. Throughput optimized using evolutionary computing to guarantee QoS in IEEE 802.16 networks, 2017 International Conference On Smart Technologies For Smart Nation (SmartTechCon), Bengaluru, India, 2017, pp. 1602–1606, https://doi.org/10.1109/SmartTechCon.2017.8358635.
Li Y, Li S, Chen C, Hao A, Qin H, Plug- A. and-Play Scheme to Adapt Image Saliency Deep Model for Video Data, in IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 6, pp. 2315–2327, June 2021, https://doi.org/10.1109/TCSVT.2020.3023080
Sun X, Wang M, Lin R, Sun Y, Shin Cheng S. Deep-Learned Perceptual Quality Control for Intelligent Video Communication, in IEEE Transactions on Consumer Electronics, vol. 68, no. 4, pp. 354–365, Nov. 2022, https://doi.org/10.1109/TCE.2022.3206114
Huchappa R, Patil KK. Evolutionary model to guarantee quality of service for tactical worldwide interoperability for microwave access networks. IAES Int J Artif Intell. 2022;11(2):687.
Google Scholar
Wang Z, Zhou Z, Lu H, Hu Q, Jiang J. Video Saliency Prediction via Joint discrimination and local consistency. IEEE Trans Cybernetics. March 2022;52(3):1490–501. https://doi.org/10.1109/TCYB.2020.2989158.
Lin L, Zheng Y, Chen W, Lan C, Zhao T. IEEE Signal Process Lett. 2023;30:693–7. https://doi.org/10.1109/LSP.2023.3283541. Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment,.
Chen C, Song M, Song W, Guo L, Jian M. A Comprehensive Survey on Video Saliency Detection With Auditory Information: The Audio-Visual Consistency Perceptual is the Key! in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 2, pp. 457–477, Feb. 2023, https://doi.org/10.1109/TCSVT.2022.3203421
Chen C, Wang G, Peng C, Zhang X, Qin H. Improved robust video saliency detection based on long-term spatial-temporal information. IEEE Trans Image Process. 2020;29:1090–100. https://doi.org/10.1109/TIP.2019.2934350.
Article MathSciNet Google Scholar
Kumar M, Ravishankar H, Deepa KR, et al. Early diagnosis of COVID-19 Disease by ChestNet Convolutional Neural Network from chest xray images. SN COMPUT SCI. 2024;5:696. https://doi.org/10.1007/s42979-024-02998-9.
Article Google Scholar
Min X, Zhai G, Zhou J, Zhang X-P, Yang X, Guan X. A Multimodal Saliency Model for videos with High Audio-Visual Correspondence. IEEE Trans Image Process. 2020;29:3805–19. https://doi.org/10.1109/TIP.2020.2966082.
Article MathSciNet Google Scholar
Vu PV, Chandler DM. ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices. Proc SPIE. 2014;23:Art013016.
Google Scholar
H R, R DK, Hosur SB, MB, P AK, E NV. Comparative analysis and QoS enhancement through Novel Feedback Architecture. 2023 Int Conf Data Sci Netw Secur (ICDSNS). 2023;Tiptur(India):1–6. https://doi.org/10.1109/ICDSNS58469.2023.10244875.
Bajˇcinovci V, Vranješ M, Babi´c D, Kovaˇcevi´c B. Subjective and objective quality assessment of MPEG-2, H.264 and H.265 videos, in Proc. IEEE Int. Symp. ELMAR, 2017, pp. 73–77.
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: From error visibility to structural similarity, IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.
Sun W, Liao Q, Xue J-H, Zhou F. SPSIM: A. superpixel-based similarity index for full-reference image quality assessment, IEEE Trans. Image Process., vol. 27, no. 9, pp. 4232–4244, Sep. 2018.
Soundararajan R, Bovik AC. Video quality assessment by reduced reference spatio-temporal entropic differencing, IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 4, pp. 684–694, Apr. 2013.
Bampis CG, Gupta P, Soundararajan R, Bovik AC. SpEED-QA: Spatial efficient entropic differencing for image and video quality, IEEE Signal Process. Lett., vol. 24, no. 9, pp. 1333–1337, Sep. 2017.
Mittal A, Saad MA, Bovik AC. A completely blind video integrity oracle. IEEE Trans Image Process. Jan. 2016;25(1):289–300.
Korhonen J. Two-level approach for no-reference consumer video quality assessment, IEEE Trans. Image Process., vol. 28, no. 12, pp. 5923–5938, Dec. 2019.
Lin L, Yang J, Wang Z, Zhou L, Chen W, Xu Y. Compressed video quality index based on saliency-aware artifact detection. Sensors, 21, 19, 2021, Art. 6429.
Mittal A, Moorthy AK, Bovik AC. No-reference image quality assessment in the spatial domain, IEEE Trans. Image Process., vol. 21, no. 12, pp. 4695–4708, Dec. 2012.
Ravi SH, Patil KK. Delay aware downlink resource allocation scheme for future generation tactical wireless networks. IAES Int J Artif Intell. 2021;10(4):1025.
Google Scholar
Zhang R, Isola P, Efros AA, Shechtman E, Wang O. The unreasonable effectiveness of deep features as a perceptual metric, in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2018, pp. 586–595.

Download references

Acknowledgements

The authors acknowledged the REVA University, Bengaluru, Karnataka, India for supporting the research work by providing the facilities.

Funding

No Funding Involved in the Research work.

Author information

Authors and Affiliations

School of Computing and Information Technology, REVA University, Bangalore, Karnataka, India
H. Ravishankar, D. R. Sarvamangala & C. Rashmi
School of Electronics and Communication Engineering, REVA University, Bangalore, Karnataka, India
R. D. AnithaKumari
School of Electrical and Electronics Engineering, REVA University, Bangalore, Karnataka, India
K. R. Deepa

Authors

H. Ravishankar
View author publications
You can also search for this author in PubMed Google Scholar
R. D. AnithaKumari
View author publications
You can also search for this author in PubMed Google Scholar
D. R. Sarvamangala
View author publications
You can also search for this author in PubMed Google Scholar
C. Rashmi
View author publications
You can also search for this author in PubMed Google Scholar
K. R. Deepa
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The research resulted from a collective effort, with all authors contributing collaboratively to its accomplishment.

Corresponding author

Correspondence to H. Ravishankar.

Ethics declarations

Conflict of Interest

Authors have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ravishankar, H., AnithaKumari, R., Sarvamangala, D. et al. Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms. SN COMPUT. SCI. 5, 926 (2024). https://doi.org/10.1007/s42979-024-03279-1

Download citation

Received: 02 August 2024
Accepted: 01 September 2024
Published: 03 October 2024
DOI: https://doi.org/10.1007/s42979-024-03279-1

Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Human Vision Attention Mechanism-Inspired Temporal-Spatial Feature Pyramid for Video Saliency Detection

Contrast Based Hierarchical Spatial-Temporal Saliency for Video

Transformer-based multi-level attention integration network for video saliency prediction

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Human Vision Attention Mechanism-Inspired Temporal-Spatial Feature Pyramid for Video Saliency Detection

Contrast Based Hierarchical Spatial-Temporal Saliency for Video

Transformer-based multi-level attention integration network for video saliency prediction

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation