[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

In the field of computer vision, video saliency detection is a critical task that involves identifying and highlighting areas within video frames that are most likely to attract human attention. Despite its importance, this task poses significant challenges due to the dynamic nature of video content, which encompasses varying spatial and temporal features. Addressing these challenges, the Dynamic Attentive Integration for Spatial-Temporal Saliency (DAISTS), a novel methodology that significantly enhances the precision of video saliency detection is designed. DAISTS introduces a dual-path spatial-temporal feature hierarchy, effectively merging deep and shallow learning attributes to improve saliency map accuracy. Additionally, spatial and channel attention aided integration is designed, which adaptively refine feature fusion based on scene context and training data. Moreover, DAISTS incorporates a frame-based attention model, a breakthrough in temporal analysis that selectively prioritizes frames for improved saliency prediction. The methodology is further supported by a comprehensive loss function that integrates Relative Entropy, linear correlation coefficient, and normalized scan-way saliency metrics, ensuring a balanced and effective approach to training. This encapsulates the approach to advancing video saliency detection, highlighting DAISTS as a robust, efficient, and adaptable solution, to make a significant impact in the realm of computer vision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

The dataset produced and scrutinized in this study are accessible from the corresponding author upon reasonable request.

References

  1. Tliba M, Kerkouri MA, Ghariba B, Chetouani A, Coltekin A, Shehata MS, Bruno A. ‘SATSal: a multi-level self-attention based architecture for visual saliency prediction. ’ IEEE Access. 2022;10:20701–13.

    Article  Google Scholar 

  2. Niu L, Aha L, Mattila J, Gotchev A, Ruiz E. A stereoscopic eyein-hand vision system for remote handling in ITER. Fusion Eng Des. Sep. 2019;146:pp1790–1795.

  3. Nousias S, Arvanitis G, Lalos AS, Pavlidis G, Koulamas C, Kalogeras A, Moustakas K. ‘A saliency aware CNN-based 3D model simplification and compression framework for remote inspection of heritage sites. ’ IEEE Access. 2020;8:169982–70001.

    Article  Google Scholar 

  4. Yao Q, Gong X. ‘Saliency guided self-attention network for weakly and semi-supervised semantic segmentation. ’ IEEE Access. 2020;8:14413–23.

    Article  Google Scholar 

  5. Jones Y, Deligianni F, Dalton J. ‘‘Improving ECG classification interpretability using saliency maps,’’ in Proc. IEEE 20th Int. Conf. Bioinf. Bioeng. (BIBE), Oct. 2020, pp. 675–682.

  6. Qamar M, Qamar S, Muneeb M, Bae S-H, Rahman A. Saliency Prediction in Uncategorized Videos Based on Audio-Visual Correlation, in IEEE Access, vol. 11, pp. 15460–15470, 2023, https://doi.org/10.1109/ACCESS.2023.3244191

  7. Prem Kumar M, Ravi Shankar H, Deepa KR, et al. Effective COVID-19 disease identification using correlation coefficient absolute feature selection and logistic boosting neural network algorithm. SN COMPUT SCI. 2024;5:662. https://doi.org/10.1007/s42979-024-02941-y.

    Article  Google Scholar 

  8. Cao L, Guo D, Wang Q, Feng L, Shi C. Video Quality Assessment of Danmaku-based video saliency regions. IEEE Signal Process Lett. 2022;29:2213–7. https://doi.org/10.1109/LSP.2022.3215925.

    Article  Google Scholar 

  9. Zhu S, Liu C, Xu Z. High-Definition Video Compression System Based on Perception Guidance of Salient Information of a Convolutional Neural Network and HEVC Compression Domain, in IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 7, pp. 1946–1959, July 2020, https://doi.org/10.1109/TCSVT.2019.2911396

  10. Ravishankar H, Patil KK. Throughput optimized using evolutionary computing to guarantee QoS in IEEE 802.16 networks, 2017 International Conference On Smart Technologies For Smart Nation (SmartTechCon), Bengaluru, India, 2017, pp. 1602–1606, https://doi.org/10.1109/SmartTechCon.2017.8358635.

  11. Li Y, Li S, Chen C, Hao A, Qin H, Plug- A. and-Play Scheme to Adapt Image Saliency Deep Model for Video Data, in IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 6, pp. 2315–2327, June 2021, https://doi.org/10.1109/TCSVT.2020.3023080

  12. Sun X, Wang M, Lin R, Sun Y, Shin Cheng S. Deep-Learned Perceptual Quality Control for Intelligent Video Communication, in IEEE Transactions on Consumer Electronics, vol. 68, no. 4, pp. 354–365, Nov. 2022, https://doi.org/10.1109/TCE.2022.3206114

  13. Huchappa R, Patil KK. Evolutionary model to guarantee quality of service for tactical worldwide interoperability for microwave access networks. IAES Int J Artif Intell. 2022;11(2):687.

    Google Scholar 

  14. Wang Z, Zhou Z, Lu H, Hu Q, Jiang J. Video Saliency Prediction via Joint discrimination and local consistency. IEEE Trans Cybernetics. March 2022;52(3):1490–501. https://doi.org/10.1109/TCYB.2020.2989158.

  15. Lin L, Zheng Y, Chen W, Lan C, Zhao T. IEEE Signal Process Lett. 2023;30:693–7. https://doi.org/10.1109/LSP.2023.3283541. Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment,.

  16. Chen C, Song M, Song W, Guo L, Jian M. A Comprehensive Survey on Video Saliency Detection With Auditory Information: The Audio-Visual Consistency Perceptual is the Key! in IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 2, pp. 457–477, Feb. 2023, https://doi.org/10.1109/TCSVT.2022.3203421

  17. Chen C, Wang G, Peng C, Zhang X, Qin H. Improved robust video saliency detection based on long-term spatial-temporal information. IEEE Trans Image Process. 2020;29:1090–100. https://doi.org/10.1109/TIP.2019.2934350.

    Article  MathSciNet  Google Scholar 

  18. Kumar M, Ravishankar H, Deepa KR, et al. Early diagnosis of COVID-19 Disease by ChestNet Convolutional Neural Network from chest xray images. SN COMPUT SCI. 2024;5:696. https://doi.org/10.1007/s42979-024-02998-9.

    Article  Google Scholar 

  19. Min X, Zhai G, Zhou J, Zhang X-P, Yang X, Guan X. A Multimodal Saliency Model for videos with High Audio-Visual Correspondence. IEEE Trans Image Process. 2020;29:3805–19. https://doi.org/10.1109/TIP.2020.2966082.

    Article  MathSciNet  Google Scholar 

  20. Vu PV, Chandler DM. ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices. Proc SPIE. 2014;23:Art013016.

    Google Scholar 

  21. H R, R DK, Hosur SB, MB, P AK, E NV. Comparative analysis and QoS enhancement through Novel Feedback Architecture. 2023 Int Conf Data Sci Netw Secur (ICDSNS). 2023;Tiptur(India):1–6. https://doi.org/10.1109/ICDSNS58469.2023.10244875.

  22. Bajˇcinovci V, Vranješ M, Babi´c D, Kovaˇcevi´c B. Subjective and objective quality assessment of MPEG-2, H.264 and H.265 videos, in Proc. IEEE Int. Symp. ELMAR, 2017, pp. 73–77.

  23. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: From error visibility to structural similarity, IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.

  24. Sun W, Liao Q, Xue J-H, Zhou F. SPSIM: A. superpixel-based similarity index for full-reference image quality assessment, IEEE Trans. Image Process., vol. 27, no. 9, pp. 4232–4244, Sep. 2018.

  25. Soundararajan R, Bovik AC. Video quality assessment by reduced reference spatio-temporal entropic differencing, IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 4, pp. 684–694, Apr. 2013.

  26. Bampis CG, Gupta P, Soundararajan R, Bovik AC. SpEED-QA: Spatial efficient entropic differencing for image and video quality, IEEE Signal Process. Lett., vol. 24, no. 9, pp. 1333–1337, Sep. 2017.

  27. Mittal A, Saad MA, Bovik AC. A completely blind video integrity oracle. IEEE Trans Image Process. Jan. 2016;25(1):289–300.

  28. Korhonen J. Two-level approach for no-reference consumer video quality assessment, IEEE Trans. Image Process., vol. 28, no. 12, pp. 5923–5938, Dec. 2019.

  29. Lin L, Yang J, Wang Z, Zhou L, Chen W, Xu Y. Compressed video quality index based on saliency-aware artifact detection. Sensors, 21, 19, 2021, Art. 6429.

  30. Mittal A, Moorthy AK, Bovik AC. No-reference image quality assessment in the spatial domain, IEEE Trans. Image Process., vol. 21, no. 12, pp. 4695–4708, Dec. 2012.

  31. Ravi SH, Patil KK. Delay aware downlink resource allocation scheme for future generation tactical wireless networks. IAES Int J Artif Intell. 2021;10(4):1025.

    Google Scholar 

  32. Zhang R, Isola P, Efros AA, Shechtman E, Wang O. The unreasonable effectiveness of deep features as a perceptual metric, in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2018, pp. 586–595.

Download references

Acknowledgements

The authors acknowledged the REVA University, Bengaluru, Karnataka, India for supporting the research work by providing the facilities.

Funding

No Funding Involved in the Research work.

Author information

Authors and Affiliations

Authors

Contributions

The research resulted from a collective effort, with all authors contributing collaboratively to its accomplishment.

Corresponding author

Correspondence to H. Ravishankar.

Ethics declarations

Conflict of Interest

Authors have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ravishankar, H., AnithaKumari, R., Sarvamangala, D. et al. Video Compression through Advanced Video Saliency Aware Spatial-Temporal Integration and Attention Mechanisms. SN COMPUT. SCI. 5, 926 (2024). https://doi.org/10.1007/s42979-024-03279-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-03279-1

Keywords

Navigation