[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features

Published: 01 August 2011 Publication History

Abstract

In this paper, a novel approach to video temporal decomposition into semantic units, termed scenes, is presented. In contrast to previous temporal segmentation approaches that employ mostly low-level visual or audiovisual features, we introduce a technique that jointly exploits low-level and high-level features automatically extracted from the visual and the auditory channel. This technique is built upon the well-known method of the scene transition graph (STG), first by introducing a new STG approximation that features reduced computational cost, and then by extending the unimodal STG-based temporal segmentation technique to a method for multimodal scene segmentation. The latter exploits, among others, the results of a large number of TRECVID-type trained visual concept detectors and audio event detectors, and is based on a probabilistic merging process that combines multiple individual STGs while at the same time diminishing the need for selecting and fine-tuning several STG construction parameters. The proposed approach is evaluated on three test datasets, comprising TRECVID documentary films, movies, and news-related videos, respectively. The experimental results demonstrate the improved performance of the proposed approach in comparison to other unimodal and multimodal techniques of the relevant literature and highlight the contribution of high-level audiovisual features toward improved video segmentation to scenes.

Cited By

View all
  • (2024)Involving Distinguished Temporal Graph Convolutional Networks for Skeleton-Based Temporal Action SegmentationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328541634:1(647-660)Online publication date: 1-Jan-2024
  • (2024)Semantic Transition Detection for Self-supervised Video Scene SegmentationMultiMedia Modeling10.1007/978-3-031-53311-2_2(14-27)Online publication date: 29-Jan-2024
  • (2023)Towards global video scene segmentation with context-aware transformerProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i3.25426(3206-3213)Online publication date: 7-Feb-2023
  • Show More Cited By
  1. Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Circuits and Systems for Video Technology
      IEEE Transactions on Circuits and Systems for Video Technology  Volume 21, Issue 8
      August 2011
      165 pages

      Publisher

      IEEE Press

      Publication History

      Published: 01 August 2011

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 21 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Involving Distinguished Temporal Graph Convolutional Networks for Skeleton-Based Temporal Action SegmentationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328541634:1(647-660)Online publication date: 1-Jan-2024
      • (2024)Semantic Transition Detection for Self-supervised Video Scene SegmentationMultiMedia Modeling10.1007/978-3-031-53311-2_2(14-27)Online publication date: 29-Jan-2024
      • (2023)Towards global video scene segmentation with context-aware transformerProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i3.25426(3206-3213)Online publication date: 7-Feb-2023
      • (2023)Characters Link Shots: Character Attention Network for Movie Scene SegmentationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363025720:4(1-23)Online publication date: 26-Oct-2023
      • (2023)A Coarse-to-Fine Framework for Automatic Video UnscreenIEEE Transactions on Multimedia10.1109/TMM.2022.315017725(2723-2733)Online publication date: 1-Jan-2023
      • (2023)Overview of the NLPCC 2023 Shared Task 10: Learn to Watch TV: Multimodal Dialogue Understanding and Response GenerationNatural Language Processing and Chinese Computing10.1007/978-3-031-44699-3_37(412-419)Online publication date: 12-Oct-2023
      • (2023)Multimodal Dialogue Understanding via Holistic Modeling and Sequence LabelingNatural Language Processing and Chinese Computing10.1007/978-3-031-44699-3_36(399-411)Online publication date: 12-Oct-2023
      • (2022)Video localized caption generation framework for industrial videosJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-21238143:4(4107-4132)Online publication date: 1-Jan-2022
      • (2022)Interactive Design of Business English Learning Resources Based on EDIPT Multimodal ModelComputational Intelligence and Neuroscience10.1155/2022/12648472022Online publication date: 1-Jan-2022
      • (2022)Automatic Scene Segmentation Algorithm for Image Color RestorationProceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering10.1145/3573428.3573777(746-751)Online publication date: 21-Oct-2022
      • Show More Cited By

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media