Chaisorn et al., 2003 - Google Patents
A Two-Level Multi-Modal Approach for Story Segmentation of Large News Video Corpus.Chaisorn et al., 2003
View PDF- Document ID
- 6887006697235574202
- Author
- Chaisorn L
- Chua T
- Koh C
- Zhao Y
- Xu H
- Feng H
- Tian Q
- Publication year
- Publication venue
- TRECVID
External Links
Snippet
This paper presents an enhanced work from our previous paper [Chaisorn et al. 2002]. The system is enhanced to perform news story segmentation on a large video corpus used in TRECVID 2003 evaluation. We use a combination of features include visual-based features …
- 230000011218 segmentation 0 title abstract description 35
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30837—Query results presentation or summarisation specifically adapted for the retrieval of video data
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
- G11B27/30—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Huang et al. | Automated generation of news content hierarchy by integrating audio, video, and text information | |
Qi et al. | Integrating visual, audio and text analysis for news video | |
Snoek et al. | Multimodal video indexing: A review of the state-of-the-art | |
Purver | Topic segmentation | |
Kang | Affective content detection using HMMs | |
Bertini et al. | Content-based indexing and retrieval of TV news | |
US20040143434A1 (en) | Audio-Assisted segmentation and browsing of news videos | |
Li et al. | Content-based movie analysis and indexing based on audiovisual cues | |
Chaisorn et al. | A Two-Level Multi-Modal Approach for Story Segmentation of Large News Video Corpus. | |
Wang et al. | Speech segmentation without speech recognition | |
Chaisorn et al. | A multi-modal approach to story segmentation for news video | |
US20120281969A1 (en) | Video summarization using audio and visual cues | |
WO1999036863A2 (en) | System and method for selective retrieval of a video sequence | |
Chaisorn et al. | The segmentation of news video into story units | |
Jiang et al. | Video segmentation with the support of audio segmentation and classification | |
US7349477B2 (en) | Audio-assisted video segmentation and summarization | |
Jiang et al. | Video segmentation with the assistance of audio content analysis | |
Chaisorn et al. | The segmentation and classification of story boundaries in news video | |
Amaral et al. | A prototype system for selective dissemination of broadcast news in European Portuguese | |
Chaisorn et al. | Two-level multi-modal framework for news story segmentation of large video corpus | |
Nitta et al. | Generating semantic descriptions of broadcasted sports videos based on structures of sports games and TV programs | |
Li et al. | Movie content analysis, indexing and skimming via multimodal information | |
Chaisorn et al. | A hierarchical approach to story segmentation of large broadcast news video corpus | |
Bigot et al. | Speaker role recognition to help spontaneous conversational speech detection | |
Chaisorn et al. | Story boundary detection in news video using global rule induction technique |