Chaisorn et al., 2003 - Google Patents

A Two-Level Multi-Modal Approach for Story Segmentation of Large News Video Corpus.

Chaisorn et al., 2003

Document ID: 6887006697235574202
Author: Chaisorn L; Chua T; Koh C; Zhao Y; Xu H; Feng H; Tian Q
Publication year: 2003
Publication venue: TRECVID

External Links

Cited by

Snippet

This paper presents an enhanced work from our previous paper [Chaisorn et al. 2002]. The system is enhanced to perform news story segmentation on a large video corpus used in TRECVID 2003 evaluation. We use a combination of features include visual-based features …

Continue reading at www-nlpir.nist.gov (PDF) (other versions)

230000011218 segmentation 0 title abstract description 35

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30796—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using original textual content or text extracted from visual content or transcript of audio data
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30837—Query results presentation or summarisation specifically adapted for the retrieval of video data
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
- G11B27/30—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification

Similar Documents

Publication	Publication Date	Title
Huang et al.	1999	Automated generation of news content hierarchy by integrating audio, video, and text information
Qi et al.	2000	Integrating visual, audio and text analysis for news video
Snoek et al.	2005	Multimodal video indexing: A review of the state-of-the-art
Purver	2011	Topic segmentation
Kang	2003	Affective content detection using HMMs
Bertini et al.	2001	Content-based indexing and retrieval of TV news
US20040143434A1 (en)	2004-07-22	Audio-Assisted segmentation and browsing of news videos
Li et al.	2004	Content-based movie analysis and indexing based on audiovisual cues
Chaisorn et al.	2003	A Two-Level Multi-Modal Approach for Story Segmentation of Large News Video Corpus.
Wang et al.	2003	Speech segmentation without speech recognition
Chaisorn et al.	2003	A multi-modal approach to story segmentation for news video
US20120281969A1 (en)	2012-11-08	Video summarization using audio and visual cues
WO1999036863A2 (en)	1999-07-22	System and method for selective retrieval of a video sequence
Chaisorn et al.	2002	The segmentation of news video into story units
Jiang et al.	2000	Video segmentation with the support of audio segmentation and classification
US7349477B2 (en)	2008-03-25	Audio-assisted video segmentation and summarization
Jiang et al.	2000	Video segmentation with the assistance of audio content analysis
Chaisorn et al.	2002	The segmentation and classification of story boundaries in news video
Amaral et al.	2007	A prototype system for selective dissemination of broadcast news in European Portuguese
Chaisorn et al.	2003	Two-level multi-modal framework for news story segmentation of large video corpus
Nitta et al.	2005	Generating semantic descriptions of broadcasted sports videos based on structures of sports games and TV programs
Li et al.	2003	Movie content analysis, indexing and skimming via multimodal information
Chaisorn et al.	2004	A hierarchical approach to story segmentation of large broadcast news video corpus
Bigot et al.	2010	Speaker role recognition to help spontaneous conversational speech detection
Chaisorn et al.	2006	Story boundary detection in news video using global rule induction technique