[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1459359.1459393acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Exploring knowledge of sub-domain in a multi-resolution bootstrapping framework for concept detection in news video

Published: 26 October 2008 Publication History

Abstract

In this paper, we present a model based on a multi-resolution, multi-source and multi-modal (M3) bootstrapping framework that exploits knowledge of sub-domains for concept detection in news video. Because the characteristics and distributions of data in different sub-domains are different, we model and analyze the video in each sub-domain separately using a transductive framework. Along with this framework, we propose a "pseudo-Vapnik combined error bound" to tackle the problem of imbalanced distribution of training data in certain segments of sub-domains. For effective fusion of multi-modal features, we utilize multi-resolution inference and constraints to permit evidences from different modal features to support each other. Finally, we employ a bootstrapping technique to leverage unlabeled data to boost the overall system performance. We test our framework by detecting semantic concepts in the TRECVID 2004 dataset. Experimental results demonstrate that our approach is effective.

References

[1]
A. Amir et al, "IBM research TRECVID 2005 video retrieval system", Proceedings of TRECVID 2005, Gaithersburg, MD, November 2005 available at: http://www-nlpir.nist.gov/projects/tvpubs/tv5.papers/
[2]
L. Chaisorn, "A Hierarchical Multi-Modal approach to story segmentation in news video", PhD thesis in National University of Singapore, 2004
[3]
S. F. Chang, R. Manmatha, and T. S. Chua, "Combining text and audio-visual features in video indexing", Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1005--1008, 2005
[4]
T. S. Chua, S. F. Chang, L. Chaisorn, and W. H. Hsu, "Story Boundary Detection in Large Broadcast News Video Archives-Techniques, Experience and Trends", Proceedings of the 12th ACM International Conference on Multimedia pp. 656--659, 2004
[5]
T. S. Chua et al, "TRECVID 2004 Search and Feature Extraction Task by NUS PRIS" Proceedings of (VIDEO) TRECVID 2004, Gaithersburg, MD, November 2004, available at : http://www-nlpir.nist.gov/projects/tvpubs/
[6]
T. S. Chua et al, "TRECVID 2005 by NUS PRIS", Proceeding of TRECVID 2005, Gaithersburg, MD, November 2005, available at http://www-nlpir.nist.gov/projects/tvpubs/
[7]
H. M. Feng, R. Shi, T. S. Chua, "A bootstrapping framework for annotating and retrieving WWW images." In Proceeding of the 12th ACM Multimedia conference International conference pp. 960--967, 2004
[8]
J. L. Gauvain, L. Lamel, and G. Adda, "The LIMSI Broadcast News Transcription System." Speech Communication, 37(1-2) pp. 89--108, 2002.
[9]
A. Hauptmann et al, "Multi-Lingual Broadcast News Retrieval" Proceedings of TRECVID 2006 available at: http://www-nlpir.nist.gov/projects/tvpubs/
[10]
D. A. Hull, "Using statistical testing in the evaluation of retrieval experiments". Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 329--338, 1993
[11]
A. K. Jain, M. N. Murty, and P. J. Flynn, "Data Clustering: A Review", ACM Computing Surveys, Vol 31, No. 3, pp. 264--323,1999
[12]
D. Jurafsky and J. H. Martin, "Speech and language processing", published by Prentice-Hall Inc 2000.
[13]
M. Lan, C. L. Tan and H. B. Low "Proposing a new term weighting scheme for text categorization", Proceeding of the 21st National Conference on Artificial Intelligence, AAAI-2006
[14]
C. Y. Lin, "Robust Automated Topic Identification" Ph.D. Thesis, University of Southern California 1997
[15]
M. R. Naphade and J. R. Smith, "On the detection of semantic concepts at TRECVID", Proceedings of the 12th ACM Multimedia pp. 660--667, 2004
[16]
G. J. Qi, X. S. Hua, Y. Song, J. H. Tang, H. J. Zhang, "Transductive Inference with Hierarchical Clustering for Video Annotation" International Conference on Multimedia and Expo, pp.643--646, 2007
[17]
L. A. Rowe and R. Jain, "ACM SIGMM Retreat Report on Future Directions in Multimedia Research", ACM Transactions on Multimedia Computing, Communications, and Applications, Vol 1, issues 1 pp. 3--13, 2005
[18]
C. G. M. Snoek, et al, "The Challenge Problem for Automated Detection of 101 Semantic Concepts in Multimedia", Proceedings of the 14th ACM Multimedia International conference, pp.421--430, 2006.
[19]
Q. Tian, J. Yu, Q. Xue, and N. Sebe, "A New Analysis of the Value of Unlabeled Data in Semi-Supervised Learning for Image Retrieval", Proceedings of IEEE International Conference on Multimedia and Expo (ICME 2004), Vol.2, pp.1019--1022, 2004.
[20]
V. N. Vapnik, "Statistical learning theory", Wiley Interscience New York. pp.120--200, 1998,
[21]
R. Yan and M. R. Naphade "Semi-supervised Cross Feature Learning for Semantic Concept Detection in Video" Proceeding of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), vol. 1, pp.657--663, 2005.
[22]
R. Yan, J. Yang and A. G. Hauptmann, "Learning Query-Class Dependent Weights for Automatic Video Retrieval", In Proceeding of the 12th ACM Multimedia conference International conference, pp. 548--555, 2004
[23]
J. Yang, A. Hauptmann, M. Y. Chen, "Finding Person X: Correlating Names with Visual Appearances", International Conference on Image and Video Retrieval (CIVR'04), Dublin City University, Ireland, July 21--23, 2004
[24]
J. Yang, R. Yan and Hauptmann, "Cross-Domain Video Concept Detection Using Adaptive SVMs", In Proceeding of the 15th annual ACM international conference on Multimedia, pp. 188--197, 2007
[25]
R. E. Yaniv, and L. Gerzon, "Effective Transductive Learning via PAC-Bayesian Model Selection." Technical Report CS-2004-05, IIT, 2004.

Cited By

View all
  • (2018)Attention-in-Attention Networks for Surveillance Video Understanding in Internet of ThingsIEEE Internet of Things Journal10.1109/JIOT.2017.27798655:5(3419-3429)Online publication date: Oct-2018
  • (2014)Knowledge Adaptation with PartiallyShared Features for Event DetectionUsing Few ExemplarsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2014.230641936:9(1789-1802)Online publication date: Sep-2014
  • (2014)E-LAMPMachine Vision and Applications10.1007/s00138-013-0529-625:1(5-15)Online publication date: 1-Jan-2014
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '08: Proceedings of the 16th ACM international conference on Multimedia
October 2008
1206 pages
ISBN:9781605583037
DOI:10.1145/1459359
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bootstrapping
  2. domain knowledge
  3. multi-resolution analysis
  4. text semantics
  5. transductive learning
  6. unlabeled data

Qualifiers

  • Research-article

Conference

MM08
Sponsor:
MM08: ACM Multimedia Conference 2008
October 26 - 31, 2008
British Columbia, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Attention-in-Attention Networks for Surveillance Video Understanding in Internet of ThingsIEEE Internet of Things Journal10.1109/JIOT.2017.27798655:5(3419-3429)Online publication date: Oct-2018
  • (2014)Knowledge Adaptation with PartiallyShared Features for Event DetectionUsing Few ExemplarsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2014.230641936:9(1789-1802)Online publication date: Sep-2014
  • (2014)E-LAMPMachine Vision and Applications10.1007/s00138-013-0529-625:1(5-15)Online publication date: 1-Jan-2014
  • (2013)We are not equally negativeProceedings of the 21st ACM international conference on Multimedia10.1145/2502081.2502119(293-302)Online publication date: 21-Oct-2013
  • (2012)Knowledge adaptation for ad hoc multimedia event detection with few exemplarsProceedings of the 20th ACM international conference on Multimedia10.1145/2393347.2393414(469-478)Online publication date: 29-Oct-2012
  • (2012)Robust Video Content Analysis via Transductive LearningACM Transactions on Intelligent Systems and Technology10.1145/2168752.21687553:3(1-26)Online publication date: 1-May-2012
  • (2010)Automatic generation of semantic fields for annotating web imagesProceedings of the 23rd International Conference on Computational Linguistics: Posters10.5555/1944566.1944715(1301-1309)Online publication date: 23-Aug-2010
  • (2010)On the sampling of web images for learning visual concept classifiersProceedings of the ACM International Conference on Image and Video Retrieval10.1145/1816041.1816051(50-57)Online publication date: 5-Jul-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media