More Web Proxy on the site http://driver.im/

Article

Adapting appearance models of semantic concepts to particular videos via transductive learning

Authors:

Bernd FreislebenAuthors Info & Claims

MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval

Pages 187 - 196

https://doi.org/10.1145/1290082.1290110

Published: 24 September 2007 Publication History

Abstract

The detection of high-level concepts in video data is an essential processing step of a video retrieval system. The meaning and the appearance of certain events or concepts are strongly related to contextual information. For example, the appearance of semantic concepts, such as e.g. entertainment or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In recent years, supervised machine learning approaches have been extensively used to learn and detect high-level concepts in video shots. The class of semi-supervised learning methods incorporates unlabeled data in the learning process. Transductive learning is a subclass of semi-supervised learning: In the transductive setting, all training samples are labeled, but the unlabeled test samples are considered in the learning process as well. Up to now, transductive learning has not been applied for the purpose of video indexing and retrieval. In this paper, we propose transductive learning, realized by transductive support vector machines (TSVM), for the detection of those high-level concepts whose appearance is strongly related to a particular video. For each video and each concept, a transductive model is learned separately and adapted to the appearance of a specific concept in the particular test video. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed transductive learning approach for several high-level concepts.

References

[1]

Amir, A., Argillander, J., Campbell, M., Haubold, A., Iyengar, G., Ebadollahi, S., Kang, F., Naphade, M. R., Natsev, A., Smith, J.R., Tesic, J., and Volkmer, T. IBM Research TRECVID-2005 Video Retrieval System, in TREC Video Retrieval Online Proceedings, (2005), http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html

[2]

Blum, A. and Mitchell, T. Combining Labeled and Unlabeled Data with Co-Training. In Proceedings of the 11th Conference on Computational Learning Theory, Madison, Wisconsin, USA, 1998, 92--100.

Digital Library

[3]

Brown, G. Wyatt, J., Harris, R., and Yao, X. Diversity Creation Methods: A Survey and Categorisation. In Information Fusion 6 (2005), Elsevier, 2005, 5--20.

[4]

Burges, C. A Tutorial on Support Vector Machines for Pattern Recognition. In: Data Mining and Knowledge Discovery 2, 2, Kluwer Academic Publishers, 1998, 121--167.

Digital Library

[5]

Chang, C.-C. and Lin, C.-J. LIBSVM: A Library for Support Vector Machines, 2001. http://www.csie.ntu.edu.tw/~cjlin/libsvm

Digital Library

[6]

Chapelle, O., Schölkopf, B., and Zien, A. Semi-Supervised Learning, MIT Press, Cambridge, Massachusetts, 2006.

Digital Library

[7]

Cortes, C. and Vapnik, V. Support Vector Networks. In: Machine Learning, Vol. 20, No. 3, 1995, 273--297.

Digital Library

[8]

Dorai, C. and Venkatesh, S. Media Computing-Computational Media Aesthetics. Kluwer Academic Publishers, Boston, 2002.

Digital Library

[9]

Ewerth, R. and Freisleben, B. Semi-Supervised Learning for Semantic Video Retrieval. In Proceedings of ACM Int'l Conference on Image and Video Retrieval (CIVR '07), Amsterdam, The Netherlands, 2007, 154--161.

Digital Library

[10]

Freund, Y. and Schapire, R. E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. In Journal of Computer and System Sciences, 55(1), 1997, 119--139.

Digital Library

[11]

Gemert, J., Geusebroek, J., Veenman, C., Snoek, C., and Smeulders, A. Robust Scene Categorization by Learning Image Statistics in Context. In Proceedings of Int'l Workshop on Semantic Learning Applications in Multimedia, in conjunction with CVPR'06, New York, USA, 2006.

Digital Library

[12]

Jeannin, S. and Mory, B. Video Motion Representation for Improved Content Access. In IEEE Transactions on Consumer Electronics, Vol. 46, No. 3, 2000, 645--655.

Digital Library

[13]

Joachims T. Transductive Inference for Text Classification using Support Vector Machines. In Proc. of 16th International Conference on Machine Learning (ICML), Bled, Slovenia, 1999, 200--209.

Digital Library

[14]

Joachims, T. Transductive Learning via Spectral Graph Partitioning. In Proc. of 20th International Conference on Machine Learning (ICML), Washington DC, 2003, 290--297.

[15]

Kuncheva, L. I., Whitaker, C. J., Shipp, C. A., and Duin, R. P. W. Limits on the Majority Vote Accuracy in Classifier Fusion. In Pattern Analysis and Applications, 6, 2003, Springer-Verlag, 22--31.

[16]

Lu, L., Jiang, H., and Zhang, H.-J. A Robust Audio Classification and Segmentation Method. In Proceedings of the ACM Conference on Multimedia 2001, Ottawa, Canada, 203--211.

Digital Library

[17]

Manjunath, B. S., Ohm, J.-R., Vasudevan, V., and Yamada, A. Color and Texture Descriptors. In IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, No. 6, June 2001, 703--715.

Digital Library

[18]

Naphade, M. R. and Smith, J. R. On the Detection of Semantic Concepts at TRECVID. In Proceedings of the ACM Conference on Multimedia, 2004, New York, 660--667.

Digital Library

[19]

Naphade, M. R., Kennedy, L., Kender, J. R., Chang, S.-F., Smith, J. R., Over, P., and Hauptmann, A. A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005 (LSCOM-Lite), IBM Research Technical Report, 2005.

[20]

Phillips, P.J., Grother, P., Micheals, R. J., Blackburn, D. M., Tabassi, E., and Bone, M. Face Recognition Vendor Test 2002. Evaluation Report IR 6965, National Institute of Standards and Technology, www.itl.nist.gov/iad/894.03/face/face.html, March 2003.

[21]

Phillips, P. J., Scruggs, W. T., O'Toole, A. J., Flynn, P. J., Bowyer, K. W., Schott, C. L., and Sharpe, M. FRVT 2006 and ICE 2006 Large-Scale Results. NISTIR 7408, National Institute of Standards and Technology, http://www.frvt.org/FRVT2006/docs/FRVT2006andICE2006LargeScaleReport.pdf

[22]

Smith, J. R., Naphade, M. R., and Natsev, A. Multimedia Semantic Indexing Using Model Vectors. In Proceedings of the IEEE International Conference on Multimedia & Expo 2003, Volume 2, Baltimore, Maryland, USA, 2003, 445--448.

Digital Library

[23]

Snoek, C. G. M., Worring, M., Geusebroek, J.-M., Koelma, D. C., Seinstra, F. J., and Smeulders, A. W. M. The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28(10), 2006, 1678--1689.

Digital Library

[24]

Snoek, C. G. M., Worring, M., van Gemert, J. C., Geusebroek, J.-M., and Smeulders, A. W.M. The Challenge Problem for Automated Detection of 101 Semantic Concepts in Multimedia. In Proceedings of ACM the Conference on Multimedia, Santa Barbara, USA, 2006, 421--430.

Digital Library

[25]

TRECVID: TREC Video Retrieval Evaluation Series. http://www-nlpir.nist.gov/projects/trecvid/

[26]

Vapnik, V. The Nature of Statistical Learning Theory. Springer-Verlag, Berlin, 1995.

Digital Library

[27]

Viola, P. and Jones, M. Robust Real-Time Face Detection. In International Journal of Computer Vision, Volume 57 (2), (2004), Kluwer Academic Publishers, 2004, 137--154.

Digital Library

[28]

Wu, J., Ding, D., Hua, X.-S., and Zhang, B. Tracking Concept Drifting with an Online-Optimized Incremental Learning Framework. In Proc. of the 7th ACM Int'l Workshop on Multimedia Information Retrieval, 2005, Singapore, 33--40.

Digital Library

[29]

Yan, R. and Hauptmann, A. G. Co-Retrieval: A Boosted Reranking Approach for Video Retrieval. In Proc. of the Int'l Conf. on Image and Video Retrieval, Dublin, Ireland, 2004, 60--69.

[30]

Yan, R. and Naphade, M. Co-Training Non-Robust Classifiers for Video Semantic Concept Detection. In Proceedings of the IEEE International Conference on Image Processing 2005, Vol. 1, Singapore, 1205--1208.

[31]

Yan, R. and Naphade, M. Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos. In Proc. of the IEEE Int'l Conf. on Computer Vision and Pattern Recognition, 2005, Vol. 1, San Diego, CA, USA, 657--63.

Digital Library

Index Terms

Adapting appearance models of semantic concepts to particular videos via transductive learning
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Effective transductive learning via objective model selection

This paper is concerned with transductive learning. We study a recent transductive learning approach based on clustering. In this approach one constructs a diversity of unsupervised models of the unlabeled data using clustering algorithms. These models ...
Transductive Visual-Semantic Embedding for Zero-shot Learning
ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

Zero-shot learning (ZSL) aims to bridge the knowledge transfer via available semantic representations (e.g., attributes) between labeled source instances of seen classes and unlabelled target instances of unseen classes. Most existing ZSL approaches ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval

September 2007

343 pages

ISBN:9781595937780

DOI:10.1145/1290082

General Chairs:
James Z. Wang
The Pennsylvania State University, USA
,
Nozha Boujemaa
INRIA Rocquencourt, France
,
Program Chairs:
Alberto Del Bimbo
University of Florence, Italy
,
Jia Li
The Pennsylvania State University, USA

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 September 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM07

Sponsor:

MM07: The 15th ACM International Conference on Multimedia 2007

September 24 - 29, 2007

Bavaria, Augsburg, Germany

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
242
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten