[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1290082.1290110acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Adapting appearance models of semantic concepts to particular videos via transductive learning

Published: 24 September 2007 Publication History

Abstract

The detection of high-level concepts in video data is an essential processing step of a video retrieval system. The meaning and the appearance of certain events or concepts are strongly related to contextual information. For example, the appearance of semantic concepts, such as e.g. entertainment or news anchors, is determined by the used editing layout which usually is typical for a certain broadcasting station. In recent years, supervised machine learning approaches have been extensively used to learn and detect high-level concepts in video shots. The class of semi-supervised learning methods incorporates unlabeled data in the learning process. Transductive learning is a subclass of semi-supervised learning: In the transductive setting, all training samples are labeled, but the unlabeled test samples are considered in the learning process as well. Up to now, transductive learning has not been applied for the purpose of video indexing and retrieval. In this paper, we propose transductive learning, realized by transductive support vector machines (TSVM), for the detection of those high-level concepts whose appearance is strongly related to a particular video. For each video and each concept, a transductive model is learned separately and adapted to the appearance of a specific concept in the particular test video. Experimental results on TRECVID 2005 video data demonstrate the feasibility of the proposed transductive learning approach for several high-level concepts.

References

[1]
Amir, A., Argillander, J., Campbell, M., Haubold, A., Iyengar, G., Ebadollahi, S., Kang, F., Naphade, M. R., Natsev, A., Smith, J.R., Tesic, J., and Volkmer, T. IBM Research TRECVID-2005 Video Retrieval System, in TREC Video Retrieval Online Proceedings, (2005), http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
[2]
Blum, A. and Mitchell, T. Combining Labeled and Unlabeled Data with Co-Training. In Proceedings of the 11th Conference on Computational Learning Theory, Madison, Wisconsin, USA, 1998, 92--100.
[3]
Brown, G. Wyatt, J., Harris, R., and Yao, X. Diversity Creation Methods: A Survey and Categorisation. In Information Fusion 6 (2005), Elsevier, 2005, 5--20.
[4]
Burges, C. A Tutorial on Support Vector Machines for Pattern Recognition. In: Data Mining and Knowledge Discovery 2, 2, Kluwer Academic Publishers, 1998, 121--167.
[5]
Chang, C.-C. and Lin, C.-J. LIBSVM: A Library for Support Vector Machines, 2001. http://www.csie.ntu.edu.tw/~cjlin/libsvm
[6]
Chapelle, O., Schölkopf, B., and Zien, A. Semi-Supervised Learning, MIT Press, Cambridge, Massachusetts, 2006.
[7]
Cortes, C. and Vapnik, V. Support Vector Networks. In: Machine Learning, Vol. 20, No. 3, 1995, 273--297.
[8]
Dorai, C. and Venkatesh, S. Media Computing-Computational Media Aesthetics. Kluwer Academic Publishers, Boston, 2002.
[9]
Ewerth, R. and Freisleben, B. Semi-Supervised Learning for Semantic Video Retrieval. In Proceedings of ACM Int'l Conference on Image and Video Retrieval (CIVR '07), Amsterdam, The Netherlands, 2007, 154--161.
[10]
Freund, Y. and Schapire, R. E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. In Journal of Computer and System Sciences, 55(1), 1997, 119--139.
[11]
Gemert, J., Geusebroek, J., Veenman, C., Snoek, C., and Smeulders, A. Robust Scene Categorization by Learning Image Statistics in Context. In Proceedings of Int'l Workshop on Semantic Learning Applications in Multimedia, in conjunction with CVPR'06, New York, USA, 2006.
[12]
Jeannin, S. and Mory, B. Video Motion Representation for Improved Content Access. In IEEE Transactions on Consumer Electronics, Vol. 46, No. 3, 2000, 645--655.
[13]
Joachims T. Transductive Inference for Text Classification using Support Vector Machines. In Proc. of 16th International Conference on Machine Learning (ICML), Bled, Slovenia, 1999, 200--209.
[14]
Joachims, T. Transductive Learning via Spectral Graph Partitioning. In Proc. of 20th International Conference on Machine Learning (ICML), Washington DC, 2003, 290--297.
[15]
Kuncheva, L. I., Whitaker, C. J., Shipp, C. A., and Duin, R. P. W. Limits on the Majority Vote Accuracy in Classifier Fusion. In Pattern Analysis and Applications, 6, 2003, Springer-Verlag, 22--31.
[16]
Lu, L., Jiang, H., and Zhang, H.-J. A Robust Audio Classification and Segmentation Method. In Proceedings of the ACM Conference on Multimedia 2001, Ottawa, Canada, 203--211.
[17]
Manjunath, B. S., Ohm, J.-R., Vasudevan, V., and Yamada, A. Color and Texture Descriptors. In IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, No. 6, June 2001, 703--715.
[18]
Naphade, M. R. and Smith, J. R. On the Detection of Semantic Concepts at TRECVID. In Proceedings of the ACM Conference on Multimedia, 2004, New York, 660--667.
[19]
Naphade, M. R., Kennedy, L., Kender, J. R., Chang, S.-F., Smith, J. R., Over, P., and Hauptmann, A. A Light Scale Concept Ontology for Multimedia Understanding for TRECVID 2005 (LSCOM-Lite), IBM Research Technical Report, 2005.
[20]
Phillips, P.J., Grother, P., Micheals, R. J., Blackburn, D. M., Tabassi, E., and Bone, M. Face Recognition Vendor Test 2002. Evaluation Report IR 6965, National Institute of Standards and Technology, www.itl.nist.gov/iad/894.03/face/face.html, March 2003.
[21]
Phillips, P. J., Scruggs, W. T., O'Toole, A. J., Flynn, P. J., Bowyer, K. W., Schott, C. L., and Sharpe, M. FRVT 2006 and ICE 2006 Large-Scale Results. NISTIR 7408, National Institute of Standards and Technology, http://www.frvt.org/FRVT2006/docs/FRVT2006andICE2006LargeScaleReport.pdf
[22]
Smith, J. R., Naphade, M. R., and Natsev, A. Multimedia Semantic Indexing Using Model Vectors. In Proceedings of the IEEE International Conference on Multimedia & Expo 2003, Volume 2, Baltimore, Maryland, USA, 2003, 445--448.
[23]
Snoek, C. G. M., Worring, M., Geusebroek, J.-M., Koelma, D. C., Seinstra, F. J., and Smeulders, A. W. M. The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28(10), 2006, 1678--1689.
[24]
Snoek, C. G. M., Worring, M., van Gemert, J. C., Geusebroek, J.-M., and Smeulders, A. W.M. The Challenge Problem for Automated Detection of 101 Semantic Concepts in Multimedia. In Proceedings of ACM the Conference on Multimedia, Santa Barbara, USA, 2006, 421--430.
[25]
TRECVID: TREC Video Retrieval Evaluation Series. http://www-nlpir.nist.gov/projects/trecvid/
[26]
Vapnik, V. The Nature of Statistical Learning Theory. Springer-Verlag, Berlin, 1995.
[27]
Viola, P. and Jones, M. Robust Real-Time Face Detection. In International Journal of Computer Vision, Volume 57 (2), (2004), Kluwer Academic Publishers, 2004, 137--154.
[28]
Wu, J., Ding, D., Hua, X.-S., and Zhang, B. Tracking Concept Drifting with an Online-Optimized Incremental Learning Framework. In Proc. of the 7th ACM Int'l Workshop on Multimedia Information Retrieval, 2005, Singapore, 33--40.
[29]
Yan, R. and Hauptmann, A. G. Co-Retrieval: A Boosted Reranking Approach for Video Retrieval. In Proc. of the Int'l Conf. on Image and Video Retrieval, Dublin, Ireland, 2004, 60--69.
[30]
Yan, R. and Naphade, M. Co-Training Non-Robust Classifiers for Video Semantic Concept Detection. In Proceedings of the IEEE International Conference on Image Processing 2005, Vol. 1, Singapore, 1205--1208.
[31]
Yan, R. and Naphade, M. Semi-Supervised Cross Feature Learning for Semantic Concept Detection in Videos. In Proc. of the IEEE Int'l Conf. on Computer Vision and Pattern Recognition, 2005, Vol. 1, San Diego, CA, USA, 657--63.

Index Terms

  1. Adapting appearance models of semantic concepts to particular videos via transductive learning

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval
      September 2007
      343 pages
      ISBN:9781595937780
      DOI:10.1145/1290082
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 September 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. concept detection
      2. context
      3. high-level features
      4. semantic video retrieval
      5. semi-supervised learning
      6. transductive SVM
      7. transductive learning

      Qualifiers

      • Article

      Conference

      MM07
      MM07: The 15th ACM International Conference on Multimedia 2007
      September 24 - 29, 2007
      Bavaria, Augsburg, Germany

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 242
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 04 Jan 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media