[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Large Scale Concept Detection in Video Using a Region Thesaurus

  • Conference paper
Advances in Multimedia Modeling (MMM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5371))

Included in the following conference series:

  • 773 Accesses

Abstract

This paper presents an approach on high-level feature detection within video documents, using a Region Thesaurus. A video shot is represented by a single keyframe and MPEG-7 features are extracted locally, from coarse segmented regions. Then a clustering algorithm is applied on those extracted regions and a region thesaurus is constructed to facilitate the description of each keyframe at a higher level than the low-level descriptors but at a lower than the high-level concepts. A model vector representation is formed and several high-level concept detectors are appropriately trained using a global keyframe annotation. The proposed approach is thoroughly evaluated on the TRECVID 2007 development data for the detection of nine high level concepts, demonstrating sufficient performance on large data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Saux, B., Amato, G.: Image classifiers for scene analysis. In: International Conference on Computer Vision and Graphics (2004)

    Google Scholar 

  2. Gokalp, D., Aksoy, S.: Scene classification using bag-of-regions representations. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)

    Google Scholar 

  3. Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV - International Workshop on Statistical Learning in Computer Vision (2004)

    Google Scholar 

  4. Boujemaa, N., Fleuret, F., Gouet, V., Sahbi, H.: Visual content extraction for automatic semantic annotation of video news. In: IS&T/SPIE Conf. on Storage and Retrieval Methods and Applications for Multimedia (2004)

    Google Scholar 

  5. Voisine, N., Dasiopoulou, S., Mezaris, V., Spyrou, E., Athanasiadis, T., Kompatsiaris, I., Avrithis, Y., Strintzis, M.G.: Knowledge-assisted video analysis using a genetic algorithm. In: 6th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS (2005)

    Google Scholar 

  6. IBM: MARVEL Multimedia Analysis and Retrieval System. IBM Research White paper (2005)

    Google Scholar 

  7. Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. International Journal of Computer Vision (2008)

    Google Scholar 

  8. Naphade, M.R., Kennedy, L., Kender, J.R., Chang, S.F., Smith, J.R., Over, P., Hauptmann, A.: A Light Scale Concept Ontology for Multimedia understanding for trecvid (IBM Research Technical Report (2005)

    Google Scholar 

  9. Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM Press, New York (2006)

    Google Scholar 

  10. Avrithis, Y., Doulamis, A., Doulamis, N., Kollias, S.: A stochastic framework for optimal key frame extraction from mpeg video databases. Computer Vision and Image Understanding 75 (1/2), 3–24 (1999)

    Article  Google Scholar 

  11. Manjunath, B., Ohm, J., Vasudevan, V., Yamada, A.: Color and texture descriptors. IEEE trans. on Circuits and Systems for Video Technology 11(6), 703–715 (2001)

    Article  Google Scholar 

  12. Spyrou, E., LeBorgne, H., Mailis, T., Cooke, E., Avrithis, Y., O’Connor, N.: Fusing MPEG-7 visual descriptors for image classification. In: International Conference on Artificial Neural Networks (ICANN) (2005)

    Google Scholar 

  13. Molina, J., Spyrou, E., Sofou, N., Martinez, J.M.: On the selection of MPEG-7 visual descriptors and their level of detail for nature disaster video sequences classification. In: Falcidieno, B., Spagnuolo, M., Avrithis, Y., Kompatsiaris, I., Buitelaar, P. (eds.) SAMT 2007. LNCS, vol. 4816, pp. 70–73. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  14. Ayache, S., Quenot, G.: TRECVID, collaborative annotation using active learning. In: TRECVID, Workshop, Gaithersburg (2007)

    Google Scholar 

  15. Kishida, K.: Property of average precision and its generalization: an examination of evaluation indicator for information retrieval. NII Technical Reports, NII-2005-014E (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Spyrou, E., Tolias, G., Avrithis, Y. (2009). Large Scale Concept Detection in Video Using a Region Thesaurus. In: Huet, B., Smeaton, A., Mayer-Patel, K., Avrithis, Y. (eds) Advances in Multimedia Modeling . MMM 2009. Lecture Notes in Computer Science, vol 5371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92892-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-92892-8_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-92891-1

  • Online ISBN: 978-3-540-92892-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics