Large Scale Concept Detection in Video Using a Region Thesaurus

Evaggelos Spyrou⁵,
Giorgos Tolias⁵ &
Yannis Avrithis⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5371))

Included in the following conference series:

International Conference on Multimedia Modeling

773 Accesses

Abstract

This paper presents an approach on high-level feature detection within video documents, using a Region Thesaurus. A video shot is represented by a single keyframe and MPEG-7 features are extracted locally, from coarse segmented regions. Then a clustering algorithm is applied on those extracted regions and a region thesaurus is constructed to facilitate the description of each keyframe at a higher level than the low-level descriptors but at a lower than the high-level concepts. A model vector representation is formed and several high-level concept detectors are appropriately trained using a global keyframe annotation. The proposed approach is thoroughly evaluated on the TRECVID 2007 development data for the detection of nine high level concepts, demonstrating sufficient performance on large data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Hierarchical Late Fusion for Concept Detection in Videos

Semantic Concept Detection Using Dense Codeword Motion

A Comparative Study on the Use of Multi-label Classification Techniques for Concept-Based Video Indexing and Annotation

References

Saux, B., Amato, G.: Image classifiers for scene analysis. In: International Conference on Computer Vision and Graphics (2004)
Google Scholar
Gokalp, D., Aksoy, S.: Scene classification using bag-of-regions representations. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)
Google Scholar
Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV - International Workshop on Statistical Learning in Computer Vision (2004)
Google Scholar
Boujemaa, N., Fleuret, F., Gouet, V., Sahbi, H.: Visual content extraction for automatic semantic annotation of video news. In: IS&T/SPIE Conf. on Storage and Retrieval Methods and Applications for Multimedia (2004)
Google Scholar
Voisine, N., Dasiopoulou, S., Mezaris, V., Spyrou, E., Athanasiadis, T., Kompatsiaris, I., Avrithis, Y., Strintzis, M.G.: Knowledge-assisted video analysis using a genetic algorithm. In: 6th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS (2005)
Google Scholar
IBM: MARVEL Multimedia Analysis and Retrieval System. IBM Research White paper (2005)
Google Scholar
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. International Journal of Computer Vision (2008)
Google Scholar
Naphade, M.R., Kennedy, L., Kender, J.R., Chang, S.F., Smith, J.R., Over, P., Hauptmann, A.: A Light Scale Concept Ontology for Multimedia understanding for trecvid (IBM Research Technical Report (2005)
Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330. ACM Press, New York (2006)
Google Scholar
Avrithis, Y., Doulamis, A., Doulamis, N., Kollias, S.: A stochastic framework for optimal key frame extraction from mpeg video databases. Computer Vision and Image Understanding 75 (1/2), 3–24 (1999)
Article Google Scholar
Manjunath, B., Ohm, J., Vasudevan, V., Yamada, A.: Color and texture descriptors. IEEE trans. on Circuits and Systems for Video Technology 11(6), 703–715 (2001)
Article Google Scholar
Spyrou, E., LeBorgne, H., Mailis, T., Cooke, E., Avrithis, Y., O’Connor, N.: Fusing MPEG-7 visual descriptors for image classification. In: International Conference on Artificial Neural Networks (ICANN) (2005)
Google Scholar
Molina, J., Spyrou, E., Sofou, N., Martinez, J.M.: On the selection of MPEG-7 visual descriptors and their level of detail for nature disaster video sequences classification. In: Falcidieno, B., Spagnuolo, M., Avrithis, Y., Kompatsiaris, I., Buitelaar, P. (eds.) SAMT 2007. LNCS, vol. 4816, pp. 70–73. Springer, Heidelberg (2007)
Chapter Google Scholar
Ayache, S., Quenot, G.: TRECVID, collaborative annotation using active learning. In: TRECVID, Workshop, Gaithersburg (2007)
Google Scholar
Kishida, K.: Property of average precision and its generalization: an examination of evaluation indicator for information retrieval. NII Technical Reports, NII-2005-014E (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Image, Video and Multimedia Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou Str., 157 80, Athens, Greece
Evaggelos Spyrou, Giorgos Tolias & Yannis Avrithis

Authors

Evaggelos Spyrou
View author publications
You can also search for this author in PubMed Google Scholar
Giorgos Tolias
View author publications
You can also search for this author in PubMed Google Scholar
Yannis Avrithis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Eurécom, 2229, route des crêtes, 06904, Sophia-Antipolis, France
Benoit Huet
Dublin City University, Dublin, Ireland
Alan Smeaton
Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA
Ketan Mayer-Patel
Image, Video and Multimedia Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou Str., 157 80, Athens, Greece
Yannis Avrithis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Spyrou, E., Tolias, G., Avrithis, Y. (2009). Large Scale Concept Detection in Video Using a Region Thesaurus. In: Huet, B., Smeaton, A., Mayer-Patel, K., Avrithis, Y. (eds) Advances in Multimedia Modeling . MMM 2009. Lecture Notes in Computer Science, vol 5371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92892-8_20

Download citation

DOI: https://doi.org/10.1007/978-3-540-92892-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92891-1
Online ISBN: 978-3-540-92892-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Large Scale Concept Detection in Video Using a Region Thesaurus

Abstract

Access this chapter

Preview

Similar content being viewed by others

Hierarchical Late Fusion for Concept Detection in Videos

Semantic Concept Detection Using Dense Codeword Motion

A Comparative Study on the Use of Multi-label Classification Techniques for Concept-Based Video Indexing and Annotation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Large Scale Concept Detection in Video Using a Region Thesaurus

Abstract

Access this chapter

Preview

Similar content being viewed by others

Hierarchical Late Fusion for Concept Detection in Videos

Semantic Concept Detection Using Dense Codeword Motion

A Comparative Study on the Use of Multi-label Classification Techniques for Concept-Based Video Indexing and Annotation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation