[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Evaluation of active learning strategies for video indexing

Published: 01 August 2007 Publication History

Abstract

In this paper, we compare active learning strategies for indexing concepts in video shots. Active learning is simulated using subsets of a fully annotated data set instead of actually calling for user intervention. Training is done using the collaborative annotation of 39 concepts of the TRECVID 2005 campaign. Performance is measured on the 20 concepts selected for the TRECVID 2006 concept detection task. The simulation allows exploring the effect of several parameters: the strategy, the annotated fraction of the data set, the size of the data set, the number of iterations and the relative difficulty of concepts. Three strategies were compared. The first two, respectively, select the most probable and the most uncertain samples. The third one is a random choice. For easy concepts, the ''most probable'' strategy is the best one when less than 15% of the data set is annotated and the ''most uncertain'' strategy is the best one when 15% or more of the data set is annotated. The ''most probable'' and ''most uncertain'' strategies are roughly equivalent for moderately difficult and difficult concepts. In all cases, the maximum performance is reached when 12-15% of the whole data set is annotated. This result is, however, dependent upon the step size and the training set size. One-fortieth of the training set size is a good value for the step size. The size of the subset of the training set that has to be annotated in order to reach the maximum achievable performance varies with the square root of the training set size. The ''most probable'' strategy is more ''recall oriented'' and the ''most uncertain'' strategy is more ''precision oriented''.

References

[1]
Angluin, D., Queries and concept learning. Machine Learn. v2. 319-342.
[2]
S. Ayache, G. Quénot, TRECVID 2007 collaborative annotation, 2007. Web site at: {http://mrim.imag.fr/tvca/index.html}.
[3]
S. Ayache, G. Quénot, J. Gensel, CLIPS-LSR experiments at TRECVID 2006, in: NIST TREC-2006 Video Retrieval Evaluation Conference, 13-14 November 2006.
[4]
S. Ayache, G. Quénot, S. Satoh, Context-based conceptual image indexing, in: ICASSP '06: IEEE International Conference on Acoustics, Speech and Signal Processing, 15-19 May 2006.
[5]
C.-C. Chang, C.-J. Lin, LIBSVM: A Library for Support Vector Machines, 2001. Software available at: {http://www.csie.ntu.edu.tw/~cjlin/libsvm}.
[6]
M. Ferecatu, N. Boujemaa, M. Crucianu, Semantic interactive image retrieval combining visual and conceptual content description, Multimedia Systems. (2007), to appear.
[7]
Freund, Y., Seung, H.S., Shamir, E. and Tishby, N., Selective sampling using the query by committee algorithm. Machine Learn. v28 i2-3. 133-168.
[8]
Gosselin, P.H. and Cord, M., A comparison of active classification methods for content-based image retrieval. In: CVDB '04: Proceedings of the First International Workshop on Computer Vision Meets Databases, ACM Press, New York, NY, USA. pp. 51-58.
[9]
Lewis, D.D. and Gale, W.A., A sequential algorithm for training text classifiers. In: Croft, W.B., van Rijsbergen, C.J. (Eds.), Proceedings of SIGIR-94, 17th ACM International Conference on Research and Development in Information Retrieval, Springer, Heidelberg, Germany, Dublin, Ireland. pp. 3-12.
[10]
C.-Y. Lin, B.L. Tseng, J.R. Smith, Video collaborative annotation forum: establishing ground-truth labels on large multimedia datasets, in: NIST TREC-2003 Video Retrieval Evaluation Conference, 17-18 November 2003.
[11]
Naphade, M., Smith, J.R., Tesic, J., Chang, S.-F., Hsu, W., Kennedy, L., Hauptmann, A. and Curtis, J., Large-scale concept ontology for multimedia. IEEE MultiMedia. v13 i3. 86-91.
[12]
Naphade, M.R. and Smith, J.R., On the detection of semantic concepts at trecvid. In: MULTIMEDIA '04: Proceedings of the 12th Annual ACM International Conference on Multimedia, ACM Press, New York, NY, USA. pp. 660-667.
[13]
Qi, G.-J., Song, Y., Hua, X.-S., Zhang, H.-J. and Dai, L.-R., Video annotation by active learning and cluster tuning. In: CVPRW '06: Proceedings of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, IEEE Computer Society, Washington, DC, USA. pp. 114
[14]
G. Quénot, Computation of optical flow using dynamic programming, in: IAPR Workshop on Machine Vision Applications, 1996, pp. 249-252.
[15]
G. Salton, C. Buckley, Improving retrieval performance by relevance feedback, in: K. Sparck Jones, P. Willett (Eds.), Readings in information retrieval. Morgan Kaufmann multimedia information and systems series. Morgan Kaufmann Publishers, San Francisco, CA, 1997, pp. 355-364.
[16]
H.S. Seung, M. Opper, H. Sompolinsky, Query by committee. in: Computational Learning Theory, 1992, pp. 287-294.
[17]
Smeaton, A.F., Over, P. and Kraaij, W., Evaluation campaigns and trecvid. In: MIR '06: Proceedings of the Eighth ACM International Workshop on Multimedia Information Retrieval, ACM Press, New York, NY, USA. pp. 321-330.
[18]
Snoek, C.G.M., Worring, M. and Hauptmann, A.G., Learning rich semantics from news video archives by style analysis. ACM Trans. Multimedia Comput. Comm. Appl. v2 i2. 91-108.
[19]
F. Souvannavong, B. Mérialdo, B. Huet, Partition sampling for active video database annotation, in: WIAMIS' 04, Fifth International Workshop on Image Analysis for Multimedia Interactive Services, Instituto Superior Técnico, Lisboa, Portugal, 21-23, April 2004.
[20]
Tieu, K. and Viola, P., Boosting image retrieval. Internat. J. Comput. Vision. v56 i1. 17-36.
[21]
Tong, S. and Chang, E., Support vector machine active learning for image retrieval. In: Proceedings of the Ninth ACM International Conference on Multimedia, ACM Press, New York, NY, USA. pp. 107-118.
[22]
T. Volkmer, J.R. Smith, A.P. Natsev, M. Campbell, M. Naphade, A web-based system for collaborative annotation of large image and video collections, in: 13th ACM International Conference on Multimedia, 6-11 November 2005.
[23]
Zhou, X.S. and Huang, T.S., Relevance feedback in image retrieval: a comprehensive review. ACM Multimedia Systems. v8 i6. 536-544.

Cited By

View all
  • (2021)OCR-aided person annotation and label propagation for speaker modeling in TV shows2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2016.7472743(5570-5574)Online publication date: 11-Mar-2021
  • (2013)Literature survey of active learning in multimedia annotation and retrievalProceedings of the Fifth International Conference on Internet Multimedia Computing and Service10.1145/2499788.2499794(237-242)Online publication date: 17-Aug-2013
  • (2013)Tag completion based on belief theory and neighbor votingProceedings of the 3rd ACM conference on International conference on multimedia retrieval10.1145/2461466.2461476(49-56)Online publication date: 16-Apr-2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Image Communication
Image Communication  Volume 22, Issue 7-8
August, 2007
114 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 August 2007

Author Tags

  1. Active learning
  2. Evaluation
  3. Strategies
  4. Video indexing

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2021)OCR-aided person annotation and label propagation for speaker modeling in TV shows2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2016.7472743(5570-5574)Online publication date: 11-Mar-2021
  • (2013)Literature survey of active learning in multimedia annotation and retrievalProceedings of the Fifth International Conference on Internet Multimedia Computing and Service10.1145/2499788.2499794(237-242)Online publication date: 17-Aug-2013
  • (2013)Tag completion based on belief theory and neighbor votingProceedings of the 3rd ACM conference on International conference on multimedia retrieval10.1145/2461466.2461476(49-56)Online publication date: 16-Apr-2013
  • (2013)Narrative theme navigation for sitcoms supported by fan-generated scriptsMultimedia Tools and Applications10.1007/s11042-011-0877-z63:2(387-406)Online publication date: 1-Mar-2013
  • (2012)Exploring label dependency in active learning for phenotype mappingProceedings of the 2012 Workshop on Biomedical Natural Language Processing10.5555/2391123.2391143(146-154)Online publication date: 8-Jun-2012
  • (2012)Active learning with semi-automatic annotation for extractive speech summarizationACM Transactions on Speech and Language Processing 10.1145/2093153.20931558:4(1-25)Online publication date: 20-Feb-2012
  • (2012)Simulating the future of concept-based video retrieval under improved detector performanceMultimedia Tools and Applications10.1007/s11042-011-0818-x60:1(203-231)Online publication date: 1-Sep-2012
  • (2012)Active learning with multiple classifiers for multimedia indexingMultimedia Tools and Applications10.1007/s11042-010-0599-760:2(403-417)Online publication date: 1-Sep-2012
  • (2012)Annotating images with suggestionsProceedings of the 14th international conference on Advanced Concepts for Intelligent Vision Systems10.1007/978-3-642-33140-4_14(155-166)Online publication date: 4-Sep-2012
  • (2012)Active cleaning for video corpus annotationProceedings of the 18th international conference on Advances in Multimedia Modeling10.1007/978-3-642-27355-1_48(518-528)Online publication date: 4-Jan-2012
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media