More Web Proxy on the site http://driver.im/

research-article

Open access

Visual Descriptors in Methods for Video Hyperlinking

Authors:

Petra Galuščáková,

Pavel PecinaAuthors Info & Claims

ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

Pages 294 - 300

https://doi.org/10.1145/3078971.3079026

Published: 06 June 2017 Publication History

Abstract

In this paper, we survey different state-of-the-art visual processing methods and utilize them in hyperlinking. Visual information, calculated using Features Signatures, SIMILE descriptors and convolutional neural networks (CNN), is utilized as similarity between video frames and used to find similar faces, objects and setting. Visual concepts in frames are also automatically recognized and textual output of the recognition is combined with search based on subtitles and transcripts. All presented experiments were performed in the Search and Hyperlinking 2014 MediaEval task and Video Hyperlinking 2015 TRECVid task.

References

[1]

Robin Aly, Maria Eskevich, Roeland Ordelman, and Gareth J. F. Jones. 2013. Adapting Binary Information Retrieval Evaluation Metrics for Segment-based Retrieval Tasks. CoRR abs/1312.1913 (2013).

[2]

Christian Beecks, Merih Seran Uysal, and Thomas Seidl. 2010. Signature Quadratic Form Distance. In Proc. of CIVR. Xi'an, China, 438--445.

Digital Library

[3]

Chidansh A. Bhatt, Nikolaos Pappas, Maryam Habibi, and Andrei Popescu-Belis. 2014. Multimodal Reranking of Content-based Recommendations for Hyperlinking Video Snippets. In Proc. of ICMR. Glasgow, UK, 225--232.

Digital Library

[4]

Adam Blažek, Jakub Lokoć, Filip Matzner, and Tomáš Skopal. 2015. Enhanced Signature-Based Video Browser. In Proc. of MMM (Lecture Notes in Computer Science), Vol. 8936. Sydney, Australia, 243--248.

[5]

Adam Blažek, Jakub Lokoć, and Tomáš Skopal. 2014. Video Retrieval with Feature Signature Sketches. In Proc of SISAP (Lecture Notes in Computer Science), Vol. 8821. Los Cabos, Mexico, 25--36.

[6]

Petra Budikova, Michal Batko, and Pavel Zezula. 2011. Evaluation Platform for Content-based Image Retrieval Systems. In International Conference on Theory and Practice of Digital Libraries (TPDL 2011). 130--142.

Digital Library

[7]

Petra Budíková, Jan Botorek, Michal Batko, and Pavel Zezula. 2014. DISA at ImageCLEF 2014: The Search-based Solution for Scalable Image Annotation. In Proc. of CLEF. Sheffield, UK, 360--371.

[8]

Ken Chatfield, Relja Arandjelović, Omkar Parkhi, and Andrew Zisserman. 2015. On-the-fly Learning for Visual Search of Large-scale Image and Video Datasets. International Journal of Multimedia Information Retrieval (2015), 1--19.

[9]

Shu Chen, Keith Curtis, David N. Racca, Liting Zhou, Gareth J.F. Jones, and Noel E. O'Connor. 2015. DCU ADAPT @ TRECVid 2015: Video Hyperlinking Task. In Proc. of TRECVID. Gaithersburg, MD, USA.

[10]

Shu Chen, Gareth J. F. Jones, and Noel E. O'Connor. 2013. DCU Linking Runs at MediaEval 2013: Search and Hyperlinking Task. In Proc. of MediaEval. Barcelona, Spain

[11]

Zhiyong Cheng, Xuanchong Li, Jialie Shen, and Alexander G. Hauptmann. 2015. Carnegie Mellon University-SMU@TRECVID 2015: Video Hyperlinking. In Proc. of TRECVID. Gaithers- burg, MD, USA.

[12]

Claudiu Cobârzan, Klaus Schoeffmann, Werner Bailer, Wolfgang Hürst, Adam Blažek, Jakub Lokoć, Stefanos Vrochidis, Kai Uwe Barthel, and Luca Rossetto. 2017. Interactive Video Search Tools: A Detailed Analysis of the Video Browser Showdown 2015. Multimedia Tools and Applications 76, 4 (2017), 5539--5571.

Digital Library

[13]

Anca-Roxana Şimon, Ronan Sicre, Rémi Bois, Guillaume Gravier, Pascale Sébillot, and Emmanuel Morin. 2015. IRISA at TrecVid2015: Leveraging Multimodal LDA for Video Hyperlinking. In Proc. of TRECVID. Gaithersburg, MD, USA.

[14]

Maria Eskevich, David N. Racca Aly, Robin and, Roeland Ordelman, Shu Chen, and Jones Gareth J.F. 2014. Search and Hyperlinking 2014 Overview. (2014). http://www.slideshare.net/mariaeskevich/ search-and-hyperlinking-me14sh-task-overviewmero

[15]

Maria Eskevich, Robin Aly, David N. Racca, Roeland Ordelman, Shu Chen, and Gareth J.F. Jones. 2014. The Search and Hyperlinking Task at MediaEval 2014. In Proc. of MediaEval. Barcelona, Spain.

[16]

Florian Eyben, Felix Weninger, Florian Gross, and Björn Schuller. 2013. Recent Developments in openSMILE, the Munich Open-source Multimedia Feature Extractor. In Proc. of ACMMM. Barcelona, Spain, 835--838.

Digital Library

[17]

Petra Galuščáková, Martin Kruliš, Jakub Lokoč, and Pavel Pecina. 2014. CUNI at MediaEval 2014 Search and Hyperlinking Task: Visual and Prosodic Features in Hyperlinking. In Proc. of MediaEval. Barcelona, Spain.

[18]

Petra Galuščáková and Pavel Pecina. 2015. Audio Information for Hyperlinking of TV Content. In Proc. of SLAM. Brisbane, Australia, 27--30.

Digital Library

[19]

Petra Galuščáková, Michal Batko, Martin Kruliš, Jakub Lokoč, David Novák, and Pavel Pecina. 2015. CUNI at TRECVID 2015 Video Hyperlinking Task. In Proc. of TRECVID. Gaithersburg, MD, USA.

[20]

Fernando García, Emilio Sanchis, Marcos Calvo, Ferran Pla, and Lluís-F. Hurtado. 2013. ELiRF at MediaEval 2013: Similar Segments in Social Speech Task. In Proc. of MediaEval. Barcelona, Spain.

[21]

Djoerd Hiemstra. 2001. Using Language Models for Information Retrieval. Ph.D. Dissertation. University of Twente, Enschede, Netherlands.

[22]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25. 1097--1105.

Digital Library

[23]

Martin Kruliš, Tomáš Skopal, Jakub Lokoč, and Christian Beecks. 2012. Combining CPU and GPU Architectures for Fast Similarity Search. Distributed and Parallel Databases 30, 3--4 (2012), 179--207.

Digital Library

[24]

Martin Kruliš, Jakub Lokoč, and Tomáš Skopal. 2016. Efficient Extraction of Clustering-based Feature Signatures Using GPU Architectures. Multimedia Tools and Applications 75, 13 (July 2016), 8071--8103.

Digital Library

[25]

Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar. 2009. Attribute and Simile Classifiers for Face Verification. In Proc. of ICCV. Kyoto, Japan, 365--372.

[26]

Lori Lamel and Jean-Luc Gauvain. 2008. Speech Processing for Audio Indexing. In Proc. of GoTAL 2008, Advances in NLP. Gothenburg, Sweden, 4--15.

Digital Library

[27]

P. Lanchantin, P.-J. Bell, M.-J.-F. Gales, T. Hain, X. Liu, Y. Long, J. Quinnell, S. Renals, O. Saz, M.-S. Seigel, P Swietojanski, and P.-C. Woodland. 2013. Automatic Transcription of Multi-genre Media Archives. In Proc. of SLAM Workshop. Marseille, France, 26--31.

[28]

H.A. Le, B. Huet Q.M. Bui and, B. Červenková, J. Bouchner, E. Apostolidis, F. Markatopoulou, A. Pournaras, V. Mezaris, D. Stein, S. Eickeler, and M. Stadtschnitzer. 2014. LinkedTV at MediaEval 2014 Search and Hyperlinking Task. In Proc. of MediaEval. Barcelona, Spain.

[29]

Gina-Anne Levow. 2013. UWCL at MediaEval 2013: Similar Segments in Social Speech Task. In Proc. of MediaEval. Barcelona, Spain.

[30]

Michał Lokaj, Harald Stiegler, and Werner Bailer. 2013. TOSCA-MP at Search and Hyperlinking of Television Content Task. In Proc. of MediaEval. Barcelona, Spain.

[31]

Merriam-Webster Online. 2009. Merriam-Webster Online Dictionary. (2009). http://www.merriam-webster.com

[32]

Tomas Mikolov, Scott Wen tau Yih, and Geoffrey Zweig. 2013. Linguistic Reg- ularities in Continuous Space Word Representations. In Proc. of HLT NAACL. Atlanta, GA, USA.

[33]

Usman Niaz, Bernard Merialdo, Claudiu Tanase, Maria Eskevich, and Benoit Huet. 2015. EURECOM at TrecVid 2015: Semantic Indexing and Video Hyperlinking Tasks. In Proc. of TRECVID 2015. Gaithersburg, MD, USA.

[34]

Tom De Nies, Wesley De Neve, Erik Mannens, and Rik Van de Walle. 2013. Ghent University-iMinds at MediaEval 2013: An Unsupervised Named Entity-based Similarity Measure for Search and Hyperlinking. In Proc. of MediaEval. Barcelona, Spain.

[35]

David Novák, Michal Batko, and Pavel Zezula. 2015. Large-scale Image Retrieval using Neural Net Descriptors. In Proc. of SIGIR '15. Santiago, Chile.

Digital Library

[36]

Roeland Ordelman, Maria Eskevich, Robin Aly, Benoit Huet, and Gareth J. F. Jones. 2015. Defining and Evaluating Video Hyperlinking for Navigating Multimedia Archives. In Proc. of WWW (Companion Volume). Florence, Italy, 727--732.

Digital Library

[37]

I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. 2006. Terrier: A High Performance and Scalable Information Retrieval Platform. In Proc. of SIGIR Workshop on Open Source Information Retrieval. Seattle, WA, USA, 18--25.

[38]

Paul Over, George Awad, Martial Michel, Jonathan Fiscus, Wessel Kraaij, Alan F. Smeaton, Georges Quéenot, and Roeland Ordelman. 2015. TRECVID 2015 -- An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics. In Proc. of TRECVID. Gaithersburg, MD, USA.

[39]

Lei Pang and Chong-Wah Ngo. 2015. VIREO @ TRECVID 2015: Video Hyper- linking (LNK). In Proc. of TRECVID 2015. Gaithersburg, MD, USA.

[40]

Omkar M. Parkhi, Andrea Vedaldi, and Andrew Zisserman. 2015. Deep Face Recognition. In Proc. of BMVC. Swansea, UK, 41.1--41.12.

[41]

Zsombor Paróczi, Bálint Fodor, and Gábor Szücs. 2014. DCLab at MediaEval2014 Search and Hyperlinking Task. In Proc. of MediaEval . Barcelona, Spain.

[42]

John Preston, Jonathon Hare, Sina Samangooei, Jamie Davies, Neha Jain, and David Dupplaw. 2013. A Unified, Modular and Multimodal Approach to Search and Hyperlinking Video. In Proc. of MediaEval. Barcelona, Spain.

[43]

A. Rousseau, P. Deléglise, and Y. Estève. 2014. Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling And More TED Talks. In Proc. of LREC. Reykjavik, Iceland, 3935--3939.

[44]

Yossi Rubner and Carlo Tomasi. 2001. Perceptual Metrics for Image Database Navigation. Springer Science & Business Media, Norwell, MA, USA.

Digital Library

[45]

Mathilde Sahuguet, Benoit Huet, Barbora Červenková, Evlampios Apostolidis, Vasileios Mezaris, Daniel Stein, Stefan Eickeler, José, Luis Redondo Garcia, Raphaël Troncy, and Lukáš Pikora. 2013. LinkedTV at MediaEval 2013 Search and Hyperlinking Task. In Proc. of MediaEval. Barcelona, Spain.

[46]

T. Tommasi, R. Aly, K. McGuinness, K. Chatfield, R. Arandjelovic, O. Parkhi, R. J. F. Ordelman, A. Zisserman, and T. Tuytelaars. 2014. Beyond Metadata: Searching Your Archive Based on Its Audio-visual Content. In Proc. of IBC. Amsterdam, Netherlands.

[47]

Steven D. Werner and Nigel G. Ward. 2013. Evaluating Prosody-Based Similarity Models for Information Retrieval. In Proc. of MediaEval. Barcelona, Spain.

[48]

Frank Wilcoxon. 1945. Individual Comparisons by Ranking Methods. Biometrics Bulletin 1, 6 (1945), 80--83.

[49]

Bin Xu, Weihang Liao, Zizheng Liu, Wentao Bao, Yiming Li, Daiqin Yang, Sihan Wang, Hongyi Liu, Yatong Xia, Yingbin Wang, and Zhenzheng Chen. 2015. IIPWHU@TRECVID 2015. In Proc. of TRECVID. Gaithersburg, MD, USA

Cited By

Petscharnig SSchöffmann K(2018)Binary convolutional neural network features off-the-shelf for image to video linking in endoscopic multimedia databasesMultimedia Tools and Applications10.5555/3287850.328789677:21(28817-28842)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.5555/3287850.3287896
Budnik MDemirdelen MGravier G(2018)A Study on Multimodal Video Hyperlinking with Visual Aggregation2018 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME.2018.8486549(1-6)Online publication date: Jul-2018
https://doi.org/10.1109/ICME.2018.8486549
Petscharnig SSchöffmann K(2018)Binary convolutional neural network features off-the-shelf for image to video linking in endoscopic multimedia databasesMultimedia Tools and Applications10.1007/s11042-018-6016-377:21(28817-28842)Online publication date: 8-May-2018
https://doi.org/10.1007/s11042-018-6016-3

Recommendations

Content-based video retrieval: does video's semantic visual feature matter?
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

A new shot level video browsing method based on semantic visual features (e.g., car, mountain, and fire) is proposed to facilitate content-based retrieval. The video's binary semantic feature vector is utilized to calculate the score of similarity ...
Supporting semantic visual feature browsing in contentbased video retrieval
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

A new shot level video retrieval system that supports semantic visual features (e.g., car, mountain, and fire) browsing is developed to facilitate content-based retrieval. The video's binary semantic feature vector is utilized to calculate the score of ...
The evolution of visual information retrieval

This paper seeks to provide a brief overview of those developments which have taken the theory and practice of image and video retrieval into the digital age. Drawing on a voluminous literature, the context in which visual information retrieval takes ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

June 2017

524 pages

ISBN:9781450347013

DOI:10.1145/3078971

General Chairs:
Bogdan Ionescu
University Politehnica of Bucharest, Romania
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Jiashi Feng
National University of Singapore, Singapore
,
Martha Larson
Radboud University & Delft University of Technology, The Netherlands
,
Rainer Lienhart
University of Augsburg, Germany
,
Cees Snoek
University of Amsterdam & Qualcomm Research Netherlands, The Netherlands

Copyright © 2017 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2017

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Czech Science Foundation

Conference

ICMR '17

Sponsor:

SIGMM

ICMR '17: International Conference on Multimedia Retrieval

June 6 - 9, 2017

Bucharest, Romania

Acceptance Rates

ICMR '17 Paper Acceptance Rate 33 of 95 submissions, 35%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
347
Total Downloads

Downloads (Last 12 months)65
Downloads (Last 6 weeks)13

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Petscharnig SSchöffmann K(2018)Binary convolutional neural network features off-the-shelf for image to video linking in endoscopic multimedia databasesMultimedia Tools and Applications10.5555/3287850.328789677:21(28817-28842)Online publication date: 1-Nov-2018
https://dl.acm.org/doi/10.5555/3287850.3287896
Budnik MDemirdelen MGravier G(2018)A Study on Multimodal Video Hyperlinking with Visual Aggregation2018 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME.2018.8486549(1-6)Online publication date: Jul-2018
https://doi.org/10.1109/ICME.2018.8486549
Petscharnig SSchöffmann K(2018)Binary convolutional neural network features off-the-shelf for image to video linking in endoscopic multimedia databasesMultimedia Tools and Applications10.1007/s11042-018-6016-377:21(28817-28842)Online publication date: 8-May-2018
https://doi.org/10.1007/s11042-018-6016-3

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten