More Web Proxy on the site http://driver.im/

short-paper

Image Tagging via Cross-Modal Semantic Mapping

Authors:

Yunlun YangAuthors Info & Claims

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

Pages 1143 - 1146

https://doi.org/10.1145/2733373.2806302

Published: 13 October 2015 Publication History

Abstract

Images without annotations are ubiquitous on the Internet, and recommending tags for them has become a challenging open task in image understanding. A common bottleneck of related work is the semantic gap between the image and text representations. In this paper, we bridge the gap by introducing a semantic layer, the space of word embeddings that represents the image tags as the word vectors. Our model first learns the optimal mapping from the visual space to the semantic space using training sources. Then we annotate test images by decoding the semantic representations of the visual features. Extensive experiments demonstrate that our model outperforms the state-of-the-art approaches in predicting the image tags.

References

[1]

M. Chen, A. Zheng, and K. Weinberger. Fast image tagging. In Proceedings of The 30th International Conference on Machine Learning, pages 1274--1282, 2013.

Digital Library

[2]

T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: a real-world web image database from national university of singapore. In Proceedings of the ACM international conference on image and video retrieval, page 48. ACM, 2009.

Digital Library

[3]

R. Collobert and J. Weston. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, pages 160--167. ACM, 2008.

Digital Library

[4]

P. Duygulu, K. Barnard, J. F. de Freitas, and D. A. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Computer Vision--ECCV 2002, pages 97--112. Springer, 2002.

Digital Library

[5]

C. Gong, D. Tao, K. Fu, and J. Yang. Relish: Reliable label inference via smoothness hypothesis. In Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.

[6]

M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In Computer Vision, 2009 IEEE 12th International Conference on, pages 309--316. IEEE, 2009.

[7]

X. Li, C. G. Snoek, and M. Worring. Learning social tag relevance by neighbor voting. Multimedia, IEEE Transactions on, 11(7):1310--1322, 2009.

Digital Library

[8]

Z. Lin, G. Ding, M. Hu, J. Wang, and X. Ye. Image tag completion via image-specific and tag-specific linear sparse reconstructions. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 1618--1625. IEEE, 2013.

Digital Library

[9]

X. Liu, S. Yan, T.-S. Chua, and H. Jin. Image label completion by pursuing contextual decomposability. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP), 8(2):21, 2012.

Digital Library

[10]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91--110, 2004.

Digital Library

[11]

A. Makadia, V. Pavlovic, and S. Kumar. A new baseline for image annotation. In Computer Vision--ECCV 2008, pages 316--329. Springer, 2008.

Digital Library

[12]

M. Schmidt, G. Fung, and R. Rosales. Fast optimization methods for l1 regularization: A comparative study and two new approaches. In Machine Learning: ECML 2007, pages 286--297. Springer, 2007.

Digital Library

[13]

R. Socher and L. Fei-Fei. Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 966--973. IEEE, 2010.

[14]

J. Turian, L. Ratinov, and Y. Bengio. Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 384--394. Association for Computational Linguistics, 2010.

Digital Library

[15]

L. Wu, R. Jin, and A. K. Jain. Tag completion for image retrieval. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2013.

Digital Library

[16]

H. Yu, Z.-H. Deng, Y. Yang, and T. Xiong. A joint optimization model for image summarization based on image content and tags. In Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.

Digital Library

Cited By

Tang JShu XQi GLi ZWang MYan SJain R(2017)Tri-Clustered Tensor Completion for Social-Aware Image Tag RefinementIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2016.260888239:8(1662-1674)Online publication date: 29-Jun-2017
https://dl.acm.org/doi/10.1109/TPAMI.2016.2608882
Masoud MLee SBelkasim S(2016)Statistical-Based Image Tagging2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)10.1109/WI.2016.0106(610-613)Online publication date: Oct-2016
https://doi.org/10.1109/WI.2016.0106

Index Terms

Image Tagging via Cross-Modal Semantic Mapping
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Image Tagging with Social Assistance
ICMR '14: Proceedings of International Conference on Multimedia Retrieval

Image tagging, also known as image annotation and image conception detection, has been extensively studied in the literature. However, most existing approaches can hardly achieve satisfactory performance owing to the deficiency and unreliability of the ...
Tagging image by exploring weighted correlation between visual features and tags
WAIM'11: Proceedings of the 12th international conference on Web-age information management

Automatic image tagging automatically label images with semantic tags, which significantly facilitate image search and organization. Existing tagging methods often derive the probabilistic or co-occurring tags from the visually similar images, which ...
Tagging photos using users' vocabularies

Online social image share websites such as Flickr and Panoramio allow users to manually annotate their images with their own words, which can be used to facilitating image retrieval and other image applications. The smart-phones have made it possible ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '15: Proceedings of the 23rd ACM international conference on Multimedia

October 2015

1402 pages

ISBN:9781450334594

DOI:10.1145/2733373

General Chairs:
Xiaofang Zhou
The University of Queensland, Australia
,
Alan F. Smeaton
Dublin City University, Ireland
,
Qi Tian
The University of Texas at San Antonio, USA
,
Program Chairs:
Dick C.A. Bulterman
FXPAL, USA
,
Heng Tao Shen
The University of Queensland, Australia
,
Ketan Mayer-Patel
The University of North Carolina, USA
,
Shuicheng Yan
National University of Singapore, Singapore

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

National High Technology Research and Development Program of China (863 Program)
National Natural Science Foundation of China

Conference

MM '15

Sponsor:

SIGMM

MM '15: ACM Multimedia Conference

October 26 - 30, 2015

Brisbane, Australia

Acceptance Rates

MM '15 Paper Acceptance Rate 56 of 252 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
175
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 11 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tang JShu XQi GLi ZWang MYan SJain R(2017)Tri-Clustered Tensor Completion for Social-Aware Image Tag RefinementIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2016.260888239:8(1662-1674)Online publication date: 29-Jun-2017
https://dl.acm.org/doi/10.1109/TPAMI.2016.2608882
Masoud MLee SBelkasim S(2016)Statistical-Based Image Tagging2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)10.1109/WI.2016.0106(610-613)Online publication date: Oct-2016
https://doi.org/10.1109/WI.2016.0106

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents