[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/860435.860459acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Automatic image annotation and retrieval using cross-media relevance models

Published: 28 July 2003 Publication History

Abstract

Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images. We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.

References

[1]
K. Barnard, P. Duygulu, N. de~Freitas, D. Forsyth, D. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003.]]
[2]
K. Barnard and D. Forsyth. Learning the semantics of words and pictures. In International Conference on Computer Vision, Vol.2, pages 408--415, 2001.]]
[3]
D. Blei, Michael, and M. I. Jordan. Modeling annotated data. To appear in the Proceedings of the 26th annual international ACM SIGIR conference]]
[4]
Berger, A. and Lafferty, J. Information retrieval as statistical translation. In Proceedings of the 22nd annual international ACM SIGIR conference, pages 222--229, 1999.]]
[5]
P. Brown, S. D. Pietra, V. D. Pietra, and R. Mercer. The mathematics of statistical machine translation: Parameter estimation. In Computational Linguistics, 19(2):263--311, 1993.]]
[6]
W. B. Croft. Combining Approaches to Information Retrieval, in Advances in Information Retrieval ed. W. B. Croft, Kluwer Academic Publishers, Boston, MA.]]
[7]
C. Carson, M. Thomas, S. Belongie, J. M. Hellerstein, and J. Malik. Blobworld: A system for region-based image indexing and retrieval. In Third International Conference on Visual Information Systems, Lecture Notes in Computer Science, 1614, pages 509--516, 1999.]]
[8]
M. Das and R. Manmatha and E. M. Riseman, Indexing Flowers by Color Names using Domain Knowledge-driven Segmentation, IEEE Intelligent Systems, 14(5):24--33, 1999.]]
[9]
P. Duygulu, K. Barnard, N. de~Freitas, and D. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Seventh European Conference on Computer Vision, pages 97--112, 2002.]]
[10]
D. Forsyth and J. Ponce, Computer Vision: A Modern Approach Prentice Hall, 2003]]
[11]
D. Hiemstra Using Language Models for Information Retrieval. PhD dissertation, University of Twente, Enschede, The Netherlands, 2001.]]
[12]
J. M. Ponte, and W. B. Croft, A language modeling approach to information retrieval. Proceedings of the 21st annual international ACM SIGIR Conference, pages 275--281, 1998.]]
[13]
V. Lavrenko and W. Croft. Relevance-based language models. Proceedings of the 24th annual international ACM SIGIR conference, pages 120--127, 2001.]]
[14]
V. Lavrenko, M. Choquette, and W. Croft. Cross-lingual relevance models. Proceedings of the 25th annual international ACM SIGIR conference, pages 175--182, 2002.]]
[15]
Y. Mori, H. Takahashi, and R. Oka. Image-to-word transformation based on dividing and vector quantizing images with words. In MISRM'99 First International Workshop on Multimedia Intelligent Storage and Retrieval Management, 1999.]]
[16]
R. W. Picard and T. P. Minka", Vision Texture for Annotation, In Multimedia Systems, 3(1):3--14, 1995.]]
[17]
J. Shi and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888--905, 2000.]]
[18]
J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval, Proceedings of the 24th annual international ACM SIGIR Conference, pages 111--119, 2001.]]

Cited By

View all
  • (2024)Object Recognition to Content Based Image Retrieval: A Study of the Developments and Applications of Computer VisionJournal of Computing and Natural Science10.53759/181X/JCNS202404005(41-52)Online publication date: 5-Jan-2024
  • (2024)Object Recognition as Next Token Prediction2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01575(16645-16656)Online publication date: 16-Jun-2024
  • (2024)Research on image caption generation method based on multi-modal pre-training model and text mixup optimizationSignal, Image and Video Processing10.1007/s11760-024-03268-018:8-9(5743-5761)Online publication date: 28-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
July 2003
490 pages
ISBN:1581136463
DOI:10.1145/860435
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 July 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. image annotation
  2. image retrieval
  3. relevance models

Qualifiers

  • Article

Conference

SIGIR03
Sponsor:

Acceptance Rates

SIGIR '03 Paper Acceptance Rate 46 of 266 submissions, 17%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Object Recognition to Content Based Image Retrieval: A Study of the Developments and Applications of Computer VisionJournal of Computing and Natural Science10.53759/181X/JCNS202404005(41-52)Online publication date: 5-Jan-2024
  • (2024)Object Recognition as Next Token Prediction2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01575(16645-16656)Online publication date: 16-Jun-2024
  • (2024)Research on image caption generation method based on multi-modal pre-training model and text mixup optimizationSignal, Image and Video Processing10.1007/s11760-024-03268-018:8-9(5743-5761)Online publication date: 28-May-2024
  • (2024)Improving loss function for deep convolutional neural network applied in automatic image annotationThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-023-02873-340:3(1617-1629)Online publication date: 1-Mar-2024
  • (2024)Deep Learning Models for Image Annotation Application in a 6G Network EnvironmentDevelopment of 6G Networks and Technology10.1002/9781394230686.ch8(201-221)Online publication date: 15-Nov-2024
  • (2023)Comparison of Deep Learning Models for Automatic Image Descriptors2023 IEEE 20th India Council International Conference (INDICON)10.1109/INDICON59947.2023.10440731(914-919)Online publication date: 14-Dec-2023
  • (2023)G2L: A High-Dimensional Geometric Approach for Automatic Generation of Highly Accurate Pseudo-labels2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW60793.2023.00117(1085-1094)Online publication date: 2-Oct-2023
  • (2023)From methods to datasets: A survey on Image-Caption GeneratorsMultimedia Tools and Applications10.1007/s11042-023-16560-x83:9(28077-28123)Online publication date: 31-Aug-2023
  • (2022)Scalable Image Annotation by Summarizing Training Samples into Labeled PrototypesSignal and Data Processing10.52547/jsdp.18.4.4918:4(49-68)Online publication date: 1-Mar-2022
  • (2022)A survey on social image semantic analysisChinese Science Bulletin10.1360/TB-2022-093868:25(3368-3384)Online publication date: 11-Nov-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media