Abstract
With the explosion of multimedia content online image search has become a viable way of retrieving relevant images. However, current methods of image search use textual cues to retrieve images and do not take into account the visual information they contain. In this paper we aim to crawl and build multimodal image search engines that take into account both the textual and visual content relevant to the images. We intend to use off the shelf text search engines to accomplish the above task, making the construction of image retrieval systems extremely easy. We build visual models by using the bag of words paradigm and propose and validate through experimentation a combined multiple vocabulary scheme that outperforms normal vocabularies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Li, L.-J., Wang, G., Fei-Fei, L.: OPTIMOL: automatic Online Picture collection via Incremental Model Learning. In: Computer Vision and Pattern Recognition, CVPR, Minneapolis (2007)
Schroff, F., Criminisi, A., Zisserman, A.: Harvesting Image Databases From The Web. In: Proceedings of the 11th International Conference on Computer Vision, Rio de Janeiro, Brazil (2007)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A Large Scale Hierarchical Image Database. In: Computer Vision and Pattern Recognition, CVPR, Miami (2009)
Collins, B., Deng, J., Li, K., Fei-Fei, L.: Towards scalable dataset construction: An active learning approach. In: Proceedings of the 10th European Conference on Computer Vision, ECCV, Marseille, France (2008)
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Towards scalable dataset construction: An active learning approach. In: Proceedings of the 10th International Conference on Computer Vision, ICCV, Beijing, China (2005)
Fergus, R., Perona, P., Zisserman, A.: A Visual Category Filter for Google Images. In: Proceedings of the 8th European Conference on Computer Vision, ECCV, Prague, Czech Republic (2004)
Berg, T.L., Forsyth, D.A.: Animals on the Web, Computer Vision and Pattern Recognition, CVPR, New York (2006)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Torralba, A., Fergus, R., Freeman, W.T.: 80 Million Tiny Images: a Large Database for Non-Parametric Object and Scene Recognition. IEEE PAMI 30, 1958–1970 (2008)
Frankel, C., Swain, M.J., Athitsos, V.: Webseer: An Image Search Engine for the World Wide Web. In: Computer Vision and Pattern Recognition, CVPR, San Juan, Puerto Rico (2007)
Onix Text Retrieval Toolkit, http://www.lextek.com/manuals/onix/stopwords1.html
Lowe, D.G.: Object Recognition from Local Scale-Invariant Features. In: International Conference on Computer Vision, Corfu, Greece, pp. 1150–1157 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Agarwal, A., Saxena, V. (2010). Content Based Multimodal Retrieval for Databases of Indian Monuments. In: Ranka, S., et al. Contemporary Computing. IC3 2010. Communications in Computer and Information Science, vol 94. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14834-7_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-14834-7_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14833-0
Online ISBN: 978-3-642-14834-7
eBook Packages: Computer ScienceComputer Science (R0)