Abstract
In this paper we address the problem of unsupervised landmark mining, which is to automatically discover frequently appearing landmarks from an unstructured image dataset. Landmark mining often suffers from false matches resulted from cluttered backgrounds and foregrounds, inter-class similarities, and so on. Analogous to TF-IDF in image retrieval, we propose the Saliency-GD weighting scheme of visual words, which can be easily integrated into state-of-the-art local-feature-based visual instance mining frameworks. Saliency detection provides feature weighting in image space from the attention perspective, and in feature space, the knowledge of geographic density (GD) transferred from a separate training dataset gives a multimodal selection of meaningful visual words. Experiments on public landmark datasets show that Saliency-GD weighting scheme greatly improves the landmark mining performance with increasing discrimination power of visual features.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cheng, M.M., Mitra, N., Huang, X., Torr, P., Hu, S.M.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)
Chum, O., Matas, J.: Large scale discovery of spatially related images. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 371–377 (2010)
Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: finding a (thick) needle in a haystack. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17–24 (2009)
Crandall, D., Backstrom, L., Huttenlocher, D., Kleinberg, J.: Mapping the world’s photos. In: World Wide Web Conference (WWW), pp. 761–770 (2009)
Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.: What makes Paris look like Paris. ACM Trans. Graph. 31(4), 101:1–101:9 (2012)
Goldberg, C., Chen, T., Zhang, F.L., Shamir, A., Hu, S.M.: Data-driven object manipulation in images. Comput. Graph. Forum 31(2), 265–274 (2012)
Hauff, C., Thomee, B., Trevisiol, M.: Working notes for the placing task at MediaEval 2013. In: MediaEval Workshop (2013)
He, J., Feng, J., Liu, X., Cheng, T., Lin, T.H., Chung, H., Chang, S.F.: Mobile product search with bag of hash bits and boundary reranking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3005–3012 (2012)
Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)
Li, H.: Multimodal visual pattern mining with convolutional neural networks. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 427–430 (2016)
Li, H., Ellis, J., Ji, H., Chang, S.F.: Event specific multimodal pattern mining for knowledge base construction. In: ACM International Conference on Multimedia, pp. 821–830 (2016)
Li, W., Wang, C., Zhang, L., Rui, Y., Zhang, B.: Scalable visual instance mining with instance graph. In: British Machine Vision Conference (BMVC), pp. 98:1–98:11 (2015)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Quack, T., Leibe, B., Van Gool, L.: World-scale mining of objects and events from community photo collections. In: ACM International Conference on Image and Video Retrieval (CIVR), pp. 47–56 (2008)
Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1939–1946 (2013)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2001)
Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 25–32 (2009)
Zhang, W., Li, H., Ngo, C.W., Chang, S.F.: Scalable visual instance mining with threads of features. In: ACM International Conference on Multimedia, pp. 297–306 (2014)
Zhu, Z., Xu, C.: Organizing photographs with geospatial and image semantics. Multimed. Syst., 1–9 (2016)
Acknowledgment
This work was supported by the National Basic Research Program (973 Program) of China (No. 2013CB329403), and the National Natural Science Foundation of China (Nos. 61332007, 91420201 and 61620106010).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Li, W., Li, J., Zhang, B. (2018). Saliency-GD: A TF-IDF Analogy for Landmark Image Mining. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10735. Springer, Cham. https://doi.org/10.1007/978-3-319-77380-3_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-77380-3_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77379-7
Online ISBN: 978-3-319-77380-3
eBook Packages: Computer ScienceComputer Science (R0)