Abstract
In this paper, we address the problem of visual instance mining, which is to automatically discover frequently appearing visual instances from a large collection of images. We propose a scalable mining method by leveraging the graph structure with images as vertices. Different from most existing approaches that focus on either instance-level similarities or image-level context properties, our method captures both information. In the proposed framework, the instance-level information is integrated during the construction of a sparse instance graph based on the similarity between augmented local features, while the image-level context is explored with a greedy breadth-first search algorithm to discover clusters of visual instances from the graph. This framework can tackle the challenges brought by small visual instances, diverse intra-class variations, as well as noise in large-scale image databases. To further improve the robustness, we integrate two techniques into the basic framework. First, to better cope with the increasing noise of large databases, weak geometric consistency is adopted to efficiently combine the geometric information of local matches into the construction of the instance graph. Second, we propose the layout embedding algorithm, which leverages the algorithm originally designed for graph visualization to fully explore the image database structure. The proposed method was evaluated on four annotated data sets with different characteristics, and experimental results showed the superiority over state-of-the-art algorithms on all data sets. We also applied our framework on a one-million Flickr data set and proved its scalability.
Similar content being viewed by others
References
Wang, H., Zhao, G., Yuan, J.: Visual pattern discovery in image and video data: a brief survey. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov. 4(1), 24–37 (2014)
Zhang, W., Li, H., Ngo, C.W., Chang, S.F.: Scalable visual instance mining with threads of features. In: ACM International Conference on Multimedia, pp. 297–306 (2014)
Li, W., Wang, C., Zhang, L., Rui, Y., Zhang, B.: Scalable visual instance mining with instance graph. In: British Machine Vision Conference (BMVC), pp. 98.1–98.11 (2015)
Weng, C., Yuan, J.: Efficient mining of optimal AND/OR patterns for visual recognition. IEEE Trans. Multimed. 17(5), 626–635 (2015)
Zhang, W., Ngo, C.W.: Topological spatial verification for instance search. IEEE Trans. Multimed. 17(8), 1236–1247 (2015)
Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: ACM International Conference on Image and Video Retrieval (CIVR), pp. 549–556 (2007)
Wu, Z., Ke, Q., Isard, M., Sun, J.: Bundling features for large scale partial-duplicate web image search. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 25–32 (2009)
Dong, W., Wang, Z., Charikar, M., Li, K.: High-confidence near-duplicate image detection. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 1–8 (2012)
Liang, J., Han, Y., Hu, Q.: Semi-supervised image clustering with multi-modal information. Multimed. Syst. 22(2), 149–160 (2016)
Zhu, Z., Xu, C.: Organizing photograghs with geospatial and image semantics. Multimed. Syst. 1–9 (2016)
Wang, X.J., Xu, Z., Zhang, L., Liu, C., Rui, Y.: Towards indexing representative images on the web. In: ACM International Conference on Multimedia, pp. 1229–1238 (2012)
Chen, J., Jin, Q., Bao, S., Su, Z., Chen, S., Yu, Y.: Exploitation and exploration balanced hierarchical summary for landmark images. IEEE Trans. Multimed. 17(10), 1773–1786 (2015)
Rematas, K., Fernando, B., Dellaert, F., Tuytelaars, T.: Dataset fingerprints: exploring image collections through data mining. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4867–4875 (2015)
Kennedy, L., Chang, S.F.: Internet image archaeology: automatically tracing the manipulation history of photographs on the web. In: ACM International Conference on Multimedia, pp. 349–358 (2008)
Hamzaoui, A., Letessier, P., Joly, A., Buisson, O., Boujemaa, N.: Object Vis. Query Suggest. 68(2), 429–454 (2014)
Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: ARISTA—image search to annotation on billions of web photos. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2987–2994 (2010)
Wang, Y., Li, S., Kot, A.: Deepbag: recognizing handbag models. IEEE Trans. Multimed. 17(11), 2072–2083 (2015)
Romberg, S., Pueyo, L., Lienhart, R., van Zwol, R.: Scalable logo recognition in real-world images. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 25.1–25.8 (2011)
Romberg, S., Lienhart, R.: Bundle min-hashing for logo recognition. In: ACM International Conference on Multimedia Retrieval (ICMR), pp. 113–120 (2013)
Yuan, J., Wu, Y.: Spatial random partition for common visual pattern discovery. In: IEEE International Conference on Computer Vision (ICCV), pp. 1–8 (2007)
Liu, H., Yan, S.: Common visual pattern discovery via spatially coherent correspondences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1609–1616 (2010)
Quack, T., Ferrari, V., Van Gool, L.: Video mining with frequent itemset configurations. In: ACM International Conference on Image and Video Retrieval (CIVR), pp. 360–369 (2006)
Philbin, J., Zisserman, A.: Object mining using a matching graph on very large image collections. In: Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), pp. 738–745 (2008)
Chum, O., Matas, J.: Large scale discovery of spatially related images. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 371–377 (2010)
Lee, D., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery. In: European Conference on Computer Vision (ECCV), pp. 648–662 (2010)
Chum, O., Matas, J.: Fast computation of min-hash signatures for image collections. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3077–3084 (2012)
Tsai, J.T., Lin, Y.Y., Liao, H.Y.: Per-cluster ensemble kernel learning for multi-modal image clustering with group-dependent feature selection. IEEE Trans. Multimed. 16(8), 2229–2241 (2014)
Chum, O., Perdoch, M., Matas, J.: Geometric min-hashing: finding a (thick) needle in a haystack. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 17–24 (2009)
Xie, L., Tian, Q., Zhou, W., Zhang, B.: Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb. Comput. Vis. Image Understand. 124, 31–41 (2014)
Yan, Y., Liu, G., Wang, S., Zhang, J., Zheng, K.: Graph-based clustering and ranking for diversified image search. Multimed. Syst. 1–12 (2016)
Cao, S., Snavely, N.: Graph-based discriminative learning for location recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 700–707 (2013)
Turcot, P., Lowe, D.: Better matching with fewer features: the selection of useful features in large database recognition problems. In: IEEE International Conference on Computer Vision (ICCV) Workshops (2009)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. Int. J. Comput. Vis. 60(1), 63–86 (2004)
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision (ICCV), pp. 1470–1477 (2003)
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision (ECCV), pp. 304–317 (2008)
Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87(3), 316–336 (2010)
Noack, A.: Energy models for graph clustering. J. Graph Algorithm Appl. 11(2), 453–480 (2007)
Jacomy, M., Venturini, T., Heymann, S., Bastian, M.: ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for Gephi software. PloS One 9(6), e98,679 (2014)
Zhang, W., Pang, L., Ngo, C.W.: Snap-and-ask: answering multimodal question by naming visual instance. In: ACM International Conference on Multimedia, pp. 609–618 (2012)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2161–2168 (2006)
FuentesPineda, G., Koga, H., Watanabe, T.: Scalable object discovery: a hash-based approach to clustering co-occurring visual words. IEICE Trans. Inf. Syst. 94(10), 2024–2035 (2011)
Letessier, P., Buisson, O., Joly, A.: Scalable mining of small visual objects. In: ACM International Conference on Multimedia, pp. 599–608 (2012)
Letessier, P., Buisson, O., Joly, A.: Scalable mining of small visual objects (with new experiments). Research report, LIRMM (2013)
Li, W., Wang, C., Zhang, L., Rui, Y., Zhang, B.: Partial-duplicate clustering and visual pattern discovery on web scale image database. IEEE Trans. Multimed. 17(7), 967–980 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Li, Y., Liu, L., Shen, C., van den Hengel, A.: Mid-level deep pattern mining. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 971–980 (2015)
Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: International Conference on Web and Social Media (ICWSM), pp. 361–362 (2009)
Sibson, R.: SLINK: an optimally efficient algorithm for the single-link cluster method. Comput. J. 16(1), 30–34 (1973)
Defays, D.: An efficient algorithm for a complete link method. Comput. J. 20(4), 364–366 (1977)
van Dongen, S.: A cluster algorithm for graphs. Technical Report, CWI (2000)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Neural Information Processing Systems (NIPS), pp. 849–856 (2001)
Schaeffer, S.: Graph Cluster. 1(1), 27–64 (2007)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by P. Pala.
The work of W. Li, J. Li, and B. Zhang was supported by the National Basic Research Program (973 Program) of China (No. 2013CB329403), and the National Natural Science Foundation of China (Nos. 61332007, 91420201 and 61620106010).
Rights and permissions
About this article
Cite this article
Li, W., Li, J., Wang, C. et al. Visual instance mining from the graph perspective. Multimedia Systems 24, 147–162 (2018). https://doi.org/10.1007/s00530-016-0533-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-016-0533-6