Abstract
We present an algorithm for Interactive Co-segmentation of a foreground object from a group of related images. While previous works in co-segmentation have focussed on unsupervised co-segmentation, we use successful ideas from the interactive object-cutout literature. We develop an algorithm that allows users to decide what foreground is, and then guide the output of the co-segmentation algorithm towards it via scribbles. Interestingly, keeping a user in the loop leads to simpler and highly parallelizable energy functions, allowing us to work with significantly more images per group. However, unlike the interactive single-image counterpart, a user cannot be expected to exhaustively examine all cutouts (from tens of images) returned by the system to make corrections. Hence, we propose iCoseg, an automatic recommendation system that intelligently recommends where the user should scribble next. We introduce and make publicly available the largest co-segmentation dataset yet, the CMU-Cornell iCoseg dataset, with 38 groups, 643 images, and pixelwise hand-annotated groundtruth. Through machine experiments and real user studies with our developed interface, we show that iCoseg can intelligently recommend regions to scribble on, and users following these recommendations can achieve good quality cutouts with significantly lower time and effort than exhaustively examining all cutouts.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bagon, S. (2006). Matlab wrapper for graph cut. http://www.wisdom.weizmann.ac.il/~bagon.
Bai, X., & Sapiro, G. (2007). A geodesic framework for fast interactive image and video segmentation and matting. In ICCV.
Batra, D., Sukthankar, R., & Chen, T. (2008). Semi-supervised clustering via learnt codeword distances. In BMVC.
Batra, D., Kowdle, A., Parikh, D., Tang, K., & Chen, T. (2009). http://amp.ece.cornell.edu/projects/touch-coseg/. Interactive Co-segmentation by Touch.
Batra, D., Kowdle, A., Parikh, D., Luo, J., & Chen, T. (2010). icoseg: interactive co-segmentation with intelligent scribble guidance. In CVPR.
Bouman, C. A. (1997). Cluster: an unsupervised algorithm for modeling Gaussian mixtures. Available from http://www.ece.purdue.edu/~bouman.
Boykov, Y., & Jolly, M. P. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in n-d images. In ICCV.
Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(9), 1124–1137.
Boykov, Y., Veksler, O., & Zabih, R. (2001). Efficient approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(12), 1222–1239.
Chen, Y., & Medioni, G. (1992). Object modelling by registration of multiple range images. Image and Vision Computing, 10(3), 145–155.
Chen, Z., Chou, H. L., & Chen, W. C. (2008). A performance controllable octree construction method. In ICPR (pp. 1–4).
Collins, B., Deng, J., Li, K., & Fei-Fei, L. (2008). Towards scalable dataset construction: an active learning approach. In ECCV.
Comaniciu, D., & Meer, P. (2002). Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.
Criminisi, A., Sharp, T., & Blake, A. (2008). Geos: geodesic image segmentation. In ECCV .
Cui, J., Yang, Q., Wen, F., Wu, Q., Zhang, C. Gool, L. V., & Tang, X. (2008). Transductive object cutout. In CVPR.
Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In SIGGRAPH ’96: proceedings of the 23rd annual conference on computer graphics and interactive techniques (pp. 303–312). New York: ACM. doi:http://doi.acm.org/10.1145/237170.237269.
Fang, Y. H., Chou, H. L., & Chen, Z. (2003). 3d shape recovery of complex objects from multiple silhouette images. Pattern Recognition Letters, 24(9–10), 1279–1293.
Fitzgibbon, A. W., Cross, G., & Zisserman, A. (1998). Automatic 3d model construction for turn-table sequences. In Proceedings of SMILE workshop on structure from multiple images in large scale environments (Vol. 1560, pp. 154–170).
Forbes, K., Nicolls, F., de Jager, G., & Voigt, A. (2006). Shape-from-silhouette with two mirrors and an uncalibrated camera. In ECCV (pp. 165–178).
Franco, J. S., & Boyer, E. (2003). Exact polyhedral visual hulls. In BMVC (Vol. 1, pp. 329–338). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.5634.
Gallagher, A., & Chen, T. (2008). Estimating age, gender and identity using first name priors. In CVPR.
Hochbaum, D. S., & Singh, V. (2009). An efficient algorithm for co-segmentation. In ICCV.
Hoiem, D., Efros, A. A., & Hebert, M. (2005). Geometric context from a single image. In ICCV.
Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2007). Active learning with Gaussian processes for object categorization. In ICCV.
Kohli, P., & Torr, P. H. S. (2008). Measuring uncertainty in graph cut solutions. Computer Vision and Image Understanding, 112(1), 30–38.
Kolmogorov, V., & Zabih, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.
Kowdle, A., Batra, D., Chen, W. C., & Chen, T. (2010). imodel: interactive co-segmentation for object of interest 3d modeling. In Workshop on reconstruction and modeling of large-scale 3D virtual environments at European conference on computer vision.
Lee, Y. J., & Grauman, K. (2010). Collect-cut: segmentation with top-down cues discovered in multi-object images. In CVPR.
Leung, T., & Malik, J. (1998). Contour continuity in region based image segmentation. In ECCV.
Levoy, M., Pulli, K., Curless, B., Rusinkiewicz, S., Koller, D., Pereira, L., Ginzton, M., Anderson, S., Davis, J., Ginsberg, J., Shade, J., & Fulk, D. (2000). The digital Michelangelo project: 3d scanning of large statues. In SIGGRAPH (pp. 131–144).
Li, Y., Sun, J., Tang, C. K., & Shum, H. Y. (2004). Lazy snapping. In SIGGRAPH.
Mu, Y., & Zhou, B. (2007). Co-segmentation of image pairs with quadratic global constraint in MRFs. In ACCV.
Mukherjee, L., Singh, V., & Dyer, C. R. (2009). Half-integrality based algorithms for co-segmentation of images. In CVPR.
Rother, C., Kolmogorov, V., & Blake, A. (2004). “Grabcut”: interactive foreground extraction using iterated graph cuts. In SIGGRAPH.
Rother, C., Minka, T., Blake, A., & Kolmogorov, V. (2006). Cosegmentation of image pairs by histogram matching—incorporating a global constraint into MRFs. In CVPR.
Schnitman, Y., Caspi, Y., Cohen Or, D., & Lischinski, D. (2006). Inducing semantic segmentation from an example. In ACCV.
Settles, B. (2009). Active learning literature survey (Computer Sciences Technical Report 1648). Madison: University of Wisconsin.
Seung, H. S., Opper, M., & Sompolinsky, H. (1992). Query by committee. In COLT.
Snavely, N., Seitz, S., & Szeliski, R. (2006). Photo tourism: exploring photo collections in 3d. In SIGGRAPH (pp. 835–846).
Starck, J., & Hilton, A. (2007). Surface capture for performance-based animation. IEEE Computer Graphics and Applications, 27(3), 21–31.
Szeliski, R. (1993). Rapid octree construction from image sequences. CVGIP. Image Understanding, 58(1), 23–32.
Vicente, S., Kolmogorov, V., & Rother, C. (2010). Cosegmentation revisited: models and optimization. In ECCV.
Vijayanarasimhan, S., & Grauman, K. (2009). What’s it going to cost you?: Predicting effort vs. informativeness for multi-label image annotations. In CVPR.
Vlasic, D., Baran, I., Matusik, W., & Popović, J. (2008). Articulated mesh animation from multi-view silhouettes. In SIGGRAPH (pp. 1–9). New York: ACM.
Yan, R., Yang, J., & Hauptmann, A. (2003). Automatically labeling video data using multi-class active learning. In ICCV.
Zhang, L., Curless, B., & Seitz, S. M. (2002). Rapid shape acquisition using color structured light and multi-pass dynamic programming. In 3DPVT (p. 24).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Batra, D., Kowdle, A., Parikh, D. et al. Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance. Int J Comput Vis 93, 273–292 (2011). https://doi.org/10.1007/s11263-010-0415-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-010-0415-x