Sparse Dictionaries for Semantic Segmentation

Lingling Tao¹⁹,
Fatih Porikli²⁰ &
René Vidal¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8693))

Included in the following conference series:

European Conference on Computer Vision

23k Accesses
8 Citations

Abstract

A popular trend in semantic segmentation is to use top-down object information to improve bottom-up segmentation. For instance, the classification scores of the Bag of Features (BoF) model for image classification have been used to build a top-down categorization cost in a Conditional Random Field (CRF) model for semantic segmentation. Recent work shows that discriminative sparse dictionary learning (DSDL) can improve upon the unsupervised K-means dictionary learning method used in the BoF model due to the ability of DSDL to capture discriminative features from different classes. However, to the best of our knowledge, DSDL has not been used for building a top-down categorization cost for semantic segmentation. In this paper, we propose a CRF model that incorporates a DSDL based top-down cost for semantic segmentation. We show that the new CRF energy can be minimized using existing efficient discrete optimization techniques. Moreover, we propose a new method for jointly learning the CRF parameters, object classifiers and the visual dictionary. Our experiments demonstrate that by jointly learning these parameters, the feature representation becomes more discriminative and the segmentation performance improves with respect to that of state-of-the-art methods that use unsupervised K-means dictionary learning.

Download to read the full chapter text

Chapter PDF

Transferring Segmentation from Image to Image via Contextual Sparse Representation

Improved hierarchical conditional random field model for object segmentation

Article 21 August 2015

TransFGU: A Top-Down Approach to Fine-Grained Unsupervised Semantic Segmentation

Keywords

References

Bach, F., Mairal, J., Ponce, J.: Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(4), 791–804 (2012)
Article Google Scholar
Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2010)
Google Scholar
Brostow, G.J., Shotton, J., Fauqueur, J., Cipolla, R.: Segmentation and recognition using structure from motion point clouds. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 44–57. Springer, Heidelberg (2008)
Chapter Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. Journal of Computer Vision 88(2), 303–338 (2010)
Article Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005 (2005)
Google Scholar
Fulkerson, B., Vedaldi, A., Soatto, S.: Class segmentation and object localization with superpixel neighborhoods. In: IEEE Int. Conf. on Computer Vision (2009)
Google Scholar
Galleguillos, C., Rabinovich, A., Belongie, S.: Object categorization using co-occurrence, location and appearance. In: IEEE Conf. on Computer Vision and Pattern Recognition (2008)
Google Scholar
Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. International Journal of Computer Vision 80(3), 300–316 (2008)
Article Google Scholar
Jain, A., Zappella, L., McClure, P., Vidal, R.: Visual dictionary learning for joint object categorization and segmentation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 718–731. Springer, Heidelberg (2012)
Chapter Google Scholar
Joachims, T., Finley, T., Yu, C.N.J.: Cutting-plane training of structural SVMs. Machine Learning 77(1), 27–59 (2009)
Article MATH Google Scholar
Kohli, P., Ladicky, L., Torr, P.H.S.: Robust higher order potentials for enforcing label consistency. In: IEEE Conf. on Computer Vision and Pattern Recognition (2008)
Google Scholar
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Trans. on Pattern Analysis and Machine Intelligence 26(2), 147–159 (2004)
Google Scholar
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Neural Information Processing Systems, pp. 109–117 (2011)
Google Scholar
Ladický, Ľ., Sturgess, P., Alahari, K., Russell, C., Torr, P.H.S.: What, where and how many? Combining object detectors and cRFs. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 424–437. Springer, Heidelberg (2010)
Chapter Google Scholar
Ladicky, L., Russell, C., Kohli, P., Torr, P.: Associative hierarchical CRFs for object class image segmentation. In: IEEE Int. Conf. on Computer Vision (2009)
Google Scholar
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML (2001)
Google Scholar
Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)
Article Google Scholar
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: Neural Information Processing Systems, pp. 801–808 (2007)
Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Discriminative learned dictionaries for local image analysis. IEEE Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Murphy, K.P., Weiss, Y., Jordan, M.I.: Loopy belief propagation for approximate inference: An empirical study. In: Proceedings of Uncertainty in AI, pp. 467–475 (1999)
Google Scholar
Naikal, N., Singaraju, D., Sastry, S.S.: Using models of objects with deformable parts for joint categorization and segmentation of objects. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part II. LNCS, vol. 7725, pp. 79–93. Springer, Heidelberg (2013)
Chapter Google Scholar
Opelt, A., Pinz, A.: The TU Graz-02 database (2002), http://www.emt.tugraz.at/~pinz/data/GRAZ02/
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: IEEE Conf. on Computer Vision and Pattern Recognition (2008)
Google Scholar
Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. Journal of Computer Vision 81(1), 2–23 (2009)
Article Google Scholar
Singaraju, D., Vidal, R.: Using global bag of features models in random fields for joint categorization and segmentation of objects. In: IEEE Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Tighe, J., Lazebnik, S.: Finding things: Image parsing with regions and per-exemplar detectors. In: IEEE Conf. on Computer Vision and Pattern Recognition (2013)
Google Scholar
Vedaldi, A.: A MATLAB wrapper of SVM^struct (2011), http://www.vlfeat.org/~vedaldi/code/svm-struct-matlab.html
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008), http://www.vlfeat.org/
Vedaldi, A., Soatto, S.: Quick shift and kernel methods for mode seeking. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 705–718. Springer, Heidelberg (2008)
Chapter Google Scholar
Yang, J., Yang, M.: Top-down visual saliency via joint crf and dictionary learning. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Yao, J., Fidler, S., Urtasun, R.: Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In: IEEE Conf. on Computer Vision and Pattern Recognition (2012)
Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 1169–1176. ACM, New York (2009)
Google Scholar
Zhang, K., Zhang, W., Zheng, Y., Xue, X.: Sparse reconstruction for weakly supervised semantic segmentation. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, pp. 1889–1895 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Imaging Science, Johns Hopkins University, USA
Lingling Tao & René Vidal
Australian National University & NICTA ICT, Australia
Fatih Porikli

Authors

Lingling Tao
View author publications
You can also search for this author in PubMed Google Scholar
Fatih Porikli
View author publications
You can also search for this author in PubMed Google Scholar
René Vidal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tao, L., Porikli, F., Vidal, R. (2014). Sparse Dictionaries for Semantic Segmentation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8693. Springer, Cham. https://doi.org/10.1007/978-3-319-10602-1_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-10602-1_36
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10601-4
Online ISBN: 978-3-319-10602-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Sparse Dictionaries for Semantic Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Transferring Segmentation from Image to Image via Contextual Sparse Representation

Improved hierarchical conditional random field model for object segmentation

TransFGU: A Top-Down Approach to Fine-Grained Unsupervised Semantic Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Sparse Dictionaries for Semantic Segmentation

Abstract

Chapter PDF

Similar content being viewed by others

Transferring Segmentation from Image to Image via Contextual Sparse Representation

Improved hierarchical conditional random field model for object segmentation

TransFGU: A Top-Down Approach to Fine-Grained Unsupervised Semantic Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation