[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

β-CapsNet: learning disentangled representation for CapsNet by information bottleneck

Published: 30 August 2022 Publication History

Abstract

We present a framework for learning disentangled representation of CapsNet by information bottleneck constraint that distills information into a compact form and motivates to learn an interpretable capsule. In our β-CapsNet framework, the hyperparameter β is utilized to trade off disentanglement and other tasks, and variational inference is utilized to convert the information bottleneck constraint into a KL divergence term that is approximated as a constraint on the mean of the capsule. For supervised learning, class-independent mask vector is used for understanding the types of variations synthetically irrespective of the image class, and we carry out extensive quantitative and qualitative experiments by tuning the parameter β to figure out the relationship between disentanglement, reconstruction and classification performance. Furthermore, the unsupervised β-CapsNet and the corresponding dynamic routing algorithm are proposed for learning disentangled capsule in an unsupervised manner, and extensive empirical evaluations suggest that our β-CapsNet achieves state-of-the-art disentanglement performance compared to CapsNet and various baselines on several complex datasets both in supervision and unsupervised scenes.

References

[1]
Higgins I, Matthey L, Pal A, Burgess C, Glorot X, Botvinick M, Mohamed S, Lerchner A (2017) beta-vae: Learning basic visual concepts with a constrained variational framework. In: International Conference on Learning Representations
[2]
Ridgeway K (2016) A survey of inductive biases for factorial representation-learning. arXiv preprint arXiv:1612.05299
[3]
Lake BM, Ullman TD, Tenenbaum JB, and Gershman SJ Building machines that learn and think like people Behav Brain Sci 2016 40 1-101
[4]
Bengio Y, Courville A, and Vincent P Representation learning: a review and new perspectives IEEE Trans Pattern Anal Mach Intell 2013 35 8 1798-1828
[5]
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: NIPS
[6]
Kingma DP, Welling M (2014) Auto-encoding variational bayes. ICLR
[7]
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In:NIPS, Montreal, Canada, pp 2672–2680
[8]
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: NIPS, Long Beach, CA pp 3856–3866
[9]
Tishby N, Pereira FC, Biale W (1999) The information bottleneck method. In: The 37th annual Allerton conference on communication, control, and computing, pp 368–377
[10]
Tishby N, Zaslavsky N (2015) Deep learning and the information bottleneck principle. In: Information theory workshop (ITW), 2015 IEEE, pp 1–5. IEEE
[11]
Schmidhuber J Learning factorial codes by predictability minimization Neural Comput 1992 4 6 863-879
[12]
Desjardins G, Courville A, Bengio Y (2012) Disentangling factors of variation via generative entangling. arXiv preprint arXiv:1210.5474
[13]
Chen TQ, Li X, Grosse RB, Duvenaud DK (2018) Isolating sources of disentanglement in variational autoencoders. In NIPS, pp 2615–2625
[14]
Achille A and Soatto S Information dropout: learning optimal representations through noisy computation IEEE Trans Pattern Anal Mach Intell 2018 40 12 2897-2905
[15]
Kim H, Mnih A (2018) Disentangling by factorising. arXiv preprint arXiv:1802.05983
[16]
Dupont E (2018) Learning disentangled joint continuous and discrete representations
[17]
Kulkarni TD, Whitney WF, Kohli P, Tenenbaum J (2015) Deep convolutional inverse graphics network. Adv Neural Inf Process Syst 2539–2547
[18]
Yang J, Reed SE, Yang M-H, Lee H (2015) Weakly-supervised disentangling with recurrent transformations for 3d view synthesis. In: NIPS, pp 1099–1107
[19]
Reed S, Sohn K, Zhang Y, Lee H (2014) Learning to disentangle factors of variation with manifold interaction. In: ICML, pp 1431–1439
[20]
Shwartz-Ziv R, Tishby N (2017) Opening the black box of deep neural networks via information. CoRR, arXiv:1703.00810
[21]
Alemi AA, Fischer I, Dillon JV, Murphy K (2017) Deep variational information bottleneck. In: ICLR
[22]
Peng XB, Kanazawa A, Toyer S, Abbeel P, Levine S (2019) Variational discriminator bottleneck: improving imitation learning, inverse RL, and GANs by constraining information flow. In: ICLR
[23]
Chalk M, Marre O, Tkacik G (2016) Relevant sparse codes with variational information bottleneck. In: NIPS, pp 1957–1965
[24]
Federici M, Dutta A, Forré P, Kushman N, Akata Z (2020) Learning robust representations via multi-view information bottleneck. In: ICLR
[25]
Hu MF, Liu JW (2021) Optimal representations of CapsNet by information bottleneck. In: ICANN
[26]
Hinton GE (1981) A parallel computation that assigns canonical object-based frames of reference. In: Proceedings of the 7th international joint conference on artificial intelligence, IJCAI, pp 683–685
[27]
Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders. In: Artificial neural networks and machine learning, ICANN, pp 44–51
[28]
Hinton GE, Sabour S, Frosst N (2018) Matrix capsules with EM routing. In: International conference on learning representations, ICLR
[29]
Ribeiro FDS, Leontidis G, Kollias SD (2020) Capsule routing via variational bayes. In: The thirty-fourth AAAI conference on artificial intelligence, AAAI 2020, pp 3749–3756
[30]
Dou Z, Tu Z, Wang X, Wang L, Shi S, Zhang T (2019) Dynamic layer aggregation for neural machine translation with routing-by-agreement. In: AAAI conference on artificial intelligence, pp 86–93
[31]
Wang D, Liu Q (2018) An optimization view on dynamic routing between capsules. In: International conference on learning representations, ICLR
[32]
Li H, Guo X, Dai B, Ouyang W, Wang X (2018) Neural network encapsulation. In: European computer vision conference, ECCV, pp 266–282
[33]
Kosiorek AR, Sabour S, Teh YW, Hinton GE (2019) Stacked capsule autoencoders. In: Annual conference on neural information processing systems, NeurIPS, pp 15486–15496
[34]
Rajasegaran J, Jayasundara V, Jayasekara S et al (2019) Deepcaps: going deeper with capsule networks. In: Computer vision and pattern recognition, CVPR, pp 10725–10733
[35]
LeCun Y, Cortes C, Burges CJC (1998) The mnist database of handwritten digits
[36]
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. CoRR
[37]
Aubry M, Maturana D, Efros A, Russell B, Sivic J (2014) Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. In: CVPR
[38]
Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: ICCV
[39]
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: CVPR, pp 4401–4410

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neural Computing and Applications
Neural Computing and Applications  Volume 35, Issue 3
Jan 2023
894 pages
ISSN:0941-0643
EISSN:1433-3058
Issue’s Table of Contents

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 30 August 2022
Accepted: 12 August 2022
Received: 25 November 2021

Author Tags

  1. Disentanglement
  2. Information bottleneck
  3. CapsNet
  4. Representation learning

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media