Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15136))

Included in the following conference series:

European Conference on Computer Vision

118 Accesses
1 Altmetric

Abstract

A great interest has arisen in using Deep Generative Models (DGM) for generative design. When assessing the quality of the generated designs, human designers focus more on structural plausibility, e.g., no missing component, rather than visual artifacts, e.g., noises or blurriness. Meanwhile, commonly used metrics such as Fréchet Inception Distance (FID) may not evaluate accurately because they are sensitive to visual artifacts and tolerant to semantic errors. As such, FID might not be suitable to assess the performance of DGMs for a generative design task. In this work, we propose to encode the to-be-evaluated images with a Denoising Autoencoder (DAE) and measure the distribution distance in the resulting latent space. Hereby, we design a novel metric Fréchet Denoised Distance (FDD). We experimentally test our FDD, FID and other state-of-the-art metrics on multiple datasets, e.g., BIKED, Seeing3DChairs, FFHQ and ImageNet. Our FDD can effectively detect implausible structures and is more consistent with structural inspections by human experts. Our source code is publicly available at https://github.com/jiajie96/FDD_pytorch.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 49.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 59.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Generative Creativity: Adversarial Learning for Bionic Design

Deep image synthesis from intuitive user input: A review and perspectives

Article Open access 27 October 2021

Fresh Eyes

References

Aubry, M., Maturana, D., Efros, A.A., Russell, B.C., Sivic, J.: Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. In: 2014 IEEE CVPR, pp. 3762–3769 (2014). https://doi.org/10.1109/CVPR.2014.487
Baker, N., Lu, H., Erlikhman, G., Kellman, P.J.: Deep convolutional networks do not classify based on global object shape. PLoS Comput. Biol. 14 (2018). https://api.semanticscholar.org/CorpusID:54476941
Barratt, S.T., Sharma, R.: A note on the inception score. ArXiv abs/1801.01973 (2018). https://api.semanticscholar.org/CorpusID:38384342
Betzalel, E., Penso, C., Navon, A., Fetaya, E.: A study on the evaluation of generative models. CoRR abs/2206.10935 (2022). https://doi.org/10.48550/ARXIV.2206.10935
Binkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying MMD GANs. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, Conference Track Proceedings. OpenReview.net (2018). https://openreview.net/forum?id=r1lUOzWCW
Borji, A.: Pros and cons of GAN evaluation measures: new developments. Comput. Vis. Image Underst. 215, 103329 (2022). https://doi.org/10.1016/j.cviu.2021.103329, https://www.sciencedirect.com/science/article/pii/S1077314221001685
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. ArXiv abs/1809.11096 (2018), https://api.semanticscholar.org/CorpusID:52889459
Buzuti, L.F., Thomaz, C.E.: Fréchet autoencoder distance: a new approach for evaluation of generative adversarial networks. Comput. Vis. Image Underst. 235, 103768 (2023)
Article Google Scholar
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. NIPS 2020, Curran Associates Inc., Red Hook, NY, USA (2020)
Google Scholar
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: Stargan v2: diverse image synthesis for multiple domains. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8185–8194 (2020). https://doi.org/10.1109/CVPR42600.2020.00821
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 248–255. IEEE (2009). https://ieeexplore.ieee.org/abstract/document/5206848/
Dhariwal, P., Nichol, A.Q.: Diffusion models beat GANs on image synthesis. In: Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems (2021). https://openreview.net/forum?id=AAWuCvzaVt
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Fan, J., Vuaille, L., Bäck, T., Wang, H.: On the noise scheduling for generating plausible designs with diffusion models (2023)
Google Scholar
Fan, J., Vuaille, L., Wang, H., Bäck, T.: Adversarial latent autoencoder with self-attention for structural image synthesis. arXiv preprint arXiv:2307.10166 (2023)
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F., Brendel, W.: Imagenet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. ArXiv abs/1811.12231 (2018). https://api.semanticscholar.org/CorpusID:54101493
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 13(1), 723–773 (2012)
MathSciNet Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Hermann, K.L., Chen, T., Kornblith, S.: The origins and prevalence of texture bias in convolutional neural networks. arXiv: Computer Vision and Pattern Recognition (2019). https://api.semanticscholar.org/CorpusID:220266152
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local NASH equilibrium. In: Advances in Neural Information Processing Systems, vol. 30 (NIPS 2017) (2018)
Google Scholar
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. arXiv preprint arxiv:2006.11239 (2020)
Horak, D., Yu, S., Khorshidi, G.S.: Topology distance: a topology-based approach for evaluating generative adversarial networks. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, February 2-9, 2021, pp. 7721–7728. AAAI Press (2021). https://doi.org/10.1609/AAAI.V35I9.16943
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. ArXiv abs/1710.10196 (2017). https://api.semanticscholar.org/CorpusID:3568073
Karras, T., Aittala, M., Aila, T., Laine, S.: Elucidating the design space of diffusion-based generative models. Adv. Neural. Inf. Process. Syst. 35, 26565–26577 (2022)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Google Scholar
Kucker, S.C., et al.: Reproducibility and a unifying explanation: lessons from the shape bias. Infant Behav. Dev. 54, 156–165 (2019). https://api.semanticscholar.org/CorpusID:53045726
Kynkäänniemi, T., Karras, T., Aittala, M., Aila, T., Lehtinen, J.: The role of imagenet classes in fréchet inception distance. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=4oXTQ6m_ws8
Landau, B., Smith, L.B., Jones, S.S.: The importance of shape in early lexical learning. Cogn. Dev. 3, 299–321 (1988). https://api.semanticscholar.org/CorpusID:205117480
Liu, W., et al.: Towards visually explaining variational autoencoders. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8642–8651 (2020)
Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3730–3738 (2015)
Google Scholar
Maiorca, A., Yoon, Y., Dutoit, T.: Evaluating the quality of a synthesized motion with the fréchet motion distance. In: ACM SIGGRAPH 2022 Posters. SIGGRAPH 2022, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3532719.3543228
Naeem, M.F., Oh, S.J., Uh, Y., Choi, Y., Yoo, J.: Reliable fidelity and diversity metrics for generative models. In: III, H.D., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 7176–7185. PMLR (2020). https://proceedings.mlr.press/v119/naeem20a.html
Nobari, A.H., Rashad, M.F., Ahmed, F.: Creativegan: editing generative adversarial networks for creative design synthesis. CoRR abs/2103.06242 (2021). https://arxiv.org/abs/2103.06242
Oquab, M., et al.: DINOv2: learning robust visual features without supervision. Trans. Mach. Learn. Res. (2024). https://openreview.net/forum?id=a68SUt6zFt
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning (2021). https://api.semanticscholar.org/CorpusID:231591445
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1511.06434
Regenwetter, L., Curry, B., Ahmed, F.: BIKED: a dataset for computational bicycle design with machine learning benchmarks. J. Mech. Des. 144(3) (2021). https://doi.org/10.1115/1.4052585
Regenwetter, L., Nobari, A.H., Ahmed, F.: Deep generative models in engineering design: a review. J. Mech. Des. 144(7), 071704 (2022)
Article Google Scholar
Salimans, T., Goodfellow, I.J., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training gans. ArXiv abs/1606.03498 (2016), https://api.semanticscholar.org/CorpusID:1687220
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 618–626 (2017). https://doi.org/10.1109/ICCV.2017.74
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv:2010.02502 (2020). https://arxiv.org/abs/2010.02502
Stein, G., et al.: Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models. CoRR abs/2306.04675 (2023). https://doi.org/10.48550/ARXIV.2306.04675
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2818–2826 (2015). https://api.semanticscholar.org/CorpusID:206593880
Van Den Oord, A., Vinyals, O., et al.: Neural discrete representation learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning (2008). https://api.semanticscholar.org/CorpusID:207168299
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. ArXiv abs/1708.07747 (2017). https://api.semanticscholar.org/CorpusID:702279
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp. 586–595. Computer Vision Foundation / IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00068, http://openaccess.thecvf.com/content_cvpr_2018/html/ Zhang_The_Unreasonable_Effectiveness_CVPR_2018_paper.html
Zhou, W.: Image quality assessment: from error measurement to structural similarity. IEEE Trans. Image Process. 13, 600–613 (2004)
Article Google Scholar

Download references

Acknowledgements

We gratefully acknowledge Laure Vuaille for her valuable insights and the support provided by BMW Group.

Author information

Authors and Affiliations

BMW Group, Bremer Str. 6, 80788, Munich, Germany
Jiajie Fan & Amal Trigui
LIACS, Leiden University, Niels Bohrweg 1, 2333, Leiden, CA, The Netherlands
Jiajie Fan, Thomas Bäck & Hao Wang

Authors

Jiajie Fan
View author publications
You can also search for this author in PubMed Google Scholar
Amal Trigui
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Bäck
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiajie Fan .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Hessen, Germany
Stefan Roth
Princeton University, Palo Alto, CA, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 14762 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, J., Trigui, A., Bäck, T., Wang, H. (2025). Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15136. Springer, Cham. https://doi.org/10.1007/978-3-031-73229-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-73229-4_6
Published: 25 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73228-7
Online ISBN: 978-3-031-73229-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Generative Creativity: Adversarial Learning for Bionic Design

Deep image synthesis from intuitive user input: A review and perspectives

Fresh Eyes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 14762 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Generative Creativity: Adversarial Learning for Bionic Design

Deep image synthesis from intuitive user input: A review and perspectives

Fresh Eyes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 14762 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation