[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content
Log in

Wavelets in the Deep Learning Era

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

Sparsity-based methods, such as wavelets, have been the state of the art for more than 20 years for inverse problems before being overtaken by neural networks. In particular, U-nets have proven to be extremely effective. Their main ingredients are a highly nonlinear processing, a massive learning made possible by the flourishing of optimization algorithms with the power of computers (GPU) and the use of large available datasets for training. It is far from obvious to say which of these three ingredients has the biggest impact on the performance. While the many stages of nonlinearity are intrinsic to deep learning, the usage of learning with training data could also be exploited by sparsity-based approaches. The aim of our study is to push the limits of sparsity to use, similarly to U-nets, massive learning and large datasets, and then to compare the results with U-nets. We present a new network architecture, called learnlets, which conserves the properties of sparsity-based methods such as exact reconstruction and good generalization properties, while fostering the power of neural networks for learning and fast calculation. We evaluate the model on image denoising tasks. Our conclusion is that U-nets perform better than learnlets on image quality metrics in distribution, while learnlets have better generalization properties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. https://github.com/zaccharieramzi/understanding-unets.

  2. https://www.tensorflow.org/api_docs/python/tf/image/rgb_to_grayscale; TensorFlow Documentation for RGB to grayscale.

  3. We call a sample, a full image, not just a patch of the image.

References

  1. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer (2015)

  2. Isola, P., et al.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

  3. Adler, J., Oktem, O.: Learned primal–dual reconstruction. IEEE Trans. Med. Imaging 37(6), 1322–1332 (2018)

    Article  Google Scholar 

  4. Zbontar, J., et al.: fastMRI: an open dataset and benchmarks for accelerated MRI. In: arXiv preprint arXiv:1811.08839 (2018)

  5. Quan, T.M., Nguyen-Duc, T., Jeong, W.-K.: Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss. IEEE Trans. Med. Imaging 37(6), 1488–1497 (2018)

    Article  Google Scholar 

  6. Ye, J.C., Han, Y., Cha, E.: Deep convolutional framelets: a general deep learning framework for inverse problems. SIAM J. Imaging Sci. 11(2), 991–1048 (2018)

    Article  MATH  Google Scholar 

  7. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  8. Donoho, D.L.: De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41(3), 613–627 (1995)

    Article  MATH  Google Scholar 

  9. Gottschling, N.M., et al.: The troublesome kernel: why deep learning for inverse problems is typically unstable. In: arXiv preprint arXiv:2001.01258 (2020)

  10. Zaccharie, R., et al.: Wavelets in the deep learning era. In: 2020 28th European Signal Processing Conference (EUSIPCO) (2021). issn: 2076-1465. https://doi.org/10.23919/Eusipco47968.2020.9287317

  11. Recoskie, D., Mann, R.: Learning filters for the 2D wavelet transform. In: Proceedings: 2018 15th Conference on Computer and Robot Vision, CRV 2018 (2018), pp. 198–205. https://doi.org/10.1109/CRV.2018.00036

  12. Jawali, D., Kumar, A., Seelamantula, C.A.: A learning approach for wavelet design. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing: Proceedings. 2019, pp. 5018–5022. isbn: 9781538646588

  13. Pfister, L., Bresler, Y.: Learning filter bank sparsifying transforms. IEEE Trans. Signal Process. 67(2), 504–519 (2019). https://doi.org/10.1109/TSP.2018.2883021

    Article  MATH  Google Scholar 

  14. Fan, F., et al.: Soft autoencoder and its wavelet adaptation interpretation. arXiv preprint arXiv:1812.11675 (2018)

  15. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems. Software available from tensor ow.org. 2015. http://tensorflow.org/

  16. Arbelaez, P., et al.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011). https://doi.org/10.1109/TPAMI.2010.161

    Article  Google Scholar 

  17. Martin, D., et al.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of 8th International Conference on Computer Vision, Vol. 2, pp. 416–423 (2001)

  18. Zhang, K., et al.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017). https://doi.org/10.1109/TIP.2017.2662206

    Article  MATH  Google Scholar 

  19. Lefkimmiatis, S.: Universal denoising networks: a novel CNN architecture for image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3204–3213 (2018)

  20. Mohan, S., et al.: Robust and interpretable blind image denoising via bias-free convolutional neural networks. In: International Conference on Learning Representations (2020)

  21. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. 2014. arXiv:1412.6980 [cs.LG]

  22. Mallat, S.: A Wavelet Tour of Signal Processing. Elsevier, Hoboken (1999)

    MATH  Google Scholar 

  23. Farrens, S., et al.: PySAP: python sparse data analysis package for multidisciplinary image processing. In: arXiv preprint arXiv:1910.08465 (2019)

  24. Candes, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)

    Article  MATH  Google Scholar 

  25. Starck, J.-L., Candes, E.J., Donoho, D.L.: The curvelet transform for image denoising. IEEE Trans. Image Process. 11(6), 670–684 (2002)

    Article  MATH  Google Scholar 

  26. Yu, S., Park, B., Jeong, J.: Deep iterative down-up CNN for image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)

Download references

Acknowledgements

This work was granted access to the HPC resources of IDRIS under the allocation 2021-AD011011554 made by GENCI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zaccharie Ramzi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A U-net Architecture

Figure 10 shows the U-net architecture. It consists of:

  • A contracting path that allows to capture context by applying convolutions, nonlinearities (ReLU) and max-pooling for downsampling.

  • An expanding path that contains upsampling and convolution operations.

Fig. 10
figure 10

The U-net architecture. The amount of channels of each feature map is indicated on the top of the blue rectangles. In this case, the number of base filters is 64. Figure from [1]

B Values of Figures

In Tables 234 and 5 there can be found all the PSNR values as a function of \(\sigma \) for each model in Figs. 345 and 6, respectively. Conversely, Table 6 corresponds to the information presented in Fig. 8.

Table 2 PSNR for different standard deviations of the noise added to the test images for every model in Fig. 3 and the original noisy images
Table 3 PSNR for different standard deviations of the noise added to the test images for both models in Fig. 4 and the original noisy images
Table 4 PSNR for different standard deviations of the noise added to the test images for all the models in Fig. 5 and the original noisy images
Table 5 PSNR for different standard deviations of the noise added to the test images for both models in Fig. 6 and the original noisy images
Table 6 PSNR at \(\sigma =25\) added to the test images as a function of the number of samples used during training for the three models in Fig. 8

C Training on Horizontal and Vertical Bands

In order to check how learnlets adapt their filters to the dataset, we designed a very simple experiment where the goal is to denoise images containing only vertical and horizontal bands. An example of such an image can be seen in Fig. 11. The setting is otherwise similar to that described in Sect. 5, and we consider learnlets with \(J_m = 17\) analysis filters, \(m=5\) scales and filters of size \(k_A = 5\).

Fig. 11
figure 11

An example image containing vertical bands for the simple experiment design

The analysis filters obtained after training now feature some horizontal and vertical bands as can be expected. This is illustrated for the first scale in Fig. 12.

This illustrates how learnlets really tie their filters to the data they are trained on, to be the most efficient possible.

Fig. 12
figure 12

Analysis filters of the first scale of learnlets trained to denoise images containing only vertical and horizontal bands. Some visible vertical and horizontal bands can now be seen in the filters themselves

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ramzi, Z., Michalewicz, K., Starck, JL. et al. Wavelets in the Deep Learning Era. J Math Imaging Vis 65, 240–251 (2023). https://doi.org/10.1007/s10851-022-01123-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-022-01123-w

Keywords

Navigation