Abstract
Sparsity-based methods, such as wavelets, have been the state of the art for more than 20 years for inverse problems before being overtaken by neural networks. In particular, U-nets have proven to be extremely effective. Their main ingredients are a highly nonlinear processing, a massive learning made possible by the flourishing of optimization algorithms with the power of computers (GPU) and the use of large available datasets for training. It is far from obvious to say which of these three ingredients has the biggest impact on the performance. While the many stages of nonlinearity are intrinsic to deep learning, the usage of learning with training data could also be exploited by sparsity-based approaches. The aim of our study is to push the limits of sparsity to use, similarly to U-nets, massive learning and large datasets, and then to compare the results with U-nets. We present a new network architecture, called learnlets, which conserves the properties of sparsity-based methods such as exact reconstruction and good generalization properties, while fostering the power of neural networks for learning and fast calculation. We evaluate the model on image denoising tasks. Our conclusion is that U-nets perform better than learnlets on image quality metrics in distribution, while learnlets have better generalization properties.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
https://www.tensorflow.org/api_docs/python/tf/image/rgb_to_grayscale; TensorFlow Documentation for RGB to grayscale.
We call a sample, a full image, not just a patch of the image.
References
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer (2015)
Isola, P., et al.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Adler, J., Oktem, O.: Learned primal–dual reconstruction. IEEE Trans. Med. Imaging 37(6), 1322–1332 (2018)
Zbontar, J., et al.: fastMRI: an open dataset and benchmarks for accelerated MRI. In: arXiv preprint arXiv:1811.08839 (2018)
Quan, T.M., Nguyen-Duc, T., Jeong, W.-K.: Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss. IEEE Trans. Med. Imaging 37(6), 1488–1497 (2018)
Ye, J.C., Han, Y., Cha, E.: Deep convolutional framelets: a general deep learning framework for inverse problems. SIAM J. Imaging Sci. 11(2), 991–1048 (2018)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Donoho, D.L.: De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41(3), 613–627 (1995)
Gottschling, N.M., et al.: The troublesome kernel: why deep learning for inverse problems is typically unstable. In: arXiv preprint arXiv:2001.01258 (2020)
Zaccharie, R., et al.: Wavelets in the deep learning era. In: 2020 28th European Signal Processing Conference (EUSIPCO) (2021). issn: 2076-1465. https://doi.org/10.23919/Eusipco47968.2020.9287317
Recoskie, D., Mann, R.: Learning filters for the 2D wavelet transform. In: Proceedings: 2018 15th Conference on Computer and Robot Vision, CRV 2018 (2018), pp. 198–205. https://doi.org/10.1109/CRV.2018.00036
Jawali, D., Kumar, A., Seelamantula, C.A.: A learning approach for wavelet design. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing: Proceedings. 2019, pp. 5018–5022. isbn: 9781538646588
Pfister, L., Bresler, Y.: Learning filter bank sparsifying transforms. IEEE Trans. Signal Process. 67(2), 504–519 (2019). https://doi.org/10.1109/TSP.2018.2883021
Fan, F., et al.: Soft autoencoder and its wavelet adaptation interpretation. arXiv preprint arXiv:1812.11675 (2018)
Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems. Software available from tensor ow.org. 2015. http://tensorflow.org/
Arbelaez, P., et al.: Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011). https://doi.org/10.1109/TPAMI.2010.161
Martin, D., et al.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Proceedings of 8th International Conference on Computer Vision, Vol. 2, pp. 416–423 (2001)
Zhang, K., et al.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017). https://doi.org/10.1109/TIP.2017.2662206
Lefkimmiatis, S.: Universal denoising networks: a novel CNN architecture for image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3204–3213 (2018)
Mohan, S., et al.: Robust and interpretable blind image denoising via bias-free convolutional neural networks. In: International Conference on Learning Representations (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. 2014. arXiv:1412.6980 [cs.LG]
Mallat, S.: A Wavelet Tour of Signal Processing. Elsevier, Hoboken (1999)
Farrens, S., et al.: PySAP: python sparse data analysis package for multidisciplinary image processing. In: arXiv preprint arXiv:1910.08465 (2019)
Candes, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted l1 minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)
Starck, J.-L., Candes, E.J., Donoho, D.L.: The curvelet transform for image denoising. IEEE Trans. Image Process. 11(6), 670–684 (2002)
Yu, S., Park, B., Jeong, J.: Deep iterative down-up CNN for image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Acknowledgements
This work was granted access to the HPC resources of IDRIS under the allocation 2021-AD011011554 made by GENCI.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
A U-net Architecture
Figure 10 shows the U-net architecture. It consists of:
-
A contracting path that allows to capture context by applying convolutions, nonlinearities (ReLU) and max-pooling for downsampling.
-
An expanding path that contains upsampling and convolution operations.
B Values of Figures
In Tables 2, 3, 4 and 5 there can be found all the PSNR values as a function of \(\sigma \) for each model in Figs. 3, 4, 5 and 6, respectively. Conversely, Table 6 corresponds to the information presented in Fig. 8.
C Training on Horizontal and Vertical Bands
In order to check how learnlets adapt their filters to the dataset, we designed a very simple experiment where the goal is to denoise images containing only vertical and horizontal bands. An example of such an image can be seen in Fig. 11. The setting is otherwise similar to that described in Sect. 5, and we consider learnlets with \(J_m = 17\) analysis filters, \(m=5\) scales and filters of size \(k_A = 5\).
The analysis filters obtained after training now feature some horizontal and vertical bands as can be expected. This is illustrated for the first scale in Fig. 12.
This illustrates how learnlets really tie their filters to the data they are trained on, to be the most efficient possible.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ramzi, Z., Michalewicz, K., Starck, JL. et al. Wavelets in the Deep Learning Era. J Math Imaging Vis 65, 240–251 (2023). https://doi.org/10.1007/s10851-022-01123-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-022-01123-w