Abstract
The well-known method of median filtering is used both in a wide range of application frameworks and as a standalone filter. Small-window median filters can highly reduce the power of salt and pepper or additive Gaussian noise and minimize edge blurring. Large-window filters are also used, for example to estimate the background of images. Currently, the window size should not be an issue as a constant time algorithm and several implementations, including GPU-based codes, have been proposed in recent years. Unfortunately, none of these constant time implementations manage to fully exploit the capabilities of modern GPUs, and thus the throughputs of large-window median filters remain far below the peak throughputs allowed by recent GPU models. This paper aims at showing that a separable approximation of a 2D median filtering is often stronger than its full 2D implementation. Statistical and theoretical analysis are conducted to show and explain this fact which had so far remained unobserved. It is confirmed by experimentations on a dataset composed of 10,000 images, corrupted by different levels of salt and pepper noise. Separable and full 2D median filter algorithms are compared with several metrics, notably PSNR and MSSIM. In addition, a GPU implementation of 2D separable median filters is also proposed. This implementation is able to output up to 125 billion pixels per second on a recent Volta V100, which significantly outperforms existing implementations like Nvidia’s NPP library or Green’s code, resulting in the fastest median filtering solution to date.
Similar content being viewed by others
Notes
Compute unified device architecture, Nvidia’s programming model.
Parallel Ccdf-based Median filter (Ccdf: cumulative complementary distribution function).
Parallel register-only median filter.
In that chapter, Tukey addresses the issue of statistical analysis on large datasets and discusses a method in which data are sliced into nine-value sets (or 81 values for larger datasets) and where the mean estimation is taken as the median value of the median values of all those sets, i.e., the ninther. The process can be recursive. In our implementation, data slices are mapped to image rows.
Peak signal-to-noise ratio.
Mean structural SIMilarity.
Titan-X GPU: Maxwell family, compute capability 5.2, 3,072 cuda cores at 1.24 GHz, 12 GByte of RAM.
V100 SXM2 GPU: Volta family, compute capability 7.0, 5,120 cuda cores at 1.45 GHz, 32 GByte of RAM.
Parallel Register-only Separable Median Filter
References
Adams A (2021) Fast median filters using separable sorting networks. In: ACM transactions and graphics, August 2021, Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3450626.3459773
Batcher KE (1968) Sorting networks and their applications. In: Proceedings of the April 30–May 2, 1968, Spring Joint Computer Conference, ACM, New York, NY, USA, AFIPS ’68 (Spring), pp 307–314. https://doi.org/10.1145/1468075.1468121
Battiato PS (2016) High performance median filtering algorithm based on NVIDIA GPU computing
Chen W, Beister M, Kyriakou Y, Kachelries M (2009) High performance median filtering using commodity graphics hardware. In: Nuclear Science Symposium Conference Record (NSS/MIC), 2009 IEEE, pp 4142–414. https://doi.org/10.1109/NSSMIC.2009.5402323
Cline D, White KB, Egbert PK (2007) Fast 8-bit median filtering based on separability. In: IEEE International Conference on Image Processing, 2007. ICIP 2007. IEEE, vol 5, pp V-281
Cover database used to generate the testing database. http://boss.gipsa-lab.grenoble-inp.fr/Warming/Materials/BOSSRank-covers.tar.bz2
Green O (2018) Efficient scalable median filtering using histogram-based operations. IEEE Trans Image Process 27(5):2217–2228
Huang TS (1981) Two-dimensional digital signal processing II: transforms and median filters. Springer, New York
Kachelriess M (2009) Branchless vectorized median filtering. In: Nuclear Science Symposium Conference Record (NSS/MIC), 2009 IEEE, pp 4099 –4105. https://doi.org/10.1109/NSSMIC.2009.5402362
Narendra PM (1981) A separable median filter for image noise smoothing. IEEE Trans Pattern Anal Mach Intell 1:20–29
Nividia Kepler architecture. https://en.wikipedia.org/wiki/Kepler_(microarchitecture)
Nvidia Maxwell architecture. https://developer.nvidia.com/maxwell-compute-architecture
Nvidia Pascal architecture. https://developer.nvidia.com/pascal
Nvidia Volta architecture. https://devblogs.nvidia.com/inside-volta/
Nvidia NPP library. https://developer.nvidia.com/npp
Paulius Micikevicius (2013) Performance optimization: programming guidelines and GPU architecture reasons behind them. GPU Technology Conference. http://on-demand.gputechconf.com/gtc/2013/presentations/S3466-Programming-Guidelines-GPU-Architecture.pdf
Pei-Eng N, Kai-Kuang M (2006) A switching median filter with boundary discriminative noise detection for extremely corrupted images. IEEE Trans Image Process 15(6):1506–1516. https://doi.org/10.1109/TIP.2005.871129
Perreault S, Hébert P (2007) Median filtering in constant time. IEEE Trans Image Process 16(9):2389–2394
Perrot G, Domas S, Couturier R (2014) Fine-tuned high-speed implementation of a GPU-based median filter. J Signal Process Syst 75(3):185–190
Sanchez R, Rodriguez P (2012) Highly parallelable bidimensional median filter for modern parallel programming models. J Signal Process Syst. https://doi.org/10.1007/s11265-012-0715-1
Tukey JW (1977) Exploratory data analysis. Addison-Wesley, Boston
Tukey JW (1978) The ninther, a technique for low-effort robust (resistant) location in large samples. Contributions to survey sampling and applied statistics in honor of HO Hartley pp 251–258
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) “Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612. https://doi.org/10.1109/TIP.2003.819861
Weiss B (2006) Fast median and bilateral filtering. ACM SIGGRAPH, (2006) Papers, ACM, vol ’06. NY, USA, SIGGRAPH, New York, pp 519–526
Acknowledgements
This work was partially supported by the EIPHI Graduate School (contract ANR-17-EURE-0002). Computations have been performed on the supercomputer facilities of the “Mésocentre de Franche-Comté”.
Funding
The Graphical Processing Units used in the described experiments have been provided by the Mésocentre de calcul de Franche-Comté, the regional computing facility.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interests
Authors declare that they do not have no conflict of interest of any sort.
Availability of data and material
Source codes of the proposed implementation can be generated online and downloaded for testing purposes at: http://info.iut-bm.univ-fcomte.fr/staff/perrot/convomed/ In addition, full definition images and more measurements can be found at: http://info.iut-bm.univ-fcomte.fr/staff/perrot/separable-median/
Code availability
Sample source codes are available via the above URL, but the source code of the kernel generator is beyond the scope of this paper and is right protected.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
GPU: Graphical Processing Unit
NPP: Nvidia Performance Primitives.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Perrot, G., Domas, S. & Couturier, R. How separable median filters can get better results than full 2D versions. J Supercomput 78, 10118–10148 (2022). https://doi.org/10.1007/s11227-021-04233-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04233-1