Abstract
The 2D Discrete Wavelet Transform (DWT) is an important function in many multimedia applications, such as JPEG2000 and MPEG-4 standards, digital watermarking, and content-based multimedia information retrieval systems. The 2D DWT is computationally intensive than other functions, for instance, in the JPEG2000 standard. Therefore, different architectures have been proposed to process 2D DWT. The goal of this paper is to review and to evaluate different algorithms and different kinds of architectures such as application-specific integrated circuits, field programmable gate array, digital signal processors, graphics processing units, and General-Purpose Processors (GPPs) that are used to process 2D DWT. In addition, we implement the 2D DWT using different algorithms on GPPs enhanced with multimedia extensions. The experimental results show that the largest speedup of the vectorized 2D DWT over the scalar implementation is about 2.8 for first level decomposition. Furthermore, the characteristics of the 2D DWT and disadvantages of the existing architectures such as GPPs enhanced with SIMD instructions are discussed.
Similar content being viewed by others
References
Stollnitz EJ, Derose TD, Salesin DH (1996) Wavelets for computer graphics: theory and applications. Morgan Kaufmann, San Mateo
Yusof Y, Khalifa OO (2007) Digital watermarking for digital images using wavelet transform. In: Proc IEEE int conf on telecommunications, May
Hsieh MS, Tseng DC, Huang YH (2006) Hiding digital watermarks using multiresolution wavelet transform. IEEE Trans Ind Electron 48(5):875–882
Liang KC, Kuo CJ (1997) Progressive image indexing and retrieval based on embedded wavelet coding. In: Proc int conf on image processing, October, pp 572–575
Chang T, Kuo CCJ (1993) Texture analysis and classification with tree-structured wavelet transform. IEEE Trans Image Process 2(4):429–441
Biswas PK, Chatterji BN (2007) Texture image retrieval using rotated wavelet. Pattern Recognit Lett 28:1240–1249
Wang JZ, Wiederhold G, Firschein O, Wei SX (1997) Content-based image indexing and searching using Daubechies’ wavelets. Int J Digit Libr, 311–328
Mandal MK, Aboulnasr T, Panchanathan S (1996) Image indexing using moments and wavelets. IEEE Trans Consum Electron 3(42):557–565
Dettori L, Semler L (2007) A comparison of wavelet, ridgelet, and curvelet-based texture classification algorithms in computed tomography. Comput Biol Med 37(4):486–498
Jin Y, Angelini E, Laine A (2005) Wavelets in medical image processing: denoising, segmentation, and registration. In: Handbook of biomedical image analysis volume I: segmentation models part A. Springer, New York, pp 305–358
Djebouri D, Djebbari A, Djebbouri M (2005) A new robust GPS satellite signal acquisition using lifting wavelet. Telecommun Radio Eng 65(2–6)
Shahbahrami A, Juurlink B, Vassiliadis S (2008) Implementing the 2D wavelet transform on SIMD-enhanced general-purpose processors. IEEE Trans Multimed 10(1):43–51
Andreopoulos Y, Masselos K, Schelkens P, Lafruit G, Cornelis J (2002) Cache misses and energy dissipation results for JPEG-2000 filtering. In: Proc 14th IEEE int conf on digital signal processing, pp 201–209
Andreopoulos Y, Schelkens P, Cornelis J (2001) Analysis of wavelet transform implementations for image and texture coding applications in programmable platforms. In: Proc IEEE signal processing systems, pp 273–284
Andreopoulos Y, Schelkens P, Lafruit G, Masselos K, Cornelis J (2003) High-level cache modeling for 2-D discrete wavelet transform implementations. J VLSI Signal Process 34:209–226
Andreopoulos Y, Zervas ND, Lafruit G, Schelkens P, Stouraitis T, Goutis CE, Cornelis J (2001) A local wavelet transform implementation versus an optimal row-column algorithm for the 2D multilevel decomposition. In: Proc IEEE int conf on image processing, vol 3, pp 330–333
Chrysafis C, Ortega A (2000) Line-based, reduced memory, wavelet image compression. IEEE Trans Image Process 9(3):378–389
Shahbahrami A, Juurlink B, Vassiliadis S (2005) Efficient vectorization of the FIR filter. In: Proc 16th annual workshop on circuits, systems and signal processing (ProRISC2005), November, pp 432–437
Kuo SM, Gan WS (2005) Digital signal processors architectures, implementations, and applications. Prentice Hall, New York
Trenas MA, Lopez J, Zapata EL, Arguello F (1998) A memory system supporting the efficient SIMD computation of the two dimensional DWT. In: Proc IEEE int conf on acoustics speech and signal processing, May, vol 3, pp 1521–1524
Cohen A, Daubechies I, Eauveau JCF (1992) Biorthogonal bases of compactly supported wavelets. Commun Pure Appl Math 45(5):485–560
Daubechies I, Sweldens W (1998) Factoring wavelet transforms into lifting steps. J Fourier Anal Appl 4(3):247–269
Ferretti M, Rizzo D (2001) A parallel architecture for the 2D discrete wavelet transform with integer lifting scheme. J VLSI Signal Process 28:165–185
Shahbahrami A (2008) Avoiding conversion and rearrangement overhead in SIMD architectures. PhD thesis, Delft University of Technology, September
Shahbahrami A, Juurlink B, Vassiliadis S (2005) Performance comparison of SIMD implementations of the discrete wavelet transform. In: Proc 16th IEEE int conf on application-specific systems architectures and processors (ASAP), July
Andra K, Chakrabarti C, Acharya T (2002) A VLSI architecture for lifting-based forward and inverse wavelet transform. IEEE Trans Signal Process 50(4):966–977
Chen CY, Yang ZL, Wang TC, Chen LG (2001) A programmable parallel VLSI architecture for 2D discrete wavelet transform. J VLSI Signal Process 28:151–163
Xiong CY, Hou JH, Tian JW, Liu J (2007) Efficient array architectures for multi-dimensional lifting-based discrete wavelet transforms. Signal Process 87(5):1089–1099
Weeks M, Bayoumi MA (2002) Three-dimensional discrete wavelet transform architectures. IEEE Trans Signal Process 50(8):2050–2063
Liao H, Mandal MK, Cockburn BF (2004) Efficient architectures for 1D and 2D lifting-based wavelet transforms. IEEE Trans Signal Process 52(5):1315–1326
Tseng PC, Huang CT, Chen LG (2003) Reconfigurable discrete wavelet transform architecture for advanced multimedia systems. In: Proc IEEE workshop on signal processing systems, August
Tseng PC, Huang CT, Chen LG (2005) Reconfigurable discrete wavelet transform processor for heterogeneous reconfigurable multimedia systems. J VLSI Signal Process Syst 41(1):35–47
Hyun E, Sima M, McGuire M (2006) Reconfigurable implementation of wavelet transform on an FPGA-augmented NIOS processor. In: Proc IEEE Canadian conf on electrical and computer engineering, May
Schelkens P, Decroos F, Cornelis J, Lafruit G, Catthoor F (1999) Implementation of an integer wavelet transform on a parallel TI TMS320C40 platform. In: Proc IEEE workshop on signal processing systems, October
Cho JK, Hwang MC, Kim JS, Choi BD, Ko SJ (2004) Fast implementation of wavelet lifting for JPEG2000 on a fixed-point DSP. In: Proc int conf on circuits, systems, computers, and communications, July
Chaver D, Prieto M, Pinuel L, Tirado F (2002) Parallel wavelet transform for large scale image processing. In: Proc IEEE int symp on parallel and distributed processing, April, pp 4–9
Chaver D, Tenllado C, Pinuel L, Prieto M, Tirado F (2002) 2-D wavelet transform enhancement on general-purpose microprocessors: memory hierarchy and SIMD parallelism exploitation. In: Proc int conf on the high performance computing, December
Tenllado C, Setoain J, Prieto M, Pinuel L, Tirado F (2008) Parallel implementation of the 2D discrete wavelet transform on graphics processing units: filter bank versus lifting. IEEE Trans Parallel Distrib Syst 19(3):299–310
Gnavi S, Penna B, Grangetto M, Magli E, Olmo G (2002) Wavelet kernels on a DSP: a comparison between lifting and filter banks for image coding. EURASIP J Appl Signal Process 2002(1):981–989
Vishwanath M, Owens RM, Irwin MJ (1992) Discrete wavelet transforms in VLSI. In: Proc int conf on application specific array processors, August
Denk TC, Parhi KK (1997) VLSI architectures for lattice structure based orthonormal discrete wavelet transform. IEEE Trans Circuits Syst II, Analog Digit Signal Process 44(2):129–132
Martina M, Masera G, Piccinini G, Zamboni M (2000) A VLSI architecture for IWT (integer wavelet transform). In: Proc 43rd IEEE midwest symposium on circuits and systems, vol 3, pp 1174–1177
Limqueco JC, Bayoumi MA (1998) A VLSI architecture for separable 2D discrete wavelet transform. J VLSI Signal Process Syst 18(2):125–140
Weeks M, Bayoumi M (2003) Discrete wavelet transform: architectures, design and performance issues. J VLSI Signal Process 35:155–178
Tenllado C, Lario R, Prieto M, Tirado F (2004) The 2D discrete wavelet transform on programmable graphics hardware. In: Proc 4th IASTED int conf on visualization, imaging, and image processing
Hopf M, Ertl T (2000) Hardware-accelerated wavelet transformations. In: Proc IEEE TVCG symp on visualization, May
Wang J, Wong T, Heng P, Leung C (2004) Discrete wavelet transform on GPU. In: Proc ACM workshop general-purpose computing on graphics processors
Mirsky E, DeHon A (1996) MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources. In: Proc IEEE symp on FPGAs for custom computing machines, April, pp 157–166
Ebeling C, Cronquist D, Franklin P, Fisher C (1996) RaPiD a configurable computing architecture for compute-intensive applications. Technical report TR-96-11-03, University of Washington Department of Computer Science and Engineering, November
Lee RB (1996) Subword parallelism with MAX-2. IEEE MICRO 16(4):51–59
Peleg A, Wiljie S, Weiser U (1997) Intel MMX for multimedia PCs. Commun ACM 40(1):24–38
Peleg A, Weiser U (1996) MMX technology extension to the intel architecture. IEEE MICRO 16(4):42–50
Tremblay M, O’Connor JM, Narayanan V, He L (1996) VIS speeds new media processing. IEEE MICRO 16(4):10–20
Bannon P, Saito Y (1997) The alpha 21164PC microprocessor. In: IEEE proc compcon 97, February, pp 20–27
Gwennap L (1996) Digital, MIPS add multimedia extensions. Microprocess Rep 10(15):24–28
Jennings MD, Conte TM (1998) Subword extensions for video processing on mobile systems. IEEE Concurr 6(3):13–16
Lee RB (1997) Multimedia extensions for general-purpose processors. In: Proc IEEE workshop on signal processing systems, November, pp 9–23
Advanced Micro Devices Inc (2000) 3DNow technology manual
Raman SK, Pentkovski V, Keshava J (2000) Implementing streaming SIMD extensions on the Pentium 3 processor. IEEE MICRO 20(4):47–57
Thakkar S, Huff T (1999) The internet streaming SIMD extensions. Intel Technol J, 1–8
Semiconductor F (2002) AltiVec technology programming environments manual
Diefendorff K, Dubey PK, Hochsprung R, Scales H (2000) AltiVec extension to powerPC accelerates media processing. IEEE MICRO 20(2):85–95
Flachs B, Asano S, Dhong SH, Hofstee HP, Kim GGR, Le T, Liu P, Leenstra J, Oh JLBMHJ, Mueller SM, Takahashi O, Watanabe AHY, Yano N, Brokenshire DA, Peyravian M, Vandung T, Iwata E (2006) The microarchitecture of the synergistic processor for a cell processor. IEEE J Solid-State Circuits 41:63–70
Hofstee HP (2005) Power efficient processor architecture and the cell processor. In: Proc 11th IEEE int symp on high-performance computer architecture, February, pp 258–262
IBM (2007) Synergistic processor unit instruction set architecture, January, version 1.2
Asokan R, Nazareth S (2001) Processor architectures for multimedia. In: Proc 14th annual workshop on architecture and system design, November, pp 589–594
Hen HY (2002) Programmable digital signal processors: architecture, programming, and applications. Dekker, New York
Slingerland NT, Smith AJ (2000) Multimedia instruction sets for general purpose microprocessors: a survey. Technical Report UCB//CSD-00-1124, University of California, December
Ferrand F (2003) Optimization and code parallelization for processors with multimedia SIMD instructions. Master’s thesis, ENST Bretagne
Meerwald P, Norcen R, Uhl A (2002) Cache issues with JPEG2000 wavelet lifting. In: Proc of visual communications and image processing, January
Chatterjee S, Brooks CD (2002) Cache efficient wavelet lifting in JPEG 2000. In: Proc IEEE int conf on multimedia, pp 797–800
Komi H, Ortega A (2001) Analysis of cache efficiency in 2D wavelet transform. In: Proc IEEE int conf on multimedia and expo, pp 465–468
Tao J, Shahbahrami A, Juurlink B, Buchty R, Karl W, Vassiliadis S (2007) Optimizing cache performance of the discrete wavelet transform using a visualization tool. In: Proc 9th IEEE int symp on multimedia, December
Shahbahrami A, Juurlink B, Vassiliadis S (2006) Improving the memory behavior of vertical filtering in the discrete wavelet transform. In: Proc 3rd ACM int conf on computing frontiers, May, pp 253–260
Chaver D, Tenllado C, Pinuel L, Prieto M, Tirado F (2003) Vectorization of the 2D wavelet lifting transform using SIMD extensions. In: Proc 17th IEEE int symp on parallel and distributed image processing and multimedia
Kutil R (2006) A single-loop approach to SIMD parallelization of 2D wavelet lifting. In: Proc 14th Euromicro int conf on parallel, distributed, and network-based processing, pp 413–420
Bernabe G, Garcia JM, Gonzales J (2003) Reducing 3D wavelet transform execution time through the streaming SIMD extensions. In: Proc 11th Euromicro conf on parallel distributed and network based processing, February
Shahbahrami A, Juurlink B (2007) A comparison of two SIMD implementations of the 2D discrete wavelet transform. In: Proc 18th annual workshop on circuits, systems and signal processing (ProRISC2007), November
Shahbahrami A, Juurlink B, Vassiliadis S (2006) Performance impact of misaligned accesses in SIMD extensions. In: Proc 17th annual workshop on circuits, systems and signal processing (ProRISC2006), November, pp 334–342
Shahbahrami A (2011) Improving the performance of 2D discrete wavelet transform using data-level parallelism. In: Proc IEEE int conf on high performance computing and simulation, July, pp 362–368
Shahbahrami A, Juurlink B (2009) SIMD architectural enhancements to improve the performance of the 2D discrete wavelet transform. In: Proc 12th EUROMICRO conf on digital system design, August, pp 497–504
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shahbahrami, A. Algorithms and architectures for 2D discrete wavelet transform. J Supercomput 62, 1045–1064 (2012). https://doi.org/10.1007/s11227-012-0790-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-012-0790-x