Shahbahrami et al., 2005 - Google Patents
Performance comparison of SIMD implementations of the discrete wavelet transformShahbahrami et al., 2005
View PDF- Document ID
- 7357601446004338052
- Author
- Shahbahrami A
- Juurlink B
- Vassiliadis S
- Publication year
- Publication venue
- 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05)
External Links
Snippet
This paper focuses on SIMD implementations of the 2D discrete wavelet transform (DWT). The transforms considered are Daubechies' real-to-real method of four coefficients (Daub-4) and the integer-to-integer (5, 3) lifting scheme. Daub-4 is implemented using SSE and the …
- 238000000354 decomposition reaction 0 abstract description 8
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/147—Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/141—Discrete Fourier transforms
- G06F17/142—Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
- G06F7/5443—Sum of products
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image, e.g. from bit-mapped to bit-mapped creating a different image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4084—Transform-based scaling, e.g. FFT domain scaling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding, e.g. from bit-mapped to non bit-mapped
- G06T9/007—Transform coding, e.g. discrete cosine transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration, e.g. from bit-mapped to bit-mapped creating a similar image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11698773B2 (en) | Accelerated mathematical engine | |
US7129962B1 (en) | Efficient video processing method and system | |
EP3093757B1 (en) | Multi-dimensional sliding window operation for a vector processor | |
JP2004038451A (en) | Hadamard transformation processing method and device | |
Abel et al. | Applications tuning for streaming SIMD extensions | |
Shahbahrami et al. | Performance comparison of SIMD implementations of the discrete wavelet transform | |
WO2020160608A1 (en) | Highly parallel convolutional neural network | |
Adámek et al. | GPU fast convolution via the overlap-and-save method in shared memory | |
Shahbahrami | Algorithms and architectures for 2D discrete wavelet transform | |
US6404934B1 (en) | High speed image processing apparatus using a cascade of elongated filters programmed in a computer | |
CN1268231A (en) | Variable block size 2-dimensional inverse discrete cosine transform engine | |
WO2002035470A1 (en) | Image processing system with enhanced processing and memory management | |
Shahbahrami et al. | Matrix register file and extended subwords: two techniques for embedded media processors | |
US6292814B1 (en) | Methods and apparatus for implementing a sign function | |
US12072799B2 (en) | Programmable multi-level data access address generator | |
Chang et al. | Fast convolution kernels on Pascal GPU with high memory efficiency | |
Barina et al. | Parallel wavelet schemes for images: How to make the wavelet transform friendly to parallel architectures | |
US6504959B1 (en) | Image processing apparatus using a cascade of poly-point operations | |
KR102510924B1 (en) | Massively parallel, associative multiplier-accumulator | |
Klosowski et al. | Real-time image deconvolution on the GPU | |
Barina et al. | Minimum memory vectorisation of wavelet lifting | |
Tao et al. | Optimizing cache performance of the discrete wavelet transform using a visualization tool | |
Shahbahrami et al. | A comparison of two SIMD implementations of the 2D discrete wavelet transform | |
Shahbahrami et al. | SIMD architectural enhancements to improve the performance of the 2D discrete wavelet transform | |
Sekhar | Precision-Aware and Quantization of Lifting Based DWT Hardware Architecture |