Abstract
With the recent tremendous increase in Graphics Processing Unit’s computing capability, using it as a co-processor of the CPU has become fundamental for achieving high overall throughput. Nvidia’s Compute Device Unified Architecture (CUDA) can greatly benefit single instruction multiple thread styled, computationally expensive programs. Video encoding, to an extent, is an excellent example of such an application which can see impressive performance gains from CUDA optimization. This paper presents a portable, fault-tolerant and a novel parallelized software implementation of Motion JPEG 2000 (MJPEG 2000) reference encoder using CUDA. Each major structural/ computational unit of JPEG 2000 is discussed in CUDA framework and the results are provided wherever required. Our experimental results demonstrate that GPU based implementation works 49 times faster than the original implementation on the CPU. For the standard frame resolution of 2048 × 1080, this new fast encoder can encode up to 11 frames/second.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Kruger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. In: Eurographics 2005, State of the Art Reports, August 2005, pp. 21–51 (2005)
Information technology - JPEG 2000 image coding system - Part3: Motion JPEG 2000, ISO/IEC 15444-3 (2000)
Information technology - JPEG 2000 image coding system - Part1: Core coding system, ISO/IEC 15444-1 (2000)
Christopoulos, C., Skodras, A., Ebrahimi, T.: The JPEG 2000 still image coding system-An overview. Proc. IEEE Transactions on Consumer Electronics 46(4), 1103–1127 (2000)
Taubman, D.: High performance scalable image compression with EBCOT. IEEE Trans. Image Processing 9(7), 1158–1170 (2000)
Antonini, M., Barlaud, M.: Image coding using wavelet transform. IEEE Transactions on Image Processing 1(2), 205–220 (1992)
Meerwald, P., Norcen, R., Uhl, A.: Parallel JPEG 2000 image coding on multiprocessors. In: Proc. of Int’l. Parallel and Distributed Processing Symp., USA, April 2002, pp. 2–7 (2002)
Muta, H., Doi, M., Nakano, H., Mori., Y.: Multilevel parallelization on the Cell/B.E for a Motion JPEG 2000 encoding server. In: Proc. ACM Multimedia Conf. (ACM-MM 2007), Augsburg, Germany (September 2007)
Kang, S., Bader, D.A.: Optimizing JPEG 2000 Still Image Encoding on the Cell Broadband Engine. In: Proc. of 37th International Conference on Parallel Processing, pp. 83–90 (2000)
Dishant, A., Kumar, M.M., Mittal, A.: Frame based parallelization of MPEG-4 on Compute unified device architecture (CUDA). In: Proc. IEEE International Advance Computing Conference (2010)
Chen, W., Hang, H.: H.264/AVC motion estimation implementation on compute device unified architecture (CUDA). In: Proc. IEEE International conference on Multimedia and Expo. (2008)
Franco, J., Bernabe, G., Fernandez, J., Acacio, M.E.: A parallel implementation of the 2D wavelet transform using CUDA. In: Proc. Euromicro International Conference on Parallel, Distributed and Network-based processing, pp. 111–118 (2009)
Zhong, H., Lieberman, S.A., Mahlke, S.A.: Extending multi-core architectures to exploit hybrid parallelism in single-thread applications. In: International Symp. on High-Performance Computer Architecture, Phoenix, Arizona (February 2007)
Michael, D.A., Kossentini, F.: JasPer-a software-based JPEG-2000 codec implementation. In: IEEE International Conference on Image Processing 2000, pp. 53–56 (2000)
Tenllado, C., Setoain, J., Prieto, M., Pinuel, L., Tirado, F.: Parallel Implementation of the 2D Discrete Wavelet Transform on Graphics Processing Units-Filter Bank versus Lifting. IEEE Transactions on Parallel and Distributed Sytems 19(2), 299–310 (2008)
Bernabe, G., Garcia, J.M., Gonzalez, J.: Reducing 3D Wavelet Transform Excecution Time Using Blocing and the Streaming SIMD Extensions. Journal of VLSI Signal Processing 41(2), 209–223 (2005)
Garcia, A., Shen, H.: GPU-Based 3D Wavelet Reconstruction with Tileboarding. The Visual Computer 21(8-10), 755–763 (2005)
Moreland, K., Angel, E.: The FFT on a GPU. In: Graphics Hardware, July 2003, pp. 112–119 (2003)
NVIDIA Corporation. Accelerating MATLAB with CUDA using MEX Files (September 2007)
Misra, D., Yang, Y.: Coarse-grained parallel algorithms for multi-dimensional wavelet transforms. J. Supercomputing 12(1-2), 99–118 (1998)
Lian, C.J., Chen, K.F., Chen, H., Chen, L.G.: Analysis and architecture design of block-coding engine for EBCOT in JPEG 2000. IEEE Trans. Circuits and Systems 13(3), 219–230 (2003)
Burt, P.J., Anderson, H.: The Laplacian pyramid as a compact image code. IEEE Trans. Commun. 31(4), 532–540 (1983)
NVIDIA Corporation. NVIDIA Compute Unified Device Architecture (CUDA) Programming Guide Version 2 (April 2009)
Parakh, N., Mittal, A., Niyogi, R.: Optimization of MPEG 2 Encoder on Cell B. E. Processor. In: IEEE International Advance Computing Conference, IACC 2009, March 6-7, pp. 423–427 (2009)
Lin, D., Xiaohuang, H., Nguyen, Q., Blackburn, J., Rodrigues, C., Huang, T., Do, M.N., Patel, S.J., Hwu, W.-M.W.: The parallelization of video processing I 26(6), 103–112 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sanketh, D., Niyogi, R. (2010). An Accelerated MJPEG 2000 Encoder Using Compute Unified Device Architecture. In: Ranka, S., et al. Contemporary Computing. IC3 2010. Communications in Computer and Information Science, vol 95. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14825-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-14825-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14824-8
Online ISBN: 978-3-642-14825-5
eBook Packages: Computer ScienceComputer Science (R0)