Abstract
This paper proposes a new discrete cosine transform (DCT) processor. The micro-rotation section of the architecture is based on a shared-resource improved coordinate rotation digital computer (CORDIC) unit, in an enhanced scalable DCT engine. To reduce the resources, and utilization area all micro-rotation operations have implemented as one united block in overlapped form. Using one processing element, the memory-based architecture has reduced the power consumption. Inputs and outputs of the processor are in-order which can be taken into account as an advantage for the proposed design. The processor has a low-complexity and distributed controller. Furthermore, due to the shared-resource implementation of CORDIC-II unit, by reduction of adding, shifting operations both in size, and number, the processor has high capabilities in short word lengths in comparison with state-of-the-art DCT processors. Compared to existing prominent DCT processors, the proposed processor achieves better results with limited hardware resources.
Similar content being viewed by others
References
Ahmed, N., Natarajan, T., Rio, K.R.: Discrete cosine transform. IEEE Trans. Comp. 23, 90–93 (1974)
Rao, K.R., Yip, P.: Discrete Cosine Transform: Algorithms, Advantages, Applications. Academic Press, San Diego (1990)
Mitsui, M., Murakami, Y., Obi, T.: Color enhancement in multispectral image using the Karhunen–Loeve transform. Opt. Rev. 12, 69–75 (2005). https://doi.org/10.1007/s10043-004-0069-4
Radünz, A.P., Bayer, F.M., Cintra, R.J.: Low-complexity rounded KLT approximation for image compression. J. Real-Time Image Proc. 19, 173–183 (2022). https://doi.org/10.1007/s11554-021-01173-0
Clarke: Relation between the Karhunen–Loeveand cosine transforms. In: IEE Proceedings F Communications, Radar and Signal Process. 259–260 (1981)
Sadaghiani, A.K., Forouzandeh, B.: Image interpolation based on 2D-DWT and HDP-HMM. Pattern Anal. Appl. 25(2), 361–377 (2022). https://doi.org/10.1007/s10044-022-01057-4
Sadaghiani, A.K., Sheikhaei, S., Forouzandeh, B.: Image interpolation based on 2D-DWT with novel regularity-preserving algorithm using RLS adaptive filters. Int. J. Image Graph. (2022). https://doi.org/10.1142/S0219467823500390
Jiaming, Lu., Zhao, L., Chen, K., Deng, P., Li, B., Liu, S., An, Qi.: Real-time FPGA-based digital signal processing and correction for a small animal PET. IEEE Trans. Nucl. Sci. 66(7), 1287–1295 (2019)
Kumar, S., Jha, R.K.: An FPGA-based design for a real-time image denoising using approximated fractional integrator. Multidimens. Syst. Signal Process. 31, 1317–1339 (2020)
Britanak, V., Yip, P., Rao, K.R.: Discrete Cosine and Sine Transformes. Academic Press, New York (2007)
International Telecommunication Union recommendation, H.262, Telecommunication Section (2000)
International Telecommunication Union recommendation, H.263, Telecommunication Section (2005)
Bhaskaran, V., Konstantinides, K.: Image and Video Compression Standards. Kluwer, Norwell (1997)
Pourazad, M. T., Doutre, C., Azimi, M., Nasiopoulos, P.: HEVC: The new gold standard for video compression: How does HEVC compare with H.264/AVC?. IEEE Consum. Electron. Mag. 1(3), 36–46 (2012)
Ochoa-Dominguez, H., Rao, K.: Discrete Cosine Transform. CRC Press, Boca Raton (2019)
Huang, H., Xiao, L.: CORDIC based fast Radix-2 DCT algorithm. IEEE Signal Process. Lett. 20(5), 483–486 (2013)
Jridi, M., Alfalou, A.: Joint optimization of low-power DCT architecture and efficient quantization technique for embedded image compression. In: International Federation for Information Processing (IFIP), Brest, France (2012)
Brahimi, N., Bouden, T., Brahimi, T., Boubchir, L.: A novel and efficient 8-point DCT approximation for image compression. Multimed. Tools Appl. 79(1), 7615–7631 (2020)
Oliveira, R.S., Cintra, R.J., Bayer4, F.M., da Silveira, T.L.T.: Low-complexity 8-point DCT approximation based on angle similarity for image and video coding. Multidimens. Syst. Signal Process. 1(21), 1363–1394 (2018)
Shabani, A., Sabri, M., Khabbazan, B., Timarchi, S.: Area and power-efficient variable-sized DCT architecture for HEVC using Muxed-MCM problem. IEEE Trans. Circuits Syst. I: Regul. Pap. 68(3), 1259–1268 (2020)
Singhadia, A., Mamillapalli, M., Chakrabarti, I.: Hardware-efficient 2D-DCT/IDCT architecture for portable HEVC-compliant devices. IEEE Trans. Consum. Electron. 66(3), 203–212 (2020)
Zhang, J., Shi, W., Zhou, Li., Gong, R., Wang, L., Zhou, H.: A low-power and high-PSNR unified DCT/IDCT architecture based on EARC and enhanced scale factor approximation. IEEE Access 7, 165684–165691 (2019)
Shabani, A., Timarchi, S., Mahdavi, H.: Power and area efficient CORDIC-Based DCT using direct realization of decomposed matrix. Microelectron. J. 91, 11–21 (2019)
Chiper, D.F: A structured fast algorithm for the VLSI pipeline implementation of inverse discrete cosine transform. Circuits Syst Signal Process 40, 5351–5366 (2021). https://doi.org/10.1007/s00034-021-01718-5
Parhi, K.K.: VLSI Digital Signal Processing Systems: Design and Implementation. Wiley, New York (1999)
Coelh, D.F.G., Cintra, R.J., Madanayake, A., Perera, S.M.: Low-complexity scaling methods for DCT-II approximations. IEEE Trans. Signal Process. 69, 4557–4566 (2021)
Shabani, A., Timarchi, S.: Low-power DCT-based compressor for wireless capsule endoscopy. Signal Process. Image Commun. (2017). https://doi.org/10.1016/j.image.2017.03.003
Garrido, M., Källström, P., Kumm, M., Gustafsson, O.: CORDIC II: a new improved CORDIC algorithm. IEEE Trans. Circuits Syst. II Express Briefs 63(2), 186–190 (2016)
Potluri, U.S., Madanayake, A., Cintra, R.J., Bayer, F.M., Kulasekera, S., Edirisuriya, A.: Improved 8-point approximate DCT for image and video compression requiring only 14 additions. IEEE Trans. Circuits Syst. I 61(6), 1727–1740 (2014)
Bouguezel, S., Ahmad, M.O., Swamy, M.N.S.: Binary discrete cosine and Hartley transforms. IEEE Trans. Circuits Syst. I Regul. Pap. 60(4), 989–1002 (2013)
Jridi, M., Alfalou, A., Meher, P.K.: A generalized algorithm and reconfigurable architecture for efficient and scalable orthogonal approximation of DCT. IEEE Trans. Circuits Syst.—I: Regul. Pap. 62(2) 449–457 (2015)
Hsiao, J.-H., Chen, L.-G., Chiueh, T.-D., Chen, C.-T.: High throughput CORDIC-based systolic array design for the discrete cosine transform. IEEE Trans. Circuits Syst. Video Technol. 5(3), 218–225 (1995)
Sadaghiani, A.K., Sheikhaei, S.: Hardware-efficient bartlett spectral density estimator based on optimized R22FFT processor using CCSSI method. J. Circuits Syst. Comput. 30(2), 1–20 (2021)
Liu, Bo., Ding, X., Cai, H., Zhu, W., Wang, Z., Liu, W., Yang, J.: Precision adaptive MFCC based on R2SDF-FFT and approximate computing for low-power speech keywords recognition. IEEE Circuits Syst. Mag. 21(4), 24–39 (2021)
Sadaghiani, A.K., Ghanbari, M.: An optimized hardware design for high speed 2DDCT processor based on modified Loeffler architecture. In: 27th Iranian Conference Electrical Engineering (ICEE), Yazd, Iran, pp. 1476–1480 (2019)
Saponara, S.: Real-time and low-power processing of 3D direct/inverse discrete cosine transform for low-complexity video codec. J. Real-Time Image Proc. 7, 43–53 (2012)
Sun, C.-C., Ruan, S.-J., Heyne, B., Goetze, J.: Low-power and high-quality Cordic-based Loeffler DCT for signal processing. IET Circuits Devices Syst. 1(6), 453–461 (2007)
Garrido, M., Qureshi, F., Gustafsson, O.: Low-complexity multiplierless constant rotators based on combined coefficient selection and shift-and-add implementation (CCSSI). IEEE Trans. Circuits Syst. I Regular Pap. 61, 2002–2012 (2014)
Sadaghiani, A.K., Sheikhaei, S., Forouzandeh, B.: Low complexity multiplierless welch estimator based on memory-based FFT. J. Circuits Syst. Comput. (2022). https://doi.org/10.1142/S0218126622200031
Petrovsky, N., Stankevich, A., Petrovsky, A.: CORDIC-lifting factorization of paraunitary filter banks based on the quaternionic multipliers for lossless image coding. Multidimension. Syst. Signal Process. 27, 667–695 (2016)
https://www.xilinx.com/support/documentation/data_sheets/ds181_Artix_7_Data_Sheet.pdf
Wirendre, A.: Perera: architectures for multiplierless fast Fourier transform hardware implementation in VLSI. IEEE Trans. Acoust. Speech Signal Process. 35(12), 1750–1760 (1987)
Sadaghiani, A.K., Sheikhaei, S., Forouzandeh, B.: High performance image compression based on optimized EZW using hidden Markov chain and Gaussian mixture model. In: 28th Iranian Conference On Electrical Engineering (ICEE), Tabriz, Iran, pp. 1–5 (2020)
Cintra, R.J., Bayer, F.M.: A DCT approximation for image compression. IEEE Signal Process. Lett. 18(10), 579–582 (2011)
Ji, X., Kwong, S., Zhao, D., Wang, H., Kuo, C.-C.J., Dai, Q.: Early determination of zero-quantized 8 × 8 DCT coefficients. IEEE Trans. Circuits Syst. Video Technol. 19(12), 1755–1765 (2009)
Cheng, C., Parhi, K.K.: A novel systolic array structure for DCT. IEEE Trans. Circuits Syst. II Express Briefs 52(7), 366–369 (2005)
Wahid, K., Dimitrov, V., Jullien, G.: New encoding of 8 × 8 DCT to make H.264 lossless. In: Proc. IEEE Asia Pacific Conf. Circuits Syst., pp. 780–783 (2006)
Ayas, S., Ekinci, M.: Single image super resolution based on sparse representation using discrete wavelet transform. Multimed. Tools Appl. 77(11), 1–14 (2018). https://doi.org/10.1007/s11042-017-5233-5
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Khalili Sadaghiani, A., Forouzandeh, B. Low-power hardware-efficient memory-based DCT processor. J Real-Time Image Proc 19, 1105–1121 (2022). https://doi.org/10.1007/s11554-022-01243-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-022-01243-x