Abstract
Decoding high-quality videos in real-time is becoming more and more difficult with the increasing resolution. In this paper, a novel hardware/software (HW/SW) partitioning is proposed with powerful SIMD (single instruction multiple data) instructions for the real-time AVS video decoder. Since most key functions that need large amounts of computations are optimized by SIMD instead of hardware, the distribution of workload between hardware and software is balanceable, and the performance of the video decoder is improved. Besides, the generality and programmability are also maintained. The proposed method is implemented on a 32-bit dual-issue RISC processor with 256-bit vector extension. The experimental results of conformation AVS test sequences show that the video decoder system can support the real-time decoding of AVS 1080p videos at 30 fps, and improve performance over 100 times compared to the original processor without the proposed method. Moreover, this approach could be easily applied to other video decoders, such as H.264 and VC-1.
Similar content being viewed by others
References
Audio Video Coding Standard Workgroup of China (AVS), Advanced Coding of Audio and Video—Part 2: Video, Dec. 2004
Chen L, Cong M, Huang J, Li L, Liu H, Qian C (2012) A novel HW/SW partitioning with SIMD instructions for AVS video decoder. In: Proceedings of the IEEE 7th International Conference on Networking, Architecture and Storage (NAS), pp 273–277
Cheung N-M, Fan X, Au O, Kung M-C (2010) Video coding on multicore graphics processors. IEEE Signal Process Mag 27(2):79–89
Feng L, Rui G, Shu S, Xu C (2006) HW/SW co-design and implementation of multi-standard video decoding. In: Proceedings of the 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia, Oct., pp 87–92
Gschwind M, Hofstee H, Flachs B, Hopkin M, Watanabe Y, Yamazaki T (2006) Synergistic processing in cell’s multicore architecture. IEEE Micro 26(2):10–24
Hu W, Chen Y (2010) GS464V: a high-performance Low-power XPU with 512-bit Vector Extension. In: Proceedings of the 22nd IEEE Symposium on High Performance Chips (HOT CHIPS’10)
Hu W, Wang J, Gao X, Chen Y, Liu Q, Li G (2009) Godson-3: a scalable multi-core RISC Processor with X86 Emulation Support. In: IEEE Micro, vol 29, no. 2, pp 17–29
Iwata K, Irita T, Mochizuki S, Ueda H, Ehama M, Kimura M, Takemura J, Matsumoto K, Yamamoto E, Teranuma T, Takakubo K, Watanabe H, Yoshioka S, Hattori T (2010) A 342 mW mobile application processor with full-HD multi-standard video codec and tile-based address-translation circuits. IEEE J Solid State Circuits 45(1):59–68
Jia H, Zhang P, Xie D, Gao W (2006) An AVS HDTV video decoder architecture employing efficient HW/SW partitioning. IEEE Trans Consum Electron 52(4):1447–1453
Jian GA, Chu JC, Huang TY, Chang TC, Guo JI (2009) A system architecture exploration on the configurable HW/SW co-design for H.264 video decoder. In: IEEE International Symposium on Circuits and Systems (ISCAS), May, pp 2237–2240
Jian GA, Huang TY, Chu JC, Guo JI (2009) Optimization of VC-1/H.264/AVS Video Decoders on Embedded Processors. In: Sixth International Conference on Information Technology: New Generations (ITNG), Apr., pp 1313–1318
Joint Video Team (JVT) of ISO/IEC MPEG and ITU-T VCEG (2004) Information technology—coding of audio-visual objects, part 10: Advanced Video Coding
Krommydas K, Tsoublekas G, Antonopoulos C, Bellas N (2010) Mapping and optimization of the AVS video decoder on a high performance chip multiprocessor. In: IEEE International Conference on Multimedia and Expo (ICME), July, pp 896–901
Lappalainen V, Hamalainen T, Liuha P (2002) Overview of research efforts on media ISA extensions and their usage in video coding. IEEE Trans Circuits Syst Video Technol 12(8):660–670
Lee J-Y, Lee J-J, Park S (2010) Multi-core platform for an efficient H.264 and VC-1 video decoding based on macroblock row-level parallelism. IET Circuits Devices Syst 4(2):147–158
Lee J, Moon S, Sung W (2004) H.264 decoder optimization exploiting SIMD instructions. In: Proceedings of IEEE Asia-Pacific Conference on Circuits and Systems, vol 2, Dec, pp 1149–1152
Mori T, Ueda Y, Nonogaki N, Terazawa T, Sroka M, Fujita T, Kodaka T, Morita K, Arakida H, Miura T, Okuda Y, Kizu T, Tsuboi Y (2009) A power, performance scalable eight-cores media processor for mobile multimedia applications. IEEE J Solid State Circuits 44(11):2957–2965
MPTE Standard (2006) VC-1 Compressed Video Bitstream Format and Decoding Process (SMPTE 421M-2006)
Peng C, Huang C, Wang R, Dai J, Zhao Y (2004) Architecture of AVS hardware decoding system. In: Proceedings of International Symposium on Intelligent Multimedia, Video and Speech Processing, Oct., pp 306–309
Sheng B, Gao W, Xie D, Wu D (2006) An efficient VLSI architecture of VLD for AVS HDTV decoder. IEEE Trans Consum Electron 52(2):696–701
Tachikake K, Togawa N, Miyaoka Y, Choi J, Yanagisawa M, Ohtsuki T (2003) A hardware/software partitioning algorithm for SIMD processor cores. In: Proceedings of the 2003 Asia and South Pacific Design Automation Conference (ASP-DAC), pp 135–140
Togawa N, Yanagisawa M, Ohtsuki T (2000) A hardware/software cosynthesis system for digital signal processor cores with two types of register files. IEICE Trans Fundam Electron Commun Comput Sci E83-A(3):442–451
Wei L, Yong-en C (2009) VLD Design for AVS Video Decoder. In: Second International Workshop on Knowledge Discovery and Data Mining (WKDD), pp 648–651
Woh M, Seo S, Mahlke S, Mudge T, Chakrabarti C, Flautner K (2010) Anysp: anytime anywhere anyway signal processing. IEEE Micro 30(1):81–91
Acknowledgments
This work is partially supported by the National Sci&Tech Major Project (No.2009ZX01028-002-003, 2009ZX01029-001-003, 2010ZX01036-001-002), National Natural Science Foundation (No.60921002, 61003064, 61050002, 61070025, 61100163, 61133004, 61173001, 61222204) of China, the National High Technology Development 863 Program of China (2012AA010901,2012AA011002), and the Strategic Priority Research Program of the Chinese Academy of Sciences (under Grant XDA06010401-02).
Author information
Authors and Affiliations
Corresponding author
Additional information
The basic idea of this paper appeared in the Proceedings of IEEE 7th International Conference on Networking, Architecture and Storage (NAS): L. Chen, M. Cong, J. Huang, L. Li, H. Liu and C. Qian, “A Novel HW/SW Partitioning with SIMD Instructions for AVS Video Decoder” as an extended abstract [2]. In this version, we carry out detailed analysis, and present more performance and power results.
Rights and permissions
About this article
Cite this article
Chen, L., Cong, M., Huang, J. et al. A novel hardware/software partitioning for SIMD-based real-time AVS video decoder. Multimed Tools Appl 71, 1651–1671 (2014). https://doi.org/10.1007/s11042-012-1296-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1296-5