[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Efficient Hardware Implementation of Cellular Neural Networks with Incremental Quantization and Early Exit

Published: 01 December 2018 Publication History

Abstract

Cellular neural networks (CeNNs) have been widely adopted in image processing tasks. Recently, various hardware implementations of CeNNs have emerged in the literature, with Field Programmable Gate Array (FPGA) being one of the most popular choices due to its high flexibility and low time-to-market. However, CeNNs typically involve extensive computations in a recursive manner. As an example, to simply process an image of 1,920 × 1,080 pixels requires 4--8 Giga floating point multiplications (for 3 × 3 templates and 50–100 iterations), which needs to be done in a timely manner for real-time applications. To address this issue, in this article, we propose a compressed CeNN framework for efficient FPGA implementations. It involves various techniques, such as incremental quantization and early exit, which significantly reduces computation demands while maintaining an acceptable performance. Particularly, incremental quantization quantizes the numbers in CeNN templates to powers of two, so that complex and expensive multiplications can be converted to simple and cheap shift operations, which only require a minimum number of registers and logical elements (LEs). While a similar concept has been explored in hardware implementations of Convolutional Neural Networks (CNNs), CeNNs have completely different computation patterns, which require different quantization and implementation strategies. Experimental results on FPGAs show that incremental quantization and early exit can achieve a speedup of up to 7.8× and 8.3×, respectively, compared with the state-of-the-art implementations, while with almost no performance loss with four widely adopted applications. We also discover that different from CNNs, the optimal quantization strategies of CeNNs depend heavily on the applications. We hope that our work can serve as a pioneer in the hardware optimization of CeNNs.

References

[1]
Stephen J. Carey, David R. W. Barr, Bin Wang, Alexey Lopich, and Piotr Dudek. 2013. Mixed signal SIMD processor array vision chip for real-time image processing. Analog Integr. Circ. Signal Process. 77, 3 (2013), 385--399.
[2]
Hsin-Chieh Chen, Yung-Ching Hung, Chang-Kuo Chen, Teh-Lu Liao, and Chun-Kuo Chen. 2006. Image-processing algorithms realized by discrete-time cellular neural networks and their circuit implementations. Chaos, Solitons Fract. 29, 5 (2006), 1100--1108.
[3]
Leon O. Chua and Tamas Roska. 2002. Cellular Neural Networks and Visual Computing: Foundations and Applications. Cambridge University Press.
[4]
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or −1. arXiv Preprint arXiv:1602.02830.
[5]
M. Duraisamy and F. Mary Magdalene Jane. 2014. Cellular neural network based medical image segmentation using artificial bee colony algorithm. In Proceedings of the International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE’14). IEEE, 1--6.
[6]
Osama Basil Gazi, Mohamed Belal, and Hala Abdel-Galil. 2014. Edge detection in satellite image using cellular neural network. System 8 (2014), 9.
[7]
Hubert Harrer and Josef A. Nossek. 1992. Discrete-time cellular neural networks. Int. Jo. Circ. Theory Appl. 20, 5 (1992), 453--467.
[8]
Hubert Harrer, Josef A. Nossek, Tams Roska, and Leon O. Chua. 1994. A current-msode DTCNN universal chip. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’94), Vol. 4. IEEE, 135--138.
[9]
Jeremy Hills and Yongmin Zhong. 2014. Cellular neural network-based thermal modelling for real-time robotic path planning. Int. J. Agile Syst. Manage. 7, 3--4 (2014), 261--281.
[10]
Hlevkin. 2017. Retrieved from http://www.hlevkin.com/06testimages.htm.
[11]
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks. In Advances in Neural Information Processing Systems. Springer, 4107--4115.
[12]
K. Karacs, G. Y. Cserey, Zarndy, P. Szolgay, C. S. Rekeczky, L. Kek, V. Szab, G. Pazienza, and T. Roska. 2007. Software library for cellular wave computing engines. Cellular Wave Computing Library (Templates, Algorithms, and Programs), L. Kék, K. Karacs, and T. Roska (Eds.). Retrieved from http://cnn-technology.itk.ppke.hu/Library_v2.1b.pdf.
[13]
James Kennedy. 2011. Particle swarm optimization. In Encyclopedia of Machine Learning. Springer, 760--766.
[14]
Seungjin Lee, Minsu Kim, Kwanho Kim, Joo-Young Kim, and Hoi-Jun Yoo. 2011. 24-GOPS 4.5mm<sup>2</sup> digital cellular neural network for rapid visual attention in an object-recognition SoC. IEEE Trans. Neural Netw. 22, 1 (2011), 64--73.
[15]
Huaqing Li, Xiaofeng Liao, Chuandong Li, Hongyu Huang, and Chaojie Li. 2011. Edge detection of noisy images based on cellular neural networks. Commun. Nonlin. Sci. Numer. Simul. 16, 9 (2011), 3746--3759.
[16]
Dilan Manatunga, Hyesoon Kim, and Saibal Mukhopadhyay. 2015. SP-CNN: A scalable and programmable CNN-based accelerator. IEEE Micro 35, 5 (2015), 42--50.
[17]
Gabriele Manganaro, Paolo Arena, and Luigi Fortuna. 2012. Cellular Neural Networks: Chaos, Complexity and VLSI Processing, Vol. 1. Springer Science 8 Business Media.
[18]
J. Javier Martnez, Javier Garrigs, Javier Toledo, and J. Manuel Ferrndez. 2013. An efficient and expandable hardware implementation of multilayer cellular neural networks. Neurocomputing 114 (2013), 54--62.
[19]
Jens Muller, Robert Wittig, Jan Muller, and Ronald Tetzlaff. 2016. An improved cellular nonlinear network architecture for binary and greyscale image processing. IEEE Trans. Circ. Syst. II: Express Briefs (2016).
[20]
Reid Porter, Jan Frigo, Al Conti, Neal Harvey, Garrett Kenyon, and Maya Gokhale. 2007. A reconfigurable computing framework for multi-scale cellular image processing. Microprocess. Microsyst. 31, 8 (2007), 546--563.
[21]
Sasanka Potluri, Alireza Fasih, Laxminand Kishore Vutukuru, Fadi Al Machot, and Kyandoghere Kyamakya. 2011. CNN based high performance computing for real time image processing on GPU. In Proceedings of the Joint 3rd International Workshop on Nonlinear Dynamics and Synchronization (INDS’11) 8 16th International Symposium on Theoretical Electrical Engineering (ISTET’11). IEEE, 1--7.
[22]
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In Proceedings of the European Conference on Computer Vision. Springer, 525--542.
[23]
Angel Rodrguez-Vzquez, Gustavo Lin-Cembrano, L. Carranza, Elisenda Roca-Moreno, Ricardo Carmona-Galn, Francisco Jimnez-Garrido, Rafael Domnguez-Castro, and S. Espejo Meana. 2004. ACE16k: The third generation of mixed-signal SIMD-CNN ACE chips toward VSoCs. IEEE Trans. Circ. Syst. I: Reg. Papers 51, 5 (2004), 851--863.
[24]
Rahimeh Rouhi, Mehdi Jafari, Shohreh Kasaei, and Peiman Keshavarzian. 2015. Benign and malignant breast tumors classification based on region growing and CNN segmentation. Expert Syst. Appl. 42, 3 (2015), 990--1002.
[25]
Han Song, Pool Jeff, Tran John, and William J. Dally. 2016. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In Proceedings of the 4th International Conference on Learning Representations.
[26]
John Suckling, J. Parker, D. Dance, S. Astley, I. Hutt, C. Boggis, I. Ricketts, E. Stamatakis, N. Cerneaz, S. Kok et al. 1994. The mammographic image analysis society digital mammogram database. In Exerpta Medica. International Congress Series, Vol. 1069. 375--378.
[27]
Tams Szirnyi and Mrton Csapodi. 1998. Texture classification and segmentation by cellular neural networks using genetic learning. Comput. Vision Image Understand. 71, 3 (1998), 255--270.
[28]
Darrell Whitley. 1994. A genetic algorithm tutorial. Stat. Comput. 4, 2 (1994), 65--85.
[29]
Henry Wong, Vaughn Betz, and Jonathan Rose. 2011. Comparing FPGA vs. custom CMOS and the impact on processor microarchitecture. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, 5--14.
[30]
Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized convolutional neural networks for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4820--4828.
[31]
Xiaowei Xu, Yukun Ding, Sharon Xiaobo Hu, Michael Niemier, Jason Cong, Yu Hu, and Yiyu Shi. 2018. Scaling for edge inference of deep neural networks. Nature Electron. 1, 4 (2018), 216.
[32]
Xiaowei Xu, Feng Lin, Aosen Wang, Xinwei Yao, Qing Lu, Wenyao Xu, Yiyu Shi, and Yu Hu. 2018. Accelerating dynamic time warping with memristor-based customized fabrics. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 37, 4 (2018), 729--741.
[33]
Xiaowei Xu, Feng Lin, Wenyao Xu, Xinwei Yao, Yiyu Shi, Dewen Zeng, and Yu Hu. 2018. MDA: A reconfigurable memristor-based distance accelerator for time series mining on data centers. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2018).
[34]
Xiaowei Xu, Qing Lu, Tianchen Wang, Jinglan Liu, Yu Hu, and Yiyu Shi. 2017. Efficient hardware implementation of cellular neural networks with powers-of-two based incremental quantization. In Proceedings of the Neuromorphic Computing Symposium. ACM, 1.
[35]
Xiaowei Xu, Qing Lu, Tianchen Wang, Jinglan Liu, Cheng Zhuo, Xiaobo Sharon Hu, and Yiyu Shi. 2017. Edge segmentation: Empowering mobile telemedicine with compressed cellular neural networks. In Proceedings of the 36th International Conference on Computer-Aided Design. IEEE Press, 880--887.
[36]
Xiaowei Xu, Qing Lu, Lin Yang, Sharon Hu, Danny Chen, Yu Hu, and Yiyu Shi. 2018. Quantization of fully convolutional networks for accurate biomedical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE/CVF, 1.
[37]
Xiaowei Xu, Dewen Zeng, Wenyao Xu, Yiyu Shi, and Yu Hu. 2017. An efficient memristor-based distance accelerator for time series data mining on data centers. In Proceedings of the 54th Annual Design Automation Conference. ACM, 58.
[38]
Nerhun Yildiz, Evren Cesur, Kamer Kayaer, Vedat Tavsanoglu, and Murathan Alpay. 2015. Architecture of a fully pipelined real-time cellular neural network emulator. IEEE Trans. Circ. Syst. I: Reg. Papers 62, 1 (2015), 130--138.
[39]
Nerhun Yildiz, Evren Cesur, and Vedat Tavsanoglu. 2016. On the way to a third-generation real-time cellular neural network processor. In Proceedings of the IEEE International Workshop on Cellular Neural Networks and their Applications (CNNA’16).
[40]
Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental network quantization: Toward lossless CNNs with low-precision weights. In Proceedings of the 5th International Conference on Learning Representations.

Cited By

View all
  • (2024)An Improved Lane Detection and Lane Departure Warning Framework for ADASIEEE Transactions on Consumer Electronics10.1109/TCE.2024.338770870:2(4793-4803)Online publication date: 11-Apr-2024
  • (2024)Diffusion-based Wasserstein generative adversarial network for blood cell image augmentationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108221133(108221)Online publication date: Jul-2024
  • (2023)Predictive exitProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i7.26042(8657-8665)Online publication date: 7-Feb-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Journal on Emerging Technologies in Computing Systems
ACM Journal on Emerging Technologies in Computing Systems  Volume 14, Issue 4
Special Issue on Neuromorphic Computing
October 2018
164 pages
ISSN:1550-4832
EISSN:1550-4840
DOI:10.1145/3294068
  • Editor:
  • Yuan Xie
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 01 December 2018
Accepted: 01 August 2018
Revised: 01 April 2018
Received: 01 December 2017
Published in JETC Volume 14, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cellular neural networks
  2. FPGA
  3. acceleration
  4. quantization

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An Improved Lane Detection and Lane Departure Warning Framework for ADASIEEE Transactions on Consumer Electronics10.1109/TCE.2024.338770870:2(4793-4803)Online publication date: 11-Apr-2024
  • (2024)Diffusion-based Wasserstein generative adversarial network for blood cell image augmentationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108221133(108221)Online publication date: Jul-2024
  • (2023)Predictive exitProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i7.26042(8657-8665)Online publication date: 7-Feb-2023
  • (2023)The importance of resource awareness in artificial intelligence for healthcareNature Machine Intelligence10.1038/s42256-023-00670-0Online publication date: 12-Jun-2023
  • (2022)High-Level Synthesis Hardware Design for FPGA-Based Accelerators: Models, Methodologies, and FrameworksIEEE Access10.1109/ACCESS.2022.320110710(90429-90455)Online publication date: 2022
  • (2021)Quantization of Deep Neural Networks for Accurate Edge ComputingACM Journal on Emerging Technologies in Computing Systems10.1145/345121117:4(1-11)Online publication date: 30-Jun-2021
  • (2020)Binarizing Weights Wisely for Edge Intelligence: Guide for Partial Binarization of Deconvolution-Based GeneratorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.2983370(1-1)Online publication date: 2020
  • (2020)An improved clustering method based on biological visual modelsApplied Mathematical Modelling10.1016/j.apm.2020.04.00885(174-191)Online publication date: Sep-2020
  • (2019)Accurate Congenital Heart Disease Model Generation for 3D Printing2019 IEEE International Workshop on Signal Processing Systems (SiPS)10.1109/SiPS47522.2019.9020624(127-130)Online publication date: Oct-2019
  • (2019)Improve Real-time Object Detection with Feature Enhancement2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC)10.1109/ITAIC.2019.8785856(235-238)Online publication date: May-2019
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media