Convolution without multiplication: A general speed up strategy for CNNs

GuoRong Cai¹,
ShengMing Yang¹,
Jing Du¹,
ZongYue Wang¹,
Bin Huang¹,
Yin Guan^3,4,
SongJian Su⁵,
JinHe Su¹ &
…
SongZhi Su²

296 Accesses
8 Citations
Explore all metrics

Abstract

Convolutional Neural Networks (CNN) have achieved great success in many computer vision tasks. However, it is difficult to deploy CNN models on low-cost devices with limited power budgets, because most existing CNN models are computationally expensive. Therefore, CNN model compression and acceleration have become a hot research topic in the deep learning area. Typical schemes for speeding up the feed-forward process with a slight accuracy loss include parameter pruning and sharing, low-rank factorization, compact convolutional filters and knowledge distillation. In this study, we propose a general acceleration scheme that replaces the floating-point multiplication with integer addition. To this end, we propose a general accelerate scheme, where the floating point multiplication is replaced by integer addition. The motivation is based on the fact that every floating point can be replaced by the summation of an exponential series. Therefore, the multiplication between two floating points can be converted to the addition among exponentials. In the experiment section, we directly apply the proposed scheme to AlexNet, VGG, ResNet for image classification, and Faster-RCNN for object detection. The results acquired from ImageNet and PASCAL VOC show that the proposed quantized scheme has a promising performance, even with only one item of exponential. Moreover, we analyzed the eciency of our method on mainstream FPGAs. The experimental results show that the proposed quantized scheme can achieve acceleration on FPGA with a slight accuracy loss.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Fast Depthwise Separable Convolution for Embedded Systems

Accelerating Large Kernel Convolutions with Nested Winograd Transformation

Speeding up inference on deep neural networks for object detection by performing partial convolution

Article 13 September 2019

References

Cheng Y, Wang D, Zhou P, et al. A survey of model compression and acceleration for deep neural networks. arXiv: 1710.09282
Liang S, Yin S Y, Liu L B, et al. FP-BNN: Binarized neural network on FPGA. Neurocomputing, 2018, 275: 1072–1086
Article Google Scholar
Li S, Dou Y, Niu X, et al. A fast and memory saved GPU acceleration algorithm of convolutional neural networks for target detection. Neurocomputing, 2017, 230: 48–59
Article Google Scholar
Rigamonti R, Sironi A, Lepetit V, et al. Learning separable filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, 2013. 2754–2761
Cohen T S, Welling M. Group equivariant convolutional networks. In: Proceedings of the 33rd International Conference on Machine Learning. New York, 2016. 48: 2990–2999
Google Scholar
Jaderberg M, Vedaldi A, Zisserman A. Speeding up convolutional neural networks with low rank expansions. In: Proceedings of British Machine Vision Conference. Nottingham, 2014
Tai C, Xiao T, Wang X, et al. Convolutional neural networks with low-rank regularization. In: Proceedings of the 4th International Conference on Learning Representations. San Juan, 2016
Manngård M, Kronqvist J, Böling J M. Structural learning in artificial neural networks using sparse optimization. Neurocomputing, 2018, 272: 660–667
Article Google Scholar
Courbariaux M, Bengio Y. Binarynet: Training deep neural networks with weights and activations constrained to +1 or −1. arXiv: 1602.02830
Courbariaux M, Bengio Y, David J. Binaryconnect: Training deep neural networks with binary weights during propagations. arXiv: 1511.00363
Denton E L, Zaremba W, Bruna J, et al. Exploiting linear structure within convolutional networks for efficient evaluation. In: Proceedings of the Annual Conference on Neural Information Processing Systems. Montreal, 2014. 1269–1277
Li H, Ouyang W, Wang X. Multi-bias non-linear activation in deep neural networks. In: Proceedings of the 33rd International Conference on Machine Learning. New York, 2016, 48: 221–229
Google Scholar
Zhai S, Cheng Y, Zhang Z M. Doubly convolutional neural networks. In: Proceedings of Annual Conference on Neural Information Processing Systems. Barcelona, 2016. 1082–1090
Szegedy C, Ioffe S, Vanhoucke V. Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv: 1602.07261
Wen W, Wu C, Wang Y, et al. Learning structured sparsity in deep neural networks. arXiv: 1608.03665
Srinivas S, Babu R V, Data-free parameter pruning for deep neural networks. arXiv: 1507.06149
Han S, Pool J, Tran J, et al. Learning both weights and connections for efficient neural networks. arXiv: 1506.02626
Chen W, Wilson J, Tyree S, et al. Compressing neural networks with the hashing trick. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, 2015, 37: 2285–2294
Google Scholar
Ullrich K, Meeds E, Welling M. Soft weight-sharing for neural network compression. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, 2017
Lebedev V, Lempitsky V S. Fast convnets using group-wise brain damage. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 2554–2564
Zhou H, Alvarez J M, Porikli F. Less is more: Towards compact CNNs. In: Proceedings of the 14th European Conference. Amsterdam, 2016, 9908: 662–677
Google Scholar
Li H, Kadav A, Durdanovic I, et al. Pruning filters for efficient convnets. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, 2017
Gong Y, Liu L, Yang M, et al. Compressing deep convolutional networks using vector quantization. arXiv: 1412.6115
Wu J X, Leng C, Cheng J. Quantized convolutional neural networks for mobile devices. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 4820–4828
Vanhoucke V, Senior A, Mao M Z, Improving the speed of neural networks on CPUs. In: Proceedings of the 25th Annual Conference on Neural Information Processing Systems. Sierra Nevada, 2011
Gupta S, Agrawal A, Gopalakrishnan K, et al. Deep learning with limited numerical precision. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, 2015, 37: 1737–1746
Google Scholar
Han S, Mao H, Dally W J. Deep compression: Compressing deep neural networks with pruning. arXiv: 1510.00149
Rastegari M, Ordonez V, Redmon J, et al. Xnor-net: Imagenet classification using binary convolutional neural networks. In: Proceedings of the Computer Vision-ECCV 2016-14th European Conference. Amsterdam, 2016, 9908: 525–542
Article Google Scholar
Merolla P, Appuswamy R, Arthur J V, et al. Deep neural networks are robust to weight binarization and other nonlinear distortions. arXiv: 1606.01981
Hu H X, Wen G, Yu X, et al. Distributed stabilization of heterogeneous MASs in uncertain strong-weak competition networks. IEEE Trans Syst Man Cybern Syst, 2020, doi: https://doi.org/10.1109/TSMC.2020.3034765
Hu H X, Wen G, Yu W, et al. Distributed stabilization of multiple heterogeneous agents in the strong-weak competition network: A switched system approach. IEEE Trans Cybern, 2020, doi: https://doi.org/10.1109/TCYB.2020.2995154
Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 26th Annual Conference on Neural Information Processing Systems 2012. Lake Tahoe, 2012. 1106–1114
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, 2015
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, 2016. 770–778
Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the Annual Conference on Neural Information Processing Systems. Montreal, 2015. 91–99
Hubara I, Courbariaux M, Soudry D, et al. Binarized neural networks. In: Proceedings of the Annual Conference on Neural Information Processing Systems. Barcelona, 2016. 4114–4122
Chen H, Wang Y, Xu C, et al. AdderNet: Do we really need multiplications in deep learning? In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 2020. 1465–1474
Jin Q, Yang L, Liao Z. AdaBits: Neural network quantization with adaptive bit-widths. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, 2020. 2143–2153
Sun Y, Wang Z, Huang S, et al. Accelerating frequent item counting with FPGA. In: Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. Monterey, 2014. 109–112

Download references

Author information

Authors and Affiliations

College of Computer Engineering, Jimei University, Xiamen, 361021, China
GuoRong Cai, ShengMing Yang, Jing Du, ZongYue Wang, Bin Huang & JinHe Su
Department of Artificial Intelligence, Xiamen University, Xiamen, 361005, China
SongZhi Su
Laboratory of Big Data and Artificial Intelligence, NetDragon Websoft Inc., Fuzhou, 350001, China
Yin Guan
College of Computer and Control Engineering, Minjiang University, Fuzhou, 350108, China
Yin Guan
The Third Research Institute, Ropeok, Inc., Xiamen, 361008, China
SongJian Su

Authors

GuoRong Cai
View author publications
You can also search for this author in PubMed Google Scholar
ShengMing Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Du
View author publications
You can also search for this author in PubMed Google Scholar
ZongYue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yin Guan
View author publications
You can also search for this author in PubMed Google Scholar
SongJian Su
View author publications
You can also search for this author in PubMed Google Scholar
JinHe Su
View author publications
You can also search for this author in PubMed Google Scholar
SongZhi Su
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Bin Huang or SongZhi Su.

Additional information

This work was supported by the National Natural Science Foundation of China (Grant Nos. 41971424, 61701191), the Key Technical Project of Xiamen Ocean Bureau (Grant No. 18CZB033HJ11), the Natural Science Foundation of Fujian Province (Grant Nos. 2019J01712, 2020J01701), the Key Technical Project of Xiamen Science and Technology Bureau (Grant Nos. 3502Z20191018, 3502Z20201007, 3502Z20191022, 3502Z20203057), and the Science and Technology Project of Education Department of Fujian Province (Grant Nos. JAT190321, JAT190318, JAT190315).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cai, G., Yang, S., Du, J. et al. Convolution without multiplication: A general speed up strategy for CNNs. Sci. China Technol. Sci. 64, 2627–2639 (2021). https://doi.org/10.1007/s11431-021-1936-2

Download citation

Received: 17 March 2021
Accepted: 18 September 2021
Published: 11 November 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s11431-021-1936-2

Convolution without multiplication: A general speed up strategy for CNNs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Depthwise Separable Convolution for Embedded Systems

Accelerating Large Kernel Convolutions with Nested Winograd Transformation

Speeding up inference on deep neural networks for object detection by performing partial convolution

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Convolution without multiplication: A general speed up strategy for CNNs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Depthwise Separable Convolution for Embedded Systems

Accelerating Large Kernel Convolutions with Nested Winograd Transformation

Speeding up inference on deep neural networks for object detection by performing partial convolution

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation