[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3378184.3378224acmotherconferencesArticle/Chapter ViewAbstractPublication PagesappisConference Proceedingsconference-collections
research-article

An Experimental Study of the Impact of Pre-Training on the Pruning of a Convolutional Neural Network

Published: 17 February 2020 Publication History

Abstract

In recent years, deep neural networks have known a wide success in various application domains. However, they require important computational and memory resources, which severely hinders their deployment, notably on mobile devices or for real-time applications. Neural networks usually involve a large number of parameters, which correspond to the weights of the network. Such parameters, obtained with the help of a training process, are determinant for the performance of the network. However, they are also highly redundant. The pruning methods notably attempt to reduce the size of the parameter set, by identifying and removing the irrelevant weights. In this paper, we examine the impact of the training strategy on the pruning efficiency. Two training modalities are considered and compared: (1) fine-tuned and (2) from scratch. The experimental results obtained on four datasets (CIFAR10, CIFAR100, SVHN and Caltech101) and for two different CNNs (VGG16 and MobileNet) demonstrate that a network that has been pre-trained on a large corpus (e.g. ImageNet) and then fine-tuned on a particular dataset can be pruned much more efficiently (up to 80% of parameter reduction) than the same network trained from scratch.

References

[1]
Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. 2015. Structured Pruning of Deep Convolutional Neural Networks. CoRR abs/1512.08571 (2015). arXiv:1512.08571 http://arxiv.org/abs/1512.08571
[2]
Jimmy Ba and Rich Caruana. 2014. Do Deep Nets Really Need to be Deep? In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2654--2662. http://papers.nips.cc/paper/5484-do-deep-nets-really-need-to-be-deep.pdf
[3]
Jian Cheng, Peisong Wang, Gang Li, Qinghao Hu, and Hanqing Lu. 2018. Recent Advances in Efficient Computation of Deep Convolutional Neural Networks. CoRR abs/1802.00939 (2018). arXiv:1802.00939 http://arxiv.org/abs/1802.00939
[4]
François Chollet. 2016. Xception: Deep Learning with Depthwise Separable Convolutions. CoRR abs/1610.02357 (2016). arXiv:1610.02357 http://arxiv.org/abs/1610.02357
[5]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition. 248--255. https://doi.org/10.1109/CVPR.2009.5206848
[6]
Emily Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, and Rob Fergus. 2014. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation. CoRR abs/1404.0736 (2014). arXiv:1404.0736 http://arxiv.org/abs/1404.0736
[7]
Jonathan Frankle and Michael Carbin. 2018. The Lottery Ticket Hypothesis: Training Pruned Neural Networks. CoRR abs/1803.03635 (2018). arXiv:1803.03635 http://arxiv.org/abs/1803.03635
[8]
Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, and Michael Carbin. 2019. The Lottery Ticket Hypothesis at Scale. CoRR abs/1903.01611 (2019). arXiv:1903.01611 http://arxiv.org/abs/1903.01611
[9]
Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning both Weights and Connections for Efficient Neural Networks. CoRR abs/1506.02626 (2015). arXiv:1506.02626 http://arxiv.org/abs/1506.02626
[10]
Babak Hassibi, David G. Stork, Gregory Wolff, and Takahiro Watanabe. 1994. Optimal Brain Surgeon: Extensions and performance comparisons. In Advances in Neural Information Processing Systems 6. Morgan-Kaufmann, 263--270. http://papers.nips.cc/paper/749-optimal-brain-surgeon-extensions-and-performance-comparisons.pdf
[11]
Kaiming He and Jian Sun. 2014. Convolutional Neural Networks at Constrained Time Cost. CoRR abs/1412.1710 (2014). arXiv:1412.1710 http://arxiv.org/abs/1412.1710
[12]
Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel Pruning for Accelerating Very Deep Neural Networks. CoRR abs/1707.06168 (2017). arXiv:1707.06168 http://arxiv.org/abs/1707.06168
[13]
Geoffrey Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. In NIPS Deep Learning and Representation Learning-Workshop. http://arxiv.org/abs/1503.02531
[14]
Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. CoRR abs/1704.04861 (2017). arXiv:1704.04861 http://arxiv.org/abs/1704.04861
[15]
Hengyuan Hu, Rui Peng, Yu-Wing Tai, and Chi-Keung Tang. 2016. Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures. CoRR abs/1607.03250 (2016). arXiv:1607.03250 http://arxiv.org/abs/1607.03250
[16]
Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images.
[17]
Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. http://yann.lecun.com/exdb/mnist/. (2010). http://yann.lecun.com/exdb/mnist/
[18]
Yann LeCun, John S. Denker, and Sara A. Solla. 1990. Optimal Brain Damage. In Advances in Neural Information Processing Systems 2, D. S. Touretzky (Ed.). Morgan-Kaufmann, 598--605. http://papers.nips.cc/paper/250-optimal-brain-damage.pdf
[19]
Fei-Fei Li, Rob Fergus, and Pietro Perona. 2004. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories. In 2004 Conference on Computer Vision and Pattern Recognition Workshop. 178--178. https://doi.org/10.1109/CVPR.2004.383
[20]
Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2016. Pruning Filters for Efficient ConvNets. CoRR abs/1608.08710 (2016). arXiv:1608.08710 http://arxiv.org/abs/1608.08710
[21]
Min Lin, Qiang Chen, and Shuicheng Yan. 2014. Network In Network. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings. http://arxiv.org/abs/1312.4400
[22]
Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, and Trevor Darrell. 2018. Rethinking the Value of Network Pruning. CoRR abs/1810.05270 (2018). arXiv:1810.05270 http://arxiv.org/abs/1810.05270
[23]
Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning. CoRR abs/1611.06440 (2016). arXiv:1611.06440 http://arxiv.org/abs/1611.06440
[24]
Ari S. Morcos, Haonan Yu, Michela Paganini, and Yuandong Tian. 2019. One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers. ArXiv abs/1906.02773 (2019).
[25]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In Workshop on Deep Learning and Unsupervised Feature Learning, NeurIPS.
[26]
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. CoRR abs/1603.05279 (2016). arXiv:1603.05279 http://arxiv.org/abs/1603.05279
[27]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings. http://arxiv.org/abs/1409.1556
[28]
Xavier Suau, Luca Zappella, Vinay Palakkode, and Nicholas Apostoloff. 2018. Principal Filter Analysis for Guided Network Compression. CoRR abs/1807.10585 (2018). arXiv:1807.10585 http://arxiv.org/abs/1807.10585
[29]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--9. https://doi.org/10.1109/CVPR.2015.7298594 ISSN: 1063-6919, 1063-6919.
[30]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In Conference on Computer Vision and Pattern Recognition, CVPR. https://doi.org/10.1109/CVPR.2016.308
[31]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? CoRR abs/1411.1792 (2014). arXiv:1411.1792 http://arxiv.org/abs/1411.1792

Cited By

View all
  • (2022)One-Cycle Pruning: Pruning Convnets With Tight Training Budget2022 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP46576.2022.9897980(4128-4132)Online publication date: 16-Oct-2022
  • (2022)Methods for Pruning Deep Neural NetworksIEEE Access10.1109/ACCESS.2022.318265910(63280-63300)Online publication date: 2022
  • (2022)Improve Convolutional Neural Network Pruning by Maximizing Filter VarietyImage Analysis and Processing – ICIAP 202210.1007/978-3-031-06427-2_32(379-390)Online publication date: 15-May-2022
  • Show More Cited By

Index Terms

  1. An Experimental Study of the Impact of Pre-Training on the Pruning of a Convolutional Neural Network

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        APPIS 2020: Proceedings of the 3rd International Conference on Applications of Intelligent Systems
        January 2020
        214 pages
        ISBN:9781450376303
        DOI:10.1145/3378184
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 17 February 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. CNN compression
        2. Fine-tuning
        3. Neural Network Pruning

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        APPIS 2020

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)13
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 09 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2022)One-Cycle Pruning: Pruning Convnets With Tight Training Budget2022 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP46576.2022.9897980(4128-4132)Online publication date: 16-Oct-2022
        • (2022)Methods for Pruning Deep Neural NetworksIEEE Access10.1109/ACCESS.2022.318265910(63280-63300)Online publication date: 2022
        • (2022)Improve Convolutional Neural Network Pruning by Maximizing Filter VarietyImage Analysis and Processing – ICIAP 202210.1007/978-3-031-06427-2_32(379-390)Online publication date: 15-May-2022
        • (2021)Literature Review of Deep Network CompressionInformatics10.3390/informatics80400778:4(77)Online publication date: 17-Nov-2021
        • (2021)A Smoothed LASSO-Based DNN Sparsification TechniqueIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2021.309776568:10(4287-4298)Online publication date: Oct-2021

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media