[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3517207.3526982acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
poster

DyFiP: explainable AI-based dynamic filter pruning of convolutional neural networks

Published: 05 April 2022 Publication History

Abstract

Filter pruning is one of the most effective ways to accelerate Convolutional Neural Networks (CNNs). Most of the existing works are focused on the static pruning of CNN filters. In dynamic pruning of CNN filters, existing works are based on the idea of switching between different branches of a CNN or exiting early based on the hardness of a sample. These approaches can reduce the average latency of inference, but they cannot reduce the longest-path latency of inference. In contrast, we present a novel approach of dynamic filter pruning that utilizes explainable AI along with early coarse prediction in the intermediate layers of a CNN. This coarse prediction is performed using a simple branch that is trained to perform top-k classification. The branch either predicts the output class with high confidence, in which case the rest of the computations are left out. Alternatively, the branch predicts the output class to be within a subset of possible output classes. After this coarse prediction, only those filters that are important for this subset of classes are then evaluated. The importances of filters for each output class are obtained using explainable AI. Using this concept of dynamic pruning, we are able not only to reduce the average latency of inference, but also the longest-path latency of inference. Our proposed architecture for dynamic pruning can be deployed on different hardware platforms.

References

[1]
Berrada, L., Zisserman, A., and Kumar, M. P. Smooth loss functions for deep top-k classification. The Computing Research Repository (CoRR) (2018).
[2]
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. The Computing Research Repository (CoRR) (2017).
[3]
Chen, S., and Zhao, Q. Shallowing deep networks: Layer-wise pruning based on feature representations. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 12 (2019), 3048--3056.
[4]
chenyaofo. Pretrained models on CIFAR10/100 in PyTorch, 2019.
[5]
Dhamdhere, K., Sundararajan, M., and Yan, Q. How important is a neuron? The Computing Research Repository (CoRR) (2018).
[6]
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html, 2012.
[7]
Han, S., Pool, J., Tran, J., and Dally, W. Learning both weights and connections for efficient neural network. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS) (2015), C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., Curran Associates, Inc., pp. 1135--1143.
[8]
Han, Y., Huang, G., Song, S., Yang, L., Wang, H., and Wang, Y. Dynamic neural networks: A survey. The Computing Research Repository (CoRR) (2021).
[9]
He, K., Gkioxari, G., Dollár, P., and Girshick, R. Mask r-cnn. The Computing Research Repository (CoRR) (2017).
[10]
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition.
[11]
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. The Computing Research Repository (CoRR) (2015).
[12]
He, Y., Kang, G., Dong, X., Fu, Y., and Yang, Y. Soft filter pruning for accelerating deep convolutional neural networks. IJCAI'18, AAAI Press, p. 2234--2240.
[13]
Hunter, J. D. Matplotlib: A 2d graphics environment. Computing in Science and Engineering 9, 3 (2007), 90--95.
[14]
Kokhlikyan, N., Miglani, V., Martin, M., Wang, E., Reynolds, J., Melnikov, A., Lunova, N., and Reblitz-Richardson, O. Pytorch captum, 2019.
[15]
Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS) (2012), F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds., vol. 25, Curran Associates, Inc., pp. 1106--1114.
[16]
LeCun, Y., and Cortes, C. MNIST handwritten digit database, 2010.
[17]
LeCun, Y., Denker, J., and Solla, S. Optimal brain damage. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS) (1989), D. Touretzky, Ed., Morgan-Kaufmann, pp. 598--605.
[18]
Leontiadis, I., Laskaridis, S., Venieris, S. I., and Lane, N. D. It's always personal: Using early exits for efficient on-device CNN personalisation. The Computing Research Repository (CoRR) (2021).
[19]
Li, H., Kadav, A., Durdanovic, I., Samet, H., and Graf, H. P. Pruning filters for efficient convnets. The Computing Research Repository (CoRR) (2016).
[20]
Li, H., Kadav, A., Durdanovic, I., Samet, H., and Graf, H. P. Pruning filters for efficient convnets. The Computing Research Repository (CoRR) (2016).
[21]
Lin, J., Rao, Y., Lu, J., and Zhou, J. Runtime neural pruning. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS) (2017), I. Guyon, U. v. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., Curran Associates, Inc., pp. 2181--2191.
[22]
Molchanov, P., Tyree, S., Karras, T., Aila, T., and Kautz, J. Pruning convolutional neural networks for resource efficient transfer learning. The Computing Research Repository (CoRR) (2016).
[23]
Nikolaos, F., Theodorakopoulos, I., Pothos, V., and Vassalos, E. Dynamic pruning of cnn networks. In 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA) (2019), pp. 1--5.
[24]
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS) (2019), pp. 8024--8035.
[25]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[26]
Ronneberger, O., Fischer, P., and Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Springer, Cham, 2015, pp. 234--241.
[27]
Ruder, S. An overview of gradient descent optimization algorithms. The Computing Research Repository (CoRR) (2016).
[28]
Sabih, M., Hannig, F., and Teich, J. Utilizing explainable AI for quantization and pruning of deep neural networks. The Computing Research Repository (CoRR) (2020).
[29]
Shrikumar, A., Greenside, P., and Kundaje, A. Learning important features through propagating activation differences. The Computing Research Repository (CoRR) (2017).
[30]
Srinivas, S., and Babu, R. V. Data-free parameter pruning for deep neural networks. The Computing Research Repository (CoRR) (2015).
[31]
Sundararajan, M., Taly, A., and Yan, Q. Axiomatic attribution for deep networks. The Computing Research Repository (CoRR) (2017).
[32]
Teerapittayanon, S., McDanel, B., and Kung, H. T. BranchyNet: Fast inference via early exiting from deep neural networks. In Proceedings of the 23rd International Conference on Pattern Recognition (ICPR) (2016), IEEE, pp. 2464--2469.
[33]
Tjoa, E., and Guan, C. A survey on explainable artificial intelligence (XAI): Towards medical XAI. The Computing Research Repository (CoRR) (2019).
[34]
van der Walt, S., Colbert, S. C., and Varoquaux, G. The NumPy Array: A structure for efficient numerical computation. Computing in Science and Engineering 13, 2 (2011), 22--30.
[35]
Van Rossum, G., and Drake, F. L. Python 3 Reference Manual. CreateSpace, Scotts Valley, CA, USA, 2009.
[36]
You, Z., Yan, K., Ye, J., Ma, M., and Wang, P. Gate decorator: Global filter pruning method for accelerating deep convolutional neural networks. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NeurIPS) (2019), pp. 2130--2141.

Cited By

View all
  • (2024)Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep LearningACM Computing Surveys10.1145/365728356:10(1-40)Online publication date: 14-May-2024
  • (2024)Leveraging Temporal Patterns: Automated Augmentation to Create Temporal Early Exit Networks for Efficient Edge AIIEEE Access10.1109/ACCESS.2024.349715812(169787-169804)Online publication date: 2024
  • (2024)Approximate Computing: Concepts, Architectures, Challenges, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.346737512(146022-146088)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroMLSys '22: Proceedings of the 2nd European Workshop on Machine Learning and Systems
April 2022
121 pages
ISBN:9781450392549
DOI:10.1145/3517207
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 April 2022

Check for updates

Qualifiers

  • Poster

Conference

EuroSys '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 18 of 26 submissions, 69%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)5
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep LearningACM Computing Surveys10.1145/365728356:10(1-40)Online publication date: 14-May-2024
  • (2024)Leveraging Temporal Patterns: Automated Augmentation to Create Temporal Early Exit Networks for Efficient Edge AIIEEE Access10.1109/ACCESS.2024.349715812(169787-169804)Online publication date: 2024
  • (2024)Approximate Computing: Concepts, Architectures, Challenges, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.346737512(146022-146088)Online publication date: 2024
  • (2024)Hardware-Aware Evolutionary Explainable Filter Pruning for Convolutional Neural NetworksInternational Journal of Parallel Programming10.1007/s10766-024-00760-552:1-2(40-58)Online publication date: 22-Feb-2024
  • (2023)Robust and Tiny Binary Neural Networks using Gradient-based Explainability MethodsProceedings of the 3rd Workshop on Machine Learning and Systems10.1145/3578356.3592595(87-93)Online publication date: 8-May-2023
  • (2023)Empirical evaluation of filter pruning methods for acceleration of convolutional neural networkMultimedia Tools and Applications10.1007/s11042-023-17656-083:18(54699-54727)Online publication date: 7-Dec-2023
  • (2022)MOSP: Multi-Objective Sensitivity Pruning of Deep Neural Networks2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)10.1109/IGSC55832.2022.9969374(1-8)Online publication date: 24-Oct-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media