Abstract
Dynamic inference networks improve computational efficiency by executing a subset of network components, i.e., executing path, conditioned on input sample. Prevalent methods typically assign routers to computational blocks so that a computational block can be skipped or executed. However, such inference mechanisms are prone to suffer instability in the optimization of dynamic inference networks. First, a dynamic inference network is more sensitive to its routers than its computational blocks. Second, the components executed by the network vary with samples, resulting in unstable feature evolution throughout the network. To alleviate the problems above, we propose SP-Nets to slow down the progress from two aspects. First, we design a dynamic auxiliary module to slow down the progress in routers from the perspective of historical information. Moreover, we regularize the feature evolution directions across the network to smoothen the feature extraction in the aspect of information flow. As a result, we conduct extensive experiments on three widely used benchmarks and show that our proposed SP-Nets achieve state-of-the-art performance in terms of efficiency and accuracy.
H. Wang and W. Zhang—Equal contribution to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ba, J., Caruana, R.: Do deep nets really need to be deep? In: Proceedings of the Advances in Neural Information Processing Systems, pp. 2654–2662. Curran Associates, Inc. (2014)
Chen, Y., Dai, X., Liu, M., Chen, D., Yuan, L., Liu, Z.: Dynamic convolution: attention over convolution kernels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Chen, Z., Zhang, L., Cao, Z., Guo, J.: Distilling the knowledge from handcrafted features for human activity recognition. IEEE Trans. Ind. Inf. 14, 4334–4342 (2018)
Du, X., Li, Z., Ma, Y., Cao, Y.: Efficient network construction through structural plasticity. IEEE J. Emerg. Sel. Top. Circ. Syst. 9, 453–464 (2019)
Figurnov, M., et al.: Spatially adaptive computation time for residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
French, G., Mackiewicz, M., Fisher, M.H.: Self-ensembling for visual domain adaptation. In: Proceedings of the International Conference on Learning Representations (2018)
Furlanello, T., Lipton, Z.C., Tschannen, M., Itti, L., Anandkumar, A.: Born-again neural networks. In: Proceedings of the International Conference on Machine Learning, pp. 1602–1611 (2018)
Gao, X., Zhao, Y., Dudziak, L., Mullins, R., Xu, C.: Dynamic channel pruning: feature boosting and suppression. In: Proceedings of the International Conference on Learning Representations (2019)
Ghodrati, A., Bejnordi, B.E., Habibian, A.: FrameExit: conditional early exiting for efficient video recognition. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (2021)
Guo, Q., Yu, Z., Wu, Y., Liang, D., Qin, H., Yan, J.: Dynamic recursive neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: Proceedings of the International Conference on Learning Representations (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. In: Proceedings of the Advances in Neural Information Processing Systems (2015)
Huang, G., Chen, D., Li, T., Wu, F., van der Maaten, L., Weinberger, K.: Multi-scale dense networks for resource efficient image classification. In: Proceedings of the International Conference on Learning Representations (2018)
Ioannou, Y., Robertson, D., Shotton, J., Cipolla, R., Criminisi, A.: Training CNNs with low-rank filters for efficient image classification. In: Proceedings of the International Conference on Learning Representations (2016)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: Proceedings of the British Machine Vision Conference (2014)
Laine, S., Aila, T.: Temporal ensembling for semi-supervised learning. In: Proceedings of the International Conference on Learning Representations (2017)
Leroux, S., Molchanov, P., Simoens, P., Dhoedt, B., Breuel, T., Kautz, J.: IamNN: iterative and adaptive mobile neural network for efficient image classification. arXiv:1804.10123 (2018)
Li, H., Zhang, H., Qi, X., Yang, R., Huang, G.: Improved techniques for training adaptive deep networks. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
McIntosh, L., Maheswaranathan, N., Sussillo, D., Shlens, J.: Convolutional neural networks with low-rank regularization. In: Proceedings of the International Conference on Learning Representations (2016)
McIntosh, L., Maheswaranathan, N., Sussillo, D., Shlens, J.: Recurrent segmentation for variable computational budgets. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Meng, Y., et al.: AR-Net: adaptive frame resolution for efficient action recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12352, pp. 86–104. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58571-6_6
Perone, C.S., Ballester, P.L., Barros, R.C., Cohen-Adad, J.: Unsupervised domain adaptation for medical imaging segmentation with self-ensembling. arXiv preprint arXiv:1811.06042 (2018)
Polino, A., Pascanu, R., Alistarh, D.: Model compression via distillation and quantization. In: Proceedings of the International Conference on Learning Representations (2018)
Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: FitNets: hints for thin deep nets. In: Proceedings of the International Conference on Learning Representations (2015)
Shafiee, M.S., Shafiee, M.J., Wong, A.: Dynamic representations toward efficient inference on deep neural networks by decision gates. In: Proceedings of the CVPR Workshop (2019)
Su, Z., Fang, L., Kang, W., Hu, D., Pietikäinen, M., Liu, L.: Dynamic group convolution for accelerating convolutional neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 138–155. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_9
Sun, F., Qin, M., Zhang, T., Liu, L., Chen, Y.K., Xie, Y.: Computation on sparse neural networks: an inspiration for future hardware. arXiv:2004.11946 (2020)
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 1195–1204 (2017)
Teja Mullapudi, R., Mark, W.R., Shazeer, N., Fatahalian, K.: HydraNets: specialized dynamic architectures for efficient inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Veit, A., Belongie, S.: Convolutional networks with adaptive inference graphs. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 3–18. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_1
Verelst, T., Tuytelaars, T.: Dynamic convolutions: exploiting spatial sparsity for faster inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Wang, H., Li, S., Su, S., Qin, Z., Li, X.: RDI-Net: relational dynamic inference networks. In: Proceedings of the IEEE International Conference on Computer Vision (2021)
Wang, H., Qin, Z., Li, S., Li, X.: CoDiNet: path distribution modeling with consistency and diversity for dynamic routing. IEEE Trans. Pattern Anal. Mach. Intell. (2021)
Wang, Y., et al.: Dual dynamic inference: enabling more efficient, adaptive and controllable deep inference. IEEE J. Sel. Top. Signal Process. 14, 623–633 (2020)
Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Wu, Z., et al.: BlockDrop: dynamic inference paths in residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Xia, W., Yin, H., Dai, X., Jha, N.K.: Fully dynamic inference with deep neural networks. arXiv:2007.15151 (2020)
Xie, Z., Zhang, Z., Zhu, X., Huang, G., Lin, S.: Spatially adaptive inference with stochastic feature sampling and interpolation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 531–548. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_31
Wang, X., Yu, F., Dou, Z.-Y., Darrell, T., Gonzalez, J.E.: SkipNet: learning dynamic routing in convolutional networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 420–436. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_25
Xu, Y., Du, B., Zhang, L., Zhang, Q., Wang, G., Zhang, L.: Self-ensembling attention networks: addressing domain shift for semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5581–5588 (2019)
Yang, L., Han, Y., Chen, X., Song, S., Dai, J., Huang, G.: Resolution adaptive networks for efficient inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Yu, J., Huang, T.S.: Universally slimmable networks and improved training techniques. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: Proceedings of the International Conference on Learning Representations (2017)
Zhang, L., Shi, Y., Shi, Z., Ma, K., Bao, C.: Task-oriented feature distillation. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Proceedings of the Advances in Neural Information Processing Systems, pp. 14759–14771 (2020)
Zhang, P., Zhong, Y., Li, X.: SlimYOLOv3: narrower, faster and better for real-time UAV applications. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Acknowledgement
This work was supported by Alibaba Innovative Research (AIR) Program, Alibaba Research Intern Program, National Key Research and Development Program of China under Grant 2020 AAA0107400, Zhejiang Provincial Natural Science Foundation of China under Grant LR19F020004, and National Natural Science Foundation of China under Grant U20A20222.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, H. et al. (2022). SP-Net: Slowly Progressing Dynamic Inference Networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-20083-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20082-3
Online ISBN: 978-3-031-20083-0
eBook Packages: Computer ScienceComputer Science (R0)