More Web Proxy on the site http://driver.im/

research-article

AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications

Authors: Sicong Liu,

Junzhao DuAuthors Info & Claims

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 5, Issue 1

Article No.: 24, Pages 1 - 22

https://doi.org/10.1145/3448125

Published: 30 March 2021 Publication History

Abstract

There are many deep learning (e.g. DNN) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DNN tends to be deployed locally on the resource-constrained mobile devices via model compression. The current practice either hand-crafted DNN compression techniques, i.e., for optimizing DNN-relative performance (e.g. parameter size), or on-demand DNN compression methods, i.e., for optimizing hardware-dependent metrics (e.g. latency), cannot be locally online because they require offline retraining to ensure accuracy. Also, none of them have correlated their efforts with runtime adaptive compression to consider the dynamic nature of deployment context of mobile applications. To address those challenges, we present AdaSpring, a context-adaptive and self-evolutionary DNN compression framework. It enables the runtime adaptive DNN compression locally online. Specifically, it presents the ensemble training of a retraining-free and self-evolutionary network to integrate multiple alternative DNN compression configurations (i.e., compressed architectures and weights). It then introduces the runtime search strategy to quickly search for the most suitable compression configurations and evolve the corresponding weights. With evaluation on five tasks across three platforms and a real-world case study, experiment outcomes show that AdaSpring obtains up to 3.1x latency reduction, 4.2x energy efficiency improvement in DNNs, compared to hand-crafted compression techniques, while only incurring ≤ 6.2ms runtime-evolution latency.

References

[1]

Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. 2018. Understanding and simplifying one-shot architecture search. In International Conference on Machine Learning. 550--559.

[2]

Sourav Bhattacharya and Nicholas D Lane. 2016. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of CD-ROM. 176--189.

Digital Library

[3]

Sourav Bhattacharya, Dionysis Manousakas, Alberto Gil CP Ramos, Stylianos I Venieris, Nicholas D Lane, and Cecilia Mascolo. 2020. Countering Acoustic Adversarial Attacks in Microphone-equipped Smart Home Devices. Proceedings of the IMWUT 4, 2 (2020), 1--24.

[4]

Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2017. Efficient architecture search by network transformation. arXiv preprint arXiv:1707.04873 (2017).

[5]

Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2019. Once-for-all: Train one network and specialize it for efficient deployment. arXiv preprint arXiv:1908.09791 (2019).

[6]

Han Cai, Ligeng Zhu, and Song Han. 2018. Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2018).

[7]

Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, and Manmohan Chandraker. 2017. Learning efficient object detection models with knowledge distillation. In Advances in Neural Information Processing Systems. 742--751.

[8]

Ling Chen, Yi Zhang, and Liangying Peng. 2020. METIER: A Deep Multi-Task Learning Based Activity and User Recognition Model Using Wearable Sensors. Proceedings of IMWUT 4, 1 (2020), 1--18.

[9]

Tao Chen, Ke Li, Rami Bahsoon, and Xin Yao. 2018. FEMOSAA: Feature-guided and knee-driven multi-objective optimization for self-adaptive software. ACM Transactions on Software Engineering and Methodology 27, 2 (2018).

Digital Library

[10]

Wenlin Chen, James Wilson, Stephen Tyree, Kilian Q Weinberger, and Yixin Chen. 2016. Compressing convolutional neural networks in the frequency domain. In Proceedings of SIGKDD. 1475--1484.

Digital Library

[11]

Xin Chen, Lingxi Xie, Jun Wu, and Qi Tian. 2019. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of ICCV. 1294--1303.

[12]

Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. 2017. A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017).

[13]

Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, et al. 2019. Chamnet: Towards efficient network design through platform-aware model adaptation. In Proceedings of CVPR. 11398--11407.

[14]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Proceedings of CVPR.

[15]

Jiemin Fang, Yuzhu Sun, Kangjian Peng, Qian Zhang, Yuan Li, Wenyu Liu, and Xinggang Wang. 2020. Fast neural network adaptation via parameter remapping and architecture search. arXiv preprint arXiv:2001.02525 (2020).

[16]

Robert M French. 1999. Catastrophic forgetting in connectionist networks. Trends in cognitive sciences 3, 4 (1999), 128--135.

[17]

Xitong Gao, Yiren Zhao, Łukasz Dudziak, Robert Mullins, and Cheng-zhong Xu. 2018. Dynamic channel pruning: Feature boosting and suppression. arXiv preprint arXiv:1810.05331 (2018).

[18]

Amir Gholami, Kiseok Kwon, Bichen Wu, Zizheng Tai, Xiangyu Yue, Peter Jin, Sicheng Zhao, and Kurt Keutzer. 2018. Squeezenext: Hardware-aware neural network design. In Proceedings of CVPR. 1638--1647.

[19]

Google. 2017. TensorFlow. https://goo.gl/j7HAZJ.

[20]

Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proceedings of MobiSys. 123--136.

Digital Library

[21]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of CVPR. 770--778.

[22]

Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018. Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of ECCV. 784--800.

[23]

Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 1389--1397.

[24]

Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).

[25]

Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016).

[26]

Nandan Kumar Jha and Sparsh Mittal. 2020. Modeling Data Reuse in Deep Neural Networks by Taking Data-Types into Cognizance. IEEE Trans. Comput. (2020).

[27]

Nandan Kumar Jha, Sparsh Mittal, and Govardhan Mattela. 2019. The ramifications of making deep neural networks compact. In Proceedings of VLSID. IEEE, 215--220.

[28]

Yufan Jiang, Chi Hu, Tong Xiao, Chunliang Zhang, and Jingbo Zhu. 2019. Improved differentiable architecture search for language modeling and named entity recognition. In Proceedings of EMNLP-IJCNLP. 3576--3581.

[29]

Kaggle. 2019. State Farm Distracted Driver Detection. https://www.kaggle.com/c/state-farm-distracted-driver-detection.

[30]

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of PNAS 114, 13 (2017), 3521--3526.

[31]

Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images. https://www.tensorflow.org/datasets/catalog/cifar100.

[32]

Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images. Technical Report.

[33]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

[34]

Hyeokhyen Kwon, Catherine Tong, Harish Haresamudram, Yan Gao, Gregory D Abowd, Nicholas D Lane, and Thomas Ploetz. 2020. IMU-Tube: Automatic extraction of virtual on-body accelerometry from video for human activity recognition. arXiv preprint arXiv:2006.05675 (2020).

[35]

Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In Proceedings of IPSN. IEEE, 1--12.

[36]

Nicholas D Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 283--294.

Digital Library

[37]

Gen Li, Inyoung Yun, Jonghyun Kim, and Joongkyu Kim. 2019. Dabnet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv preprint arXiv:1907.11357 (2019).

[38]

Mu Li, Tong Zhang, Yuqiang Chen, and Alexander J Smola. 2014. Efficient mini-batch training for stochastic optimization. In Proceedings of SIGKDD. 661--670.

Digital Library

[39]

Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017).

[40]

Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).

[41]

Sicong Liu, Junzhao Du, Kaiming Nan, Atlas Wang, Yingyan Lin, et al. 2020. AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles. arXiv preprint arXiv:2006.04432 (2020).

[42]

Sicong Liu, Yingyan Lin, Zimu Zhou, Kaiming Nan, Hui Liu, and Junzhao Du. 2018. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Proceedings of MobiSys. 389--400.

Digital Library

[43]

Jian-Hao Luo and Jianxin Wu. 2020. Autopruner: An end-to-end trainable filter pruning method for efficient deep model inference. Pattern Recognition (2020), 107461.

[44]

Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Xiaojiang Chen, and Xin Wang. 2020. A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions. arXiv preprint arXiv:2006.02903 (2020).

[45]

Tonmoy Saikia, Yassine Marrakchi, Arber Zela, Frank Hutter, and Thomas Brox. 2019. Autodispnet: Improving disparity estimation with automl. In Proceedings of ICCV. 1812--1823.

[46]

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510--4520.

[47]

Liu Sicong, Zhou Zimu, Du Junzhao, Shangguan Longfei, Jun Han, and Xin Wang. 2017. Ubiear: Bringing location-independent sound awareness to the hard-of-hearing people with smartphones. Proceedings of IMWUT 1, 2 (2017), 1--21.

Digital Library

[48]

Pravendra Singh, Vinay Kumar Verma, Piyush Rai, and Vinay P Namboodiri. 2019. Play and prune: Adaptive filter pruning for deep model compression. arXiv preprint arXiv:1905.04446 (2019).

[49]

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. 2019. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of CVPR. 2820--2828.

[50]

Surat Teerapittayanon, Bradley McDanel, and Hsiang-Tsung Kung. 2016. Branchynet: Fast inference via early exiting from deep neural networks. In Proceedings of ICPR. IEEE, 2464--2469.

[51]

UCI. 2017. Dataset for Human Activity Recognition. https://goo.gl/m5bRo1.

[52]

Xiaofei Wang, Yiwen Han, Victor CM Leung, Dusit Niyato, Xueqiang Yan, and Xu Chen. 2020. Convergence of edge computing and deep learning: A comprehensive survey. IEEE Communications Surveys & Tutorials 22, 2 (2020), 869--904.

[53]

Junru Wu, Yue Wang, Zhenyu Wu, Zhangyang Wang, Ashok Veeraraghavan, and Yingyan Lin. 2018. Deep k-Means: Re-Training and Parameter Sharing with Harder Cluster Assignments for Compressing Deep Convolutions. arXiv preprint arXiv:1806.09228 (2018).

[54]

Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S Davis, Kristen Grauman, and Rogerio Feris. 2018. Blockdrop: Dynamic inference paths in residual networks. In Proceedings of CVPR. 8817--8826.

[55]

Li Yang, Zhezhi He, Yu Cao, and Deliang Fan. 2020. A Progressive Sub-Network Searching Framework for Dynamic Inference. arXiv preprint arXiv:2009.05681 (2020).

[56]

Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceedings of CVPR. 5687--5695.

[57]

Zhican Yang, Chun Yu, Fengshi Zheng, and Yuanchun Shi. 2019. ProxiTalk: Activate Speech Input by Bringing Smartphone to the Mouth. Proceedings of IMWUT 3, 3 (2019), 1--25.

[58]

Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, and Tarek Abdelzaher. 2017. Deepiot: Compressing deep neural network structures for sensing systems with a compressor-critic framework. In Proceedings of SenSys. 1--14.

Digital Library

[59]

Jiahui Yu and Thomas Huang. 2019. AutoSlim: Towards One-Shot Architecture Search for Channel Numbers. arXiv preprint arXiv:1903.11728 (2019).

[60]

Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. 2018. DeepThings: Distributed adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2348--2359.

[61]

Zhao Zhong, Junjie Yan, Wei Wu, Jing Shao, and Cheng-Lin Liu. 2018. Practical block-wise neural network architecture generation. In Proceedings of CVPR. 2423--2432.

[62]

Pan Zhou, Caiming Xiong, Richard Socher, and Steven CH Hoi. 2020. Theory-inspired path-regularized differential network architecture search. arXiv preprint arXiv:2006.16537 (2020).

[63]

Yinhao Zhu and Nicholas Zabaras. 2018. Bayesian deep convolutional encoder-decoder networks for surrogate modeling and uncertainty quantification. J. Comput. Phys. 366 (2018), 415--447.

Digital Library

[64]

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of CVPR. 8697--8710.

Cited By

Yin WXu DHuang GZhang YWei SXu MLiu XShu YLiu JTan RHe YChen J(2024)PieBridge: Fast and Parameter-Efficient On-Device Training via Proxy NetworksProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699327(126-140)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3666025.3699327
Panopoulos IVenieris SVenieris I(2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3665868
Zhang QLan YGuo KWang D(2024)Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596148:2(1-29)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3659614
Show More Cited By

Index Terms

AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications
1. Human-centered computing
  1. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing systems and tools

Recommendations

Low-complexity and low-memory entropy coder for image compression

A low-complexity and low-memory entropy coder (LLEC) is proposed for image compression. The two key elements in the LLEC are zerotree coding and Golomb-Rice (1966, 1991) codes. Zerotree coding exploits the zerotree structure of transformed coefficients ...
Wavelet transform and bit-plane encoding
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1

In this work a new approach for wavelet transform (WT) based image compression is presented. Employing a simple region representation coding scheme previously used with bi-level facsimile pictures, the wavelet transform coefficients are first quantized ...
Audio coding using the wavelet packet transform and a combined scalar-vector quantization
ICASSP '96: Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02

This paper investigates a hybrid scalar-vector quantization scheme for coding high quality audio signals. A wavelet packet transform (WPT) is used to decompose the audio signal into frequency bands slightly finer than the critical band divisions. A ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Volume 5, Issue 1

March 2021

1272 pages

EISSN:2474-9567

DOI:10.1145/3459088

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 March 2021

Published in IMWUT Volume 5, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Key R\&D Program of China
National Natural Science Foundation of China
Fundamental Research Funds for the Central Universities
National Science Fund for Distinguished Young Scholars

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

69
Total Citations
View Citations
433
Total Downloads

Downloads (Last 12 months)70
Downloads (Last 6 weeks)8

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yin WXu DHuang GZhang YWei SXu MLiu XShu YLiu JTan RHe YChen J(2024)PieBridge: Fast and Parameter-Efficient On-Device Training via Proxy NetworksProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699327(126-140)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3666025.3699327
Panopoulos IVenieris SVenieris I(2024)CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN WorkloadsACM Transactions on Embedded Computing Systems10.1145/366586823:4(1-32)Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3665868
Zhang QLan YGuo KWang D(2024)Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596148:2(1-29)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3659614
Zhang QLiu KWang D(2024)Sensing to Hear through MemoryProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36595988:2(1-31)Online publication date: 15-May-2024
https://dl.acm.org/doi/10.1145/3659598
Xu CZheng XRen ZLiu LMa H(2024)UHeadProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435518:1(1-28)Online publication date: 6-Mar-2024
https://dl.acm.org/doi/10.1145/3643551
Wang SZhong LFu YChen LRen JZhang Y(2024)UFaceProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435468:1(1-27)Online publication date: 6-Mar-2024
https://dl.acm.org/doi/10.1145/3643546
Cui MXie BWang QXiong JGanesan DLane NShi W(2024)EVLeSen: In-Vehicle Sensing with EV-Leaked SignalProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649389(679-693)Online publication date: 29-May-2024
https://dl.acm.org/doi/10.1145/3636534.3649389
Wang XFan GDing RJin HHao WTao M(2024)Water Salinity Sensing with UAV-Mounted IR-UWB RadarACM Transactions on Sensor Networks10.1145/363351520:4(1-37)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3633515
Zhang YHan FYang PFeng YYan YGuan R(2024)Wi-Cyclops: Room-Scale WiFi Sensing System for Respiration Detection Based on Single-AntennaACM Transactions on Sensor Networks10.1145/363295820:4(1-24)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3632958
Wei YXiong JLiu HYu YPan JDu J(2024)AdaStreamLiteProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314607:4(1-29)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1145/3631460
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents