More Web Proxy on the site http://driver.im/

research-article

Open access

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Authors:

Shaolei RenAuthors Info & Claims

Proceedings of the ACM on Measurement and Analysis of Computing Systems, Volume 5, Issue 3

Article No.: 34, Pages 1 - 34

https://doi.org/10.1145/3491046

Published: 15 December 2021 Publication History

Abstract

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity --- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.

References

[1]

Mohamed S Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, and Nicholas Donald Lane. Zero-cost proxies for lightweight NAS. In ICLR, 2021.

[2]

AI-Benchmark. Performance of mobile phones. http://ai-benchmark.com/ranking_detailed.html.

[3]

Haldun Akoglu. User's guide to correlation coefficients. Turkish Journal of Emergency Medicine, 18(3):91 -- 93, 2018.

[4]

Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. Understanding and simplifying one-shot architecture search. In ICML, 2018.

[5]

Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, and Quoc V. Le. Can weight sharing outperform random architecture search? An investigation with TuNAS. In CVPR, 2020.

[6]

Ermao Cai, Da-Cheng Juan, Dimitrios Stamoulis, and Diana Marculescu. NeuralPower: Predict and deploy energy-efficient convolutional neural networks. In ACML, 2017.

[7]

Han Cai. Latency lookup tables of mobile devices. https://file.lzhu.me/hancai/.

[8]

Han Cai. Latency lookup tables of mobile devices and GPUs. https://file.lzhu.me/LatencyTools/tvm_lut/.

[9]

Han Cai, Chuang Gan, and Song Han. Once for all: Train one network and specialize it for efficient deployment. In ICLR, 2019.

[10]

Han Cai, Ligeng Zhu, and Song Han. ProxylessNas: Direct neural architecture search on target task and hardware. In ICLR, 2019.

[11]

Wuyang Chen, Xinyu Gong, and Zhangyang Wang. Neural architecture search on ImageNet in four GPU hours: A theoretically inspired perspective. In ICLR, 2021.

[12]

Hsin-Pai Cheng, Tunhou Zhang, Yukun Yang, Feng Yan, Harris Teague, Yiran Chen, and Hai Li. MSNet: Structural wired neural architecture search for internet of things. In ICCV Workshop, 2019.

[13]

Grace Chu, Okan Arikan, Gabriel Bender, Weijun Wang, Achille Brighton, Pieter-Jan Kindermans, Hanxiao Liu, Berkin Akin, Suyog Gupta, and Andrew Howard. Discovering multi-hardware mobile models via architecture search, 2020.

[14]

Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu Yu, Peter Vajda, and Joseph E. Gonzalez. Fbnetv3: Joint architecture-recipe search using predictor pretraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16276--16285, 2021.

[15]

Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, et al. ChamNet: Towards efficient network design through platform-aware model adaptation. In CVPR, 2019.

[16]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.

[17]

Xuanyi Dong and Yi Yang. NAS-Bench-201: Extending the scope of reproducible neural architecture search. In ICLR, 2020.

[18]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.

[19]

Lukasz Dudziak, Thomas Chau, Mohamed S. Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas D. Lane. BRP-NAS: Prediction-based nas using GCNs. In NeurIPS, 2020.

[20]

Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. Neural architecture search: A survey. Journal of Machine Learning Research, 20(55):1--21, 2019.

[21]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.

Digital Library

[22]

Google. Tensorflow lite image classification app. https://www.tensorflow.org/lite/models/image_classification/overview.

[23]

Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. Single path one-shot neural architecture search with uniform sampling. In ECCV, 2020.

Digital Library

[24]

Mark Hill and Vijay Janapa Reddi. Gables: A roofline model for mobile SoCs. In HPCA, 2019.

[25]

Andrey Ignatov, Radu Timofte, Andrei Kulik, Seungsoo Yang, Ke Wang, Felix Baum, Max Wu, Lirong Xu, and Luc Van Gool. Ai benchmark: All about deep learning on smartphones in 2019. In ICCVW, 2019.

[26]

Weiwen Jiang, Lei Yang, Sakyasingha Dasgupta, Jingtong Hu, and Yiyu Shi. Standing on the shoulders of giants: Hardware and neural architecture co-search with hot start. IEEE Transactions on Computer-Aided Design of Integrated CIrcuits and Systems, 2020.

[27]

Sheng-Chun Kao, Arun Ramamurthy, and Tushar Krishna. Generative design of hardware-aware dnns. 2020.

[28]

Hayeon Lee, Sewoong Lee, Song Chong, and Sung Ju Hwang. HELP: hardware-adaptive efficient latency predictor for nas via meta-learning. In NeurIPS, 2021.

[29]

Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, and Yingyan Lin. HW-NAS-Bench: Hardware-aware neural architecture search benchmark. In ICLR, 2021.

[30]

Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. Progressive neural architecture search. In ECCV, 2018.

Digital Library

[31]

Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Differentiable architecture search. In ICLR, 2019.

[32]

Bingqian Lu, Jianyi Yang, and Shaolei Ren. Poster: Scaling up deep neural network optimization for edge inference. In IEEE/ACM Symposium on Edge Computing (SEC), 2020.

[33]

Qing Lu, Weiwen Jiang, Xiaowei Xu, Yiyu Shi, and Jingtong Hu. On neural architecture search for resource-constrained hardware platforms. In ICCAD, 2019.

[34]

Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-yao Huang, Zhihui Li, Xiaojiang Chen, and Xin Wang. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Comput. Surv., 54(4), May 2021.

[35]

Binxin Ru, Xingchen Wan, Xiaowen Dong, and Michael Osborne. Neural architecture search using Bayesian optimisation with weisfeiler-lehman kernel. In ICLR, 2021.

[36]

Manas Sahni, Shreya Varshini, Alind Khare, and Alexey Tumanov. Compofa textendash compound once-for-all networks for faster multi-platform deployment. In International Conference on Learning Representations, 2021.

[37]

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR, 2018.

[38]

Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, and Tong Zhang. Multi-objective neural srchitecture search via predictive network performance optimization. arXiv preprint arXiv:1911.09336, 2019.

[39]

Dimitrios Stamoulis, Ruizhou Ding, Di Wang, Dimitrios Lymberopoulos, Bodhi Priyantha, Jie Liu, and Diana Marculescu. Single-path NAS: Designing hardware-efficient ConvNets in less than 4 hours. In ECML-PKDD, 2019.

[40]

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. MnasNet: Platform-aware neural architecture search for mobile. In CVPR, 2019.

[41]

Mingxing Tan and Quoc Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In ICML, 2019.

[42]

Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, and Song Han. HAT: Hardware-aware transformers for efficient natural language processing. In ACL, 2020.

[43]

Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, and Song Han. APQ: Joint search for network architecture, pruning and quantization policy. In CVPR, 2020.

[44]

Samuel Williams, Andrew Waterman, and David Patterson. Roofline: an insightful visual performance model for multicore architectures. Communications of the ACM, 2009.

[45]

Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In CVPR, 2019.

[46]

Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, and Peizhao Zhang. Machine learning at Facebook: Understanding inference at the edge. In HPCA, 2019.

[47]

Tien-Ju Yang, Andrew Howard, Bo Chen, Xiao Zhang, Alec Go, Mark Sandler, Vivienne Sze, and Hartwig Adam. Netadapt: Platform-aware neural network adaptation for mobile applications. In ECCV, 2018.

Digital Library

[48]

Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, and Quoc Le. Bignas: Scaling up neural architecture search with big single-stage models. In ECCV, 2020.

Digital Library

[49]

Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, and Yunxin Liu. https://github.com/microsoft/nn-meter.

[50]

Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, and Yunxin Liu. nn-meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices. In MobiSys, 2021.

Digital Library

[51]

Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, and Tian Guo. Few-shot neural architecture search. In ICML, 2021.

[52]

Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning. In ICLR, 2017.

Cited By

Luo XLiu DKong HHuai SChen HXiong GLiu W(2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.1145/3701728
Wu JWang LJin QLiu F(2024)Graft: Efficient Inference Serving for Hybrid Deep Learning With SLO Guarantees via DNN Re-AlignmentIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.334051835:2(280-296)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPDS.2023.3340518
Luo XLiu DKong HHuai SXiong GLiu W(2024)Domino-Pro-Max: Toward Efficient Network Simplification and Reparameterization for Embedded Hardware SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.340154443:12(4532-4545)Online publication date: Dec-2024
https://doi.org/10.1109/TCAD.2024.3401544
Show More Cited By

Index Terms

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Neural Architecture Search Survey: A Hardware Perspective
We review the problem of automating hardware-aware architectural design process of Deep Neural Networks (DNNs). The field of Convolutional Neural Network (CNN) algorithm design has led to advancements in many fields, such as computer vision, virtual ...
Auto-Keras: An Efficient Neural Architecture Search System
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Neural architecture search (NAS) has been proposed to automatically tune deep neural networks, but existing search algorithms, e.g., NASNet, PNAS, usually suffer from expensive computational cost. Network morphism, which keeps the functionality of a ...
One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search
SIGMETRICS '22

Convolutional neural networks (CNNs) are used in numerous realworld applications such as vision-based autonomous driving and video content analysis.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems

Proceedings of the ACM on Measurement and Analysis of Computing Systems Volume 5, Issue 3

POMACS

December 2021

435 pages

EISSN:2476-1249

DOI:10.1145/3506735

Editors:
Augustin Chaintreau
Columbia University
,
Leana Golubchik
University of Southern California
,
Zhi-Li Zhang
University of Minnesota

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 December 2021

Published in POMACS Volume 5, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
1,138
Total Downloads

Downloads (Last 12 months)374
Downloads (Last 6 weeks)36

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Luo XLiu DKong HHuai SChen HXiong GLiu W(2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
https://dl.acm.org/doi/10.1145/3701728
Wu JWang LJin QLiu F(2024)Graft: Efficient Inference Serving for Hybrid Deep Learning With SLO Guarantees via DNN Re-AlignmentIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.334051835:2(280-296)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPDS.2023.3340518
Luo XLiu DKong HHuai SXiong GLiu W(2024)Domino-Pro-Max: Toward Efficient Network Simplification and Reparameterization for Embedded Hardware SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.340154443:12(4532-4545)Online publication date: Dec-2024
https://doi.org/10.1109/TCAD.2024.3401544
Ji YWei DShi LLiu PLi CYu JCai X(2024)An Active Learning based Latency Prediction Approach for Neural Network Architecture2024 4th International Conference on Neural Networks, Information and Communication (NNICE)10.1109/NNICE61279.2024.10498710(967-971)Online publication date: 19-Jan-2024
https://doi.org/10.1109/NNICE61279.2024.10498710
Li ZPaolieri MGolubchik L(2024)Inference latency prediction for CNNs on heterogeneous mobile devices and ML frameworksPerformance Evaluation10.1016/j.peva.2024.102429165(102429)Online publication date: Aug-2024
https://doi.org/10.1016/j.peva.2024.102429
Mills KNiu DSalameh MQiu WHan FLiu PZhang JLu WJui SWilliams BChen YNeville J(2023)AIO-PProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i8.26101(9180-9189)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i8.26101
Ankrah ECibrian FSilva LTavakoulnia ABeltran JSchuck SLakes KHayes G(2023)Me, My Health, and My Watch: How Children with ADHD Understand Smartwatch Health DataACM Transactions on Computer-Human Interaction10.1145/357700830:4(1-25)Online publication date: 12-Sep-2023
https://dl.acm.org/doi/10.1145/3577008
Tuli SJha N(2023)AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference With TransformersIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.327399242:11(4038-4051)Online publication date: 8-May-2023
https://dl.acm.org/doi/10.1109/TCAD.2023.3273992
Xiang TMeng QZhang JZhang BSong WXie AGu J(2023)Review of Inference Time Prediction Approaches of DNN: Emphasis on Service robots with cloud-edge-device architecture2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)10.1109/ROBIO58561.2023.10354654(1-6)Online publication date: 4-Dec-2023
https://doi.org/10.1109/ROBIO58561.2023.10354654
Pinos MMrazek VSekanina L(2023)Prediction of Inference Energy on CNN Accelerators Supporting Approximate Circuits2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS)10.1109/DDECS57882.2023.10139724(45-50)Online publication date: 3-May-2023
https://doi.org/10.1109/DDECS57882.2023.10139724
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents