[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

Published: 15 December 2021 Publication History

Abstract

Convolutional neural networks (CNNs) are used in numerous real-world applications such as vision-based autonomous driving and video content analysis. To run CNN inference on various target devices, hardware-aware neural architecture search (NAS) is crucial. A key requirement of efficient hardware-aware NAS is the fast evaluation of inference latencies in order to rank different architectures. While building a latency predictor for each target device has been commonly used in state of the art, this is a very time-consuming process, lacking scalability in the presence of extremely diverse devices. In this work, we address the scalability challenge by exploiting latency monotonicity --- the architecture latency rankings on different devices are often correlated. When strong latency monotonicity exists, we can re-use architectures searched for one proxy device on new target devices, without losing optimality. In the absence of strong latency monotonicity, we propose an efficient proxy adaptation technique to significantly boost the latency monotonicity. Finally, we validate our approach and conduct experiments with devices of different platforms on multiple mainstream search spaces, including MobileNet-V2, MobileNet-V3, NAS-Bench-201, ProxylessNAS and FBNet. Our results highlight that, by using just one proxy device, we can find almost the same Pareto-optimal architectures as the existing per-device NAS, while avoiding the prohibitive cost of building a latency predictor for each device.

References

[1]
Mohamed S Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, and Nicholas Donald Lane. Zero-cost proxies for lightweight NAS. In ICLR, 2021.
[2]
AI-Benchmark. Performance of mobile phones. http://ai-benchmark.com/ranking_detailed.html.
[3]
Haldun Akoglu. User's guide to correlation coefficients. Turkish Journal of Emergency Medicine, 18(3):91 -- 93, 2018.
[4]
Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. Understanding and simplifying one-shot architecture search. In ICML, 2018.
[5]
Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, and Quoc V. Le. Can weight sharing outperform random architecture search? An investigation with TuNAS. In CVPR, 2020.
[6]
Ermao Cai, Da-Cheng Juan, Dimitrios Stamoulis, and Diana Marculescu. NeuralPower: Predict and deploy energy-efficient convolutional neural networks. In ACML, 2017.
[7]
Han Cai. Latency lookup tables of mobile devices. https://file.lzhu.me/hancai/.
[8]
Han Cai. Latency lookup tables of mobile devices and GPUs. https://file.lzhu.me/LatencyTools/tvm_lut/.
[9]
Han Cai, Chuang Gan, and Song Han. Once for all: Train one network and specialize it for efficient deployment. In ICLR, 2019.
[10]
Han Cai, Ligeng Zhu, and Song Han. ProxylessNas: Direct neural architecture search on target task and hardware. In ICLR, 2019.
[11]
Wuyang Chen, Xinyu Gong, and Zhangyang Wang. Neural architecture search on ImageNet in four GPU hours: A theoretically inspired perspective. In ICLR, 2021.
[12]
Hsin-Pai Cheng, Tunhou Zhang, Yukun Yang, Feng Yan, Harris Teague, Yiran Chen, and Hai Li. MSNet: Structural wired neural architecture search for internet of things. In ICCV Workshop, 2019.
[13]
Grace Chu, Okan Arikan, Gabriel Bender, Weijun Wang, Achille Brighton, Pieter-Jan Kindermans, Hanxiao Liu, Berkin Akin, Suyog Gupta, and Andrew Howard. Discovering multi-hardware mobile models via architecture search, 2020.
[14]
Xiaoliang Dai, Alvin Wan, Peizhao Zhang, Bichen Wu, Zijian He, Zhen Wei, Kan Chen, Yuandong Tian, Matthew Yu Yu, Peter Vajda, and Joseph E. Gonzalez. Fbnetv3: Joint architecture-recipe search using predictor pretraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16276--16285, 2021.
[15]
Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, et al. ChamNet: Towards efficient network design through platform-aware model adaptation. In CVPR, 2019.
[16]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
[17]
Xuanyi Dong and Yi Yang. NAS-Bench-201: Extending the scope of reproducible neural architecture search. In ICLR, 2020.
[18]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
[19]
Lukasz Dudziak, Thomas Chau, Mohamed S. Abdelfattah, Royson Lee, Hyeji Kim, and Nicholas D. Lane. BRP-NAS: Prediction-based nas using GCNs. In NeurIPS, 2020.
[20]
Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. Neural architecture search: A survey. Journal of Machine Learning Research, 20(55):1--21, 2019.
[21]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org.
[22]
Google. Tensorflow lite image classification app. https://www.tensorflow.org/lite/models/image_classification/overview.
[23]
Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. Single path one-shot neural architecture search with uniform sampling. In ECCV, 2020.
[24]
Mark Hill and Vijay Janapa Reddi. Gables: A roofline model for mobile SoCs. In HPCA, 2019.
[25]
Andrey Ignatov, Radu Timofte, Andrei Kulik, Seungsoo Yang, Ke Wang, Felix Baum, Max Wu, Lirong Xu, and Luc Van Gool. Ai benchmark: All about deep learning on smartphones in 2019. In ICCVW, 2019.
[26]
Weiwen Jiang, Lei Yang, Sakyasingha Dasgupta, Jingtong Hu, and Yiyu Shi. Standing on the shoulders of giants: Hardware and neural architecture co-search with hot start. IEEE Transactions on Computer-Aided Design of Integrated CIrcuits and Systems, 2020.
[27]
Sheng-Chun Kao, Arun Ramamurthy, and Tushar Krishna. Generative design of hardware-aware dnns. 2020.
[28]
Hayeon Lee, Sewoong Lee, Song Chong, and Sung Ju Hwang. HELP: hardware-adaptive efficient latency predictor for nas via meta-learning. In NeurIPS, 2021.
[29]
Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, and Yingyan Lin. HW-NAS-Bench: Hardware-aware neural architecture search benchmark. In ICLR, 2021.
[30]
Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. Progressive neural architecture search. In ECCV, 2018.
[31]
Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Differentiable architecture search. In ICLR, 2019.
[32]
Bingqian Lu, Jianyi Yang, and Shaolei Ren. Poster: Scaling up deep neural network optimization for edge inference. In IEEE/ACM Symposium on Edge Computing (SEC), 2020.
[33]
Qing Lu, Weiwen Jiang, Xiaowei Xu, Yiyu Shi, and Jingtong Hu. On neural architecture search for resource-constrained hardware platforms. In ICCAD, 2019.
[34]
Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-yao Huang, Zhihui Li, Xiaojiang Chen, and Xin Wang. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Comput. Surv., 54(4), May 2021.
[35]
Binxin Ru, Xingchen Wan, Xiaowen Dong, and Michael Osborne. Neural architecture search using Bayesian optimisation with weisfeiler-lehman kernel. In ICLR, 2021.
[36]
Manas Sahni, Shreya Varshini, Alind Khare, and Alexey Tumanov. Compofa textendash compound once-for-all networks for faster multi-platform deployment. In International Conference on Learning Representations, 2021.
[37]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR, 2018.
[38]
Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, and Tong Zhang. Multi-objective neural srchitecture search via predictive network performance optimization. arXiv preprint arXiv:1911.09336, 2019.
[39]
Dimitrios Stamoulis, Ruizhou Ding, Di Wang, Dimitrios Lymberopoulos, Bodhi Priyantha, Jie Liu, and Diana Marculescu. Single-path NAS: Designing hardware-efficient ConvNets in less than 4 hours. In ECML-PKDD, 2019.
[40]
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V. Le. MnasNet: Platform-aware neural architecture search for mobile. In CVPR, 2019.
[41]
Mingxing Tan and Quoc Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In ICML, 2019.
[42]
Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, and Song Han. HAT: Hardware-aware transformers for efficient natural language processing. In ACL, 2020.
[43]
Tianzhe Wang, Kuan Wang, Han Cai, Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, and Song Han. APQ: Joint search for network architecture, pruning and quantization policy. In CVPR, 2020.
[44]
Samuel Williams, Andrew Waterman, and David Patterson. Roofline: an insightful visual performance model for multicore architectures. Communications of the ACM, 2009.
[45]
Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, and Kurt Keutzer. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In CVPR, 2019.
[46]
Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, and Peizhao Zhang. Machine learning at Facebook: Understanding inference at the edge. In HPCA, 2019.
[47]
Tien-Ju Yang, Andrew Howard, Bo Chen, Xiao Zhang, Alec Go, Mark Sandler, Vivienne Sze, and Hartwig Adam. Netadapt: Platform-aware neural network adaptation for mobile applications. In ECCV, 2018.
[48]
Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, and Quoc Le. Bignas: Scaling up neural architecture search with big single-stage models. In ECCV, 2020.
[49]
Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, and Yunxin Liu. https://github.com/microsoft/nn-meter.
[50]
Li Lyna Zhang, Shihao Han, Jianyu Wei, Ningxin Zheng, Ting Cao, Yuqing Yang, and Yunxin Liu. nn-meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices. In MobiSys, 2021.
[51]
Yiyang Zhao, Linnan Wang, Yuandong Tian, Rodrigo Fonseca, and Tian Guo. Few-shot neural architecture search. In ICML, 2021.
[52]
Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning. In ICLR, 2017.

Cited By

View all
  • (2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
  • (2024)Graft: Efficient Inference Serving for Hybrid Deep Learning With SLO Guarantees via DNN Re-AlignmentIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.334051835:2(280-296)Online publication date: 1-Feb-2024
  • (2024)Domino-Pro-Max: Toward Efficient Network Simplification and Reparameterization for Embedded Hardware SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.340154443:12(4532-4545)Online publication date: Dec-2024
  • Show More Cited By

Index Terms

  1. One Proxy Device Is Enough for Hardware-Aware Neural Architecture Search

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Measurement and Analysis of Computing Systems
      Proceedings of the ACM on Measurement and Analysis of Computing Systems  Volume 5, Issue 3
      POMACS
      December 2021
      435 pages
      EISSN:2476-1249
      DOI:10.1145/3506735
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 December 2021
      Published in POMACS Volume 5, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. automl
      2. hardware-aware
      3. neural architecture search
      4. scalability

      Qualifiers

      • Research-article

      Funding Sources

      • NSF

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)374
      • Downloads (Last 6 weeks)36
      Reflects downloads up to 19 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future EnvisionACM Transactions on Embedded Computing Systems10.1145/370172824:1(1-100)Online publication date: 24-Oct-2024
      • (2024)Graft: Efficient Inference Serving for Hybrid Deep Learning With SLO Guarantees via DNN Re-AlignmentIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.334051835:2(280-296)Online publication date: 1-Feb-2024
      • (2024)Domino-Pro-Max: Toward Efficient Network Simplification and Reparameterization for Embedded Hardware SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.340154443:12(4532-4545)Online publication date: Dec-2024
      • (2024)An Active Learning based Latency Prediction Approach for Neural Network Architecture2024 4th International Conference on Neural Networks, Information and Communication (NNICE)10.1109/NNICE61279.2024.10498710(967-971)Online publication date: 19-Jan-2024
      • (2024)Inference latency prediction for CNNs on heterogeneous mobile devices and ML frameworksPerformance Evaluation10.1016/j.peva.2024.102429165(102429)Online publication date: Aug-2024
      • (2023)AIO-PProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i8.26101(9180-9189)Online publication date: 7-Feb-2023
      • (2023)Me, My Health, and My Watch: How Children with ADHD Understand Smartwatch Health DataACM Transactions on Computer-Human Interaction10.1145/357700830:4(1-25)Online publication date: 12-Sep-2023
      • (2023)AccelTran: A Sparsity-Aware Accelerator for Dynamic Inference With TransformersIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.327399242:11(4038-4051)Online publication date: 8-May-2023
      • (2023)Review of Inference Time Prediction Approaches of DNN: Emphasis on Service robots with cloud-edge-device architecture2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)10.1109/ROBIO58561.2023.10354654(1-6)Online publication date: 4-Dec-2023
      • (2023)Prediction of Inference Energy on CNN Accelerators Supporting Approximate Circuits2023 26th International Symposium on Design and Diagnostics of Electronic Circuits and Systems (DDECS)10.1109/DDECS57882.2023.10139724(45-50)Online publication date: 3-May-2023
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media