[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Block-Level Surrogate Models for Inference Time Estimation in Hardware-Aware Neural Architecture Search

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13717))

Abstract

Hardware-Aware Neural Architecture Search (HA-NAS) is an attractive approach for discovering network architectures that balance task accuracy and deployment efficiency. In an iterative search algorithm, inference time is typically determined at every step by directly profiling architectures on hardware. This imposes limitations on the scalability of search processes because access to specialized devices for profiling is required. As such, the ability to assess inference time without hardware access is an important aspect to enable deep learning on resource-constrained embedded devices. Previous work estimates inference time by summing individual contributions of the architecture’s parts. In this work, we propose using block-level inference time estimators to find the network-level inference time. Individual estimators are trained on collected datasets of independently sampled and profiled architecture block instances. Our experiments on isolated blocks commonly found in classification architectures show that gradient boosted decision trees serve as an accurate surrogate for inference time. More specifically, their Spearman correlation coefficient exceeds 0.98 on all tested platforms. When such blocks are connected in sequence, the sum of all block estimations correlates with the measured network inference time, having Spearman correlation coefficients above 0.71 on evaluated CPUs and an accelerator platform. Furthermore, we demonstrate the applicability of our Surrogate Model (SM) methodology in its intended HA-NAS context. To this end, we evaluate and compare two HA-NAS processes: one that relies on profiling via hardware-in-the-loop and one that leverages block-level surrogate models. We find that both processes yield similar Pareto-optimal architectures. This shows that our method facilitates a similar task-performance outcome without relying on hardware access for profiling during architecture search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 59.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 74.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Note that the experiments are meant to demonstrate our methodology. The experiments are not intended to benchmark specific hardware and deployment toolchains, hence those details are left out.

References

  1. Abdelfattah, M.S., Dudziak, Ł., Chau, T., Lee, R., Kim, H., Lane, N.D.: Best of both worlds: Automl codesign of a cnn and its hardware accelerator. In: ACM/IEEE DAC. IEEE (2020)

    Google Scholar 

  2. Baker, B., Gupta, O., Raskar, R., Naik, N.: Accelerating neural architecture search using performance prediction. ICLR Workshop (2018)

    Google Scholar 

  3. Benmeziane, H., Maghraoui, K.E., Ouarnoughi, H., Niar, S., Wistuba, M., Wang, N.: A comprehensive survey on hardware-aware neural architecture search. arXiv preprint arXiv:2101.09336 (2021)

  4. Bouzidi, H., Ouarnoughi, H., Niar, S., Cadi, A.A.E.: Performance prediction for convolutional neural networks on edge gpus. In: ACM ICCF, p. 54–62 (2021)

    Google Scholar 

  5. Breiman, L.: Random forests. In: Machine Learning. vol. 45, pp. 5–32. Springer Science and Business Media LLC (2001). https://doi.org/10.1023/a:1010933404324

  6. Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once for all: Train one network and specialize it for efficient deployment. In: ICLR (2020)

    Google Scholar 

  7. Cai, H., Zhu, L., Han, S.: ProxylessNAS: Direct neural architecture search on target task and hardware. In: ICLR (2019)

    Google Scholar 

  8. Dillon, J.V., et al.: Tensorflow distributions. arXiv preprint arXiv:1711.10604 (2017)

  9. Dong, X., Yang, Y.: NAS-Bench-201: Extending the scope of reproducible neural architecture search. In: ICLR (2019)

    Google Scholar 

  10. Dong, Z., Gao, Y., Huang, Q., Wawrzynek, J., So, H.K., Keutzer, K.: HAO: Hardware-aware neural architecture optimization for efficient inference. In: IEEE FCCM, pp. 50–59. IEEE (2021)

    Google Scholar 

  11. Dudziak, L., Chau, T., Abdelfattah, M., Lee, R., Kim, H., Lane, N.: BRP-NAS: prediction-based NAS using GCNs. NeurIPS 33, 10480–10490 (2020)

    Google Scholar 

  12. Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(1), 1997–2017 (2019)

    MathSciNet  MATH  Google Scholar 

  13. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Statist. 29(5), 1189–1232 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  14. Gupta, S., Akin, B.: Accelerator-aware neural network design using automl. arXiv preprint arXiv:2003.02838 (2020)

  15. Guyon, I., et al.: Analysis of the automl challenge series, pp. 191–236 2015–2018

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE/CVF CVPR, pp. 770–778 (2016)

    Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  18. Howard, A., et al.: Searching for mobilenetv3. In: IEEE/CVF CVPR, pp. 1314–1324 (2019)

    Google Scholar 

  19. Jin, H., Song, Q., Hu, X.: Auto-keras: An efficient neural architecture search system. In: ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1946–1956. ACM (2019)

    Google Scholar 

  20. Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  21. Lee, H., Lee, S., Chong, S., Hwang, S.J.: HELP: Hardware-adaptive efficient latency prediction for NAS via meta-learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)

    Google Scholar 

  22. Li, C., et al.: HW-NAS-Bench: Hardware-aware neural architecture search benchmark. In: ICLR (2021)

    Google Scholar 

  23. Li, W., Liewig, M.: A survey of ai accelerators for edge environment. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S., Orovic, I., Moreira, F. (eds.) WorldCIST 2020. AISC, vol. 1160, pp. 35–44. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45691-7_4

    Chapter  Google Scholar 

  24. Moons, B., et al.: Distilling optimal neural networks: Rapid search in diverse spaces. In: IEEE/CVF CVPR, pp. 12229–12238 (2021)

    Google Scholar 

  25. Roijers, D.M., Zintgraf, L.M., Nowé, A.: Interactive Thompson sampling for multi-objective multi-armed bandits. In: Rothe, J. (ed.) ADT 2017. LNCS (LNAI), vol. 10576, pp. 18–34. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67504-6_2

    Chapter  Google Scholar 

  26. Shaw, A., Hunter, D., Iandola, F., Sidhu, S.: Squeezenas: Fast neural architecture search for faster semantic segmentation. arXiv preprint arXiv:1908.01748 (2019)

  27. Stamoulis, D., et al.: Single-path NAS: Device-aware efficient convnet design. In: Joint Workshop on On-Device Machine Learning & Compact Deep Neural Network Representations with Industrial Applications (ODML-CDNNRIA) at ICML (2019)

    Google Scholar 

  28. Tan, M., Chen, B., Pang, R., Vasudevan, V., Le, Q.V.: Mnasnet: Platform-aware neural architecture search for mobile. IEEE/CVF CVPR, pp. 2815–2823 (2019)

    Google Scholar 

  29. Tsai, H., Ooi, J., Ferng, C.S., Chung, H.W., Riesa, J.: Finding fast transformers: One-shot neural architecture search by component composition. arXiv preprint arXiv:2008.06808 (2020)

  30. Vanschoren, J.: Meta-Learning, pp. 35–61. Springer International Publishing (2019)

    Google Scholar 

  31. Wistuba, M., Rawat, A., Pedapati, T.: A survey on neural architecture search. arXiv preprint arXiv:1905.01392 (2019)

  32. Wu, B., et al.: FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search. In: IEEE/CVF CVPR, pp. 10726–10734 (2019)

    Google Scholar 

  33. Wu, J., et al.: Weak NAS predictors are all you need. arXiv preprint arXiv:2102.10490 (2021)

  34. Yang, T.-J., et al.: NetAdapt: platform-aware neural network adaptation for mobile applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 289–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_18

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Willem Sanberg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Stolle, K., Vogel, S., van der Sommen, F., Sanberg, W. (2023). Block-Level Surrogate Models for Inference Time Estimation in Hardware-Aware Neural Architecture Search. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13717. Springer, Cham. https://doi.org/10.1007/978-3-031-26419-1_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26419-1_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26418-4

  • Online ISBN: 978-3-031-26419-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics