Abstract
Hardware-Aware Neural Architecture Search (HA-NAS) is an attractive approach for discovering network architectures that balance task accuracy and deployment efficiency. In an iterative search algorithm, inference time is typically determined at every step by directly profiling architectures on hardware. This imposes limitations on the scalability of search processes because access to specialized devices for profiling is required. As such, the ability to assess inference time without hardware access is an important aspect to enable deep learning on resource-constrained embedded devices. Previous work estimates inference time by summing individual contributions of the architecture’s parts. In this work, we propose using block-level inference time estimators to find the network-level inference time. Individual estimators are trained on collected datasets of independently sampled and profiled architecture block instances. Our experiments on isolated blocks commonly found in classification architectures show that gradient boosted decision trees serve as an accurate surrogate for inference time. More specifically, their Spearman correlation coefficient exceeds 0.98 on all tested platforms. When such blocks are connected in sequence, the sum of all block estimations correlates with the measured network inference time, having Spearman correlation coefficients above 0.71 on evaluated CPUs and an accelerator platform. Furthermore, we demonstrate the applicability of our Surrogate Model (SM) methodology in its intended HA-NAS context. To this end, we evaluate and compare two HA-NAS processes: one that relies on profiling via hardware-in-the-loop and one that leverages block-level surrogate models. We find that both processes yield similar Pareto-optimal architectures. This shows that our method facilitates a similar task-performance outcome without relying on hardware access for profiling during architecture search.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that the experiments are meant to demonstrate our methodology. The experiments are not intended to benchmark specific hardware and deployment toolchains, hence those details are left out.
References
Abdelfattah, M.S., Dudziak, Ł., Chau, T., Lee, R., Kim, H., Lane, N.D.: Best of both worlds: Automl codesign of a cnn and its hardware accelerator. In: ACM/IEEE DAC. IEEE (2020)
Baker, B., Gupta, O., Raskar, R., Naik, N.: Accelerating neural architecture search using performance prediction. ICLR Workshop (2018)
Benmeziane, H., Maghraoui, K.E., Ouarnoughi, H., Niar, S., Wistuba, M., Wang, N.: A comprehensive survey on hardware-aware neural architecture search. arXiv preprint arXiv:2101.09336 (2021)
Bouzidi, H., Ouarnoughi, H., Niar, S., Cadi, A.A.E.: Performance prediction for convolutional neural networks on edge gpus. In: ACM ICCF, p. 54–62 (2021)
Breiman, L.: Random forests. In: Machine Learning. vol. 45, pp. 5–32. Springer Science and Business Media LLC (2001). https://doi.org/10.1023/a:1010933404324
Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once for all: Train one network and specialize it for efficient deployment. In: ICLR (2020)
Cai, H., Zhu, L., Han, S.: ProxylessNAS: Direct neural architecture search on target task and hardware. In: ICLR (2019)
Dillon, J.V., et al.: Tensorflow distributions. arXiv preprint arXiv:1711.10604 (2017)
Dong, X., Yang, Y.: NAS-Bench-201: Extending the scope of reproducible neural architecture search. In: ICLR (2019)
Dong, Z., Gao, Y., Huang, Q., Wawrzynek, J., So, H.K., Keutzer, K.: HAO: Hardware-aware neural architecture optimization for efficient inference. In: IEEE FCCM, pp. 50–59. IEEE (2021)
Dudziak, L., Chau, T., Abdelfattah, M., Lee, R., Kim, H., Lane, N.: BRP-NAS: prediction-based NAS using GCNs. NeurIPS 33, 10480–10490 (2020)
Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(1), 1997–2017 (2019)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Statist. 29(5), 1189–1232 (2001)
Gupta, S., Akin, B.: Accelerator-aware neural network design using automl. arXiv preprint arXiv:2003.02838 (2020)
Guyon, I., et al.: Analysis of the automl challenge series, pp. 191–236 2015–2018
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE/CVF CVPR, pp. 770–778 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Howard, A., et al.: Searching for mobilenetv3. In: IEEE/CVF CVPR, pp. 1314–1324 (2019)
Jin, H., Song, Q., Hu, X.: Auto-keras: An efficient neural architecture search system. In: ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1946–1956. ACM (2019)
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
Lee, H., Lee, S., Chong, S., Hwang, S.J.: HELP: Hardware-adaptive efficient latency prediction for NAS via meta-learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2021)
Li, C., et al.: HW-NAS-Bench: Hardware-aware neural architecture search benchmark. In: ICLR (2021)
Li, W., Liewig, M.: A survey of ai accelerators for edge environment. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S., Orovic, I., Moreira, F. (eds.) WorldCIST 2020. AISC, vol. 1160, pp. 35–44. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45691-7_4
Moons, B., et al.: Distilling optimal neural networks: Rapid search in diverse spaces. In: IEEE/CVF CVPR, pp. 12229–12238 (2021)
Roijers, D.M., Zintgraf, L.M., Nowé, A.: Interactive Thompson sampling for multi-objective multi-armed bandits. In: Rothe, J. (ed.) ADT 2017. LNCS (LNAI), vol. 10576, pp. 18–34. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67504-6_2
Shaw, A., Hunter, D., Iandola, F., Sidhu, S.: Squeezenas: Fast neural architecture search for faster semantic segmentation. arXiv preprint arXiv:1908.01748 (2019)
Stamoulis, D., et al.: Single-path NAS: Device-aware efficient convnet design. In: Joint Workshop on On-Device Machine Learning & Compact Deep Neural Network Representations with Industrial Applications (ODML-CDNNRIA) at ICML (2019)
Tan, M., Chen, B., Pang, R., Vasudevan, V., Le, Q.V.: Mnasnet: Platform-aware neural architecture search for mobile. IEEE/CVF CVPR, pp. 2815–2823 (2019)
Tsai, H., Ooi, J., Ferng, C.S., Chung, H.W., Riesa, J.: Finding fast transformers: One-shot neural architecture search by component composition. arXiv preprint arXiv:2008.06808 (2020)
Vanschoren, J.: Meta-Learning, pp. 35–61. Springer International Publishing (2019)
Wistuba, M., Rawat, A., Pedapati, T.: A survey on neural architecture search. arXiv preprint arXiv:1905.01392 (2019)
Wu, B., et al.: FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search. In: IEEE/CVF CVPR, pp. 10726–10734 (2019)
Wu, J., et al.: Weak NAS predictors are all you need. arXiv preprint arXiv:2102.10490 (2021)
Yang, T.-J., et al.: NetAdapt: platform-aware neural network adaptation for mobile applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 289–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_18
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Stolle, K., Vogel, S., van der Sommen, F., Sanberg, W. (2023). Block-Level Surrogate Models for Inference Time Estimation in Hardware-Aware Neural Architecture Search. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13717. Springer, Cham. https://doi.org/10.1007/978-3-031-26419-1_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-26419-1_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26418-4
Online ISBN: 978-3-031-26419-1
eBook Packages: Computer ScienceComputer Science (R0)