[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2830840.2830854acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Big/little deep neural network for ultra low power inference

Published: 04 October 2015 Publication History

Abstract

Deep neural networks (DNNs) have recently proved their effectiveness in complex data analyses such as object/speech recognition. As their applications are being expanded to mobile devices, their energy efficiencies are becoming critical. In this paper, we propose a novel concept called big/LITTLE DNN (BL-DNN) which significantly reduces energy consumption required for DNN execution at a negligible loss of inference accuracy. The BL-DNN consists of a little DNN (consuming low energy) and a full-fledged big DNN. In order to reduce energy consumption, the BL-DNN aims at avoiding the execution of the big DNN whenever possible. The key idea for this goal is to execute the little DNN first for inference (without big DNN execution) and simply use its result as the final inference result as long as the result is estimated to be accurate. On the other hand, if the result from the little DNN is not considered to be accurate, the big DNN is executed to give the final inference result. This approach reduces the total energy consumption by obtaining the inference result only with the little, energy-efficient DNN in most cases, while maintaining the similar level of inference accuracy through selectively utilizing the big DNN execution. We present design-time and runtime methods to control the execution of big DNN under a trade-off between energy consumption and inference accuracy. Experiments with state-of-the-art DNNs for ImageNet and MNIST show that our proposed BL-DNN can offer up to 53.7% (ImageNet) and 94.1% (MNIST) reductions in energy consumption at a loss of 0.90% (ImageNet) and 0.12% (MNIST) in inference accuracy, respectively.

References

[1]
G. Hinton et al., "Reducing the dimensionality of data with neural networks," Science, 313(5786):504--507, 2006.
[2]
A. Krizhevsky et al., "Imagenet classification with deep convolutional neural networks," In Proc. Advances in Neural Information Processing Systems (NIPS), 2012.
[3]
A. S. Razavian et al., "CNN Features off-the-shelf an Astounding Baseline for Recognition," In Proc. Computer Vision and Pattern Recognition Workshops (CVPRW), 2014.
[4]
P. Sermanet et al., "Overfeat: Integrated recognition, localization and detection using convolutional networks," arXiv:1312.6229, 2013.
[5]
Y. Taigman et al., "Deepface: Closing the gap to human-level performance in face verification," In Proc. Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
[6]
C. Szegedy et al., "Going deeper with convolutions," arXiv:1409.4842, 2014.
[7]
J. Gehlhaar, "Neuromorphic processing: A new frontier in scaling computer architecture," Keynote Speech at International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014.
[8]
TeraDeep, Inc., http://www.teradeep.com/.
[9]
S. Chakradhar et al., "A dynamically configurable coprocessor for convolutional neural networks," In Proc. International Symposium on Computer Architecture (ISCA), 2010.
[10]
P. H. Pham et al., "NeuFlow: Dataflow vision processing system-on-a-chip," In Proc. Midwest Symposium on Circuits and Systems (MWSCAS), 2012.
[11]
T. Chen et al., "Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning," In Proc. International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014.
[12]
Y. Chen et al., "DaDianNao: A machine-learning supercomputer," In Proc. International Symposium on Microarchitecture (MICRO), 2014.
[13]
S. Park et al., "93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications," In Proc. International Solid-State Circuits Conference (ISSCC), 2015.
[14]
Q. Zhang et al., "ApproxANN: An approximate computing framework for artificial neural network," In Proc. Design Automation and Test in Europe (DATE), 2015.
[15]
Y. Lee et al., "Performance analysis of bit-width reduced floating-point arithmetic units in FPGAs: A case study of neural network-based face detector," EURASIP Journal on Embedded Systems, Jan. 2009.
[16]
Y. Sun et al., "Deep convolutional network cascade for facial point detection," In Proc. Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
[17]
E. Zhou et al., "Extensive facial landmark localization with coarse-to-fine convolutional network cascade," In Proc. International Conference on Computer Vision Workshops (ICCVW), 2013.
[18]
G. Hinton et al., "Distilling the knowledge in a neural network," In Proc. NIPS Deep Learning and Representation Learning Workshop, 2014.
[19]
M. Peemen et al., "Memory-centric accelerator design for Convolutional Neural Networks," In Proc. International Conference on Computer Design (ICCD), 2013.
[20]
V. Gupta et al., "Low-power digital signal processing using approximate adders," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 32(1):124--137, 2013.
[21]
R. Ye et al., "On reconfiguration-oriented approximate adder design and its application," In Proc. International Conference on Computer-Aided Design (ICCAD), 2013.
[22]
K. Simonyan et al., "Very deep convolutional networks for large-scale image recognition," arXiv:1409.1556, 2014.
[23]
K. Chatfield et al., "Return of the devil in the details: Delving deep into convolutional nets," arXiv:1405.3531, 2014.
[24]
http://caffe.berkeleyvision.org/.
[25]
Y. LeCun et al., "Gradient-based learning applied to document recognition," Proceedings of the IEEE, 86(11):2278--2324, 1998.
[26]
A. Krizhevsky et al., "Learning multiple layers of features from tiny images," Computer Science Department, University of Toronto, Tech. Rep, 2009.
[27]
J. Deng et al., "ImageNet: A large-scale hierarchical image database," In Proc. Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
[28]
Micron, Inc., DDR3 SDRAM System-Power Calculator, http://www.micron.com/products/support/power-calc.
[29]
Z. Du et al., "Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators," In Proc. Asia and South Pacific Design Automation Conference (ASP-DAC), 2014.
[30]
S. Venkataramani et al., "AxNN: Energy-efficient neuromorphic systems using approximate computing," In Proc. International Symposium on Low Power Electronics and Design (ISLPED), 2014.
[31]
P. Rosenfeld et al., "DRAMSim2: A cycle accurate memory system simulator," IEEE Computer Architecture Letters, 10(1):16--19, 2011.
[32]
C. Zhang et al., "Optimizing FPGA-based accelerator design for deep convolutional neural networks," In Proc. International Symposium on Field-Programmable Gate Arrays (FPGA), 2015.
[33]
M. Woźniak et al., "A survey of multiple classifier systems as hybrid systems," In Proc. Information Fusion, vol. 16, pp. 3--17, 2014.
[34]
S. Venkataramani et al., "Scalable-effort classifiers for energy-efficient machine learning," In Proc. Design Automation Conference(DATE), 2015.
[35]
M. Termenon et al., "A two stage sequential ensemble applied to the classification of Alzheimer's disease based on MRI features," In Proc. Neural Processing Letters, vol. 35, pp. 1--12, 2012.
[36]
A. F. R. Rahman et al., "Serial combination of multiple experts: a unified evaluation," In Proc. Pattern Analysis & Applications, vol. 2, pp. 292--311, 1999.
[37]
H. Daume III et al., "Domain adaptation for statistical classifiers," In Proc. Journal of Artificial Intelligence Research, pp. 101--126, 2006.

Cited By

View all
  • (2022)Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural NetworksACM Transactions on Embedded Computing Systems10.1145/354281921:4(1-28)Online publication date: 23-Aug-2022
  • (2022)Embedding Temporal Convolutional Networks for Energy-efficient PPG-based Heart Rate MonitoringACM Transactions on Computing for Healthcare10.1145/34879103:2(1-25)Online publication date: 3-Mar-2022
  • (2021)ApproxNet: Content and Contention-Aware Video Object Classification System for Embedded ClientsACM Transactions on Sensor Networks10.1145/346353018:1(1-27)Online publication date: 5-Oct-2021
  • Show More Cited By

Index Terms

  1. Big/little deep neural network for ultra low power inference

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CODES '15: Proceedings of the 10th International Conference on Hardware/Software Codesign and System Synthesis
    October 2015
    242 pages
    ISBN:9781467383219

    Sponsors

    Publisher

    IEEE Press

    Publication History

    Published: 04 October 2015

    Check for updates

    Author Tags

    1. deep neural network
    2. low power

    Qualifiers

    • Research-article

    Conference

    ESWEEK'15
    ESWEEK'15: ELEVENTH EMBEDDED SYSTEM WEEK
    October 4 - 9, 2015
    Amsterdam, The Netherlands

    Acceptance Rates

    Overall Acceptance Rate 280 of 864 submissions, 32%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)28
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 26 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Human Activity Recognition on Microcontrollers with Quantized and Adaptive Deep Neural NetworksACM Transactions on Embedded Computing Systems10.1145/354281921:4(1-28)Online publication date: 23-Aug-2022
    • (2022)Embedding Temporal Convolutional Networks for Energy-efficient PPG-based Heart Rate MonitoringACM Transactions on Computing for Healthcare10.1145/34879103:2(1-25)Online publication date: 3-Mar-2022
    • (2021)ApproxNet: Content and Contention-Aware Video Object Classification System for Embedded ClientsACM Transactions on Sensor Networks10.1145/346353018:1(1-27)Online publication date: 5-Oct-2021
    • (2021)An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning LevelACM Transactions on Design Automation of Electronic Systems10.1145/346097226:6(1-20)Online publication date: 1-Aug-2021
    • (2019)Lightweight prediction based big/little design for efficient neural network inferenceProceedings of the 4th ACM/IEEE Symposium on Edge Computing10.1145/3318216.3363457(356-356)Online publication date: 7-Nov-2019
    • (2019)Dynamic Beam Width Tuning for Energy-Efficient Recurrent Neural NetworksProceedings of the 2019 Great Lakes Symposium on VLSI10.1145/3299874.3317974(69-74)Online publication date: 13-May-2019
    • (2019)Pack and DetectProceedings of the ACM India Joint International Conference on Data Science and Management of Data10.1145/3297001.3297020(150-156)Online publication date: 3-Jan-2019
    • (2018)Dynamic Bit-width Reconfiguration for Energy-Efficient Deep Learning HardwareProceedings of the International Symposium on Low Power Electronics and Design10.1145/3218603.3218611(1-6)Online publication date: 23-Jul-2018
    • (2017)MECProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305381.3305466(815-824)Online publication date: 6-Aug-2017
    • (2017)The cascading neural networkKnowledge and Information Systems10.1007/s10115-017-1029-152:3(791-814)Online publication date: 1-Sep-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media