[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3666025.3699364acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article
Open access

Intermittent Inference: Trading a 1% Accuracy Loss for a 1.9x Throughput Speedup

Published: 04 November 2024 Publication History

Abstract

We present INTERCEPT, a compile-time toolchain enabling manifold throughput improvements when running intermittent DNN inference on IoT devices, in exchange of a maximum 1% accuracy loss. Intermittently-computing IoT devices rely on ambient energy harvesting and compute opportunistically, as energy is available. They use NVM to persist intermediate results in anticipation of energy failures. Without requiring changes to existing models and by exploiting the features of STT-MRAM as NVM, INTERCEPT optimizes the placement and configuration of state persistence operations when executing the inference process. This happens off-line with no user intervention, while enforcing a maximum 1% accuracy loss. Our results, obtained across three platforms and six diverse neural networks, indicate that INTERCEPT provides a 40% energy gain in a single inference process, on average. With the same energy budget, this yields a 1.9x throughput speedup.

References

[1]
A. M. Hosseini Monazzah, A. M. Rahmani, A. Miele, and N. Dutt. 2020. CAST: Content-Aware STT-MRAM Cache Write Management for Different Levels of Approximation. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 39, 12 (2020), 4385--4398.
[2]
M. Afanasov, N. A. Bhatti, A. Naveed, D. Campagna, G. Caslini, F. M. Centonze, K. Dolui, A. Maioli, E. Barone, M. H. Alizai, J. H. Siddiqui, and L. Mottola. 2020. Battery-Less Zero-Maintenance Embedded Sensing at the Mithræum of Circus Maximus. In Proc. of ACM Conf. on Embedded Networked Sensor Systems (SenSys). 368--381.
[3]
M. H. Ahmadilivani, M. Taheri, J. Raik, M. Daneshtalab, and M. Jenihhin. 2024. A Systematic Literature Review on Hardware Reliability Assessment Methods for Deep Neural Networks. ACM Comput. Surv. 56, 6 (2024), 1--39.
[4]
S. Ahmed, A. Bakar, N. A. Bhatti, M. H. Alizai, J. H. Siddiqui, and L. Mottola. 2019. The betrayal of constant power×time: Finding the missing joules of transiently-powered computers. In Proc. of the 20th ACM SIGPLAN/SIGBED Intl. Conf. on Languages, Compilers, and Tools for Embedded Systems (LCTES). 97--109.
[5]
S. Ahmed, B. Islam, K. S. Yildirim, M. Zimmerling, P. Pawełczak, M. H. Alizai, B. Lucia, L. Mottola, J. Sorber, and J. Hester. 2024. The Internet of Batteryless Things. Commun. ACM 67, 3 (2024), 64--73.
[6]
K. Akhunov and K. S. Yıldırım. 2024. CRAM-Based Acceleration for Intermittent Computing of Parallelizable Tasks. IEEE Trans. on Emerging Topics in Computing 12, 1 (2024), 48--59.
[7]
G. Armeniakos, G. Zervakis, D. Soudris, and J. Henkel. 2022. Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey. ACM Comput. Surv. 55, 4 (2022), 1--36.
[8]
Z. Azad, H. Farbeh, A. M. H. Monazzah, and S. G. Miremadi. 2017. An Efficient Protection Technique for Last Level STT-RAM Caches in Multi-Core Processors. IEEE Trans. on Parallel and Distributed Systems 28, 6 (2017), 1564--1577.
[9]
F. Bambusi, F. Cerizzi, Y. Lee, and L. Mottola. 2022. The Case for Approximate Intermittent Computing. In Proc. of Intl. Conf. on Information Processing in Sensor Networks (IPSN). 463--476.
[10]
N. A. Bhatti, M. H. Alizai, A. A. Syed, and L. Mottola. 2016. Energy Harvesting and Wireless Transfer in Sensor Network Applications: Concepts and Experiences. ACM Trans. on Sensor Networks 12, 3 (2016), 1--40.
[11]
N. A. Bhatti and L. Mottola. 2017. HarvOS: Efficient Code Instrumentation for Transiently-powered Embedded Sensing. In Proc. of ACM/IEEE Intl. Conf. on Information Processing in Sensor Networks (IPSN). 209--220.
[12]
C. Bolchini, L. Cassano, A. Miele, and A. Toschi. 2023. Fast and Accurate Error Simulation for CNNs Against Soft Errors. IEEE Trans. on Computers 72, 4 (2023), 984--997.
[13]
C. Bolchini and A. Miele. 2013. Reliability-Driven System-Level Synthesis for Mixed-Critical Embedded Systems. IEEE Trans. on Computers 62, 12 (2013), 2489--2502.
[14]
L. Caronti, K. Akhunov, M. Nardello, K. S. Yıldırım, and D. Brunelli. 2023. Finegrained hardware acceleration for efficient batteryless intermittent inference on the edge. ACM Trans. on Embedded Computing Systems 22, 5 (2023), 1--19.
[15]
Z. Chen, G. Li, and K. Pattabiraman. 2021. A low-cost fault corrector for Deep Neural Networks through range restriction. In Proc. Intl. Conf. Dependable Systems and Networks (DSN). 1--13.
[16]
A. Colin et al. 2018. Termination Checking and Task Decomposition for Task-based Intermittent Programs. In Proceedings of the 27th International Conference on Compiler Construction (CC 2018).
[17]
A. Colin, E. Ruppel, and B. Lucia. 2018. A reconfigurable energy storage architecture for energy-harvesting devices. In Proc. of Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 767--781.
[18]
Fabrizio De Vita, Rawan MA Nawaiseh, Dario Bruneo, Valeria Tomaselli, Marco Lattuada, and Mirko Falchetto. 2023. μ-ff: On-device forward-forward training algorithm for microcontrollers. In 2023 IEEE International Conference on Smart Computing (SMARTCOMP). 49--56.
[19]
B. Denby, K. Chintalapudi, R. Chandra, B. Lucia, and S. Noghabi. 2023. Kodan: Addressing the Computational Bottleneck in Space. In Proc. of Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 392--403.
[20]
T. Devolder, J. Hayakawa, K. Ito, H. Takahashi, S. Ikeda, P. Crozat, N. Zerounian, J.-V. Kim, C. Chappert, and H. Ohno. 2008. Single-Shot Time-Resolved Measurements of Nanosecond-Scale Spin-Transfer Induced Switching: Stochastic Versus Deterministic Aspects. Physical Review Letters 100 (2008), 057206. Issue 5.
[21]
X. Dong, C. Xu, Y. Xie, and N. P. Jouppi. 2012. Nvsim: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 31, 7 (2012), 994--1007.
[22]
A. F. Gad. 2023. Pygad: An intuitive genetic algorithm python library. Multimedia Tools and Applications 83 (2023), 58029--58042.
[23]
K. Geissdoerfer and M. Zimmerling. 2022. Learning to communicate effectively between battery-free devices. In 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22). 419--435.
[24]
G. Gobieski, B. Lucia, and N. Beckmann. 2019. Intelligence beyond the edge: Inference on intermittent embedded systems. In Proc. of Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 199--213.
[25]
H. M. G. de A. Rocha, A. C. S. Beck, S. M. D. M. Maia, M. E. Kreutz, and M. M. Pereira. 2020. A Routing based Genetic Algorithm for Task Mapping on MPSoC. In Proc. Brazilian Symp on Computing Systems Engineering (SBESC). 1--8.
[26]
Soheil Hashemi, Nicholas Anthony, Hokchhay Tann, R Iris Bahar, and Sherief Reda. 2017. Understanding the impact of precision quantization on the accuracy and energy of neural networks. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. 1474--1479.
[27]
J. Hester et al. 2014. Ekho: Realistic and Repeatable Experimentation for Tiny Energy-harvesting Sensors. In Proc. of ACM Conf. on Embedded Network Sensor Systems (SenSys).
[28]
J. Hester et al. 2017. Flicker: Rapid Prototyping for the Batteryless Internet-of-Things. In Proc. of ACM Conf. on Embedded Network Sensor Systems (SenSys).
[29]
J. Hester, L. Sitanayah, and J. Sorber. 2015. Tragedy of the coulombs: Federating energy storage for tiny, intermittently-powered sensors. In Proc. of ACM Conf. on Embedded Networked Sensor Systems (SenSys). 5--16.
[30]
L.-H. Hoang, M. A. Hanif, and M. Shafique. 2020. Ft-clipact: Resilience analysis of deep neural networks and improving their fault tolerance using clipped activation. In Proc. of Design, Automation & Test in Europe Conf. & Exhibition (DATE). 1241--1246.
[31]
Y. Ibrahim, H. Wang, M. Bai, Z. Liu, J. Wang, Z. Yang, and Z. Chen. 2020. Soft error resilience of deep residual networks for object recognition. IEEE Access 8 (2020), 19490--19503.
[32]
N. Ikeda, R. Shigeta, J. Shiomi, and Y. Kawahara. 2020. Soil-Monitoring Sensor Powered by Temperature Difference between Air and Shallow Underground Soil. Proc. of ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 1 (2020), 1--22.
[33]
B. Islam, Y. Luo, and S. Nirjon. 2023. Amalgamated intermittent computing systems. In Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation. 184--196.
[34]
B. Islam and S. Nirjon. 2020. Scheduling computational and energy harvesting tasks in deadline-aware intermittent systems. In 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, 95--109.
[35]
B. Islam and S. Nirjon. 2020. Zygarde: Time-Sensitive On-Device Deep Inference and Adaptation on Intermittently-Powered Systems. Proc. of ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 3 (2020), 1--29.
[36]
S. Islam, J. Deng, S. Zhou, C. Pan, C. Ding, and M. Xie. 2022. Enabling fast deep learning on tiny energy-harvesting IoT devices. In Proc. of Design, Automation & Test in Europe Conf. & Exhibition (DATE). 921--926.
[37]
S. Islam, S. Zhou, R. Ran, Y.-F. Jin, W. Wen, C. Ding, and M. Xie. 2022. Eve: Environmental adaptive neural network models for low-power energy harvesting system. In Proc. of IEEE/ACM Intl. Conf. on Computer-Aided Design (ICCAD). 1--9.
[38]
J. Van Der Woude and M. Hicks. 2016. Intermittent Computation Without Hardware Support or Programmer Intervention. In Proc. of USENIX Symp. on Operating Systems Design and Implementation (OSDI). 17--32.
[39]
S. Jeon, Y. Choi, Y. Cho, and H. Cha. 2023. Harvnet: resource-optimized operation of multi-exit deep neural networks on energy harvesting devices. In Proc. of the Annual Intl. Conf. on Mobile Systems, Applications and Services (MobiSys). 42--55.
[40]
C.-K. Kang, H. R. Mendis, C.-H. Lin, M.-S. Chen, and P.-C. Hsiu. 2020. Everything leaves footprints: Hardware accelerated intermittent deep inference. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 39, 11 (2020), 3479--3491.
[41]
C.-K. Kang, H. R. Mendis, C.-H. Lin, M.-S. Chen, and P.-C. Hsiu. 2022. More is less: Model augmentation for intermittent deep inference. ACM Trans. on Embedded Computing Systems 21, 5 (2022), 1--26.
[42]
K. Langendoen. 2006. Apples, oranges, and testbeds. In 2006 IEEE International Conference on Mobile Ad Hoc and Sensor Systems. IEEE, 387--396.
[43]
S. Lee and S. Nirjon. 2019. Neuro.ZERO: a zero-energy neural network accelerator for embedded sensing and inference systems. In Proc. of ACM Conf. on Embedded Networked Sensor Systems (SenSys). 138--152.
[44]
C.-C. Lin, C.-Y. Liu, C.-H. Yen, T.-W. Kuo, and P.-C. Hsiu. 2023. Intermittent-aware neural network pruning. In Proc. of ACM/IEEE Design Automation Conf. (DAC). 1--6.
[45]
B. Lucia and B. Ransford. 2015. A Simpler, Safer Programming and Execution Model for Intermittent Systems. In Proc. of ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI). 575--585.
[46]
M. Lv and E. Xu. 2022. Deep Learning on Energy Harvesting IoT Devices: Survey and Future Challenges. IEEE Access 10 (2022), 124999--125014.
[47]
Mingsong Lv and Enyu Xu. 2022. Efficient dnn execution on intermittently-powered iot devices with depth-first inference. IEEE Access 10 (2022), 101999--102008.
[48]
A. Maioli and L. Mottola. 2021. Alfred: Virtual memory for intermittent computing. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems. 261--273.
[49]
A. Y. Majid et al. 2020. Dynamic Task-Based Intermittent Execution for Energy-Harvesting Devices. ACM Trans. on Sensor Networks 16, 1 (2020), 1--24.
[50]
H. R. Mendis, C.-K. Kang, and P.-C. Hsiu. 2021. Intermittent-aware neural architecture search. ACM Trans. on Embedded Computing Systems 20, 5s (2021), 1--27.
[51]
S. Mittal. 2016. A survey of techniques for approximate computing. ACM Comput. Surv. 48, 4 (2016), 1--33.
[52]
H. Noh, T. You, J. Mun, and B. Han. 2017. Regularizing deep neural networks by noise: Its interpretation and optimization. In Conf. on Neural Information Processing Systems (NIPS). 5115--5124.
[53]
A. Ranjan, S. Venkataramani, X. Fong, K. Roy, and A. Raghunathan. 2015. Approximate storage for energy efficient spintronic memories. In Proc. of ACM/IEEE Design Automation Conf. (DAC). 1--6.
[54]
S. Resch, S. K. Khatamifard, Z. I. Chowdhury, M. Zabihi, Z. Zhao, H. Cilasun, J.-P. Wang, S. S. Sapatnekar, and U. R. Karpuzcu. 2020. MOUSE: Inference in nonvolatile memory for energy harvesting applications. In Proc. of Annual IEEE/ACM Intl. Symp. on Microarchitecture (MICRO). 400--414.
[55]
S. Resch, S. K. Khatamifard, Z. I. Chowdhury, M. Zabihi, Z. Zhao, H. Cilasun, J.-P. Wang, S. S. Sapatnekar, and U. R. Karpuzcu. 2022. Energy-efficient and reliable inference in nonvolatile memory under extreme operating conditions. ACM Trans. on Embedded Computing Systems 21, 5 (2022), 1--36.
[56]
S. S. Saha, S. S. Sandha, and M. Srivastava. 2022. Machine learning for microcontroller-class hardware: A review. IEEE Sensors Journal 22, 22 (2022), 21362--21390.
[57]
Arash Salahvarzi, Mohsen Khosroanjam, Amir Mahdi Hosseini Monazzah, Hakem Beitollahi, Umit Y. Ogras, and Mahdi Fazeli. 2023. WiSE: When Learning Assists Resolving STT-MRAM Efficiency Challenges. IEEE Trans. on Emerging Topics in Computing 11, 1 (2023), 43--55.
[58]
N. Sayed, F. Oboril, A. Shirvanian, R. Bishnoi, and M. B. Tahoori. 2017. Exploiting STT-MRAM for approximate computing. In Proc. of European Test Symp. (ETS). 1--6.
[59]
S. Seyedfaraji, J. T. Daryani, M. M. Sabry Aly, and S. Rehman. 2022. EXTENT: Enabling Approximation-Oriented Energy Efficient STT-RAM Write Circuit. IEEE Access 10 (2022), 82144--82155.
[60]
Joshua R Smith, Alanson P Sample, Pauline S Powledge, Sumit Roy, and Alexander Mamishev. 2006. A wirelessly-powered platform for sensing and computation. In International Conference on Ubiquitous Computing. Springer, 495--506.
[61]
STMicroelectronics. 2023. STLQ015 Datasheet. https://www.st.com/resource/en/datasheet/stlq015.pdf. Accessed: 2024-09-23.
[62]
STMicroelectronics. 2023. STM32AI Model Zoo - Image Classification Models. https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/image_classification/models. Accessed: January 22, 2024.
[63]
S. Tosun,O. Ozturk, E. Ozkan, and M. Ozen. 2015. Application mapping algorithms for mesh-based network-on-chip architectures. Journal of Supercomputing 71, 3 (2015), 995--1017.
[64]
Y. Wu, Z. Wang, Z. Jia, Y. Shi, and J. Hu. 2020. Intermittent inference with nonuniformly compressed multi-exit neural network for energy harvesting powered devices. In Proc. of ACM/IEEE Design Automation Conf. (DAC). 1--6.
[65]
C.-H. Yen, H. R. Mendis, T.-W. Kuo, and P.-C. Hsiu. 2022. Stateful neural networks for intermittent systems. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems 41, 11 (2022), 4229--4240.
[66]
K. S. Yıldırım, A. Y. Majid, D. Patoukas, K. Schaper, P. Pawelczak, and J. Hester. 2018. Ink: Reactive kernel for tiny batteryless sensors. In Proc. of ACM Conf. on Embedded Networked Sensor Systems (SenSys). 41--53.
[67]
Liuyang Zhang, Yuanqing Cheng, Wang Kang, Lionel Torres, Youguang Zhang, Aida Todri-Sanial, and Weisheng Zhao. 2018. Addressing the Thermal Issues of STT-MRAM From Compact Modeling to Design Techniques. IEEE Trans. on Nanotechnology 17, 2 (2018), 345--352.

Index Terms

  1. Intermittent Inference: Trading a 1% Accuracy Loss for a 1.9x Throughput Speedup

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SenSys '24: Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems
    November 2024
    950 pages
    ISBN:9798400706974
    DOI:10.1145/3666025
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 November 2024

    Check for updates

    Author Tags

    1. intermittent computing
    2. deep neural network (DNN) inference
    3. energy efficiency

    Qualifiers

    • Research-article

    Conference

    Acceptance Rates

    Overall Acceptance Rate 174 of 867 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 252
      Total Downloads
    • Downloads (Last 12 months)252
    • Downloads (Last 6 weeks)115
    Reflects downloads up to 05 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media