[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Characterizing GPU Overclocking Faults

  • Conference paper
  • First Online:
Computer Security – ESORICS 2021 (ESORICS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12972))

Included in the following conference series:

  • 3849 Accesses

Abstract

Graphics Processing Units (GPUs) are powerful parallel processors that are becoming common on computers. They are used in many high-performance tasks such as crypto-mining and neural-network training. It is common to overclock a GPU to gain performance, however this practice may introduce calculation faults. In our work, we lay the foundations to exploiting these faults, by characterizing their formation and structure. We find that temperature is a contributing factor to the fault rate, but is not the sole cause. We also find that faults are a byte-wide phenomenon: individual bit-flips are rare. Surprisingly, we find that the vast majority of byte faults are in fact byte-flips: all 8 bits are simultaneously negated. Finally, we find strong evidence that faults are triggered by memory-remnant reads at an alignment of a 32 byte memory transaction size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 79.50
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 99.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agoyan, M., Dutertre, J., Mirbaha, A., Naccache, D., Ribotta, A., Tria, A.: Single-bit DFA using multiple-byte laser fault injection. In: 2010 IEEE International Conference on Technologies for Homeland Security (HST), pp. 113–119 (2010)

    Google Scholar 

  2. Agoyan, M., Dutertre, J.-M., Naccache, D., Robisson, B., Tria, A.: When clocks fail: on critical paths and clock faults. In: Gollmann, D., Lanet, J.-L., Iguchi-Cartigny, J. (eds.) CARDIS 2010. LNCS, vol. 6035, pp. 182–193. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12510-2_13

    Chapter  Google Scholar 

  3. ArchWiki. NVIDIA/Tips and tricks. https://wiki.archlinux.org/index.php/NVIDIA/Tips_and_tricks

  4. Barenghi, A., Bertoni, G.M., Breveglieri, L., Pellicioli, M., Pelosi, G.: Low voltage fault attacks to AES. In: 2010 IEEE International Symposium on Hardware-Oriented Security and Trust (HOST), pp. 7–12. IEEE (2010)

    Google Scholar 

  5. Bialas, P., Strzelecki, A.: Benchmarking the cost of thread divergence in CUDA. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9573, pp. 570–579. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32149-3_53

    Chapter  Google Scholar 

  6. Biham, E., Shamir, A.: Differential fault analysis of secret key cryptosystems. In: Kaliski, B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 513–525. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0052259

    Chapter  Google Scholar 

  7. Boneh, D., DeMillo, R.A., Lipton, R.J.: On the importance of checking cryptographic protocols for faults. In: Fumy, W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233, pp. 37–51. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-69053-0_4

    Chapter  Google Scholar 

  8. Nvidia developer forum. Unified Memory vs Pinned Memory. https://forums.developer.nvidia.com/t/unified-memory-vs-pinned-host-memory-vs-gpu-global-memory/34640

  9. Dusart, P., Letourneux, G., Vivolo, O.: Differential fault analysis on A.E.S. In: Zhou, J., Yung, M., Han, Y. (eds.) ACNS 2003. LNCS, vol. 2846, pp. 293–306. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45203-4_23

    Chapter  Google Scholar 

  10. Ekbote, B., Hire, V., Mahajan, P., Sisodia, J.: Blockchain based remittances and mining using CUDA. In: 2017 International Conference On Smart Technologies for Smart Nation (SmartTechCon), pp. 908–911. IEEE (2017)

    Google Scholar 

  11. Nvidia Forum. Run CUDA on dedicated GPU. https://forums.developer.nvidia.com/t/solved-run-cuda-on-dedicated-nvidia-gpu-while-connecting-monitors-to-intel-hd-graphics-is-this-possible/47690/2/

  12. Gawande, N.A., Daily, J.A., Siegel, C., Tallent, N.R., Vishnu, A.: Scaling deep learning workloads: NVIDIA DGX-1/Pascal and intel knights landing. Future Gener. Comput. Syst. 108, 1162–1172 (2020)

    Article  Google Scholar 

  13. Giraud, C.: DFA on AES. In: Dobbertin, H., Rijmen, V., Sowa, A. (eds.) AES 2004. LNCS, vol. 3373, pp. 27–41. Springer, Heidelberg (2005). https://doi.org/10.1007/11506447_4

    Chapter  Google Scholar 

  14. Gratchoff, J.: Proving the wild jungle jump. Technical report, University of Amsterdam (2015). https://homepages.staff.os3.nl/~delaat/rp/2014-2015/p48/report.pdf

  15. Harris, M.: Unified Memory in CUDA 6. https://developer.nvidia.com/blog/unified-memory-in-cuda-6/

  16. integralfx. DDR4 Overclocking Guide. https://github.com/integralfx/MemTestHelper/blob/master/DDR4

  17. Jiang, Z.H., Fei, Y., Kaeli, D.: A complete key recovery timing attack on a GPU. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 394–405. IEEE (2016)

    Google Scholar 

  18. Kemal. Scripts to overclock & start bitcoin miners on boot. https://gist.github.com/disq/995082

  19. Kovacs, B.: Nvidia overclock scripts. https://github.com/brandonkovacs/nvidia-overclock-scripts

  20. Landaverde, R., Zhang, T., Coskun, A.K., Herbordt, M.: An investigation of unified memory access performance in CUDA. In: 2014 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2014)

    Google Scholar 

  21. Lapid, B., Wool, A.: Cache-attacks on the ARM TrustZone implementations of AES-256 and AES-256-GCM via GPU-based analysis. In: Cid, C., Jacobson Jr. M. (eds.) SAC 2018. LNCS, vol. 11349, pp. 235–256. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10970-7_11

  22. Lee, S., Kim, Y., Kim, J., Kim, J.: Stealing webpages rendered on your browser by exploiting GPU vulnerabilities. In: 2014 IEEE Symposium on Security and Privacy, pp. 19–33. IEEE (2014)

    Google Scholar 

  23. Liao, N., Cui, X., Liao, K., Wang, T., Yu, D., Cui, X.: Improving DFA attacks on AES with unknown and random faults. Sci. China Inf. Sci. 60(4), 1–14 (2016). https://doi.org/10.1007/s11432-016-0071-7

    Article  Google Scholar 

  24. Liu, Y., Cui, X., Cao, J., Zhang, X.: A hybrid fault model for differential fault attack on AES. In: 2017 IEEE 12th International Conference on ASIC (ASICON), pp. 784–787. IEEE (2017)

    Google Scholar 

  25. Manavski, S.A.: CUDA compatible GPU as an efficient hardware accelerator for AES cryptography. In: 2007 IEEE International Conference on Signal Processing and Communications, pp. 65–68. IEEE (2007)

    Google Scholar 

  26. Moro, N., Dehbaoui, A., Heydemann, K., Robisson, B., Encrenaz, E.: Electromagnetic fault injection: towards a fault model on a 32-bit microcontroller. In: 2013 Workshop on Fault Diagnosis and Tolerance in Cryptography, pp. 77–88. IEEE (2013)

    Google Scholar 

  27. Murakami, T., Kasahara, R., Saito, T.: An implementation and its evaluation of password cracking tool parallelized on GPGPU. In: 2010 10th International Symposium on Communications and Information Technologies, pp. 534–538. IEEE (2010)

    Google Scholar 

  28. Nvidia. Cuda-C-Programming-Guide. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html/

  29. Nvidia. Nvidia System Management Interface. https://developer.nvidia.com/nvidia-system-management-interface/

  30. Nvidia. Using the nvidia-settings Utility. https://download.nvidia.com/XFree86/Linux-x86_64/396.51/README/nvidiasettings.html/

  31. Nvidia. Everything you need to know about unified memory. https://on-demand.gputechconf.com/gtc/2018/presentation/s8430-everything-you-need-to-know-about-unified-memory.pdf, 2018

  32. Piret, G., Quisquater, J.-J.: A differential fault attack technique against SPN structures, with application to the AES and Khazad. In: Walter, C.D., Koç, Ç.K., Paar, C. (eds.) CHES 2003. LNCS, vol. 2779, pp. 77–88. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45238-6_7

    Chapter  MATH  Google Scholar 

  33. Gerardo Ravago. CUDA bitcoin miner. https://github.com/geedo0/cuda_bitcoin_miner

  34. Jan S. CUDA-AES. https://github.com/franneck94/CUDA-AES

  35. Sabbagh, M., Fei, Y., Kaeli, D.: A novel GPU overdrive fault attack. In: 2020 57th ACM/IEEE Design Automation Conference (DAC), pp. 1–6 (2020)

    Google Scholar 

  36. Selmane, N., Guilley, S., Danger, J.: Practical setup time violation attacks on AES. In: 2008 Seventh European Dependable Computing Conference, pp. 91–96 (2008)

    Google Scholar 

  37. Online tech tips. How to overclock your GPU safely to boost performance. https://www.online-tech-tips.com/computer-tips/overclock-gpu-safely-boost-performance/

  38. George Thessalonikefs. Electromagnetic fault injection characterization. Master’s thesis, University of Amsterdam (2014). https://homepages.staff.os3.nl/~delaat/rp/2013-2014/p67/report.pdf

  39. Timmers, N., Mune, C.: Escalating privileges in Linux using voltage fault injection. In: 2017 Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC), pp. 1–8 (2017)

    Google Scholar 

  40. Timmers, N., Spruyt, A., Witteman, M.: Controlling PC on ARM using fault injection. In: 2016 Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC), pp. 25–35. IEEE (2016)

    Google Scholar 

  41. Ville Timonen. GPU Burn. https://github.com/wilicc/gpu-burn

  42. Wong, H., Papadopoulou, M.-M., Sadooghi-Alvandi, M., Moshovos, A.: Demystifying GPU microarchitecture through microbenchmarking. In: 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), pp. 235–246. IEEE (2010)

    Google Scholar 

  43. Zhu, Z., Kim, S., Rozhanski, Y., Hu, Y., Witchel, E., Silberstein, M.: Understanding the security of discrete GPUs. In: Proceedings of the General Purpose GPUs, pp. 1–11 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Eldad Zuberi or Avishai Wool .

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 CUDA basics

CUDA Kernels are declared using the \(\mathtt{\_}{} \mathtt{\_}{} \mathtt{global}{} \mathtt{\_}{} \mathtt{\_}\) declaration specifier and can be invoked using the syntax in Algorithm 2. Kernels are executed in blocks where each block consists of multiple threads. The parameters numBlocks and threadsPerBlock specify the execution configuration syntax. Each thread that executes the kernel is given unique thread/block IDs that are accessible within the kernel through built-in variables. All threads of a block reside on the same processor core and must share the memory resources of that core. Therefore, the number of threads per block is limited (up 1024 on current GPUs). Instructions are issued and executed in groups of 32 threads, called warps.

figure c

Thread blocks are required to execute independently: It must be possible to execute them in any order, in parallel or in series. Threads within a block can cooperate by sharing data through some shared memory and by synchronizing their execution to coordinate memory accesses. Synchronization points can be declared using intrinsic functions, e.g., \(\mathtt{\_}{} \mathtt{\_}{} \mathtt{syncthreads()}\).

1.2 A.2 Future Work

Future work involves leveraging the characterization of faults presented in this paper towards the development of efficient tailored exploitation algorithms and methods. Examples include:

Breaking Cryptographic Calculations Implemented on GPUs. One can speculate that using the byte-flip phenomenon may be incorporated with the work done by Sabbagh et al. [35]. As their work relies on exploiting an instrumented-AES, our characterization might enable the attack to target non-instrumented kernels, as well as reducing the number of messages required to break the encryption. Also it seems that byte-flips may be used to improve attacks on public-key calculations done in a GPU.

Faulty Instructions. During our tests we observed that as the faults rate increased, occasionally the graphics card stopped responding (API calls failed), crashed, or acted extremely slow. We also received kernel crashes with error codes such as: “An illegal instruction was encountered” and “Invalid program counter”. This suggests that the GPU is not only vulnerable to data corruption, but also to instruction corruption [14, 26, 38,39,40], since code-registers (apart from data-registers) are also vulnerable to the faults caused by overclocking.

The knowledge in this paper may allow an attacker to develop code which triggers precise and predictable faults - effectively allowing it to hide malicious instruction in a legitimate code. To design this, the attacker could create a more “prone-to-errors” region of the code (e.g., by performing many loops in a specific alignment). The attacker also knows that it is likely the fault value will be a byte-flip. By studying of GPU opcodes and their inverse, the attacker can then craft his own command in the misread CUDA code. Similar technique can be used to leverage the faults to modification of the Program Counter register.

Other GPUs. Our tests were conducted on an Nvidia GPU, similar work can be carried out to characterize the faults on other GPUs.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zuberi, E., Wool, A. (2021). Characterizing GPU Overclocking Faults. In: Bertino, E., Shulman, H., Waidner, M. (eds) Computer Security – ESORICS 2021. ESORICS 2021. Lecture Notes in Computer Science(), vol 12972. Springer, Cham. https://doi.org/10.1007/978-3-030-88418-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88418-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88417-8

  • Online ISBN: 978-3-030-88418-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics