Abstract
Recently, National Institute of Standards and Technology (NIST) in the U.S. had initiated a global-scale competition to standardize the lightweight authenticated encryption with associated data (AEAD) and hash function. Gimli is one of the Round 2 candidates that is designed to be efficiently implemented across various platforms, including hardware (VLSI and FPGA), microprocessors, and microcontrollers. However, the performance of Gimli in massively parallel architectures like Graphics Processing Units (GPU) is still unknown. A high performance Gimli implementation on GPU can be especially useful to Internet of Things (IoT) applications, wherein the gateway devices and cloud servers need to handle a massive number of communications protected by AEAD. In this paper, we show that with careful optimization, Gimli can be efficiently implemented in desktop and embedded GPU to achieve extremely high throughput. Our experiments show that the proposed Gimli implementation can achieve 661.44 KB/s (encryption), 892.24 KB/s (decryption), and 4344.46 KB/s (hashing) in state-of-the-art GPUs.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availibility
This paper uses the code from NIST Lightweight Cryptography Standardization as a starting point to develop the optimized implementation on GPU. The code for Gimli hash and authenticated encryption can be found here: https://csrc.nist.gov/projects/lightweight-cryptography/round-2-candidates.
References
Cao, J., Yu, P., Xiang, X., Ma, M., Li, H.: Anti-quantum fast authentication and data transmission scheme for massive devices in 5G NB-IoT system. IEEE Internet Things J. 6(6), 9794–9805 (2019)
Hammi, B., Fayad, A., Khatoun, R., Zeadally, S., Begriche, Y.: A lightweight ECC-based authentication scheme for Internet of Things (IoT). IEEE Syst. J. 14(3), 3440–3450 (2020)
NIST Lightweight Cryptography Standardization Round 2 Candidates.: [Online] Available: https://csrc.nist.gov/Projects/lightweight-cryptography/round-2-candidates. Accessed at 13 (2020)
Mohajerani, K., Haeussler, R., Nagpal, R., Farahmand, F., Abdulgadir, A., Kaps, J. P., and Gaj, K. FPGA Benchmarking of round 2 candidates in the NIST lightweight cryptography standardization process: methodology, metrics, tools, and results. Cryptol. ePrint Archive (2020)
Bernstein, D.J., Kölbl, S., Lucks, S., Massolino, P. M. C., Mendel, F., Nawaz, K., Schneider, T., Schwabe, P., Standaert, F.X., Todo, Y., and Viguier, B.: Gimli specification, Gimli submission to NIST lightweight cryptography standardization round 2. [Online] Available: https://csrc.nist.gov/CSRC/media/Projects/lightweight-cryptography/documents/round-2/spec-doc-rnd2/gimli spec-round2.pdf. Accessed at 15 (2020)
Bernstein, D.J., Kölbl, S., Lucks, S., Massolino, P.M.C., Mendel, F., Nawaz, K., Schneider, T., Schwabe, P., Standaert, F.X., Todo, Y., Viguier, B.: Gimli: A cross-platform permutation, Cryptographic Hardware and Embedded Systems-CHES 2017. Taipei (2017)
Aslam, M., Riaz, O., Mumtaz, S., Asif, A.D.: Performance comparison of GPU-based jacobi solvers using CUDA provided synchronization methods. IEEE Access 8, 31792–31812 (2020)
Peng, S., Tan, S.X.D.: GLU3. :0: fast GPU-based parallel sparse LU factorization for circuit simulation. IEEE Des. Test 37(32), 78–90 (2020)
Chen, X., Chen, D.Z., Han, Y., Hu, X.S.: moDNN: memory optimal deep neural network training on graphics processing units. IEEE Trans. Parallel Distrib. Syst. 30(3), 646–661 (2018)
Pan, W., Zheng, F., Zhao, Y., Zhu, W.T., Jing, J.: An efficient elliptic curve cryptography signature server with GPU acceleration. IEEE Trans. Inf. Forensics Secur. 12(1), 111–122 (2016)
Ochoa-Jiménez, E., Rivera-Zamarripa, L., Cruz-Cortés, N., Rodríguez-Henríquez, F.: Implementation of RSA signatures on GPU and CPU architectures. IEEE Access 8, 9928–9941 (2020)
Shor, P.: Algorithms for quantum computation: discrete logarithm and factoring, IEEE Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe (1994)
Gupta, N., Jati, A., Chauhan, A.K., Chattopadhyay, A.: PQC acceleration using GPUs: FrodoKEM, NewHope, and Kyber. IEEE Trans. Parallel Distrib. Syst. 32(3), 575–586 (2020)
Duong-Ngoc, P., Tan, T.N., Lee, H.: Efficient NewHope cryptography based facial security system on a GPU. IEEE Access 8, 108158–108168 (2020)
Lee, W.K., Phan, R.C.W., Goi, B.M., Chen, L., Zhang, X., Xiong, N.N.: Parallel and high speed hashing in GPU for telemedicine applications. IEEE Access 6, 37991–38002 (2018)
Lee, W.K., Goi, B.M., Phan, R.C.W.: Tera-bit encryption in a second: performance evaluation of block ciphers in GPU With Kepler, Maxwell and Pascal architectures. Concurr. Comput. 31(11), e5048 (2019)
Hajihassani, O., Monfared, S.K., Khasteh, S.H., Gorgin, S.: Fast AES implementation: a high-throughput bitsliced approach. IEEE Trans. Parallel Distrib. Syst. 30(10), 2211–2222 (2019)
Pooranian, Z., Chen, K.C., Yu, C.M., Conti, M.: RARE: Defeating side channels based on data-deduplication in cloud storage, IEEE INFOCOM 2018-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 444–449 (2018)
Pooranian, Z., Shojafar, M., Garg, S., Taheri, R., Tafazolli, R.: LEVER: secure deduplicated cloud storage with encrypted two-party interactions in cyber-physical systems. IEEE Trans. Ind. Inform. 17(8), 5759–5768 (2020)
Acknowledgements
This work was supported by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2019H1D3A1A01102607, 2020R1A2B5B01002145, 2021R1A6A3A13038773).
Funding
This work was supported by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2019H1D3A1A01102607, 2020R1A2B5B01002145, 2021R1A6A3A13038773).
Author information
Authors and Affiliations
Contributions
All authors contributed equally to the final dissemination of the research investigation as a full article. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
There is no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Han, K., Lee, WK. & Hwang, S.O. cuGimli: optimized implementation of the Gimli authenticated encryption and hash function on GPU for IoT applications. Cluster Comput 25, 433–450 (2022). https://doi.org/10.1007/s10586-021-03415-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-021-03415-z