Abstract
In this paper, we exploit AI accelerator to implement cryptographic algorithms. To the best of our knowledge, it is the first attempt to implement quantum-safe Lattice-Based Cryptography (LBC) with AI accelerator. However, AI accelerators are designed for machine learning workloads (e.g., convolution operation), and cannot directly deliver their strong power into the cryptographic computation. Noting that polynomial multiplication over rings is a kind of time-consuming computation in LBC, we utilize a straightforward approach to make the AI accelerator fit well for polynomial multiplication over rings. Additional non-trivial optimizations are also made to minimize the overhead of transformation, such as using low-latency shared memory, coalescing memory access. Moreover, based on NVIDIA AI accelerator, Tensor Core, we have implemented a prototype system named TESLAC and give a set of comprehensive experiments to evaluate its performance. The experimental results show TESLAC can reach tens of millions of operations per second, achieving a performance speedup of two orders of magnitude from the AVX2-accelerated reference implementation. Particularly, with some techniques, TESLAC can also be scaled to other LBC with larger modulo q.
This work was partially supported by National Key R&D Program of China under Award 2018YFB0804401 and National Natural Science Foundation of China under Award No. 61902392.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
NVIDIA has launched different generations of Tensor Core. If there is no additional explanation, Tensor Core in this paper refers to the one on the architecture of Volta.
References
Aguilar-Melchor, C., Barrier, J., Guelton, S., Guinet, A., Killijian, M.-O., Lepoint, T.: NFLlib: NTT-based fast lattice library. In: Sako, K. (ed.) CT-RSA 2016. LNCS, vol. 9610, pp. 341–356. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29485-8_20
Akleylek, S., Goi, B., Yap, W., Wong, D.C., Lee, W.: Fast NTRU encryption in GPU for secure IoP communication in post-quantum era. In: 2018 IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted Computing, Scalable Computing Communications, Cloud Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1923–1928 (2018)
Akleylek, S., Dağdelen, Ö., Yüce Tok, Z.: On the efficiency of polynomial multiplication for lattice-based cryptography on GPUs using CUDA. In: Pasalic, E., Knudsen, L.R. (eds.) BalkanCryptSec 2015. LNCS, vol. 9540, pp. 155–168. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-29172-7_10
Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: Post-quantum key exchange—a new hope. In: 25th USENIX Security Symposium (USENIX Security 16), pp. 327–343 (2016)
Avanzi, R., et al.: CRYSTALS-KYBER: algorithm specifications and supporting documentation. https://pq-crystals.org/kyber/. Accessed 15 Sep 2020
Aysu, A., Patterson, C., Schaumont, P.: Low-cost and area-efficient FPGA implementations of lattice-based cryptography. In: 2013 IEEE International Symposium on Hardware-Oriented Security and Trust (HOST), pp. 81–86. IEEE (2013)
Barrett, P.: Implementing the Rivest Shamir and Adleman public key encryption algorithm on a standard digital signal processor. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263, pp. 311–323. Springer, Heidelberg (1987). https://doi.org/10.1007/3-540-47721-7_24
Bernstein, D.J.: Introduction to post-quantum cryptography. In: Post-Quantum Cryptography, pp. 1–14. Springer (2009). https://doi.org/10.1007/978-3-540-88702-7_1
Committee, I.S., et al.: 754–2008 IEEE standard for floating-point arithmetic. IEEE Comput. Soc. Std. 2008, 517 (2008)
Dai, W., Sunar, B., Schanck, J., Whyte, W., Zhang, Z.: NTRU modular lattice signature scheme on CUDA GPUs. In: 2016 International Conference on High Performance Computing & Simulation (HPCS), pp. 501–508. IEEE (2016)
D’Anvers, J.P., Karmakar, A., Roy, S.S., Vercauteren, F.: Saber: Mlwr-based kem. https://www.esat.kuleuven.be/cosic/pqcrypto/saber/index.html. Accessed 15 Sep 2020
Hoffstein, J., Pipher, J., Silverman, J.H.: NTRU: a ring-based public key cryptosystem. In: Buhler, J.P. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 267–288. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0054868
Jeremy Appleyard, S.Y.: Programming tensor cores in CUDA 9. https://devblogs.nvidia.com/programming-tensor-cores-cuda-9/. Accessed 5 Apr 2020
Lee, W.K., Akleylek, S., Wong, D.C.K., Yap, W.S., Goi, B.M., Hwang, S.O.: Parallel implementation of nussbaumer algorithm and number theoretic transform on a GPU platform: application to qTESLA. The Journal of Supercomputing, pp. 1–26 (2020)
Lu, X., et al.: LAC: practical Ring-LWE based public-key encryption with byte-level modulus. IACR Cryptology ePrint Archive 2018, 1009 (2018). https://eprint.iacr.org/2018/1009
Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985)
Nist, F.: FIPS 186-4-Digital Signature Standard (DSS). National Institute of Standards and Technology (2013)
Oder, T., Güneysu, T., Valencia, F., Khalid, A., O’Neill, M., Regazzoni, F.: Lattice-based cryptography: From reconfigurable hardware to ASIC. In: 2016 International Symposium on Integrated Circuits (ISIC), pp. 1–4. IEEE (2016)
Pöppelmann, T., Güneysu, T.: Towards efficient arithmetic for lattice-based cryptography on reconfigurable hardware. In: Hevia, A., Neven, G. (eds.) LATINCRYPT 2012. LNCS, vol. 7533, pp. 139–158. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33481-8_8
Post-quantum cryptography project, N.: Post-quantum cryptography. https://csrc.nist.gov/Projects/Post-Quantum-Cryptography. Accessed 23 Sep 2020
Post-quantum cryptography project, N.: Round 2 submissions. https://csrc.nist.gov/Projects/Post-Quantum-Cryptography/Round-2-Submissions. Accessed 4 Apr 2020
Regev, O.: On lattices, learning with errors, random linear codes, and cryptography. J. ACM (JACM) 56(6), 1–40 (2009)
Seiler, G.: Faster AVX2 optimized NTT multiplication for Ring-LWE lattice cryptography. IACR Cryptology ePrint Archive 2018, vol. 39 (2018)
Shor, P.W.: Algorithms for quantum computation: discrete logarithms and factoring. In: Proceedings 35th Annual Symposium on Foundations of Computer Science, pp. 124–134. IEEE (1994)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Appendix
A Appendix
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Wan, L., Zheng, F., Lin, J. (2021). TESLAC: Accelerating Lattice-Based Cryptography with AI Accelerator. In: Garcia-Alfaro, J., Li, S., Poovendran, R., Debar, H., Yung, M. (eds) Security and Privacy in Communication Networks. SecureComm 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 398. Springer, Cham. https://doi.org/10.1007/978-3-030-90019-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-90019-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90018-2
Online ISBN: 978-3-030-90019-9
eBook Packages: Computer ScienceComputer Science (R0)