[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Split-Radix Based Compact Hardware Architecture for CRYSTALS-Kyber

Published: 01 January 2024 Publication History

Abstract

Facing the threat of large-scale quantum computers to traditional public-key cryptography, the National Institute of Standards and Technology has conducted Post-Quantum Cryptography algorithms evaluation for a long time, and CRYSTALS-Kyber has been selected to enter the standardization process. In the previous literature, hardware designs can significantly improve the performance of CRYSTALS-Kyber, and the most time-consuming operations are Number Theoretic Transform (NTT) and point-wise multiplication (PWM). However, the split-radix algorithm, which has a lower theoretical complexity in the FFT, has rarely been studied in the NTT. In this paper, we studied whether there are advantages of introducing split-radix algorithms into the NTT defined by CRYSTALS-Kyber and detailed derived the split-radix algorithms for the forward and inverse NTT without pre- or post-processing. By further optimizing the split-radix algorithm for the forward NTT, one of the three modular multipliers in the <inline-formula><tex-math notation="LaTeX">$\boldsymbol{L}$</tex-math><alternatives><mml:math><mml:mi mathvariant="bold-italic">L</mml:mi></mml:math><inline-graphic xlink:href="guo-ieq1-3320040.gif"/></alternatives></inline-formula>-shaped butterfly unit is replaced by shifting-and-addition, which will reduce the hardware resource consumption. Besides, we proposed a recombined formula for PWM, which reduces the capacity of the intermediate data RAM for PWM by 25%. Together with the proposed hardware scheduling method, the above algorithms can improve performance and save hardware resources.

References

[1]
G. Alagic et al., “Status report on the third round of the NIST post-quantum cryptography standardization process,” NIST Interagency/Internal Rep. (NISTIR), Gaithersburg, MD, USA, 2022. [Online]. Available: https://doi.org/10.6028/NIST.IR.8413
[2]
R. Avanzi et al., “CRYSTALS-Kyber algorithm specifications and supporting documentation.” 2022. Accessed: Feb. 10, 2023. [Online]. Available: https://csrc.nist.gov/Projects/post-quantum-cryptography/post-quantum-cryptography-standa-rdization/round-3-submissions
[3]
L. Botros, M. Kannwischer, and P. Schwabe, “Memory-efficient high-speed implementation of Kyber on Cortex-M4,” in Prog. Cryptol.—AFRICACRYPT—11th Int. Conf. Cryptol. Afr., Rabat, Morocco, Jul. 9–11, 2019, pp. 209–228.
[4]
E. Alkim, Y. Bilgin, M. Cenk, and F. Gérard, “Cortex-M4 optimizations for R,M LWE schemes,” IACR Trans. Cryptogr. Hardw. Embed. Syst., no. 3, pp. 336–357, 2020.
[5]
M. Bisheh-Niasar, R. Azarderakhsh, and M. Mozaffari-Kermani, “High-speed NTT-based polynomial multiplication accelerator for post-quantum cryptography,” in Proc. IEEE 28th Symp. Comput. Arithmetic (ARITH), 2021, pp. 94–101.
[6]
Y. Xing and S. Li, “A compact hardware implementation of CCA-secure key exchange mechanism CRYSTALS-KYBER on FPGA,” IACR Trans. Cryptograph. Hardw. Embedded Syst., no. 2, pp. 328–356, 2021.
[7]
W. Guo, S. Li, and L. Kong, “An efficient implementation of KYBER,” IEEE Trans. Circuits Syst.,II, Exp. Briefs, vol. 69, no. 3, pp. 1562–1566, Mar. 2022.
[8]
M. Bisheh-Niasar, R. Azarderakhsh, and M. Mozaffari-Kermani, “Instruction-set accelerated implementation of CRYSTALS-Kyber,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 68, no. 11, pp. 4648–4659, Nov. 2021.
[9]
V. Dang, K. Mohajerani, and K. Gaj, “High-speed hardware architectures and FPGA benchmarking of CRYSTALS-Kyber, NTRU, and Saber,” IEEE Trans. Comput., vol. 72, no. 2, pp. 306–320, Feb. 2023.
[10]
P. Duhamel and H. Hollmann, “‘Split radix’ FFT algorithm,” Electron. Lett., vol. 20, no. 1, pp. 14–16, Jan. 1984.
[11]
D. Takahashi, “An extended split-radix FFT algorithm,” IEEE Signal Process. Lett., vol. 8, no. 5, pp. 145–147, May 2001.
[12]
F. Winkler, Polynomial Algorithms in Computer Algebra (Texts and Monographs in Symbolic Computation), 1st ed. Vienna, Austria: Springer, Aug. 1996.
[13]
S. Roy, F. Vercauteren, N. Mentens, D. Chen, and I. Verbauwhede, “Compact ring-LWE cryptoprocessor,” in Proc. Cryptographic Hardware Embedded Syst. (CHES), in Lecture Notes in Computer Science, vol. 8731, L. Batina and M. Robshaw, Eds. Berlin, Heidelberg: Springer. [Online]. Available: https://doi.org/10.1007/978-3-662-44709-3_21
[14]
T. Pöppelmann, T. Oder, and T. Güneysu, “High-performance ideal lattice-based cryptography on 8-bit ATxmega microcontrollers,” in Proc. Prog. Cryptol. (LATINCRYPT), vol. 9230. Cham, Germany: Springer. [Online]. Available: https://doi.org/10.1007/978-3-319-22174-8_19
[15]
N. Zhang, B. Yang, C. Chen, S. Yin, S. Wei, and L. Liu, “Highly efficient architecture of NewHope-NIST on FPGA using low-complexity NTT/INTT,” IACR Trans. Cryptogr. Hardw. Embed. Syst., vol. 2020, no. 2, pp. 49–72, 2020. [Online]. Available: https://doi.org/10.13154/tches.v2020.i2.49-72
[16]
X. Chen, B. Yang, S. Yin, S. Wei, and L. Liu, “CFNTT: Scalable radix-2/4 NTT multiplication architecture with an efficient conflict-free memory mapping scheme,” IACR Trans. Cryptogr. Hardw. Embed. Syst., no. 1, pp. 94–126, 2022. [Online]. Available: https://doi.org/10.46586/tches.v2022.i1.94-126
[17]
D. Chen, “High-speed polynomial multiplication architecture for ring-LWE and SHE cryptosystems,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 62, no. 1, pp. 157–166, Jan. 2015.
[18]
X. Feng and S. Li, “Accelerating an FHE integer multiplier using negative wrapped convolution and ping-pong FFT,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 66, no. 1, pp. 121–125, Jan. 2019.
[19]
L. Johnson, “Conflict free memory addressing for dedicated FFT hardware,” IEEE Trans. Circuits Syst. II. Analog Digit. Signal Process., vol. 39, no. 5, pp. 312–316, May 1992.
[20]
N. Zhang, “NTTU: An area-efficient low-power NTT-uncoupled architecture for NTT-based multiplication,” IEEE Trans. Comput., vol. 69, no. 4, pp. 520–533, Apr. 2020.
[21]
A. Aikata, A. Mert, M. Imran, S. Pagliarini, and S. Roy, “KaLi: A crystal for post-quantum security using Kyber and Dilithium,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 70, no. 2, pp. 747–758, Feb. 2023.

Cited By

View all
  • (2024)A Highly Hardware Efficient ML-KEM Accelerator with Optimised Architectural LayersACM Transactions on Embedded Computing Systems10.1145/370846924:2(1-24)Online publication date: 18-Dec-2024

Index Terms

  1. Split-Radix Based Compact Hardware Architecture for CRYSTALS-Kyber
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image IEEE Transactions on Computers
        IEEE Transactions on Computers  Volume 73, Issue 1
        Jan. 2024
        300 pages

        Publisher

        IEEE Computer Society

        United States

        Publication History

        Published: 01 January 2024

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 20 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)A Highly Hardware Efficient ML-KEM Accelerator with Optimised Architectural LayersACM Transactions on Embedded Computing Systems10.1145/370846924:2(1-24)Online publication date: 18-Dec-2024

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media