Abstract
The arithmetic in a finite field constitutes the core of public key cryptography like RSA, ECC or pairing-based cryptography. This paper discusses an efficient hardware implementation of the Coarsely Integrated Operand Scanning (CIOS) method of Montgomery modular multiplication combined with an effective systolic architecture designed with a two-dimensional array of processing elements. The systolic architecture increases the speed of calculation by combining the concepts of pipelining and the parallel processing into a single concept. We propose the CIOS method for the Montgomery multiplication using a systolic architecture. As far as we know, this is the first implementation of such design. The proposed architectures are designed for field programmable gate array platforms. They targeted to reduce the number of clock cycles of the modular multiplication. The presented implementation results of the CIOS algorithms focus on different security levels useful in cryptography. This architecture has been designed in order to use the flexible DSP48 on Xilinx Field-Programmable Gate Array’s. Our architecture is scalable and depends only on the number and size of words. For instance, we provide results of implementation for 8-, 16-, 32- and 64-bit-long words in 33, 66, 132 and 264 clock cycles. We highlight the fact that for a given number of word, the number of clock cycles is constant. We propose a general version of our systolic architecture presented in SPACE2016.
Similar content being viewed by others
References
Bigou K, Tisserand A (2015) Single base modular multiplication for efficient hardware RNS implementations of ECC. In: Conference on cryptographic hardware and embedded systems, pp 123–140
Fan J, Sakiyama K, Verbauwhede I (2007) Montgomery modular multiplication algorithm on multi-core systems. In: IEEE workshop on signal processing systems, 2007, pp 261–266
Hariri A, Reyhani-Masoleh A (2009) Bit-serial and bit-parallel Montgomery multiplication and squaring over GF. IEEE Trans Comput 58(10):1332–1345
Harris D, Krishnamurthy R, Anders M, Mathew S, Hsu S (2005) An improved unified scalable radix-2 Montgomery multiplier. In: 17th IEEE symposium on computer arithmetic, 2005. ARITH-17 2005, pp 172–178
Huang M, Gaj K, El-Ghazawi T (2011) New hardware architectures for Montgomery modular multiplication algorithm. IEEE Trans Comput 60(7):923–936
Huang M, Gaj K, Kwon S, El-Ghazawi T (2008) An optimized hardware architecture for the Montgomery multiplication algorithm. In: Cramer R (ed) Public key cryptography – PKC 2008, vol 4939 of lecture notes in computer science. Springer, Berlin, pp 214–228
Lee K -I (2007) Algorithm and VLSI architecture design for h.264/AVC Inter Frame Coding. PhD thesis, National Cheng Kung University, Tainan, Taiwan
Iwamura K, Matsumoto T, Imai H (1993) High-speed implementation methods for RSA scheme. In: Rueppel R A (ed) Advances in cryptology—EUROCRYPT’ 92, vol 658 of lecture notes in computer science. Springer, Berlin, pp 221–238
Iwamura K, Matsumoto T, Imai H (1993) Systolic-arrays for modular exponentiation using Montgomery method. In: Rueppel R A (ed) Advances in cryptology — EUROCRYPT’ 92, vol 658 of lecture notes in computer science. Springer, Berlin, pp 477–481
Joux A (2004) A one round protocol for tripartite Diffie-Hellman. J Cryptol 17(4):263–276
Koç C K, Acar T, Kaliski B S Jr (1996) Analyzing and comparing Montgomery multiplication algorithms. IEEE Micro 16(3):26–33
Koblitz N (1987) Elliptic curve cryptosystems. Math Comput 48(177):203–209
Manochehri K, Pourmozafari S, Sadeghiyan B (2010) Montgomery and RNS for RSA hardware implementation. Cpmput Inform 29:849–880
Kung H T (1982) Why systolic architectures? Computer 15(1):37–46
Miller V (1986) Use of elliptic curves in cryptography. In: Williams H C (ed) Advances in cryptology—CRYPTO ’85 proceedings, vol 218 of lecture notes in computer science. Springer, Berlin, pp 417–426
Montgomery P L (1985) Modular multiplication without trial division. Math Comput 44(170):519–521
Ors S B, Batina L, Preneel B, Vandewalle J (2003) Hardware implementation of a Montgomery modular multiplier in a systolic array
Perin G, Mesquita D G, Martins J B (2011) Montgomery modular multiplication on reconfigurable hardware: systolic versus multiplexed implementation. Int J Reconfig Comput 2011:61–610
Rivest R L, Shamir A, Adleman L (1978) A method for obtaining digital signatures and public-key cryptosystems. Commun ACM 21:120–126
Tenca A F, Koç Ç K (1999) A scalable architecture for Montgomery multiplication. In: Koç cCK, Paar C (eds) Cryptographic hardware and embedded systems, first international workshop, CHES’99, Worcester, MA, USA, August 12–13, 1999, Proceedings, vol 1717 of Lecture Notes in Computer Science. Springer, pp 94–108
Vucha M, Rajawat A (2011) Design and FPGA implementation of systolic array architecture for matrix multiplication. Int J Comput Appl 26(3):0975–8887
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
1.1 A.1 Code Sage NW-8
1.2 A.2 Code Sage NW-16
Appendix B: Architecture
1.1 B.1 Execution
Rights and permissions
About this article
Cite this article
Mrabet, A., El-Mrabet, N., Lashermes, R. et al. A Scalable and Systolic Architectures of Montgomery Modular Multiplication for Public Key Cryptosystems Based on DSPs. J Hardw Syst Secur 1, 219–236 (2017). https://doi.org/10.1007/s41635-017-0018-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41635-017-0018-x