CN115270155A

CN115270155A - Method for obtaining maximum common divisor of big number expansion and hardware architecture

Info

Publication number: CN115270155A
Application number: CN202210910137.5A
Authority: CN
Inventors: 王中风; 欧伦; 朱丹阳; 田静
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2022-07-29
Filing date: 2022-07-29
Publication date: 2022-11-01

Abstract

The present application relates to the technical field of computer applications, and provides a method and hardware architecture for obtaining the greatest common divisor of a large number expansion, a control module, a GCD calculation unit, a Bezuo coefficient calculation unit, a first multiplexer, and a second multiplexer , the termination module and the input_valid signal, the GCD calculation unit and the Bezuo coefficient calculation unit are used to iteratively update the intermediate variables according to the control signal of the control module, and the introduction of the δ parameter avoids comparing the sizes of the intermediate variables a and b, based on the k-ary algorithm At the same time, the calculation and redundant form of the Bezuo coefficient are introduced. In the hardware implementation, only simple addition and subtraction and shift operations are required, which greatly reduces the time required for the addition operation, achieves the purpose of increasing the clock frequency, and reduces the iteration cycle. Reduce total run time.

Description

A method and hardware architecture for obtaining the extended greatest common divisor of large numbers

技术领域technical field

本申请涉及计算机应用技术领域，具体涉及脉一种获取大数拓展最大公约数的方法及硬件架构。This application relates to the field of computer application technology, in particular to a method and hardware architecture for obtaining the extended greatest common divisor of large numbers.

背景技术Background technique

随着互联网的发展，在通信过程中保证信息的安全十分重要，因此密码技术受到越来越广泛的关注。在密码学中，出于安全的需要，除了使用大位宽的数据以外，还包括应用于数据传输、存储和身份认证等方面的加解密算法和签名算法，例如二次型可验证延迟函数(Verifiable delay function in binary quadratic forms，VDFs)和模逆运算等，在这些运算当中，拓展最大公约数(Extended greatest common divisor，XGCD)由于其计算的复杂性，是其中的核心单元。With the development of the Internet, it is very important to ensure the security of information in the communication process, so cryptographic technology has received more and more attention. In cryptography, for the sake of security, in addition to using data with large bit width, it also includes encryption and decryption algorithms and signature algorithms applied to data transmission, storage and identity authentication, such as the quadratic verifiable delay function ( Verifiable delay function in binary quadratic forms, VDFs) and modular inverse operations, etc. Among these operations, the Extended greatest common divisor (XGCD) is the core unit due to its computational complexity.

在基于RSA(RSA algorithm)的公钥密码体制中，发送方利用公钥对文件明文进行加密，得到文件密文，接收方利用该公钥对应的私钥对文件密文进行解密。公钥由模数和公钥指数组成，私钥由模数和私钥指数组成，在公私钥对的生成过程中，首先选取两个大质数p、q得到模数，利用p和q的互质数目的乘积得到第一参数，根据第一参数选取公钥指数，其中，私钥指数为公钥指数关于第一参数的模反元素，因此，可以使用公钥指数和第一参数作为XGCD的输入，输出的对应的贝祖系数即为私钥指数。In the public key cryptosystem based on RSA (RSA algorithm), the sender uses the public key to encrypt the plaintext of the file to obtain the ciphertext of the file, and the receiver uses the private key corresponding to the public key to decrypt the ciphertext of the file. The public key is composed of modulus and public key exponent, and the private key is composed of modulus and private key exponent. In the process of generating the public-private key pair, firstly select two large prime numbers p and q to obtain the modulus, and use the interaction of p and q to obtain the modulus. The product of the prime number gets the first parameter, and the public key index is selected according to the first parameter, wherein the private key index is the inverse element of the public key index with respect to the first parameter, therefore, the public key index and the first parameter can be used as the input of XGCD , the corresponding Bezou coefficient output is the private key index.

现有技术公开了一种基于拓展欧几里得算法(Extended Euclidean algorithm，简称EEA)及其变体的XGCD计算方法，该方法所需的迭代次数少，由于CPU的频率固定，较少的迭代次数意味着计算所需时间很短，所以其被广泛用于软件实现中，例如GMP大数库(GNUMultiple Precision Arithmetic Library)。然而，在硬件实现方面，上述方法需要用到除法，而大数除法的实现十分复杂，会导致关键路径很长，且其具有数据依赖性，需要串行地迭代进行，很难通过并行处理来提高计算速度，导致硬件实现上复杂度高且计算效率低，在高速的密码学应用和硬件实现中成为瓶颈。The prior art discloses an XGCD calculation method based on the Extended Euclidean algorithm (EEA for short) and its variants. The number of iterations required by this method is small. Due to the fixed frequency of the CPU, fewer iterations The number of times means that the time required for calculation is very short, so it is widely used in software implementations, such as the GMP large number library (GNUMultiple Precision Arithmetic Library). However, in terms of hardware implementation, the above method needs to use division, and the implementation of large number division is very complicated, which will lead to a long critical path, and it has data dependence, which needs to be iteratively performed serially, and it is difficult to solve it through parallel processing. Improving the calculation speed leads to high complexity and low calculation efficiency in hardware implementation, which becomes a bottleneck in high-speed cryptography applications and hardware implementation.

现有技术还公开基于k-ary算法的XGCD计算方法和基于Two-bit PM算法的XGCD计算方法，其中，k-ary算法通过设计k值可以只需要用到简单的运算，并且相比于PM算法和Two-bit PM算法，不同的k值可以在面积和运算速度上获得一个很好的折中，因此更为灵活，适用于更多的设计需求。然而对于k-ary算法，大多数研究未提及贝祖系数的计算方法，极少数基于k-ary算法的计算贝祖系数的方法又需要用到模乘与除法，对于硬件实现而言都是十分复杂的。The prior art also discloses the XGCD calculation method based on the k-ary algorithm and the XGCD calculation method based on the Two-bit PM algorithm, wherein the k-ary algorithm only needs to use simple calculations by designing the k value, and compared to PM Algorithm and Two-bit PM algorithm, different k values can obtain a good compromise between area and operation speed, so it is more flexible and suitable for more design requirements. However, for the k-ary algorithm, most studies did not mention the calculation method of the Bezou coefficient, and very few methods for calculating the Bezou coefficient based on the k-ary algorithm also need to use modular multiplication and division, which is a must for hardware implementation. Very complicated.

发明内容Contents of the invention

本申请提供一种获取大数拓展最大公约数的方法及硬件架构，以降低复杂度从而适应于硬件实现。The present application provides a method and hardware architecture for obtaining the extended greatest common divisor of large numbers, so as to reduce complexity and be suitable for hardware implementation.

本申请第一方面提供一种获取大数拓展最大公约数的方法，应用于公私钥对的生成，包括：The first aspect of this application provides a method for obtaining the extended greatest common divisor of large numbers, which is applied to the generation of public-private key pairs, including:

获取公私钥对生成过程中的第一参数和公钥指数；Obtain the first parameter and public key exponent in the process of generating the public-private key pair;

获取所述方法中预设的k值，所述k为2的幂次；Obtain the k value preset in the method, and the k is a power of 2;

根据所述第一参数和所述公钥指数，初始化后续迭代中所需的中间变量，所述中间变量包括第一迭代变量、第二迭代变量、第三迭代变量、第四迭代变量、第五迭代变量、第六迭代变量和第七迭代变量，其中，所述第七迭代变量的初始值为0；所述第一迭代变量、所述第二迭代变量、所述第三迭代变量、所述第四迭代变量、所述第五迭代变量和所述第六迭代变量满足xA+yB＝a和zA+wB＝b，初始值为x＝1，y＝0，z＝0，w＝1，a＝A和b＝B，其中，A为第一参数、B为公钥指数、a为第一迭代变量、b为第二迭代变量、x为第三迭代变量、y为第四迭代变量、z为第五迭代变量、w为第六迭代变量；According to the first parameter and the public key index, initialize the intermediate variables required in subsequent iterations, the intermediate variables include the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable iteration variable, the sixth iteration variable and the seventh iteration variable, wherein the initial value of the seventh iteration variable is 0; the first iteration variable, the second iteration variable, the third iteration variable, the The fourth iteration variable, the fifth iteration variable and the sixth iteration variable satisfy xA+yB=a and zA+wB=b, the initial values are x=1, y=0, z=0, w=1, a=A and b=B, wherein, A is the first parameter, B is the public key index, a is the first iteration variable, b is the second iteration variable, x is the third iteration variable, y is the fourth iteration variable, z is the fifth iteration variable, w is the sixth iteration variable;

判断第一迭代变量或第二迭代变量是否为偶数；Judging whether the first iteration variable or the second iteration variable is an even number;

如果第一迭代变量为偶数，则选取第一数据，以及，使用第一迭代变量除以第一数据的值来更新第一迭代变量，使用第三迭代变量除以第一数据的值来更新第三迭代变量，使用第四迭代变量除以第一数据的值来更新第四迭代变量，使用第七迭代变量减去log₂u来更新第七迭代变量，其中，u为第一数据，所述第一数据为小于或等于k且能被第一迭代变量整除的2的幂次的数据集合中最大的数；If the first iteration variable is an even number, the first data is selected, and the first iteration variable is updated by dividing the first iteration variable by the value of the first data, and the third iteration variable is updated by dividing the value of the first data by the third iteration variable. Three iteration variables, using the fourth iteration variable divided by the value of the first data to update the fourth iteration variable, using the seventh iteration variable to subtract log ₂ u to update the seventh iteration variable, where u is the first data, the The first data is the largest number in the power of 2 data set that is less than or equal to k and can be divisible by the first iteration variable;

如果更新后的第一迭代变量为0，则第一参数和公钥指数的最大公约数为第二迭代变量，第一参数对应的贝祖系数为第五迭代变量，公钥指数对应的贝祖系数为第六迭代变量。If the updated first iteration variable is 0, then the greatest common divisor of the first parameter and the public key exponent is the second iteration variable, the Beizu coefficient corresponding to the first parameter is the fifth iteration variable, and the Beizu coefficient corresponding to the public key index is The coefficient is the sixth iteration variable.

可选地，还包括：Optionally, also include:

如果更新后的第一迭代变量为非0，则重新判断第一迭代变量或第二迭代变量是否为偶数。If the updated first iteration variable is non-zero, it is re-determined whether the first iteration variable or the second iteration variable is an even number.

可选地，在所述判断第一迭代变量或第二迭代变量是否为偶数之后，还包括：Optionally, after the judging whether the first iteration variable or the second iteration variable is an even number, it also includes:

如果第二迭代变量为偶数，则选取第二数据，以及，使用第二迭代变量除以第二数据的值来更新第二迭代变量，使用第五迭代变量除以第二数据的值来更新第五迭代变量，使用第六迭代变量除以第二数据的值来更新第六迭代变量，使用第七迭代变量加上log₂v来更新第七迭代变量，其中，v为第二数据，所述第二数据为小于或等于k且能被第二迭代变量整除的2的幂次的数据集合中最大的数；If the second iteration variable is an even number, select the second data, and use the value of the second iteration variable divided by the second data to update the second iteration variable, and use the fifth iteration variable to divide the value of the second data to update the second iteration variable. Five iteration variables, using the sixth iteration variable divided by the value of the second data to update the sixth iteration variable, using the seventh iteration variable plus log ₂ v to update the seventh iteration variable, wherein, v is the second data, said The second data is the largest number in the power of 2 data set that is less than or equal to k and can be divisible by the second iteration variable;

如果更新后的第二迭代变量为0，则第一参数和公钥指数的最大公约数为第一迭代变量，第一参数对应的贝祖系数为第三迭代变量，公钥指数对应的贝祖系数为第四迭代变量。If the updated second iteration variable is 0, then the greatest common divisor of the first parameter and the public key exponent is the first iteration variable, the Bezu coefficient corresponding to the first parameter is the third iterative variable, and the Bezu coefficient corresponding to the public key exponent is The coefficient is the fourth iteration variable.

可选地，还包括：Optionally, also include:

如果更新后的第二迭代变量为非0，则重新判断第一迭代变量或第二迭代变量是否为偶数。If the updated second iteration variable is non-zero, it is re-determined whether the first iteration variable or the second iteration variable is an even number.

如果第一迭代变量和第二迭代变量均为奇数，则选取第三数据和第四数据，并计算c＝(pa+qb)/k，α＝(px+qz)/k，β＝(py+qw)/k，其中，p为第三数据，q为第四数据，第三数据和第四数据满足

且pa+qb＝0(mod k)，以及，根据第三数据和第四数据更新第一迭代变量、第二迭代变量、第三迭代变量、第四迭代变量、第五迭代变量、第六迭代变量和第七迭代变量，其中：如果第七迭代变量为负数，则使用(c，α，β)更新(b，z，w)，且使用第七迭代变量加上log₂k/q-1来更新第七迭代变量；如果第七迭代变量为非负数，则使用(c，α，β)更新(a，x，y)，且使用第七迭代变量减去log₂k/q-1来更新第七迭代变量；If both the first iteration variable and the second iteration variable are odd numbers, then select the third data and the fourth data, and calculate c=(pa+qb)/k, α=(px+qz)/k, β=(py +qw)/k, wherein, p is the third data, q is the fourth data, the third data and the fourth data satisfy

And pa+qb=0(mod k), and update the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable, the sixth iteration variable according to the third data and the fourth data variable and the seventh iteration variable, where: if the seventh iteration variable is negative, use (c, α, β) to update (b, z, w), and use the seventh iteration variable plus log ₂ k/q-1 to update the seventh iteration variable; if the seventh iteration variable is non-negative, use (c, α, β) to update (a, x, y), and use the seventh iteration variable to subtract log ₂ k/q-1 to update the seventh iteration variable;

本申请第二方面提供一种获取大数拓展最大公约数的硬件架构，用于计算XA+YB＝GCD(A，B)＝G，其中，G、X和Y为运算结果，A为第一操作数，B为第二操作数，第一操作数和第二操作数均为奇数，其特征在于，包括：控制模块、GCD计算单元、贝祖系数计算单元、第一多路选择器、第二多路选择器、终止模块和input_valid信号；The second aspect of the present application provides a hardware architecture for obtaining the extended greatest common divisor of large numbers, which is used to calculate XA+YB=GCD(A, B)=G, where G, X and Y are the operation results, and A is the first Operand, B is the second operand, the first operand and the second operand are odd numbers, it is characterized in that, comprising: control module, GCD calculation unit, Bezou coefficient calculation unit, the first multiplexer, the first Two multiplexers, a termination module and an input_valid signal;

GCD计算单元，所述GCD计算单元的输入端分别与所述第一多路选择器的输出端和所述控制模块的输出端连接，所述GCD计算单元的输出端分别与所述控制模块的输入端、所述终止模块的输入端和所述第一多路选择器的输入端连接，所述GCD计算单元用于将输入的第一操作数和第二操作数分别作为第一迭代变量和第二迭代变量的初始值，以及，根据所述控制模块的控制信号，更新每次输入的第一迭代变量或第二迭代变量，同时更新初始值为0的第七迭代变量，输出当前次迭代完成的第一迭代变量、第二迭代变量和第七迭代变量；GCD calculation unit, the input end of the GCD calculation unit is respectively connected with the output end of the first multiplexer and the output end of the control module, and the output end of the GCD calculation unit is respectively connected with the output end of the control module The input terminal, the input terminal of the termination module and the input terminal of the first multiplexer are connected, and the GCD calculation unit is used to use the input first operand and the second operand as the first iteration variable and the second operand respectively. The initial value of the second iteration variable, and, according to the control signal of the control module, update the first iteration variable or the second iteration variable input each time, update the seventh iteration variable with an initial value of 0, and output the current iteration completed first iteration variant, second iteration variant and seventh iteration variant;

贝祖系数计算单元，所述贝祖系数计算单元的输入端分别与所述第二多路选择器的输出端和所述控制模块的输出端连接，所述贝祖系数计算单元的输出端与所述第二多路选择器的输入端连接，所述贝祖系数计算单元用于将输入的1、0、0、1分别作为第三迭代变量、第四迭代变量、第五迭代变量和第六迭代变量的初始值，以及，根据所述控制模块的控制信号，更新每次输入的第三迭代变量、第四迭代变量、第五迭代变量或第六迭代变量，输出当前次迭代完成的第三迭代变量、第四迭代变量、第五迭代变量和第六迭代变量，其中，所述GCD计算单元和所述贝祖系数计算单元接收的控制模块的控制信号为同步信号；A Bezou coefficient calculation unit, the input end of the Bezou coefficient calculation unit is respectively connected to the output end of the second multiplexer and the output end of the control module, and the output end of the Bezou coefficient calculation unit is connected to The input end of the second multiplexer is connected, and the Bezou coefficient calculation unit is used to use the input 1, 0, 0, and 1 as the third iteration variable, the fourth iteration variable, the fifth iteration variable and the third iteration variable respectively. The initial value of the six iteration variables, and, according to the control signal of the control module, update the third iteration variable, the fourth iteration variable, the fifth iteration variable or the sixth iteration variable input each time, and output the first iteration completed by the current iteration. Three iteration variables, the fourth iteration variable, the fifth iteration variable and the sixth iteration variable, wherein the control signal of the control module received by the GCD calculation unit and the Bezou coefficient calculation unit is a synchronization signal;

input_valid信号，用于向所述第一多路选择器和所述第二多路选择器传输信号，其中，所述input_valid信号用于控制所述第一多路选择器选择第一操作数、第二操作数或上一次迭代输出的第一迭代变量、第二迭代变量作为所述GCD计算单元的每次迭代计算的输入，以及用于控制所述第二多路选择器选择0、1或上一次迭代输出的第三迭代变量、第四迭代变量、第五迭代变量和第六迭代变量作为所述贝祖系数计算单元的每次迭代计算的输入；The input_valid signal is used to transmit signals to the first multiplexer and the second multiplexer, wherein the input_valid signal is used to control the first multiplexer to select the first operand, the second Two operands or the first iteration variable output by the last iteration, the second iteration variable are used as the input of each iteration calculation of the GCD calculation unit, and are used to control the second multiplexer to select 0, 1 or the upper The third iterative variable, the fourth iterative variable, the fifth iterative variable and the sixth iterative variable output by one iteration are used as the input of each iteration calculation of the Bezou coefficient calculation unit;

控制模块，所述控制模块内置k值，用于接收每次迭代完成的第一迭代变量、第二迭代变量和第七迭代变量，以及，根据内置的k值、第一迭代变量、第二迭代变量和第七迭代变量，输出控制信号以控制所述GCD计算单元和所述贝祖系数计算单元的计算结果；A control module, the built-in k value of the control module is used to receive the first iteration variable, the second iteration variable and the seventh iteration variable completed by each iteration, and, according to the built-in k value, the first iteration variable, the second iteration variable and the seventh iteration variable, outputting a control signal to control the calculation results of the GCD calculation unit and the Bezou coefficient calculation unit;

终止模块，用于接收每次迭代完成的第一迭代变量和第二迭代变量，以及，如果当前次迭代的第一迭代变量或第二迭代变量等于0，则终止所有变量的迭代更新；当第一迭代变量为0时确定(G，X，Y)分别为当前次迭代次的第二迭代变量、第五迭代变量和第六迭代变量；当第二迭代变量为0时确定(G，X，Y)分别为当前次迭代次的第一迭代变量、第三迭代变量和第四迭代变量。The termination module is used to receive the first iteration variable and the second iteration variable completed by each iteration, and if the first iteration variable or the second iteration variable of the current iteration is equal to 0, then terminate the iterative update of all variables; when the first When the first iteration variable is 0, it is determined that (G, X, Y) are respectively the second iteration variable, the fifth iteration variable and the sixth iteration variable of the current iteration; when the second iteration variable is 0, it is determined (G, X, Y) are respectively the first iteration variable, the third iteration variable and the fourth iteration variable of the current iteration.

可选地，当所述k值为8时，所述GCD计算单元包括第一移位器、第二移位器、第三移位器、第四移位器、第五移位器、第六移位器、第七移位器、第八移位器、第一GCD选择器和第二GCD选择器，所述第一GCD选择器的输入端分别与第一移位器、第二移位器、第三移位器、第四移位器、第五移位器和第一迭代变量连接，所述第二GCD选择器的输入端分别与第四移位器、第五移位器、第六移位器、第七移位器、第八移位器和第二迭代变量连接；所述第一GCD选择器和所述第二GCD选择器连接所述控制模块的控制信号，用于根据控制信号选择对应的输出；Optionally, when the k value is 8, the GCD calculation unit includes a first shifter, a second shifter, a third shifter, a fourth shifter, a fifth shifter, a Six shifters, the seventh shifter, the eighth shifter, the first GCD selector and the second GCD selector, the input terminals of the first GCD selector are respectively connected to the first shifter, the second shifter The shifter, the third shifter, the fourth shifter, the fifth shifter and the first iterative variable are connected, and the input terminals of the second GCD selector are respectively connected with the fourth shifter and the fifth shifter , the sixth shifter, the seventh shifter, the eighth shifter and the second iterative variable connection; the first GCD selector and the second GCD selector are connected to the control signal of the control module, using To select the corresponding output according to the control signal;

所述第一移位器输入第一迭代变量，用于完成对第一迭代变量的右移一位操作；The first shifter inputs the first iteration variable, and is used to complete the right shift operation of the first iteration variable by one bit;

所述第二移位器输入第一迭代变量，用于完成对第一迭代变量的右移两位操作；The second shifter inputs the first iteration variable, and is used to complete the right shift operation of the first iteration variable by two bits;

所述第三移位器输入第一迭代变量，用于完成对第一迭代变量的右移三位操作；The third shifter inputs the first iteration variable, and is used to complete the three-bit right shift operation on the first iteration variable;

所述第四移位器输入第一迭代变量与第二迭代变量的和或差，用于完成对第一迭代变量与第二迭代变量的和或差的右移三位操作；The fourth shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a three-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable;

所述第五移位器输入第一迭代变量与第二迭代变量的和或差，用于完成对第一迭代变量与第二迭代变量的和或差的右移两位操作；The fifth shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a two-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable;

所述第六移位器输入第二迭代变量，用于完成对第二迭代变量的右移一位操作；The sixth shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by one bit;

所述第七移位器输入第二迭代变量，用于完成对第二迭代变量的右移两位操作；The seventh shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by two bits;

所述第八移位器输入第二迭代变量，用于完成对第二迭代变量的右移三位操作。The eighth shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by three bits.

可选地，当所述k值为4时，所述GCD计算单元包括第一移位器、第二移位器、第三移位器、第四移位器、第五移位器、第一GCD选择器和第二GCD选择器，所述第一GCD选择器的输入端分别与第一移位器、第二移位器、第三移位器和第一迭代变量连接，所述第二GCD选择器的输入端分别与第三移位器、第四移位器、第五移位器和第二迭代变量连接；所述第一GCD选择器和所述第二GCD选择器连接控制模块的控制信号，用于根据控制信号选择对应的输出；Optionally, when the k value is 4, the GCD calculation unit includes a first shifter, a second shifter, a third shifter, a fourth shifter, a fifth shifter, a A GCD selector and a second GCD selector, the input terminals of the first GCD selector are respectively connected with the first shifter, the second shifter, the third shifter and the first iteration variable, and the first The input ends of the two GCD selectors are respectively connected with the third shifter, the fourth shifter, the fifth shifter and the second iterative variable; the first GCD selector and the second GCD selector are connected to control The control signal of the module is used to select the corresponding output according to the control signal;

所述第三移位器输入第一迭代变量与第二迭代变量的和或差，用于完成对第一迭代变量与第二迭代变量的和或差的右移两位操作；The third shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a two-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable;

所述第四移位器输入第二迭代变量，用于完成对第二迭代变量的右移一位操作；The fourth shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by one bit;

所述第五移位器输入第二迭代变量，用于完成对第二迭代变量的右移两位操作。The fifth shifter inputs the second iterative variable, and is used to complete the right shift operation of the second iterative variable by two bits.

可选地，当所述k值为2时，所述GCD计算单元包括第一移位器、第二移位器、第三移位器、第一GCD选择器和第二GCD选择器，所述第一GCD选择器的输入端分别与第一移位器、第二移位器和第一迭代变量连接，所述第二GCD选择器的输入端分别与第二移位器、第三移位器和第二迭代变量连接；所述第一GCD选择器和所述第二GCD选择器连接控制模块的控制信号，用于根据控制信号选择对应的输出；Optionally, when the k value is 2, the GCD calculation unit includes a first shifter, a second shifter, a third shifter, a first GCD selector and a second GCD selector, so The input end of the first GCD selector is respectively connected with the first shifter, the second shifter and the first iteration variable, and the input end of the second GCD selector is respectively connected with the second shifter, the third shifter The positioner is connected to the second iteration variable; the first GCD selector and the second GCD selector are connected to the control signal of the control module, for selecting the corresponding output according to the control signal;

所述第二移位器输入第一迭代变量与第二迭代变量的和或差，用于完成对第一迭代变量与第二迭代变量的和或差的右移一位操作；The second shifter inputs the sum or difference of the first iteration variable and the second iteration variable, and is used to complete the right shift operation of the sum or difference of the first iteration variable and the second iteration variable;

所述第三移位器输入第二迭代变量，用于完成对第二迭代变量的右移一位操作。The third shifter inputs the second iterative variable, and is used to complete the right shift operation of the second iterative variable by one bit.

可选地，所述第一迭代变量、所述第二迭代变量、所述第三迭代变量、所述第四迭代变量、所述第五迭代变量和所述第六迭代变量使用冗余表示。Optionally, the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable, and the sixth iteration variable use redundant representations.

由以上技术方案可知，本申请提供的基于改进k-ary算法的获取大数拓展最大公约数的方法，以及其冗余表示的XGCD硬件实现方式，与现有技术相比：From the above technical solutions, it can be seen that the method for obtaining the extended greatest common divisor of large numbers based on the improved k-ary algorithm provided by this application, and the XGCD hardware implementation of its redundant representation, compared with the prior art:

(1)δ参数的引入，避免了比较a，b的大小，无论是对于冗余形式还是非冗余形式，都是十分有效的。(1) The introduction of the δ parameter avoids comparing the size of a and b, which is very effective for both redundant and non-redundant forms.

(2)在计算最大公约数的同时引入贝祖系数算法，计算贝祖系数只需要简单的加减法以及移位操作，在每个迭代中都保证xA+yB＝a，zA+wB＝b的成立。(2) The Bézout coefficient algorithm is introduced while calculating the greatest common divisor. The calculation of the Bézout coefficient only requires simple addition, subtraction and shift operations, and xA+yB=a and zA+wB=b are guaranteed in each iteration established.

(3)冗余形式的引入，大幅度降低加法运算所需的时间，从而达到提高时钟频率的目的。(3) The introduction of the redundant form greatly reduces the time required for the addition operation, thereby achieving the purpose of increasing the clock frequency.

综上，本申请实施例提供的方案通过引入中间变量参数δ，避免在每次迭代之中比较两个大位宽数据a，b，同时引入贝祖系数的计算和冗余形式，避免加法中进位传播造成的延迟。本申请实施例的方法和硬件架构可以减少迭代周期，缩短总运行时间，使得关键路径大幅度缩小，有非常显著的速度提升效果。In summary, the solution provided by the embodiment of the present application avoids comparing two large-bit-width data a and b in each iteration by introducing the intermediate variable parameter δ, and at the same time introduces the calculation and redundant form of the Bezou coefficient to avoid adding Delay caused by carry propagation. The method and hardware architecture of the embodiment of the present application can reduce the iteration cycle, shorten the total running time, greatly reduce the critical path, and have a very significant speed improvement effect.

附图说明Description of drawings

图1为本申请实施例提供的获取大数拓展最大公约数方法的流程示意图；Fig. 1 is a schematic flow chart of the method for obtaining the extended greatest common divisor of large numbers provided by the embodiment of the present application;

图2为本申请实施例提供的获取大数拓展最大公约数的硬件架构的结构示意图；FIG. 2 is a schematic structural diagram of a hardware architecture for obtaining the extended greatest common divisor of large numbers provided by the embodiment of the present application;

图3为本申请实施例提供的GCD计算单元的部分结构示意图；Fig. 3 is a partial structural schematic diagram of the GCD calculation unit provided by the embodiment of the present application;

图4为本申请实施例提供的贝祖系数计算单元的部分结构示意图。FIG. 4 is a schematic diagram of a partial structure of a Bezou coefficient calculation unit provided in an embodiment of the present application.

具体实施方式Detailed ways

参见图1，本申请实施例基于k-ary算法，在现有技术的k-ary算法上进行改进，降低复杂度且适应于硬件实现，以下将结合图1，详细阐述本申请实施例提供的XGCD计算流程的原理。Referring to Fig. 1, the embodiment of the present application is based on the k-ary algorithm, which is improved on the k-ary algorithm of the prior art, reduces the complexity and is suitable for hardware implementation. The following will describe in detail the implementation provided by the embodiment of the present application in conjunction with Fig. 1 The principle of XGCD calculation process.

为方便后续的说明，首先定义符号如下：To facilitate the subsequent description, first define the symbols as follows:

A和B代表初始输入，A为第一操作数，B为第二操作数，且A和B均为奇数，在整个计算过程中，A和B是定值。A and B represent the initial input, A is the first operand, B is the second operand, and both A and B are odd numbers. During the entire calculation process, A and B are fixed values.

最终的输出为G，X和Y，满足XA+YB＝GCD(A，B)＝G。The final output is G, X and Y, satisfying XA+YB=GCD(A, B)=G.

x，y，z，w，a，b为迭代的中间变量，分别满足xA+yB＝a，zA+wB＝b，初始值为x＝1，y＝0，z＝0，w＝1，a＝A和b＝B。x, y, z, w, a, b are the intermediate variables of the iteration, respectively satisfying xA+yB=a, zA+wB=b, the initial value is x=1, y=0, z=0, w=1, a=A and b=B.

δ为引入的参数，初始值为0。δ is an introduced parameter with an initial value of 0.

k为预设定值，k值为2的幂次，以使在计算过程中除以k操作能够适用移位来替代，省去除法运算，大幅度减少硬件复杂度和关键路径长度。k越大，每次迭代后a，b减少的位数越多，所需周期更少，但是较大的k会导致硬件面积增大，使线延迟增大。k is a preset value, and the k value is a power of 2, so that the operation of dividing by k in the calculation process can be replaced by a shift, which saves the division operation and greatly reduces the hardware complexity and the length of the critical path. The larger k is, the more digits of a and b are reduced after each iteration, and the required cycle is less, but a larger k will lead to an increase in hardware area and increase the line delay.

本申请实施例提供的XGCD计算流程的原理如下：The principle of the XGCD calculation process provided in the embodiment of this application is as follows:

在初始化迭代的中间变量之后，首先判断a和b的奇偶性，如果a是偶数，则获取数据u，其中u为2的幂次且≤k，且u为能被a整除的2的幂次的数据集合中最大的数，之后使用

来更新替代(a，x，y)，并将参数δ减去log₂u以更新替代原来的δ；如果b是偶数，则获取数据v，其中v为2的幂次且≤k，且v为能被a整除的2的幂次的数据集合中最大的数，之后使用

来更新替代(b，z，w)，并将参数δ加上log₂u以更新替代原来的δ，需要说明的是a和b不可能同时为偶数；如果a和b均为奇数，则获取一组(p，q)，其中

并且满足pa+qb＝0(mod k)，再计算c＝(pa+q)/k，α＝(px+qz)/k，β＝(py+qw)/k，如果参数δ小于0，则用(c，α，β)更新替代(b，z，w)，并且参数δ加上log₂k/q-1以更新替代原来的δ，如果参数δ大于或等于0，则用(c，α，β)替代(a，x，y)，并且参数δ减去log₂k/q-1以更新替代原来的δ。循环执行上述的判断a和b的奇偶性以及对应的更新变量的步骤，直至a或b为0，此时得到最终输出，如果a为0，则(G，X，Y)＝(b，z，w)，如果b为0，则(G，X，Y)＝(a，x，y)。After initializing the intermediate variables of the iteration, first judge the parity of a and b, if a is even, then obtain the data u, where u is a power of 2 and ≤ k, and u is a power of 2 that can be divisible by a The largest number in the data set, then use

To update and replace (a, x, y), and subtract log ₂ u from parameter δ to update and replace the original δ; if b is even, then obtain data v, where v is a power of 2 and ≤ k, and v It is the largest number in the power of 2 data set that can be divisible by a, and then use

To update and replace (b, z, w), and add log ₂ u to the parameter δ to update and replace the original δ. It should be noted that a and b cannot be even numbers at the same time; if a and b are both odd numbers, then get A set of (p, q), where

And satisfy pa+qb=0(mod k), then calculate c=(pa+q)/k, α=(px+qz)/k, β=(py+qw)/k, if parameter δ is less than 0, Then use (c, α, β) to replace (b, z, w), and add log ₂ k/q-1 to the parameter δ to update and replace the original δ. If the parameter δ is greater than or equal to 0, use (c , α, β) to replace (a, x, y), and the parameter δ is subtracted from log ₂ k/q-1 to update and replace the original δ. Perform the above steps of judging the parity of a and b and the corresponding updating variables in a loop until a or b is 0, and the final output is obtained at this time. If a is 0, then (G, X, Y)=(b, z , w), if b is 0, then (G, X, Y)=(a, x, y).

本申请实施例通过引入参数δ，避免在每次迭代循环中比较两个大位宽数据a，b，以减少延迟以及后续在硬件实现时的硬件资源消耗，且在计算k-ary中同时计算贝祖系数。此外，在一部分优选实施例中，引入冗余形式，除去参数δ外，其余的变量均使用冗余表示。由于操作数都是大位宽数据，加法的进位传播延迟往往比较大，为了进一步降低关键路径，避免加法中进位传播造成的延迟，引入冗余形式在迭代的大数加法运算中非常有效。基于冗余形式改进的k-ary算法只需要进行加减运算，在迭代次数很多时，使用冗余形式带来的速度的提升十分可观。In the embodiment of the present application, by introducing the parameter δ, it is avoided to compare two large-bit-width data a, b in each iterative cycle, so as to reduce delay and subsequent hardware resource consumption in hardware implementation, and calculate k-ary at the same time Bezu coefficient. In addition, in some preferred embodiments, a redundant form is introduced, except for the parameter δ, the rest of the variables are represented by redundancy. Since the operands are all large-bit-width data, the carry propagation delay of addition is often relatively large. In order to further reduce the critical path and avoid the delay caused by carry propagation in addition, the introduction of redundant forms is very effective in iterative large number addition operations. The improved k-ary algorithm based on the redundant form only needs to perform addition and subtraction operations. When the number of iterations is large, the speed improvement brought about by using the redundant form is very considerable.

本申请实施例使用C++对上述的XGCD计算流程进行软件建模以用于测试，当k＝8时，对于实际值为1024比特的输入，平均一次XGCD计算需要908次迭代，比起Two-bit PM所需的1195次迭代，减少了约25％。如果k更小，或者是输入更大的话，迭代周期会增加。The embodiment of the present application uses C++ to carry out software modeling on the above-mentioned XGCD calculation process for testing. When k=8, for an input with an actual value of 1024 bits, an average XGCD calculation requires 908 iterations, compared to Two-bit 1195 iterations required for PM, a reduction of about 25%. If k is smaller, or if the input is larger, the iteration period will increase.

基于上述XGCD计算流程的原理，本申请实施例提供一种在公私钥对场景中获取私钥指数的方法，该方法包括步骤S1至步骤S6。Based on the principle of the above-mentioned XGCD calculation process, this embodiment of the present application provides a method for obtaining a private key index in a public-private key pair scenario, and the method includes steps S1 to S6.

S1、获取公私钥对生成过程中的第一参数和公钥指数。S1. Obtain the first parameter and the public key index in the process of generating the public-private key pair.

在公私钥对的生成过程中，公钥由模数和公钥指数组成，私钥由模数和私钥指数组成，首先选取两个大质数p、q得到模数，利用p和q的互质数目的乘积得到第一参数，根据第一参数选取公钥指数，其中，私钥指数为公钥指数关于第一参数的模反元素，因此，可以使用公钥指数和第一参数作为XGCD的输入，输出的对应的贝祖系数即为私钥指数。In the process of generating the public-private key pair, the public key is composed of the modulus and the public key exponent, and the private key is composed of the modulus and the private key exponent. First, two large prime numbers p and q are selected to obtain the modulus. The product of the prime number gets the first parameter, and the public key index is selected according to the first parameter, wherein the private key index is the inverse element of the public key index with respect to the first parameter, therefore, the public key index and the first parameter can be used as the input of XGCD , the corresponding Bezou coefficient output is the private key index.

S2、获取所述方法中预设的k值，所述k为2的幂次。S2. Obtain a value of k preset in the method, where k is a power of 2.

S3、生成第一迭代变量、第二迭代变量、第三迭代变量、第四迭代变量、第五迭代变量、第六迭代变量和第七迭代变量，其中，所述第七迭代变量的初始值为0；所述第一迭代变量、所述第二迭代变量、所述第三迭代变量、所述第四迭代变量、所述第五迭代变量和所述第六迭代变量满足xA+yB＝a和zA+wB＝b，其中，A为第一参数、B为公钥指数、a为第一迭代变量、b为第二迭代变量、x为第三迭代变量、y为第四迭代变量、z为第五迭代变量、w为第六迭代变量。S3. Generate a first iteration variable, a second iteration variable, a third iteration variable, a fourth iteration variable, a fifth iteration variable, a sixth iteration variable, and a seventh iteration variable, wherein the initial value of the seventh iteration variable is 0; the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable and the sixth iteration variable satisfy xA+yB=a and zA+wB=b, where A is the first parameter, B is the public key index, a is the first iteration variable, b is the second iteration variable, x is the third iteration variable, y is the fourth iteration variable, z is The fifth iteration variable, w is the sixth iteration variable.

根据第一参数和公钥指数，初始化后续需要迭代的中间变量，包括第一迭代变量、第二迭代变量、第三迭代变量、第四迭代变量、第五迭代变量、第六迭代变量和第七迭代变量。初始化所述第一迭代变量、所述第二迭代变量、所述第三迭代变量、所述第四迭代变量、所述第五迭代变量和所述第六迭代变量，其中，x＝1，y＝0，z＝0，w＝1，a＝A和b＝B。According to the first parameter and the public key index, initialize the intermediate variables that need to be iterated, including the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable, the sixth iteration variable and the seventh iteration variable iteration variable. Initialize the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable and the sixth iteration variable, where x=1, y =0, z=0, w=1, a=A and b=B.

S4、判断第一迭代变量或第二迭代变量是否为偶数。S4. Determine whether the first iteration variable or the second iteration variable is an even number.

需要说明的是第一迭代变量和第二迭代变量不可能同为偶数。It should be noted that the first iteration variable and the second iteration variable cannot both be even numbers.

S5、如果第一迭代变量为偶数，则选取第一数据，以及，使用第一迭代变量除以第一数据的值来更新第一迭代变量，使用第三迭代变量除以第一数据的值来更新第三迭代变量，使用第四迭代变量除以第一数据的值来更新第四迭代变量，使用第七迭代变量减去log₂u来更新第七迭代变量，其中，u为第一数据，所述第一数据为小于或等于k且能被第一迭代变量整除的2的幂次的数据集合中最大的数。S5. If the first iteration variable is an even number, select the first data, and use the value of the first iteration variable to divide the first data to update the first iteration variable, and use the third iteration variable to divide the value of the first data to update Update the third iterative variable, use the fourth iterative variable divided by the value of the first data to update the fourth iterative variable, use the seventh iterative variable minus log ₂ u to update the seventh iterative variable, where u is the first data, The first data is the largest number in a power of 2 data set that is less than or equal to k and can be divisible by the first iteration variable.

S6、如果更新后的第一迭代变量为0，则第一参数和公钥指数的最大公约数为第二迭代变量，第一参数对应的贝祖系数为第五迭代变量，公钥指数对应的贝祖系数为第六迭代变量。S6. If the updated first iteration variable is 0, the greatest common divisor of the first parameter and the public key index is the second iteration variable, the Bezou coefficient corresponding to the first parameter is the fifth iteration variable, and the public key index corresponds to Bézout coefficient is the sixth iteration variable.

进一步地，如果更新后的第一迭代变量为非0，则重新判断第一迭代变量或第二迭代变量是否为偶数。Further, if the updated first iteration variable is non-zero, it is re-judged whether the first iteration variable or the second iteration variable is an even number.

进一步地，在判断第一迭代变量或第二迭代变量是否为偶数时，如果第二迭代变量为偶数，则选取第二数据，以及，使用第二迭代变量除以第二数据的值来更新第二迭代变量，使用第五迭代变量除以第二数据的值来更新第五迭代变量，使用第六迭代变量除以第二数据的值来更新第六迭代变量，使用第七迭代变量加上log₂v来更新第七迭代变量，其中，v为第二数据，所述第二数据为小于或等于k且能被第二迭代变量整除的2的幂次的数据集合中最大的数。Further, when judging whether the first iteration variable or the second iteration variable is an even number, if the second iteration variable is an even number, then select the second data, and use the value of dividing the second iteration variable by the second data to update the first Two iteration variables, use the fifth iteration variable divided by the value of the second data to update the fifth iteration variable, use the sixth iteration variable to divide the value of the second data to update the sixth iteration variable, use the seventh iteration variable plus log ₂ v to update the seventh iteration variable, wherein, v is the second data, and the second data is the largest number in the power of 2 data set that is less than or equal to k and divisible by the second iteration variable.

进一步地，如果更新后的第二迭代变量为0，则第一参数和公钥指数的最大公约数为第一迭代变量，第一参数对应的贝祖系数为第三迭代变量，公钥指数对应的贝祖系数为第四迭代变量。Further, if the updated second iterative variable is 0, then the greatest common divisor of the first parameter and the public key exponent is the first iterative variable, the Bezou coefficient corresponding to the first parameter is the third iterative variable, and the public key exponent corresponds to The Bézout coefficient of is the fourth iteration variable.

进一步地，如果更新后的第二迭代变量为非0，则重新判断第一迭代变量或第二迭代变量是否为偶数。Further, if the updated second iteration variable is non-zero, it is re-judged whether the first iteration variable or the second iteration variable is an even number.

进一步地，在判断第一迭代变量或第二迭代变量是否为偶数时，如果第一迭代变量和第二迭代变量均为奇数，则选取第三数据和第四数据，并计算c＝(pa+qb)/k，α＝(px+qz)/k，β＝(py+qw)/k，其中，p为第三数据，q为第四数据，第三数据和第四数据满足

且pa+qb＝0(mod k)，以及，根据第三数据和第四数据更新第一迭代变量、第二迭代变量、第三迭代变量、第四迭代变量、第五迭代变量、第六迭代变量和第七迭代变量，其中：如果第七迭代变量为负数，则使用(c，α，β)更新(b，z，w)，且使用第七迭代变量加上log₂k/q-1来更新第七迭代变量；如果第七迭代变量为非负数，则使用(c，α，β)更新(a，x，y)，且使用第七迭代变量减去log₂k/q-1来更新第七迭代变量。如果更新后的第一迭代变量或第二迭代变量为0，则终止迭代采用前述方法对应输出，如果更新后的第一迭代变量或第二迭代变量都不为0，则继续重新判断第一迭代变量或第二迭代变量是否为偶数从而对中间变量迭代更新。Further, when judging whether the first iteration variable or the second iteration variable is an even number, if the first iteration variable and the second iteration variable are all odd numbers, then select the third data and the fourth data, and calculate c=(pa+ qb)/k, α=(px+qz)/k, β=(py+qw)/k, wherein, p is the third data, q is the fourth data, the third data and the fourth data satisfy

And pa+qb=0(mod k), and update the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable, the sixth iteration variable according to the third data and the fourth data variable and the seventh iteration variable, where: if the seventh iteration variable is negative, use (c, α, β) to update (b, z, w), and use the seventh iteration variable plus log ₂ k/q-1 to update the seventh iteration variable; if the seventh iteration variable is non-negative, use (c, α, β) to update (a, x, y), and use the seventh iteration variable to subtract log ₂ k/q-1 to Update the seventh iteration variable. If the updated first iteration variable or the second iteration variable is 0, terminate the iteration and use the above method to output correspondingly, if the updated first iteration variable or the second iteration variable is not 0, then continue to re-judge the first iteration Whether the variable or the second iterative variable is an even number to iteratively update the intermediate variable.

进一步地，所述第一迭代变量、所述第二迭代变量、所述第三迭代变量、所述第四迭代变量、所述第五迭代变量和所述第六迭代变量均使用冗余表示。Further, the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable and the sixth iteration variable all use redundant representations.

参见图2，针对本申请实施例前述提供的XGCD计算流程，本申请实施例还提供一种获取大数拓展最大公约数的硬件架构，包括控制模块(control)、GCD计算单元、贝祖系数计算单元、第一多路选择器、第二多路选择器、终止模块(termination)和input_valid信号。Referring to Figure 2, for the XGCD calculation process provided above in the embodiment of the present application, the embodiment of the present application also provides a hardware architecture for obtaining the extended greatest common divisor of large numbers, including a control module (control), a GCD calculation unit, and a Bézout coefficient calculation unit, first multiplexer, second multiplexer, termination and input_valid signal.

所述GCD计算单元用于更新第一迭代变量a、第二迭代变量b和第七迭代变量δ。GCD计算单元的输入端分别与所述第一多路选择器的输出端和所述控制模块的输出端连接，所述GCD计算单元的输出端分别与所述控制模块的输入端、所述终止模块的输入端和所述第一多路选择器的输入端连接，所述GCD计算单元用于将输入的第一操作数A和第二操作数B分别作为第一迭代变量和第二迭代变量的初始值，以及，根据所述控制模块的控制信号，更新每次输入的第一迭代变量a_i或第二迭代变量b_i，其中i为当前迭代次数，同时更新初始值为0的第七迭代变量δ，输出当前次迭代完成的第一迭代变量a_i+1、第二迭代变量b_i+1和第七迭代变量，作为下一次迭代的输入。The GCD calculation unit is used for updating the first iteration variable a, the second iteration variable b and the seventh iteration variable δ. The input end of the GCD calculation unit is respectively connected with the output end of the first multiplexer and the output end of the control module, and the output end of the GCD calculation unit is respectively connected with the input end of the control module, the termination The input end of the module is connected to the input end of the first multiplexer, and the GCD calculation unit is used to use the input first operand A and the second operand B as the first iteration variable and the second iteration variable respectively The initial value of , and, according to the control signal of the control module, update the first iteration variable a _i or the second iteration variable b _i input each time, where i is the current number of iterations, and update the seventh with an initial value of 0 at the same time The iteration variable δ outputs the first iteration variable a _i+1 , the second iteration variable b _i+1 and the seventh iteration variable completed in the current iteration as the input of the next iteration.

所述贝祖系数计算单元用于更新第三迭代变量x、第四迭代变量y、第五迭代变量z、第六迭代变量w，所述贝祖系数计算单元的输入端分别与所述第二多路选择器的输出端和所述控制模块的输出端连接，所述贝祖系数计算单元的输出端与所述第二多路选择器的输入端连接，所述贝祖系数计算单元用于将输入的1、0、0、1分别作为第三迭代变量、第四迭代变量、第五迭代变量和第六迭代变量的初始值，以及，根据所述控制模块的控制信号，更新每次输入的第三迭代变量、第四迭代变量、第五迭代变量或第六迭代变量，输出当前次迭代完成的第三迭代变量、第四迭代变量、第五迭代变量和第六迭代变量，其中，所述GCD计算单元和所述贝祖系数计算单元接收的控制模块的控制信号为同步信号，通过循环迭代计算GCD以及贝祖系数，是同步得到的。The Bezou coefficient calculation unit is used to update the third iterative variable x, the fourth iterative variable y, the fifth iterative variable z, and the sixth iterative variable w, and the input terminals of the Bezou coefficient calculation unit are respectively connected to the second The output end of the multiplexer is connected to the output end of the control module, the output end of the Bezou coefficient calculation unit is connected to the input end of the second multiplexer, and the Bezou coefficient calculation unit is used for Using the input 1, 0, 0, and 1 as the initial values of the third iteration variable, the fourth iteration variable, the fifth iteration variable, and the sixth iteration variable, and, according to the control signal of the control module, updating each input The third iteration variable, the fourth iteration variable, the fifth iteration variable or the sixth iteration variable, output the third iteration variable, the fourth iteration variable, the fifth iteration variable and the sixth iteration variable completed in the current iteration, where the The control signal of the control module received by the GCD calculation unit and the Bézout coefficient calculation unit is a synchronous signal, and the GCD and the Bézout coefficient are calculated synchronously through cyclic iterations.

input_valid信号，用于向所述第一多路选择器和所述第二多路选择器传输信号，其中，所述input_valid信号用于控制所述第一多路选择器选择第一操作数、第二操作数或上一次迭代输出的第一迭代变量、第二迭代变量作为所述GCD计算单元的每次迭代计算的输入，以及用于控制所述第二多路选择器选择0、1或上一次迭代输出的第三迭代变量、第四迭代变量、第五迭代变量和第六迭代变量作为所述贝祖系数计算单元的每次迭代计算的输入。The input_valid signal is used to transmit signals to the first multiplexer and the second multiplexer, wherein the input_valid signal is used to control the first multiplexer to select the first operand, the second Two operands or the first iteration variable output by the last iteration, the second iteration variable are used as the input of each iteration calculation of the GCD calculation unit, and are used to control the second multiplexer to select 0, 1 or the upper The third iterative variable, the fourth iterative variable, the fifth iterative variable and the sixth iterative variable output by one iteration are used as the input of each iteration calculation of the Bezou coefficient calculation unit.

例如，如果input_valid信号为1，代表一组新的数据输入，此时GCD计算单元与贝祖系数计算单元会被初始化，第一多路选择器会选择将新输入的A，B作为GCD计算单元的输入，第二多路选择器会选择将0，1作为贝祖系数计算单元的输入，从而对中间变量进行初始化。如果input_valid信号为0，则代表没有新的数据，则继续迭代计算原来的GCD和贝祖系数，会选取上一次迭代的输出作为当前次迭代的输入。For example, if the input_valid signal is 1, it represents a new set of data input. At this time, the GCD calculation unit and the Bezou coefficient calculation unit will be initialized, and the first multiplexer will select the newly input A and B as the GCD calculation unit. , the second multiplexer will select 0, 1 as the input of the Bezou coefficient calculation unit, so as to initialize the intermediate variable. If the input_valid signal is 0, it means that there is no new data, then continue to iteratively calculate the original GCD and Bezou coefficients, and select the output of the previous iteration as the input of the current iteration.

所述控制模块内置k值，用于接收每次迭代完成的第一迭代变量、第二迭代变量和第七迭代变量，以及，根据内置的k值、第一迭代变量、第二迭代变量和第七迭代变量，输出控制信号以决定所述GCD计算单元和所述贝祖系数计算单元每次迭代的输出结果。The built-in k value of the control module is used to receive the first iteration variable, the second iteration variable and the seventh iteration variable completed by each iteration, and, according to the built-in k value, the first iteration variable, the second iteration variable and the seventh iteration variable Seven iteration variables, outputting control signals to determine the output results of each iteration of the GCD calculation unit and the Bezou coefficient calculation unit.

终止模块，用于接收每次迭代完成的第一迭代变量和第二迭代变量，以及，如果当前次迭代的第一迭代变量或第二迭代变量等于0，则终止所有变量的迭代更新，并输出output_valid信号以表示当第一迭代变量为0时确定(G，X，Y)分别为当前次迭代次的第二迭代变量、第五迭代变量和第六迭代变量；当第二迭代变量为0时确定(G，X，Y)分别为当前次迭代次的第一迭代变量、第三迭代变量和第四迭代变量。The termination module is used to receive the first iteration variable and the second iteration variable completed by each iteration, and if the first iteration variable or the second iteration variable of the current iteration is equal to 0, then terminate the iterative update of all variables and output The output_valid signal indicates that when the first iteration variable is 0, it is determined that (G, X, Y) are respectively the second iteration variable, the fifth iteration variable, and the sixth iteration variable of the current iteration; when the second iteration variable is 0 Determine (G, X, Y) as the first iteration variable, the third iteration variable and the fourth iteration variable of the current iteration, respectively.

参见图3，示例性地，在k＝8时，GCD计算单元包括第一移位器、第二移位器、第三移位器、第四移位器、第五移位器、第六移位器、第七移位器、第八移位器、第一GCD选择器和第二GCD选择器，所述第一GCD选择器的输入端分别与第一移位器、第二移位器、第三移位器、第四移位器、第五移位器和第一迭代变量连接，所述第二GCD选择器的输入端分别与第四移位器、第五移位器、第六移位器、第七移位器、第八移位器和第二迭代变量连接；所述第一GCD选择器和所述第二GCD选择器连接所述控制模块的控制信号，用于根据控制信号选择对应的输出。Referring to FIG. 3 , for example, when k=8, the GCD calculation unit includes a first shifter, a second shifter, a third shifter, a fourth shifter, a fifth shifter, a sixth a shifter, a seventh shifter, an eighth shifter, a first GCD selector and a second GCD selector, the input terminals of the first GCD selector are respectively connected to the first shifter and the second shifter Device, the 3rd shifter, the 4th shifter, the 5th shifter are connected with the first iterative variable, the input end of the second GCD selector is respectively connected with the 4th shifter, the 5th shifter, The sixth shifter, the seventh shifter, the eighth shifter and the second iteration variable are connected; the first GCD selector and the second GCD selector are connected to the control signal of the control module for Select the corresponding output according to the control signal.

所述第一移位器输入第一迭代变量a，用于完成对第一迭代变量a的右移一位操作。The first shifter inputs the first iterative variable a, and is used to complete the right shift operation of the first iterative variable a by one bit.

所述第二移位器输入第一迭代变量a，用于完成对第一迭代变量a的右移两位操作。The second shifter inputs the first iterative variable a, and is used to complete the right shift operation of the first iterative variable a by two bits.

所述第三移位器输入第一迭代变量a，用于完成对第一迭代变量a的右移三位操作。The third shifter is input to the first iterative variable a, and is used to perform a three-bit right shift operation on the first iterative variable a.

所述第四移位器输入第一迭代变量a与第二迭代变量b的和或差，用于完成对第一迭代变量a与第二迭代变量b的和或差的右移三位操作。The fourth shifter inputs the sum or difference of the first iterative variable a and the second iterative variable b, and is used to perform a three-bit right shift operation on the sum or difference of the first iterative variable a and the second iterative variable b.

所述第五移位器输入第一迭代变量a与第二迭代变量b的和或差，用于完成对第一迭代变量a与第二迭代变量b的和或差的右移两位操作。The fifth shifter inputs the sum or difference of the first iterative variable a and the second iterative variable b, and is used to perform a two-bit right shift operation on the sum or difference of the first iterative variable a and the second iterative variable b.

在k＝8时，根据前述的XGCD计算流程的原理可知，当a是偶数时，获取数据u，其中u≤k，且u为能被a整除的2的幂次的数据集合中最大的数，因此u的值在迭代过程中可能为2、4、8，之后使用

来更新替代a，则在GCD计算单元的硬件架构中，设计第一移位器、第二移位器和第三移位器。同理，a和b均为奇数时，获取一组(p，q)可能为(1，1)、(1，-1)、(2，2)和(2，-2)，因此设计第四移位器和第五移位器。When k=8, according to the principle of the aforementioned XGCD calculation process, when a is an even number, the data u is obtained, where u≤k, and u is the largest number in the power of 2 data set divisible by a , so the value of u may be 2, 4, 8 during the iteration process, and then use

To update and replace a, then in the hardware architecture of the GCD computing unit, design the first shifter, the second shifter and the third shifter. Similarly, when a and b are both odd numbers, the obtained set of (p, q) may be (1, 1), (1, -1), (2, 2) and (2, -2), so the design of the first Quad shifter and fifth shifter.

第二迭代变量b的更新与此类似，并且可复用其中第四移位器和第五移位器的部分硬件，在GCD计算单元中设计第六移位器、第七移位器和第八移位器用以在第二迭代变量为偶数时更新第二迭代变量。The update of the second iteration variable b is similar to this, and part of the hardware of the fourth shifter and the fifth shifter can be reused, and the sixth shifter, the seventh shifter and the fifth shifter can be designed in the GCD calculation unit The eight shifters are used to update the second iteration variable when the second iteration variable is even.

所述第六移位器输入第二迭代变量b，用于完成对第二迭代变量的右移一位操作。The sixth shifter inputs the second iterative variable b, and is used to complete the right shift operation of the second iterative variable by one bit.

所述第七移位器输入第二迭代变量b，用于完成对第二迭代变量的右移两位操作。The seventh shifter inputs the second iterative variable b, and is used to complete the right shift operation of the second iterative variable by two bits.

所述第八移位器输入第二迭代变量b，用于完成对第二迭代变量的右移三位操作。The eighth shifter inputs the second iterative variable b, and is used to complete the three-bit right shift operation on the second iterative variable.

当控制模块接收每次迭代完成的第一迭代变量、第二迭代变量和第七迭代变量，以及根据内置的k值，控制GCD计算单元中的第一GCD选择器和第二GCD选择器选择对应的输出，用以更新第一迭代变量和第二迭代变量，例如在一种情况下，控制模块接收迭代完成的第一迭代变量为偶数，控制模块根据第一迭代变量值和k值选取u为4，则控制模块发出控制信号，该控制信号控制下一次迭代的GCD计算单元中的第一GCD选择器选择第二移位器作为第一迭代变量的更新值输出，控制第二GCD选择器选择原来输入的第二迭代变量值作为更新值输出。When the control module receives the first iteration variable, the second iteration variable and the seventh iteration variable completed in each iteration, and according to the built-in k value, it controls the first GCD selector and the second GCD selector in the GCD calculation unit to select the corresponding The output of is used to update the first iteration variable and the second iteration variable. For example, in one case, the first iteration variable received by the control module after iteration is even, and the control module selects u according to the first iteration variable value and k value as 4. The control module sends a control signal, which controls the first GCD selector in the GCD calculation unit of the next iteration to select the second shifter as the update value output of the first iteration variable, and controls the second GCD selector to select The originally input second iteration variable value is output as an updated value.

示例性地，k值为4时，所述GCD计算单元包括第一移位器、第二移位器、第三移位器、第四移位器、第五移位器、第一GCD选择器和第二GCD选择器，所述第一GCD选择器的输入端分别与第一移位器、第二移位器、第三移位器和第一迭代变量连接，所述第二GCD选择器的输入端分别与第三移位器、第四移位器、第五移位器和第二迭代变量连接；所述第一GCD选择器和所述第二GCD选择器连接控制模块的控制信号，用于根据控制信号选择对应的输出。Exemplarily, when the k value is 4, the GCD calculation unit includes a first shifter, a second shifter, a third shifter, a fourth shifter, a fifth shifter, a first GCD selection and a second GCD selector, the input terminals of the first GCD selector are respectively connected with the first shifter, the second shifter, the third shifter and the first iterative variable, and the second GCD selects The input end of device is respectively connected with the 3rd shifter, the 4th shifter, the 5th shifter and the second iterative variable; The control of described first GCD selector and described second GCD selector connection control module Signal, used to select the corresponding output according to the control signal.

所述第一移位器输入第一迭代变量，用于完成对第一迭代变量的右移一位操作。The first shifter inputs the first iteration variable, and is used to complete the right shift operation on the first iteration variable by one bit.

所述第二移位器输入第一迭代变量，用于完成对第一迭代变量的右移两位操作。The second shifter inputs the first iteration variable, and is used to complete the right shift operation of the first iteration variable by two bits.

所述第三移位器输入第一迭代变量与第二迭代变量的和或差，用于完成对第一迭代变量与第二迭代变量的和或差的右移两位操作。The third shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a two-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable.

所述第四移位器输入第二迭代变量，用于完成对第二迭代变量的右移一位操作。The fourth shifter inputs the second iterative variable, and is used to complete the right shift operation of the second iterative variable by one bit.

示例性地，k值为2时，所述GCD计算单元包括第一移位器、第二移位器、第三移位器、第一GCD选择器和第二GCD选择器，所述第一GCD选择器的输入端分别与第一移位器、第二移位器和第一迭代变量连接，所述第二GCD选择器的输入端分别与第二移位器、第三移位器和第二迭代变量连接；所述第一GCD选择器和所述第二GCD选择器连接控制模块的控制信号，用于根据控制信号选择对应的输出。Exemplarily, when the value of k is 2, the GCD calculation unit includes a first shifter, a second shifter, a third shifter, a first GCD selector and a second GCD selector, and the first The input end of the GCD selector is respectively connected with the first shifter, the second shifter and the first iteration variable, and the input end of the second GCD selector is respectively connected with the second shifter, the third shifter and The second iteration variable is connected; the first GCD selector and the second GCD selector are connected to the control signal of the control module, and are used to select the corresponding output according to the control signal.

所述第二移位器输入第一迭代变量与第二迭代变量的和或差，用于完成对第一迭代变量与第二迭代变量的和或差的右移一位操作。The second shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a one-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable.

进一步地，本申请实施例提供的获取大数拓展最大公约数的硬件架构可以引入冗余形式，其中，第一迭代变量、所述第二迭代变量、所述第三迭代变量、所述第四迭代变量、所述第五迭代变量和所述第六迭代变量均使用冗余表示。需要说明的是，本申请实施例上述移位器的移位代表的是实际值的移位，而在冗余形式下，如图3所示，冗余形式的移位为实际值移位的2倍，例如，在实际值情况下，k＝8时，所述第一移位器用于完成对第一迭代变量a的右移一位操作，所述第二移位器用于完成对第一迭代变量a的右移两位操作，所述第三移位器用于完成对第一迭代变量a的右移三位操作；那么在冗余形式下，k＝8时，所述第一移位器用于完成对第一迭代变量a的右移两位操作，所述第二移位器用于完成对第一迭代变量a的右移四位操作，所述第三移位器用于完成对第一迭代变量a的右移六位操作，以此类推。Further, the hardware architecture for obtaining the extended greatest common divisor of large numbers provided by the embodiment of the present application can introduce redundancy, wherein the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable The iteration variable, the fifth iteration variable and the sixth iteration variable all use redundant representations. It should be noted that the shift of the above-mentioned shifter in the embodiment of the present application represents the shift of the actual value, and in the redundant form, as shown in Figure 3, the shift of the redundant form is the shift of the actual value 2 times, for example, in the actual value case, when k=8, the first shifter is used to complete the right shift operation of the first iteration variable a, and the second shifter is used to complete the first The two-bit right shift operation of the iterative variable a, the third shifter is used to complete the three-bit right shift operation on the first iterative variable a; then in redundant form, when k=8, the first shift The second shifter is used to complete the right shift operation of the first iteration variable a by two bits, the second shifter is used to complete the right shift operation of the first iteration variable a by four bits, and the third shifter is used to complete the operation of the first iteration variable a Iterates the right shift of variable a by six bits, and so on.

由上面的流程可以看出，当k为2的幂次时，整个计算过程中只有简单的加减法以及移位运算，通过引入冗余形式可使加减法中没有进位传播，这能进一步减少关键路径，提高计算速度。冗余形式下，加法的关键路径不再与位宽有关，而是一个固定的值，例如Carry-Save形式下，加法的延迟约为两个1位全加器，这相比于1024位全加器速度提升数百倍，大幅度缩短了运算时间。使用冗余形式的缺点在于，计算结束之后需要进行一次冗余到非冗余的转换，但是无论中间进行了多少次加法，也都仅仅只需要一次转换。而相比于数百上千次迭代中使用冗余加法所带来的受益，计算结束之后进行的冗余到非冗余的转换所增加的延迟可以忽略不计。在本申请的优选实施例中，除了第七迭代变量δ使用非冗余表示，其余的变量都是使用冗余表示。It can be seen from the above process that when k is a power of 2, there are only simple addition and subtraction and shift operations in the entire calculation process. By introducing redundant forms, there is no carry propagation in addition and subtraction, which can further Reduce critical paths and increase calculation speed. In the redundant form, the key path of addition is no longer related to the bit width, but a fixed value. For example, in the Carry-Save form, the delay of addition is about two 1-bit full adders, which is compared to 1024-bit full adders. The speed of the adder has been increased hundreds of times, and the operation time has been greatly shortened. The disadvantage of using the redundant form is that a redundant-to-non-redundant conversion is required after the calculation, but no matter how many additions are performed in between, only one conversion is required. Compared with the benefit of using redundant addition in hundreds of thousands of iterations, the added delay of redundant to non-redundant conversion after the calculation is negligible. In a preferred embodiment of the present application, except for the seventh iteration variable δ, which uses non-redundant representation, all other variables use redundant representation.

参见图4，关于贝祖系数计算单元，用于对第三迭代变量x、第四迭代变量y、第五迭代变量z和第六迭代变量w迭代更新，对于贝祖系数的计算，整体上也和GCD计算单元的硬件架构类似，唯独移位操作需要单独处理。Referring to Fig. 4, the Bezou coefficient calculation unit is used to iteratively update the third iterative variable x, the fourth iterative variable y, the fifth iterative variable z, and the sixth iterative variable w. For the calculation of the Bezou coefficient, the overall Similar to the hardware architecture of the GCD calculation unit, only the shift operation needs to be processed separately.

与a，b直接移位截断不同，系数x，y，z，w不能直接通过移位来进行截断，原因在于无法保证在每次迭代中x，y，z，w都保持着和a，b一样的可除性，例如，1*A+1*B＝a中，如果a为偶数，那么可以直接进行移位操作。由于是冗余形式，除以2变为向右移动2位，但是却不能对直接将系数1也向右移动两位，因为1不能整除2。但是通过加减初始输入A，B能够有效解决这个问题。以k＝4时为例，x除以4可能会变换成以下四种情况中的一种：Unlike a, b directly shifting and truncating, the coefficients x, y, z, w cannot be truncated directly by shifting, because there is no guarantee that x, y, z, w will maintain the sum of a, b in each iteration The same divisibility, for example, in 1*A+1*B=a, if a is an even number, then the shift operation can be performed directly. Because it is a redundant form, dividing by 2 becomes moving 2 bits to the right, but it cannot directly move the coefficient 1 to the right by 2 bits, because 1 cannot divide 2 evenly. But by adding and subtracting the initial input A, B can effectively solve this problem. Taking k=4 as an example, dividing x by 4 may transform into one of the following four situations:

由于A，B为初始值，且在运算过程中不发生变化，因此类似于3B的值可以提前计算并保存。于是，硬件上可以同时计算这四个值，最后根据x和B的低位来选择正确的结果。对于k＝8，需要同时计算8个值，最后从中选出正确结果。图4为一个硬件架构的例子，在k＝8时用以更新迭代x，其中，LUT通过输入x[5:0]、B[5:0]，输出对应的L，用于选择正确的LB值，将其输入到RSD加法器中，使得x+L*B能够整除8，根据x和B确定L的值，使得x+L*B能够整除8，其他系数的更新与此类似。Since A and B are initial values and do not change during the operation, values similar to 3B can be calculated and saved in advance. Therefore, the hardware can calculate these four values at the same time, and finally select the correct result according to the low bits of x and B. For k=8, it is necessary to calculate 8 values at the same time, and finally select the correct result from them. Figure 4 is an example of a hardware architecture, which is used to update the iteration x when k=8, where the LUT outputs the corresponding L by inputting x[5:0] and B[5:0], which is used to select the correct LB Value, input it into the RSD adder, so that x+L*B can divide 8, and determine the value of L according to x and B, so that x+L*B can divide 8, and the update of other coefficients is similar.

本申请实施例通过HDL硬件描述语言进行了实现，并在ASIC进行了仿真。当k＝8时，使用冗余有符号数(Redundant signed digit，简称RSD)形式，通过使用TSMC 28-nm库进行ASIC综合，可以展现本方案带来的加速效果。当然，为了保证比较的公平性，对使用其他技术实现的方案的结果按比例进行调整，结果如下表所示：The embodiment of the present application is realized by HDL hardware description language, and simulated in ASIC. When k=8, use the form of redundant signed digit (RSD for short), and use the TSMC 28-nm library to perform ASIC synthesis, which can show the acceleration effect brought by this solution. Of course, in order to ensure the fairness of the comparison, the results of the solutions implemented using other technologies are adjusted proportionally, and the results are shown in the following table:

根据技术之间的差异对结果进行调整之后，本申请实施例的技术方案相比起软件实现速度提升了18倍，比现有技术(D.Zhu,Y.Song,J.Tian,Z.Wang,and H.Yu,“Anefficient accelerator of the squaring for the verifiable delay function overa class group,”in 2020IEEE Asia Pacific Conference on Circuits and Systems(APCCAS).IEEE,2020,pp.137–140.)快12倍，比起目前最快的方案(K.Sreedhar,H.Mark,and C.Torng,“Afast large-integer extended gcd algorithm and hardware designfor verifiable delay functions and modular inversion,”Cryptology ePrintArchive,Report 2021/1292,2021.)也有1.5倍的提升。After adjusting the results according to the differences between the technologies, the technical solution of the embodiment of the present application is 18 times faster than the software implementation speed, and compared with the prior art (D.Zhu, Y.Song, J.Tian, Z.Wang , and H. Yu, “A efficient accelerator of the squaring for the verifiable delay function overa class group,” in 2020IEEE Asia Pacific Conference on Circuits and Systems(APCCAS). IEEE, 2020, pp.137–140.) 12 times faster, Compared with the current fastest scheme (K.Sreedhar, H.Mark, and C.Torng, "Afast large-integer extended gcd algorithm and hardware design for verifiable delay functions and modular inversion," Cryptology ePrintArchive, Report 2021/1292, 2021. ) also has a 1.5x boost.

同时本申请实施例也对不同的k用TSMC 28-nm进行了ASIC综合，结果如下表所示：At the same time, the embodiment of this application also carried out ASIC synthesis for different k using TSMC 28-nm, and the results are shown in the following table:

kk 22 44 88 周期数number of cycles 21512151 11951195 908908 时钟频率Clock frequency 3.03GHz3.03GHz 2.5GHz2.5GHz 1.9GHz1.9GHz 面积area 0.201mm20.201mm2 0.404mm20.404mm2 0.316mm20.316mm2 总时间total time 709.83709.83 478ns478ns 472.16ns472.16ns

根据不同的设计需求可以选择不同的k值，相比于目前已有的方案，不仅拥有相近甚至更好的性能，并且更具灵活性。需要说明的是，使用不同的冗余方式，都能取得类似的效果，但是可能会有一些不同的小问题需要处理以适应本申请，例如RSD形式，在迭代进行加法操作之前需要对位数进行一些调整，避免位数无限制增长。使用不同的k值，取得某些性能参数的提高，例如迭代周期，时钟频率，面积等。Different k values can be selected according to different design requirements. Compared with existing solutions, it not only has similar or even better performance, but also has more flexibility. It should be noted that using different redundancy methods can achieve similar effects, but there may be some different small problems that need to be dealt with to adapt to this application. Some adjustments to avoid unlimited growth of digits. Using different k values, some performance parameters are improved, such as iteration cycle, clock frequency, area, etc.

由以上技术方案可知，本申请实施例提供一种基于改进k-ary算法的获取大数拓展最大公约数的方法，以及其冗余表示的XGCD硬件实现方式，与现有技术相比：It can be seen from the above technical solutions that the embodiment of the present application provides a method for obtaining the extended greatest common divisor of large numbers based on the improved k-ary algorithm, and its XGCD hardware implementation of redundant representation, compared with the prior art:

(4)x，y，z，w系数的移位操作的特别处理，因为并不能保证系数的可除性。(4) Special treatment for the shift operation of x, y, z, w coefficients, because the divisibility of coefficients cannot be guaranteed.

以上所述的本申请实施方式并不构成对本申请保护范围的限定。The embodiments of the present application described above are not intended to limit the scope of protection of the present application.

Claims

1. A method for obtaining the extended greatest common divisor of large numbers, applied to the generation of public-private key pairs, characterized in that it includes:

Obtain the first parameter and public key exponent in the process of generating the public-private key pair;

Obtain the k value preset in the method, and the k is a power of 2;

According to the first parameter and the public key index, initialize the intermediate variables required in subsequent iterations, the intermediate variables include the first iteration variable, the second iteration variable, the third iteration variable, the fourth iteration variable, the fifth iteration variable iteration variable, the sixth iteration variable and the seventh iteration variable, wherein the initial value of the seventh iteration variable is 0; the first iteration variable, the second iteration variable, the third iteration variable, the The fourth iteration variable, the fifth iteration variable and the sixth iteration variable satisfy xA+yB=a and zA+wB=b, the initial values are x=1, y=0, z=0, w=1, a=A and b=B, wherein, A is the first parameter, B is the public key index, a is the first iteration variable, b is the second iteration variable, x is the third iteration variable, y is the fourth iteration variable, z is the fifth iteration variable, w is the sixth iteration variable;

Judging whether the first iteration variable or the second iteration variable is an even number;

If the first iteration variable is an even number, the first data is selected, and the first iteration variable is updated by dividing the first iteration variable by the value of the first data, and the third iteration variable is updated by dividing the value of the first data by the third iteration variable. Three iteration variables, using the fourth iteration variable divided by the value of the first data to update the fourth iteration variable, using the seventh iteration variable to subtract log ₂ u to update the seventh iteration variable, where u is the first data, the The first data is the largest number in the power of 2 data set that is less than or equal to k and can be divisible by the first iteration variable;

If the updated first iteration variable is 0, then the greatest common divisor of the first parameter and the public key exponent is the second iteration variable, the Beizu coefficient corresponding to the first parameter is the fifth iteration variable, and the Beizu coefficient corresponding to the public key index is The coefficient is the sixth iteration variable.

2. A method for obtaining the extended greatest common divisor of large numbers according to claim 1, further comprising:

If the updated first iteration variable is non-zero, it is re-determined whether the first iteration variable or the second iteration variable is an even number.

3. A method for obtaining the expanded greatest common divisor of large numbers according to claim 1, characterized in that, after said judging whether the first iteration variable or the second iteration variable is an even number, it also includes:

If the second iteration variable is an even number, select the second data, and use the value of the second iteration variable divided by the second data to update the second iteration variable, and use the fifth iteration variable to divide the value of the second data to update the second iteration variable. Five iteration variables, using the sixth iteration variable divided by the value of the second data to update the sixth iteration variable, using the seventh iteration variable plus log ₂ v to update the seventh iteration variable, wherein, v is the second data, said The second data is the largest number in the power of 2 data set that is less than or equal to k and can be divisible by the second iteration variable;

If the updated second iteration variable is 0, then the greatest common divisor of the first parameter and the public key exponent is the first iteration variable, the Bezu coefficient corresponding to the first parameter is the third iterative variable, and the Bezu coefficient corresponding to the public key exponent is The coefficient is the fourth iteration variable.

4. A method for obtaining the extended greatest common divisor of large numbers according to claim 3, further comprising:

If the updated second iteration variable is non-zero, it is re-determined whether the first iteration variable or the second iteration variable is an even number.

5. A kind of method for obtaining the extended greatest common divisor of large numbers according to claim 1, characterized in that, after said judging whether the first iteration variable or the second iteration variable is an even number, it also includes:

If both the first iteration variable and the second iteration variable are odd numbers, then select the third data and the fourth data, and calculate c=(pa+qb)/k, α=(px+qz)/k, β=(py +qw)/k, wherein, p is the third data, q is the fourth data, the third data and the fourth data satisfy

6. A hardware architecture for obtaining the extended greatest common divisor of large numbers, which is used to calculate XA+YB=GCD(A, B)=G, wherein G, X and Y are operation results, A is the first operand, and B It is the second operand, the first operand and the second operand are both odd numbers, and it is characterized in that it includes: a control module, a GCD calculation unit, a Bezou coefficient calculation unit, a first multiplexer, and a second multiplexer device, termination module and input_valid signal;

GCD calculation unit, the input end of the GCD calculation unit is respectively connected with the output end of the first multiplexer and the output end of the control module, and the output end of the GCD calculation unit is respectively connected with the output end of the control module The input terminal, the input terminal of the termination module and the input terminal of the first multiplexer are connected, and the GCD calculation unit is used to use the input first operand and the second operand as the first iteration variable and the second operand respectively. The initial value of the second iteration variable, and, according to the control signal of the control module, update the first iteration variable or the second iteration variable input each time, update the seventh iteration variable with an initial value of 0, and output the current iteration completed first iteration variant, second iteration variant and seventh iteration variant;

A Bezou coefficient calculation unit, the input end of the Bezou coefficient calculation unit is respectively connected to the output end of the second multiplexer and the output end of the control module, and the output end of the Bezou coefficient calculation unit is connected to The input end of the second multiplexer is connected, and the Bezou coefficient calculation unit is used to use the input 1, 0, 0, and 1 as the third iteration variable, the fourth iteration variable, the fifth iteration variable and the third iteration variable respectively. The initial value of the six iteration variables, and, according to the control signal of the control module, update the third iteration variable, the fourth iteration variable, the fifth iteration variable or the sixth iteration variable input each time, and output the first iteration completed by the current iteration. Three iteration variables, the fourth iteration variable, the fifth iteration variable and the sixth iteration variable, wherein the control signal of the control module received by the GCD calculation unit and the Bezou coefficient calculation unit is a synchronization signal;

The input_valid signal is used to transmit signals to the first multiplexer and the second multiplexer, wherein the input_valid signal is used to control the first multiplexer to select the first operand, the second Two operands or the first iteration variable output by the last iteration, the second iteration variable are used as the input of each iteration calculation of the GCD calculation unit, and are used to control the second multiplexer to select 0, 1 or the upper The third iterative variable, the fourth iterative variable, the fifth iterative variable and the sixth iterative variable output by one iteration are used as the input of each iteration calculation of the Bezou coefficient calculation unit;

A control module, the built-in k value of the control module is used to receive the first iteration variable, the second iteration variable and the seventh iteration variable completed by each iteration, and, according to the built-in k value, the first iteration variable, the second iteration variable and the seventh iteration variable, outputting a control signal to control the calculation results of the GCD calculation unit and the Bezou coefficient calculation unit;

The termination module is used to receive the first iteration variable and the second iteration variable completed by each iteration, and if the first iteration variable or the second iteration variable of the current iteration is equal to 0, then terminate the iterative update of all variables; when the first When the first iteration variable is 0, it is determined that (G, X, Y) are respectively the second iteration variable, the fifth iteration variable and the sixth iteration variable of the current iteration; when the second iteration variable is 0, it is determined (G, X, Y) are respectively the first iteration variable, the third iteration variable and the fourth iteration variable of the current iteration.

7. A kind of hardware architecture for obtaining the extended greatest common divisor of large numbers according to claim 6, characterized in that, when the k value is 8, the GCD calculation unit includes a first shifter, a second shifter shifter, third shifter, fourth shifter, fifth shifter, sixth shifter, seventh shifter, eighth shifter, first GCD selector and second GCD selector , the input terminals of the first GCD selector are respectively connected with the first shifter, the second shifter, the third shifter, the fourth shifter, the fifth shifter and the first iteration variable, so The input end of the second GCD selector is respectively connected with the fourth shifter, the fifth shifter, the sixth shifter, the seventh shifter, the eighth shifter and the second iteration variable; A GCD selector and the second GCD selector are connected to the control signal of the control module, and are used to select the corresponding output according to the control signal;

The first shifter inputs the first iteration variable, and is used to complete the right shift operation of the first iteration variable by one bit;

The second shifter inputs the first iteration variable, and is used to complete the right shift operation of the first iteration variable by two bits;

The third shifter inputs the first iteration variable, and is used to complete the three-bit right shift operation on the first iteration variable;

The fourth shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a three-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable;

The fifth shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a two-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable;

The sixth shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by one bit;

The seventh shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by two bits;

The eighth shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by three bits.

8. A kind of hardware architecture for obtaining the extended greatest common divisor of large numbers according to claim 6, characterized in that, when the k value is 4, the GCD calculation unit includes a first shifter, a second shifter a shifter, a third shifter, a fourth shifter, a fifth shifter, a first GCD selector and a second GCD selector, the input terminals of the first GCD selector are respectively connected to the first shifter , the second shifter, the third shifter and the first iterative variable are connected, and the input terminals of the second GCD selector are respectively connected with the third shifter, the fourth shifter, the fifth shifter and the first Two iteration variables are connected; the first GCD selector and the second GCD selector are connected to the control signal of the control module, and are used to select the corresponding output according to the control signal;

The third shifter inputs the sum or difference of the first iterative variable and the second iterative variable, and is used to perform a two-bit right shift operation on the sum or difference of the first iterative variable and the second iterative variable;

The fourth shifter inputs the second iteration variable, and is used to complete the right shift operation of the second iteration variable by one bit;

The fifth shifter inputs the second iterative variable, and is used to complete the right shift operation of the second iterative variable by two bits.

9. A kind of hardware architecture for obtaining the extended greatest common divisor of large numbers according to claim 6, characterized in that, when the k value is 2, the GCD calculation unit includes a first shifter, a second shifter A shifter, a third shifter, a first GCD selector and a second GCD selector, the input terminals of the first GCD selector are respectively connected with the first shifter, the second shifter and the first iteration variable , the input end of the second GCD selector is respectively connected with the second shifter, the third shifter and the second iterative variable; the first GCD selector and the second GCD selector are connected to the control module The control signal is used to select a corresponding output according to the control signal;

The second shifter inputs the sum or difference of the first iteration variable and the second iteration variable, and is used to complete the right shift operation of the sum or difference of the first iteration variable and the second iteration variable;

The third shifter inputs the second iterative variable, and is used to complete the right shift operation of the second iterative variable by one bit.

10. A hardware architecture for obtaining the extended greatest common divisor of large numbers according to claim 6, wherein the first iteration variable, the second iteration variable, the third iteration variable, and the second iteration variable Four iteration variables, the fifth iteration variable and the sixth iteration variable use redundant representations.