CN113504895A

CN113504895A - Elliptic curve multi-scalar dot multiplication calculation optimization method and optimization device

Info

Publication number: CN113504895A
Application number: CN202110791569.4A
Authority: CN
Inventors: 高鸣宇; 张烨; 董江彬
Original assignee: Tsinghua University
Current assignee: Shenzhen Zhixin Huaxi Information Technology Co ltd
Priority date: 2021-07-13
Filing date: 2021-07-13
Publication date: 2021-10-15
Anticipated expiration: 2041-07-13
Also published as: CN113504895B

Abstract

The application discloses an elliptic curve multi-scalar dot product calculation optimization method and an elliptic curve multi-scalar dot product calculation optimization device, intermediate variable points in the main calculation process are cached by designing a barrel matrix, the output of Pippenger intermediate quantities is avoided, the calculation is continuously performed in a pipeline manner until all final calculations are finished, transverse reduction and longitudinal reduction are performed, the total calculation times of serial calculation are reduced from thousands of times to one time, most of expenses such as serial-parallel conversion and synchronous locking are eliminated, the continuous working time of a production line is effectively prolonged, and therefore the overall performance is improved.

Description

Elliptic curve multi-scalar dot multiplication calculation optimization method and optimization device

Technical Field

The application relates to the technical field of cryptography, in particular to an elliptic curve multi-scalar dot multiplication calculation optimization method and an elliptic curve multi-scalar dot multiplication calculation optimization device.

Background

In the related art, as shown in fig. 1, which is a Pippenger algorithm, is an algorithm used in one round of calculation in the MSM module of PipeZK, in which such calculation needs to be performed in total

To produce

Point, finally this

The points are again taken as input for one calculation,thereby obtaining the final result. Where N is the total amount of input data and M is the total amount of input data for a single round, the former typically being 10⁶On the order of magnitude, the latter is often 1000, 1024, etc.

In FIG. 1, for each G_iThe storage contents of the buckets (buckets) for temporarily storing the result of one point are different, so that different G's are calculated_iEvery time, the Bucket needs to be emptied in advance. And performing longitudinal reduction (Q)_j＝∑2^jζG_i) And transverse reduction (G)_i＝∑_iB_i) This is a serial phase of the algorithm, and there is no parallel algorithm for the time being to perform the parallel computation of this computation, so this serial phase requires the execution of all

Next, for the case of millions of input data, this stage is performed thousands of times and cannot be optimized in parallel, which causes a certain performance loss.

Therefore, the disadvantages of the related art are:

1) each round of calculation needs longitudinal reduction, and the calculation process can hardly be parallelized, thereby causing performance loss.

2) Each round of calculation needs to be transversely reduced, and the calculation process can hardly be parallelized, so that the performance is lost.

3) Each round of calculation needs to be written back to the memory, which brings extra logic control and certain difficulty in implementation.

Disclosure of Invention

The present application is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, an object of the present application is to provide an elliptic curve multi-scalar dot product calculation optimization method, which effectively improves the continuous working time of a pipeline, thereby improving the overall performance.

Another objective of the present application is to provide an elliptic curve multi-scalar point multiplication calculation optimization device.

In order to achieve the above object, an embodiment of an aspect of the present application provides an elliptic curve multi-scalar point multiplication calculation optimization method, including:

in a main calculation process, caching intermediate variable points in the main calculation process by using one row of a bucket matrix;

maintaining the content in the bucket matrix while canceling the transverse reduction at the end of the main calculation process in the PipeZK algorithm, and continuing to the main calculation process of the next order;

canceling part of the calculation process of the longitudinal reduction at the tail of each round;

and after all rounds are finished, performing transverse reduction and longitudinal reduction calculation on the barrel matrix, and outputting elliptic curve points through the barrel matrix.

According to the elliptic curve multi-scalar dot multiplication calculation optimization method, variable points in the main calculation process are cached by designing the bucket matrix, output of the intermediate quantity of the Pippenger is avoided, running and calculation are continuously performed in a pipeline manner until all final calculation is finished, transverse reduction and longitudinal reduction are performed, the total calculation times of serial calculation are reduced from thousands of times to one time, serial-parallel conversion stages between sub-batch processing are optimized, the parallelism degree of the algorithm can be improved by eliminating the conversion, the continuous working time of a production line is effectively prolonged, and therefore the overall performance is improved.

In addition, the elliptic curve multi-scalar dot product calculation optimization method according to the above embodiment of the present application may further have the following additional technical features:

further, in one embodiment of the present application, the bucket matrix is common

Row, total 2^ζ-1 column, where λ is the bit width of the coefficient and ζ is the bit segment width.

Further, in an embodiment of the present application, the performing of the horizontal reduction and the vertical reduction calculation on the bucket matrix includes: and performing transverse reduction on each row of the barrel matrix to obtain a plurality of elliptic curve points, and performing longitudinal reduction on the plurality of elliptic curve points.

Further, in an embodiment of the present application, the performing of the horizontal reduction and the vertical reduction calculation on the bucket matrix includes: and performing longitudinal reduction on each column of the barrel matrix to obtain a plurality of elliptic curve points, and performing transverse reduction on the plurality of elliptic curve points.

Further, in one embodiment of the present application, in the horizontal reduction calculation, the calculations of different rows are performed in parallel or in series; in the vertical reduction calculation, the calculations of different columns are performed in parallel or in series.

In order to achieve the above object, another embodiment of the present application provides an elliptic curve multi-scalar point multiplication calculation optimization apparatus, including:

the cache module is used for caching the intermediate variable points in the main calculation process by utilizing one row of the bucket matrix in the main calculation process;

the first optimization module is used for keeping the content in the bucket matrix while canceling the transverse reduction at the tail of the main calculation process in the PipeZK algorithm and continuing to the main calculation process of the next order;

the second optimization module is used for canceling part of the calculation process of the longitudinal reduction at the tail of each round;

and the output module is used for performing transverse reduction and longitudinal reduction calculation on the barrel matrix after all rounds are finished, and outputting elliptic curve points through the barrel matrix.

The elliptic curve multi-scalar dot multiplication calculation optimization device provided by the embodiment of the application caches variable points in the main calculation process by designing the bucket matrix, avoids output of the intermediate quantity of the Pippenger, continuously runs and calculates in a pipeline manner until all final calculations are completely finished, then performs transverse reduction and longitudinal reduction, reduces the total calculation times of serial calculation from thousands of times to one time, optimizes the serial-parallel conversion stage between the sub-batch processing, improves the parallelism degree of the algorithm by eliminating the conversion, effectively improves the continuous working time of a production line, and further improves the overall performance.

In addition, the elliptic curve multi-scalar dot product calculation optimization device according to the above embodiment of the present application may further have the following additional technical features:

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a schematic diagram of a prior art solution calculation process;

FIG. 2 is a flowchart illustrating an optimization method for multi-scalar dot product calculation of an elliptic curve according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a calculation flow of an elliptic curve multi-scalar point multiplication calculation optimization method according to an embodiment of the present application;

FIG. 4 is a diagram illustrating a hardware architecture and a computing flow according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an access order to coefficient data in various rounds according to an embodiment of the present application;

fig. 6 is a schematic diagram of an access sequence to buckets when a final result is solved after all rounds of calculation are completed according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an elliptic curve multi-scalar dot product calculation optimization device according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

The definitions and related terms in this application are first introduced.

Bit segment:

the bit segment is a part of a certain number in binary representation, for example, the 2 nd to 4 th bits of the binary number 101011 are a bit segment of the number.

The bit segments are denoted below using the notation:

a[p:q]

where a is a certain number, p is the lowest index of the bit segment, q is the highest index of the bit segment, the indices starting from 0, 0 representing the lowest bit. For example:

a[0:3]

a bit segment consisting of 0 rd bit to 3 rd bit of binary system representing the digit a, the width of the bit segment being 4

And (3) transverse reduction:

the transverse reduction refers to the following calculation process:

wherein G is the output of the calculation process, and the type is an elliptic curve point; b is_iIs the input to the calculation process, each B_iThe types of the points are all elliptic curve points, and the total number is 2^ζ-1; ζ is the bit segment width, a constant parameter of the calculation process, often taken as 4.

Longitudinal reduction:

longitudinal reduction refers to the following calculation process:

wherein Q is the output of the calculation process, the type being an elliptic curve point; g_iIs the input to the calculation process, each G_iThe types of the points are all elliptic curve points, and the total number is

λ is the coefficient field binary bit width, and is input dependent, with common values being 256, 384, 512, etc.; ζ is the bit segment width, a constant parameter of the calculation process, often taken as 4.

And (4) round:

the round is a certain cycle in an outer cycle of the algorithm, specifically, the algorithm comprises two layers of the outer cycle and an inner cycle, the outer cycle needs to read in M elliptic curve points and M coefficients each time, and M is usually 1K. The round refers to a certain time in the process of the loop iteration.

Each round is hereinafter denoted and distinguished by the k-th round, where k is referred to as the round number or round index, starting with 0 and having a maximum value of

N is the input size of the whole algorithm,

presentation pair

The result of (2) is rounded up.

The order:

the bit order refers to a certain cycle in an algorithm inner cycle, specifically, the algorithm comprises an outer cycle and an inner cycle, the inner cycle reads in corresponding bit segments of M coefficients and M elliptic curves with p and q as bit segment parameters each time, namely, M bit segments and M coefficients are read in total, M is usually 1K, and p and q are related to cycle parameters. The bit order refers to a certain time in the loop iteration process.

The individual digits are denoted and distinguished below by the kth digit, where k is referred to as the digit number or digit index, starting from 0 and having a maximum value of

The functional relationship between the lowest subscript and the highest subscript of a corresponding bit segment of a certain order and a bit order number exists, the functional relationship between the called order number and the lowest subscript of the bit segment is low-order subscript mapping, the functional relationship between the called order number and the highest subscript of the bit segment is high-order subscript mapping, and the functional relationship between the called order number and a tuple consisting of the lowest subscript and the highest subscript is subscript mapping.

The main calculation process comprises the following steps:

the main calculation process is the main calculation process in the inner loop of the algorithm, which receives a number of elliptic curve points and equal number of bit segments of coefficients, and outputs 2^ζ1 elliptic curve point. The main process is to check the coefficient bit segment corresponding to each elliptic curve point and to transfer the elliptic curve point to the barrel with corresponding label by using the value of the bit segment as subscript. When the elliptic curve points are transmitted to the barrel with the corresponding label, if the elliptic curve points do not exist in the barrel, the elliptic curve points are directly filled into the barrel; if the elliptic curve points exist in the barrel, the addition operation result of the elliptic curve points in the barrel and the elliptic curve points of the current elliptic curve points is filled into the barrel. And ending the process until all the elliptic curve points and the coefficients are traversed, and outputting the results in all the buckets.

The calculation process requires specifying the input elliptic curve points, the input coefficients, coefficient bit segment parameters, and bucket positions.

The method mainly aims at optimizing a PipeZK accelerator architecture design on an MSM module. Namely, the prior art implementation scheme is' PipeZK: the other technical solutions are mainly to accelerate by using a distributed system and a GPU, or to multiply by using FPGA but not aiming at multi-scalar points on an elliptic curve. In the original scheme, when the size of input data is N, the data is divided into a plurality of arrays with the same length, the arrays are processed in batches, the length of the arrays and the total length of the arrays are kept equal to each other as much as possible, an elliptic curve point coordinate is generated each time, finally, the generated elliptic curve points are processed again to obtain a final point, and the point is output by an MSM module.

Zero knowledge proves that: the zero-knowledge proof is a very useful cryptographic protocol for protecting privacy, and can be widely used in a plurality of application scenarios such as a block chain. The prover can prove to the verifier that the prover knows a certain knowledge without revealing any information about the knowledge itself through zero knowledge proof.

PipeZK: the PipeZK accelerator is an accelerator for one of zero Knowledge algorithms called Non-interactive concise zero Knowledge proof (zero-Knowledge) algorithm, also commonly abbreviated as zk-SNARK. The scheme capable of accelerating zk-SNARK at present adopts a distributed system, a GPU and an FPGA, the former two situations do not have any overlapping part with the application, and the third scheme adopts the FPGA acceleration, and the operation acceleration on an elliptic curve is mainly performed at present, and the calculation process of multiplying multiple scalar points is not directly accelerated.

MSM (Multi-scalar Multiplication): the point multiplication of elliptic curve is implemented by inputting a set of coefficients and coordinates of points on a set of elliptic curve, outputting an elliptic curve point, calculating a formula by multiplying each coefficient by a corresponding point on the elliptic curve, and multiplying the result by the corresponding point on the elliptic curveEach multiplication result is added. I.e. Q ═ Σ k_iP_iMiddle k_iIs the ith coefficient, P_iIs the ith elliptic curve point, the bold represents the elliptic curve point, and the non-bold represents the common number. PipeZK has a POLY module besides the MSM module, and the application is only optimized for MSM.

Elliptic curve: an important basic theory in the field of cryptography, most elliptic curve-based cryptographic algorithms are designed depending on the decomposition difficulty of Q ═ kP (similar well-known RSA algorithms are designed depending on the decomposition difficulty of the product of two large prime numbers). Two operations can be performed on the elliptic curve: two elliptic curve points are added, and a common positive integer is multiplied by one elliptic curve point. The specific calculation formula is not direct addition and multiplication, and a relatively complex solving process is provided, so that the calculation amount of the solving process is relatively large. Therefore, the MSM is often calculated by millions of elliptic curve points, and the calculation cost becomes very large.

The optimization method for elliptic curve multi-scalar point multiplication calculation proposed by the embodiment of the application is described below with reference to the attached drawings.

FIG. 2 is a method for optimizing elliptic curve multi-scalar dot product calculation according to an embodiment of the present application.

As shown in fig. 1, the elliptic curve multi-scalar dot product calculation optimization method includes the following steps:

in step S1, in the main calculation process, intermediate variable points in the main calculation process are buffered by one line of the bucket matrix.

Specifically, the intermediate variable points in different main calculation processes are cached by designing a bucket matrix instead of a group of buckets, so that output of the intermediate quantity of the Pippenger is avoided, running and calculating are continuously performed in a pipeline mode until all final calculations are finished, and then transverse reduction and longitudinal reduction are performed, so that the total calculation times of serial calculation can be reduced from thousands of times to one time, most of expenses of serial-parallel conversion, synchronous locking and the like are eliminated, and the calculation performance of the MSM is improved.

Note that a Bucket (Bucket) is a name for a storage unit of one elliptic curve point. Pippenger is a calculation method for multiple scalar point multiplication on elliptic curves.

Specifically, let λ be the bit width of the coefficient (that is, each coefficient is a λ digit number in binary), and the common values are 256, 384, 512, and so on; zeta is the bit segment width, and the common values are 4 and the like. Designing a bucket matrix, totaling

Line, 2^ζ-1 column.

Step S2, while canceling the lateral reduction at the end of the primary computation process in the PipeZK algorithm, the contents in the bucket matrix are kept and continued to the primary computation process of the next rank.

And step S3, canceling part of the calculation process of the longitudinal reduction at the end of each round.

And step S4, after all rounds are finished, performing transverse reduction and longitudinal reduction calculation on the barrel matrix, and outputting elliptic curve points through the barrel matrix.

Further, in one embodiment of the present application, performing the horizontal reduction and the vertical reduction calculations on the bucket matrix comprises: and performing transverse reduction on each row of the barrel matrix to obtain a plurality of elliptic curve points, and performing longitudinal reduction on the plurality of elliptic curve points.

Further, in one embodiment of the present application, performing the horizontal reduction and the vertical reduction calculations on the bucket matrix comprises: and performing longitudinal reduction on each column of the barrel matrix to obtain a plurality of elliptic curve points, and performing transverse reduction on the plurality of elliptic curve points.

Specifically, after all rounds are finished, there are two types of transverse reduction and longitudinal reduction calculations performed on the bucket matrix, and both types of reduction output an elliptic curve point.

Wherein, the first reduction type is to carry out transverse reduction firstly and then carry out longitudinal reduction. On the barrel matrix, each row is transversely reduced to obtain

An elliptic curve point. To this again

One longitudinal reduction is made for each elliptic curve point. In the process of performing transverse reduction on different rows, the rows are generally executed in parallel, but can also be executed in series, and meanwhile, the execution sequence among the rows has no influence.

The second reduction type is longitudinal reduction first and then transverse reduction. On the bucket matrix, make a longitudinal reduction for each column, get 2^ζ1 elliptic curve point, and then 2^ζ-1 elliptic curve point for one transverse reduction. In the process of longitudinal reduction of different columns, the columns are generally executed in parallel, but can also be executed in series, and meanwhile, the execution sequence among the columns has no influence.

As shown in fig. 3, a calculation process in the elliptic curve multi-scalar dot product calculation optimization method is shown.

In the main calculation process, the original input needs to be divided into several subsets, and the general division method is to divide the original input into subsets as uniformly as possible

Subsets, each subset containing approximately M elliptic curve points and M coefficients. However, the division can be performed disorderly and unevenly without affecting the calculation result, and the number of the divided subsets does not affect the final result. Dividing the subset only requires that the correspondence of each elliptic curve point and its coefficient in the subset is not destroyed, and that each elliptic curve point and each coefficientThe coefficients are used exactly once.

There is an extreme method of bypassing the above-described division feature by changing the data input, i.e., repeating a certain elliptic curve point multiple times or combining the repeated elliptic curve points, and making its corresponding coefficient equal to the original corresponding coefficient.

There are a number of mapping methods for the subscript mapping of the bit segments in the main computation process portion of fig. 3.

If the round number is i and the bit number is j, the index can be divided by the lowest index of the bit segment shown in FIG. 3, which is j ζ and j ζ + ζ -1, respectively, or the index can be divided by the lowest index of the bit segment shown in FIG. 3

The lowest position section is a subscript,

For the calculation of the highest index of the bit segment, the corresponding selected bucket position should be changed to the first of the selected bucket matrix

And (6) rows.

For other bit segment selection methods, the corresponding bucket positions are also changed correspondingly, but the selected core characteristics are as follows: the first, bit segments do not overlap; secondly, the union set of all the selected bit segments is just the whole bit segment of the coefficient; thirdly, the barrel positions are not overlapped; fourth, the union of all selected bucket positions is exactly the entire bucket matrix. The bit segment selection and bucket position selection methods which meet the above four conditions can both obtain correct results.

There are several mapping methods for bucket matrix row and column subscripts: in addition to the above method in which the bit segment and the same position are simultaneously and cooperatively changed, the result may be directly stored in other designated bucket matrix positions during the main calculation process, and the input mode of the horizontal reduction or the vertical reduction is simultaneously adjusted, thereby bypassing the subscript relationship shown in fig. 3.

This method is characterized in that: the first, accessed bucket in the main calculation process, the value of the corresponding access coefficient bit segment, and the multiplied coefficient in the horizontal reduction are certain equal. And in the second and main calculation processes, the accessed bucket corresponding to the lowest index of the accessed bit segment is equal to the logarithm taking the base 2 of the multiplication coefficient in the longitudinal reduction. The barrel position selection method satisfying the above conditions can realize the elliptic curve multi-scalar point multiplication calculation optimization method of the embodiment of the present application.

Specifically, as shown in fig. 4, fig. 4 only illustrates the flow and related components of a round of computation, which is a schematic diagram of the main components and computation flow of the MSM module optimized in the present application. In fig. 4, the coefficient bit width is 256 bits, the bit width of each component of the elliptic curve point coordinates is 768 bits, and N is 1048576, so that the data amount in a single round is 1024. (note that the above parameters are only described in the schematic, and the actual technical solution can be modified to other reasonable values at all).

In fig. 4, "1024 Scalar" at the upper left represents data of a coefficient portion, "1024 Point" at the upper right represents data of an elliptic curve Point portion, a plurality of cylinders at the lower left represent a buffer area, i.e., a Bucket matrix, each cylinder is a Bucket and can store one elliptic curve Point, a PADD portion at the lower right represents an elliptic curve Point addition calculation unit and can perform addition of two elliptic curve points and output one elliptic curve Point, and portions of 3 boxes at the lower right represent FIFO queues for caching all elliptic curve pairs to be calculated by the PADD.

In the scheme before optimization, the buckets have only one row, and the numerical value of a certain section in the coefficient is used for determining which Bucket a certain elliptic curve point is stored in. In the optimized scheme, Bucket has the sharing

Line, 2^ζ-1 column, λ being the bit width of the coefficient. In principle, the parameter ζ can be changed, the larger the ζ parameter is, the larger the resource overhead is, and selecting ζ equal to 4 as the bit segment width is a compromise choice in resource performance balance.

Taking this round of calculation as an example, there are 1024 coefficients in total, as shown in fig. 5 where idx0 represents the first coefficient and idx1 represents the second coefficient, until idx1023 represents the 1024 th coefficient. Each coefficient bit is 256 bits wide, with 4 bits as a set of slices, for a total of 64 slices. The highest bit of each coefficient is on the left, the lowest bit is on the right, and the highest bit and the lowest bit are respectively marked as a 0 th group, a 1 st group and a 63 rd group from right to left.

The specific calculation flow is that the 63 rd group, that is, the group on the leftmost side, is accessed, idx0, idx1 and idx1023 are accessed from top to bottom, in the process, if the numerical value of the slice is 0, the elliptic curve point corresponding to the coefficient is discarded, if the numerical value is 1, the elliptic curve point corresponding to the coefficient is placed in the first Bucket in the first row, if the numerical value is 2, the elliptic curve point is placed in the second Bucket in the first row, and so on. And when the 63 th group has 1024 coefficients which are accessed completely, returning to the first coefficient, accessing the 62 th group, if the coefficient is 0, discarding the point, if the coefficient is 1, placing the point into the first Bucket in the second row, if the coefficient is 2, placing the point into the second Bucket in the second row, if the coefficient is 3, placing the point into the third Bucket in the second row, and so on.

As shown in fig. 6, the calculation step of longitudinal reduction should be implemented by hardware more efficiently, but can be calculated by software method and still be more efficient than the general calculation method. Specifically, for each column, the calculation is performed from top to bottom, taking the first point of this column, multiplying by 2^ζThen add the second point of the column and multiply by 2^ζThen add the third point of this column and so on. (after adding the last row of dots, do not multiply by 2^ζ) The result of this calculation is written back into the first row of this column, i.e. into row 0. If implemented in hardware, multiply by 2^ζShould be broken down into zeta sub-point operations, similar to the software implementation.

And finally, taking out the content in row0 and calculating. The step has two methods of hardware realization and software realization, and both have better effect. Specifically, the first column may be multiplied by 1, the second column may be multiplied by 2, the third column may be multiplied by 3, and so on, and finally the results may be added. The method of multiplying a constant by an elliptic curve point should be decomposed into operations of point multiplication and point addition.

Since the processes of horizontal reduction and vertical reduction in the original scheme are parts which cannot be parallelized and can only be executed in series, the subsequent calculation must wait for the completion of the calculation process, and most parts of hardware are in an idle state during the serial calculation. The serial computing overhead is eliminated, the number of times is reduced from thousands of times to only one time, the hardware is almost continuously in a working state, and the end of the pre-computation does not need to be waited, so that the overall acceleration performance is improved.

As another possible implementation, the same calculation result may be obtained by eliminating only the horizontal reduction in the loop or only the vertical reduction in the loop, and the corresponding bucket matrix and the associated storage structure may be changed accordingly. This acceleration effect is less efficient than the bucket matrix acceleration effect, but still higher than the PipeZK performance.

According to the elliptic curve multi-scalar dot multiplication calculation optimization method provided by the embodiment of the application, the intermediate variable points in the main calculation process are cached by designing the bucket matrix, the output of the Pippenger intermediate quantity is avoided, the continuous running and calculation are carried out until all final calculations are finished, then the transverse reduction and the longitudinal reduction are carried out, the total calculation times of serial calculation are reduced from thousands of times to one time, the serial-parallel conversion stage between the sub-batch processing is optimized, the parallelism degree of the algorithm can be improved by eliminating the conversion, the continuous working time of a production line is effectively prolonged, and the overall performance is improved.

Next, an elliptic curve multi-scalar point multiplication calculation optimization apparatus proposed according to an embodiment of the present invention is described with reference to the drawings.

FIG. 7 is a schematic structural diagram of an apparatus for optimizing multi-scalar dot product calculation of an elliptic curve according to an embodiment of the present invention.

As shown in fig. 7, the elliptic curve multi-scalar dot product calculation optimizing device includes: a cache module 100, a first optimization module 200, a second optimization module 300, and an output module 400.

The buffer module 100 is configured to buffer intermediate variable points in the main calculation process by using one row of the bucket matrix in the main calculation process.

The first optimization module 200 is configured to maintain the contents of the bucket matrix while canceling the horizontal reduction at the end of the main computation process in the PipeZK algorithm, and continue to the main computation process of the next rank.

And the second optimization module 300 is used for canceling part of the calculation process of the longitudinal reduction at the end of each round.

And the output module 400 is configured to perform horizontal reduction and vertical reduction calculation on the bucket matrix after all rounds are finished, and output elliptic curve points through the bucket matrix.

It should be noted that the foregoing explanation of the method embodiment is also applicable to the apparatus of this embodiment, and is not repeated herein.

According to the elliptic curve multi-scalar dot multiplication calculation optimization device provided by the embodiment of the application, the intermediate variable points in the main calculation process are cached by designing the bucket matrix, the output of the Pippenger intermediate quantity is avoided, the continuous running and calculation are carried out until all final calculations are finished, then the transverse reduction and the longitudinal reduction are carried out, the total calculation times of serial calculation are reduced from thousands of times to one time, the serial-parallel conversion stage between the sub-batch processing is optimized, the parallelism degree of the algorithm can be improved by eliminating the conversion, the continuous working time of a production line is effectively prolonged, and the overall performance is improved.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. An elliptic curve multi-scalar dot product calculation optimization method is characterized by comprising the following steps:

2. The method of claim 1, wherein the bucket matrix is common

3. The method of claim 1, wherein performing the horizontal reduction and vertical reduction calculations on the bucket matrix comprises:

and performing transverse reduction on each row of the barrel matrix to obtain a plurality of elliptic curve points, and performing longitudinal reduction on the plurality of elliptic curve points.

4. The method of claim 1, wherein performing the horizontal reduction and vertical reduction calculations on the bucket matrix comprises:

and performing longitudinal reduction on each column of the barrel matrix to obtain a plurality of elliptic curve points, and performing transverse reduction on the plurality of elliptic curve points.

5. The method according to any one of claims 1 to 4,

in the horizontal reduction calculation, the calculation of different rows is executed in parallel or in series;

in the vertical reduction calculation, the calculations of different columns are performed in parallel or in series.

6. An elliptic curve multi-scalar dot product calculation optimization device, comprising:

7. The apparatus of claim 6, wherein the bucket matrix is common

8. The apparatus of claim 6, wherein the performing lateral reduction and longitudinal reduction calculations on the bucket matrix comprises:

9. The apparatus of claim 6, wherein the performing lateral reduction and longitudinal reduction calculations on the bucket matrix comprises:

10. The apparatus according to any one of claims 6 to 9,