FR2791157A1

FR2791157A1 - Integrated circuit comprises means for effecting modular operations according to Montgomery's method for use in cryptography, has a single multiplication accumulator circuit

Info

Publication number: FR2791157A1
Application number: FR9903410A
Authority: FR
Inventors: Alain Pomet
Original assignee: STMicroelectronics SA
Current assignee: STMicroelectronics SA
Priority date: 1999-03-17
Filing date: 1999-03-17
Publication date: 2000-09-22

Abstract

The circuit comprises a coprocessor (200) for effecting modular operations according to Montgomery's methods. The coprocessor has a single multiplication accumulator circuit (231). An Independent claim is made for a procedure for carrying out a modular operation using Montgomery's method.

Description

Dispositif et procédé de mise en oeuvre d'une opérationDevice and method for carrying out an operation

modulaire élémentaire selon la méthode de Montgomery. elementary modular according to Montgomery's method.

L'invention concerne un dispositif et un procédé de mise en oeuvre d'une opération modulaire élémentaire selon la méthode de Montgomery. Cette méthode permet d'effectuer des calculs modulaires dans un corps fini (ou corps de Galois) sans effectuer de divisions. Classiquement, les opérations modulaires dans les corps finis sont utilisées en cryptographie pour des applications telles que l'authentification de messages, The invention relates to a device and a method for implementing an elementary modular operation according to the Montgomery method. This method allows you to perform modular calculations in a finite field (or Galois body) without dividing. Conventionally, modular operations in finite fields are used in cryptography for applications such as message authentication,

l'identification d'un utilisateur et l'échange de clés. the identification of a user and the exchange of keys.

De tels exemples d'applications sont décrits par exemple Examples of such applications are described, for example

dans la demande de brevet FR-A-2 679 054 (ci-après Dl). in the patent application FR-A-2 679 054 (hereinafter D1).

On trouve dans le commerce des circuits intégrés dédiés à de telles applications, par exemple le produit fabriqué par STMicroelectronics S.A. et référencé STl6CF54, bâti autour d'une association de type unité centrale - coprocesseur arithmétique, et dédié à la mise en oeuvre de calculs modulaires. Le coprocesseur utilisé permet de traiter des multiplications modulaires, en utilisant la méthode de Montgomery. Il fait l'objet de la Commercially available integrated circuits dedicated to such applications, for example the product manufactured by STMicroelectronics SA and referenced STl6CF54, built around an association of CPU type - arithmetic coprocessor, and dedicated to the implementation of modular calculations . The coprocessor used makes it possible to process modular multiplications, using the Montgomery method. It is the subject of

demande de brevet EP-A-0 601 907 (ci-après D2). patent application EP-A-0 601 907 (hereinafter D2).

L'opération de base, dite Pfield, consiste, à partir de trois données binaires, A (multiplicande), B (multiplieur inférieur à N) et N (modulo), codées sur un nombre entier n de bits, à produire une donnée binaire notée P(A, B)N codée sur n bits, telle que P(A, B)N = A * B * I mod N, avec I = 2-n mod N. Pour cela, on considère que les données sont codées sur m mots de k bits, avec m * k = n, et on fournit les mots A et B à un circuit de multiplication ayant une entrée série, une entrée The basic operation, called Pfield, consists, from three binary data, A (multiplicand), B (multiplier less than N) and N (modulo), coded on an integer n of bits, to produce a binary data denoted P (A, B) N coded on n bits, such that P (A, B) N = A * B * I mod N, with I = 2-n mod N. For this, it is considered that the data are coded on m words of k bits, with m * k = n, and the words A and B are supplied to a multiplication circuit having a serial input, an input

parallèle et une sortie série.parallel and a serial output.

Pour le coprocesseur décrit dans D2, on a k = 32 et m = 8 ou 16. La figure 1 représente le coprocesseur d'arithmétique modulaire dévoilé par D2. Ce coprocesseur comprend les éléments suivants: - trois registres à décalage 10, 11, et 12, de m * k bits, destinés à recevoir respectivement le multiplieur B, le résultat S, et le modulo N, - des multiplexeurs 13 à 15 dont les sorties sont reliées respectivement aux entrées des registres 10 à 12, - trois registres à décalage 16, 17, et 18, de k bits, ayant une entrée série et une sortie parallèle, destinés à recevoir respectivement k bits du multiplicande A, un paramètre de calcul noté J0, un résultat intermédiaire noté Y0, - deux circuits de multiplication 19 et 20 ayant chacun une entrée série, une entrée parallèle de k bits et une sortie série, - deux bascules parallèles 21 et 22 de k bits qui servent de tampon aux circuits de multiplication 19 et , - un multiplexeur 23 qui sert à relier le registre 22 soit au registre 17, soit au registre 18, - trois multiplexeurs 24, 25 et 26 servant à aiguiller les données sur les entrées des circuits de multiplication 19 et 20, - trois circuits de soustraction 27, 28, et 29 comportant chacun deux entrées séries et une sortie série, deux circuits d'addition 30 et 31, ayant chacun deux entrées séries et une sortie série, - trois cellules à retard 32, 33 et 34 qui sont en fait des registres à décalage de k bits, et qui servent à retarder les données de k cycles d'horloge pour masquer le temps de calcul des circuits de multiplication 19 et , - un circuit de comparaison 35, - deux multiplexeurs 36 et 37 qui permettent le contrôle des circuits de soustraction 27 et 28, - un multiplexeur 38, et For the coprocessor described in D2, we have k = 32 and m = 8 or 16. FIG. 1 represents the modular arithmetic coprocessor disclosed by D2. This coprocessor comprises the following elements: three shift registers 10, 11, and 12, of m * k bits, intended to receive respectively the multiplier B, the result S, and the modulo N, - multiplexers 13 to 15 whose outputs are respectively connected to the inputs of the registers 10 to 12, - three shift registers 16, 17, and 18, of k bits, having a serial input and a parallel output, intended to receive respectively k bits of the multiplicand A, a parameter of a calculation denoted J0, an intermediate result denoted by Y0, two multiplication circuits 19 and 20 each having a serial input, a parallel input of k bits and a serial output, two parallel latches 21 and 22 of k bits which serve as a buffer for multiplication circuits 19 and, - a multiplexer 23 which serves to connect the register 22 either to the register 17 or to the register 18, - three multiplexers 24, 25 and 26 serving to route the data on the inputs of the multiplier circuits. cation 19 and 20, three subtraction circuits 27, 28, and 29 each having two series inputs and one series output, two addition circuits 30 and 31, each having two series inputs and one series output, three delay cells. 32, 33 and 34 which are in fact k-bit shift registers, and which serve to delay the data of k clock cycles to mask the calculation time of the multiplication circuits 19 and, - a comparison circuit 35, two multiplexers 36 and 37 which allow the control of the subtraction circuits 27 and 28, a multiplexer 38, and

- un démultiplexeur 39.a demultiplexer 39.

Pour plus de détails sur la réalisation de ces For more details on the realization of these

éléments on peut se référer à D2. elements we can refer to D2.

Pour réaliser une opération élémentaire dite PField du type PField(A, B)N = A * B * I mod N, A et B étant codés sur m mots de k bits, et I étant une erreur égale à 2-m*k, on réalise m fois l'itération de boucle suivante avec i un indice variant de 0 à m-1: X = S(i-1) + Ai * B, Y0 = (X * J0) mod 2k, To carry out a PField elementary operation PField (A, B) N = A * B * I mod N, A and B being coded on m words of k bits, and I being an error equal to 2-m * k, the following loop iteration is carried out m times with i an index varying from 0 to m-1: X = S (i-1) + Ai * B, Y0 = (X * J0) mod 2k,

Z = X + (N * Y0)Z = X + (N * Y0)

S(i) = Z \ 2k, \ étant une division entière, si S(i) est supérieur à N alors on soustrait N à S(i) avant la prochaine itération, avec S(-1) = 0, Ai étant le mot de k bits de poids i, J0 étant un mot de k bits défini par l'équation S (i) = Z \ 2k, \ being an integer division, if S (i) is greater than N then we subtract N to S (i) before the next iteration, with S (-1) = 0, where Ai is the word of k bits of weight i, J0 being a k-bit word defined by the equation

((N * J0) + 1) mod 2k = 0.((N * J0) + 1) mod 2k = 0.

Le coprocesseur de la figure 1 permet d'effectuer une itération complète par un décalage simultané de m * k bits des registres 10 à 12 contenant respectivement B, S(i-1) et N suivi d'un décalage de 2 * k bits du registre 12 pour mémoriser S(i), le mot Ai étant chargé dans le registre 21, et le mot J0 étant chargé dans le registre 17. Pour réaliser le calcul complet de PField(A, B)N, il suffit de répéter m fois chaque itération en changeant le mot Ai contenu dans le registre 21 lors de chaque itération. L'opération " X = S(i-1) + Ai * B " se fait à l'aide des circuits de multiplication 19 et d'addition 30. L'opération " Y0 = (X * J0) mod 2k " se fait, lors des k premiers décalages, dans le circuit de multiplication 20, en ayant pris soin de mémoriser J0 dans le registre 22, le résultat Y0 étant mémorisé dans le registre 18. L'opération " Z = X + (N * Y0) ", N et X ayant été retardés de k bits dans les cellules à retard 32 et 34 et YO0 ayant été mis dans le registre 22, s'effectue à l'aide des circuits de multiplication 20 et d'addition 31. L'opération " S(i) = Z \ 2k " est réalisée par décalage de k bits. La comparaison de S(i) avec N s'effectue par la soustraction de N à S(i) dans le circuit de soustraction 29, N étant retardé de k bits dans la cellule 33, un éventuel débordement étant détecté et mémorisé dans le circuit de comparaison 35 pour connaître le résultat de la comparaison. La soustraction de N à S(i) se faisant lors de l'itération suivante dans The coprocessor of FIG. 1 makes it possible to carry out a complete iteration by a simultaneous shift of m * k bits of the registers 10 to 12 respectively containing B, S (i-1) and N followed by a 2 * k bits offset of register 12 to memorize S (i), the word Ai being loaded in the register 21, and the word J0 being loaded into the register 17. To perform the complete calculation of PField (A, B) N, simply repeat m times each iteration by changing the word Ai contained in the register 21 during each iteration. The operation "X = S (i-1) + Ai * B" is done by means of the multiplication circuits 19 and the addition circuit 30. The operation "Y0 = (X * J0) mod 2k" is performed during the first k offsets in the multiplication circuit 20, having taken care to store J0 in the register 22, the result Y0 being stored in the register 18. The operation "Z = X + (N * Y0)" , N and X having been delayed by k bits in the delay cells 32 and 34 and YO0 having been set in the register 22, is carried out by means of the multiplication and addition circuits 31. The operation " S (i) = Z \ 2k "is performed by k-bit shift. The comparison of S (i) with N is carried out by the subtraction of N to S (i) in the subtraction circuit 29, N being delayed by k bits in the cell 33, a possible overflow being detected and stored in the circuit 35 for the result of the comparison. The subtraction of N to S (i) occurring during the next iteration in

le circuit de soustraction 28.the subtraction circuit 28.

De nombreuses améliorations ont été réalisées sur ce circuit. Les améliorations ayant pour but d'aller plus vite et/ou de réduire la taille du circuit et/ou de réduire la consommation du circuit et/ou d'apporter des fonctionnalités supplémentaires sans augmenter la taille du circuit de manière considérable. L'homme du métier peut se reporter entre autre aux publications des demandes de brevets européens EP - 0 712 070, EP - 0 712 Many improvements have been made on this circuit. The improvements are intended to go faster and / or reduce the size of the circuit and / or reduce the consumption of the circuit and / or provide additional functionality without increasing the size of the circuit significantly. The person skilled in the art can refer inter alia to the publications of the European patent applications EP-0 712 070, EP-0 712

071, EP - 0 712 072, EP - 0 778 518, EP - 0 784 262, EP - 071, EP - 0 712 072, EP - 0 778 518, EP - 0 784 262, EP -

0 785 502, EP - 0 785 503, EP - 0 793 165, EP - 0 853 0 785 502, EP-0 785 503, EP-0 793 165, EP-0 853

275, et également à la publication de la demande de 275, and also to the publication of the application for

brevet internationale WO/97 25668.International Patent WO / 97 25668.

Il est également connu, de la publication de la demande de brevet européen EP - 0 566 498 (ci-après D3), un autre circuit permettant de calculer l'opération élémentaire P(A, B)N = A * B * I mod N, avec I = 2-n et n la taille de A, B ou N. Le circuit de D3 utilise un unique circuit de multiplication parallèle/série, représenté dans D3 sous la forme d'un additionneur parallèle couplé à un registre à décalage. Le circuit de D3 ne reproduit pas exactement l'algorithme de Montgomery et utilise une donnée intermédiaire égale à (N-1)/2+l. Le circuit de D3 utilise un circuit de multiplication disposant d'une entrée parallèle de n bits et se limite à des opérandes de calcul de taille figée. Par ailleurs, la taille du circuit de D3 est proportionnelle à la taille des opérandes utilisés, la surface ainsi occupée étant considérable. La présente invention a pour but d'améliorer l'état de la technique en proposant un coprocesseur dont les performances en vitesse de traitement sont améliorées par rapport au circuit de D2, tout en occupant une surface plus réduite de silicium. Le coprocesseur de l'invention utilise un unique circuit de multiplication d'architecture voisine de D3. La présente invention se rapporte également au procédé de calcul mis en oeuvre pour It is also known, from the publication of the European patent application EP-0 566 498 (hereinafter D3), another circuit making it possible to calculate the elementary operation P (A, B) N = A * B * I mod N, with I = 2-n and n the size of A, B or N. The D3 circuit uses a single parallel / series multiplication circuit, represented in D3 as a parallel adder coupled to a shift register . The circuit of D3 does not exactly reproduce the Montgomery algorithm and uses an intermediate data equal to (N-1) / 2 + 1. The circuit of D3 uses a multiplication circuit having a parallel input of n bits and is limited to fixed-size calculation operands. Moreover, the size of the circuit of D3 is proportional to the size of the operands used, the surface thus occupied being considerable. The present invention aims to improve the state of the art by providing a coprocessor whose processing speed performance is improved compared to the D2 circuit, while occupying a smaller area of silicon. The coprocessor of the invention uses a single architecture multiplication circuit close to D3. The present invention also relates to the calculation method implemented for

réaliser une opération modulaire.perform a modular operation.

L'invention a pour objet un circuit intégré comprenant un coprocesseur d'arithmétique modulaire comportant des moyens de mémorisation pour mémoriser et fournir en série des premier et deuxième opérandes A et B, un modulo N et un résultat S, avec A un entier codé sur a * k bits, a étant un entier non nul au plus égal à m, et avec B, N et S qui sont des entiers codés sur au plus m * k bits, m et k étant des entiers supérieurs à 1, et des moyens de calcul pour effectuer des opérations modulaires selon la méthode de Montgomery, caractérisé en ce que les moyens de calcul comportent un circuit pour calculer une donnée intermédiaire Yo; une première bascule de k bits pour mémoriser un mot Ai de k bits de A; une deuxième bascule de k bits pour mémoriser soit le mot de poids le plus faible de N soit Y0; un circuit parallèle d'addition connecté pour additionner les mots contenus dans les première et deuxième bascules; un dispositif de sélection couplé aux sorties des première et deuxième bascules et du circuit parallèle d'addition afin de fournir sur une sortie parallèle soit le mot contenu dans la première bascule, soit le mot contenu dans la deuxième bascule, soit la somme des mots contenus dans les première et deuxième bascules, soit "zéro", en fonction d'une part d'un bit de B, et d'autre part soit d'un bit de Y0 soit d'un bit de N; un circuit accumulateur qui additionne, décale d'un bit et mémorise les mots fournis successivement par le dispositif de sélection avec un bit d'un résultat actualisé S, le bit sortant du circuit The invention relates to an integrated circuit comprising a modular arithmetic coprocessor comprising storage means for storing and supplying in series first and second operands A and B, a modulo N and a result S, with A an integer coded on a * k bits, a being a non-zero integer at most equal to m, and with B, N and S being integers coded on at most m * k bits, m and k being integers greater than 1, and means method for performing modular operations according to the Montgomery method, characterized in that the calculating means comprise a circuit for calculating an intermediate data Yo; a first k-bit latch for storing a word Ai of k bits of A; a second k-bit latch for storing either the least significant word of N or Y0; a connected addition parallel circuit for adding the words contained in the first and second latches; a selection device coupled to the outputs of the first and second flip-flops and the parallel addition circuit to provide on a parallel output either the word contained in the first flip-flop, or the word contained in the second flip-flop, or the sum of the words contained in the first and second flip-flops, ie "zero", as a function of one part of a bit of B, and on the other hand of either a bit of Y0 or a bit of N; an accumulator circuit which adds, shifts by one bit and stores the words successively supplied by the selection device with a bit of an updated result S, the bit coming out of the circuit

accumulateur devenant un nouveau résultat actualisé. accumulator becoming a new updated result.

Une amélioration de l'invention consiste en ce que les moyens de calcul comportent un premier registre à décalage de k bits pour recevoir d'une part un mot de k bits de A et transmettre ledit mot en parallèle à la première bascule et d'autre part N afin de retarder N de An improvement of the invention consists in that the calculation means comprise a first k-bit shift register for receiving on the one hand a k-bit word of A and transmitting said word in parallel to the first flip-flop and on the other hand N share in order to delay N of

k cycles d'un signal d'horloge.k cycles of a clock signal.

L'invention concerne également un procédé pour effectuer une opération modulaire selon la méthode de Montgomery par décalage en série de premier et deuxième opérandes A et B, d'un modulo N et d'un résultat actualisé à travers des moyens de calcul, avec A un entier codé sur a * k bits, a étant un entier non nul au plus égal à m, et avec B, N et S des entiers codés sur au plus m * k bits, m et k étant des entiers supérieurs à 1 caractérisé en ce qu'il comporte la répétition des étapes suivantes, i étant un indice variant de 0 à m-l: mémorisation d'un mot Ai de k bits correspondant au mot de poids i de A dans une première bascule de k bits; calcul d'une donnée intermédiaire Y0 telle que Y0 = ((S(i-1) + (Ai * B)) * J0) mod 2k, avec S(i-1) qui correspond au (i-1)-ième résultat actualisé, S(-1) étant égal à 0, et J0 étant un mot de k bits résolvant l'équation ((J0 * N) + 1) mod 2k = 0; mémorisation du mot de k bits de poids faible de N puis de Y0 dans une deuxième bascule de k bits; addition dans un circuit parallèle d'addition des mots contenus dans les première et deuxième bascules; sélection et fourniture soit du mot contenu dans la première bascule, soit du mot contenu dans la deuxième bascule, soit de la somme des mots contenus dans les première et deuxième bascules, soit "zéro", en fonction d'une part d'un bit de B, et d'autre part soit d'un bit de Y0 soit d'un bit de N; Additions successives dans un circuit accumulateur des mots fournis par le dispositif de sélection pour chaque paire de bits de B et de N, le résultat de chaque addition étant additionné à un bit du précédent résultat actualisé S(i-1) puis décalé d'un bit et mémorisé entre chaque addition, le bit sortant de l'accumulateur lors du décalage correspondant à un The invention also relates to a method for performing a modular operation according to the Montgomery method by series shifting of first and second operands A and B, a modulo N and an updated result through calculation means, with A an integer coded on a * k bits, a being a non-zero integer at most equal to m, and with B, N and S being integers encoded on at most m * k bits, m and k being integers greater than 1, characterized in it comprises the repetition of the following steps, i being an index varying from 0 to ml: storing a word Ai of k bits corresponding to the weight word i of A in a first flip-flop of k bits; calculating an intermediate data Y0 such that Y0 = ((S (i-1) + (Ai * B)) * J0) mod 2k, with S (i-1) corresponding to the (i-1) -th result updated, S (-1) being equal to 0, and J0 being a k-bit word solving the equation ((J0 * N) + 1) mod 2k = 0; storing the word of k least significant bits of N and Y0 in a second flip-flop of k bits; adding in a parallel addition circuit the words contained in the first and second latches; selecting and supplying either the word contained in the first flip-flop, or the word contained in the second flip-flop, or the sum of the words contained in the first and second flip-flops, or "zero", depending on one hand a bit B, and on the other hand either a bit of Y0 or a bit of N; Successive additions in an accumulator circuit of the words supplied by the selection device for each pair of bits of B and N, the result of each addition being added to a bit of the previous updated result S (i-1) and then shifted by one bit and stored between each addition, the bit coming out of the accumulator during the shift corresponding to a

nouveau résultat actualisé S(i).new updated result S (i).

Une amélioration consiste en ce que l'on effectue une comparaison du résultat sortant de l'accumulateur avec N retardé de k cycles d'un signal d'horloge, et en ce que l'on utilise un même premier registre à décalage de k bits pour retarder N et pour pouvoir charger les mots Ai dans la première bascule. De plus, la mémorisation d'un mot Ai dans la première bascule s'effectue par k décalages du mot Ai dans le premier registre puis chargement parallèle dans la première bascule après que le mot Ai ait été chargé en entier dans le premier registre. Le calcul de chaque bit de la donnée intermédiaire Y0 s'effectue un cycle d'horloge avant que An improvement consists in that a comparison is made of the output of the accumulator with N delayed by k cycles of a clock signal, and that the same first k-bit shift register is used. to delay N and to be able to load the words Ai in the first flip-flop. In addition, the memorization of a word Ai in the first flip-flop is performed by k offsets of the word Ai in the first register and then parallel loading in the first flip-flop after the word Ai has been loaded in full in the first register. The calculation of each bit of the intermediate data Y0 is carried out one clock cycle before

l'on ait besoin dudit chaque bit.each of these bits is needed.

L'invention sera mieux comprise et d'autres particularités et avantages apparaîtront à la lecture de The invention will be better understood and other features and advantages will become apparent on reading

la description qui va suivre, la description faisant the description that follows, the description

référence aux dessins annexés parmi lesquels: la figure 1 représente un coprocesseur d'arithmétique modulaire selon l'état de la technique, la figure 2 représente un coprocesseur d'arithmétique modulaire selon l'invention, et les figures 3 à 7 représentent de manière détaillée 1 represents a modular arithmetic coprocessor according to the state of the art, FIG. 2 represents a modular arithmetic coprocessor according to the invention, and FIGS. 3 to 7 represent in detail

différents éléments du coprocesseur de la figure 2. different elements of the coprocessor of FIG.

La figure 1 ayant été décrite précédemment et représentant l'état de la technique, elle ne sera pas FIG. 1 having been described above and representing the state of the art, it will not be

décrite plus en détail.described in more detail.

La figure 2 représente le coprocesseur 200 d'arithmétique modulaire, selon un mode préféré de réalisation. Afin de ne pas surcharger le schéma, seul le cheminement des données a été représenté sur cette figure 2. Une machine d'état (non représentée) envoie les signaux de commande nécessaires aux différents éléments fonctionnels du coprocesseur 200. Le coprocesseur 200 comporte les éléments suivants: - Des premier à quatrième dispositifs de mémorisation 201 à 204 contenant respectivement des données A, B, N et S. Les données A, B, N et S sont des données codées sur au plus m mots de k bits. Les dispositifs de mémorisation 201 à 204 permettent de fournir de manière indépendante n'importe quel mot de k bits de la donnée mémorisée. Chaque dispositif de mémorisation 201 à 204 dispose de première et deuxième Fig. 2 shows coprocessor 200 of modular arithmetic, according to a preferred embodiment. In order not to overload the diagram, only the data flow has been represented in this FIG. 2. A state machine (not shown) sends the necessary control signals to the various functional elements of the coprocessor 200. The coprocessor 200 comprises the elements following: - First to fourth storage devices 201 to 204 respectively containing data A, B, N and S. The data A, B, N and S are data encoded on at most m words of k bits. The storage devices 201 to 204 make it possible to independently supply any k-bit word of the stored data item. Each storage device 201 to 204 has first and second

entrées séries et d'une sortie de donnée de type série. serial inputs and serial data output.

La première entrée de chaque dispositif de mémorisation The first input of each storage device

201 à 204 est connectée à une borne d'entrée Din. 201 to 204 is connected to a Din input terminal.

- Des premier et deuxième circuits de soustraction 205 et 206 de type série disposent de première et deuxième entrées et d'une sortie de type série. La première entrée du premier circuit de soustraction 205 est connectée à la sortie du deuxième dispositif de mémorisation 202. La première entrée du deuxième circuit de soustraction 206 est connectée à la sortie du First and second series-type subtraction circuits 205 and 206 have first and second inputs and a serial type output. The first input of the first subtraction circuit 205 is connected to the output of the second storage device 202. The first input of the second subtraction circuit 206 is connected to the output of the second

quatrième dispositif de mémorisation 204. fourth storage device 204.

- Des premier et deuxième multiplexeurs 207 et 208 sont couplés respectivement aux deuxièmes entrées des First and second multiplexers 207 and 208 are respectively coupled to the second inputs of the

premier et deuxième circuits de soustraction 205 et 206. first and second subtraction circuits 205 and 206.

Les premier et deuxième multiplexeurs 207 et 208 disposent de deux entrées chacun, l'une des entrées recevant un "zéro" logique, et l'autre des entrées étant connectée à la sortie du troisième dispositif de mémorisation 203. L'association des premier et deuxième circuits de soustraction 205 et 206 avec les premier et deuxième multiplexeurs 207 et 208 permet de soustraire soit "zéro" soit la donnée sortante du troisième dispositif de mémorisation 203 aux données sortantes des deuxième et quatrième dispositifs de mémorisation 202 et 204. - Des premier à quatrième circuits de retard 211 à 214 servent à synchroniser les données en les retardant d'un cycle du signal d'une horloge de cadencement. Chacun des circuits de retard 211 à 214 dispose d'une entrée et d'une sortie, chaque circuit de retard étant par exemple constitué d'une simple bascule synchrone de type D. L'entrée du premier circuit de retard 211 est connectée à The first and second multiplexers 207 and 208 have two inputs each, one of the inputs receiving a "zero" logic, and the other of the inputs being connected to the output of the third storage device 203. The association of the first and second second subtraction circuits 205 and 206 with the first and second multiplexers 207 and 208 allows subtracting either "zero" or the outgoing data from the third storage device 203 to the outgoing data of the second and fourth storage devices 202 and 204. - First Fourth delay circuits 211 to 214 serve to synchronize the data by delaying them by one cycle of a clock signal. Each of the delay circuits 211 to 214 has an input and an output, each delay circuit consisting for example of a simple D-type synchronous flip-flop. The input of the first delay circuit 211 is connected to

la sortie du premier circuit de soustraction 205. the output of the first subtraction circuit 205.

L'entrée du deuxième circuit de retard 212 est connectée The input of the second delay circuit 212 is connected

à la sortie du troisième dispositif de mémorisation 203. at the output of the third storage device 203.

L'entrée du troisième circuit de retard 213 est connectée à la sortie du deuxième circuit de retard 212. L'entrée du quatrième circuit de retard 214 est connectée à la The input of the third delay circuit 213 is connected to the output of the second delay circuit 212. The input of the fourth delay circuit 214 is connected to the

sortie du deuxième circuit de soustraction 206. output of the second subtraction circuit 206.

- Un premier registre 221 à décalage de k bits dispose d'une entrée série, d'une sortie série et d'une sortie parallèle. Ce premier registre 221 sert d'une part de registre tampon pour les mots de A et d'autre part de retardateur de k cycles d'horloge pour N. - Un deuxième registre 222 à décalage de k bits dispose d'une entrée série et d'une sortie parallèle. Ce deuxième registre 222 sert d'une part de registre tampon pour le mot No de poids le plus faible de N et d'autre A first k-bit shift register 221 has a serial input, a serial output and a parallel output. This first register 221 serves firstly as a buffer register for the words of A and secondly as a timer for k clock cycles for N. - A second k-bit offset register 222 has a serial input and a parallel output. This second register 222 serves firstly as a buffer register for the word No of the weakest weight of N and other

part pour une donnée intermédiaire Y0. share for an intermediate data Y0.

- Un troisième multiplexeur 223 est associé au premier registre 221. Le troisième multiplexeur 223 dispose de trois entrées et d'une sortie, la sortie étant connectée à l'entrée du premier registre 221. L'une des entrées du troisième multiplexeur 223 est connectée à la sortie du premier dispositif de mémorisation 201. Une autre des entrées du troisième multiplexeur est connectée à la sortie du premier circuit de soustraction 205. La dernière des entrées du troisième multiplexeur 223 est A third multiplexer 223 is associated with the first register 221. The third multiplexer 223 has three inputs and one output, the output being connected to the input of the first register 221. One of the inputs of the third multiplexer 223 is connected. at the output of the first storage device 201. Another of the inputs of the third multiplexer is connected to the output of the first subtraction circuit 205. The last of the inputs of the third multiplexer 223 is

connectée à la sortie du troisième circuit de retard 213. connected to the output of the third delay circuit 213.

- Un quatrième multiplexeur 224 est associé au deuxième registre 222. Le quatrième multiplexeur 224 dispose de première et deuxième entrées et d'une sortie, la sortie étant connectée à l'entrée du deuxième registre 222. La première entrée du quatrième multiplexeur 224 est A fourth multiplexer 224 is associated with the second register 222. The fourth multiplexer 224 has first and second inputs and an output, the output being connected to the input of the second register 222. The first input of the fourth multiplexer 224 is

- Des première et deuxième bascules 225 et 226 à verrouillage de k bits servent à mémoriser pendant le calcul d'une part un mot de A et d'autre part le mot No de First and second latches 225 and 226 with a lock of k bits serve to memorize during the calculation on the one hand a word of A and on the other hand the word No of

poids le plus faible de N ou la donnée intermédiaire Y0. lowest weight of N or intermediate data Y0.

Chacune des bascules 225 et 226 comporte une entrée parallèle et une sortie parallèle, les entrées des première et deuxième bascules 225 et 226 étant respectivement connectées aux sorties parallèles des Each of the flip-flops 225 and 226 has a parallel input and a parallel output, the inputs of the first and second flip-flops 225 and 226 being respectively connected to the parallel outputs of the flip-flops.

premier et deuxième registres 221 et 222. first and second registers 221 and 222.

- Un circuit d'addition 227, disposant de deux entrées parallèles et d'une sortie parallèle, a ses deux entrées connectées respectivement aux sorties des première et deuxième bascules 225 et 226. La sortie du circuit d'addition 227 fournit ainsi la somme des An addition circuit 227, having two parallel inputs and a parallel output, has its two inputs respectively connected to the outputs of the first and second flip-flops 225 and 226. The output of the addition circuit 227 thus provides the sum of the

contenus des première et deuxième bascules 225 et 226. contents of the first and second flip-flops 225 and 226.

- Un dispositif de sélection 228 est connecté aux sorties des première et deuxième bascules 225 et 226 et à la sortie du circuit d'addition 227 afin de pouvoir fournir sur une sortie parallèle soit le contenu de la première bascule 225, soit le contenu de la deuxième bascule 226, soit la somme des contenus des première et deuxième bascules 225 et 226, soit "zéro". Le dispositif de sélection 228 dispose en outre de première et deuxième entrées de sélection qui reçoivent respectivement un premier signal de sélection SELA et un deuxième signal de sélection SELY. Lorsque les premier et deuxième signaux SELA et SELY de sélection sont tous deux à un niveau logique "zéro", alors la sortie du dispositif de sélection 228 fournit sur sa sortie le nombre "zéro" codé sur k + 1 bits. Lorsque le premier signal de sélection Il SELA est à un niveau logique " un " et que le deuxième signal de sélection SELY est à un niveau logique " zéro ", alors la sortie du dispositif de sélection 228 fournit sur sa sortie le contenu de la première bascule 225. Lorsque le premier signal de sélection SELA est à un niveau logique " zéro " et que le deuxième signal de sélection SELY est à un niveau logique " un ", alors la sortie du dispositif de sélection 228 fournit sur sa sortie le contenu de la deuxième bascule 226. Lorsque les premier et deuxième signaux SELA et SELY de sélection sont tous deux à un niveau logique " un ", alors la sortie du dispositif de sélection 228 fournit sur sa sortie la somme des contenus des première et deuxième A selection device 228 is connected to the outputs of the first and second flip-flops 225 and 226 and to the output of the addition circuit 227 in order to be able to provide on a parallel output either the contents of the first flip-flop 225 or the contents of the second flip-flop 226, the sum of the contents of the first and second latches 225 and 226, or "zero". The selection device 228 further has first and second selection inputs which respectively receive a first selection signal SELA and a second selection signal SELY. When the first and second selection signals SELA and SELY are both at a "zero" logic level, then the output of the selection device 228 provides on its output the number "zero" encoded on k + 1 bits. When the first selection signal Il SELA is at a logic level "one" and the second selection signal SELY is at a logic level "zero", then the output of the selection device 228 provides on its output the contents of the first flip-flop 225. When the first selection signal SELA is at a logic level "zero" and the second selection signal SELY is at a logic level "one", then the output of the selection device 228 provides on its output the content of the second flip-flop 226. When the first and second selection signals SELA and SELY are both at a "one" logic level, then the output of the selection device 228 provides on its output the sum of the contents of the first and second

bascules 225 et 226.flip-flops 225 and 226.

- Un cinquième multiplexeur 229, disposant de deux entrées et d'une sortie, a sa sortie connectée à la première entrée de sélection du dispositif de sélection 228. l'une des entrées du cinquième multiplexeur 229 est A fifth multiplexer 229, having two inputs and an output, has its output connected to the first selection input of the selection device 228. one of the inputs of the fifth multiplexer 229 is

connectée à la sortie du premier circuit de retard 211. connected to the output of the first delay circuit 211.

L'autre des entrées du cinquième multiplexeur 229 reçoit The other of the inputs of the fifth multiplexer 229 receives

un " zéro " logique.a logical "zero"

- Un sixième multiplexeur 230, disposant de première à troisième entrées et d'une sortie, a sa sortie connectée à la deuxième entrée de sélection du dispositif de sélection 228. La première entrée du sixième multiplexeur 230 reçoit un " zéro " logique. La deuxième entrée du sixième multiplexeur 230 est connectée à la A sixth multiplexer 230, having first to third inputs and an output, has its output connected to the second selection input of the selection device 228. The first input of the sixth multiplexer 230 receives a logical "zero". The second input of the sixth multiplexer 230 is connected to the

sortie du troisième circuit de retard 213. output of the third delay circuit 213.

- Un circuit accumulateur 231 effectue une double multiplication par addition successive des mots sortant du dispositif de sélection 228. Le circuit accumulateur 231 comporte une entrée parallèle connectée à la sortie du dispositif de sélection 228, une entrée série connectée à la sortie du quatrième circuit de retard 214, une sortie de résultat et trois sorties de valeurs internes notées IO à I2. Lors de chaque cycle de l'horloge de séquencement du coprocesseur 200, le circuit accumulateur additionne un bit présent à l'entrée série avec un mot présent à l'entrée parallèle et avec un résultat interne. Le nouveau résultat est ensuite décalé pour devenir un nouveau résultat interne. - Un septième multiplexeur 233 dispose de deux entrées et d'une sortie. L'une des entrées du septième multiplexeur 233 est connectée à la sortie de résultat du circuit accumulateur 231. La sortie du septième multiplexeur 233 est connectée aux deuxièmes entrées des - An accumulator circuit 231 performs a double multiplication by successive addition of the words leaving the selection device 228. The accumulator circuit 231 comprises a parallel input connected to the output of the selection device 228, a serial input connected to the output of the fourth circuit delay 214, a result output and three outputs of internal values denoted I0 to I2. During each cycle of the coprocessor sequencing clock 200, the accumulator circuit adds a bit present at the serial input with a word present at the parallel input and with an internal result. The new result is then shifted to become a new internal result. A seventh multiplexer 233 has two inputs and one output. One of the inputs of the seventh multiplexer 233 is connected to the result output of the accumulator circuit 231. The output of the seventh multiplexer 233 is connected to the second inputs of the second multiplexer 233.

dispositifs de mémorisation 201 à 204. storage devices 201 to 204.

- Une porte ET 234 dispose de première et deuxième entrées et d'une sortie. La première entrée de la porte ET 234 est connectée à la sortie du premier circuit de soustraction 205. La deuxième entrée de la porte ET 234 est connectée à un conducteur de la sortie parallèle de la première bascule 225 qui correspond au bit de poids le An AND gate 234 has first and second inputs and an output. The first input of the AND gate 234 is connected to the output of the first subtraction circuit 205. The second input of the AND gate 234 is connected to a conductor of the parallel output of the first flip-flop 225 which corresponds to the least significant bit.

plus faible du mot contenu dans la première bascule 225. weaker of the word contained in the first flip-flop 225.

- Une porte OU Exclusif 235 dispose de première à cinquième entrées et d'une sortie. Une première entrée de la porte OU Exclusif 235 est connectée à la sortie de la porte ET 234. une deuxième entrée de la porte OU Exclusif 235 est connectée à la sortie du deuxième circuit de soustraction 206. les troisième à cinquième entrées de la porte OU Exclusif 235 sont connectées respectivement aux trois sorties de valeurs internes du circuit accumulateur 231. - Un huitième multiplexeur 236 dispose de deux entrées et d'une sortie. L'une des entrées du huitième multiplexeur 236 est connectée à la sortie série du premier registre 221. L'autre des entrées du huitième multiplexeur 236 est connectée à la sortie série du premier circuit de retard 211. La sortie du huitième multiplexeur 236 est connectée à l'autre entrée du - Exclusive OR gate 235 has first to fifth inputs and an output. A first input of the exclusive-OR gate 235 is connected to the output of the AND gate 234. a second input of the exclusive-OR gate 235 is connected to the output of the second subtraction circuit 206. the third to fifth inputs of the gate OR Exclusive 235 are respectively connected to the three internal value outputs of the accumulator circuit 231. - An eighth multiplexer 236 has two inputs and one output. One of the inputs of the eighth multiplexer 236 is connected to the serial output of the first register 221. The other of the inputs of the eighth multiplexer 236 is connected to the serial output of the first delay circuit 211. The output of the eighth multiplexer 236 is connected at the other entrance of

septième multiplexeur 233.seventh multiplexer 233.

- Un circuit de comparaison 232 disposant de deux entrées compare bit à bit le résultat sortant du circuit accumulateur 231 avec la donnée qui sort en série du huitième multiplexeur 236. Le résultat de la comparaison est ensuite transmis à un circuit de gestion (non représenté) du coprocesseur 200. La figure 2 représente un cheminement de données entre différents éléments fonctionnels. Le cheminement représenté à l'aide des conducteurs de liaisons et des différents multiplexeurs peut présenter de nombreuses variantes, l'important étant d'assurer des échanges de données entre les différents éléments de calcul et de mémorisation. Certains éléments de la figure 2 ne correspondent pas exactement à des éléments standards couramment utilisés par l'homme du métier. Les figures 3 à 7 A comparison circuit 232 having two inputs compares bit-by-bit the output from the accumulator circuit 231 with the data output in series from the eighth multiplexer 236. The result of the comparison is then transmitted to a management circuit (not shown) coprocessor 200. Figure 2 shows a data path between different functional elements. The path represented with the aid of the connection conductors and the various multiplexers can have many variants, the important thing being to ensure data exchanges between the different elements of calculation and storage. Some elements of Figure 2 do not exactly correspond to standard elements commonly used by those skilled in the art. Figures 3 to 7

précisent la structure de ces différents éléments. specify the structure of these different elements.

La figure 3 correspond à l'un des dispositifs de mémorisation 201 à 204. Le dispositif de mémorisation 201 comporte deux multiplexeurs 301 et 302 et des premier à FIG. 3 corresponds to one of the storage devices 201 to 204. The storage device 201 comprises two multiplexers 301 and 302 and

m-ième registres à décalage 303 notés également Ri à Rm. m-th shift registers 303 also denoted Ri to Rm.

Le multiplexeur 301 comporte des première à quatrième entrées et une sortie. Les première et deuxième entrées du multiplexeur 301 constituent les première et deuxième entrées du dispositif de mémorisation 201. La troisième entrée du multiplexeur 301 reçoit un "zéro" logique. Les premier à m-ième registres 303 sont des registres de k bits, à décalage, qui disposent d'une entrée série et d'une sortie série. Les entrées des premier à m-ième registres 303 sont connectées ensembles The multiplexer 301 has first to fourth inputs and an output. The first and second inputs of the multiplexer 301 constitute the first and second inputs of the storage device 201. The third input of the multiplexer 301 receives a logical "zero". The first to mth registers 303 are k-bit shift registers which have a serial input and a serial output. The inputs from the first to the mth registers 303 are connected together

à la sortie du multiplexeur 301.at the output of the multiplexer 301.

Le multiplexeur 302 comporte des première à m-ième entrées et une sortie. Les premières à m-ième entrées du multiplexeur 302 sont respectivement connectées aux sorties des premier à m-ième registres 303. La sortie du multiplexeur 302 est connectée à la quatrième entrée du Multiplexer 302 has first to mth inputs and an output. The first m-th inputs of the multiplexer 302 are respectively connected to the outputs of the first to mth registers 303. The output of the multiplexer 302 is connected to the fourth input of the multiplexer 302.

multiplexeur 301.multiplexer 301.

Des signaux de commande (non représentés) servent à sélectionner les entrées des multiplexeurs 301 et 302 et à valider le décalage de manière indépendante dans chacun des registres 303. Lorsque l'on désire mémoriser une donnée de m * k bits dans le dispositif de mémorisation 201, ladite donnée est rangée par mot de k bits dans chacun des registres 303. Pour mémoriser la donnée, il suffit d'effectuer k décalages du premier registre 303, puis k décalages du deuxième registre 303 et ainsi de suite jusqu'au m-ième registre 303, le multiplexeur 301 Control signals (not shown) serve to select the inputs of the multiplexers 301 and 302 and to validate the offset independently in each of the registers 303. When it is desired to store a data item of m * k bits in the storage device 201, said data is stored by word of k bits in each of the registers 303. To store the data, it suffices to carry out k offsets of the first register 303, then k offsets of the second register 303 and so on up to n register 303, the multiplexer 301

sélectionnant la source de la donnée. selecting the source of the data.

Pour fournir une donnée codée sur m * k bits, il suffit de décaler les uns après les autres les registres To provide data coded on m * k bits, it is sufficient to shift one after the other the registers

303 dans l'ordre de mémorisation de la donnée. 303 in the order of storing the data.

Le bouclage de la sortie du multiplexeur 302 sur la quatrième entrée du multiplexeur 301 permet de faire entrer dans l'un des registres 303 le mot de k bits qui est sorti simultanément. Ce bouclage assure la mémorisation des données que l'on sort pour des usages multiples. Comme on peut le remarquer il est possible d'utiliser de manière indépendante n'importe quel mot de The looping of the output of the multiplexer 302 on the fourth input of the multiplexer 301 makes it possible to enter in one of the registers 303 the k-bit word which is output simultaneously. This looping ensures the storage of data that is output for multiple uses. As can be seen it is possible to use independently any word of

k bits d'une donnée comportant plusieurs mots de k bits. k bits of a data item comprising several words of k bits.

Il est également possible de faire entrer un mot de k bits dans l'un des registres 303 pendant que l'on sort un It is also possible to enter a k-bit word in one of the registers 303 while

mot de k bits d'un autre des registres 303. k-bit word of another 303 registers.

La figure 4 représente le premier (ou le deuxième) circuit de soustraction 205 (ou 206). Le circuit de soustraction 205 comporte deux inverseurs 401 et 402, un additionneur élémentaire et deux bascules 404 et 405 de mémorisation de type D, connectés selon une technique Figure 4 shows the first (or second) subtraction circuit 205 (or 206). The subtraction circuit 205 comprises two inverters 401 and 402, an elementary adder and two latches 404 and 405 of type D storage connected by a technique

connue, comme indiqué sur la figure 4. known as shown in Figure 4.

Ce circuit de soustraction 205 produit un retard systématique d'un cycle d'horloge sur les données le traversant. Le deuxième circuit de retard 212 sert à compenser les retards produits sur les données qui sortent du troisième dispositif de mémorisation 203. De même, on pourrait également compenser les retards sur la sortie du premier dispositif de mémorisation 201. Cependant, les données sortantes du premier dispositif de mémorisation 201 n'ont pas besoin d'être synchroniser This subtraction circuit 205 produces a systematic delay of a clock cycle on the data passing therethrough. The second delay circuit 212 serves to compensate for the delays produced on the data coming out of the third storage device 203. Similarly, the delays on the output of the first storage device 201 could also be compensated. However, the outgoing data of the first memory device 201 do not need to be synchronize

avec les autres données.with the other data.

L'utilisation de circuit de soustraction 205 tel que représenté sur la figure 4 permet également de s'affranchir des premier, troisième et quatrième circuits de retard 211, 213 et 214. En effet, la bascule 404 produit un retard identique. Il suffit d'extraire le signal à l'entrée de la bascule 404 et de l'inverser pour obtenir le prochain bit sortant. Un inconvénient est de ne pas avoir un signal stable dès le front actif du signal d'horloge. Pour les systèmes fonctionnant avec une fréquence d'horloge peu élevée, cela permet d'économiser trois bascules D. Le circuit de la figure 5 représente le circuit de comparaison 232 de manière détaillée. Le circuit de comparaison 232 correspond à un circuit de soustraction sur lequel on extrait la retenue mémorisée et la donnée qui arrive sur la première entrée du circuit de soustraction, le circuit de soustraction étant bien évidemment simplifié. La retenue mémorisée estinversée puis rentre dans un OU logique avec la donnée présente sur la première entrée. Le résultat sortant du OU logique lorsque la totalité des données est rentrée dans le circuit de comparaison 232 permet de savoir laquelle des deux données est supérieure à l'autre. Le résultat est The use of subtraction circuit 205 as shown in Figure 4 also eliminates the first, third and fourth delay circuits 211, 213 and 214. Indeed, the latch 404 produces an identical delay. It is sufficient to extract the signal at the input of the flip-flop 404 and to invert it to obtain the next outgoing bit. A disadvantage is not to have a stable signal from the active edge of the clock signal. For systems operating with a low clock frequency, this saves three D flip-flops. The circuit of FIG. 5 represents the comparison circuit 232 in detail. The comparison circuit 232 corresponds to a subtraction circuit on which the stored hold is extracted and the data which arrives at the first input of the subtraction circuit, the subtraction circuit being obviously simplified. The stored hold is reversed and then enters a logical OR with the data present on the first entry. The output of the logical OR when all the data is entered in the comparison circuit 232 allows to know which of the two data is greater than the other. The result is

mémorisé dans une bascule D 501.stored in a D flip-flop 501.

La bascule D 501 dispose d'une entrée de donnée, d'une entrée d'horloge, d'une entrée de forçage à " un ", D flip-flop 501 has a data input, a clock input, a forcing input at "one",

d'une entrée de forçage à " zéro ", et d'une sortie. a "zero" forcing input, and an output.

L'entrée de donnée reçoit la donnée sortant du OU logique, l'entrée d'horloge reçoit un signal de chargement LD dont le front montant correspond à l'instant ou l'on désire obtenir le résultat de la comparaison. Les entrées de forçage à " un " et à " zéro ", reçoivent des signaux de prépositionnement ST et RST pour initialiser le circuit de comparaison 232. La sortie de la bascule 501 est connectée à un dispositif de The data input receives the data output from the logical OR, the clock input receives a load signal LD whose rising edge corresponds to the moment when it is desired to obtain the result of the comparison. The "one" and "zero" forcing inputs receive prepositioning signals ST and RST to initialize the comparison circuit 232. The output of the flip-flop 501 is connected to a device of FIG.

séquencement (non représenté) du coprocesseur 200. sequencing (not shown) of the coprocessor 200.

La figure 6 représente un élément du dispositif de sélection 228. Le dispositif de sélection comporte k + 1 éléments de ce type. Cet élément est constitué de trois porte ET 601 à 603 à trois entrées, deux portes ET 601 et 603 ayant une entrée inverseuse, et d'une porte OU 604 à trois entrées. Le rôle de cet élément est le même que celui d'un multiplexeur à quatre entrées dont la quatrième entrée reçoit un "zéro" logique. Dans le dispositif de sélection 228, l'élément correspondant au bit de poids le plus fort ne comporte que la porte ET centrale 602 car les première et deuxième bascules 224 et FIG. 6 represents an element of the selection device 228. The selection device comprises k + 1 elements of this type. This element consists of three AND gate 601 to 603 with three inputs, two AND gates 601 and 603 having an inverting input, and an OR gate 604 with three inputs. The role of this element is the same as that of a four-input multiplexer whose fourth input receives a logical "zero". In the selection device 228, the element corresponding to the most significant bit comprises only the central AND gate 602 because the first and second latches 224 and

225 ne disposent que de k bits.225 only have k bits.

La figure 7 représente un ensemble constitué par le circuit accumulateur 231 et le dispositif de sélection 228. L'ensemble ainsi constitué réalise deux multiplications avec addition des deux produits et addition d'une autre donnée en série. Si on appelle LATCHA la donnée présente dans la première bascule 225, LATCHY la donnée présente dans la deuxième bascule 226, SELA la donnée arrivant en série sur la première entrée de sélection du dispositif de sélection 228, SELY la donnée arrivant en série sur la deuxième entrée de sélection du dispositif de sélection, ES la donnée arrivant en série sur l'entrée série de l'accumulateur 231, et RES le résultat sortant de l'accumulateur 231, alors on effectue l'opération suivante: FIG. 7 represents an assembly constituted by the accumulator circuit 231 and the selection device 228. The assembly thus constituted produces two multiplications with the addition of the two products and the addition of another serial datum. If we call LATCHA the data present in the first latch 225, LATCHY the data present in the second flip-flop 226, SELA the data arriving in series on the first selection input of the selection device 228, SELY the data arriving in series on the second selecting input of the selection device, ES the data arriving in series on the serial input of the accumulator 231, and RES the result coming out of the accumulator 231, then the following operation is carried out:

RES = (SELY * LATCHY) + (SELA * LATCHA) + ES RES = (SELY * LATCHY) + (SELA * LATCHA) + ES

La structure du circuit accumulateur 231 correspond à une structure standard d'accumulateur. Ledit circuit 231 comporte: - des première à kième bascules d'accumulation 701 à 704, par exemple de type D, disposant chacune d'une entrée de donnée et d'une sortie, l'entrée de donnée de la première bascule 701 étant connectée au conducteur de poids le plus fort (c'est à dire de poids k) de la sortie parallèle du dispositif de sélection 228; - des première à (k+1)-ième bascules de retenue 705 à 709, par exemple de type D, disposant chacune d'une entrée de donnée et d'une sortie; - une bascule de résultat 710, par exemple de type D, disposant d'une entrée de donnée et d'une sortie, la sortie de cette bascule de résultat correspondant à la sortie de l'accumulateur 231; - des premier à (k+l)-ième additionneurs 711 à 715 standards (ou additionneur complet) disposant chacun de première à troisième entrées, d'une sortie de résultat, et d'une sortie de retenue, les premières entrées des premier à k-ième additionneurs 711 à 714 étant connectées au dispositif de sélection 228 pour recevoir respectivement les bits de poids k-1 à 0, les deuxièmes entrées des premier à k-ième additionneurs 711 à 714 étant connectées respectivement aux sorties des première à k-ième bascules d'accumulation 701 à 704, la première entrée du (k+l)-ième additionneur 715 étant connectée à la sortie de résultat du k-ième additionneur 714, la deuxième entrée du (k+l)-ième additionneur 715 correspondant à l'entrée série de l'accumulateur 231 qui reçoit la donnée ES, les troisièmes entrées des premier à (k+1)-ième additionneurs 711 à 715 étant respectivement connectées aux sorties des première à (k+l)-ième bascules de retenue 705 à 709, les sorties de résultat des premier à (k-1)-ième additionneurs 711 à 713 étant respectivement connectées aux entrées de données des deuxième à k-ième bascules d'accumulation 702 à 704, la sortie de résultat du (k+l)-ième additionneur 715 étant connectée à l'entrée de la bascule de résultat 710, les sorties de retenue des premier à (k+l)-ième additionneurs 711 à 715 étant respectivement connectées aux entrées de données des première à (k+1)-ième bascules de retenue 705 à 709. Les sorties internes Io à I2 correspondent respectivement aux sorties de retenue des k+l-ième et k-ième additionneurs 715 et 714 et à la sortie de The structure of the accumulator circuit 231 corresponds to a standard battery structure. Said circuit 231 comprises: from first to kth accumulation latches 701 to 704, for example of type D, each having a data input and an output, the data input of the first flip-flop 701 being connected the driver with the strongest weight (that is to say, weight k) of the parallel output of the selection device 228; from first to (k + 1) second latch latches 705 to 709, for example of type D, each having a data input and an output; a result latch 710, for example of type D, having a data input and an output, the output of this result flip-flop corresponding to the output of the accumulator 231; - from first to (k + 1) -theaders 711 to 715 standard (or full adder) each having first to third inputs, a result output, and a carry output, the first inputs of the first to k-th adders 711 to 714 being connected to the selection device 228 for respectively receiving the bits of weight k-1 to 0, the second inputs of the first to k-th adders 711 to 714 being respectively connected to the outputs of the first to k- the second input of the (k + 1) adder 715 being connected to the output output of the k-th adder 714, the second input of the corresponding (k + 1) adder 715 at the serial input of the accumulator 231 which receives the data ES, the third inputs of the first to (k + 1) -th adders 711 to 715 being respectively connected to the outputs of the first to (k + 1) th flip-flops of hold 705 to 7 09, the result outputs of the first through (k-1) th adders 711 to 713 being respectively connected to the data inputs of the second to kth accumulating latches 702 to 704, the result output of the (k + 1 the second adder 715 being connected to the input of the result latch 710, the retaining outputs of the first to (k + 1) th adders 711 to 715 being respectively connected to the data inputs of the first to (k + 1 The second internal latches 705 to 709. The internal outputs I0 to I2 respectively correspond to the retaining outputs of the k + l-th and k-th adders 715 and 714 and to the output of

résultat du (k-1)-ième additionneur 713. result of the (k-1) th adder 713.

Dans la pratique, les bascules de retenue, d'accumulation et de résultat 701 à 710 comportent In practice, the restraint, accumulation and result latches 701 to 710 comprise

également des entrées d'horloge et de forçage à zéro. also clock entries and forcing at zero.

Toutes les entrées d'horloge desdites bascules 701 à 710 sont connectées ensembles et reçoivent un même signal d'horloge. De même, toutes les entrées de forçage à zéro sont connectées ensemble pour être remise à zéro simultanément avant chaque calcul. Ces entrées ne sont pas représentées pour ne pas surcharger inutilement les dessins. Le fonctionnement du dispositif décrit sur cette figure 7 est relativement simple. Lors de chaque cycle du signal d'horloge qui synchronise le coprocesseur, on additionne successivement soit LATCHA, soit LATCHY, soit LACHA + LATCHY, soit zéro avec le contenu des bascules de retenue 705 à 709 et avec le bit arrivant de la donnée ES au contenu des bascules d'accumulation 701 à 704, le mot contenu dans les bascules d'accumulation 701 à 704 étant décalé successivement, de sorte que le bit contenu dans la bascule de résultat 710 corresponde au bit qui sort de All clock inputs of said flip-flops 701 to 710 are connected together and receive the same clock signal. Likewise, all zero forcing inputs are connected together to be reset simultaneously before each calculation. These entries are not shown so as not to overload the drawings unnecessarily. The operation of the device described in this Figure 7 is relatively simple. During each cycle of the clock signal which synchronizes the coprocessor, one successively adds either LATCHA, LATCHY or LACHA + LATCHY, or zero with the contents of the latches 705 to 709 and with the bit coming from the data ES to the contents of the accumulation latches 701 to 704, the word contained in the accumulation latches 701 to 704 being shifted successively, so that the bit contained in the flip-flop of result 710 corresponds to the bit that comes out of

l'accumulateur 231.the accumulator 231.

Avant de commencer un calcul, on effectue une remise à zéro de toutes les bascules d'accumulation, de retenue et de résultat 701 à 710. Puis, la double multiplication s'effectue ensuite par décalage simultané des données SELA, SELY et ES, à chaque cycle du signal d'horloge. Les bits de SELA et de SELY déterminent quelle(s) donnée(s) parmi LATCHA et LATCHY doivent être accumulées (voir le fonctionnement du dispositif de sélection 228). Lorsque la totalité des bits des données SELA et SELY a été décalée (soit après m * k cycles d'horloge), on fournit des "0" (pendant k+l cycles d'horloge) à la place des données SELA, SELY et ES afin de sortir la fin du résultat encore contenu dans les Before starting a calculation, all the accumulation, holding and result latches 701 to 710 are reset. Then, the double multiplication is then carried out by simultaneously shifting the data SELA, SELY and ES to each cycle of the clock signal. The bits of SELA and SELY determine which data (s) among LATCHA and LATCHY should be accumulated (see the operation of the selection device 228). When all the bits of the data SELA and SELY have been shifted (ie after m * k clock cycles), "0" (during k + 1 clock cycles) are provided in place of the data SELA, SELY and ES in order to get out the end of the result still contained in the

bascules d'accumulation 701 à 704.accumulation latches 701 to 704.

Si lesdites données sont codées sur des nombres de bits différents, il convient de compléter chaque donnée à If these data are coded on different numbers of bits, each data must be completed.

l'aide de " 0 ".using "0".

A présent que la description structurelle et Now that the structural description and

fonctionnelle des éléments composant le coprocesseur a été faite, il convient d'expliquer à présent le fonctionnement global du coprocesseur. Les explications qui vont suivrent permettront à l'homme du métier de synchroniser de manière globale le coprocesseur afin d'obtenir les opérations désirées. Par la suite, on utilise les données A, B et N qui sont des entiers non nuls codés sur respectivement a * k, b * k et n * k bits, avec a, b et n des entiers non nuls inférieurs à m. Dans the coprocessor component elements have been made, it is now necessary to explain the overall operation of the coprocessor. The explanations that follow will enable the skilled person to synchronize globally the coprocessor to obtain the desired operations. Subsequently, we use the data A, B and N which are non-zero integers coded on respectively a * k, b * k and n * k bits, with a, b and n non-zero integers less than m. In

un premier temps, on considère N impair. At first, we consider N odd.

Opération élémentaire Pfield(A, B)N = A*B*I mod N: A) Initialisation du coprocesseur: - On charge les données A, B, N respectivement dans les premier à troisième dispositifs de mémorisation 201 à 203; - On charge des zéros dans le quatrième dispositif de mémorisation 204, la donnée étant appelée S(-l); - On initialise le dispositif de comparaison 232 pour que la dernière comparaison indique que N est supérieur à Elementary operation Pfield (A, B) N = A * B * I mod N: A) Initialization of the coprocessor: - The data A, B, N are respectively loaded in the first to third storage devices 201 to 203; Zeros are loaded in the fourth storage device 204, the data being called S (-1); - The comparison device 232 is initialized so that the last comparison indicates that N is greater than

S(-1).S (-1).

B) Répétition a fois de la boucle de calcul suivante, avec i un indice variant de 0 à a-l: B-l) On charge simultanément le i-ième mot Ai de poids faible de A dans le premier registre 221 et le mot No de poids le plus faible de N dans le B) Repetition of the following computation loop at a time, with i an index varying from 0 to al: B1). The lower-most i-word A i is simultaneously loaded with A in the first register 221 and the word No of the weight. weaker N in the

deuxième registre 222.second register 222.

B-2) Puis, on charge simultanément les mots Ai et No respectivement dans les première et deuxième bascules 225 et 226. B-3) On initialise à zéro les circuits de soustraction 205 et 206, les circuits de retard 211 à 214, le premier registre 221 et toutes les B-2) Then, the words Ai and No respectively are loaded simultaneously in the first and second flip-flops 225 and 226. B-3) The subtraction circuits 205 and 206, the delay circuits 211 to 214, the subtraction circuits are initialized to zero. first register 221 and all

bascules 701 à 710 de l'accumulateur 231. latches 701 to 710 of the accumulator 231.

B-4) On décale simultanément de deux unités les mots B, N et S(i-1) contenus dans les deuxième à quatrième dispositifs de mémorisation 202 à 204, des zéros étant fournis sur les première et B-4) The words B, N and S (i-1) contained in the second to fourth storage devices 202 to 204 are simultaneously shifted by two units, zeros being provided on the first and fourth

deuxième entrées du dispositif de sélection 228. second inputs of the selection device 228.

B-5) On effectue k décalages successifs sur les deuxième à quatrième dispositifs de mémorisation 202 à 204, sur les premier et deuxième registres 221 et 222. La sortie de la porte OU Exclusif 235 fournit successivement les k bits d'une donnée Y0, avec Y0 = ((S(i-1) + Ai * B) * J0) mod 2k, J0 B-5) Successive shifts are performed on the second to fourth storage devices 202 to 204 on the first and second registers 221 and 222. The output of the exclusive-OR gate 235 successively supplies the k bits of a data Y0, with Y0 = ((S (i-1) + Ai * B) * J0) mod 2k, J0

étant défini par l'équation ((N*Jo)+1) mod 2k = 0. being defined by the equation ((N * Jo) +1) mod 2k = 0.

La sortie de la porte OU Exclusif est reliée d'une part à l'entrée du deuxième registre 222 et d'autre part à la deuxième entrée de sélection du dispositif de sélection 228. La donnée B est fournie à la première entrée de sélection du dispositif de sélection. L'entrée du premier registre 221 reçoit bit à bit le mot No de poids faible de N. L'entrée série de l'accumulateur reçoit S(i-1) si la dernière comparaison indique que S(i-1) < N, ou reçoit S(i-1) - N si la dernière comparaison indique que S(i-1) 2 N (la soustraction s'effectuant dans le deuxième circuit de soustraction 206). La sortie de l'accumulateur 231 fournit k bits égaux à zéro The output of the Exclusive OR gate is connected firstly to the input of the second register 222 and secondly to the second selection input of the selection device 228. The data B is supplied to the first selection input of the selection device. The input of the first register 221 receives bit by bit the word N of low weight of N. The serial input of the accumulator receives S (i-1) if the last comparison indicates that S (i-1) <N, or receives S (i-1) -N if the last comparison indicates that S (i-1) 2 N (the subtraction taking place in the second subtraction circuit 206). The output of the accumulator 231 provides k bits equal to zero

qui ne sont pas mémorisés.which are not memorized.

B-6) On transfert le contenu du deuxième registre 222 B-6) The contents of the second register 222 are transferred

égal à Y0 dans la deuxième bascule 226. equal to Y0 in the second flip-flop 226.

B-7) On effectue (n-l) * k décalages successifs sur les deuxième à quatrième dispositifs de mémorisation 202 à 204 et sur le premier registre 221. La donnée B est fournie à la première entrée de sélection du dispositif de sélection. Les n-l mots N1 à Nn-1 de poids fort de N sont fournis bit à bit d'une part à l'entrée du premier registre o10 221 et d'autre part à la deuxième entrée de B-7) Successive shifts are made (n-l) * on the second to fourth storage devices 202 to 204 and on the first register 221. The data B is supplied to the first selection input of the selection device. The n-1 words N1 to Nn-1 of most significant of N are provided bit by bit on the one hand at the input of the first register o10 221 and secondly on the second input of

sélection du dispositif de sélection 228. selection of the selection device 228.

L'entrée série du circuit accumulateur 231 reçoit S(i-1) si la dernière comparaison indique que S(i-1) < N, ou reçoit S(i-1) - N si la dernière comparaison indique que S(i-l) 2 N (la soustraction s'effectuant dans le deuxième circuit de soustraction 206). La sortie du circuit accumulateur 231 fournit les (n-l) * k bits de poids faible de S(i) qui sont mémorisés The serial input of the accumulator circuit 231 receives S (i-1) if the last comparison indicates that S (i-1) <N, or receives S (i-1) - N if the last comparison indicates that S (il) 2 N (the subtraction taking place in the second subtraction circuit 206). The output of the accumulator circuit 231 supplies the (n-1) * k least significant bits of S (i) which are stored

dans le quatrième dispositif de mémorisation 204. in the fourth storage device 204.

Les (n-l) * k bits de poids faible de S(i) sont comparés avec les (n-l) * k bits de poids faible The (n-1) * k least significant bits of S (i) are compared with the (n-1) * k least significant bits

de N dans le circuit de comparaison 232. N in the comparison circuit 232.

B-8) On effectue k+l décalages successifs sur le quatrième dispositif de mémorisation 204 et sur le premier registre 221. Les première et deuxième entrées de sélection du dispositif de sélection 228 reçoivent des zéros pour pouvoir fournir les k bits de poids fort de S(i) et finir la comparaison de S(i) avec N. Le résultat de la comparaison est mémorisé pour la prochaine itération. C) A l'issue de la dernière itération, le résultat S(a-l) mémorisé dans le quatrième dispositif de mémorisation subit une nouvelle soustraction de N si S(a-1) 2 N. La soustraction s'effectue par décalage simultané de S(a-1) et N dans le deuxième circuit de soustraction 206. Pour récupérer le résultat de la soustraction, on fournit des zéros aux entrées de sélection du dispositif de sélection 228 afin de rendre transparent le circuit accumulateur 231. L'homme du métier peut remarquer que la donnée de calcul J0 n'est plus calculée. En effet, les connexions réalisées sur les entrées de la porte OU Exclusif permettent de calculer directement Y0 avec un décalage B-8) k + 1 successive offsets are performed on the fourth storage device 204 and on the first register 221. The first and second selection inputs of the selection device 228 receive zeros to be able to supply the k most significant bits of S (i) and finish the comparison of S (i) with N. The result of the comparison is stored for the next iteration. C) At the end of the last iteration, the result S (a1) stored in the fourth storage device undergoes a new subtraction of N if S (a-1) 2 N. The subtraction is carried out by simultaneous shift of S (a-1) and N in the second subtraction circuit 206. To recover the result of the subtraction, zeros are supplied to the selection inputs of the selection device 228 in order to make transparent the accumulator circuit 231. The skilled person can notice that the calculation data J0 is no longer calculated. In fact, the connections made on the inputs of the exclusive OR gate make it possible to directly calculate Y0 with an offset

d'un cycle d'horloge par rapport au calcul de S(i). of a clock cycle with respect to the calculation of S (i).

De même, l'homme du métier peut remarquer que l'on commence le calcul avec No dans la deuxième bascule 222 puis que l'on remplace No par Y0. Le résultat est exactement le même que si l'on avait Y0 dans la deuxième bascule 222 en fournissant N en série sur la deuxième Similarly, one skilled in the art can notice that one begins the calculation with No in the second flip-flop 222 and that No is replaced by Y0. The result is exactly the same as if we had Y0 in the second flip-flop 222 supplying N in series on the second

entrée de sélection depuis le début du calcul. selection input from the beginning of the calculation.

L'homme du métier peut remarquer qu'avantageusement le premier registre 221 est utilisé d'une part pour transférer un mot dans la première bascule 225, et Those skilled in the art can remark that advantageously the first register 221 is used on the one hand to transfer a word in the first flip-flop 225, and

d'autre part pour retarder N de k cycles d'horloge. on the other hand to delay N of k clock cycles.

Multiplication modulaire: Pour effectuer une multiplication modulaire, il suffit d'effectuer deux opérations élémentaires Pfield en introduisant un paramètre H de correction d'erreur. On effectue ainsi soit Pfield(H, Pfield(A, B)N)N, soit Pfield(A, Pfield(H, B)N)N, avec H = 2(a+n)*k mod N. Calcul de Ac mod N On pose C un entier codé sur c bits et dont le bit de poids 2c1 est égal à 1. On considère que A et N sont codés sur un même nombre de bit égal à n*k bits. Si A est de taille inférieure à N, alors on complète A avec des Modular Multiplication: To perform a modular multiplication, it is sufficient to perform two Pfield elementary operations by introducing an error correction parameter H. We thus do either Pfield (H, Pfield (A, B) N) N, or Pfield (A, Pfield (H, B) N) N, with H = 2 (a + n) * k mod N. Calculation of Ac mod N We put C an integer coded on c bits and whose bit of weight 2c1 is equal to 1. We consider that A and N are encoded on the same number of bit equal to n * k bits. If A is smaller than N, then we complete A with

zéros en bits de poids fort.zeros in most significant bits.

a) on calcule H = 22*n*k mod N. b) on calcule R(1) = Pfield (H, A), et on mémorise R(1) dans les premier et deuxième dispositifs de mémorisation 201 et 202, le contenu du premier dispositif 201 étant appelé A, et le contenu de deuxième dispositif 202 étant appelé B. c) on effectue une boucle indexée par un indice i variant de 2 à c: c-l) On effectue une opération Pfield(B, B)N, en chargeant les mots de B à la place des mots de A lors de l'étape B-1. Le résultat est stocké dans a) calculating H = 22 * n * k mod N. b) calculating R (1) = Pfield (H, A), and storing R (1) in the first and second storage devices 201 and 202, the the contents of the first device 201 being called A, and the second device content 202 being called B. c) a loop indexed by an index i ranging from 2 to c: cl is performed. A Pfield (B, B) N operation is performed. , loading the words of B in place of the words of A in step B-1. The result is stored in

le deuxième dispositif de mémorisation 202. the second storage device 202.

c-2) Si le bit de poids 2c-i de C est égal à 1 alors on effectue également une opération Pfield(A, B)N, et on stocke le résultat dans le deuxième dispositif c-2) If the weight bit 2c-i of C is equal to 1 then Pfield (A, B) N is also performed, and the result is stored in the second device

de mémorisation 202.memorizing 202.

d) On charge " 1 " codé sur n*k bits dans le premier d) We load "1" coded on n * k bits in the first

dispositif de mémorisation 201.storage device 201.

e) On effectue une opération Pfield(l, B)N pour obtenir le e) Perform a Pfield (l, B) N operation to obtain the

résultat final.final result.

Calcul de H = 2(n+p)*k mod N, p étant un entier Pour effectuer le calcul de H, on neutralise une partie des éléments du coprocesseur 200. Le cinquième multiplexeur 229 est positionné pour fournir des " zéro " sur sa sortie. On charge une donnée égale à " 1 " codé sur k bits dans la deuxième bascule 226. Le sixième multiplexeur 230 est positionné pour relier la sortie du troisième circuit de retard 213 à la deuxième entrée de sélection du dispositif de sélection 228. Le huitième multiplexeur 236 est positionné pour relier l'entrée du comparateur 232 à la sortie du premier circuit de retard 211. L'ensemble résultant de ces différentes neutralisations transforme le coprocesseur 200 en circuit de calcul de H par soustractions successives. Un tel circuit est décrit dans la demande européenne n 0O 601 907. Calculation of H = 2 (n + p) * k mod N, where p is an integer To perform the calculation of H, a portion of the elements of the coprocessor 200 are neutralized. The fifth multiplexer 229 is set to provide "zero" on its exit. A data item equal to "1" coded on k bits is loaded into the second flip-flop 226. The sixth multiplexer 230 is positioned to connect the output of the third delay circuit 213 to the second selection input of the selection device 228. The eighth multiplexer 236 is positioned to connect the input of the comparator 232 to the output of the first delay circuit 211. The resulting set of these different neutralizations converts the coprocessor 200 into a circuit for calculating H by successive subtractions. Such a circuit is described in European Application No. 601907.

Claims

An integrated circuit comprising a coprocessor (200) of modular arithmetic comprising: - storage means (201 to 204) for storing and supplying in series first and second operands A and B, a modulo N and a result S, with An integer coded on a * k bits, where a is a non-zero integer at most equal to m, and with B, N and S which are integers coded on at most m * k bits, m and k being integers greater than 1; computing means for performing modular operations according to the Montgomery method, characterized in that the computing means comprise: - a circuit (234, 235) for calculating an intermediate data Y0; a first flip-flop (225) of k bits for storing a word Ai of k bits of A; a second flip-flop (226) of k bits for storing either the least significant word of N or Yo; - an addition parallel circuit (227) connected to add the words contained in the first and second flip-flops (225, 226); a selection device (228) coupled to the outputs of the first and second flip-flops (225, 226) and the parallel addition circuit (227) for providing on a parallel output the word contained in the first flip-flop (225), either the word contained in the second flip-flop (226), or the sum of the words contained in the first and second flip-flops (225, 226), or "zero", as a function, on the one hand, of a bit of B, and on the other hand, either a bit of Y0 or a bit of N; an accumulator circuit (231) which adds, shifts by one bit and stores the words successively supplied by the selection device (228) with a bit of an updated result S, the bits leaving the accumulator circuit (231) becoming a new updated result.

2. Circuit according to claim 1, characterized in that the calculation means comprise a first register (221) with a shift of k bits to receive on the one hand a word of k bits of A and transmit said word in parallel to the first flip-flop (225) and on the other hand N

to delay N of k cycles of a clock signal.

Circuit according to Claim 1, characterized in that the circuit for calculating the data Y0 consists of a five-input exclusive OR gate (235) which receives a first data item from a logical AND gate ( 234) between the least significant bit of the word of A present in the first flip-flop (225) and the next bit of B to be supplied to the selection device (228), a second datum corresponding to the next bit of S to be supplied to the accumulator circuit (231), and third to fifth data which correspond to values

internals of the accumulator circuit (231).

4. A method for performing a modular operation according to the Montgomery method by series shift of first and second operands A and B, a modulo N and an updated result through calculation means, with A an integer coded on a * k bits, a being a non-zero integer at most equal to m, and with B, N and S being integers coded on at most m * k bits, m and k being integers greater than 1, characterized in that it comprises the repetition of the following steps, i being an index varying from 0 to m-1: storing a word Ai of k bits corresponding to a word of weight i of A in a first flip-flop (226) of k bits ; calculating an intermediate data Y0 such that Yo = ((S (i-1) + (Ai * B)) * J0) mod 2k, with S (i-1) corresponding to (i-1) an updated result, S (-1) being equal to 0, and OJ being a word of k bits solving the equation ((J0 * N) + 1) mod 2k = 0; storing the word of k least significant bits of N and then Y0 in a second flip-flop (226) of k bits; adding in a parallel addition circuit (227) the words contained in the first and second latches (225, 226); selecting and supplying either the word contained in the first flip-flop (225) or the word contained in the second flip-flop (226), or the sum of the words contained in the first and second flip-flops (225, 226), or "zero ", as a function of a bit of B, and secondly of a bit of Y0 or a bit of N; Successive additions in an accumulator circuit (231) of the words supplied by the selection device (228) for each pair of bits of B and N, the result of each addition being added to a bit of the previous updated result S (i). 1) then shifted by one bit and stored between each addition, the bit coming out of the accumulator (231) during the shift corresponding to a

new updated result S (i).

5. Method according to claim 4, characterized in that a comparison is made of the output of the accumulator (231) with N delayed by k cycles of a clock signal, and in that one uses the same first register (221) with a shift of k bits to delay N and to be able to load the words Ai in the first

flip flop (225).

6. Method according to one of claims 4 or 5,

characterized in that the memorization of a word Ai in the first flip-flop (225) is effected by k offsets of the word Ai in the first register (221) and then parallel loading in the first flip-flop (225) after the word Ai has been loaded in full in the first register

(221).

7. Method according to one of claims 4 to 6,

characterized in that the calculation of each bit of the intermediate data Y0 is carried out one clock cycle

before we need that each bit.

8. The method according to claim 4, wherein the following steps are performed: A) initialization of the coprocessor: loading of the data A, B, N respectively in first to third storage devices (201 to

203);

- loading zeros in a fourth storage device (204), the data being called S (-1); initializing a comparison device (232) so that a last comparison indicates that N is greater than S (-1); B) repetition of the next loop with i, an index varying from 0 to a-1: B1) simultaneous loading of the i-th least significant word Ai of A in a first register (221) of k bits and the word The lowest weight number of N in a second register (222) of k bits; B-2) then, simultaneous loading of the words Ai and No respectively in the first and second latches (225, 226); B-3) initializing the subtraction circuits (205, 206), the delay circuits (211 to 214), the first register (221) and the accumulator circuit (231); B-4) simultaneously shifting the words B, N and S (i-1) contained in the second to fourth storage devices (202 to 204), zeros being provided to the selection device

(228);

B-5) performing k successive offsets on the second to fourth storage devices (202 to 204), on the first and second registers (221, 222), computing means successively supplying the k bits of the data Y0 d ' a part to the second register (222) and secondly to the selection device (228), the data B being also supplied to the selection device (228), the first register (221) receiving bit by bit the word No of k least significant bits of N, the accumulator (231) receiving in series S (i-1) if the last comparison indicates O10 as S (i-1) <N, or receiving in series S (i-1) - N if the last comparison indicates that S (i-1) 2 N; B-6) transferring the contents of the second register (222) equal to Y0 in the second flip-flop (226); B-7) performing (nl) * k successive offsets on the second to fourth storage devices (202 to 204) and on the first register (221), the data B being supplied to the selection device (228), the n -1 words N1 to Nn-1 of k most significant bits of N being supplied bit by bit on the one hand to the first register (221) and on the other hand to the selection device (228), the accumulator (231) receiving in series S (i-1) if the last comparison indicates that S (i-1) <N or receiving in series S (i-1) - N if the last comparison indicates that S (i-1) 2 N, l accumulator (231) supplying the (n-1) * k least significant bits of S (i) which are stored in the fourth storage device (204), the (nl) * k least significant bits of S (i) ) being compared with the (nl) * k least significant bits of N in the comparison circuit

(232);

B-8) performing k + 1 successive offsets on the fourth storage device (204) and on the first register (221), in order to be able to memorize the k most significant bits of S (i) and to finish the comparison of S ( i) with N, the result of the comparison being stored for the next iteration.