JPH1063641A

JPH1063641A - Product sum circuit with shift

Info

Publication number: JPH1063641A
Application number: JP8214500A
Authority: JP
Inventors: Takashi Yomo; 孝四方; Mari Kobayashi; 真理小林
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1996-08-14
Filing date: 1996-08-14
Publication date: 1998-03-06

Abstract

PROBLEM TO BE SOLVED: To provide a product sum circuit capable of simultaneously executing product sum operation and arithmetic shift processing in one machine cycle. SOLUTION: The product sum circuit is connected to 1st to 3rd regiters for storing data and provided with a 1st shifting circuit 21 for executing the arithmetic left shift of 1st input data outputted from the 1st register by '0' or prescribed bits in accordance with a 1st shift control signal, a multiplier 22 for executing multiplying operation between 2nd input data from the 2nd register and the output of the 1st shift circuit, an adder 23 for adding 3rd input data from the 3rd register to the multiplied output of the multiplier 22, and a 2nd shifting circuit 24 for executing the arithmetic right shift of the output of the adder 23 by '0' or prescribed bits in accordance with a 2nd shift control signal.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、デジタルシグナル
プロセッサ（ＤＳＰ）等で使用される積和回路に関し、
少ない命令サイクルで正確で高精度の演算を行うことが
できる積和回路の改良に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a product-sum circuit used in a digital signal processor (DSP) or the like.
The present invention relates to an improvement of a multiply-accumulate circuit capable of performing an accurate and high-precision operation with a small number of instruction cycles.

【０００２】[0002]

【従来の技術】乗数と被乗数の積を求める乗算器とその
乗算結果と次の乗算結果との和を求める加算器とを組み
合わせた積和回路が種々提案されている。特に、ＤＳＰ
内に設けられている積和回路は、ブースのアルゴリズム
を利用した乗算回路と通常の加算回路を組み合わせたも
のが一般的である。2. Description of the Related Art Various product-sum circuits have been proposed in which a multiplier for obtaining a product of a multiplier and a multiplicand and an adder for obtaining a sum of the result of the multiplication and the result of the next multiplication are combined. In particular, DSP
The product-sum circuit provided therein is generally a combination of a multiplication circuit using a Booth algorithm and a normal addition circuit.

【０００３】この様な積和回路は、最近のマルチメディ
ア関連の技術において、圧縮処理などで多数回にわたる
積和演算を行う必要があることから中心的な存在であ
る。Such a product-sum circuit is central in recent multimedia-related technologies because it is necessary to perform product-sum operations many times in compression processing and the like.

【０００４】一方、かかるＤＳＰ内の乗算回路では、デ
ータフォーマットを符号ビットと小数点以下のビットで
構成した固定小数点に対応させることが一般的に行われ
ている。かかるデータフォーマットを採用すると、如何
なる数値データでも−１以上＋１未満で表現することが
でき、しかもハードウエアを簡単化することができる。[0004] On the other hand, in a multiplication circuit in such a DSP, it is general practice to make the data format correspond to a fixed point composed of a sign bit and bits below the decimal point. By adopting such a data format, any numerical data can be represented by -1 or more and less than +1 and the hardware can be simplified.

【０００５】また、乗算演算をより高精度に行う為に、
倍精度演算が行えることがＤＳＰに要求されている。倍
精度演算では、通常の倍のビット数の乗数と被乗数を乗
算する為に、部分積を求めてからそれらの和を求めると
いった積和演算を行うことが必要になる。In order to perform the multiplication operation with higher accuracy,
DSPs are required to be able to perform double precision calculations. In the double precision operation, it is necessary to perform a product-sum operation such as obtaining a partial product and then obtaining a sum thereof in order to multiply a multiplicand of a normal double bit number and a multiplicand.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、固定少
数点のデータフォーマットの場合、１以上の数値データ
や−１より小さい数値データに対しては、一旦、−１〜
＋１の範囲に正規化する為に、その数値データ対応する
１／２、１／４、１／８倍した被乗数をテーブルから読
み出し、乗算した後に再度２倍、４倍、８倍をそれぞれ
行う逆正規化処理が必要である。この数値データの非正
規化は、一種のデータビットのシフトにより実現できる
ので、演算結果をシフト回路にて対応するシフト量分だ
けシフトさせることが行われる。However, in the case of a fixed-point data format, for numerical data of 1 or more or numerical data smaller than -1, once -1 to 1 is set.
In order to normalize to the range of +1, the multiplicand corresponding to the numerical data corresponding to 、, ４, １／ is read from the table, multiplied, and then multiplied by 2, 4, and 8 again. Normalization processing is required. Since the denormalization of the numerical data can be realized by a kind of data bit shift, the operation result is shifted by a corresponding shift amount by a shift circuit.

【０００７】また、倍精度での演算では、部分積の桁合
わせを行う為に、適宜データビットをシフトする工程が
必要になる。そして、部分積の桁合わせの後に他の部分
積との和をとることになる。In addition, in the operation with double precision, a step of shifting data bits is necessary in order to perform digit alignment of a partial product. After the digit alignment of the partial product, the sum with the other partial products is obtained.

【０００８】従って、従来の積和回路を利用して、デー
タフォーマットに納まらない数値データを乗算したり、
倍精度での演算をする場合は、頻繁にデータシフトの命
令を実行する必要があり、その様な演算が頻発するＤＳ
Ｐにおいては命令数の増大を招いていた。Therefore, using a conventional multiply-accumulate circuit, it is possible to multiply numerical data that does not fit in the data format,
When performing an operation with double precision, it is necessary to frequently execute a data shift instruction, and a DS in which such an operation frequently occurs.
In P, the number of instructions increased.

【０００９】そこで、本発明の目的は、命令数が少なく
より正確で精度の高い演算を実現することができる積和
回路を提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to provide a multiply-accumulate circuit capable of realizing a more accurate and accurate operation with a small number of instructions.

【００１０】更に、本発明の目的は、シフト回路を最適
化して付加した積和回路を提供することにある。Another object of the present invention is to provide a product-sum circuit in which a shift circuit is optimized and added.

【００１１】[0011]

【課題を解決するための手段】上記目的は、本発明によ
れば、データを保持する第一、第二、第三のレジスタに
接続され、該第一のレジスタからの第一の入力データを
第一のシフト制御信号に従って０または所定ビットの算
術左シフトする第一のシフト回路と、該第二のレジスタ
からの第二の入力データと該第一のシフト回路の出力と
の乗算演算を行う乗算回路と、該第三のレジスタからの
第三の入力データと該乗算回路の乗算出力との加算演算
を行う加算回路とを有することを特徴とする積和回路を
提供することにより達成される。SUMMARY OF THE INVENTION According to the present invention, there is provided an electronic apparatus comprising: first, second, and third registers for holding data, and receiving first input data from the first register; A first shift circuit that performs arithmetic left shift of 0 or a predetermined bit in accordance with a first shift control signal, and performs a multiplication operation on second input data from the second register and an output of the first shift circuit. This is achieved by providing a multiply-accumulate circuit, comprising: a multiplication circuit; and an addition circuit that performs an addition operation of the third input data from the third register and the multiplication output of the multiplication circuit. .

【００１２】また、上記の目的は、本発明によれば、デ
ータを保持する第一、第二、第三のレジスタに接続さ
れ、該第一のレジスタからの第一の入力データと該第二
のレジスタからの第二の入力データとの乗算演算を行う
乗算回路と、該第三のレジスタからの第三の入力データ
と該乗算回路の乗算出力との加算演算を行う加算回路
と、該加算回路の出力を第二のシフト制御信号に従って
０または所定ビットの算術右シフトする第二のシフト回
路とを有することを特徴とする積和回路を提供すること
により達成される。Further, according to the present invention, the above object is achieved by connecting first, second, and third registers for holding data, wherein the first input data from the first register and the second A multiplication circuit that performs a multiplication operation with the second input data from the register, an addition circuit that performs an addition operation of the third input data from the third register and the multiplication output of the multiplication circuit, A second shift circuit for arithmetically shifting the output of the circuit by 0 or a predetermined bit arithmetically right in accordance with a second shift control signal.

【００１３】本発明は、更に、上記の第一と第二のシフ
ト回路を両方設けた積和回路によっても上記目的が達成
される。The present invention further achieves the above object by a sum-of-products circuit having both the first and second shift circuits.

【００１４】[0014]

【発明の実施の形態】以下、図面に従って本発明の実施
の形態例を説明する。但し、本発明の技術的範囲がかか
る実施の形態によって限定的に解釈されることはない。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described below with reference to the drawings. However, the technical scope of the present invention is not limitedly interpreted by such embodiments.

【００１５】図１は、本発明の実施の形態にかかるＤＳ
Ｐの概略的なブロック図である。図１のＤＳＰでは、プ
ログラムカウンタＰＣはシステムクロックＣＬＫを供給
されてマシンサイクルでカウントアップする。そのプロ
グラムカウンタＰＣの出力がアドレスとして命令コード
が格納されたＲＯＭに与えられる。ＲＯＭからアドレス
に対応す命令コードがインストラクションレジスタＩＲ
０、ＩＲ１に出力され、それらの命令コードがデコーダ
ＤＥＣでデコードされ制御信号を各ブロックに供給す
る。FIG. 1 shows a DS according to an embodiment of the present invention.
It is a schematic block diagram of P. In the DSP of FIG. 1, the program counter PC is supplied with the system clock CLK and counts up in a machine cycle. The output of the program counter PC is given as an address to the ROM in which the instruction code is stored. The instruction code corresponding to the address from the ROM is stored in the instruction register IR.
0, IR1, and their instruction codes are decoded by the decoder DEC to supply control signals to each block.

【００１６】ＡＡＬＵ１，２は、メモリＲＡＭ１、２の
アドレスを演算するアドレス演算回路で、それらで演算
して求めたアドレスに対応するメモリＲＡＭ１，２のデ
ータが、ＡバスＡｂｕｓ、ＢバスＢｂｕｓを経由してレ
ジスタｒｅｇ１，２に格納され、積和回路１０で乗算さ
れ、レジスタｒｅｇ３の値と加算され、最終的な演算結
果がレジスタｒｅｇ３に格納される。The AALUs 1 and 2 are address operation circuits for calculating the addresses of the memory RAMs 1 and 2, and the data of the memory RAMs 1 and 2 corresponding to the addresses calculated by the ALUs 1 and 2 pass through the A bus Abus and the B bus Bbus. The result is stored in the registers reg 1 and reg 2, multiplied by the product-sum circuit 10, added to the value in the register reg 3, and the final operation result is stored in the register reg 3.

【００１７】インストラクションレジスタＩＲ０，ＩＲ
１が２つ設けられることにより、パイプライン処理が可
能になる。また、バスＡｂｕｓ，Ｂｂｕｓが２つ設けら
れているので、１マシンサイクルで２つのメモリＲＡＭ
１，２から２つのレジスタｒｅｇ１，２に２つのデータ
を同時に転送することが可能になる。Instruction registers IR0, IR
By providing two 1s, pipeline processing becomes possible. Further, since two buses Abus and Bbus are provided, two memory RAMs are provided in one machine cycle.
It becomes possible to transfer two data simultaneously from 1,2 to two registers reg1,2.

【００１８】図２は、図１のＤＳＰにより、メモリＲＡ
Ｍ１内の乗数と、メモリＲＡＭ２内の被乗数とを乗算し
てレジスタｒｅｇ３内の値と加算を行う場合のタイミン
グフローチャート図である。この例では、プログラムカ
ウンタＰＣがｎ−１の時に、インストラクションメモリ
ＲＯＭから「メモリＲＡＭ１からレジスタｒｅｇ１にデ
ータを転送する」なる命令コードを読み出し、次のマシ
ンサイクル（ｎ）でインストラクションレジスタＩＲ０
に格納する。同時に、プログラムカウンタＰＣのｎに従
って、メモリＲＯＭ内の「メモリＲＡＭ２からレジスタ
ｒｅｇ２にデータを転送する」なる命令コードを読み出
し、次のマシンサイクル（ｎ＋１）でインストラクショ
ンレジスタＩＲ０に格納する。その時、最初の命令コー
ドはインストラクションレジスタＩＲ１に転送される。FIG. 2 is a block diagram of the memory RA by the DSP of FIG.
FIG. 11 is a timing flowchart in a case where a multiplier in M1 is multiplied by a multiplicand in a memory RAM2 to add a value in a register reg3. In this example, when the program counter PC is n-1, the instruction code "Transfer data from the memory RAM1 to the register reg1" is read from the instruction memory ROM, and the instruction register IR0 is read in the next machine cycle (n).
To be stored. At the same time, the instruction code "Transfer data from memory RAM2 to register reg2" in the memory ROM is read according to n of the program counter PC, and stored in the instruction register IR0 in the next machine cycle (n + 1). At that time, the first instruction code is transferred to the instruction register IR1.

【００１９】マシンサイクル（ｎ＋１）で、デコーダＤ
ＥＣからの制御信号に従ってアドレス演算回路ＡＡＬＵ
１でアドレスが演算され、メモリＲＡＭ１に対するアド
レス値が生成され、マシンサイクル（ｎ＋２）で、その
アドレス値に対応する乗数のデータがレジスタｒｅｇ１
に与えられる。同様にして、被乗数も次のマシンサイク
ルでレジスタｒｅｇ２に格納される。In the machine cycle (n + 1), the decoder D
Address operation circuit AALU according to control signal from EC
1, an address is calculated, an address value for the memory RAM1 is generated, and in a machine cycle (n + 2), a multiplier data corresponding to the address value is stored in the register reg1.
Given to. Similarly, the multiplicand is stored in the register reg2 in the next machine cycle.

【００２０】マシンサイクル（ｎ＋３）にて、積和器１
０内でレジスタｒｅｇ１，２内のデータの乗算演算とレ
ジスタｒｅｇ３との加算演算が実行され、最終的な演算
結果がレジスタｒｅｇ３に格納される。In the machine cycle (n + 3), the accumulator 1
In 0, the multiplication operation of the data in the registers reg1 and 2 and the addition operation of the register reg3 are executed, and the final operation result is stored in the register reg3.

【００２１】尚、２つのバスＡｂｕｓ，Ｂｂｕｓを同時
に利用することでメモリＲＡＭ１、２からレジスタｒｅ
ｇ１，２に２つのデータを同時に転送することができ
る。その場合は、図２の例よりも１マシンサイクル分だ
け少なくすることができる。By simultaneously using the two buses Abus and Bbus, the registers res
Two data can be simultaneously transferred to g1 and g2. In that case, it can be reduced by one machine cycle as compared with the example of FIG.

【００２２】図３は、固定小数点でデータフォーマット
された例を説明する図である。図の（Ａ）は、そのデー
タフォーマットを示していて、符号ビットＳと小数点以
下を表すビットからなる。従って、このデータフォーマ
ットの例では表現できる数値の範囲が−１以上で＋１未
満である。FIG. 3 is a diagram for explaining an example in which data is formatted in a fixed point. (A) of the figure shows the data format, which is composed of a sign bit S and a bit indicating a decimal part. Therefore, in the example of this data format, the range of numerical values that can be expressed is −1 or more and less than +1.

【００２３】図の（Ｂ）は、かかるデータフォーマット
では対応できない＋１以上の被乗数Ｄ２を乗数のデータ
Ｄ１に乗する場合の処理について説明する。この例で
は、被乗数Ｄ２は１／４倍されてデータフォーマットに
適応できる様に正規化された係数が使用される。通常
は、ＲＯＭ等に格納した係数のテーブルから被乗数に対
応する係数のデータが読みだされて積和回路に与えられ
る。そして、両データＤ１，Ｄ２の乗算が行われた後
に、逆正規化の為に４倍される。具体的には、２ビット
分左シフトされて最終データＤ３（ｓ）が求められる。
このシフトを算術左シフト処理という。従って、最初に
正規化された時のビット数分だけ算術左シフトの処理が
必要になる。FIG. 2B illustrates a process in which a multiplicand D2 of +1 or more, which cannot be handled by such a data format, is raised to the multiplier data D1. In this example, the multiplicand D2 is multiplied by 1/4 and a coefficient normalized so as to be adaptable to the data format is used. Normally, coefficient data corresponding to the multiplicand is read out from a coefficient table stored in a ROM or the like and supplied to the product-sum circuit. After multiplication of both data D1 and D2, the data is multiplied by 4 for denormalization. Specifically, the final data D3 (s) is obtained by shifting leftward by 2 bits.
This shift is called arithmetic left shift processing. Therefore, arithmetic left shift processing is required by the number of bits at the time of the first normalization.

【００２４】この様に、データフォーマットに対応でき
ない数を積和器１０で演算する為には、上述の係数テー
ブルから正規化した係数を積和器１０に与え、乗算した
後に逆正規化の為の算術左シフト処理を行って、更に加
算演算を行うことになる。従って、プログラム命令とし
ては逆正規化の為の算術左シフト処理を余分に実行させ
る必要がある。As described above, in order to calculate a number which cannot correspond to the data format in the product-sum unit 10, a coefficient normalized from the above-described coefficient table is given to the product-sum unit 10, multiplied, and then denormalized. Is performed, and an addition operation is further performed. Therefore, it is necessary to additionally execute arithmetic left shift processing for denormalization as a program instruction.

【００２５】図４は、倍精度データを乗算する場合を説
明する図である。倍精度データは、通常例えば２０ビッ
トで構成されるところを、倍のビット数の４０ビットで
構成したデータである。それらのデータを２０ビットの
データ用に構成された積和回路を利用して乗算させる場
合は、図４に示す通り、データＤａを２０ビットづつの
データＡ１，Ａ２に分割し、データＤｂも２０ビットづ
つのデータＢ１，Ｂ２に分割し、それぞれの部分積を加
算することが行われる。即ち、部分積Ａ０×Ｂ０、Ａ１
×Ｂ０、Ａ０×Ｂ１、Ａ１×Ｂ１を、それぞれ必要な桁
合わせをして加算するのである。従って、部分積Ａ０×
Ｂ０は２０ビット分右側にシフトして桁あわせを行う必
要があり、部分積Ａ０×Ｂ０、Ａ１×Ｂ０、Ａ０×Ｂ１
の累積データも右シフトして桁あわせを行う必要があ
る。FIG. 4 is a diagram for explaining the case of multiplying double precision data. Double-precision data is, for example, data composed of 20 bits, but composed of 40 bits, which is twice the number of bits. When multiplying these data using a product-sum circuit configured for 20-bit data, as shown in FIG. 4, the data Da is divided into 20-bit data A1 and A2, and the data Db is also 20 bits. The data is divided into bit-by-bit data B1 and B2, and the respective partial products are added. That is, the partial products A0 × B0, A1
× B0, A0 × B1, and A1 × B1 are added after performing necessary digit alignment. Therefore, the partial product A0 ×
B0 needs to be shifted to the right by 20 bits to perform digit alignment, and the partial products A0 × B0, A1 × B0, A0 × B1
It is necessary to shift right also the accumulated data of.

【００２６】そして、最終的に求まる乗算結果のデータ
Ｄは、単純には８０ビットのデータとなる。そこで一般
的には下４０桁のデータＣ１，Ｃ０は切り捨てられる。
かかる切り捨ては、それぞれの算術右シフトを行う時に
フォーマットからの桁あふれにより切り捨てられる。下
位ビットの切り捨て自体は精度に影響を与えることはな
い。The multiplication result data D finally obtained is simply 80-bit data. Therefore, the data C1 and C0 of the lower 40 digits are generally discarded.
Such truncation is truncated due to overflow from the format when performing each arithmetic right shift. The truncation of the lower bits does not affect the precision.

【００２７】以上の様に、倍精度データで乗算する場合
も、算術右シフトの処理が必要になり、図１の積和回路
１０で演算させる為には、部分積を求めた後に適宜右シ
フト処理の命令を実行する必要がある。As described above, when multiplying by double-precision data, arithmetic right shift processing is required, and in order for the product-sum circuit 10 in FIG. Processing instructions need to be executed.

【００２８】図５は、本発明の実施の形態のシフト回路
付きの積和回路の概略ブロック図である。従来と同様に
乗数Ｂと被乗数Ａとの乗算を行う乗算回路２２と、デー
タＣとその乗算結果を加算する加算回路２３とから構成
される。更に、この例では、固定小数点のフォーマット
に適応できない被乗数を正規化した係数で与えた時に、
その逆正規化の為の算術左シフトを行うシフト回路２１
が、乗算回路２２の前段に設けられる。また、倍精度演
算を行う場合に部分積の桁あわせの為の算術右シフト処
理を行う為のシフト回路２４が、加算回路２３の後段に
設けられている。FIG. 5 is a schematic block diagram of a product-sum circuit with a shift circuit according to an embodiment of the present invention. A multiplication circuit 22 for multiplying the multiplier B and the multiplicand A as in the prior art, and an addition circuit 23 for adding the data C and the multiplication result. Furthermore, in this example, when a multiplicand that cannot be applied to the fixed-point format is given by a normalized coefficient,
Shift circuit 21 for performing arithmetic left shift for the inverse normalization
Is provided before the multiplication circuit 22. In addition, a shift circuit 24 for performing arithmetic right shift processing for digit alignment of partial products when performing double-precision arithmetic is provided downstream of the adder circuit 23.

【００２９】かかる構成にすることにより、正規化した
被乗数Ａで乗算演算を行う時の算術左シフト処理や、倍
精度演算で必要な算術右シフト処理が、乗算命令のコー
ドで一括して行うことができることになる。With this configuration, the arithmetic left shift processing when performing the multiplication operation with the normalized multiplicand A and the arithmetic right shift processing required for the double precision operation are collectively performed by the code of the multiplication instruction. Can be done.

【００３０】例えば、正規化した係数を前述した係数テ
ーブルから読みだして被乗数として積和回路１０に与え
た場合は、命令コードにその正規化したシフト量を付加
しておくことで、積和回路１０内で逆正規化の為の算術
左シフト処理を積和演算と一緒に行わせることが可能に
なる。そのシフトの有無及びシフト量は、シフト回路２
１に制御信号として与えられるシフトビットＳＦに従っ
て制御される。この算術左シフトは乗算した後に行うこ
とでも良いが、通常乗算した後のデータはビット数が多
くなりシフト回路の規模が大きくなるという問題点を有
するので、図５の例では乗算回路２２の前段に設けてい
る。For example, when a normalized coefficient is read from the above-described coefficient table and given to the product-sum circuit 10 as a multiplicand, the normalized shift amount is added to the instruction code, so that the product-sum circuit It becomes possible to perform arithmetic left shift processing for denormalization together with the product-sum operation in 10. The presence or absence of the shift and the shift amount are determined by the shift circuit 2
It is controlled in accordance with a shift bit SF given as a control signal to 1. Although the arithmetic left shift may be performed after multiplication, the data after multiplication usually has a problem that the number of bits increases and the scale of the shift circuit becomes large. Therefore, in the example of FIG. Is provided.

【００３１】従って、従来は乗算の命令コードと算術左
シフト処理の命令コードの２つのマシンサイクルを要し
ていたのに対して、図５の積和回路によれば１つの命令
コードで逆正規化のための左シフト処理も行うことがで
きるので１マシンサイクルで演算処理を完了することが
できる。Therefore, while two machine cycles of the multiplication instruction code and the arithmetic left shift processing instruction code were conventionally required, the product-sum circuit of FIG. Since a left shift process can be performed for the purpose of completion, the arithmetic process can be completed in one machine cycle.

【００３２】次に、倍精度での乗算を行う場合は、図５
の積和回路ではそのプログラムは次の様になる。Next, when performing multiplication with double precision, FIG.
In the product-sum circuit, the program is as follows.

【００３３】[0033]

【表１】 [Table 1]

【００３４】それに対して、従来のシフト回路２４が設
けられていない回路を使用した場合のプログラムは次の
様になる。On the other hand, a program in the case of using a circuit without the conventional shift circuit 24 is as follows.

【００３５】[0035]

【表２】 [Table 2]

【００３６】上記の例は、図４に示した倍精度の乗算を
行う例であり、ＡＸレジスタ内の４０ビットのデータＤ
ａとＢＸレジスタ内の４０ビットのデータＤｂとを乗算
してＤＸレジスタにその乗算結果を格納するアセンブラ
プログラムの例である。尚、ＡＸレジスタは例えば図１
内のレジスタｒｅｇ２に該当し、ＢＸレジスタは例えば
レジスタｒｅｇ１に該当する。また、ＤＸレジスタは例
えば図１内のレジスタｒｅｇ３に該当する。The above example is an example in which the double precision multiplication shown in FIG. 4 is performed, and the 40-bit data D in the AX register is
This is an example of an assembler program that multiplies a by 40-bit data Db in a BX register and stores the multiplication result in a DX register. The AX register is, for example, as shown in FIG.
, And the BX register corresponds to, for example, the register reg1. The DX register corresponds to, for example, the register reg3 in FIG.

【００３７】上記のＩのプログラム例では、左欄がオペ
レーションコードであり右欄がオペランドである。この
プログラム例では、先ず「ＭＵＬＵＡＤＸ，Ａ０，Ｂ
０」により、データＡ０とＢ０を乗算してその乗算値
（部分積）を２０ビット算術右シフトし、その結果をＤ
Ｘレジスタに格納する。この算術右シフト処理は、図５
の積和回路内のシフト回路２４にて行われる。従って、
制御信号ＡＳＦにより２０ビット右シフトの指令が与え
られる。この様に、図５の積和回路を利用することによ
り、乗算と算術右シフトを１つの命令コード、１マシン
サイクルで行うことができる。In the above example of the program I, the left column is an operation code and the right column is an operand. In this program example, first, “MULUA DX, A0, B
0 ”, the data A0 and B0 are multiplied, the multiplied value (partial product) is arithmetically shifted right by 20 bits, and the result is
Store in X register. This arithmetic right shift processing is performed as shown in FIG.
Is performed by the shift circuit 24 in the sum-of-products circuit. Therefore,
A 20-bit right shift command is given by control signal ASF. As described above, by using the product-sum circuit of FIG. 5, multiplication and arithmetic right shift can be performed with one instruction code and one machine cycle.

【００３８】プログラムＩでは、更に、「ＭＳＭＳＤ
Ｘ，Ａ１，Ｂ０」により、データＡ１とＢ０を乗算しＤ
Ｘレジスタのデータを加算する。次に、「ＭＳＭＳＡ
ＤＸ，Ａ０，Ｂ１」によりデータＡ０とＢ１を乗算しＤ
Ｘレジスタのデータを加算し、そして２０ビットの算術
右シフト処理を行ってＤＸレジスタに格納する。この算
術右シフト処理も図５中のシフト回路２４にて行われる
ので、上記と同様にして乗算と算術右シフトを１つの命
令コード、１マシンサイクルで行うことができる。In the program I, "MSMS D
X, A1, B0 ”, and multiplies the data A1 and B0 by D
Add the data in the X register. Next, "MSMSA
DX, A0, B1 "and multiply the data A0 and B1 by D
The data of the X register is added, and arithmetic right shift processing of 20 bits is performed and stored in the DX register. Since the arithmetic right shift process is also performed by the shift circuit 24 in FIG. 5, the multiplication and the arithmetic right shift can be performed in one instruction code and one machine cycle in the same manner as described above.

【００３９】最後に、「ＭＳＭＤＸ，Ａ１，Ｂ１」に
より、データＡ１とＢ１を乗算しＤＸレジスタのデータ
を加算して、最終的な倍精度の乗算結果がＤＸレジスタ
に格納される。Finally, the data A1 and B1 are multiplied by "MSM DX, A1, B1" and the data in the DX register is added, and the final double-precision multiplication result is stored in the DX register.

【００４０】上記のプログラムＩに対して、従来のシフ
ト回路が設けられていない積和回路を利用する時は、プ
ログラムIIに通り、桁あわせの為のシフト命令「ＬＳＲ
ＤＸ，２０」と「ＡＳＲＤＸ，２０」を実行する必
要があり、図５のシフト回路２４付きの積和回路を利用
する場合よりも、２マシンサイクル分余分に時間を要す
ることになる。When a product-sum circuit without a conventional shift circuit is used for the program I, a shift instruction "LSR" for digit alignment is used as in program II.
“DX, 20” and “ASR DX, 20” need to be executed, which requires two machine cycles more time than when using the product-sum circuit with the shift circuit 24 in FIG.

【００４１】固定小数点のデータフォーマットに関する
アセンブラプログラムの例は、特に示すまでもなく、図
５の積和回路のメリットは明らかである。前述した通
り、シフト回路２１を利用することにより、算術左シフ
ト処理の命令サイクルが不要になり、シフト回路２１を
有しない従来の積和回路に比較してマシンサイクルを短
くすることができる。An example of an assembler program relating to the fixed-point data format is not particularly shown, and the merits of the product-sum circuit of FIG. 5 are apparent. As described above, the use of the shift circuit 21 eliminates the need for the instruction cycle of the arithmetic left shift processing, and can shorten the machine cycle as compared with a conventional product-sum circuit without the shift circuit 21.

【００４２】図６は、図５の積和回路の詳細ブロック回
路図である。この回路例では、被乗数として４ビットの
データＡ１〜Ａ４、乗数として４ビットのデータＢ１〜
Ｂ４の乗算回路２２とその乗算結果とデータＣとの加算
回路２３が示されている。演算結果は、１１ビットに拡
張されたデータＸ１〜Ｘ１１に示されている。FIG. 6 is a detailed block circuit diagram of the product-sum circuit of FIG. In this circuit example, 4-bit data A1 to A4 as a multiplicand and 4-bit data B1 to B4 as a multiplier
The multiplication circuit 22 of B4 and the addition circuit 23 of the multiplication result and the data C are shown. The operation result is shown in data X1 to X11 extended to 11 bits.

【００４３】図６中のシフト回路２１は、固定小数点の
データフォーマットに適応できないで正規化されたデー
タを逆正規化の為に算術左シフト処理を行う。この例で
は、制御信号ＳＦ１，ＳＦ２によりシフトなし、１ビッ
トシフトまたは２ビットシフトかの指令が与えられ、そ
れに応じて４ビットのデータＡ１〜Ａ４を、シフト処理
しないか或いは１ビットまたは２ビットの算術左シフト
処理を行う。シフト回路２１は、従って６ビットの回路
２１１〜２１６から構成され、それぞれの回路は制御信
号ＳＦ１，２に応じて何れのビットを入力するかのセレ
クトを行う。The shift circuit 21 in FIG. 6 performs arithmetic left shift processing on data that cannot be adapted to the fixed-point data format and has been normalized for denormalization. In this example, an instruction of no shift, 1-bit shift or 2-bit shift is given by the control signals SF1 and SF2, and accordingly, the 4-bit data A1 to A4 are not shifted, or 1-bit or 2-bit data is not processed. Performs arithmetic left shift processing. The shift circuit 21 is therefore composed of 6-bit circuits 211 to 216, and each circuit selects which bit to input according to the control signals SF1 and SF2.

【００４４】乗算回路２２は、ブースのアルゴリズムに
より構成された例である。このアルゴリズムについて
は、例えば「デジタル信号処理システム」東京大学出版
会発行、持田侑宏、高橋宣明、津田俊隆、本間光一著、
等に解説されている通り、一般に良く知られている。ブ
ースのアルゴリズムに従えば、乗数のビットの組み合わ
せをデコードして、そのデコード結果に応じて適宜被乗
数をシフト処理して加減算を行うことで、部分積の加減
算の回数を減らすことができ、その分回路を簡単にする
ことができる。The multiplication circuit 22 is an example constituted by the Booth algorithm. This algorithm is described in, for example, "Digital Signal Processing System" published by The University of Tokyo Press, written by Yoshihiro Mochida, Nobuaki Takahashi, Toshitaka Tsuda, Koichi Honma,
Etc., as is generally known. According to Booth's algorithm, the combination of bits of the multiplier is decoded, the multiplicand is appropriately shifted according to the decoding result, and addition and subtraction are performed, so that the number of times of addition and subtraction of the partial product can be reduced. The circuit can be simplified.

【００４５】この乗算回路２２の例は、乗数Ｂ１〜Ｂ４
をデコードするブースデコーダ２２１、２２２、加減算
を指示する制御信号ＳＧとデコード信号との排他的論理
和回路２２３、２２４、それぞれ３ビットのシフトビッ
トＳＦＢ１，２を与えられてシフト回路２１の出力を適
宜シフトするブースセレクタ２２５〜２３６、更にブー
スセレクタの出力をそれぞれ加算する加算器２３７〜２
４５を有する。その結果、２つのデータＡ，Ｂの乗算結
果がデータＭ１〜Ｍ１１として出力される。The example of the multiplication circuit 22 includes multipliers B1 to B4
Booth decoders 221 and 222, exclusive OR circuits 223 and 224 of a control signal SG for instructing addition and subtraction and a decode signal, and 3-bit shift bits SFB1 and SFB2, respectively, to appropriately output the shift circuit 21. Booth selectors 225 to 236 for shifting, and adders 237 to 2 for respectively adding outputs of the booth selectors
45. As a result, the result of multiplication of the two data A and B is output as data M1 to M11.

【００４６】そして、加算回路２３はそれらのデータＭ
１〜Ｍ１１とデータＣ１〜Ｃ１１の和を演算する加算器
２５０〜２６０から構成される。それぞれの加算器はキ
ャリービットが発生すると上位の加算器にそのキャリー
ビットを与えている。The addition circuit 23 calculates the data M
1 to M11 and adders 250 to 260 for calculating the sum of the data C1 to C11. When a carry bit occurs, each adder gives the carry bit to a higher-order adder.

【００４７】シフト回路２４は、１１個のセレクタから
構成され、シフト制御信号ＡＳＦに従って、４ビットの
算術右シフト処理を行う。右シフトが指令されると、こ
の例では、加算器２５４〜２６０の出力がシフト回路２
４の２４（１）〜２４（７）に供給されて、２４（７）
〜２４（１１）には強制的に０がセレクトされる。ま
た、算術右シフト処理が必要ない時は、シフト処理は行
われない。The shift circuit 24 is composed of eleven selectors, and performs a 4-bit arithmetic right shift process according to the shift control signal ASF. When a right shift is commanded, in this example, the outputs of the adders 254 to 260 are
4 (1) to 24 (7), and 24 (7)
For 24 to (11), 0 is forcibly selected. When the arithmetic right shift processing is not required, the shift processing is not performed.

【００４８】尚、４ビットのデータＡをシフトした６ビ
ットデータと４ビットのデータＢとの乗算であればその
演算結果は１０ビット出力で足りる。しかしながら、積
和演算を多数回繰り返す様なプログラムルーチンにおい
て、途中で積和演算結果がその１０ビットをオーバーフ
ローしてもそのビットを失うことなく保持できる様に、
図６の例では１ビット分（Ｘ１１）余分に加算回路２３
やシフト回路２４を設けている。もちろん、複数ビット
余分にシフト回路２４、加算回路２３を設けても良い。In the case of multiplication of 4-bit data B and 6-bit data obtained by shifting 4-bit data A, a 10-bit output is sufficient as the operation result. However, in a program routine in which the product-sum operation is repeated many times, even if the product-sum operation result overflows the 10 bits in the middle, the bit can be retained without losing the bit.
In the example of FIG. 6, the adder circuit 23 has an extra bit (X11) for one bit.
And a shift circuit 24. Needless to say, the shift circuit 24 and the adder circuit 23 may be provided with a plurality of extra bits.

【００４９】図５、６に示した積和回路では、シフト回
路２１と２４とが個別に制御信号ＳＦ，ＡＳＦにより制
御される。無論、１マシンサイクル内に両方のシフト回
路で算術左シフト処理と算術右シフト処理とが実行され
る場合もある。In the product-sum circuits shown in FIGS. 5 and 6, shift circuits 21 and 24 are individually controlled by control signals SF and ASF. Of course, the arithmetic left shift process and the arithmetic right shift process may be executed by both shift circuits within one machine cycle.

【００５０】尚、図５、６ではシフト回路２１、２４が
両方設けられた積和回路の例がしめされているが、何れ
か一方のシフト回路を設けた積和回路であっても本発明
の目的を達成することができる。Although FIGS. 5 and 6 show an example of the product-sum circuit provided with both the shift circuits 21 and 24, the present invention can be applied to a product-sum circuit provided with either one of the shift circuits. Can achieve the purpose.

【００５１】[0051]

【発明の効果】以上説明した通り、本発明によれば、積
和回路内に、固定小数点のデータフォーマットに対応さ
せた正規化された係数の為の逆正規化の算術左シフトの
回路と、倍精度演算等で必要な算術右シフト回路を追加
した。その結果、従来積和演算とは別のマシンサイクル
で実行していたシフト命令を省略することができ、短い
マシンサイクルでの演算を可能にする。As described above, according to the present invention, in the multiply-accumulate circuit, an arithmetic left shift circuit for inverse normalization for normalized coefficients corresponding to a fixed-point data format, Added an arithmetic right shift circuit required for double precision arithmetic etc. As a result, it is possible to omit a shift instruction that has been conventionally executed in a machine cycle different from the product-sum operation, thereby enabling an operation in a short machine cycle.

[Brief description of the drawings]

【図１】本発明の実施の形態にかかるＤＳＰの概略的な
ブロック図である。FIG. 1 is a schematic block diagram of a DSP according to an embodiment of the present invention.

【図２】図１のＤＳＰにより、メモリＲＡＭ１内の乗数
と、メモリＲＡＭ２内の被乗数とを乗算してレジスタｒ
ｅｇ３内の値と加算を行う場合のタイミングフローチャ
ート図である。2 multiplies a multiplier in a memory RAM1 and a multiplicand in a memory RAM2 by a DSP of FIG.
It is a timing flowchart figure in case of performing addition with the value in eg3.

【図３】固定小数点でデータフォーマットされた例を説
明する図である。FIG. 3 is a diagram illustrating an example in which data is formatted in a fixed point.

【図４】倍精度データを乗算する場合を説明する図であ
る。FIG. 4 is a diagram illustrating a case of multiplying double precision data.

【図５】本発明の実施の形態のシフト回路付きの積和回
路の概略ブロック図である。FIG. 5 is a schematic block diagram of a product-sum circuit with a shift circuit according to the embodiment of the present invention.

【図６】図５の積和回路の詳細ブロック回路図である。FIG. 6 is a detailed block circuit diagram of the product-sum circuit of FIG. 5;

[Explanation of symbols]

１０積和回路２１シフト回路（第一のシフト回路）２２乗算回路２３加算回路２４シフト回路（第二のシフト回路）ｒｅｇ１，２，３レジスタ Reference Signs List 10 sum-of-products circuit 21 shift circuit (first shift circuit) 22 multiplication circuit 23 addition circuit 24 shift circuit (second shift circuit) reg1,2,3 register

Claims

[Claims]

1. A first, second, and third register for holding data, wherein first input data from the first register is arithmetically operated with 0 or a predetermined bit according to a first shift control signal. A first shift circuit for shifting; a multiplication circuit for performing a multiplication operation of second input data from the second register and an output of the first shift circuit; and a third circuit from the third register. A product-sum circuit, comprising: an addition circuit that performs an addition operation on input data and a multiplication output of the multiplication circuit.

And a second register connected to the first register for storing data, wherein the first input data from the first register and the second input data from the second register are connected to each other. A multiplication circuit for performing a multiplication operation; an addition circuit for performing an addition operation of the third input data from the third register and a multiplication output of the multiplication circuit; and an output of the addition circuit according to a second shift control signal. A second shift circuit for performing an arithmetic right shift of 0 or a predetermined bit.

3. The first input data from the first register is connected to first, second, and third registers for holding data, and the first input data from the first register is arithmetically operated by a 0 or a predetermined bit according to a first shift control signal. A first shift circuit for shifting; a multiplication circuit for performing a multiplication operation of second input data from the second register and an output of the first shift circuit; and a third circuit from the third register. An addition circuit for performing an addition operation of the input data and a multiplication output of the multiplication circuit; and a second shift circuit for arithmetically shifting the output of the addition circuit by 0 or a predetermined bit to the right according to a second shift control signal. A sum-of-products circuit.

4. The multiply-accumulate circuit according to claim 2, wherein said second shift circuit is larger than the number of bits obtained by adding the number of bits of said first input data and the number of bits of second input data. It is characterized in that it is configured to correspond to the number of bits.

5. The multiply-accumulate circuit according to claim 3, wherein said second shift circuit is larger than the number of bits obtained by adding the number of bits of said first shift circuit and the number of bits of second input data. It is characterized in that it is configured to correspond to the number of bits.

6. The multiply-accumulate circuit according to claim 1, wherein said adder circuit has a larger number of bits than the number of bits obtained by adding the number of bits of said first shift circuit and the number of bits of second input data. It is characterized by being constituted so that it can respond to.

7. The multiply-accumulate circuit according to claim 2, wherein said adder circuit sets the number of bits larger than the number of bits obtained by adding the number of bits of the first input data and the number of bits of the second input data. It is characterized in that it is configured to be compatible.