Embodiment
Fig. 1 is macroinstruction set symmetrical expression parallel architecture microprocessor figure
The basic operation structure of the parallel system of macroinstruction set symmetrical expression is the split storage organization of compound symmetry.As shown in Figure 1, it comprises:
* four independently may command become the address pointer that is operating as feature with storage mode operation and FILO or FIFO sequential storage mode at random and generate parts: FPCA, YPCA, FSD, FRD;
* independently deposit the Double Data port processing element that register architecture is feature for four groups: FD, FZ, FTNSF, FT with twin-lock;
* hardware logic, tissue and a control relation that constitute with the register morphosis, that allow to be changed by programmable way are the inside very long instruction word (VLIW) mark component of feature: FIF;
* long instruction control and treatment logical block: FDIF;
* three-dimensional table tennis decoding controller: FCC;
* corresponding to the register file of 8 independent data buses, corresponding 4 of every data bus is deposited the register that structure is a feature with twin-lock: TH, NH, SH, FH, TL, NL, SL, FL, I, J, K, R, IH, JH, KH, RH, D, D1, D2, D3, DR, DR1, DR2, DR3, Z, Z1, Z2, Z3, ZM, ZM1, ZM2, ZM3 is in order to keeping in of inside and outside data, instruction I/O mode of operation and result;
* synchronous circulation pulse clock signal generator part: FCLK;
* one is compiled the TIMD bus that each parts output function mode also can be transmitted to output each parts input mode.
Shown in Fig. 1 a, 1b, the first address unit FPCA and function, the structure identical and storer independent symmetry of the first FPDP parts FD with the second address unit YPCA and the second FPDP parts FZ; Identical and the data I/O symmetry of function, structure of FZM and FMM in FDR and FRR and the FZ parts in the FD parts.
Shown in Fig. 1 c, 1d, the 3rd generates parts FSD and function, the structure identical and storer independent symmetry of the 3rd FPDP parts FTNSF with four-address parts FRD and the 4th FPDP parts FT; Identical and the data I/O symmetry of function, structure of FTL and FTH in FTI and FTJ and the FTNSF parts in the FT parts.
Shown in Fig. 1 e, among the three-dimensional table tennis decoding controller FCC, function, structure, the control mode of the first code translator FCCP and the decoding of second code translator FCCB table tennis is identical and operation is symmetrical.
Four groups of FPDP (FD, FZ, FTNSF, FT), article eight, independent data bus (FDR, FRR, FZM, FMM, FTL, FTH, FTI, FTJ) and four address generator (FPCA, YPCA, FSD independently, FRD), to transmit data also respectively at specific memory device or connection miscellaneous equipment, wherein FD with bus form, FZ is instruction, Data Control flows the main source of input, is connected the bus TIMD bus of each port and internal register parts with one by the table tennis symmetrical expression decoding architecture parts FCC of three-dimensional, realize between the inside and in, data transmission between outer, symmetrical port part is controlled as having can be instruction I/O or the data I/O or the application operating feature of control I/O.
Above-mentioned feature has constituted address, data, control assembly symmetry and function and structure primary structure---the split storage organization of compound symmetry of the parallel system of macroinstruction set symmetrical expression of symmetry in pairs in twos.
Shown in Fig. 1 f, the instruction/data input operation mode that any FPDP of this architecture produces is accepted and temporal data/instruction through port register, has produced the first circuit way of output of instruction/data; Temporary through independently depositing the serial or parallel register of register architecture with twin-lock, produce the second circuit way of output of data/commands; Through internal register identifier word and code translator common combination logical action, generate the combination control signal, all parts of this system are implemented control and operation, the operating result of its each parts forms the tertiary circuit way of output of data/commands; This result is returned to the multi-channel gating device of each parts, links up data-signal input, output function between all parts.All line modes will be pooled to internal bus TIMD Bus, and the multi-channel gating device by TIMD forms the 4th circuit way of output of instruction/data, and this bus allows inner data/commands with outside all parts to transmit mutually.
Each parts data of this system/instruction I/O mode of operation, all by second line mode and first, three, four line mode gating between the parts operation of multi-channel gating device with its inside, allow between each parts instruction/data/result to transmit mutually, and all circuit I/O modes are all compiled by the multi-channel gating device of TIMD bus, have realized the bus operation mode that built-in command/data/result is transmitted each other by internal part by the TIMD bus thus.
This architectural feature is as follows:
* basic structure is simple, and first, second, third, fourth address/data port part basic structure has identical, the operation symmetry of consistance, repeatability and function, and in the very long instruction word (VLIW) hierarchy of control on, parts can be reorganized;
* inner each parts data path has the controlling features that can carry out the focus data mode of operation and disperse data manipulation mode by the TIMD bus, when architecture reorganization and different operating mode were selected, the data I/O operation of four circuit transmission modes was with the reorganization of effective support system structure and the demand of different application.
The input mode of matched orders/data, the data I/O mode of operation that each FPDP parts produces, selection, transmission through inner first, second, third, fourth line mode, realized the feature of architecture support multi-functional parallel work-flow, reach data relation and handle the behavior operating process that produces, reflected the function of macrolanguage primitive.
Fig. 2 deposits register architecture figure for multiplex data formula twin-lock
The basic operation device of macroinstruction set symmetrical expression parallel architecture is that the twin-lock of multiplex data formula is deposited register architecture, as shown in Figure 2.This structure by two independently gate and two independently latch form, can be combined into different forms, shown in Fig. 2 a, 2b, it has four key characters:
(1) control end of first, second latch is respectively by two level signal (L independently, L1) or clock (CLKA, CLKB) control, first latch is in that control signal CLKA is invalid when closing, the effective conducting of the second latch control signal CLKB, therefore the value of Q2 and Q equates in the clock period or in a certain moment.
(2),, can through the combinational logic signal controlling Q of first, second latch and the value of Q2 not waited arbitrary control end wherein therefore in certain one-period or a certain moment because the control end of first, second latch is independent respectively control.
(3) first, second latch structure has two outputs ((D1, D2) Cao Zuo application characteristic can respond the data transfer operation mode of first, second, third, fourth line mode to have self-sustaining and multi-data source for Q, Q2) end.
(4) the gating end of first, second gate, be respectively by two independently level signal (CD, CD1) coding control when gating is carried out in synchronous or asynchronous control, can be supported the mode of data multiplexing and serial or parallel register manipulation effectively.
The form of data multiplex is shown in Fig. 2 c, after the data A1 of the first gate gating input end D is preserved by first latch, CD1 controls the second gate MUX2 gating Q2, then data A1 is transferred into Q1, this is after the CLKB negative signal is stored in the Q end with data A1, CD1 after second latch cuts out, gating Q, then Q equates in the operating cycle with Q1; Shown in Fig. 2 d, after data A1 is stored in the Q end, CD controls the first gate device gating Q, then Q also equates in the operating cycle with Q2, thus when the gating of CD and CD1 is controlled to be synchronous operation, and Q, Q1, Q2 equates, is multiplexed form, has realized major and minor algorithm operating of specified data.Q, Q1 equate when CD and CD1 asynchronous operation, and Q2 does not wait with it, constitute the parallel register mode of operation.
The data manipulation of non-damage type is shown in Fig. 2 e, when arbitrary control signal (routine CLKB) generation of first, second latch is subjected to Combinational Logic Control, the D value will effectively be remained on the Q end, make the operation of D become a kind of data manipulation of non-damage type, behind EO, can control the first gate gating Q by CD at any time, recover the D value.
The plug-in type data manipulation is shown in Fig. 2 f, the plug-in type data manipulation can utilize CD and CD1 to first, second gate control between first, second latch, or use CLKA, the control of CLKB, produce two groups of different or identical data respectively, make the clock operation cycle at cycle pulse, a register of being made up of dual latch has been brought into play the effect of two registers, has the mode of operation of several data source and data type.
Self-hold circuit as shown in Figure 2, the register of this first, second latch structure is being subjected to CD respectively, CD1 and CLKA during the asynchronous control of CLKB, have the self-sustaining data and can weave into the operating characteristics of serial or parallel register architecture.
Table 2-1 twin-lock is deposited the register signal instruction card
Signal name | Function | Effective value |
CD | The first data strobe device gating signal | Select according to using coding |
CD1 | The second data strobe device gating signal | Select according to using coding |
CLKA | The first latches control signal can be clock signal; Also can be the control level signal. | Low level is effective, that is: when CLKA was " 0 ", the D data entered the LATCH_1 register and latch; When CLKA was " 1 ", the LATCH_1 data remained unchanged. |
CLKB | The second latches control signal can be clock signal; Also can be the control level signal. | High level is effective, that is: when CLKB was " 1 ", the Q1 data entered the LATCH_2 register and latch; When CLKB was " 0 ", the LATCH_1 data remained unchanged. |
D2 | The Data Source of first gate comprises the data in various sources: Q3, Q4 ... Qn | --- |
D1 | The Data Source of second gate comprises the data in various sources: Q3, Q4 ... Qn | --- |
Q2 | First latchs output data | --- |
Q | Second latchs output data | --- |
D | The first data strobe device MUX_1 output data | --- |
Q1 | The second data strobe device MUX_2 output data | --- |
Fig. 3 is FPCA, YPCA address unit figure
The address unit of symmetrical expression, as shown in Figure 3, it comprises:
* address multi-channel gating device MUX, can accept this architecture microprocessor first, second, the 3rd, the data input of the 4th line mode, wherein a circuit-switched data is from internal data bus TIMD BUS, other three circuit-switched data are from current address pointer PC, increment pointer PCINC and decrement pointer PCDEC, the gating signal MPC of MUX is from the control domain of the very long instruction word (VLIW) sign format word of FCC decoding unit or the output of first line mode, and the output bus AA of MUX is connected respectively to error in address comparator C OM, add 1 device INC, subtract 1 device DEC and Current Address Register PC;
* one adds 1 device INC and one and subtracts 1 device DEC, be respectively applied for the calculating of the current address of gating being carried out increment and decrement, the input that adds 1 device and subtract 1 device is all from the output AA of address strobe device MUX, and their output is connected respectively to increment pointer register PCINC and decrement pointer register PCDEC;
* three can be used as the first in first out of serial, reach address pointer register PC, increment pointer register PCINC and decrement pointer register PCDEC under the storage mode at random first-in last-out;
* address overflow error comparator C OM, in order to judge that the current address pointer overflows and store the ruling and the processing of feature, its input is respectively from the output line AA and the dedicated data line A1 of MUX gate, A1 is as limit address or base address, after COM process comparison process, OPADD line A and error identification signal A_err;
* manage the converter ASC that controls synchronous or asynchronous control timing for one, the ASC converter is under the control of MASC signal, with the processing of the address value [A] on the address wire A of input through the synchronous/asynchronous sequential control, generate the address of final reference-to storage, form the tertiary circuit way of output, export by address end ADDR.
The essential characteristic of the address unit of symmetrical expression is as follows:
(1) shown in Fig. 3 a, this system resets the back by MPC signal controlling MUX address strobe device, by choosing the input of internal data bus TIMD bus the 4th line mode, be used to indicate the initial address pointer of current storage mode, this address pointer is after false judgment device COM handles, produce the address [A] of actual access storer and pass through synchronous/asynchronous sequential control components A SC output, export at address bus ADDR in the tertiary circuit mode.Simultaneously, the current address value [AA] of MUX gating adds 1 device through INC respectively and DEC subtracts increment and the decrement that 1 device produces the address, wherein by the twin-lock of increment pointer register PCINC deposited structure preserve in next week by the phase for the increment of address, form second line mode output, as next cycle can selected storage unit access address pointer (twin-lock deposit structure register detailed description see also " invention figure explanation 2 ").Do not having before new address pointer is redefined by the MUX gating, the increment pointer register will become the pointer of unique storage mode at random.Pointer register PC will preserve the address value [AA] of current operation (this cycle) gating and address decrement that decrement pointer register PCDEC preserves current gating, promptly go up the address value of one-period operation.Three address registers are when system break, and the combination control signal that will produce according to the current very long instruction word (VLIW) hierarchy of control determines the address value of which register to export in the tertiary circuit mode and protects.
(2) shown in Fig. 3 b, 3c, when system adopts first-in last-out storage mode, under the MPC signal controlling, by of the four line mode input of MUX address strobe device by gating internal data bus TIMD bus, determine an address initial value, after false judgment device COM handles, produce the address [A] and the process synchronous/asynchronous sequential control components A SC output of actual access storer, export at address bus ADDR in the tertiary circuit mode.Simultaneously, current gating address value [AA] after INC adds 1 device to carry out increment in next week the phase be stored among the increment pointer register PCINC, form the second circuit way of output, storage unit is carried out the address output of read access operation as next cycle; Current address [AA] after DEC subtracts 1 device to carry out decrement in next week the phase be stored among the decrement pointer register PCDEC, storage unit is carried out the address output of number of write access operations as next cycle; Current pointer register PC has preserved the address location of current sequential storage.
(3) shown in Fig. 3 d, 3e, when system adopts the first in first out storage mode, under the MPC signal controlling, by of the four line mode input of MUX gate by gating internal data bus TIMD bus, determine an address initial value, after false judgment device COM handles, produce the address [A] and the process synchronous/asynchronous sequential control components A SC output of actual access storer, export at address bus ADDR in the tertiary circuit mode.At this moment, increment pointer register PCINC is as the write access pointer WP of storage unit; Current pointer register PC forms the second circuit way of output as the read access pointer RP of storage unit.
Furtherly, exactly when carrying out the write operation of storage unit, phase is stored among the increment pointer register PCINC increment of the address value of gating [AA] in next week, this value is as the memory unit address pointer of next cycle write operation, and as read access address pointer---current pointer register PC still keeps initial value constant.When carrying out the read operation of storage unit, the phase is stored in the PC current pointer register in next week behind the address value of gating [AA] increment, as the memory unit address pointer of read operation of following one-period, and as write access address pointer---increment pointer register PCINC still keeps initial value constant.When interruption was overflowed in system, then the combination control signal that produces according to the very long instruction word (VLIW) hierarchy of control determined to select increment pointer register or PC current address pointer to export preservation in the tertiary circuit mode.
The function of the address unit of symmetrical expression, structure, operation and control mode are all identical, when each parts independent operation, its address generation, storage operation mode and read-write control are to be determined by combination control signal mode in the cycle by the instruction/data input mode that its FD that matches and FZ FPDP parts produce respectively, and pass through the FPDP implementation data in next cycle clock upper edge and read and write input-output operation, shown in Fig. 3 f.
When FPCA and YPCA parts union operation, when one of them parts FPCA or YPCA select storage mode at random, another parts YPCA or FPCA select the first-in last-out stack storage mode, and be selected as the parts of storehouse storage mode, data read-write operation is carried out in the address that generates, to be controlled by and be selected as the combination control signal of the instruction/data generation of the FPDP input of storage mode parts at random, shown in Fig. 3 f, the instruction A that the T1 cycle carries out requires the FPDP parts of stack manipulation mode to carry out write operation in the T2 cycle, and the instruction B that T2 cycle CLK rising edge is read in requires in the T3 cycle these FPDP parts to be carried out read operation, T2 thus, it is redundant operation that the write and read that the T3 cycle produces operates in the T3 cycle, under the combination control of instruction A and instruction B, T2, the peripheral operation that T3 is selected as cycle stack manipulation mode FPDP parts is does not read, the high-impedance state of not writing.See also the explanation of " figure explanation 6 ".
Three kinds of storage operation modes of symmetrical expression address unit: first in first out or first-in last-out stack mode of operation and at random storage mode be that two subassembly FIF_PS and FIF_FIFO by inner very long instruction word (VLIW) register identification parts FIF controls.
FIF_PS and FIF_FIFO are made up of two gate MUX and trigger DFF respectively, shown in Fig. 3 g.Its essential characteristic is: can carry out first in first out or first-in last-out stack mode of operation and the definition of storage mode at random to address unit by sign logic control of external hardware on line and the control of inner very long instruction word (VLIW) register identification word.
Before this processor reset, at first outside hard on line sign pin PS_PIN of processor and FIFO_PIN are provided with by jumper.When processor in the reset cycle, the RSTn signal is effective, make MUX1 gating PS_PIN, MUX3 gating FIFO_PIN, simultaneously, MUX2 and MUX4 are because the RSTn signal is effectively distinguished the output signal of gating MUX1 and MUX3, and the CLK rising edge of phase is preserved by DFF1 and two triggers of DFF2 respectively in next week, has realized that reseting period is by the original definition of external hardware on line to address operation of components mode.
Behind this processor reset, MUX2 and MUX4 be the state value preserved of gating DFF1 and DFF2 respectively, when instruction redefines the mode of operation of address unit, PS_Ins and FIFO_Ins signal are effective, make MUX1 and MUX2 gate distinguish gating YPS and YFIFO signal, and the input of the gating by MUX2 and MUX4 is kept at respectively among DFF1 and the DFF2, outputs to the control that address unit is carried out mode of operation by PS and fifo signal at last.
The converter ASC of synchronous, the asynchronous control timing of address port shown in Fig. 3 h-1, has comprised a latch LAT, a trigger DFF and a MUX gate.System can to the sequential relationship of address output carry out synchronous/asynchronous control or carry out asynchronous by being synchronized to, by asynchronous to synchronous conversion and control, its control assembly is the subassembly FIF_ASC of inner very long instruction word (VLIW) register identification control assembly FIF.FIF_ASC is made up of two gate MUX1, MUX2 and a trigger DFF.
The synchronization of access sequential of storer was meant before the synchronous clock edge, provide the address signal and the reading and writing control signal of stable reference-to storage, be locked by the synchronous clock edge, it is stable that address after latching will keep in the whole memory cycle, read-write operation as memory data, the address allows address unit to change new address value after being latched synchronously, is used for the operation address of next memory cycle.
The asynchronous access sequential of storer is meant that the address signal of reference-to storage does not have the synchronous clock locking, produce control by address unit, keep address pointer stable in the whole cycle, finished before the read-write operation of cycle data, do not allow the new address of address unit conversion.
The synchronization of access sequential operation of storer is shown in Fig. 3 i, address A is by the high level conducting of LAT latch control signal CLK4, gating through the M_ASC device, become the output of ADDR signal, and locked by synchronous clock and keep the whole memory cycle in chip exterior, to carry out the read-write operation of memory data.
The asynchronous access sequential operation of storer is shown in Fig. 3 j, address A is by the rising edge locking of DFF trigger control signal CLK, gating through the M_ASC device, become the output of ADDR signal, address pointer as reference-to storage, this address pointer keeps the stable of whole memory cycle, till the rising edge of next CLK locks new address pointer again.
Synchronous/asynchronous time sequence control to address port has dual mode, shown in Fig. 3 h-1:
Rigid line sign definition when (1) resetting
Before this microprocessor resets, at first external pin ASC_PIN is provided with by jumper.After processor enters the reset cycle, the RSTn signal is effective, MUX1 gate gating ASC_PIN, the output signal MASC1 of MUX2 gate gating MUX1, and the CLK rising edge of phase is kept at it in DFF trigger in next week, has realized that the rigid line sign is to the original definition of the synchronous/asynchronous operation mode of address port when resetting.
(2) inner very long instruction word (VLIW) register identification definition
After this processor reset finished, the instruction control territory signal ASC_Yu of MUX1 gating synchronous/asynchronous sequential was as output, the original state MASC2 that resets that MUX2 gating DFF trigger is preserved.When outside very long instruction word (VLIW) identifier word is loaded inner very long instruction word (VLIW) register identification word, the ASC_Val signal is effective, the output signal MASC1 of MUX2 gating MUX1, the CLK rising edge of phase is kept at the DFF trigger with the signal ASC_Yu of outside very long instruction word (VLIW) loading in next week, through the sequential switching time of one-period, by the synchronous/asynchronous operation mode of MASC output signal control address port, sequential is referring to Fig. 3 k and 3l.
The switching device of the synchronous/asynchronous sequential of FPDP parts shown in Fig. 3 h-2, comprises two latchs and a gate.System carries out the conversion and control that synchronous/asynchronous latchs sequential to the input data of data port part, and its control assembly is the subassembly FIF_ASCd of inner very long instruction word (VLIW) register identification control assembly FIF.FIF_ASCd is made up of three gate MUX1, MUX2, MUX3 and a trigger DFF.
The synchronous/asynchronous time sequence control of FPDP parts has three kinds of modes, shown in Fig. 3 h-2:
Rigid line sign definition when (1) resetting
Before this microprocessor resets, at first external pin ASCd_PIN is provided with by jumper.After processor entered the reset cycle, the RSTn signal was effective, MUX1 gate gating ASCd_PIN, and the output signal MASCd1 of MUX2 gate gating MUX1, and the CLK rising edge of phase is kept at it in DFF trigger in next week.In reseting period, ASCd_Ins invalidating signal, the output line of MUX3 gating DFF trigger are connected to the input data sync/asynchronous sequential control operated device of FPDP parts as the MASCd signal, thereby are implemented in the rigid line sign definition when resetting.
(2) inner very long instruction word (VLIW) register identification definition
After this processor reset finishes, the instruction control territory signal of MUX1 gating ASCd_Yu data sync/asynchronous sequential is as output, the state of external pin ASCd_PIN signal during the trigger DFF hold reset, when outside very long instruction word (VLIW) identifier word is loaded inner very long instruction word (VLIW) register identification word, the ASCd_Val signal is effective, the output signal MASCd1 of MUX2 gating MUX1, and phase CLK rising edge is kept in the DFF trigger in next week.
When this processor presents when being synchronous sequence, the MUX3 gate passes through the ASCd_Yu signal of ASCd_Ins signal gating from instruction control in the later half cycle of the effective period of instruction control, so that system is becoming asynchronous system when the cycle internal conversion, lost efficacy at ASCd_Ins of following one-period, still the signal of trigger DFF output is got back in choosing, because at this moment DFF latchs the desired asynchronous system of ASCd_Yu domain of instruction, so processor keeps asynchronous system up to there being instruction to reset again.Sequential is seen Fig. 3 m.
When this processor presents when being asynchronous sequential, the MUX3 gate passes through the ASCd_Yu signal of ASCd_Ins signal gating from instruction control in the later half cycle of the effective period of instruction control, so that system is becoming the method for synchronization when the cycle internal conversion, lost efficacy at ASCd_Ins of following one-period, still the signal of trigger DFF output is got back in choosing, because at this moment DFF latchs the desired method of synchronization of ASCd_Yu domain of instruction, so processor keeps the method for synchronization up to there being instruction to reset again.Sequential is seen Fig. 3 n.
(3) very long instruction word (VLIW) control domain sign definition
When very long instruction word (VLIW) control domain sign dynamically arranges the synchronous/asynchronous sequential operation state of FPDP parts, the ASCd_Ins signal is effective, MUX3 gate gating present instruction control domain ASCd_Yu is as output signal MASCd, thereby changes the mode of operation of the synchronous/asynchronous sequential of FPDP parts in this cycle.After end was carried out in instruction, ASCd_Ins became invalid, and MUX3 also selects the output signal of getting back to the DFF trigger, and the synchronous/asynchronous sequential of FPDP parts also reverts to the mode of operation before instruction is carried out automatically, and sequential is referring to Fig. 3 o and Fig. 3 p.
Table 3-1 Fig. 3 signal note
TIMD---internal data bus
PC---current address pointer register data bus
PCINC---incremental address pointer register data bus
PCDEC---decrement address pointer register data bus
MPC---the gating control signal of address strobe device MUX_PC
AA---the gating output signal of address strobe device MUX_PC
First latch control signal of CLKA1---increment register PCINC
CLKA2---first latch control signal of decrement address register PCDEC
First latch control signal of CLKA3---Current Address Register P.C
First, second gate between latching of D1---increment register
The data bus of MUX_INC from other parts
First, second gate between latching of D2---decrement address register
The data bus of MUX_DEC from other parts
First, second gate between latching of D3---Current Address Register
The data bus of MUX_PC from other parts
First, second gate between latching of CD1---increment register
The gating control signal of MUX_INC
First, second gate between latching of CD2---decrement address register
The gating control signal of MUX_DEC
First, second gate between latching of CD3---Current Address Register
The gating control signal of MUX_PC
Second latch control signal of CLKB1---increment register PCINC
CLKB2---second latch control signal of decrement address register PCDEC
The following lock latch control signal of CLKB3---Current Address Register PC
MASC---the address gating signal of control synchronous/asynchronous sequential
A_err---error in address identification signal
Fig. 4 is the calculation function component diagram of FALU variable operation sequence
Macroinstruction set symmetrical expression parallel architecture has comprised the FALU parts with multiple calculation function, these parts as shown in Figure 4, it comprises:
* two add, subtract arithmetic operation device FAU1 and FAU2.FAU1 can independently use, and also can cooperate with the FINC device to finish the address and data are worth calculation function partially, and FAU2 is one of arithmetical organ that constitutes variable sequence;
* a logical operation device FLOG is one of arithmetical organ that constitutes variable sequence;
* shift operation device FSHIFT has constituted the arithmetic unit of variable sequence together with FAU2 and FLOG;
* add-one operation device FINC, it cooperates with the FAU1 device finishes the address and data are worth calculation function partially;
* 13 data gates, wherein:
The MUX1 gate is used for the data strobe of four kinds of line modes of FINC arithmetical unit;
MUX2 and MUX3 are used for the data strobe of four kinds of line modes of FAU1 arithmetical unit;
MUX4 and MUX5 are used for the data strobe of four kinds of line modes of FLOG arithmetical unit;
MUX6 and MUX7 are used for the data strobe of four kinds of line modes of FAU2 arithmetical unit;
MUX8 is used for the data strobe of four kinds of line modes of FSHIFT arithmetical unit;
MUX9 is used for four kinds of line mode data of FAU2 arithmetical unit gating or internal arithmetic result;
MUX10 is used for four kinds of line mode data of FLOG arithmetical unit gating or internal arithmetic result;
MUX11 is used for four kinds of line mode data of FSHIFT arithmetical unit gating or internal arithmetic result;
MUX12 is used for the operation result of gating FAU1 and FLOG, exports as the tertiary circuit mode;
MUX13 is used for the operation result of gating FAU2, FLOG and FSHIFT, exports as the tertiary circuit mode.
* a trigger DFF is used to preserve the inclined to one side value operation result of address or data.
Wherein, the variable sequence arithmetic unit is shown in Fig. 4 a, it is a part that constitutes the FALU parallel arithmetic element, and it is made up of FAU2, FLOG, three arithmetical organs of FSHIFT and MUX4, MUX5, MUX6, MUX7, MUX8, MUX9, MUX10, MUX11, nine gating devices of MUX13.
The data path of interconnection is in twos all arranged between the variable sequence arithmetical organ, be used for transmitting mutually operation result, the operation result output of each arithmetical organ all can be used as the input of other arithmetical organ operand, also can receive simultaneously the operation result output of other arithmetical organ, as the input of this arithmetical organ operand, the execution algorithm operation.Shown in Fig. 4 a, the operation result of FAU2 arithmetical organ is connected to the input end of MUX10 and MUX11 by the AU2 data bus, by M10 and the control of M11 gating signal, allow gating to output in FLOG and the FSHIFT arithmetical organ, operate as the operand execution algorithm.Equally, the output of the operation result of FLOG and FSHIFT also can be sent to other arithmetical organ.Operand is handled through plural arithmetical organ successively, just constituted a kind of sequence of arithmetic operation, as add, the multiplication sequence.
By the structure of this device as can be known, the operand of each arithmetical organ not only can be from register parts or memory member, can also be from the result of other arithmetical organ.Choosing by the MUX gate of operand realizes, the gating end of gate is then controlled by the signal that very long instruction word (VLIW) internal register mark component FIF_ALU produces, therefore, by changing the operand source that very long instruction word (VLIW) internal register sign just can change arithmetical organ, also just changed the sequence relation of arithmetic operation, the essential characteristic of this variable sequence of operations operation is as follows:
* the operating process of arithmetical operation and logical operation sequence is shown in Fig. 4 b, the FAU2 arithmetic operation device needs the two-way operand, wherein first via operand is directly exported from data bus TIMD bus the 4th line mode, another dataway operation number is also selected the output of data bus TIMD bus the 4th line mode by the MUX9 gate under the control of M9 signal.Two paths of data enters the FAU2 arithmetical organ and carries out the arithmetical operation of addition or subtraction and export operation result [AU2], [AU2] allows the wherein dataway operation number as the logical operation device, and is input to FLOG logical operation device by MUX10 gate gating under M10 control.Another dataway operation number of FLOG device comes from internal data bus TIMD bus, and the result that this two paths of data carries out producing after the logical operation is [LOG], chooses [LOG] net result output as arithmetic, logical operation sequence by M13 control MUX13 gate.
* the operating process of logical operation and arithmetical operation sequence is shown in Fig. 4 c, FLOG logical operation device needs the two-way operand, wherein first via operand is directly exported from data bus TIMD bus the 4th line mode, another dataway operation number is also selected the output of data bus TIMD bus the 4th line mode by the MUX10 gate under the control of M10 signal.Two paths of data enters the FLOG arithmetical organ carries out logical operation and exports operation result [LOG], and [LOG] allows the wherein dataway operation number as arithmetic operation device, is input to the FAU2 arithmetic operation device by MUX9 gate gating under M9 control.Another dataway operation number of FAU2 device comes from the output of internal data bus TIMD bus the 4th line mode, this two paths of data is carried out the result [AU2] after the arithmetical operation, chooses [AU2] net result output as logic, arithmetical operation sequence by M13 control MUX13 gate.
* the operating process of arithmetical operation and shift operation sequence is shown in Fig. 4 d, the FAU2 arithmetic operation device needs the two-way operand, wherein first via operand is directly exported from data bus TIMD bus the 4th line mode, another dataway operation number is also selected the output of data bus TIMD bus the 4th line mode by the MUX9 gate under the control of M9 signal.Two paths of data enters the FAU2 arithmetical organ carries out the arithmetical operation of addition or subtraction and exports operation result [AU2], and [AU2] allows the operand as the shift operation device, is input to FSHIFT shift operation device by MUX11 gate gating under M11 control.Because shifting function is the single operand computing, therefore do not need that other operand is arranged again, [AU2] carries out the result [SHIFT] after the shift operation, chooses [SHIFT] net result output as arithmetic, shift operation sequence by M13 control MUX13 gate.
* the operating process of shift operation and arithmetical operation sequence is shown in Fig. 4 e, and FSHIFT shift operation device only needs a dataway operation number, and this operand is selected the output of data bus TIMD bus the 4th line mode by the MUX11 gate under the control of M11 signal.After data entered the FSHIFT arithmetical organ and carry out shift operation and export operation result [SHIFT], [SHIFT] allowed to exist as a wherein dataway operation number of arithmetic operation device, and M9 control is input to the FAU2 arithmetic operation device by MUX9 gate gating down.Another dataway operation number of FAU2 parts comes from the output of internal data bus TIMD BUS the 4th line mode, this two paths of data is carried out the result [AU2] after the arithmetical operation, chooses [AU2] net result output as displacement, arithmetical operation sequence by M13 control MUX13 gate.
* the operating process of logical operation and shift operation sequence is shown in Fig. 4 f, the FLOG logic unit needs the two-way operand, wherein first via operand is directly exported from data bus TIMD bus the 4th line mode, another dataway operation number is also selected the output of data bus TIMD bus the 4th line mode by the MUX10 gate under the control of M10 signal.Two paths of data enters the FLOG arithmetical organ carries out logical operation and exports operation result [LOG], and [LOG] allows the operand as the shift operation parts, is input to FSHIFT shift operation parts by MUX11 gate gating under M11 control.Because shift operation is the single operand computing, therefore do not need that other operand is arranged again, [LOG] carries out the result [SHIFT] after the shift operation, chooses [SHIFT] net result output as logic, shift operation sequence by M13 control MUX13 gate.
* the operating process of shift operation and logical operation sequence is shown in Fig. 4 g, and FSHIFT shift operation device only needs a dataway operation number, and this operand is selected the output of TIMD bus data bus the 4th line mode by the MUX11 gate under the control of M11 signal.After data entered the FSHIFT arithmetic unit and carry out shift operation and export operation result [SHIFT], [SHIFT] allowed the wherein dataway operation number as the logical operation device, was input to FLOG logical operation device by MUX10 gate gating under M10 control.Another dataway operation number of FLOG parts comes from the output of internal data bus TIMD bus the 4th line mode, this two paths of data is carried out the result [LOG] after the logical operation, chooses [LOG] net result output as arithmetic, logical operation sequence by M13 control MUX13 gate.
The method for designing of variable sequence of operations not only is adapted to the application to three arithmetic units, and a plurality of arithmetic units are suitable for too.Its main source of operand that is characterised in that can result from the result of any functional part operation, and the two-way operand can be from the result of arbitrary function algorithm parts computing.
Table 4-1 signal name note
TIMD---internal data bus
INC---add the operation result data bus of 1 device
AU1---add, subtract arithmetic operation device 1 operation result data bus
AU2---add, subtract arithmetic operation device 2 operation result data buss
LOG---logical operation device operation result data bus
SHIFT---shift operation device operation result data bus
ALU1---arithmetic unit operation result 1 output data bus
ALU2---arithmetic unit variable sequence operation result data bus
AU---arithmetic unit address or data are worth the operation result data bus partially
Rau---be worth the result register output data bus partially
LRau---be worth result register locking control signal as a result partially
YAU1---FAU1 arithmetic operation device calculation function control signal
YAU2---FAU2 arithmetic operation device calculation function control signal
YLOG---FLOG logical operation device calculation function control signal
YSHC---FSHIFT shift operation device shift function control signal
YSHB---FSHIFT shift operation device displacement figure place control signal
Table 4-2 FAU1 and FAU2 device function table
YAU1/YAU2 | Function |
00 01 10 11 | Additive operation full add method computing subtraction band borrow subtraction |
Table 4-3 FLOG device function table
YLOG | Function |
00 01 10 11 | Transfer of operands logical and logical OR logic XOR |
Table 4-4 FSHIFT device function
table
|
000 001 010 011 100 101 110 111 | Transfer of operands logical shift left logic shift right ring shift left ring shift right arithmetic shift right arithmetic shift left operand is negated |
Displacement figure place in the YSHB control shifting function, displacement figure place scope is 0~31.
Fig. 5 is the system assumption diagram of parallel register heap
Macroinstruction set symmetrical expression parallel architecture microprocessor has comprised symmetrical expression parallel register parts, and as shown in Figure 5, it comprises:
* two groups of internal data register files, every group of twin-lock that contains the multiplex data formula of four multidigits deposited register DLAT, as Fig. 2;
* internal data bus TIMD bus can be linked up mutually with the input data terminal (Fig. 2, Q end) of each internal register DLAT, and be input to the DLAT preservation by the MUX1 gating, as shown in Figure 5 by the output of the 4th line mode;
* the data of each internal register all can export the TI bus in the tertiary circuit mode through the second register LATCH2 of dual latch, and are pooled in the internal data bus TIMD bus;
* MUX_H, MUX_L data strobe device carry out gating at four registers in every group of register file respectively, select one of them to output to MUX_H or MUX_L gate;
* MUX_MH, MUX_ML gate carry out gating to the output data of two groups of register files, choose wherein one the road to PAD_H, PAD_L;
* PAD_H, PAD_L are the data path node between internal register and the storer;
* RAN_H, RAM_L are two data storeies.
The principal character of symmetrical expression parallel register parts is:
(1) structure is symmetrical fully.Shown in Fig. 5 a, the data register FTNSF structure of the parallel system of symmetrical expression is that symmetry is consistent fully with control register FT, FD, FZ structure, and the structure of every group of register file also is symmetrical fully.
(2) constitute by many groups register file.Shown in Fig. 5 a, the data register FTNSF and the control register FT of the parallel system of symmetrical expression respectively contain two groups of register files (Register Files), all contain four registers (DLAT) in every group, each organizes register file can distinguish independent use, also can unite together and use.
(3) depositing register with multiplex data formula twin-lock is basic structure.Each DLAT has all adopted compound twin-lock to deposit register architecture among Fig. 5 a, as described in Figure 2.
(4) a plurality of data are gone into out operating point parallel work-flow fully.Shown in Fig. 5 b, at T, N, S, the F register in each group register file, except that public data input pin TIMD bus is arranged, each all has independently data input pin D0, D1, D2, a D3, by the DLAT twin-lock being deposited the control of the MUX1 gate of structure, data can be walked abreast respectively enter into T, N, S, F register.Output terminal T, N, S, the F that four registers second latch LATCH2 can deliver to data separately different data manipulation points simultaneously and carry out data processing.
(5) can link to each other with multibank.As shown in Figure 5, each group register file all can be connected with memory RAM _ H, the RAM_L of outside by data strobe device MUX_H, MUX_MH, MUX_L, MUX_ML.MUX_H, MUX_L gate carry out gating to the data of four registers in each register file, choose one of them as the current data of writing entry data memory, write in the entry data memory by MUX_MH, MUX_ML gate; MUX_MH, MUX_ML receive the data by MUX_L and the output of MUX_H gating, write in RAM_H or the RAM_L storer by PAD_H, PAD_L.
(6) can recombinate and redefine mode of operation.By inner very long instruction word (VLIW) register identification structure (FIF among Fig. 1), can the operation of parallel register be redefined.The operation of parallel register has following several mode:
* parallel register is piled
Parallel work-flow shown in Fig. 5 b, each register all can become independently data register, carries out parallel work-flow for arithmetic unit.When the parallel data register combines with two arithmetic unit FALU, can provide two arithmetic operation part FAU1s and FAU2, logic unit FLOG, the shifting part FSF of eight operands to use simultaneously for two arithmetic units, and can receive simultaneously the result of eight arithmetic operations, be operating as feature with multiple entry, multiple exit.
* register---stack manipulation mode (FILO) first-in last-out
Shown in Fig. 5 c is (FILO) register architecture first-in last-out, when inner very long instruction word (VLIW) register identification parts FIF_FIFO and FIF_PS define inner parallel register parts and are the FILO mode of operation, referring to table 5-1, these parts carry out data transfer operation with FILO (storehouse) working rule.The T data register is the stack top of internal data storehouse, the F data register is the stack tail of internal data storehouse, unit, SP pointed memory stack top, the inferior stack top of SP+1 pointed stacked memory (second), first can be used for stacked dummy cell the SP-1 pointed, and the content of F data register is consistent with the stack top location (unit of SP pointed) in the storer.
When carrying out stack-incoming operation, shown in Fig. 5 c, the data of source register are exported in the tertiary circuit mode from the second latch LATCH2 of its dual latch DLAT, and process data strobe device MUX_1 is with the SP-1 unit among the 4th line mode write store RAM.It is effective at the negative edge t1 of clock CLK1 to write data, and remain to next clock just along t2.Shown in Fig. 5 d, in the same clock period, second or the data Data that imports of tertiary circuit mode send into the T register, the data of T register are by the tertiary circuit mode, export the first data strobe device MUX_1 that N register twin-lock is deposited to from the second latch LATCH2 of this register, the data strobe device of first latch of N register is then selected the data of the T register that the tertiary circuit mode imports, at the rising edge t2 place of CLK clock data are locked first and latch LATCH1, and at CLK1 negative edge t3 place these data are saved to second and latch LATCH2.With same mode of operation, the data of N register are admitted to source register, and the data of source register are admitted to the F register.Therefore, the data that order will appear once in the data in the internal register of stacked front and back transmit, shown in Fig. 5 e.
When going out stack operation, shown in Fig. 5 c, the flow direction of data is opposite with stack-incoming operation, the data of T register are with the second latch LATCH2 output of tertiary circuit mode from its dual latch, N, S, F register data are pressed the tertiary circuit mode respectively simultaneously, from second latching LATCH2 and export MUX_1 gate before first latch of T, N, source register to separately, sending into the first latch LATCH1 at the rising edge t1 of CLK clock latchs, shown in Fig. 5 f, before CLK1 negative edge t2, latch second and latch among the LATCH2.Because the read-write cycle of storer is greater than the register transfer time, therefore, data just can be delivered to the MUX_1 gate of F register in the SP+1 unit of memory RAM at the negative half period t3 place of CLK clock, and latch into first of F register at the rising edge t4 place of CLK clock and to latch LATCH1, when t5, just enter following second of F register and latch LATCH2.
Because therefore F register and storage ripple, have produced the characteristic of " use afterwards earlier and mend " when going out stack operation.Storer stack top cell data is utilized by built-in function earlier by overlapping register F, replenishes through the read cycle of storer then.
* first-in first-out register---queue operation mode (FIFO)
The characteristics of FILO serial operation are the single operation point operations, going out, go into all of data carried out in the storehouse stack top, and the characteristics of first in first out (FIFO) serial operation are the dual operation point operations: the joining the team of data operates in rear of queue carries out, and the stem that team operates in formation that goes out of data is carried out.
Shown in Fig. 5 g is first in first out (FIFO) register architecture, when inner very long instruction word (VLIW) register identification parts FIF_FIFO and FIF_SP define inner parallel register parts and are the FIFO mode of operation, referring to table 5-1, in this data register, the stem of T register as the internal data register queue, increase a current tail of the queue of inner tail of the queue pointer Nil indication, externally set up two pointer A and B, the head of the queue of A pointed storage queue, the tail of the queue of B pointed storage queue, the shared register SP of the current address pointer of A pointer and FILO mode, the shared register SPINC of the increment pointer of B pointer and FILO mode.Shown in Fig. 5 h, when formation was sky, Nil pointed T showed that internal register is empty, A, B hands coincide, and 0 address of sensing storer shows that storer is empty, this state is exactly the initial state of FIFO operation.
The operation of joining the team of data is carried out in two kinds of situation:
(1) shown in Fig. 5 i, when Nil pointer value during less than " 100 ", show in the internal register formation that T, N, S, F register constitute and do not fill up data fully, at this moment, the data of joining the team can by second or the tertiary circuit mode be admitted to the tail of the queue register of Nil pointed, Nil+1 points to next empty register then.When after Nil adds 1, equaling " 100 ", show that the internal register formation all filled up.
(2) shown in Fig. 5 j, the operation of under the Nil pointer value equals situation that " 100 ", internal register formation all filled up, joining the team, by second or the data of joining the team imported of tertiary circuit mode, to be admitted to the storer tail of the queue unit of B pointed with the 4th line mode, the B pointer adds 1 through adding 1 device INC then, send the SPINC register back to, point to next available memory cell.At this moment, A pointer (SP register) remains unchanged.
The team's operation that goes out of data is also carried out in two kinds of situation:
(1) shown in Fig. 5 k, when the Nil pointer value is not filled up fully less than " 100 ", internal register formation, go out team's operation and pop class of operation seemingly, promptly the data by the tertiary circuit mode transmit, the T register data is sent, and the N content of registers is sent T register, and the source register content is sent N register, the F content of registers is sent source register, and the Nil pointer subtracts 1.
(2) shown in Fig. 5 l, under equaling situation that " 100 ", internal register formation fill up fully, the Nil pointer value goes out team's operation, then the data by the tertiary circuit mode transmit, the T register data is sent, the N content of registers is sent T register, the source register content is sent N register, and the F content of registers is sent source register; Data in the unit of storage queue owner pointer A indication through I/O PAD and data strobe, are delivered to the F register by first line mode, and the A pointer adds 1 and sends the SP register back to through adding 1 device INC, points to next data of storing.At this moment, B pointer (SPINC register) remains unchanged.
The data storage model of FILO is a linear, and the data storage model of FIFO then is an annular.Fig. 5 m is the memory model figure of FIFO and FILO, the FIFO operation is same direction motion in the memory model upper edge of annular, the different operation of difference representative of operating point that is: is operated for going out team when operating point points to head of the queue, is the operation of joining the team when operating point points to tail of the queue; FILO is an operating point with the stack top then, moves up and down, and the operation that direction of motion representative is different is a stack-incoming operation when stack top is upwards floated that is:, when stack top moves down for going out stack operation.
* the serial operation between the register file
The symmetrical expression parallel register not only can be realized string, the parallel control between each register, but also can realize series connection and operation in parallel between the register file.
Serial operation between register file is that two registers group are connected into an integral body with the form of FIFO or FILO, and at this moment, the memory bank that links to each other with two register files also is connected into an integral body, and Fig. 5 n has provided the structural representation of serial between register file.When all having data among TH, NH, SH, the FH, can utilize TL, NL, SL, FL to expand; When inner two register files of internal register data counter indication are all filled data, can represent the definition of FIF_RFE according to inner very long instruction word (VLIW) register, to the data back expansion of this register file; When the internal register data counter indicates first memory bank to fill data, can expand to second memory bank again.Extended mode is that FIFO goes out/go into data team or FILO goes out/go into the mode of operation of data base.
* the parallel work-flow between register file
Parallel work-flow between register file can have dual mode, and what promptly the expansion of data width and multichannel data were operated walks abreast:
(1) expansion of data width, it is the register file that two register files is merged into a double data word length, wherein TH, NH, SH, FH are high half part of word length of data, TL, NL, SL, FL are low half part of data word length, shown in Fig. 5 o, this mode makes the word length of data expand one times, has improved the precision of data operation greatly, has supported the demand that the high precision science is calculated effectively.
(2) the multichannel data operation is parallel, it is the parallel work-flow of independent data operation, in fact the structure shown in Fig. 5 b is the parallel organization of independent data operation, TH, NH, SH, FH and TL, NL, SL, FL have constituted two fully independently data entities among the figure, can independently operate separately, do not disturb mutually, the exchanges data between two register files can realize by internal data bus TIMD.This parallel mode helps the parallel work-flow of different processes, can carry out simultaneously as the logical operation that adds computing and TL, NL register of TH, NH register, has effectively supported the operation of variable sequence arithmetic unit.The control of series-parallel operation between data set also is that the internal indicator FIF_RFE by this micro-processor architecture controls its definition, shown in table 5-3.
Sign by inner very long instruction word (VLIW) register identification parts FIF_FIFO and FIF_PS, the parallel system of macroinstruction set symmetrical expression can constitute above-mentioned various operation forms to internal register and external data memory, realize able to programme, can recombinate, relocatable.
Table 5-1 register file string and mode are controlled
FIF_FIFO FIF_PS | The register file operation state |
0 0 1 0 x 1 | Serial operation FILO mode serial operation FIFO mode parallel work-flow mode |
Table 5-2 Fig. 5 signal instruction table
Signal name | Function | Relevant diagram |
TIMD | Internal data bus | 5 |
TI_BUS | The internal register stack bus | 5 |
CLK | The dual latch first latch control signal | 5a~5r |
CLK1 | The dual latch second latch control signal | 5a~5r |
Nil | The internal queues tail pointer | 5j,5k,5l,5m,5n,5o |
The control of table 5-3 register file extended mode
FIF_RFE | Extended mode |
0 1 | The parallel expansion of serial expansion |
Table 5-4 Fig. 5 functional unit instruction card
Functional unit | Function | Relevant diagram |
DLAT | Dual latch is referring to " Fig. 2 explanation " | 5,5a~5e,5h,5j 5l,5m,5n,5o |
TNSF | The registers group of one of register file can be; TH, NH, SH, FE; TL, NL, SL, FL; IH, JH, KH, RH; IL, JL, KL, among the RL one group | 5,5a~5r |
T_Reg_Latch1 N_Reg_Latch1 S_Reg_Latch1 F_Reg_Latch1 | T, N, S, F twin-lock deposit first latch in the structure | 5f,5l |
T_Reg_Latch2 N_Reg_Latch2 S_Reg_Latch2 F_Reg_Latch2 | T, N, S, F twin-lock deposit second latch in the structure | 5f,5l |
Fig. 6 is FRD, FSD address unit figure
Address unit FSD is connected with data register FTNSF, and address unit FRD is connected with register FT, and as shown in Figure 6, the principal character of these two address unit is: utilize data overlapping, solve the redundancy issue of the continuous read-write operation of data.
Shown in Fig. 6 a, address unit FSD and address unit FRD are two pointer structures, and they all contain the address pointer management component of a pair of symmetry, and (Pointer1 Pointer2), points to two storeies respectively.Each address pointer includes:
* address multi-channel gating device MUX, can accept this architecture microprocessor first, second, the 3rd, the data input of the 4th line mode, wherein a circuit-switched data is from internal data bus TIMD bus, other three circuit-switched data are from current address pointer SP, increment pointer SPINC and decrement pointer SPDEC, the gating signal MSP of MUX is from the control domain of the very long instruction word (VLIW) sign format word of FCC decoding unit or the input of first line mode, and the output bus AA of MUX is connected respectively to error in address comparator C OM, add 1 device INC, subtract 1 device DEC and Current Address Register SP;
* one adds 1 device INC and subtracts 1 device DEC, be respectively applied for the calculating of the current address pointer of gating being carried out increment and decrement, all from the output of address strobe device MUX, their output is connected respectively to increment pointer register SPINC and decrement pointer register SPDEC in their input;
* three can be used as the first in first out (FIFO) of serial, (FILO) mode of operation and address pointer register SP, increment pointer register SPINC and the decrement pointer register SPDEC of storage mode at random first-in last-out;
* the COM of base address management component of control of a pointer address overflow error and subregion, paging, in order to judge that the current address pointer overflows and data partition, paging control and management, AA that its input is exported in the tertiary circuit mode from the address strobe device and internal data bus TIMD bus are with the data A1 of the 4th line mode output, A1 is as subregion, branch page base address, AA synthesizes physical address output as address in the page or leaf after COM handles;
* synchronous/asynchronous control timing converter ASC, it is handled the address pointer A of input under the MASC signal controlling through the synchronous/asynchronous sequential control, output to the storer that is attached thereto by the address output pin.
The parallel system of macroinstruction set symmetrical expression adopts symmetrical expression parallel register structure, and address unit is had with the overlapping feature of two pointer structure composition datas.
Shown in Fig. 6 a, two pointer SPH and SPL point to the data storage area of two symmetries respectively, so that support the parallel processing operation of data.SPH and SPL point to respectively and will carry out the storage unit of reading and writing data among memory RAM _ H and the RAM_L, and when FILO operated, this location contents was identical according to register F content with inner stack mantissa, and it is overlapping to be data.
Cooperate very long instruction word (VLIW) register identification transliteration coded signal and data manipulation overlapping, realized the operating characteristics that data " are used afterwards earlier and mended " in the FILO memory model, effectively control data is gone into out the operational redundancy problem of outside stacked memory.Shown in Fig. 6 b, data goes into/goes out operating point in the SP pointed storer, it is the stack top location (Dn) of FILO memory stack, SP-1 points to next available dummy cell, SP+1 points to second (Dn-1) of storer stack top, T, N, S, F are a group (referring to Fig. 5 a explanation) in the parallel data register among the figure, are the inside stack top locations corresponding with this data storage stack.The F content of registers constantly with storer in the SP unit be consistent the situation of change of SP pointer when Fig. 6 c is stacked.
When stack-incoming operation occurs, the data of inner stack top register S are sent into the 4th F register by the tertiary circuit mode, send into the stack top dummy cell of SP-1 pointed simultaneously by the 4th line mode through the data of port output, send into S by tertiary circuit mode N register data, the T register data is sent into N, and stacked data Data sends into T.Because each internal register all adopts twin-lock to deposit structure (referring to the explanation of figure two DLAT), therefore, above sequence of operations all can be finished in one-period, data path is referring to Fig. 5 c explanation, behind the loading of finishing stacked data, the SP pointer subtracts 1 through 1 device (DEC) that subtracts among Fig. 6 a automatically, points to new stack top location, at this moment, data still are consistent with the F unit in the SP unit.
When popping efficient in operation for one, shown in Fig. 6 d, stack top location data in the storer need be read in the internal register, because the content of storer stack top unit is consistent with the F content of registers, therefore, cause the delay that data transmit greater than the transmission cycle of register in the read-write cycle of Shi Buhui of popping because of storer.Shown in Fig. 6 e, the data of F register can be finished out stack operation in the T1 cycle simultaneously by the tertiary circuit mode with S, N, T register, and the back result that pops was embodied in the register in the T2 cycle.The FSD parts produce stack address SP+1 at the CLK in T1 cycle clock negative half period, through COM comparison and the ASC synchronous/asynchronous control that makes mistakes, shown in Fig. 6 a, address pointer output pin PAD will go out stack address at CLK rising edge t1 place and send storer, carry out read operation, through the read cycle of a storer, memory data is sent institute's read data into FPDP at the CLK in T2 cycle clock negative half period by first line mode, and the F register of process data strobe FTNSF parts can be kept at twin-lock with institute's read data in the T3 cycle and deposit in the register.
The overlapping application characteristic of data is the redundant operation that solves data discrepancy external stack storer.Because FILO is the memory model of single operation point, the discrepancy of data is all at stack top, for this micro-processor architecture, stacked during with the continued operation of popping (stacked → as to pop or pop → stacked) when occurring, concerning external memory storage, be equivalent to not have storage access operations, the variation of data only is to carry out between inner stack top register, so external memory storage will maintain the original state, and has solved frequent memory read/write and problem that the power consumption that causes increases.Fig. 6 f has provided pointer, the data situation when going into out stack operation continuously.The data variation of going into to pop continuously is to rely on " Fig. 2 explanation " described control structure to produce.
Fig. 6 g is a stack pointer at stacked → variation sequential chart when going out stack operation.Operate in the T1 cycle when popping when effective, address strobe device MUX chooses SPDEC decrement pointer register data and goes into stack address output as current, among the T2 cycle writes the data in the source register unit of this pointed.Operate in T2 when popping in the cycle effectively the time, the address pointer parts are discarded the stack-incoming operation of carrying out in the cycle at T2, and read-write operation is controlled the OE signal and is set to not read not write operation, and at this moment, stack memory is high-impedance state, does not carry out the output of data; Go out stack operation control address pointer gate MUX in the T2 cycle and choose SPINC increment pointer register data, recover the data of stacked prior pointer.By stacked → go out the synergy of stack operation, any change does not take place in the stack memory data, going into out entirely of data finished between internal register.
When going out stack operation → stack-incoming operation, shown in Fig. 6 h, popping, it is effective to operate in the T1 cycle, T data register second latch data is sent in the tertiary circuit mode, T, N, S first latch are sent to the data of N, S, F second latch in first latch through the tertiary circuit mode at CLK rising edge t place through MUX_1 data strobe tertiary circuit mode; In the T2 cycle when stacked, T data register first latch enters the first latch LATCH by the stacked data of MUX_1 gating, the data of the second latch gating, first latch are new data more at the t3 place, N, S, F register then by CD2 end are controlled the unlatching that suppress its second latch at T2 in the cycle by decoder unit, make second latch still keep former data, on latch the data of then selecting second latch and recover former data content, like this, through go out → stack-incoming operation after, except that the T register has upgraded the data, other register all keeps former data constant.
The operation of indicating members shown in Fig. 6 i, acts on the T1 cycle through the combination control signal and goes out stack operation, selects the output of SPINC increment pointer register data through OPADD gating MUX, and the T2 cycle that operates in that control is popped carries out; When the T2 cycle, the combination control signal effect through newly producing produced stack-incoming operation, combinational logic was then controlled the stacked data port and is high-impedance state, makes storer not read not write, and MUX address strobe device is selected SPDEC decrement pointer register data, recovers former SP pointer value.
When continuous stacked action occurred, address pointer phase depreciation was weekly once sent into stacked data in the stack memory, shown in Fig. 6 j successively.
When popping continuously, shown in Fig. 6 k, the data of popping will directly be sent source register, so that can in time manipulate for data operation.After stacked end, again the F register is replenished.Stack pointer adds 1 in proper order when going out stack operation continuously, constantly the data of stack top are read.
Table 6-1 figure six signal instruction tables
Signal name | Function | Relevant diagram |
TIMD | Internal data bus, all internal parts data all can be admitted to the TIMD bus | 6,6b |
SPHA SPLA | The address pointer output data bus | 6,6b,6f,6h,6j,6k, 6l |
MAS | The control of synchronous/asynchronous address is referring to " Fig. 3 explanation " | 6,6b,6f,6h,6j,6k, 6l |
SPH SPL | Current address pointer register data | 6,6b,6f,6h,6j,6k, 6l |
SPINC | The increment pointer data | 6,6b,6f,6h,6j,6k, 6l |
SPDEC | The decrement pointer data | 6,6b,6f,6h,6j,6k, 6l |
AA | Address pointer gate output signal | |
A | The output signal of judging COM is overflowed in the address | |
A1 | Base address signal from the TIMD bus | |
CDT CDN CDS CDF | T, N, S, the F register first gate gating control signal | 6g |
CDT1 CDN1 CDS1 CDF1 | T, N, S, the F register second gate gating control signal | 6g |
Table 6-2 Fig. 6 functional unit instruction card
Behaviour's body component | Function | Relevant diagram |
ASC | Address synchronization, asynchronous control assembly are referring to " Fig. 3 explanation " | 6,6b,6f,6h,6j, 6k,6l |
COM | Decision means is overflowed in the address | 6,6b,6f,6h,6j, 6k,6l |
MUX | Source, address gate | 6,6b,6f,6h,6j, 6k,6l |
SP | The current address pointer latch | 6 |
SPINC | The increment pointer latch | 6 |
SPDEC | The decrement pointer latch | 6 |
TL,NL,SL,FL TH,NH,SH,FH | FTNSFH internal register stack FTNSFL internal register stack | 6a |
IL,JL,KL,RL IH,JH,KH,RH | FTI internal register stack FTJ internal register stack | 6b |
TNSF | The registers group of one of FTNSF register file can be: TH, NH, SH, FH; TL, NL, SL, among the FL one group | 6c,6d,6e,6f,6g |
T_Reg_Latch1 N_Reg_Latch1 S_Reg_Latch1 F_Reg_Latch1 | T, N, S, F twin-lock deposit first latch in the structure | 6i |
T_Reg_Latch2 N_Reg_Latch2 S_Reg_Latch2 F_Reg_Latch2 | T, N, S, F twin-lock deposit second latch in the structure | 6i |
The FPDP parts FD of macroinstruction set symmetrical expression parallel architecture, FZ have the function that cooperates the data input and output to carry out the byte exchange, shown in Fig. 7 and Fig. 7 a.The byte replacement part is made up of two devices:
* FSWAPI---the byte swap operation device of data entry mode.The input end of this device is from the DIL data bus of second line mode; Output terminal is the D data bus of tertiary circuit mode; Control end SWAPI is used to control the mode that the byte exchange is carried out in input to data from the subassembly FIF_DHL of inner very long instruction word (VLIW) register identification control assembly FIF.The function of FSWAPI device is referring to table 7-2.
* FSWAPO---the byte swap operation device of the data way of output.The input end of this device is from the output DSWAPO data bus of four kinds of line modes; Output terminal is the DB data bus; Control end SWAPO is used for the data way of output is carried out the control of byte exchange from the subassembly FIF_DHL of inner very long instruction word (VLIW) register identification control assembly FIF.The function of FSWAPO device is referring to table 7-3.
Storer can be regarded the order array of byte simply as, and each byte all has unique address, if data need be more than a bytes of memory, and then can be in a plurality of continuous bytes this deposit data.Multibyte data store two kinds of byte addressing modes, i.e. " little tail end " mode and " big tail end " mode." big tail end " mode be the high byte of word be placed on low address, low byte is placed on high address; " little tail end " mode the high byte of word be placed on high address, low byte is placed on low address.
Fig. 7 b is the two kind expressions of 64-Bit data 123456789abcdef0H at address a place.
Macroinstruction set symmetrical expression parallel architecture processor is supported " little tail end " and " big tail end " two kinds of byte addressing modes, and can select the conversion of control operation mode, and correspondence is used and can be constructed two kinds of addressing modes like this.In " big or small tail end " mode of support, this microprocessor can also carry out the byte map function to inputoutput data according to the requirement of program, referring to table 7-6 and table 7-8.
Data I/O mode byte exchange control subassembly FIF_DHL, shown in Fig. 7 a, it comprises three parts:
* FIF_BE, the byte enable control signal generates device;
* FIF_SWAPI, data entry mode byte exchange control signal generates device;
* FIF_SWAPO, data way of output byte exchange control signal generates device.
(1) FIF_BE is during the byte enable control signal generates
FIF_BE is shown in Fig. 7 c1, and it comprises:
* MUXBI gate is used for the Data Source gating of byte address;
* ADD_Badr incrementer, when being used for the continuous data storage operation to the incremental computations of byte address;
* the DFFB1 trigger is used to preserve byte address;
* the GEN_BE logical device is used for the conversion that byte manipulation is represented mode, and it is the control signal that start byte address and byte wide is converted to byte enable;
* LAT_BE latch, latching the byte enable control signal when being used for synchronous sequence;
* DFF_BE trigger, when being used for asynchronous sequential to the preservation of byte enable control signal.
Its essential characteristic is:
At this microprocessor in the reset cycle, the RSTn signal is that high level is effective, the DFFB1 trigger is cleared, initial value as byte address, this value is by the input end of Badr signal wire transmits to the MUXB1 gate, and another input end of MUXB1 is from the YBadr signal, because gating signal Badr_Ins is that low level is invalid, make and output to the ADD_Badr incrementer by MUXB1 gating Badr.When resetting, CTMS15 is that low level signal is invalid, and the ADD_Badr device does not carry out increment operation to the Badr1 signal of input, and therefore directly with input signal output, the CLK rising edge of phase is stored in the DFFB1 trigger in next week.When resetting end, DFFB1 is keeping the initial value of byte address always.
After this microprocessor resets, when the instruction control data are carried out the I/O operation, the Badr_Ins signal is that low level is effective, MUXB1 gating YBadr outputs to the ADD_Badr incrementer, this device carries out the conversion of " big or small tail " byte address according to table 7-4 to input signal, under the repetitive operation situation, carry out the increment operation of byte address, and output Badr signal is to the GEN_BE logical device, under the constraint of Size byte wide signal, be converted into the byte enable signal, by the synchronous/asynchronous time sequence control, output BE signal.Relative synchronous/asynchronous sequential control can be referring to " Fig. 3 explanation ".
After this microprocessor was finished data I/O operation, the CLR signal was that high level is effective, makes DFFB1 be cleared again.
Badr byte address signal also is sent to inputoutput data byte exchange control respectively and generates device FIF_SWAPI and FIF_SWAPO.
(2) FIF_SWAPI, data entry mode byte exchange control signal generates device
FIF_SWAPI is shown in Fig. 7 c2, and it comprises:
* two gates of MUXI1 and MUXI2 are used for program is carried out byte exchange control to the input data source gating;
* the DFFI1 trigger is used for save routine carries out the byte exchange to the input data control;
* the GENI logical device is used to generate final Input Data word joint exchange control;
* the DFFI2 trigger is used to preserve final Input Data word joint exchange control.Its essential characteristic is:
In this microprocessor reset cycle RSTn, the RSTn signal is that high level is effective, the DFFI1 trigger is cleared, the data input mode is carried out the initial value of byte exchange control as very long instruction word (VLIW), this value is by the input end of SWAPI1 signal feedback to the MUXI1 gate, and another input end of gate is from the YSWAPI signal.Because at reseting period, the SWAPI_Ins signal is that low level is invalid, makes MUXI1 gating SWAPI1 and output to the DFFI1 trigger, and the CLK rising edge of phase is stored among the DFFI1 in next week.In addition, byte enable control generates the Badr signal that device is imported into, when resetting, also be set to zero, it is input to the GENI logical device with the SWAPI2 signal, as show shown in the 7-7, the output signal of GENI also is zero, and the CLK rising edge of phase is kept at DFFI2 and sets out in the device in next week, thereby generate final Input Data word joint exchange control signal SWAPI, this signal is connected to the control end of FSWAPI device in the FD parts, because this moment, SWAPI was zero, so the FSWAPI device only will import data and directly send out, and not carry out the byte swap operation.
After this microprocessor resets end, when peeking operation, it is effective that the SWAPI_Ins signal becomes high level, make MUXI1 gating SWAPI1 signal output to DFFI1, and the CLK rising edge of phase is kept at DFFI1 in next week, has realized that very long instruction word (VLIW) carries out resetting of byte exchange control to the input data.Meanwhile, SWAPI_Ins signal controlling MUXI2 device gating YSWAPI makes that ought the cycle can change output signal SWAPI2 is new setting value, in next week behind the phase SWAPI_Ins Signal Fail, MUXI2 selects again and gets back to SWAPI1, but this moment, SWAPI1 has become the value of new settings.After the SWAPI2 signal is set up, it is input to the GENI device with the Badr signal that generates device from the FIF_BE byte enable, and by the table 7-6 carry out logical operation, and output SWAPI3 signal as a result, live by the DFFI2 CLK rising edge preservation of phase in next week at last, output to the SWAPI signal carries out the byte swap operation to the data input mode to the FD parts control.
(3) FIF_SWAPO, data way of output byte exchange control signal generates device FIF_SWAPO shown in Fig. 7 c3, and it comprises:
* two gates of MUXO1 and MUXO2 are used for very long instruction word (VLIW) is carried out byte exchange control to the data way of output source gating;
* the DFFO1 trigger is used for save routine carries out the byte exchange to output data control;
* the GENO logical device is used to generate final output data byte exchange control;
* the DFFO2 trigger is used to preserve final output data byte exchange control.
But its essential characteristic reference data input mode byte exchange control generates device.
Fig. 7 d carries out the operational instances that " big or small tail " byte exchanges and the byte exchange is carried out in programmed control to 16 bit data of input.Starting condition is as follows:
Size=01, the expression data width is 16;
YBadr=010, the expression byte address is 2;
YSWAPI2=01, representation program requires to carry out the exchange of high low byte;
BER=0 is expressed as " big tail end " mode;
CTMS15N=0, the expression repetitive operation is invalid.
At first by Fig. 7 c1, the YBadr signal is input to the ADD_Badr logical device by the MUXB1 gate, and by table 7-4 as can be known, output signal Badr2 is 100, and process DFFB1 trigger is delivered to the GENI device of FIF_SWAPI, sees Fig. 7 c2.Simultaneously, the data of YSWAPI signal also are sent to the GENI device by MUXI1, DFF1 and MUXI2 device path, by table 7-7 as can be known, GENI output signal SWAPI3 is 101, deliver to after these data are preserved by DFFI2 in the FSWAPI input data exchange operation device in the FD parts, finish the byte swap operation of data by table 7-2.
Table 7-1 Fig. 7 a signal note
DIL---the input data of FSWAPI data entry mode byte exchange device are total
Line
D---the output data of FSWAPI data entry mode byte exchange device is total
Line
SWAPI---the control signal of FSWAPI data entry mode byte exchange device
DSWAPO---the input data of FSWAPO data way of output byte exchange device are total
Line
DB---the output data of FSWAPO data way of output byte exchange device is total
Line
SWAPO---the control signal of FSWAPO data way of output byte exchange device
BE---byte enable control signal
Badr_err---byte address rub-out signal
Table 7-2 FSWAPI data entry mode byte exchange device menu
SWAPI | DIL<63:0> | D<63:0> |
000 001 010 011 100 ? 101 110 ? 111 | ? ? ? ABCDEFGH ? ? ? ? ? ? | ABCDEFGH BADCFEHG CDABGHEF DCBAHGFE EFGHABCD ? FEHGBADC GHEFCDAB ? HGFEDCBA |
Annotate: alphabetical A represents DIL<63:56 in the table 〉, letter b is represented DIL<55:48 〉,
Letter C is represented DIL<47:40 〉, alphabetical D represents DIL<39:32 〉,
Letter e is represented DIL<31:24 〉, alphabetical F represents DIL<23:16 〉,
Letter G represents DIL<15:8 〉, alphabetical H represents DIL<7:0 〉.
Table 7-3 FSWAPI data way of output byte exchange device menu
SWAPO | DSWAPO<63:0> | DB<63:0> |
000 001 010 011 100 101 110 111 | ? ? ? ABCDEFGH ? ? ? ? | HHHHHHHH GGGGGGGG GHGHGHGH EFEFEFEF EFGHEFGH ABCDABCD ABCDEFGH EFGHABCD |
Annotate: alphabetical A represents DSWAPO<63:56 in the table 〉, letter b is represented DSWAPO<55:48 〉,
Letter C is represented DSWAPO<47:40 〉, alphabetical D represents DSWAPO<39:32 〉,
Letter e is represented DSWAPO<31:24 〉, alphabetical F represents DSWAPO<23:16 〉,
Letter G represents DSWAPO<15:8 〉, alphabetical H represents DSWAPO<7:0 〉.
Table 7-4 ADD_Badr byte address incremental computations menu
Badr1 | CTMS15N BER Size | Badr2 | |
000~111 | 0 0 00 01 10 11 | Badr1<2:0〉Badr1<2:1 negates〉Badr1 that negates<2〉Badr1<2:0 negates 〉 |
000~111 | 0 1 00 01 10 11 | Badr1<2:0> |
000~111 | 1 0 00 01 10 11 | (Badr1+1)<2:0〉(Badr1+2)<2:1 negates〉(Badr1+4)<2 of negating〉Badr1<2:0 negates 〉 |
000~111 | 1 1 00 01 10 11 | Badr1+1 Badr1+2 Badr1+4 Badr1<2:0> |
Table 7-5 GEN-BE byte enable generates the device menu
Size | Badr | BE | Badr_err |
00 | 000 ? 001 010 011 100 101 110 111 | 01h ? 02h 04h 08h 10h 20h 40h 80h | 0 ? 0 0 0 0 0 0 0 |
01 ? ? ? ? | 000 010 100 110 xx1 | 03h 0ch 30h c0h xxh | 0 0 0 0 1 |
? 10 ? ? ? ? | ? 000 100 x01 x10 x11 | ? 0fh f0h xxh xxh xxh | ? 0 0 1 1 1 |
11 | 000 xx1 x1x 1xx | ffh xxh xxh xxh | 0 1 1 1 |
The menu of table 7-6 SWAPI2 very long instruction word (VLIW) control data input mode byte exchange
Size ? SWAPI2 | ? 00 ? | ? 01 ? | ? 10 | ? 11 ? |
00 01 10 11 | Do not exchange and keep | Not exchanging high low byte exchange keeps | Not exchanging the high low byte exchange of 16 exchanges of height keeps | Do not exchange the high low byte exchange of 16 exchanges of 32 exchange height of height |
Table 7-7 GENI logical device menu
Size | SWAPI2 | Badr | SWAPI3 |
00 | 00 | 000~111 | Badr |
01 | 00 01 | xx0 | Badr Badr+1 |
10 | 00 01 10 | x00 | Badr Badr+2 Badr+1 |
11 | 00 01 10 11 | 000 | Badr Badr+4 Badr+2 Badr+1 |
The menu of table 7-8 SWAPO2 very long instruction word (VLIW) control data way of output byte exchange
Size ? SWAPI2 | ? 00 ? | ? 01 ? | ? 10 ? | ? 11 ? |
0 1 | Do not exchange reservation | Do not exchange high low byte exchange | Do not exchange 16 exchanges of height | Do not exchange 32 exchanges of height |
Table 7-9 GENO logical device menu
Size | SWAPO2 | Badr | SWAPO3 |
00 | 0 1 | 000~111 | 000 001 |
01 | 0 1 | xx0 | 010 011 |
10 | 0 1 | x00 | 100 101 |
11 | 0 1 | 000 | 110 111 |
Fig. 7 b 64-Bit data 123456789abcdef0H is in two kinds of expressions at address a place.
-←1 Byte→- |
f0 |
de |
bc |
9a |
78 |
56 |
34 |
12 |
|
?a+7a+6a+5?a+4a+3?a+2a+1?a
" big tail end " mode
(BiG?Endian)
-←1 Byte→- |
12 |
34 |
56 |
78 |
9a |
bc |
dc |
f0 |
|
?a+7a+6a+5?a+4a+3?a+2a+1?a
" little tail end " mode
(Small?Endian)
The FPDP parts FD of macroinstruction set symmetrical expression parallel architecture, FZ have and cooperate the data of input to carry out the equal comparing function of different pieces of information width, thereby can realize that character is searched and operation such as string matching, as shown in Figure 7.Character and character string relatively control assembly are made up of the FSCAN device:
* FSCAN---data equality comparator spare.This device has three input ends, respectively from the IL data bus of first line mode, the D data bus of second line mode and the TIMD internal data bus of the 4th line mode.Output terminal is: the FLAGd signal is used for equating sign relatively; The SNO signal, the byte location that the expression data equate, this signal is only meaningful when FLAGd is high level.FSCAN also has three control ends, is respectively SCOMP, STRING and SCAN signal, and wherein: control signal SCOMP is used for controlling the data effective width that equates compare operation, and it is from the subassembly FIF_SCOMP of very long instruction word (VLIW) control assembly FIF.The function of FSCAN device is referring to table 7-10.
FIF_SCOMP controls subassembly, and shown in Fig. 7 e, it comprises:
* a gate MUX is used for the source gating that data equate manner of comparison;
* a trigger DFF is used for the preservation that data equate manner of comparison.
The essential characteristic of FIF_SCOMP control subassembly is:
At this microprocessor in the reset cycle, the RSTn signal is that high level is effective, the DFF trigger is cleared, the initial value that data equate the compare operation mode is set, this value is by the input end of SCOMP signal wire transmits to the MUX gate, another input end of MUX is from the YSCOMP signal, because at reseting period, gating signal SCOMP_Ins is that low level is invalid, therefore MUX gating SCOMP is as output and be sent to the input end of DFF trigger, and the CLK rising edge of phase is stored in the DFF trigger in next week.When resetting end, DFF is keeping the initial value " zero " of compare operation mode always, its output signal SCOMP is connected to the data equality comparator spare FSCAN in the FD parts, by table 7-10 as can be known, the data manner of comparison of this moment is complete 64 bit comparisons, but because at reseting period, it is invalid that STRING and SCAN signal are low level, thus the output signal FLAGd of FSCAN and SNO also for for low level invalid.
After this microprocessor resets end, operation is provided with to the data manner of comparison, it is effective that the SCOMP_Ins signal becomes high level, make MUX gating YSCOMP signal output to the DFF trigger, and the CLK rising edge of phase is saved among the DFF in next week, thereby realizes the setting to data compare operation mode.Newly-installed value compares the control of operation by the data equality comparator spare FSCAN that the output line SCOMP of DFF is connected to the FD parts.
When two control signal STRING of FSCAN and SCAN are low level when invalid, the output signal FLAGd of FSCAN device is a low level, and the SNO signal can be arbitrary value but be meaningless.
When the STRING signal be high level effectively and SCAN signal when to be low level invalid, the data that the FSCAN device is chosen on the D data bus of the IL data bus of first line mode and second line mode equate compare operation.As show shown in the 7-10, when equaling " 1000 " expression low 32 in 64 bit data, SCOMP equates that results relatively are effective, just FSCAN will [IL] and [D] hangs down 32 and equates to compare, if equate, then FLAGd is a high level, if do not wait, then FLAGd is a low level.In the effective compare operation of STRING signal, the SNO signal is meaningless always.When SCOMP was worth for other, class of operation seemingly.
The SCAN signal is high level when effective for low level is invalid when the STRING signal, and the FSCAN device is chosen the IL data bus of first line mode and the data on the 4th line mode TIMD internal data bus equate compare operation.As show shown in the 7-10, when SCOMP equals " 0011 ", be expressed as the mode of searching that 8 bit data equate this moment, just FSCAN as target word string (or regarding two binary-coded decimals as), equates the least-significant byte number of [TIMD] respectively relatively with 8 characters (64) of [IL].When having one or more characters to equate with target character in 8 characters of [IL], the FLAGd signal will become high level, and the SNO signal indicates the byte location that equates character; If when having a plurality of characters to equate, what SNO represented is the byte location (equating that effective order relatively is) of first equal character from the low byte to the high byte; When the neither one character equated with target character in 8 characters of [IL], the FLAGd signal was a low level, and the SNO signal can be arbitrary value but be meaningless.
Table 7-10 FSCAN data equality comparator spare menu
STRING | SCAN | SCOMP | FLAGd | SNO |
0 | 0 | 0000~1111 | 0 | Invalid |
0 | 1 | 0000 | [IL] carries out complete 64 with [TIMD] and equates relatively.When equal, FLAGd=1; When unequal, FLAGd=0 | 000 |
0010 | Eight 8 figure places of [IL] least-significant byte and [TIMD] equate respectively relatively.When one or more 8 figure places equate, FLAGd=1; Otherwise FLAGd=0 | 000~111 SNO is the byte location of first 8 figure places that equate |
0100 | [IL] low 16 figure places equate respectively relatively with [TIMD] four 16 figure places.FLAGd=1 when one or more 16 figure places equate; Otherwise FLAGd=0 | Xx0 SNO is the byte location of first 16 figure places that equate |
1000 | [IL] low 32 figure places equate respectively relatively with [TIMD] two 32 figure places.Hang down 32 when equating, FLAGd=1 with [TIMD] when high 32 or low 32; Otherwise FLAGd=0 | Low 32 of 000 or 111 [TIMD] hang down 32 SNO=000 when equating with [IL]; [TIMD] low 32 with [IL] high 32 SNO=000 when equating |
1 | 0 | 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 | [IL][D]641 [IL][D]41 [IL][D]81 [IL][D]121 [IL][D]161 [IL][D]201 [IL][D]241 [IL][D]281 [IL][D]321 [IL][D]361 [IL][D]401 [IL][D]441 [IL][D]481 [IL][D]521 [IL][D]561 [IL][D]601。 | Invalid |
In the split storage organization of the parallel compound symmetry of system of macroinstruction set symmetrical expression, two data buss of every group of FPDP of symmetry have the mode of operation optional characteristic, can select independent use or merge to make the I/O operation of carrying out data in two ways.
Definition by inner very long instruction word (VLIW) register identification parts FIF, two data buss of every group of FPDP connect different memory banks or miscellaneous equipment respectively, the data input that produces in a machine cycle is the mode of operation that this FPDP bus is independently used, shown in Fig. 7 f.
Definition by inner very long instruction word (VLIW) register identification parts FIF, two data buss of every group of FPDP connect same memory bank or same equipment, producing the data input in a machine cycle, is that this FPDP bus merges the mode of operation of using, shown in Fig. 7 g.
Control for the data bus use-pattern realizes by two approach:
When (1) resetting, by outside hard on line sign logical definition
When (2) carrying out, by outside very long instruction word (VLIW) identifier word control
Can select definition to the mode of operation of data bus by outside hard on line sign logic, when resetting, identify the control of logical definition data bus use-pattern, shown in Fig. 7 h by the hard on line in outside.Before this microprocessor resets, with the STRC_PIN<2:0 in the outside hard on line sign logic of microprocessor〉carry out the jumper setting, select the bus of current microprocessor data port and the connected mode of external unit or storer.
STRC_PIN<2:0〉define shown in table 7-11.
When this microprocessor and external unit or storer formation merging use-pattern, as Fig. 7 g, memory RAM _ A is connected with the FRR data bus with the FRD of FD port simultaneously, constituted the memory bus of a double data-bus width, and the initialize routine of microprocessor after resetting also is stored in the addr_rest place, reseting address unit of RAM_A.At this moment, STRC_PIN<2:0〉signal is arranged to " 010 " state with jumper.When microprocessor resets, as shown in Figure 7, external pin STRC_PIN<2:0〉signal will be read among the register STRC_REG among the inside very long instruction word (VLIW) register identification parts FIF of microprocessor, and this register data STRC is delivered in FRST and the FPCA parts.According to the STRC signal condition, reset operation parts FRST will control the FPCA address unit and produce system reset leading address addr_rest, and reseting address is delivered to the RAM_A from address port FPCA_addr, from the addr_rest unit, read article one instruction of reset initialization, send into microprocessor chip inside from FDR, the FRR data bus of FD parts, latch the back through DLAT and export the FCC decoding unit to, decipher execution from Q1, Q2 bus.
When this microprocessor and external unit or storer constitute use-pattern as Fig. 7 g, the boot that resets is deposited in the external unit or storer that links to each other with microprocessor FZM port, and connected mode is also for merging use-pattern, at this moment, microprocessor before resetting with jumper with STRC_PIN<2:0 be arranged to " 110 " state.When microprocessor resets, as shown in Figure 7, STRC_PIN<2:0 in the outside hard on line sign logic〉value be read among the register STRC_REG among the microprocessor internal very long instruction word (VLIW) register identification parts FIF, and control address parts YPCA generates reseting address addr_rest, by the YPCA_addr address port, deliver among the RAM2, read article one instruction of the boot that resets from the addr_rest unit, enter among the dual latch DLAT by FZM, FMM port, hold from Q1, Q2 at the rising edge of CLK clock to export FCC decoding scheme decoding execution to.
When the connected mode of storer or external unit and this microprocessor is shown in Fig. 7 f, show that storer or external unit adopt the independent operation mode to use data bus.At this moment, if reset boot in RAM_A, then with STRC_PIN<2:0〉be set to " 000 " state, during the microprocessor reset operation as shown in Figure 7.The FIF parts will be according to STRC_PIN<2:0〉state fills in STRC_REG, and control FPCA address unit generates reseting address addr_rest, and this address is sent in the FD parts key instruction INS_Rset that resets that produces with FCC control decoding unit, form a boot request instruction, and be divided into two cycles this instruction is sent, this order format form is shown in Fig. 7 i, it is the instruction of a double data-bus width, earlier boot request signal REQ_C is changed to effectively (for " 0 ") in the period 1, and first form (Format_1) of the boot request instruction of will resetting is sent to RAM_A by the FRD port, and this form will comprise the peek address, information such as size.Second round will reset again second form (Format_2) of key instruction be sent to RAM_A by the FDR port, this form mainly comprises the instruction that some states are provided with.After sending the boot request instruction that resets, microprocessor will be in waiting status, effective Deng the effective signal C_REQ of pending data, the reset instruction that then RAM_A is sent to the FDR port is read in the DLAT latch of FD parts, and send the FCC decoding unit to carry out instruction from the data terminal Q that locks that this latchs, shown in Fig. 7 j.
When storer or external unit use data bus in the independent operation mode, and STRC_PIN<2:0〉state when being configured to " 001 ", the boot that resets that shows this microprocessor is deposited among the RAM_B, shown in Fig. 7 f, microprocessor will read article one instruction of the boot that resets from RAM_B, carry out initialization, therefore, when microprocessor resets, reseting address will be generated by the FPCA parts, deliver to the instruction of reading in the RAM_B storer in the addr_rest unit by the FPCA_Addr address port, send into the decoding of FCC parts through the FRR port and carry out, as shown in Figure 7.
When storer or external unit use data bus in the independent operation mode, and STRC_PIN<2:0〉state when being configured to " 100 ", the boot that resets that shows this microprocessor is deposited among the RAM_C, shown in Fig. 7 f, reset this moment bootup process and STRC_PIN be " 000 " state class seemingly, just reseting address will be produced by YPCA, deliver to RAM_C by the FZM port of FZ parts with two cycles with reset instruction, and wait for effectively back execution reset operation of key instruction.
When storer or external unit use data bus in the independent operation mode, and STRC_PIN<2:0〉state will be configured to " 101 " time, when the boot that resets that shows this microprocessor is deposited among the RAM_D, shown in Fig. 7 f, when resetting, reseting address will be produced by YPCA, send into RAM_D through the YPCA_addr port and read instruction in the Addr_rest unit, send into the decoding of FCC parts through the FMM port and carry out.
Can dynamically redefine internal state by outside very long instruction word (VLIW) sign format word.Instruction manipulation control is to carry out according to the demand of data transfer operation, and its data path during to data transfer operation manages, and can define following several data transfer channel, shown in Fig. 7 k:
* D_R_MOV---the data between FDR port and the FRR port transmit;
* D_Z_MOV---the data between FDR port and the FZM port transmit;
* R_Z_MOV---the data between FRR port and the FZM port transmit;
* D_M_MOV---the data between FDR port and the FMM port transmit;
* R_M_MOV---the data between FRR port and the FMM port transmit;
* Z_M_MOV---the data between FZM port and the FMM port transmit;
* DR_ZM_MOY---when merging bus and using, between FD parts and the FA parts
Data transmit.
When D_R_MOV, shown in Fig. 7 l, send two form read request instructions of read data address by FPCA, period 1 sends this address from the FDR port, send second round and read instruction, when read data was effective, the FPCA parts were sent write data with write address from the FPCA_addr end and are sent by the FRR port, write among the RAM_B.When merging the bus transmission, read/write address is sent by address port FPCA_addr, YPCA_addr by FPCA, YPCA respectively, data will be delivered to FZM, FMM output through internal data bus TIMD together from FDR, FRR end, or deliver to FDR, FRR output by FZM, FMM.
Data transfer operation between each port is all similar with last dual mode.
Table 7-11 STRC_PIN state control table
STRC_PIN <2:0> | The state definition | The signal source |
Data bus status | Boot state resets |
000 | Data bus independently uses | Port guided from FDR | From external hardware id signal input pin STRC_PIN |
001 | Data bus independently uses | Port guided from FRR |
010 | Data bus merges use | Port guided from FD |
011 | Keep | Keep | |
100 | Data bus independently uses | Port guided from FZM |
101 | Data bus independently uses | Port guided from FMM |
110 | Data bus merges use | Port guided from FZ |
111 | Keep | Keep |
Table 7-12 Fig. 7 f~7s signal instruction table
Signal name | Function | Definition | Relevant diagram |
STRC_PIN | The mode of operation of external status control signal definition control data bus | Referring to table 7-11 | 7h~7s |
STRC | The mode of operation of the output signal control data bus of internal state register STRC_REG | Referring to table 9-1 | 7h~7s |
CLK | Latch the control clock on the data | CLK=0, data enter and latch CLK=1 on the DLAT, latch on the DLAT to remain unchanged | 7i,7j, 7l,7n, 7o,7r, 7s |
LDR | The following latch control signal that FDR input twin-lock is deposited | LDR=1, last latch data can enter down and latch middle LDR=0, latchs down to remain unchanged | 7i,7j, 7l,7o, 7r,7s |
LRR | The following latch control signal that FRR input twin-lock is deposited | LRR=1, last latch data can enter down and latch middle LRR=0, latchs down to remain unchanged | 7i,7j, 7l,7o, 7r,7s |
Q1 | The last latch control signal that FDR, FRR, FZM, FMM parts data input twin-lock are deposited | Enter the FCC decoding unit as instruction when the cycle input | 7i,7j, 7l,7o, 7r,7s |
Table 7-13 Fig. 7 f~7s functional unit instruction card
Functional unit | Function | Relevant diagram |
FRR、FDR | Two data I/O bus control unit of FD port | 7h~7s |
FZM、FMM | Two data I/O bus control unit of FZ port | 7h~7s |
FPCA | FD port data reference address generates parts | 7h~7s |
FCC | The three-dimensional table tennis of the parallel system of macroinstruction set code translator is referring to " Fig. 9 explanation " | 7h~7s |
YPCA | FZ port data reference address generates parts | 7h~7s |
MUX | The multichannel data gate | 7h~7s |
FIF | Internal state sign control assembly | 7h~7l,7o~7s |
FSWAPO | Output data byte exchange control assembly | 7i,7j,7l,7o,7r,7s |
FSWAPI | Input Data word joint exchange control assembly | 7i,7j,7l,7o,7r,7s |
FDH2 | Byte-accessed size tail control assembly | 7i,7j,7l,7o,7r,7s |
FSCAN | Byte is control assembly relatively | 7i,7j,7l,7o,7r,7s |
DLAT | Input data dual latch | 7i,7j,7l,7o,7r,7s |
FDIF | The instruction pretreatment component | 7i,7j,7l,7o,7r,7s |
FALU | Arithmetic unit | 7i,7j,7l,7o,7r,7s |
DFF | Data output control register | 7i,7j,7l,7o,7r,7s |
Fig. 8 is a very long instruction word (VLIW) hierarchy of control structural drawing
The very long instruction word (VLIW) hierarchy of control is a kind of sign system, and it is made of internal register sign system and external memory storage sign system and hard on line sign, takes the form of a kind of similar assembly language but the higher macrolanguage of semantic hierarchies in application.Its important application feature is:
The program design of this system macrolanguage must be according to whole features of this architecture, and the fundamental with human operation behavior designs in the acting in conjunction that outside hard on line sign, external memory storage sign and the internal register of the very long instruction word (VLIW) hierarchy of control identify respectively.
The important use method is as shown in Figure 8:
(1) the AS software engineer at first wants the feature of analyst's generic operation behavior, telling these features which hardware structure and operating structure is made of, which operation is to have concurrency, succession and correlative character, which feature is to reuse or redundant operation, and high-level semantic, grammer, the pragmatic relation of human operation behavior demand resolved into independently associative operation, redundant operation respectively, reuses operation, serial operation, parallel work-flow, control operation, calculating operation and storage operation.
(2) carry out the selection of this architecture method of operation, determine to support the tissue of architecture of aforesaid operations demand and the definition of structure.
(3) design rule that shows by the very long instruction word (VLIW) hierarchy of control (assembly unit, replacement, ordering, time-delay, optimization and streamline control), should arrange, make up by senior action process, make it to become the operation code stream that in one-period or a plurality of periodic duty process, to carry out, this code flow reflects the key element of demand with integral body, comprises operation temporal relationship, architecture, logical operation and the organizational controls relation of the demand element of each behavior of describing with established data structure (as supporting the data structure of higher level lanquage LISP, FORTH, FP etc.).
(4) by composing software or h coding, with the operational elements compiling of the data structure reflection used of determine or be designed to the code flow of the parallel system macrolanguage of macroinstruction set symmetrical expression, thereby form the authentication code sequence of the very long instruction word (VLIW) hierarchy of control, select the FPDP of definition to be input in this microprocessor by this system, realize control operation.
In fact, all macrolanguage codes of macroinstruction set produce according to said method, and be provided in the relevant service manual when design.
The parallel system of macroinstruction set symmetrical expression is not to produce control and Code Design according to the elementary instruction guiding, but, make it to reflect senior behavior operational requirements and form control and Code Design according to the feature of system and to effective combination, processing, the assembly unit of this feature.
The form of expression of macroinstruction set symmetrical expression parallel architecture---macrolanguage is the general performance and the reflection of these all features of architecture, shown in Fig. 8 a.Its key character is
* utilize the feature of memory stores data, the relation of the feature of functional requirement sign, the control of long instruction sign system, inner each modular construction, data path is reflected in the external memory storage sign system;
* external memory storage sign system can be stored the operation behavior that is designed, the operating characteristics of all elements when this system of use.These features comprise associative operation, redundant operation, reuse operation, serial operation, parallel work-flow, control operation, the sign control domain that the multidigit of calculating operation and storage operation is formed, they are configured to continuous by the rule of very long instruction word (VLIW) hierarchy of control application identities, a plurality of multidigit storage modes, the coded word sequence that meets the very long instruction word (VLIW) hierarchy of control, and according to the machine cycle sequential of system, be input to code translator by FPDP, produce outside sign, internal indicator, the Combinational Logic Control signal of very long instruction word (VLIW) logic control unit FDIF and code translator FCC, the primary demand of operation behavior is formed the signal of computer controlled process, be sent to each functional unit of this system, the semanteme that reflects the parallel system macrolanguage of macroinstruction set symmetrical expression, grammer, the pragmatic relation will elaborate in " Figure 10 ".
* outside sign system also comprises by the selection set-up mode of the bonding line of hardware circuit, directly imports this microprocessor, is created in reseting period to the determining of this architecture init state, will elaborate in " Figure 12 ".
* internal register sign system is utilized the effect of register-stored information, and membership credentials, logical relation and the operative relationship of definition internal architecture constitute control, the constraint of the very long instruction word (VLIW) hierarchy of control to hardware configuration.Internal register sign system is the important component part that the very long instruction word (VLIW) hierarchy of control is linked up the man-machine interface operative relationship.
Inner very long instruction word (VLIW) register identification system can be carried out programmed by outside very long instruction word (VLIW) system, selection operation mode, control mode and basic structure definition, the process that makes hardware organization's operation and control reaches the flow process to the software relevant treatment, give dynamic the setting and change through outside very long instruction word (VLIW) storaging mark system, effectively the operation of control internal functional unit reflects the controlled operating conditions of all parts, data path and system status.
The key character of inner very long instruction word (VLIW) sign system is:
* utilize the form of register, by the mode that outside very long instruction word (VLIW) identifies input, the loading of system and selects to define, the logical relation of control assembly, membership credentials and operative relationship;
* by the output signal of register identification,, realize producing jointly the signal of Combinational Logic Control, reflect the macrolanguage primitive function in real time with code translator in conjunction with the data of exterior storage sign system.
Effect about the inner overlength sign hierarchy of control elaborates in " Fig. 3-7,9 ".
The form of expression of the very long instruction word (VLIW) hierarchy of control will be formed according to the basic act key element to computer operation, and form common generation effect jointly by the hard on line sign in outside, external memory storage sign, internal register sign.
As Fig. 8 b, 8c, shown in 8d and the table 8d1~8d30, macroinstruction set symmetrical expression parallel architecture has been realized a kind of sign system of the very long instruction word (VLIW) hierarchy of control, sign format, the sign control domain, and the assembly unit of function, replace, ordering, time-delay, the grand process operation of test and coding, produced the structure that acts on this system hardware, tissue, the change of control and logical relation, make it to interact, make hardware, the architecture of software can be by reorganization and grand processing, and these features make this system support complexity in when operation, orderly or unordered, determinacy or nondeterministic algorithm, plurality of data structures, the operation of multiple application requirements and direct reflection behavior operating process.
First feature of the very long instruction word (VLIW) hierarchy of control is:
* the display form of the outside hardwired sign system of the very long instruction word (VLIW) hierarchy of control is the sign control that forms according to the combinational logic relation of each tag line or every group id line, its application characteristic is only in the control action of this system reseting period generation to this system, shown in Fig. 8 b and as described in " Figure 12 ".
* the display form of the internal indicator register system sign format of the very long instruction word (VLIW) hierarchy of control is to reflect sign to all inner structures, logic, control and operation with register or latch.
Its application characteristic is:
(1) all signs can be stored and revise;
(2) in internal indicator register system according to the state code of each sign format control domain, produce effect to the architecture sign, shown in Fig. 8 c and " Fig. 2~7,9 " described.
The exterior storage sign format of the very long instruction word (VLIW) hierarchy of control is divided into three kinds of display forms, shown in Fig. 8 e:
(1) single instrction sign format operation system.Its essential characteristic is that the process of instruction operation is independently, can finish control in a clock period; The width of its command identification form is the width of a data bus.
(2) two command identification format operation systems.Its essential characteristic is that the process of instruction operation has the feature of relevant and parallel work-flow, when associative operation, will finish control with three clock period; When parallel work-flow, can finish control by two clock period.The width of its command identification form equals a times of data-bus width.
(3) multiple instruction sign format operation system.Its essential characteristic is that the operating process of instruction produces data collision, resource is relevant, need finish with a plurality of clock period, and its instruction width is the width of n bar data bus, and need finish with the n+1 cycle.
Very long instruction word (VLIW) control sign system is implemented control and sign on sign format and sign control domain both direction.The one, control sign control domain assembly unit, replacement, ordering and time-delay, the 2nd, the composite assembly of control sign format is to form the design of outside stream line operation and Optimizing operation.
Shown in Fig. 8 f, three kinds of command identification forms of external memory storage sign system itself have erection method and array mode when design:
* single instruction format can be assembled in second and third form of second order format of two order formats or multiple instruction form, this erection method has showed in the behavior operation, based on the communication operation of data, finish the data of one or two independent behavior or the process of instruction manipulation behavior simultaneously;
But * two order format self assembly units are in second and third form of two forms or multiple instruction form, this erection method has reflected the process that a plurality of associative operation demands are arranged in the behavior operation, and corresponding requirement is also arranged on sequential, and it is the The pipeline design of behavior operation;
* the multiple instruction form can carry out self assembly unit and can mix assembly unit, this erection method is reflecting associative operation, redundant operation to a greater extent, is reusing the externally display form of form of operation, independent operation, parallel work-flow, sequential operation, the various combination of its structure and assembly unit will make human operation behavior form external control, optimization and stream line operation.
As Fig. 8 f1, shown in 2,3, multiple array mode forms multiple macrolanguage primitive with correspondence.
Counting storage operation immediately for one, is the basis of macrolanguage primitive, and because of operating independence, its very long instruction word (VLIW) external memory storage sign system is single form control operation system.
Shown in Fig. 8 f1, this macrolanguage primitive for accounting for an instruction word, has comprised stored immediate data in the form of expression of storer in the single instrction sign format, after this instruction is read into code translator from the FPDP of selecting, and can be in the monocycle executable operations.
When macrolanguage primitive carries out the register mode peek and several accesses take place simultaneously immediately by the register addressing mode, also constitute the primitive composition of macrolanguage.At this moment, single format order by assembly unit in two order formats.
As Fig. 8 f2, externally in the storer performance, this macrolanguage primitive be the operation of pair command identification format words, accounts for two instruction words, with the data of register addressing on some position a of memory bank.In sign format, the single instrction sign format is assembled in second form of two format order signs.As the sequential chart performance, the first command identification form has produced the control operation with the register mode addressing when the period 1; In second round, the second command identification form, promptly a single instrction sign format is finished the operation of number peek immediately; In the period 3, finish in the register addressing mode and in the middle of storer a, obtained data.Thus, the assembly unit of an order format has reflected independent and the process of parallel work-flow and the process of the grand processing of control language.
Higher level macrolanguage primitive also can produce by the relevant order format assembly unit of sequential.
Carrying out the interval with 1/2 algorithm and approach, select the macrolanguage primitive of control operation in real time, is the feature of a kind of reflection associative operation (conditional operation) and parallel work-flow, and its functional requirement is:
With one with the C of numerical value immediately of R register indication interval (A B) compares,
* if C in [A, B] interval, (A+C)/2 then, assignment A; With register mode addressing peek, and call subroutine 1;
* if C not in [A, B] interval, then asks for A, B, C maximum, minimum value, as A and B, adjusting R register address pointer is increment, and call subroutine 2.
Carrying out the interval with 1/2 algorithm approaches the procedural model of the macrolanguage primitive operation of real-time selection control operation and is:
IF (PERIOD A, B → C)/test C in [A, B] interval
THEN A=(A+C)/2, CALL 1/satisfy
(C), R=R+1, CALL 2/ do not satisfy ELSE MAXMIN for A, B
Shown in Fig. 8 f3, the storaging mark form of this primitive takies two command identification words, and three clock period finish, and is a kind of sign format mixing assembly form.Describe as can be known by Fig. 8 a, first cycle of this instruction, finish C in [A, B] interval comparison, produce the state after the comparison, be sent to second period, address N with the peek of register N addressing mode sends in the upper edge of second clock period simultaneously, carries out data and reads in, in second clock period, conditional outcome state after having obtained relatively, at this moment:
* when condition satisfied, the address entry value of chooser program 1 sent in the upper edge of the 3rd clock period, preserves current address pointer PC+2 simultaneously, finishes (A+C)/2, and the operation of assignment A.The 3rd clock period, with the subroutine entry pointer increment, send, and carry out the data processing operation that reads in register N addressing the 4th clock period upper edge, as the suction parameter of subroutine 1, this end of operation.
* when condition does not satisfy, in second clock period,, send, preserve current address pointer PC+2 simultaneously, finish A in the upper edge of the 3rd clock period with the address entry value of chooser program 2, B, C three numbers are maximum, the operational processes of minimum value.The 3rd clock period, the data that deletion obtains with the addressing of register N mode, the entry reference of adjustment R value makes it point to C1, and subroutine 2 entry reference increments are sent the 4th clock period upper edge, and the primitive of this macrolanguage is finished.
The assembly unit of the high-level semantic and instruction sign format that grand machine process produced of macrolanguage is relevant, in the assembly unit process, can form outer flow waterline and Optimizing operation thereof.
Another key character of the outside sign system of the very long instruction word (VLIW) hierarchy of control is:
* reconfiguring of this system sign format is the multiple array configuration that produces because of the requirement of associative operation, redundant operation, control time sequence in operating process, and this combination and the very long instruction word (VLIW) control word sequence that forms thus are a kind of reflections that is similar to outside superpipeline operating result when carrying out.
* reconfiguring of sign control domain is the multiple array configuration that produces because of the requirement of sequential operation or operation repetitive process in operating process in this system sign format, and it is a kind of reflection that is similar to outside microcode optimal design result that this combination and the very long instruction word (VLIW) control that forms are thus flowed when execution.
The outside sign format word of very long instruction word (VLIW) is made up of sign control domain a plurality of, multidigit, each identification field can produce multiple coding, a plurality of sign control domains cooperate every kind of coding in each territory can construct the primitive function operation of multiple macrolanguage, the semanteme and the instruction sequence thereof of the macrolanguage that the different application demand constitutes can produce the process of multiple command identification format combination and corresponding each functional part operation control.
Shown in Fig. 8 g, in the command identification system, instruction/data control domain and random address pointer protection/non-protection control domain is arranged all in all command identification forms.The effect of instruction/data sign control domain is that this instruction will indicate following one-period to be operating as instruction or data from what FPDP was read in, produces encoded control and makes data/commands separate (when data, instruct when all coming from a data bus).The effect of protection/non-protection control domain is meant the memory address of whether protecting next bar instruction when this instruction is carried out.
When the on-the-spot protection of the coding selection instruction in the protection/non-protection of command identification form control domain and instruction/data territory and storage, the primitive operation of the grand semanteme shown in Fig. 8 f2 changes and becomes---and the operation of getting number back jump to subroutine immediately is shown in Fig. 8 g1.
When the coding of the instruction/data territory in the command identification form was selected data, the primitive operation of the grand semanteme shown in Fig. 8 f1 was changed into---from the operation of instructing next storage unit to peek, shown in Fig. 8 g2.
When the coding selection instruction of instruction/data in the command identification form, Fig. 8 f3 with the grand semantic peek operation change of register N addressing mode is---the operation of operating or optimizing current subroutine address article one instruction is inserted in an instruction before call subroutine 1, shown in Fig. 8 g3 and Fig. 8 g4.
As described in Fig. 8 a, the grand processing of very long instruction word (VLIW) system hardware, software comprises between the instruction and instruction, between sign format and the sign format, between sign control domain and the sign control domain, and the selection of sign control domain coding all can reflect the grand process of macrolanguage primitive.
In the very long instruction word (VLIW) hierarchy of control, represent the outside very long instruction word (VLIW) storer sign system of the basic macrolanguage function of computing machine to be made of some sign control domains, each territory is a complete operating function corresponding with the fundamental element of people's generic operation and that computing machine can be discerned.The sign control domain is constructed the primitive of the macrolanguage of multiple application function by modes such as assembly unit, replacement, ordering, time-delays.
The application characteristic in very long instruction word (VLIW) hierarchy of control sign format territory is as follows:
* identifying the assembly unit of control domain in the long instruction sign format, is to be utilized as the basis so that hardware resource is redundant with walking abreast, and is a kind of implementation of target with the high-level semantic function.
Shown in Fig. 8 h, when the application behavior of determining after design process as described in Figure 8, according to the effect in long instruction assembly unit territory, the rule of sign control domain by the assembly unit territory is programmed in the very long instruction word (VLIW) sign format, form the macrolanguage code.
A very long instruction word (VLIW) sign hierarchy of control is made up of some sign control domains, and its width equals a times of data bus.The sign control domain has reflected the feature of all hardware architecture, but the length of an instruction word is limited, the control assembly unit that all can not be identified is in an instruction word, therefore the process of assembly unit is similar to a computer architecture, feature with various order formats, because the selection in assembly unit territory can identify at certain and derive various control territory array configuration in order format.
Its key character is:
(1) operating position of sign control domain in the long instruction sign format can be selected according to the control in assembly unit territory;
(2) sequence of operation of sign control domain in the long instruction sign format can be selected according to the control in assembly unit territory.
* identifying the replacement of control domain in the long instruction sign format, is based on the reusing and dynamically change operative relationship of hardware resource, to satisfy a kind of implementation that multiple application demand is a target.
Shown in Fig. 8 i.When behavior action need hardware resource is reused, during data reusing, to identify the effect of replacing the territory in the control domain according to long instruction, make outside long instruction sign format and internal register sign format mutual alternative, realize that the inside and outside sign of control control domain alternately produces the effect to architecture and running status.
As shown in the figure, replacement has two important operations:
(1) the command identification control domain in the middle of the outside very long instruction word (VLIW) storaging mark system, when the sign control domain that forms with register with inside exists jointly, the operation in the control operation territory of external memory storage sign system makes this architecture change the mode of operation of architecture according to the exterior storage sign effect of the control operation domain identifier of alternative internal register sign system;
(2) in the command identification form of outside very long instruction word (VLIW) sign system, when not having the complete sign control operation of assembly unit, can indicate the control domain generation effect that makes internal register sign system, also the control domain of maskable internal register sign system is had an effect, and makes it to become original state.
Its key character is:
(1) can in the command identification form, replace the effect of internal register sign when utilizing the outside replacement territory that identifies control domain to realize operation, promptly dynamically change operative relationship, control relation;
(2) performance of implicit sign control domain in the long instruction sign format utilizes the effect of replacing the territory in the sign control domain can make this system can utilize the effect of internal indicator control domain and realizes reusing and operating of resource.
* identify the ordering of control domain in the long instruction sign format, be with the serial of operation and the right of priority of operating process, realize the rearrangement of control stream to change execution sequence between the sign control domain, with the versatility of pursuing semantic behavior is a kind of implementation of target, shown in Fig. 8 j.This operation be with the ordering territory be control, when a serial occurs simultaneously with parallel operation and has preferential resource occupation or when data are relevant or sequential is correlated with, the control that can utilize the ordering territory is resequenced the position of sign control domain definite in the long instruction sign format and operated function by the demand of preferential resource occupation.
Ordering and explanation to the arithmetic operation territory are described in " Fig. 4 ", and be basic identical to the ordering in other control operation territory.Data operation result's ordering will be by the taking of preferential resource, and the effect by the ordering territory produces.When the operation in the clock period has produced two results and need deliver to same destination register to the result when register Rn (A+B and C+D send same), the effect in ordering territory be the indication result of A+B or C+D which is preferentially sent into as a result, and the value that is not admitted to of indication is to keep or discarded.
Its key character is:
(1) make the function of all sign control domains that reflect in the long instruction sign format, reconfigure, and the function that the sign control domain is realized does not change according to demands of applications;
(2) demand of behavior operation has reflected the operation of specific sign control domain, only is the effect because of ordering, then can reflect the macrolanguage primitive of multiple operation behavior.
* identifying the time-delay of control domain in the long instruction sign format, is to be correlated with based on data resource conflict, operation control that program flow produces, and utilizing resources supplIes to support the high-level pragmatic of macrolanguage to close with maximum is a kind of implementation of target.
Shown in Fig. 8 k.This operation will be control with the time-delay territory, and when the operation behavior demand produces the data resource conflict when relevant with operation control, the control by the time-delay territory can make and identify control domain and be arranged at respectively in the different cycles and operate.
When operation is associated with the data, shown in Fig. 8 f2, arithmetic operation territory in first form of two order formats, in the time of need obtaining the laggard line operate of data in the period 3 by the register randow addressing, the time-delay territory can be carried out the arithmetic operation territory in first order format by decoding counter period 3 of delaying time, make the effect of in the period 3, deciphering realize associative operation by time-delay, and the operation control domain that will be delayed time remains in the decoding register, the effect in time-delay territory is with the domain of dependence in the command identification form in certain cycle, operates with the data that obtain in certain cycle that lags.
Its key character is:
(1) makes the operating process of computing machine have time control relation flexibly, can farthest utilize resource;
(2) can automatically handle and solve relevant issues and collision problem enforcement hardware, the control of instruction stream is not interrupted, thereby raise the efficiency.
The outside of very long instruction word (VLIW) hierarchy of control sign can be by selecteed FPDP, in the upper edge of memory cycle of system sequence code translator is read in this exterior storage sign instruction, realizes control operation, it is characterized in that:
(1) to the single instrction sign format, shown in Fig. 8 l, with a memory cycle from selecteed FPDP reading of data and be transferred to code translator, with a clock period complete operation, and operating result preserved in the upper edge of next clock period.When the frequency of memory cycle during less than two clock period, each memory cycle can be finished the operation of two single instrction sign formats.
(2) to two command identification forms, shown in Fig. 8 m, continuously fetch data and be transferred to code translator from certain port reads with two memory cycles, finish the control of associative operation, delay operation in first cpu clock cycle, finish the control of parallel work-flow or independent operation second cpu clock cycle, when associative operation or delay operation, will finish the control of whole operation the 3rd cpu clock cycle, and operating result will be preserved in the upper edge in the 3rd or the 4th cpu clock cycle.
(3) to the multiple instruction sign format, shown in Fig. 8 n, continuously fetch data and be transferred to code translator from certain port reads with at least three memory cycles, finish the control of the operation of relevant operation, time-delay in first cpu clock cycle, finish the control of composition operation, parallel work-flow or independent operation second cpu clock cycle, realize relevant or delay operation the 3rd cpu clock cycle, finish composition operation the 4th cpu clock cycle, and operating result is preserved in the upper edge in the 5th cpu clock cycle.
When (4) the combined command sign format is operated, shown in Fig. 8 o, will realize the binary cycle or the operation in three cycles according to different combinations and assembly unit, the operation of phase will determine the operating process of following one-period because of the assembly unit of command identification form weekly.
The outside hard on line sign of table 8b-1 STRC_PIN communication structure
STRC_PIN <2:0> | Function declaration |
Data bus status | Boot state resets |
000 | Data bus independently uses | Port guided from FDR |
001 | Data bus independently uses | Port guided from FRR |
010 | Data bus merges use | Port guided from FD |
011 | Keep | Keep | |
100 | Data bus independently uses | Port guided from FZM |
101 | Data bus independently uses | Port guided from FMM |
110 | Data bus merges use | Port guided from FZ |
111 | Keep | Keep |
The outside hard on line sign of table 8b-2 ASC_PIN address port synchronous/asynchronous sequential
ASC_PIN | Function declaration | |
0 1 | The asynchronous sequential control mode of synchronous sequence control mode |
The outside hard on line sign of table 8b-3 ASCd_PIN FPDP synchronous/asynchronous sequential
ASC_PIN | Function declaration | |
0 1 | Data latching is operating as synchronous sequence control mode data latching and is operating as asynchronous sequential control mode |
Table 8b-4 PS_PIN address unit is stored at random, the outside hard on line sign of serial storage mode
PS_PIN | Function declaration | |
0 1 | Storage mode serial storage mode at random |
The first in first out of table 8b-5 FIFO_PIN address unit serial and outside first-in last-out hard on line sign
FIFO_PIN | Function declaration | |
0 1 | The first-in first-out first-in last-out |
The outside hard on line sign of table 8b-6 IA_PIN initial
address
|
0 1 | Reset and reset in the high-end FFFFFFH address of storer in the low side 000000H address of storer |
The outside hard on line sign of table 8b-7 FDP_PIN table tennis decoding circuit
FDP_PIN<1:0> | Function |
00 01 10 11 | With first line mode is that the first decoding circuit is that the first decoding circuit is that the first decoding circuit is the first decoding circuit with the 4th line mode in the tertiary circuit mode with second line mode |
Fig. 8 c note
The first address port internal indicator territory
The sequential mode during synchronous/asynchronous of ASC1_IF---first address port address output
The internal indicator territory
PC1_IF---the first address port address strobe internal indicator territory
MPNR1_IF---first address port partition holding size internal indicator territory
VPM1_IF---the first address port storage administration mode internal indicator territory
BER1_IF---the first address port byte addressing mode internal indicator territory
The second address port internal indicator territory
The sequential mode during synchronous/asynchronous of ASC2_IF---second address port address output
The internal indicator territory
PC2_IF---the second address port address strobe internal indicator territory
MPNR2_IF---second address port partition holding size internal indicator territory
VPM2_IF---the second address port storage administration mode internal indicator territory
BER2_IF---the second address port byte addressing mode internal indicator territory
Three-address port internal indicator territory
The sequential mode during synchronous/asynchronous of ASC3_IF---three-address port address output
The internal indicator territory
PC3_IF---three-address port address gating internal indicator territory
MPNR3_IF---three-address port partition holding size internal indicator territory
VPM3_IF---three-address port storage administration mode internal indicator territory
BER3_IF---three-address port byte addressing mode internal indicator territory
Four-address port internal indicator territory
The sequential mode during synchronous/asynchronous of ASC4_IF---four-address port address output
The internal indicator territory
PC4_IF---four-address port address gating internal indicator territory
MPNR4_IF---four-address port partition holding size internal indicator territory
YPM4_IF---four-address port storage administration mode internal indicator territory
BER4_IF---four-address port byte addressing mode internal indicator territory
The first FPDP internal indicator territory
SWAPI1_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the first FPDP data entry mode byte
The field of awareness
SWAPO1_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the first FPDP data way of output byte
The field of awareness
Sequential mode inside during ASCd1_IF---the first FPDP data sync/asynchronous
Identification field
I/D1_IF---the first FPDP instruction/data internal indicator territory
RPS1_IF---the first FPDP register walks abreast/the interior standard laid down by the ministries or commissions of the Central Government of serial use
The field of awareness
The mode of operation of RFIFO1_IF---the first FPDP first in first out/first-in last-out
The internal indicator territory
The standard laid down by the ministries or commissions of the Central Government in Size1_IF---the first FPDP inputoutput data byte wide
The field of awareness
The second FPDP internal indicator territory
SWAPI2_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the second FPDP data entry mode byte
The field of awareness
SWAPO2_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the second FPDP data way of output byte
The field of awareness
Sequential mode inside during ASCd2_IF---the second FPDP data sync/asynchronous
Identification field
I/D2_IF---the second FPDP instruction/data internal indicator territory
RPS2_IF---the second FPDP register walks abreast/the interior standard laid down by the ministries or commissions of the Central Government of serial use
The field of awareness
The mode of operation of RFIFO2_IF---the second FPDP first in first out/first-in last-out
The internal indicator territory
The standard laid down by the ministries or commissions of the Central Government in Size2_IF---the second FPDP inputoutput data byte wide
The field of awareness
The 3rd FPDP internal indicator territory
SWAPI3_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the 3rd FPDP data entry mode byte
The field of awareness
SWAPO3_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the 3rd FPDP data way of output byte
The field of awareness
Sequential mode inside during ASCd3_IF---the 3rd FPDP data sync/asynchronous
Identification field
I/D3_IF---the 3rd FPDP instruction/data internal indicator territory
RPS3_IF---the 3rd FPDP register walks abreast/the interior standard laid down by the ministries or commissions of the Central Government of serial use
The field of awareness
The mode of operation of RFIFO3_IF---the 3rd FPDP first in first out/first-in last-out
The internal indicator territory
The standard laid down by the ministries or commissions of the Central Government in Size3_IF---the 3rd FPDP inputoutput data byte wide
The field of awareness
The 4th FPDP internal indicator territory
SWAPI4_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the 4th FPDP data entry mode byte
The field of awareness
SWAPO4_IF---the standard laid down by the ministries or commissions of the Central Government in the exchange of the 4th FPDP data way of output byte
The field of awareness
Sequential mode inside during ASCd4_IF---the 4th FPDP data sync/asynchronous
Identification field
I/D4_IF---the 4th FPDP instruction/data internal indicator territory
RPS4_IF---the 4th FPDP register walks abreast/the interior standard laid down by the ministries or commissions of the Central Government of serial use
The field of awareness
The mode of operation of RFIFO4_IF---the 4th FPDP first in first out/first-in last-out
The internal indicator territory
The standard laid down by the ministries or commissions of the Central Government in Size4_IF---the 4th FPDP inputoutput data byte wide
The field of awareness
Decoded operation internal indicator territory
Decm_IF---the storage port data internal indicator territory of input code translator
Decs_IF---the line mode internal indicator territory of input code translator
Decpp_IF---table tennis decoded mode internal indicator territory
SWITCH_IF---replacement operation internal indicator territory
ORDER_IF---sequence of operation internal indicator territory
DELAY_IF---delay operation internal indicator territory
ASSORTMENT_IF---assembly unit operation internal indicator territory
Arithmetic operation internal indicator territory
OP2_IF---computing classification internal indicator territory
ALUOPs_IF---serial arithmetic operation internal indicator territory
ALUOPp_IF---concurrent operation operation internal indicator territory
ALUs_IF---internal indicator territory, arithmetic operation number source
ALUd_IF---deposit operation result internal indicator territory
AU_IF---arithmetical operation operation internal indicator territory
LOG_IF---logical operation operation internal indicator territory
SHC_IF---shift operation operation internal indicator territory
SHB_IF---shift operation figure place internal indicator territory
Cond_IF---internal indicator territory, condition test source
FLAG_IF---the true and false internal indicator of logic territory
Ncc_IF---carry identification field
Icc_IF---logical condition sign indicating number internal indicator territory
Processor state and other internal indicator territory
OUSU_IF---processor system state internal indicator territory
TIMER_IF---counter internal indicator territory
Sys_IF---system operating state internal indicator territory
MPI_IF---communication operation state internal indicator territory
FM_IF---order format form internal indicator territory
OP_IF---instruction basic operation internal indicator territory
EI_IF---interrupt mask internal indicator
PIL_IF---interrupt priority level internal indicator
IBAR_IF---interrupt the plot internal indicator
Fig. 8 d note
The first address port storaging mark territory
Sequential mode is deposited during the synchronous/asynchronous of ASC1---first address port address output
The storage identification field
PC1---the first address port address strobe storaging mark territory
MPNR1---first address port partition holding size storaging mark territory
YPM1---the first address port storage administration mode storaging mark territory
BER1---the first address port byte addressing mode storaging mark territory
The second address port storaging mark territory
Sequential mode is deposited during the synchronous/asynchronous of ASC2---second address port address output
The storage identification field
PC2---the second address port address strobe storaging mark territory
MPNR2---second address port partition holding size storaging mark territory
YPM2---the second address port storage administration mode storaging mark territory
BER2---the second address port byte addressing mode storaging mark territory
Three-address port storaging mark territory
Sequential mode is deposited during the synchronous/asynchronous of ASC3---three-address port address output
The storage identification field
PC3---three-address port address gating storaging mark territory
MPNR3---three-address port partition holding size storaging mark territory
YPM3---three-address port storage administration mode storaging mark territory
BER3---three-address port byte addressing mode storaging mark territory
Four-address port storaging mark territory
Sequential mode is deposited during the synchronous/asynchronous of ASC4---four-address port address output
The storage identification field
PC4---four-address port address gating storaging mark territory
MPNR4---four-address port partition holding size storaging mark territory
YPM4---four-address port storage administration mode storaging mark territory
BER4---four-address port byte addressing mode storaging mark territory
The first FPDP storaging mark territory
SWAPI1---first FPDP data entry mode byte exchange storaging mark territory
SWAPO1---first FPDP data way of output byte exchange storaging mark territory
Sequential mode storaging mark territory during ASCd1---the first FPDP data sync/asynchronous
I/D1---the first FPDP instruction/data storaging mark territory
RPS1---the first FPDP register walks abreast/serial use storaging mark territory
RFIFO1---the first FPDP first in first out/mode of operation storage first-in last-out
Identification field
Size1---the first FPDP inputoutput data byte wide storaging mark territory
The second FPDP storaging mark territory
SWAPI2---second FPDP data entry mode byte exchange storaging mark territory
SWAPO2---second FPDP data way of output byte exchange storaging mark territory
Sequential mode storaging mark territory during ASCd2---the second FPDP data sync/asynchronous
I/D2---the second FPDP instruction/data storaging mark territory
RPS2---the second FPDP register walks abreast/serial use storaging mark territory
RFIFO2---the second FPDP first in first out/mode of operation storage first-in last-out
Identification field
Size2---the second FPDP inputoutput data byte wide storaging mark territory
The 3rd FPDP storaging mark territory
SWAPI3---the 3rd FPDP data entry mode byte exchange storaging mark territory
SWAPO3---the 3rd FPDP data way of output byte exchange storaging mark territory
Sequential mode storaging mark territory during ASCd3---the 3rd FPDP data sync/asynchronous
I/D3---the 3rd FPDP instruction/data storaging mark territory
RPS3---the 3rd FPDP register walks abreast/serial use storaging mark territory
RFIFO3---the 3rd FPDP first in first out/mode of operation storage first-in last-out
Identification field
Size3---the 3rd FPDP inputoutput data byte wide storaging mark territory
The 4th FPDP storaging mark territory
SWAPI4---the 4th FPDP data entry mode byte exchange storaging mark territory
SWAPO4---the 4th FPDP data way of output byte exchange storaging mark territory
Sequential mode storaging mark territory during ASCd4---the 4th FPDP data sync/asynchronous
I/D4---the 4th FPDP instruction/data storaging mark territory
RPS4---the 4th FPDP register walks abreast/serial use storaging mark territory
RFIFO4---the 4th FPDP first in first out/mode of operation storage first-in last-out
Identification field
Size4---the 4th FPDP inputoutput data byte wide storaging mark territory
Decoded operation storaging mark territory
Decm---the storage port data storage identification field of input code translator
Decs---the line mode storaging mark territory of input code translator
Decpp---table tennis decoded mode storaging mark territory
SWITCH---replacement operation storaging mark territory
ORDER---sequence of operation storaging mark territory
DELAY---delay operation storaging mark territory
ASSORTMENT---assembly unit operation store identification field
Arithmetic operation storaging mark territory
OP2---computing classification storaging mark territory
ALUOPs---serial arithmetic operation store identification field
ALUOPp---concurrent operation operation store identification field
ALUs---storaging mark territory, arithmetic operation number source
ALUd---deposit operation result storaging mark territory
AU---arithmetical operation operation store identification field
LOG---logical operation operation store identification field
SHC---shift operation operation store identification field
SHB---shift operation figure place storaging mark territory
Cond---storaging mark territory, condition test source
FLAG---the true and false storaging mark of logic territory
Ncc---carry identification field
Icc---logical condition sign indicating number storaging mark territory
Processor state and other storaging mark territory
OUSU---processor system state storage identification field
TIMER---counter storaging mark territory
Sys---system operating state storaging mark territory
MPI---communication operation state storage identification field
FM---order format form storaging mark territory
OP---instruction basic operation storaging mark territory
EI---interrupt mask storaging mark
PIL---interrupt priority level storaging mark
IBAR---interrupt the plot storaging mark
Table 8d-1 ASC1---
ASC1 | Function declaration | |
0 1 | The address output function is synchronous sequence mode address output function sequential mode when being asynchronous |
Table 8d-2 PC1---
PC1 | Function declaration | |
000 001 010 011 100 101 110 111 | It is that first FPDP register address source is that second FPDP register address source is that the 3rd FPDP register address source is the 4th FPDP register for current pointer register address source for operation result register address source that the source, address is adopted the source for source, program pointer increment register address for program pointer decrement register address |
Table 8d-3 YPM1---
YPM1 | Function declaration | |
0 1 | Specific address way to manage paged address way to manage |
Table 8d-4 BER1---
BER1 | Function declaration | |
0 1 | The byte addressing mode is that " big tail " mode byte addressing mode is " little tail " mode |
Table 8d-5 SWAPI1---the menu of very long instruction word (VLIW) control data input mode byte exchange
Size SWAPI1 | 00 | 01 | 10 | 11 |
00 01 10 11 | Do not exchange and keep | Not exchanging high low byte exchange keeps | Not exchanging the high low byte exchange of 16 exchanges of height keeps | Do not exchange the high low byte exchange of 16 exchanges of 32 exchange height of height |
Table 8d-6 SWAPO1---the menu of very long instruction word (VLIW) control data way of output byte exchange
Size SWAPI2 | 00 | 01 | 10 | 11 |
0 1 | Do not exchange reservation | Do not exchange high low byte exchange | Do not exchange 16 exchanges of height | Do not exchange 32 exchanges of height |
Table 8d-7 ASCd1---
ASCd1 | Function declaration | |
0 1 | Data latching is synchronous sequence mode data latching sequential mode when being asynchronous |
Table 8d-8 I/D1---
I/D1 | Function declaration | |
0 1 | Following cycle data port input content is data for cycle data port input content under the instruction |
Table 8d-9 RSP1---
RSP1 | Function declaration | |
0 1 | The parallel work-flow serial operation |
Table 8d-10 RFIFO1---
RFIFO1 | Function declaration | |
0 1 | First-in first-out (FIFO) first-in last-out (FILO) |
Table 8d-11 Size1---
Size1 | Function declaration |
00 01 10 11 | 8 bit data are operated 16 bit data and are operated 32 bit data and operate 64 bit data operations |
Table 8d-12 Decm---
Decm | Function declaration |
00 01 10 11 | First FPDP is deciphered second FPDP and is deciphered the 3rd FPDP and decipher the 4th FPDP and decipher |
Table 8d-13 Decs---
Decs | Function declaration |
00 01 10 11 | The first line mode data are deciphered the second line mode data are deciphered tertiary circuit mode data are deciphered the 4th line mode data are deciphered |
Table 8d-14 Decpp---
Decpp | Function declaration |
00 01 10 11 | Serial decoding mode parallel decoding mode is the restrictive decoded mode of decoded mode periodically |
Table 8d-15 DELAY---time-delay identification field
DELAY | Function declaration |
00 01 10 11 | 3 cycleoperations of 2 cycleoperation time-delays of 1 cycleoperation time-delay of no delay operation time-delay |
Table 8d-16 OUSU---
OUSU | Function declaration |
00 01 10 11 | Processor is that OK attitude processor is that UT attitude processor is that OS attitude processor is user's attitude |
Table 8d-17 MPI---
MPI | Function declaration |
00 01 10 11 | 3 instructions of multiprocessor common instruction multiprocessor 1 instruction multiprocessor 2 instruction multiprocessors |
Table 8d-18 FM---order format identification field
op | Function declaration |
00 01 10 11 | Single form multi-format first form multi-format intermediate form multi-format final format |
Table 8d-19 OP---basic operation identification field
OP | Function declaration |
00 01 10 11 | The operation of CALL subroutine call operation IF operation of conditional transfer LOAD peek operation STORE poke |
Table 8d-20 EI---
EI | Function declaration | |
0 1 | Interrupt mask is out the state interrupt shielding and is off status |
Table 8d-21 PIL---
PIL | Function declaration | |
000 001 010 011 100 101 110 111 | The shielding more than 0 grade interruption masking more than 1 grade interruption masking more than 2 grades interruption masking more than 3 grades interruption masking more than 4 grades interruption masking more than 5 grades interruption masking more than 6 grades interruption masking interrupt more than 7 grades |
Table 8d-22 OP2---operation mark territory
OP2 | Function declaration | |
000 001 010 011 100 101 110 111 | The compare operation of data transfer operation arithmetical operation operation logic arithmetic operation shift operation operation concurrent operation operation serial arithmetic operating data keeps |
Table 8d-23 ALUOPs---
ALUOPs | Function declaration | |
000 001 010 011 100 101 110 111 | Arithmetical operation and logical operation serial operation logical operation and arithmetical operation serial operation arithmetical operation and shift operation serial operation shift operation and arithmetical operation serial operation logical operation and shift operation serial operation shift operation and logical operation serial operation keep |
Table 8d-24 ALUOPp---
ALUOPp | Function declaration |
00 01 10 11 | Arithmetical operation and arithmetical operation parallel work-flow arithmetical operation and logical operation parallel work-flow arithmetical operation and shift operation parallel work-flow logical operation and shift operation parallel work-flow |
Table 8d-25 ALUs---
ALUs | Function declaration |
00 01 10 11 | Operand from the first line mode operand from the second line mode operand from tertiary circuit mode operand from the 4th line mode |
Table 8d-26 ALUd---
ALUd | Function declaration |
00 01 10 11 | Operation result outputs to the first FPDP register operation result and outputs to the second FPDP register operation result and output to the 3rd FPDP register operation result and output to the 4th FPDP register |
Table 8d-27 AU---
AU | Function declaration | |
000 001 010 011 100 101 110 111 | Add operation full add method computing subtraction band borrow subtraction add operation and affect the method computing of Icc state full add and affect Icc state subtraction and affect Icc state band borrow subtraction and affect the Icc state |
Table 8d-28 LOG---
LOG | Function declaration | |
000 001 010 011 100 101 110 111 | Logical AND logic OR logic XOR retention logic and and modification Icc state logic or and modification Icc state logic XOR and the reservation of modification Icc state |
Table 8d-29 SHC---
LOG | Function declaration | |
000 001 010 011 100 101 110 111 | The left circulation of logical shift left logic shift right moves right circulation and moves the reservation of arithmetic shift left arithmetic shift right |
Table 8d-30 Cond---operating conditions identification field
cond | Function declaration |
00 01 10 11 | Unconditional operation is differentiated the true and false FLAG criterion of logic sign indicating number icc and is differentiated carry flag Ncc |
Fig. 9 is the three-dimensional table tennis control of a FCC code translator block diagram
The three-dimensional table tennis decoding unit FCC of a symmetry, shown in figure (9), it comprises:
* two independently instruction input gate MUX1, MUX2, be used for gating is carried out in the instruction input of first, second, third, fourth circuit, the instruction/data input mode comprises the 4th line mode of internal data bus TIMDBUS and first line mode of FPDP, the gating control end is controlled by inner very long instruction word (VLIW) register identification parts FIF, and the output of gating instruction/data is connected with FCCP with two code translator FCCB respectively;
* two independently decoding unit FCCB and FCCP, input is deciphered to instruction/data respectively, and the data among the output signal and instruction sign pre-service register FDIF of generation are together to form control signal;
* a three-dimensional is deciphered State Control parts DC, be used to control the mode of operation of decoding unit FCCB and FCCP, the input control end of DC parts is represented the off status that has of parts FIF and outside hard on line logical identifier from inner very long instruction word (VLIW) register, finishes the pre-decode operation by DC control decoding unit.
The STRC sign that to be inner very long instruction word (VLIW) register identification word select data port part mode of operation among Fig. 9 is used to indicate the definition of the mode of operation of current data port part, referring to table 9-1; FPP1, FPP2 and FPP3 are respectively the control of three-dimensional table tennis decoded mode, State Control and the restrictive encoded control signal of periodicity, in order to indicate encoded control mode and the state of current FCC, referring to table 9-2, table 9-3, table 9-3.
By the control of inner very long instruction word (VLIW) register identification, can make three-dimensional table tennis decoding unit have following decoded mode:
(1) parallel decoding
Select two data buss in the definition of data port part (FD, FZ, FTNSF or FT) independently to use, this decoded mode is the parallel decoding mode.Two code translators (FCCB and FCCP) of FCC parts allow two director datas of two different bus inputs are deciphered simultaneously, shown in Fig. 9 a.When independent the use, each memory cycle can accept to double the very long instruction word (VLIW) identifier word of data-bus width.Mutual when uncorrelated when the operation of two instructions, can carry out concurrently simultaneously, shown in Fig. 9 b; When the operation of two instructions is correlated with, delay process is carried out in the flowing water formation of then dependent instruction being sent in the FDIF instruction pretreatment component, shown in Fig. 9 c, at this moment, relevant instruction is delayed execution automatically in instruction instruction pretreatment component, remove up to the relevant control of deciphering the combination control signal that produces.
(2) serial decoding
Select two data buss in the definition of data port part (FD, FZ, FTNSF or FT) to merge use, this decoded mode is the serial decoding mode.Two code translators (FCCB and FCCP) of FCC parts will be according to two director data stream sequences of input, and by performance period gating input one by one, decoding is carried out.At synchronization, each code translator can be carried out the director data of a data bus with the input of the first or the 4th line mode.When merging use, allowing the clock period of innernal CPU is the twice of memory read/write cycle, make each memory cycle processor can accept to be four times in the very long instruction word (VLIW) identifier word sequence of data-bus width, two code translators will be according to the mode of operation of selecting definition in a memory cycle, gating is from the very long instruction word (VLIW) identifier word that is four times in data-bus width of two data port parts, shown in Fig. 9 d.Mutual when uncorrelated when the operation of two instructions, two instruction words order are simultaneously carried out; When being correlated with appearred in the operation of two instructions, delay process was carried out in the flowing water formation that dependent instruction is sent in the FDIF instruction pretreatment component, removes up to the relevant control of deciphering the combination control signal that produces.
(3) table tennis decoding
Definition by inner very long instruction word (VLIW) register identification FIF_FPP1, the FCC decoding unit can select to respond distribute by first or second gate, from the instruction/data of four symmetrical storing modes and the input of the first, second, third or the 4th line mode, the decoded operation of rattling produces the Multiple Combination decoded signal.Respond the instruction/data that first gate distributes and be " ping " decoded operation, respond instruction/data that second gate distributes for " pang " decoded operation." ping " and " pang " decoded operation can be according to the definition of inner very long instruction word (VLIW) register identification FIF_FPP3, selects to carry out the conversion of periodicity table tennis or restrictive table tennis is changed.
Periodically table tennis decoding is in the first clock signal cycle pulse cycle, and the decoded operation of rattling of the instruction of the first, second, third, fourth line mode input that first or second gate distributes or data is selected in circulation sequentially, shown in Fig. 9 e.
Restrictive table tennis decoding is at the some cycle pulses of first clock signal in the cycle, by inner very long instruction word (VLIW) register identification indication, the decoded operation of rattling of sequential instructions that the permanent haulage line mode of selecting first or second gate to distribute is regularly imported or data, until the conversion of inner very long instruction word (VLIW) register identification or outside very long instruction word (VLIW) sign format word requirement generation table tennis state, shown in Fig. 9 f.
(4) three-dimensional decoding
By the definition of inner very long instruction word (VLIW) register identification FIF_FPP2, the FCC decoding unit can carry out data strobe control to first or second gate, produces one dimension, two dimension, the decoded operation of three-dimensional table tennis.
In the first clock signal cycle pulse cycle, the equal locking pin of first, second gate is selected the instruction of the first, second, third or the 4th line mode input of same storage mode and data input, the decoded operation mode that is produced is called one dimension decoding, shown in Fig. 9 g.One dimension decoding can be selected the table tennis or the periodic manner of the single decoding unit operation of serial, parallel two decoding unit operations, produces control signal.
In the first clock signal cycle pulse cycle, first, second gate is operated at the instruction that the first, second, third or the 4th line mode of selected two storage modes and data input is imported respectively, the decoded operation mode that is produced is called two-dimensional decoding, shown in Fig. 9 h.Two-dimensional decoding can select the table tennis or the periodic manner of parallel two decoding unit operations to decipher, and produces control signal.
In the first clock signal cycle pulse cycle, first, second gate is operated at the instruction of the first, second, third or the 4th line mode input of selected two storage modes or data input and the instruction that the built-in function parts are exported in the tertiary circuit mode respectively, the decoded operation mode that is produced is called three-dimensional decoding, shown in Fig. 9 i.Three-dimensional decoding can be selected the table tennis or the periodic manner of parallel two decoding unit operations, produces control signal.
The characteristics of three-dimensional table tennis decoding are to carry out decoded operation at a plurality of instruction inlet flows.Table tennis decoding is carried out periodicity or binding selection by code translator to the instruction of first gate and the distribution of second gate, realizes the decoded operation of multiple instruction flow.The one the second gates are then directly controlled in three-dimensional decoding, by gate the instruction stream of a plurality of line modes inputs are carried out the gating operation.Three-dimensional table tennis decoding is the mode that is used in combination of three-dimensional decoding and table tennis decoding, has realized the parallel decoding operation of multiple instruction flow.Shown in Fig. 9 j, in the cycle, code translator FCCB deciphers 128 very long instruction word (VLIW) of first line mode input of FD port at T1, and code translator FCCP deciphers at 128 very long instruction word (VLIW) of first line mode input of FZ port.In the T2 cycle, code translator FCCB deciphers at 128 very long instruction word (VLIW) of first line mode input of FZ port, code translator FCCP then can generate at the internal arithmetic functional unit, or produce by inside super command identification parts FIF, or decipher with 128 very long instruction word (VLIW) that the tertiary circuit mode is imported, combination by three-dimensional decoding and table tennis decoding makes the FCC decoding unit can finish the decoded operation of 256 very long instruction word (VLIW) simultaneously in the same memory cycle.
Table 9-1 STRC state control table
STRC<2:0> | The state definition | The signal source |
Data bus status | Boot state resets |
000 | Data bus independently uses | Port guided from FDR | Output from the STRC_REG register among the internal state mark component FIF |
001 | Data bus independently uses | Port guided from FRR |
010 | Data bus merges use | Port guided from FD |
011 | Keep | Keep | |
100 | Data bus independently uses | Port guided from FZM |
101 | Data bus independently uses | Port guided from FMM |
110 | Data bus merges use | Port guided from FZ |
The control table of the three-dimensional table tennis of table 9-2 FPP1 decoded mode
FPP1<1:0> | Decoded mode |
00 01 10 11 | The decoding of serial decoding parallel decoding table tennis keeps |
The three-dimensional table tennis decoding of table 9-3 FPP2 state control table
FPP2<1:0> | The decoding State Control |
00 01 10 11 | The three-dimensional decoding of one dimension decoding two-dimensional decoding keeps |
Table 9-4 FPP3 periodicity, restrictive encoded control table
FPP3<2:0> | Function |
1xx 000 001 010 011 | The periodically restrictive decoding of table tennis decoding first circuit restrictive decoding second circuit restrictive decoding tertiary circuit restrictive decoding the 4th circuit |
Figure ten is for supporting the system assumption diagram of special use, general purpose microprocessor structure and high-level semantic
Macroinstruction set symmetrical expression parallel architecture support special use and general purpose microprocessor structure and special use, the high-rise primitive of multi-purpose computer higher level lanquage, as shown in figure 10, it comprises:
* four address port parts independently, six address pointer bus parts, four data port parts, eight data bus components, every group of eight multidigit registers of four groups of register parts, the decoding unit of two symmetries allows parallel and serial decoding;
* each parts all can be accepted data input and second, third line mode by first, second, third, fourth line mode, carries out each parts data exchange, transmits output.
* four independently the address date port can selected different institutional framework and mode of operation thereof, as described in figure three, figure five, four groups of independently FPDP and eight buses, can selectedly be defined as serial, parallel work-flow and independence, merging use-pattern, as described in Fig. 5, Fig. 6, four groups of independent register can constitute serial, parallel work-flow mode, allow to walk abreast with external register, the interconnected and interconnected mode of operation of serial stacking-type at random, as described in figure six.
* the very long instruction word (VLIW) hierarchy of control is made up of the hard on line in outside, internal register, external memory storage identifier word three parts, cooperate port data/commands I/O mode, can produce mutual acting in conjunction, through decoder for decoding, produce the Multiple Combination control signal, control operations such as each individual components data path of this architecture, data transfer, data processing, execution algorithm.As described in Fig. 8, Fig. 9.
* comprise a data processing, execution algorithm logic and compare test functional part.These parts can selectedly be defined as the different sequences of operation, control different data paths, change structure and tissue, operative relationship.As described in Fig. 4, Fig. 7.
This system is selected definition different operating mode, logical relation, data path, structure organization, can make this system support the structure of universal or special microprocessor and high-level semantic, grammer, the pragmatic relation of higher level lanquage.
Through the described operation of Figure 12, matched orders/data entry mode, can be by outside very long instruction word (VLIW) sign control word, again the internal indicator system is loaded as required, produce effect, its institutional framework relation, logical relation, mode of operation, data path, sequential etc. all can selectedly be defined as Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9.
This architecture supports general purpose microprocessor structure and infix to represent the grammatical relation of mode, shown in Figure 10 a.
(1) first, second address port parts and FPDP parts are storage mode and parallel I/O mode at random by the internal indicator register definitions.The first address port parts and FPDP parts are mainly used in instruction/data I/O mode of operation, and the second address port parts and FPDP parts are mainly used in data I/O mode of operation.
The address pointer of (2) second address port parts generates the control that generated by the first address port component address pointer, when this system operation of an instruction control, FPDP is with the operation of matched orders semanteme, in the make decision selection of address pointer of the effect of Multiple Combination control signal, the I/O operation of implementation data port.
(3) three-address port parts and FPDP parts are used to communication bus, are defined as at random and the parallel work-flow mode.These parts are controlled by the very long instruction word (VLIW) hierarchy of control and the common multi-control signal that produces of code translator, can be instructed to the control operation of a data communication, I/O operation and other system.
The all selected parallel data I/O mode of operation that is defined as of (4) first, second, third register parts, its Data Control, processing, replacement are controlled by the very long instruction word (VLIW) hierarchy of control and the common multi-control signal that produces of code translator.
Selected first-in last-out stack addressing mode and the serial I/O mode of being defined as of (5) four-address port parts and FPDP parts.The 4th register parts and four-address FPDP parts are formed inside, the outside first-in last-out stack mode of operation of associating.
(6) four-address port parts and FPDP parts are controlled by the STOCHASTIC CONTROL that the common multi-control signal that produces of the very long instruction word (VLIW) hierarchy of control and code translator and system produce, and make it to control, to preserve operation to the data of data path, instruction breakpoint, system status and each internal register of this architecture.
(7) cooperate from the instruction of first storage mode input with from the data of second storage mode input, under the effect of Multiple Combination signal, control second, third, the data processing of the 4th storage mode I/O and associative operation, calculating operation, control operation.
Thus, this architecture can be supported senior semantic instruction manipulation that forms and data manipulation split storage, independently carry out the I/O operation, and the data structure of single storehouse hierarchy of control, can support semanteme, grammer, pragmatic relation that instruction pipelining, director data associative operation and control are expressed in the infix mode.
This architecture is supported the high-level primitive of special microprocessor structure and postfix notation mode, shown in Figure 10 b.
(1) first, second address port parts and FPDP parts are storage mode and parallel I/O mode at random by the internal indicator register definitions, this port becomes two and independently instructs the I/O system, command decoder through inner two symmetries, can produce the Multiple Combination control signal respectively, control or control mutually each parts of this system respectively.
(2) three-address port parts and FPDP parts will be divided into two independently address, data manipulation ports, form the operation to another storage mode of first, second address/data parts.
(3) four-address port parts and FPDP parts will be divided into two independently address, data manipulation ports, form the operation to another storage mode of first, second address date parts.
The (4) the 3rd and the 4th FPDP parts and address date port part are formed the internal-external first-in last-out stack mode of operation of associating.
(5) first, second register parts will be selected as independently parallel work-flow mode.
(6) these architecture first address port parts and FPDP parts unite the 3rd, four-address port part and FPDP parts are implemented the Multiple Combination signal controlling that generates from the director data of the first address port parts and the input of FPDP parts.
(7) these architecture second address port parts and FPDP parts unite the 3rd, four-address port part and FPDP parts another to storage mode, implement the Multiple Combination signal controlling that generates from the director data of the second address port parts and the input of FPDP parts.
(8) first, second address port parts and FPDP parts are used to deposit indexed lexicon and target dictionary.Three-address port part and FPDP parts are used to deposit parameter, the data of expressing in the blue mode of head sea.Four-address port part and FPDP parts carry out system's control, address pointer, branch transition control and the control of various system program structure.
(9) cooperate from the instruction/data of first, second storage mode input, under the effect of Multiple Combination control signal, implement the control that interacts, and finish demands such as delay operation, application operating, calculating operation to the 3rd, the 4th storage mode implementation data processing controls and to instruction/instruction, the data/commands of first, second storage mode generation.
Thus, this architecture can be supported two split type two dictionary configurations, supports dual stack grammer, the pragmatic relation expressed in the suffix mode.
This architecture is supported the high-level primitive of special microprocessor structure and prefix expression way, shown in Figure 10 c.
(1) first address port parts and FPDP parts are storage mode and parallel I/O mode at random, the selected parallel data operation mode that is defined as of first register by the internal indicator register definitions.
(2) second address port parts and FPDP parts are selected be defined as the first-in last-out stack storage mode and and serial I/O mode, the second register parts and the second address port parts and FPDP parts are formed associating inside, outside first-in last-out stack mode of operation.
(3) three-address port parts and FPDP parts are defined as FIFO stack mode of operation and parallel work-flow mode with selection.Three, the 4th register parts are defined as the serial operation mode.
(4) this architecture cooperates the Multiple Combination signal that generates from the director data of the first address port parts and the input of FPDP parts, control second, third, four-address port part and FPDP parts, form data I/O storage mode, support the processing of a binary tree structure and grammer, pragmatic relation that the prefix mode is expressed.
(5) as the described architecture of Fig. 8, Fig. 9, the data that the mode of FILO, FIFO is imported can be transferred to instruction I/O mode, form the intelligent senior semanteme that data generate instruction manipulation.
As described in Figure 8, support that senior, high-level semantic is the design of arranging, optimizing the arrangement of by to outside very long instruction word (VLIW) identifier word flowing water, and the code sequence word that produces, this system makes each parts Be Controlled produce corresponding operating after receiving input and decoding, that is: the relevant operation of control, utilize delay operation to solve the redundant process of operation, multiple algorithm is carried out in the processing of implementation data, and system status test, data comparison and constitute high-level behavior semanteme.The mode of operation of architecture, organizational form, architectural feature and macrolanguage primitive relation when reflection is high-rise, senior semanteme and general, special purpose computer computing.
As described in Fig. 8 f, the operation that the condition of carrying out is selected the high-level semantic of control is approached in one 1/2nd algorithm interval, under the selection definition of general and application specific processor architecture, allow instruction/data split I/O mode of operation, the data structure of this split instruction, data can be arranged in it in first or second storage space by outside compiler.
Shown in Figure 10 d, the input at the two sign format sequence words of first address date port acquisition very long instruction word (VLIW) system will obtain subroutine A in second FPDP, the sign format words of first and second instructions of B.
The data that the second address port parts and FPDP parts produce are controlled by the Multiple Combination signal that the instruction/data of the first address port parts and FPDP parts input generates.This algorithm carries out the interval and approaches, and realizes the primitive of selection control operation with good conditionsi, has possessed the operational requirements of various human class behavior: the conditional operation demand; The calculating operation demand; The data access operation demand; The program jump operational requirements; The data exchange operation demand; The computation requirement of serial; The operational requirements that the functional part that constitutes is reused; The interval operational requirements that relatively constitutes redundancy; After the test, the instruction/data of second storage mode input constitutes the operational requirements of delay selection; Be determined the path of branch in program after, the parallel work-flow demand that the Double Data port constitutes ... these demands will be aligned in the outside very long instruction word (VLIW) control identifier word effectively.
Shown in Figure 10 d, two subroutines that are transferred, it carries out first and second instructions of inlet, stored into by compiling in the data-carrier store of second address unit management, its master routine is compiled to since the 3rd instruction in the data-carrier store of first address unit management, constitutes the parallel input of dual-port instruction/data and handles operation.
Shown in Figure 10 d-1, cooperate the first address date port part instruction/data I/O mode, first form of outside very long instruction word (VLIW) identifier word sequence is imported into code translator in the period 1, simultaneously, the instruction of article one of the subroutine A of selected transfer also is imported into code translator by the second address date port part.In first clock period, the combination control signal automatically performs the condition algorithm, finish a test C at [A, B] interval computing, indication simultaneously, read the data that come from first FPDP of subroutine B by the address pointer of the second address date port part, article one instruction of indicating the second address date port to read subroutine B, and article one instruction of time-delay execution subroutine A.
Second round the clock upper edge obtained second form of this instruction of first port input again and from article one instruction of the subroutine B of second FPDP input.Obtained the selection result of condition in second round, therefore, the encoded control signal will be controlled the transfer address of selected execution subroutine according to the condition result, output to the first address date port part in period 3 clock upper edge, and carry out article one of selecteed subroutine A or B and be compound to the instruction manipulation of second sign format of very long instruction word (VLIW) format words in second round, that is: finish A=(A+C)/2 or MAXMIN (A, B, C) calculating operation, and the entry address of chooser program A or B outputs to the transfer branch operation of address date port, reach article one instruction that the subroutine A that carries out the period 1 input is delayed time and carries out, or article one instruction of the subroutine B of execution input second round.
Upper edge in period 3 of clock, the data of a subroutine B are read into from first FPDP, simultaneously, obtained the second instruction of subroutine A from second FPDP, in the period 3, according to the condition result, control the first address port parts and FPDP parts and form the sequence address pointer, the second address port parts and FPDP parts read the second instruction of subroutine B, the data of execution subroutine B input simultaneously carry out data processing or subroutine A carries out the operation that data pointer is revised processing, and the instruction of the second of subroutine A.The instruction manipulation of being chosen by the condition test result will not go out of use.
Period 4 upper edge at clock, the 3rd instruction of selecteed subroutine read code translator by first FPDP, the second of subroutine B instruction simultaneously also is read into code translator by the second address date port part, will carry out from the 3rd instruction of first FPDP and the second of second FPDP input in the period 4 and instruct.
This grand primitive is by the operation of three clock period, finished when satisfying test condition and carried out: condition is handled, the access second order format word, condition test, access data, transfer address, data processing, interval calculating, 9 instructions such as execution subroutine A article one instruction and the instruction of access second or carry out when not satisfying test condition: condition is handled, the access second order format word, condition test, the instruction of access subroutine B article one, the control transfer address, interval calculating, pointer calculates, the instruction of access second, carry out 10 instructions such as the first second instruction, realized outside artificial intelligence optimization's code Design, streamline is arranged, make the structure of this system and the semanteme that operation is supported higher level lanquage thereof, grammer, the demand of pragmatic structure and raising application efficiency have been moved the macrolanguage primitive more than three in each clock period.
This architecture very long instruction word (VLIW) hierarchy of control structure internally arrives external control, has parallel multiprocessing operation function, constitute the pragmatic relation of support special use, multi-purpose computer data structure and the grammatical relation of macrolanguage primitive, realize that behavior operational semantics demand is directly reflected as the process of computer operation.
Figure 11 is time sequential routine figure
The parallel architecture microprocessor of macroinstruction set symmetrical expression contains four systems clock operation sequential, and they are:
* the first retiming clock signal CLK;
* the second retiming clock signal CLK1;
* the 3rd retiming clock signal CLK3;
* the 4th retiming clock signal CLK4.
The fundamental characteristics in this four systems time sequential routine is:
* the CLK clock signal is the cyclical variation clock, and its high level and low level dutycycle are 1: 1.
* the CLK1 clock signal is the cyclical variation clock, and its high level and low level dutycycle are 1: 3, and the CLK1 high level is effective, keeps synchronously with CLK.
* the CLK3 clock signal is the cyclical variation clock, and its high level and low level dutycycle are 1: 3, and effective phase place of CLK3 high level and CLK clock differ 135 °.
* the CLK4 clock signal is the cyclical variation clock, and its high level and low level dutycycle are 1: 3, and effective phase place of CLK4 high level and CLK clock differ 270 °.
The sequential relationship of four systems clock signal as shown in figure 11.
Above-mentioned four systems time sequential routine clock signal is used to control the operation of each built-in function parts of this microprocessor and latching of data manipulation, and its major function is as follows:
(1) first retiming clock signal CLK
* the input data register with each port of first line mode input is carried out data latching control;
* the output data with each port of the 4th line mode output is carried out output enable control;
* control the data output of each internal state and internal register in the tertiary circuit mode.
(2) second retiming clock signal CLK1
* control the renewal change of each internal state and internal register data with second line mode;
The decoded operation sequential control of * three-dimensional table tennis decoding unit;
* start the operand gating control of arithmetic unit in the tertiary circuit mode.
(3) the 3rd retiming clock signal CLK3
* interrupt the sequential control of collection and operation response;
* the sequential control of interrupt spot protection renewal.
(4) the 4th retiming clock signal CLK4
* address pointer is exported control timing;
* synchronous/asynchronous address function control timing;
* communication request is replied control timing.
Figure 12 resets for system and initialization figure
Macroinstruction set symmetrical expression parallel architecture microprocessor reset and the initialization operation process as follows:
(1) reset signal is effective, shows that the system reset cycle begins;
(2) by outside hard on line sign logic original state, mode of operation, the institutional framework of microprocessor are selected definition;
(3) the internal state marker register carries out initializing set according to the definition of outside hard on line sign logic;
(4) each built-in function parts is pressed the definition of internal state sign by the 4th line mode, carries out initializing set;
(5) according to the status indicator definition of internal indicator register, address unit forms initial address, and choosing is exported to the assigned address port;
(6) according to the definition of the status indicator of internal indicator register, the storage mode of appointment is set to corresponding storage latent period, and the port of appointment is set to input state, wait article one instruction to be read to carry out;
(7) article one instruction is sent into decoding unit from the input of data designated port through first line mode, and decoding is carried out.
As shown in figure 12, whole reset operation control is finished by system reset control assembly FRST control, and all reset initialization operations are finished by the 4th line mode.Shown in Figure 12 a, the FRST parts will produce a series of reset cycle control signal RST1, RST2, RST3 and RST4 in the system reset cycle, control all internal parts and carry out initialization operation.
Outside hard on line logic can be carried out the initialization setting to the reset mode of microprocessor:
* the selection of the use-pattern of data bus
By the hard on line logic in outside input SRTC_PIN<2:0〉definition, can select the data bus of FPDP parts of four symmetries of this microprocessor, adopt the separate connection mode or adopt to merge connected mode, shown in table 7-1.
* the sign of reseting address port and FPDP
By the hard on line logic in outside input SRTC_PIN<2:0〉definition, can select a conduct in the FPDP parts of four symmetries of this microprocessor guiding port that resets, and control reseting address and send from this port, carry out so that read article one instruction, shown in table 12-1.
* the sign of storage operation sequential
Definition by the hard on line logic input in outside ASC_PIN, ASCd_PIN, can select eight data buss of FPDP parts of four symmetries of this microprocessor and the storage operation sequential of six address buss, adopt stores synchronized or adopt asynchronous storage sequential, shown in table 12-1.
* table tennis is deciphered the sign of circuit
By the hard on line logic in outside input FDP_PIN<1:0〉definition, can select the first, second, third or the 4th line mode of the decoding unit of this microprocessor, as the first decoding line mode, shown in table 12-2.
* the sign of storage operation mode
Definition by the hard on line logic input in outside PS_PIN, FIFO_PIN, can select the FPDP parts of four symmetries of this microprocessor adopt at random storage mode, first in first out (FIFO) storage mode or first-in last-out (FILO) storehouse storage mode operate, shown in table 12-3.
* the sign of reseting address
Definition by the hard on line logic input in outside IA_PIN, can select high-end (FFFF place) that the reset initialization program of this microprocessor leaves execute store in still low side (0000 place), make microprocessor form corresponding address in the back that resets, read the reset initialization instruction sequence, shown in table 12-4.
Shown in Figure 12 a, macroinstruction set symmetrical expression parallel architecture microprocessor is pressed the reset cycle signal RSTn that the FRST parts produce according to the definition of the hard on line logic input in outside, divides four cycles to finish the initialization operation of whole hardware system:
(1) the T0 cycle, the system reset cycle is effective
As the reset signal RESET of the hard on line logic in outside input effectively when (being low),, and keep this state with the status register zero clearing of all internal data registers; The FCLK clock forming circuit resets simultaneously, produces the system works clock: the first, second, third and the 4th retiming clock signal; FRST system reset parts make system reset cycle useful signal CRS effectively (for high) at the rising edge place of first first retiming clock signal CLK, show that the system reset cycle begins.
(2) the T1 cycle, the period 1 resets
The rising edge place of first first retiming clock signal CLK behind RST signal effective (being high), system reset parts FRST is changed to the RST1 signal effectively (for high), show that system began first reset cycle, in first reset cycle, definition according to outside hard wire logic input, put the initial value of each internal state register, wherein:
* FDR=0, FRR=0, FZM=0, each port data register zero clearing of FMM=0
* IL=0 instruction latch zero clearing
* PSR=0 processor status register zero clearing
* CSR=0 communication state register zero clearing
* INTR=0 interrupt control register zero clearing
* MULR=0 puts that the limit address is 0 address on the storer
* MDLR=FFFFFFFH puts that the limit address is the maximum possible address under the storer
* UDLR=FFFFFFFH puts that the limit address is the maximum possible address under the user
* MPNR=0 puts storer and divides industry to be 256*64bit one page
* to put the page table state invalid for PEMG=0
* OK=1 puts system's supervisor mode
* YPM=1 puts the specific address addressing mode
* put system size shape of tail attitude FIF_BE according to outside hard wire logic
* according to outside hard wire logic STRC_PIN<2:0〉and FDP_PIN puts the guiding port status and the state FIF_STRC and the FIF_FDP of the decoding unit of rattling
* put data/address time sequence status FIF_ASC and FIF_ASCd according to outside hard wire logic ASC_PIN and ASCd_PIN
* put storage mode state FIF_FILO and FIF_PS according to outside hard wire logic FILO_PIN and PS_PIN
* put the initial address value that resets according to outside hard wire logic IA_PIN, that is: TH, TL register are changed to the first initial value FFFF or 0000.
(3) the T2 cycle, second round resets
The rising edge place of first first retiming clock signal CLK behind RST1 signal effective (being high), system reset parts FRST is changed to the RST2 signal effectively (for high), simultaneously the RST1 signal is changed to invalid (for low), shows the end of first reset cycle of system, second reset cycle began.In second reset cycle, TH, TL register export its data to each address unit with the 4th line mode, and address pointer PC, SP, RP, YPC are resetted.
(4) the T3 cycle, the period 3 resets
The rising edge place of first first retiming clock signal CLK behind RST2 signal effective (being high), system reset parts FRST is changed to the RST3 signal effectively (for high), simultaneously the RST2 signal is changed to invalid (for low), shows the end of second reset cycle of system, the 3rd reset cycle began.In the 3rd reset cycle, TH, TL register are cleared, and are prepared as other internal data register and reset.
(5) the T4 cycle, the period 4 resets
The rising edge place of first first retiming clock signal CLK behind RST3 signal effective (being high), system reset parts FRST is changed to the RST4 signal effectively (for high), simultaneously the RST3 signal is changed to invalid (for low), shows the end of the 3rd reset cycle of system, the 4th reset cycle began.In the 4th reset cycle, TH, TL register are delivered to data in each internal data register file and the internal register by the 4th line mode, they are reset to " zero " value.
(5) end period that resets
The rising edge place of first first retiming clock signal CLK behind RST4 signal effective (being high), system reset parts FRST is changed to invalid (for low) with the RST4 signal, and judges according to outside hard on line logic RESET signal whether all peripherals has all finished reset operation.When RESET signal still effectively when (be low), the FRST parts keep CRS signal effectively (being height), and microprocessor is remained static, and wait for that other peripherals or coprocessor finish reset operation.The rising edge place of first first retiming clock signal CLK behind RESET invalidating signal (being high), system reset parts FRST is changed to invalid (for low) with the CSR signal, the open guiding port that resets by outside hard on line logic input appointment, reseting address is sent from this port, make this data bus be input state, instruct to read article one, and send decoding unit to carry out.
So far, the microprocessor hardware system has finished whole reseting procedures, will change software systems guiding and initialization procedure over to.
The sequential control of table 12-1 storage operation
ASCd-PIN ASC_PIN | Function | |
0 1 | Stores synchronized operation exception storage operation |
Table 12-2 table tennis decoding line identification
FDP_PIN<1:0> | Function |
00 01 10 11 | With first line mode is that the first decoding circuit is that the first decoding circuit is that the first decoding circuit is the first decoding circuit with the 4th line mode in the tertiary circuit mode with second line mode |
Table 12-3 storage operation mode identifies
PS_PIN FIFO_PIN | Function | |
0 0 0 1 1 x | FILO storage operation mode FIFO storage operation mode is the storage operation mode at random |
Table 12-4 initial address sign
IA_PIN | Function | |
0 1 | Reset and reset in the high-end FFFFFFH address of storer in the low side 000000H address of storer |