[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN116755923B - Single event upset resistant memory architecture FPGA - Google Patents

Single event upset resistant memory architecture FPGA Download PDF

Info

Publication number
CN116755923B
CN116755923B CN202310774573.9A CN202310774573A CN116755923B CN 116755923 B CN116755923 B CN 116755923B CN 202310774573 A CN202310774573 A CN 202310774573A CN 116755923 B CN116755923 B CN 116755923B
Authority
CN
China
Prior art keywords
memory
check
data
register
parity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310774573.9A
Other languages
Chinese (zh)
Other versions
CN116755923A (en
Inventor
单悦尔
张艳飞
徐彦峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Zhongwei Yixin Co Ltd
Original Assignee
Wuxi Zhongwei Yixin Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Zhongwei Yixin Co Ltd filed Critical Wuxi Zhongwei Yixin Co Ltd
Priority to CN202310774573.9A priority Critical patent/CN116755923B/en
Publication of CN116755923A publication Critical patent/CN116755923A/en
Application granted granted Critical
Publication of CN116755923B publication Critical patent/CN116755923B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • G06F11/1032Simple parity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • G06F15/7814Specially adapted for real time processing, e.g. comprising hardware timers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • G06F15/7821Tightly coupled to memory, e.g. computational memory, smart memory, processor in memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The application discloses a single event upset resistant memory architecture FPGA, which relates to the technical field of FPGAs, wherein resource modules in the same subarea in the memory architecture FPGA are connected through interconnection resources in the FPGA to realize a memory unit, a register in the memory unit is replaced by a parity register, a verification circuit is realized by using the resource modules, the parity register is verified by the verification circuit in the memory operation process, when the data errors of the register are confirmed by verification, the memory unit is triggered to execute the memory operation again in time, the operation errors caused by transient errors caused by single event upset failure are avoided, and therefore, the operation accuracy can be ensured when the parallel multi-core memory operation is realized through multiple memory units, and the method has outstanding data processing efficiency, operation speed and operation reliability.

Description

Single event upset resistant memory architecture FPGA
Technical Field
The application relates to the technical field of FPGA (field programmable gate array), in particular to a single event upset resistant memory architecture FPGA.
Background
With the development of high and new technologies such as the Internet of things, cloud computing and artificial intelligence, various data-intensive applications realized by means of hardware platforms also grow rapidly. FPGA (Field Programmable GATE ARRAY, programmable array logic) is a hardware platform commonly used at present because of its advantages of parallelism and reconfigurability.
The existing FPGA, like most conventional hardware platforms, adopts von neumann architecture, and contains a large number of resource modules such as CLBs, DSPs and BRAMs inside the FPGA, and these resource modules are connected by highly configurable interconnection lines to form a computing unit and a memory unit in the FPGA. BRAM is used as an important storage unit on chip, on-chip data is mainly stored in BRAM, and on-chip data is transmitted to a computing unit during computing.
Therefore, like other hardware platforms adopting von neumann architecture, the existing FPGA needs to repeatedly transmit a large amount of data between the storage unit and the computing unit in the process of executing various computing tasks, and frequent and large data movement not only consumes a large amount of wiring resources and interconnection resources, but also causes huge delay and energy loss, thereby limiting the data processing efficiency of the FPGA.
Disclosure of Invention
Aiming at the problems and the technical requirements, the inventor provides a single event upset resistant memory architecture FPGA, and the technical scheme of the application is as follows:
The utility model provides a single event upset resistant memory computing architecture FPGA, the memory computing architecture FPGA is inside to be included according to a plurality of resource module, the interconnection resource that sets up in the resource module and the global read-write channel that extra hardware resource realized, the resource module that is located in same subregion links to each other through the interconnection resource inside FPGA in order to realize a memory computing unit, each memory computing unit includes treater and local memory cell, the local memory cell of each memory computing unit is used for storing the instruction and the data of memory computing unit, the treater in each memory computing unit is connected the local memory cell in same memory computing unit through interconnection resource and according to the instruction and the data operation of local memory cell storage; the resource modules in the different subareas correspondingly realize a plurality of memory calculation units, and the local storage unit in each memory calculation unit performs data transmission with the outside of the FPGA chip through a global read-write channel;
the at least one memory calculation unit also comprises a check circuit, wherein the at least one memory calculation unit comprises at least one register which is a parity register, and the parity register comprises a data bit and a check bit; in the process that the memory computing unit executes a memory computing operation, a check circuit in the memory computing unit performs data check on the parity register in the memory computing unit, and determines register data errors of the parity register when the check fails, and triggers the memory computing unit to execute the current memory computing operation again.
When the memory calculating unit executes a memory calculating operation, the memory calculating instruction and the initial data are written into a local storage unit of the memory calculating unit through a global read-write channel, a processor in the memory calculating unit executes the memory calculating operation based on the initial data according to the memory calculating instruction, and the memory calculating instruction and the initial data stored in the local storage unit are kept unchanged in the process of executing the memory calculating operation;
When the register data of the parity register is determined to be erroneous, the processor in the memory arithmetic unit performs a memory arithmetic operation based on the initial data again in accordance with the memory arithmetic instruction.
The further technical scheme is that the check circuit comprises a check code generator and a check code checker, and in the process that the memory calculation unit executes a memory calculation operation:
when writing register data into data bits in the parity registers, a check code generator generates corresponding check codes according to the register data and updates the written check bits;
And the check code checker performs data check according to the data bit of the parity register and the content of the check bit, reads register data from the data bit of the parity register when the check is determined to be successful, and triggers the memory calculation unit to execute the current memory calculation again when the check is determined to be failed.
The further technical scheme is that one memory computing unit comprises a plurality of parity registers, and when the check circuit determines that the register data error of at least one parity register exists, the memory computing unit is triggered to execute the current memory computing operation again.
When one memory unit comprises a plurality of parity registers, the check circuit comprises a plurality of check device groups, each parity register corresponds to one check device group, and each check device group comprises a check code generator and a check code checker; at least one check code checker is present to trigger the memory location to re-perform the current memory operation when it is determined that the check fails.
The further technical scheme is that when one storage unit comprises a plurality of parity registers:
At least two parity registers share the same check code generator, the check code generator generates respective check codes according to the register data of the two parity registers and updates and writes corresponding check bits, and the parity registers sharing the same check code generator do not write the register data at the same time;
And/or at least two parity registers share the same check code checker, the check code checker performs data check on the register data of the corresponding parity register according to the data bit of each parity register and the content of the check bit, and the parity registers sharing the same check code checker do not read out the register data at the same time; at least one check code checker is present to trigger the memory location to re-perform the current memory operation when it is determined that the check fails.
The parity register comprises K data bits and 1 check bit, and the check circuit performs data check on the parity register based on a parity check method:
When the even number 1 is included in the binary K-bit register data at the time of writing the register data to the data bits in the parity register, the check code generator generates the check code 1 and updates the written check bit; when the binary K-bit register data comprises odd 1 s, the check code generator generates a check code 0 and updates the written check bit; the check code checker determines that the check succeeds when detecting that the k+1-bit data of the parity register includes an odd number of 1's, and determines that the check fails when detecting that the k+1-bit data of the parity register includes an even number of 1's;
Or when the even number 1 is included in the binary K-bit register data while writing the register data to the data bits in the parity register, the check code generator generates the check code 0 and updates the written check bit; when the binary K-bit register data comprises an odd number of 1s, the check code generator generates a check code 1 and updates the written check bit; the check code checker determines that the check succeeds when it detects that the K +1 bit data of the parity register includes an even number of 1s, and determines that the check fails when it detects that the K +1 bit data of the parity register includes an odd number of 1 s.
Each check code checker has a corresponding failure flag, and each check code checker keeps the failure flag as an invalid signal when the check is determined to be successful and sets the failure flag as an valid signal when the check is determined to be failed;
when the invalid flag is switched from the invalid signal to the valid signal, an interrupt signal is generated and sent to a processor in the same memory calculation unit, and when the processor receives the interrupt signal, the processor re-executes the current memory calculation and resets all the invalid flags to the invalid signal.
The further technical scheme is that the effective signal of the failure flag generates an interrupt signal after a predetermined time delay and sends the interrupt signal to a processor in the same storage unit.
The further technical scheme is that all registers in the memory unit are parity registers, or part of registers are parity registers.
The beneficial technical effects of the application are as follows:
The application discloses a single event upset resistant memory architecture FPGA, wherein resource modules in the same subarea are connected through interconnection resources in the FPGA to realize a memory unit, when a register in the memory unit is replaced by a parity register, a parity register is checked by a checking circuit, when a register data error is checked and determined, the memory unit is triggered in time to execute memory operation again, so that operation errors caused by transient errors caused by single event upset failure are avoided, and the operation reliability is improved when parallel multi-core memory operation is realized through multiple memory units, and the application has outstanding data processing efficiency, operation speed and operation reliability. The storage architecture FPGA is mainly suitable for the conditions of fewer electromagnetic radiation particles and shorter duration, transient errors caused by single event upset failure are not dense, and in the process of executing parallel storage operation of multiple storage units by the storage architecture FPGA, when register data errors are detected, interruption and stopping operation are generated to eliminate the influence, so that the influence on the whole flow of the storage operation is not great.
The processor and the local data storage in each storage unit formed in the FPGA are in the same subarea, and the processor only needs to read and write the data in the adjacent local data storage when executing, so that the wiring resources and interconnection resources consumed by data transmission between the processor and the local data storage are reduced, the data transmission delay and the energy consumption are reduced, and the data processing efficiency of the FPGA is improved. In addition, the FPGA can realize multi-core parallel computation through a plurality of computation units, the processors of the computation units cannot collide with each other, the computation units can execute the computation at full speed, the effect of storage bandwidth is avoided, and the overall data processing efficiency and the computation speed of the FPGA can be further improved.
Drawings
Fig. 1 is a schematic layout diagram of resource modules in a storage architecture FPGA and a schematic diagram of sub-areas obtained by division in an example of the present application.
Fig. 2 is a schematic circuit diagram of each memory unit implemented based on fig. 1, taking an example in which the instruction registers in each memory unit are parity registers.
FIG. 3 is a schematic diagram of the connection of a verification circuit to a parity checker and processor in one embodiment of the present application.
FIG. 4 is a schematic diagram of a connection of a verification circuit to a plurality of parity checkers and processors in one embodiment of the present application.
FIG. 5 is a schematic diagram of another connection of a verification circuit to a plurality of parity checkers and processors in one embodiment of the present application.
Fig. 6 is another circuit configuration diagram of the local memory units of the respective memory units of fig. 2 connected to the outside of the FPGA chip.
Fig. 7 is another circuit configuration diagram of the local memory units of the respective memory units of fig. 2 connected to the outside of the FPGA chip.
Fig. 8 is another circuit configuration diagram of the local memory units of the respective memory units of fig. 2 connected to the outside of the FPGA chip.
Detailed Description
The following describes the embodiments of the present application further with reference to the drawings.
The application discloses a single event upset resistant storage architecture FPGA, which comprises a plurality of resource modules arranged according to a preset structure and interconnection resources arranged around the resource modules, wherein the preset structure can be arranged according to actual conditions, for example, a Column-Based structure which is the main current general purpose of the preset structure is taken as an example, the resource modules are arranged according to a row-Column structure to form a two-dimensional array architecture, each Column is provided with the same type of resource modules, the periphery of each resource module is provided with the interconnection resources, and the types of the resource modules contained in the FPGA comprise configurable logic modules CLB, configurable memory modules BRAM and configurable multiply-add modules DSP. The configurable logic module CLB contains units such as a lookup table, a register and the like, and can be configured to realize various logic functions. The configurable multiply-add module DSP contains an arithmetic unit which can execute multiple operations such as multiplication, addition, multi-bit logic gates and the like. The configurable memory module BRAM contains block memory cells, and can realize single-double-port memory and FIFO memory with various widths and depths. The structure of the part is the internal structure of the existing FPGA, and the application is not repeated.
All resource modules arranged according to a preset structure in the memory architecture FPGA are divided into a plurality of sub-areas, and each sub-area contains a certain amount of resource modules. The resource modules in the same subarea are connected through interconnection resources in the FPGA to realize one memory unit, and the resource modules in a plurality of different subareas correspondingly realize a plurality of memory units.
Each memory unit includes a respective processor and a local storage unit, the processor in each memory unit being coupled to the local storage unit in that memory unit via an interconnection resource. The local memory unit in each memory unit is used for storing instructions and data of the memory unit, and the processor in each memory unit is used for operating according to the instructions and data stored by the connected local memory unit.
In one embodiment, two memory units implemented by a resource module within any two sub-regions use the same or different instruction sets, and two memory units using the same instruction set have the same or different bit widths. In the present application, the memory cells formed within the memory architecture FPGA may form a variety of typical general purpose 8-bit processors including RISC, 8051 and 6802.
In one embodiment, the number and types of the resource modules contained in any two sub-areas obtained by dividing are the same, or in another embodiment, the number and types of the resource modules contained in any two sub-areas obtained by dividing are at least one different, that is, the sub-areas obtained by dividing may be identical or not necessarily identical. In addition, each resource module in the memory architecture FPGA belongs to a sub-region, and all the resource modules in the memory architecture FPGA are divided into corresponding sub-regions, or the resource modules in the memory architecture FPGA are divided into corresponding sub-regions, that is, the memory architecture FPGA can be globally divided into sub-regions or locally divided into sub-regions. For example, in the example shown in fig. 1, the resource modules in the FPGA are divided into 8 sub-areas, each sub-area contains the same number and type of resource modules, and the area outlined by each dashed box represents one sub-area obtained by division.
The application does not limit the number and types of resource modules contained in each sub-area, but the number and types of resource modules contained in each sub-area are enough to realize the devices needed by one memory unit, the processor in one memory unit is mainly realized based on DSP, and the memory unit in one memory unit is mainly realized based on BRAM, besides, a certain number of CLBs are needed to be combined.
The memory unit implemented by the resource modules within the sub-regions will actually also include a number of registers, such as instruction registers and data registers: when the processor reads the instruction from the local storage unit, the instruction in the local storage unit is written into the instruction register IR, and then written into the processor by the instruction register IR. When the processor reads data from the local storage unit, the data in the local storage unit is written into the data register DR, and then written into the processor by the data register DR. Correspondingly, when the processor writes data into the local storage unit, the data is written into the data register DR, and then the data register DR writes the data into the local storage unit. The memory unit includes a plurality of general-purpose registers necessary for constructing the circuit structure in addition to the instruction register and the data register. In a complex electromagnetic environment, the register is affected by single event upset due to radiation irradiation, etc., and the single event upset effect is transient and non-destructive, but can change the content of the register, thereby causing continuous functional errors and causing errors in the operation of the memory unit.
Therefore, in order to improve reliability, in the memory architecture FPGA of the present application, at least one memory unit includes a parity register, and in the memory unit including the parity register, a check circuit is implemented through a resource module in a corresponding sub-area, and the check circuit is used for connecting the parity register and checking the parity register, and in addition, the check circuit is also connected to a processor in the same memory unit. The parity register can be any one of a data register, an instruction register and a general register according to the function type, the parity register corresponds to a traditional register, the traditional register only comprises a plurality of data bits, the parity register comprises a plurality of check bits besides the data bits, the parity register can be regarded as replacement and upgrade of the traditional register, and the number of the check bits is configured according to practical conditions. At least one parity register exists in the memory unit, and a traditional register which is easily influenced by single event upset in the circuit structure of the memory unit is replaced by the parity register according to actual needs, but the logic function of the memory unit is not influenced when the traditional register is replaced by the parity register. In one embodiment, all registers in one memory unit are replaced and upgraded to parity registers, or only part of registers are parity registers, and the rest of registers which are not easily affected by single event upset are still conventional registers. In one embodiment, the parity registers are present in all of the memory cells formed by the memory architecture FPGA and include a verification circuit. Or in another embodiment, parity registers are all present in a portion of the memory cells formed by the memory architecture FPGA and include verification circuitry.
For example, based on 8 sub-areas obtained by dividing the storage architecture FPGA in fig. 1, the storage architecture FPGA respectively implements 8 storage units through resource modules in the 8 sub-areas, each storage unit respectively includes a processor and a local storage unit, the instruction register IR in each storage unit is implemented by using a parity register, and each storage unit includes a check circuit, so that a circuit structure diagram of the 8 storage units is shown in fig. 2, and fig. 2 does not show a general register in the storage units because a complete circuit structure is not shown in detail.
In addition to resource modules and interconnection resources, compared with a conventional FPGA, the memory architecture FPGA of the application also adds an additional global read-write channel realized by hardware resources. The local storage unit in each storage unit performs data transmission with the outside of the FPGA chip through the global read-write channel, as shown in figure 2.
The application forms a plurality of memory units in the memory architecture FPGA, each memory unit supports independent memory operation in a soft core mode, so the FPGA can jointly realize multi-core memory operation by the plurality of memory units formed in the FPGA, and the application comprises the following steps:
1. and writing the memory calculation instructions and the initial data of each memory calculation unit into the local memory units of the corresponding memory calculation units from outside the FPGA chip through the global read-write channel.
2. The respective memory units are started while performing parallel operations, and a processor in each memory unit performs a memory operation based on initial data in accordance with a memory instruction in a local memory unit of the memory unit, and stores a result of the generated operation in the local memory unit of the memory unit.
For any one of the memory units including the parity register and the check circuit, in the process of executing one memory operation by the memory unit, the check circuit in the memory unit performs data check on the parity register in the memory unit, determines that the register data of the parity register is wrong when the check fails, and if the operation is continued, the operation is wrong, so that the memory unit is triggered to execute the current memory operation again. Because the single event upset failure is transient and nondestructive, the register data error caused by the influence of the single event upset can be corrected when the current calculation operation is re-executed, and the operation reliability is ensured.
In order to enable the memory computing unit to re-execute the current memory computing operation, in the application, after the memory computing instruction and the initial data of the memory computing unit are written into the local memory unit through the step 1, the memory computing instruction and the initial data stored in the local memory unit are kept unchanged all the time in the process of executing the memory computing operation by the processor, so that the processor can re-execute the current memory computing operation based on the initial data according to the memory computing instruction in the local memory unit.
In one embodiment, referring to fig. 3, the check circuit in any one of the memory units includes a check code generator and a check code checker, and during execution of one of the memory operations by the memory unit:
When writing the register Data data_reg into the Data bits in the parity register, the check code generator generates a corresponding check code data_par according to the register Data data_reg and updates the check bit written into the parity register, so that the Data bits of the parity register are written with the register Data data_reg and the check bits are written with the check code data_par. When the register Data is required to be read from the parity register, the check code checker performs Data check according to the Data bit of the parity register and the content of the check bit, reads the register Data data_reg from the Data bit of the parity register when the check is determined to be successful, and triggers the memory unit to execute the current memory operation again when the check is determined to be failed.
The method in which the check code checker performs Data check based on the Data bits of the parity register and the contents of the check bits corresponds to the method in which the check code generator generates the check code data_par based on the register Data data_reg. In one embodiment, the parity register includes K data bits and 1 check bit, as compared to a conventional register that only requires one check bit. The above-mentioned checking circuit performs data checking on the parity register based on the parity check method, including:
(1) When the odd-even check is performed on the parity register, when the register data is written to the data bits in the parity register, the check code generator generates the check code 1 and updates the written check bits when an even number of 1 s are included in the binary K-bit register data. When an odd number of 1 s are included in the binary K-bit register data, the check code generator generates a check code 0 and updates the write check bit.
Accordingly, when it is necessary to read the register data from the parity register, the check code checker determines that the check succeeds when detecting that the k+1 bit data of the parity register includes an odd number of 1 s, and determines that the check fails when detecting that the k+1 bit data of the parity register includes an even number of 1 s.
To achieve the above function, in one implementation of this embodiment, the check code generator is implemented using a K-bit exclusive nor gate XNOR, and the check code checker is implemented using a k+1-bit exclusive nor gate XNOR.
(2) When even-checking the parity register, when writing the register data to the data bits in the parity register, when an even number of 1 s are included in the binary K-bit register data, the check code generator generates a check code 0 and updates the written check bits. When an odd number of 1 s are included in the binary K-bit register data, the check code generator generates a check code 1 and updates the written check bit.
Accordingly, when it is necessary to read the register data from the parity register, the check code checker determines that the check succeeds when detecting that the k+1 bit data of the parity register includes an even number of 1 s, and determines that the check fails when detecting that the k+1 bit data of the parity register includes an odd number of 1 s.
To achieve the above function, in one implementation of this embodiment, the check code generator is implemented using a K-bit exclusive or gate XOR, and the check code checker is implemented using a k+1-bit exclusive or gate XOR.
Regardless of the implementation, in one embodiment, the check code checker does not directly trigger the memory unit to re-perform the current memory operation when it determines that the check fails. Each check code checker has a corresponding Failure Flag (Failure Flag) that defaults to an invalid signal, and when the check code checker determines that the check is successful, the corresponding Failure Flag is kept to be an invalid signal, and when the check code checker determines that the check is failed, the corresponding Failure Flag is set to be an valid signal. When the invalid flag is switched from the invalid signal to the valid signal, an interrupt signal is generated and sent to a processor in the same memory calculation unit, and when the processor receives the interrupt signal, the processor re-executes the current memory calculation and resets all the invalid flags to the invalid signal. The continuous effective signal of the failure flag only generates an interrupt signal once, so that the condition of repeatedly generating the interrupt signal can be effectively avoided, and if the failure flag does not exist, after the continuous verification of the check code checker fails, the check code checker continuously generates the interrupt signal and sends the interrupt signal to the processor before the processor re-executes the current calculation operation. And through the existence of the failure flag, after the continuous verification of the verification code checker fails, the continuous valid signal of the failure flag also only generates an interrupt signal to the processor once.
In addition, based on the existence of the failure flag, the failure flag can immediately generate an interrupt signal when switching from the invalid signal to the valid signal and send the interrupt signal to a processor in the same memory unit. Or the failure flag generates an interrupt signal and sends the interrupt signal to a processor in the same storage unit after switching from the invalid signal to the valid signal and delaying for a preset time length, and the delayed preset time length can be adjusted according to actual needs.
FIG. 3 and the above embodiments are described in terms of a check circuit in a memory unit for checking a single parity register, in one embodiment, a plurality of parity registers are included in one memory unit, and the check circuit in the memory unit is configured to perform data checking on the plurality of parity registers in the memory unit, and when the check circuit determines that there is a register data error of at least one parity register, the check circuit triggers the memory unit to re-perform the current memory operation.
When a plurality of parity registers are included in a storage unit, one implementation manner is that the checking circuit includes a plurality of checker sets, each parity register corresponds to one checker set, each checker set includes a check code generator and a check code checker, and a connection manner and a checking process between each checker set and the corresponding parity register are described in the foregoing embodiments, which are not repeated. For example, in fig. 4, the checking circuit includes two checker sets corresponding to two parity registers. When at least one check code checker determines that the check fails, the storage unit is triggered to execute the current storage operation again, as shown in fig. 4, a common implementation method is that failure flags of the check code checkers in each check code checker group are respectively connected with one input end of an or gate, and when any failure flag is switched from an invalid signal to a valid signal, an output end of the or gate outputs an interrupt signal and sends the interrupt signal to a processor in the same storage unit.
When a plurality of parity registers are included in one memory unit, another implementation manner is that at least two parity registers share the same check code generator, the check code generator generates respective check codes according to register data of the two parity registers and updates and writes corresponding check bits, the parity registers sharing the same check code generator do not write the register data at the same time, and when the plurality of parity registers do not write the data at the same time, the number of the used check code generators can be reduced while the function implementation is not affected. And/or at least two parity registers share the same check code checker, the check code checker performs data check on the register data of the corresponding parity register according to the data bit of each parity register and the content of the check bit, the parity registers sharing the same check code checker do not read out the register data at the same time, and when a plurality of parity registers do not read data at the same time, the number of the used check code checkers can be reduced while the function realization is not influenced. For example, in fig. 5, corresponding to fig. 4, the two registers share the same check code generator and the same check code checker, and the check code generator generates the check bit of the corresponding data_par1 write parity register according to the data_reg1 of one parity register and generates the check bit of the corresponding data_par2 write parity register according to the data_reg2 of the other parity register. The check code generator performs data check according to the data bit and the content of the check bit of each parity checker, and sets the corresponding failure flag as a valid signal when the verification failure is determined according to the data bit and the content of the check bit of any parity checker, and triggers the failure flag to send an interrupt signal to the processor, similar to the implementation manner of the embodiment. In this case, when there are a plurality of check code generators in the memory unit, as in the above embodiment, when there is at least one check code checker determining that the check fails, the memory unit is triggered to re-execute the current memory operation, and the implementation manner is the same as in the above embodiment, and this embodiment will not be repeated.
3. And reading the operation results in the local storage units of each storage unit out of the FPGA chip through the global read-write channel.
On the basis, according to actual needs, the storage and calculation architecture FPGA can sequentially realize multiple storage and calculation operations through the storage and calculation unit which is internally realized until all operations are completed. And the FPGA of the memory architecture can use all or part of memory units when realizing any memory operation, and the memory units used when realizing each memory operation can be the same or different.
Because the processor and the local storage unit of each storage unit are in the same subarea, the processor only needs to read and write the storage instructions and the initial data in the adjacent local storage units when executing, so that the wiring resources and interconnection resources consumed by data transmission between the processor and the local storage units are reduced, the data transmission delay and the energy consumption are reduced, and the data processing efficiency of the FPGA of the storage architecture is improved. In addition, in a multi-core scene, if a plurality of operation cores read and write the same main memory at the same time, the read-write bandwidth of the main memory forms a bottleneck, so that the data processing efficiency is restricted, the problem is particularly obvious in high-performance calculation, for an FPGA supporting artificial intelligence or machine learning application, the interior of the FPGA generally has a large number of simpler operation cores, the number of the operation cores can exceed 100, the requirement on the read-write bandwidth of the main memory is larger, and the problem is more remarkable. The memory architecture FPGA of the application is also a multi-core scene, and because the processors in each memory unit respectively read and write the local memory unit instead of all processors uniformly reading and writing the same memory, the processors of each memory unit can not collide with each other, each memory unit can execute operation at full speed, the memory bandwidth is not influenced, and the overall data processing efficiency and the operation speed are improved. The storage and calculation architecture FPGA of the application can be suitable for a scene containing a large number of storage and calculation units, and even if the number of the storage and calculation units exceeds 100, the operation rate can be higher, so that the scene of artificial intelligence and machine learning can be better supported.
In one embodiment, when the predetermined structure adopted by the resource module arrangement in the memory architecture FPGA of the present application is a Column-Based structure, the BRAM module occupies one or more columns, and the local storage units in each memory unit are implemented Based on the BRAM module, so that the BRAM module for implementing the local storage unit in each memory unit is located in one or more columns, and is used for implementing the local storage unit in each memory unit, and several BRAM modules located in the same Column are in data transmission with the memory architecture FPGA chip outside the memory architecture FPGA chip via the same global read/write channel, and the memory architecture FPGA chip outside the memory architecture FPGA chip can perform data exchange with any one local storage unit connected with the global read/write channel via each global read/write channel. Based on the above, the application provides two methods for realizing the data transmission between the local storage unit in each storage unit and the outside of the FPGA chip of the storage architecture:
in one embodiment, the local memory unit in the computing unit is implemented, and a plurality of BRAM modules located in the same column are connected with an on-chip read-write interface of the FPGA and the exterior of the FPGA through the same global read-write channel for data transmission, and different global read-write channels are respectively connected with different on-chip read-write interfaces of the FPGA. In the example shown in fig. 2, 8 memory units are formed inside the memory architecture FPGA, the local memory units in the memory units 1 to 4 are implemented by BRAM modules located in the same column, and the local memory units of the 4 memory units are connected to the on-chip read-write interface 1 of the memory architecture FPGA through the global read-write channel 1 to implement data transmission with the outside of the memory architecture FPGA. The local storage units in the storage units 5-8 are realized through BRAM modules positioned in the same column, and the local storage units of the 4 storage units are connected with an on-chip read-write interface 2 of the FPGA through a global read-write channel 2 to realize data transmission with the outside of the FPGA.
In the method provided by the embodiment, a plurality of on-chip read-write interfaces of the FPGA are often occupied, and resources of the on-chip read-write interfaces of the FPGA are precious, so in another embodiment, only one on-chip read-write interface of the FPGA with the memory architecture is occupied, and the on-chip read-write interfaces are used for realizing a local storage unit in a soft core processor, a plurality of BRAM modules located in the same column are connected with one active end of a multiplexer MUX through the same global read-write channel, a fixed end of the multiplexer MUX is connected with the on-chip read-write interfaces of the FPGA, each global read-write channel can be gated through each active end of a gate multiplexer MUX, and data exchange can be performed between the FPGA and any local storage unit connected with the global read-write channel through the gated global read-write channel. For example, in the structure diagram shown in fig. 6, corresponding to fig. 2, a local storage unit in the storage units 1 to 4 is connected to one active end of the multiplexer MUX through the global read-write channel 1, a local storage unit in the storage units 5 to 8 is connected to the other active end of the multiplexer MUX through the global read-write channel 2, and a fixed end of the multiplexer MUX is connected to an on-chip read-write interface of the FPGA of the storage architecture to realize data transmission with the outside of the chip, so compared with fig. 2, the method of the embodiment only needs to occupy one on-chip read-write interface of the FPGA. It should be noted that, fig. 6 mainly is for illustrating a structure that a local storage unit is connected to the outside of the FPGA chip of the memory architecture, so that a check circuit and a parity register in each memory unit are omitted, and subsequent drawings are similarly omitted.
According to the traditional von neumann architecture, instructions and data in each storage unit are stored in a mixed mode, but in order to improve the data processing efficiency, in one embodiment, the instructions and the data of each storage unit are stored separately and independently according to a Harvard type structure, the local storage unit in one storage unit comprises an instruction local storage unit and a data local storage unit, the instruction local storage unit is used for storing instructions of the storage unit, and the data local storage unit is used for storing data of the storage unit, so that the instructions and the data do not interfere with each other when the processor operates, and the data processing efficiency is improved. The instruction local memory unit is implemented by at least one BRAM module and the data local memory unit is implemented by at least one BRAM module, so that based on this, a common sub-area contains at least one DSP, two BRAMs and 300 CLBs.
When the local storage unit in each storage unit comprises an instruction local storage unit and a data local storage unit, the storage sizes of the instruction local storage units of any two storage units are the same or different, and the storage sizes of the data local storage units of any two storage units are the same or different.
When the local storage unit in each memory unit includes an instruction local storage unit and a data local storage unit: (1) In one embodiment, the instruction local storage unit and the data local storage unit in one memory unit share the global read-write channel for data transmission, so that the instruction local storage unit and the data local storage unit in one memory unit are used for realizing data transmission between a plurality of BRAM modules located in the same column and the outside of the FPGA chip of the memory architecture through the same global read-write channel. In this embodiment, the global read/write channel may be connected to the on-chip read/write interface of the FPGA according to the method of the embodiment shown in fig. 2, or to an active side of the multiplexer MUX according to the method of the embodiment shown in fig. 6. Taking the structure shown in fig. 2 as an example, as shown in fig. 7, the instruction local storage units and the data local storage units in the memory units 1 to 4 are connected to the global read-write channel 1, and the instruction local storage units and the data local storage units in the memory units 5 to 8 are connected to the global read-write channel 2. (2) Or in another embodiment, the instruction local storage unit and the data local storage unit use respective global read-write channels respectively, the instruction local storage unit in one storage unit transmits instructions via the instruction global read-write channel, and the data local storage unit transmits data via the data global read-write channel. The plurality of BRAM modules in the same column are used for realizing the data transmission with the outside of the FPGA chip of the memory architecture through the same instruction global read-write channel, and the plurality of BRAM modules in the same column are used for realizing the data transmission with the outside of the FPGA chip of the memory architecture through the same data global read-write channel. Also in this embodiment, each instruction global read/write channel and data global read/write channel may be connected to an on-chip read/write interface of the memory architecture FPGA according to the method of the embodiment shown in fig. 2, or connected to an active end of the multiplexer MUX according to the method of the embodiment shown in fig. 6. Taking the structure shown in fig. 6 as an example, as shown in fig. 8, the instruction local storage units in the storage units 1-4 are all connected to the instruction global read-write channel 1, the data local storage units in the storage units 1-4 are all connected to the data global read-write channel 1, the instruction local storage units in the storage units 5-8 are all connected to the instruction global read-write channel 2, the data local storage units in the storage units 5-8 are all connected to the data global read-write channel 2, and the instruction global read-write channel 1, the data global read-write channel 1, the instruction global read-write channel 2 and the data global read-write channel 2 are respectively connected to one active end of the multiplexer MUX.
In another embodiment, the resource modules in each sub-region are further configured to implement a control signal and a status signal of a corresponding one of the memory units, the control signal being configured to control an operating state of the memory unit, and the status signal being configured to indicate the operating state of the memory unit. The control signals and the state signals of the memory unit are connected with other circuit structures in the FPGA, or the control signals and the state signals of the memory unit are connected to the outside of the FPGA. I.e. the control signals and status signals required by each memory unit when performing a memory operation are implemented by the resource modules within the respective sub-area.
The control signal of each memory unit comprises at least one of an enabling signal, a clock signal, a reset signal and an interrupt signal, wherein the enabling signal is used for controlling the corresponding memory unit to execute an instruction or stop executing the instruction, the reset signal is used for resetting the memory unit, and the interrupt signal provides external interrupt capability. The control signals of any two memory units are the same or different.
The above is only a preferred embodiment of the present application, and the present application is not limited to the above examples. It is to be understood that other modifications and variations which may be directly derived or contemplated by those skilled in the art without departing from the spirit and concepts of the present application are deemed to be included within the scope of the present application.

Claims (10)

1. The single event upset resistant memory computing architecture FPGA is characterized in that the memory computing architecture FPGA internally comprises a plurality of resource modules arranged according to a preset structure, interconnection resources arranged in a ring manner of the resource modules and a global read-write channel realized by additional hardware resources, the resource modules in the same subarea are connected through the interconnection resources in the FPGA to realize a memory computing unit, each memory computing unit comprises a processor and a local storage unit, the local storage unit of each memory computing unit is used for storing instructions and data of the memory computing unit, and the processor in each memory computing unit is connected with the local storage unit in the same memory computing unit through the interconnection resources and operates according to the instructions and the data stored by the local storage unit; the resource modules in the different subareas correspondingly realize a plurality of memory calculation units, and the local storage unit in each memory calculation unit performs data transmission with the outside of the FPGA chip through a global read-write channel;
the memory unit comprises at least one register which is a parity register, and the parity register comprises a data bit and a check bit; and in the process that the storage and calculation unit executes a storage and calculation operation, a check circuit in the storage and calculation unit performs data check on the parity register in the storage and calculation unit, determines the register data error of the parity register when the check fails, and triggers the storage and calculation unit to execute the current storage and calculation operation again.
2. The memory architecture FPGA of claim 1, wherein when the memory unit performs a memory operation, a memory instruction and initial data are written into a local storage unit of the memory unit via the global read-write channel, a processor in the memory unit performs the memory operation based on the initial data according to the memory instruction, and the memory instruction and the initial data stored in the local storage unit remain unchanged during the execution of the memory operation;
when it is determined that the register data of the parity register is erroneous, a processor in the memory arithmetic unit performs a memory arithmetic operation based on the initial data again in accordance with the memory arithmetic instruction.
3. The memory architecture FPGA of claim 1 wherein the check circuitry comprises a check code generator and a check code checker, during execution of a memory operation by the memory unit:
When writing register data into data bits in the parity registers, the check code generator generates corresponding check codes according to the register data and updates and writes the check codes into the parity bits;
And the check code checker performs data check according to the data bit of the parity register and the content of the check bit, reads register data from the data bit of the parity register when the check is determined to be successful, and triggers the memory computing unit to execute the current memory computing operation again when the check is determined to be failed.
4. The memory architecture FPGA of claim 3 wherein one memory cell includes a plurality of parity registers therein, the memory cell being triggered to re-perform a current memory operation when the check circuitry determines that there is a register data error for at least one parity register.
5. The memory architecture FPGA of claim 4 wherein when a memory cell includes a plurality of parity registers, the check circuit includes a plurality of groups of comparators, each parity register corresponding to a respective group of comparators, each group of comparators including a check code generator and a check code checker; and triggering the memory computing unit to re-execute the current memory computing operation when at least one check code checker determines that the check fails.
6. The memory architecture FPGA of claim 4 wherein, when a plurality of parity registers are included in a memory cell:
At least two parity registers share the same check code generator, the check code generator generates respective check codes according to the register data of the two parity registers and updates and writes corresponding check bits, and the parity registers sharing the same check code generator do not write the register data at the same time;
And/or at least two parity registers share the same check code checker, the check code checker performs data check on the register data of the corresponding parity register according to the data bit of each parity register and the content of the check bit, and the parity registers sharing the same check code checker do not read out the register data at the same time; and triggering the memory computing unit to re-execute the current memory computing operation when at least one check code checker determines that the check fails.
7. The storage architecture FPGA of claim 3 wherein said parity register comprises K data bits and 1 check bit, said check circuit performing a data check on said parity register based on a parity check method:
When the even number 1 is included in the binary K-bit register data at the time of writing the register data to the data bit in the parity register, the check code generator generates a check code 1 and updates and writes the check bit; when the binary K-bit register data comprises an odd number of 1 s, the check code generator generates a check code 0 and updates and writes the check bit; the check code checker determines that the check succeeds when detecting that the k+1-bit data of the parity register includes an odd number of 1 s, and determines that the check fails when detecting that the k+1-bit data of the parity register includes an even number of 1 s;
Or when the even number 1 is included in the binary K-bit register data while writing the register data to the data bits in the parity register, the check code generator generates a check code 0 and updates the write check bit; when the binary K-bit register data comprises an odd number of 1 s, the check code generator generates a check code 1 and updates and writes the check bit; the check code checker determines that the check succeeds when it detects that the k+1-bit data of the parity register includes an even number of 1 s, and determines that the check fails when it detects that the k+1-bit data of the parity register includes an odd number of 1 s.
8. The storage architecture FPGA of claim 3 wherein each said check code checker has a corresponding fail flag, each said check code checker maintaining the fail flag as an invalid signal if a check is determined to be successful and setting the fail flag as a valid signal if a check is determined to be failed;
The method comprises the steps that when the invalid flag is switched from an invalid signal to an effective signal, an interrupt signal is generated and sent to a processor in the same memory unit, and when the processor receives the interrupt signal, the processor re-executes the current memory operation and resets all the invalid flags to the invalid signal.
9. The memory architecture FPGA of claim 8 wherein the valid signal of the failure flag generates an interrupt signal after a predetermined time delay and sends the interrupt signal to a processor in the same memory unit.
10. The memory architecture FPGA of claim 1 wherein all or part of the registers in the memory cells are parity registers.
CN202310774573.9A 2023-06-27 2023-06-27 Single event upset resistant memory architecture FPGA Active CN116755923B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310774573.9A CN116755923B (en) 2023-06-27 2023-06-27 Single event upset resistant memory architecture FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310774573.9A CN116755923B (en) 2023-06-27 2023-06-27 Single event upset resistant memory architecture FPGA

Publications (2)

Publication Number Publication Date
CN116755923A CN116755923A (en) 2023-09-15
CN116755923B true CN116755923B (en) 2024-11-08

Family

ID=87956848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310774573.9A Active CN116755923B (en) 2023-06-27 2023-06-27 Single event upset resistant memory architecture FPGA

Country Status (1)

Country Link
CN (1) CN116755923B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740518A (en) * 2016-01-25 2016-07-06 深圳市同创国芯电子有限公司 FPGA resource placement method and apparatus
CN113407258A (en) * 2021-07-05 2021-09-17 武汉理工大学 Self-adaptive resource allocation layout and wiring method and system of storage and computation integrated architecture

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100386253B1 (en) * 2000-11-28 2003-06-02 엘지전자 주식회사 Write data conviction circuit for fpga register in using parity bit
US7036059B1 (en) * 2001-02-14 2006-04-25 Xilinx, Inc. Techniques for mitigating, detecting and correcting single event upset effects in systems using SRAM-based field programmable gate arrays
CN113608918B (en) * 2021-08-19 2023-04-28 无锡中微亿芯有限公司 FPGA with automatic error checking and correcting function for programmable logic module
US20230195661A1 (en) * 2021-12-17 2023-06-22 Dspace Gmbh Method for data communication between subregions of an fpga

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740518A (en) * 2016-01-25 2016-07-06 深圳市同创国芯电子有限公司 FPGA resource placement method and apparatus
CN113407258A (en) * 2021-07-05 2021-09-17 武汉理工大学 Self-adaptive resource allocation layout and wiring method and system of storage and computation integrated architecture

Also Published As

Publication number Publication date
CN116755923A (en) 2023-09-15

Similar Documents

Publication Publication Date Title
CN111433758B (en) Programmable operation and control chip, design method and device thereof
US10230402B2 (en) Data processing apparatus
Leem et al. ERSA: Error resilient system architecture for probabilistic applications
US12117903B2 (en) FPGA acceleration system for MSR codes
CA1123110A (en) Floating point processor having concurrent exponent/mantissa operation
CA3147217A1 (en) Compiler flow logic for reconfigurable architectures
CN105320579A (en) Self-repairing dual-redundancy assembly line oriented to SPARC V8 processor and fault-tolerant method
CN104008021A (en) Precision exception signaling for multiple data architecture
CN114649048A (en) Concurrent computation and ECC for matrix vector operations in memory
CN117992396B (en) Stream tensor processor
CN116755923B (en) Single event upset resistant memory architecture FPGA
CN118035006B (en) Control system capable of being dynamically configured for independent and lockstep operation of three-core processor
KR20200139178A (en) Data processing engine tile architecture for integrated circuits
CN117852600B (en) Artificial intelligence chip, method of operating the same, and machine-readable storage medium
US11675906B1 (en) Simultaneous multi-processor (SiMulPro) apparatus, simultaneous transmit and receive (STAR) apparatus, DRAM interface apparatus, and associated methods
US10534625B1 (en) Carry chain logic in processor based emulation system
Liu et al. Recent advances on reliability of FPGAs in a radiation environment
KR102561205B1 (en) Mobilenet hardware accelator with distributed sram architecture and channel stationary data flow desigh method thereof
US10177794B1 (en) Method and apparatus for error detection and correction
Chakraborty et al. Design of a reliable cache system for heterogeneous CMPs
CN116737470A (en) Memory architecture FPGA for overcoming single event upset failure by backup backtracking
CN112486904A (en) Register file design method and device for reconfigurable processing unit array
US20240078312A1 (en) Simultaneous Multi-Processor (SiMulPro) Apparatus, Simultaneous Transmit And Receive (STAR) Apparatus, DRAM Interface Apparatus, and Associated Methods
Liang et al. C-DMR: a cache-based fault-tolerant protection method for register file
Portaluri et al. Design Techniques for Multi-Core Neural Network Accelerators on Radiation-Hardened FPGAs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant