WO2021017378A1 - Fpga-based convolution parameter acceleration device and data read-write method - Google Patents
Fpga-based convolution parameter acceleration device and data read-write method Download PDFInfo
- Publication number
- WO2021017378A1 WO2021017378A1 PCT/CN2019/126433 CN2019126433W WO2021017378A1 WO 2021017378 A1 WO2021017378 A1 WO 2021017378A1 CN 2019126433 W CN2019126433 W CN 2019126433W WO 2021017378 A1 WO2021017378 A1 WO 2021017378A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- convolution parameters
- read
- last
- write
- convolution
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Definitions
- the present invention relates to the technical field of integrated circuits, in particular to an FPGA-based convolution parameter acceleration device and a data reading and writing method.
- CNN Convolutional Neural Network
- CNN neural network is mainly based on the computer platform.
- the CNN architecture is deployed on the development side computer, and then the mass data is used for weight training, and finally the appropriate weight coefficient is generated. After the weight coefficient is solidified, the same can be deployed on the product side.
- CNN architecture but cancels the training part, and uses the generated weight coefficients to directly make the CNN neural network work. If portability and practicality must be considered on the product side, high-end servers and workstations are generally not used to set up CNN neural networks. Under the requirements of reducing costs and reducing size, using embedded development becomes the first choice.
- CNN neural network convolution accelerators has become a popular direction, such as “Convolutional Neural Networks” FPGA Parallel Architecture Design of CNN Algorithms” (Wang Wei et al. 2019.4 Microelectronics and Computers) and “Convolutional Neural Network Accelerator Design and Application Research Based on ZYNQ Platform” (Deng Shuai 2018.5 Beijing University of Technology), the latter It only describes the theoretical process, without giving the actual design model and performance analysis. The former proposed a more specific neural network convolution accelerator model.
- the model After the analysis of the paper, the model has a greater improvement in performance. Level implementation still has the disadvantage of insufficient internal data throughput of the chip, and it is difficult to implement applications.
- the image classification algorithm of yoloV2 requires 17.4G calculation times for each frame of image. According to this design, only under the condition of seamless data connection A processing speed of 1.15 frames/s can be achieved.
- CNN Neural Network Accelerator Refer to Figure 1.
- the existing CNN neural network accelerator technology is not yet mature.
- the main problem is that the cost is relatively high, and the data throughput rate is low, which leads to too much calculation delay, which cannot satisfy real-time applications and low cost.
- CNN neural network is a complex system.
- the existing public design ideas are basically integrated design, without configurable modular design, which will lead to design changes, upgrades and low transplantation efficiency, and reduce design reusability. .
- the purpose of the present invention is to provide an FPGA-based convolution parameter acceleration device and a data reading and writing method to solve the technical problems of slow data processing and insufficient data throughput in the prior art.
- an FPGA-based method for reading and writing convolution parameter data including:
- the first random read/write memory outputs one of the set of convolution parameters according to the address, and the first read control counter increments automatically; judge whether it is completed If the output of the set of convolution parameters for a predetermined number of times is completed, the first read control counter and the second read control counter are cleared.
- the write control counter is cleared.
- the first read control counter is cleared, and the second read control counter is automatically reset. Increase by 1.
- the second random access memory while writing a set of convolution parameters to the first random access memory, the second random access memory outputs another set of convolution parameters; or, the first random access memory While outputting a set of convolution parameters, write another set of convolution parameters into the second random access memory.
- the input of a group of convolution parameters after the input of a group of convolution parameters is completed, it further includes: judging whether it is the last one of the input another group of convolution parameters. If it is not the last one, the write control counter is incremented automatically, and in the second random An address is allocated to each of the other set of convolution parameters in the read-write memory.
- the output of a set of convolution parameters after the output of a set of convolution parameters is completed, it further includes: judging whether it is the last one of another set of convolution parameters to be output, if not the last one, the second random read/write memory outputs the other set of convolution One of the parameters, the first read control counter increments automatically.
- the application also discloses an FPGA-based convolution parameter acceleration device including:
- At least one random read-write memory configured to store convolution parameters
- the write address control unit is configured to determine whether it is the last one of the input set of convolution parameters, if it is not the last one, the write control counter is incremented, and in the first random read/write memory, it is each set of convolution parameters. Allocation address
- the read address control unit judges whether it is the last one of a set of output convolution parameters. If it is not the last one, the first random read/write memory outputs one of the set of convolution parameters according to the address, and the first read control counter automatically Increment; It is judged whether the output of the predetermined number of times of the set of convolution parameters is completed, and if completed, the first read control counter and the second read control counter are cleared.
- it includes first and second random access memory, while writing a set of convolution parameters into the first random access memory, the second random access memory outputs another set of convolution parameters Or, while the first random access memory outputs a set of convolution parameters, another group of convolution parameters is written into the second random access memory.
- the write address control unit is also configured to: determine whether it is the last one of another set of input convolution parameters, if not the last one, An address is allocated to each of the other set of convolution parameters in the second random read/write memory, and the write address controls the counter to increment.
- the read address control unit is also configured to determine whether it is the last one of another set of output convolution parameters, if not the last one, then the second random The read-write memory outputs one of the other set of convolution parameters, and the first read control counter increases automatically.
- the FPGA-based convolution parameter acceleration device of the present invention uses the least logic resources to form a minimized convolution parameter management.
- the device interface is simple and easy to use, less resource occupancy, easy to transplant, short input and output paths, and due to internal use of two
- a random read-write memory can read and write data at the same time, continuously output, and maintain the peak state for a long time, which can greatly improve the parallelism and achieve high data throughput.
- Figure 1 shows a process diagram of the convolution technology in the CNN neural network model in the prior art
- Figure 2 shows a schematic diagram of an acceleration device in an embodiment of the present invention
- Figure 3 shows a schematic diagram of an acceleration device in another embodiment of the present invention.
- Figure 4 shows a process diagram of data writing in an embodiment of the present invention
- Fig. 5 shows a process diagram of data output in an embodiment of the present invention.
- CNN Convolutional Neural Networks, Convolutional Neural Network
- Convolution parameters Convolution kernel parameters in CNN
- FPGA Field Programmable Logic Gate Array
- RAM Random Access Memory
- RAM Random Access Memory
- the acceleration device 100 includes:
- At least one random access memory shown in FIG. 2 includes a first random access memory 101, configured to store convolution parameters;
- the write address control unit 201 is configured to determine whether it is the last one of the input set of convolution parameters. If it is not the last one, the write control counter (not shown in the figure) in the write address control unit 201 increments by 1, in Allocate an address in the first random access memory 101 for each of the group of convolution parameters;
- the read address control unit 202 is configured to determine whether it is the last one of a set of output convolution parameters. If it is not the last one, the first random access memory 101 outputs one of the set of convolution parameters according to the address, and the read address The first read control counter in the control unit 202 increments by 1, and determines whether the output of the set of convolution parameters is completed for a predetermined number of times. If it is completed, the second read control counter in the address control unit 202 (not shown in the figure) is read. Out) cleared.
- the least logical resources are used to form a minimized convolution parameter management acceleration unit, and its interface is simple and easy to use, with low resource occupation, easy transplantation, and short input and output paths.
- the acceleration device of the present application includes a first random read/write memory 101 and a second random read/write memory 102, and a set of volumes is written into the first random read/write memory 101 While accumulating parameters, the second random access memory 102 outputs another set of convolution parameters; or, the first random access memory 101 outputs a group of convolution parameters while writing to the second random access memory 102 Enter another set of convolution parameters.
- the write address control unit 201 is also configured to determine whether it is the last one of another set of input convolution parameters, If it is not the last one, an address is assigned to each of the other set of convolution parameters in the second random access memory 202, and the write address controls the counter to increment by one.
- the read address control unit 202 is also configured to determine whether it is the last one of another set of output convolution parameters, if not The last one, the second random read/write memory 102 outputs one of the other set of convolution parameters, and the first read control counter in the read address control unit 202 is incremented.
- the acceleration device uses two RAMs inside, the data can be read and written at the same time, one RAM is used for writing data, and the other RAM is used for reading data, so that parallel processing of data can be realized, continuous output, and long-term retention of the peak state.
- Another embodiment of the present application also discloses an FPGA-based data reading and writing method, including:
- the FPGA-based data writing method includes:
- step 11 it is judged whether there is data input, if there is no data input, go to step 15, and the write control counter of the write control unit is cleared;
- step 12 determines whether it is the last one of the input set of convolution parameters. If it is not the last one, go to step 13, and the write control counter is incremented by one, and then go to step S14, in the first random read and write Allocate an address in the memory for each of the group of convolution parameters;
- step 15 If it is the last one of the group of convolution parameters, go to step 15, and the write control counter of the write control unit is cleared.
- the FPGA-based data writing method includes:
- step 26 determine whether the data is output 21. If there is no data output, go to step 26, the first read control counter is cleared, and the second read control counter is cleared.
- step 22 determines whether it is the last one of the set of convolution parameters. If it is the last one of the set of convolution parameters, go to step 27, the first read control counter is cleared, and the second read control The counter increments by 1.
- step 23 the first random read/write memory 101 outputs one of the group of convolution parameters according to the address, and at the same time go to step 24, the first read control counter increments by 1; then, repeat step 21 .
- step 27 the first read control counter is cleared, and the first read control counter is cleared.
- the second reading control counter increments by 1, indicating that the output of a set of convolution parameters for one point has been completed.
- step 25 determines whether the output of the predetermined number of convolution parameters of the group of convolution parameters has been completed, and if not, enter step 21 again to determine whether to output data.
- step 26 is entered, and the first read control counter and the second read control counter are cleared.
- the second random read/write memory 102 while writing a set of convolution parameters to the first random read/write memory 101, the second random read/write memory 102 outputs another set of convolution parameters; or, the first random read While the write memory 101 outputs a set of convolution parameters, another set of convolution parameters is written into the second random access memory 102.
- a set of convolution parameters are stored in the first random read/write memory 101 at this time, and then further includes: determining whether it is the last one of the input another set of convolution parameters , If it is not the last one, the write control counter is incremented, an address is assigned to each of the other set of convolution parameters in the second random read/write memory, and another set of convolution parameters is written into the second random Read and write memory, and the first random read and write memory can output data at the same time.
- the output of a set of convolution parameters in the first random read/write memory 101 is completed, and then it also includes: judging whether it is the last set of output convolution parameters One, if it is not the last one, the second random read/write memory outputs one of the other set of convolution parameters, the first read control counter increments automatically, so that the second random read/write memory is used to output data, and the first Random read-write memory can write data at the same time.
- the acceleration device uses two RAMs, data can be read and written at the same time, one RAM is used to write data, and the other RAM is used to read data, so that parallel processing of data can be realized, continuous output, and long-term retention of the peak state.
- the first embodiment is a method embodiment corresponding to this embodiment.
- the technical details in the first embodiment can be applied to this embodiment, and the technical details in this embodiment can also be applied to the first embodiment.
- each module shown in the implementation of the acceleration device can be understood with reference to the relevant description of the data reading and writing method.
- the function of each module shown in the embodiment of the acceleration device can be realized by a program (executable instruction) running on the processor, or can be realized by a specific logic circuit. If the acceleration device of the embodiment of the present application is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application essentially or the part that contributes to the prior art can be embodied in the form of a software product.
- the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, Read Only Memory (ROM, Read Only Memory), magnetic disk or optical disk and other media that can store program codes. In this way, the embodiments of the present application are not limited to any specific hardware and software combination.
- FPGA-readable storage media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
- the information can be computer-readable instructions, data structures, program modules, or other data.
- Examples of storage media for FPGA configuration files include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), Readable memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical Storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by computing devices. According to the definition in this article, FPGA-readable storage media does not include transitory media, such as modulated data signals and carrier waves.
- PRAM phase change memory
- SRAM static random access memory
- DRAM dynamic random access memory
- RAM random access memory
- ROM Readable memory
- EEPROM electrically erasable programmable read-only memory
- flash memory or other memory technology
- CD-ROM compact disc
- DVD digital versatile disc
- magnetic cassettes magnetic tape storage or other magnetic storage devices or any other
- an act is performed based on a certain element, it means that the act is performed at least based on that element, which includes two situations: performing the act only based on the element, and performing the act based on the element and Other elements perform the behavior.
- Multiple, multiple, multiple, etc. expressions include two, two, two, and two or more, two or more, and two or more expressions.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Neurology (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
- Complex Calculations (AREA)
- Advance Control (AREA)
Abstract
Description
Claims (10)
- 一种基于FPGA的卷积参数数据读写方法,包括:An FPGA-based method for reading and writing convolution parameter data, including:判断是否为输入的一组卷积参数的最后一个,若不是最后一个,则写控制计数器自增,在第一随机读写存储器中为该组卷积参数中的每个分配地址;Judge whether it is the last one of a set of input convolution parameters, if it is not the last one, the write control counter is incremented, and an address is assigned to each of the set of convolution parameters in the first random read/write memory;判断是否为输出的一组卷积参数的最后一个,若不是最后一个,所述第一随机读写存储器根据地址输出该组卷积参数中的一个,第一读控制计数器自增;判断是否完成该组卷积参数的预定次数的输出,若完成,则所述第一读控制计数器、第二读控制计数器清零。It is judged whether it is the last one of a set of output convolution parameters. If it is not the last one, the first random read/write memory outputs one of the set of convolution parameters according to the address, and the first read control counter increments automatically; judge whether it is completed If the output of the set of convolution parameters for a predetermined number of times is completed, the first read control counter and the second read control counter are cleared.
- 如权利要求1所述的方法,其特征在于,若是输入的一组卷积参数的最后一个,则所述写控制计数器清零。The method according to claim 1, wherein if it is the last one of a set of input convolution parameters, the write control counter is cleared.
- 如权利要求1所述的方法,其特征在于,若是输出的一组卷积参数的最后一个,且未完成该组卷积参数的预订次数的输出,所述第一读控制计数器清零,所述第二读控制计数器自增1。The method of claim 1, wherein if it is the last one of a set of output convolution parameters, and the output of the set of convolution parameters is not completed, the first read control counter is cleared, so The second read control counter increments by 1.
- 如权利要求1所述的方法,其特征在于,向所述第一随机读写存储器中写入一组卷积参数的同时,第二随机读写存储器输出另一组卷积参数;或,所述第一随机读写存储器输出一组卷积参数的同时,向第二随机读写存储器中写入另一组卷积参数。The method according to claim 1, wherein while writing a set of convolution parameters to the first random access memory, the second random access memory outputs another set of convolution parameters; or, so While the first random access memory outputs a group of convolution parameters, another group of convolution parameters is written into the second random access memory.
- 如权利要求1所述的方法,其特征在于,一组卷积参数输入完成后还包括:判断是否为输入的另一组卷积参数的最后一个,若不是最后一个,则所述写控制计数器自增,在第二随机读写存储器中为该另一组卷积参数中的每个分配地址。The method of claim 1, wherein after the input of a set of convolution parameters is completed, the method further comprises: determining whether it is the last one of the input another set of convolution parameters, and if it is not the last one, the write control counter Self-increment, assign an address to each of the other set of convolution parameters in the second random access memory.
- 如权利要求1所述的方法,其特征在于,一组卷积参数输出完成后还包括:判断是否为输出的另一组卷积参数的最后一个,若不是最后一个,第二随机读写存储器输出该另一组卷积参数中的一个,所述第一读控制计数器自增。The method of claim 1, wherein after the output of a set of convolution parameters is completed, the method further comprises: determining whether it is the last one of another set of output convolution parameters, and if not the last one, the second random read/write memory One of the other set of convolution parameters is output, and the first read control counter is incremented.
- 一种基于FPGA的卷积参数加速装置,其特征在于,包括:An FPGA-based convolution parameter acceleration device, which is characterized in that it comprises:至少一个随机读写存储器,配置为存储卷积参数;At least one random read-write memory, configured to store convolution parameters;写地址控制单元,配置为判断是否为输入的一组卷积参数的最后一个,若不是最后一个,则写控制计数器自增,在第一随机读写存储器中为该组卷积参数中的每个分配地址;The write address control unit is configured to determine whether it is the last one of the input set of convolution parameters, if it is not the last one, the write control counter is incremented, and in the first random read/write memory, it is each set of convolution parameters. Allocation address读地址控制单元,判断是否为输出的一组卷积参数的最后一个,若不是最后一个,所述第一随机读写存储器根据地址输出该组卷积参数中的一个,第一读控制计数器自增;判断是否完成该组卷积参数的预定次数的输出,若完成,则所述第一读控制计数器、第二读控制计数器清零。The read address control unit judges whether it is the last one of a set of output convolution parameters. If it is not the last one, the first random read/write memory outputs one of the set of convolution parameters according to the address, and the first read control counter automatically Increment; It is judged whether the output of the predetermined number of times of the set of convolution parameters is completed, and if completed, the first read control counter and the second read control counter are cleared.
- 如权利要求7所述的装置,其特征在于,包括第一和第二随机读写存储器,向所述第一随机读写存储器中写入一组卷积参数的同时,第二随机读写存储器输出另一组卷积参数;或,所述第一随机读写存储器输出一组卷积参数的同时,向第二随机读写存储器中写入另一组卷积参数。The device according to claim 7, characterized in that it comprises a first and a second random read-write memory, while writing a set of convolution parameters into the first random read-write memory, the second random read-write memory Output another set of convolution parameters; or, when the first random read/write memory outputs a set of convolution parameters, write another set of convolution parameters into the second random read/write memory.
- 如权利要求7所述的装置,其特征在于,包括第一和第二随机读写存储器;所述写地址控制单元还被配置为:判断是否为输入的另一组卷积参数的最后一个,若不是最后一个,则在第二随机读写存储器中为该另一组卷积参数中的每个分配地址,所述写地址控制计数器自增。7. The device according to claim 7, characterized by comprising first and second random read/write memories; the write address control unit is further configured to: determine whether it is the last one of another set of input convolution parameters, If it is not the last one, an address is allocated to each of the other set of convolution parameters in the second random read/write memory, and the write address controls the counter to increment.
- 如权利要求7所述的装置,其特征在于,包括第一和第二随机读写存储器;读地址控制单元还被配置为:判断是否为输出的另一组卷积参数的最后一个,若不是最后一个,则第二随机读写存储器输出该另一组卷积参数中的一个,所述第一读控制计数器自增。The device according to claim 7, characterized in that it comprises a first and a second random read/write memory; the read address control unit is further configured to: determine whether it is the last one of another set of output convolution parameters, if not The last one, the second random read/write memory outputs one of the other set of convolution parameters, and the first read control counter increments automatically.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910708612.9 | 2019-08-01 | ||
CN201910708612.9A CN110390392B (en) | 2019-08-01 | 2019-08-01 | Convolution parameter accelerating device based on FPGA and data reading and writing method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021017378A1 true WO2021017378A1 (en) | 2021-02-04 |
Family
ID=68288406
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/126433 WO2021017378A1 (en) | 2019-08-01 | 2019-12-18 | Fpga-based convolution parameter acceleration device and data read-write method |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110390392B (en) |
WO (1) | WO2021017378A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390392B (en) * | 2019-08-01 | 2021-02-19 | 上海安路信息科技有限公司 | Convolution parameter accelerating device based on FPGA and data reading and writing method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180341495A1 (en) * | 2017-05-26 | 2018-11-29 | Purdue Research Foundation | Hardware Accelerator for Convolutional Neural Networks and Method of Operation Thereof |
CN109409509A (en) * | 2018-12-24 | 2019-03-01 | 济南浪潮高新科技投资发展有限公司 | A kind of data structure and accelerated method for the convolutional neural networks accelerator based on FPGA |
CN109784489A (en) * | 2019-01-16 | 2019-05-21 | 北京大学软件与微电子学院 | Convolutional neural networks IP kernel based on FPGA |
CN110390392A (en) * | 2019-08-01 | 2019-10-29 | 上海安路信息科技有限公司 | Deconvolution parameter accelerator, data read-write method based on FPGA |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5764374A (en) * | 1996-02-05 | 1998-06-09 | Hewlett-Packard Company | System and method for lossless image compression having improved sequential determination of golomb parameter |
EP1089475A1 (en) * | 1999-09-28 | 2001-04-04 | TELEFONAKTIEBOLAGET L M ERICSSON (publ) | Converter and method for converting an input packet stream containing data with plural transmission rates into an output data symbol stream |
CN100466601C (en) * | 2005-04-28 | 2009-03-04 | 华为技术有限公司 | Data read/write device and method |
CN101257313B (en) * | 2007-04-10 | 2010-05-26 | 深圳市同洲电子股份有限公司 | Deconvolution interweave machine and method realized based on FPGA |
CN104461934B (en) * | 2014-11-07 | 2017-06-30 | 北京海尔集成电路设计有限公司 | A kind of time solution convolutional interleave device and method of suitable DDR memory |
CN106940815B (en) * | 2017-02-13 | 2020-07-28 | 西安交通大学 | Programmable convolutional neural network coprocessor IP core |
CN108169727B (en) * | 2018-01-03 | 2019-12-27 | 电子科技大学 | Moving target radar scattering cross section measuring method based on FPGA |
CN108154229B (en) * | 2018-01-10 | 2022-04-08 | 西安电子科技大学 | Image processing method based on FPGA (field programmable Gate array) accelerated convolutional neural network framework |
CN109086867B (en) * | 2018-07-02 | 2021-06-08 | 武汉魅瞳科技有限公司 | Convolutional neural network acceleration system based on FPGA |
CN109032781A (en) * | 2018-07-13 | 2018-12-18 | 重庆邮电大学 | A kind of FPGA parallel system of convolutional neural networks algorithm |
CN109214281A (en) * | 2018-07-30 | 2019-01-15 | 苏州神指微电子有限公司 | A kind of CNN hardware accelerator for AI chip recognition of face |
CN109359729B (en) * | 2018-09-13 | 2022-02-22 | 深思考人工智能机器人科技(北京)有限公司 | System and method for realizing data caching on FPGA |
CN109711533B (en) * | 2018-12-20 | 2023-04-28 | 西安电子科技大学 | Convolutional neural network acceleration system based on FPGA |
-
2019
- 2019-08-01 CN CN201910708612.9A patent/CN110390392B/en active Active
- 2019-12-18 WO PCT/CN2019/126433 patent/WO2021017378A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180341495A1 (en) * | 2017-05-26 | 2018-11-29 | Purdue Research Foundation | Hardware Accelerator for Convolutional Neural Networks and Method of Operation Thereof |
CN109409509A (en) * | 2018-12-24 | 2019-03-01 | 济南浪潮高新科技投资发展有限公司 | A kind of data structure and accelerated method for the convolutional neural networks accelerator based on FPGA |
CN109784489A (en) * | 2019-01-16 | 2019-05-21 | 北京大学软件与微电子学院 | Convolutional neural networks IP kernel based on FPGA |
CN110390392A (en) * | 2019-08-01 | 2019-10-29 | 上海安路信息科技有限公司 | Deconvolution parameter accelerator, data read-write method based on FPGA |
Also Published As
Publication number | Publication date |
---|---|
CN110390392B (en) | 2021-02-19 |
CN110390392A (en) | 2019-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10318450B2 (en) | Efficient context based input/output (I/O) classification | |
WO2017156968A1 (en) | Neural network computing method, system and device therefor | |
CN110175140A (en) | Fusion memory part and its operating method | |
JP6935356B2 (en) | Semiconductor devices, information processing systems, and information processing methods | |
US10684946B2 (en) | Method and device for on-chip repetitive addressing | |
US11455781B2 (en) | Data reading/writing method and system in 3D image processing, storage medium and terminal | |
US11138104B2 (en) | Selection of mass storage device streams for garbage collection based on logical saturation | |
US11928580B2 (en) | Interleaving memory requests to accelerate memory accesses | |
CN106910528A (en) | A kind of optimization method and device of solid state hard disc data routing inspection | |
TW201833851A (en) | Risk control event automatic processing method and apparatus | |
CN102413186A (en) | Resource scheduling method and device based on private cloud computing, and cloud management server | |
WO2022199027A1 (en) | Random write method, electronic device and storage medium | |
CN111949211B (en) | Storage device and storage control method | |
WO2021017378A1 (en) | Fpga-based convolution parameter acceleration device and data read-write method | |
US11436486B2 (en) | Neural network internal data fast access memory buffer | |
CN110569112B (en) | Log data writing method and object storage daemon device | |
CN111737190B (en) | Dynamic software and hardware cooperation method of embedded system and embedded system | |
US11429299B2 (en) | System and method for managing conversion of low-locality data into high-locality data | |
CN110618872B (en) | Hybrid memory dynamic scheduling method and system | |
CN103761052A (en) | Method for managing cache and storage device | |
TW201435586A (en) | Flash memory apparatus, and method and device for managing data thereof | |
CN117555666A (en) | Training task deployment method, system, equipment and storage medium | |
CN105912404B (en) | A method of finding strong continune component in the large-scale graph data based on disk | |
CN102541463B (en) | Flash memory device and data access method thereof | |
US11442643B2 (en) | System and method for efficiently converting low-locality data into high-locality data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19939614 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19939614 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19939614 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.10.2022) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19939614 Country of ref document: EP Kind code of ref document: A1 |