[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN113269317B - Pulse neural network computing array - Google Patents

Pulse neural network computing array Download PDF

Info

Publication number
CN113269317B
CN113269317B CN202110400723.0A CN202110400723A CN113269317B CN 113269317 B CN113269317 B CN 113269317B CN 202110400723 A CN202110400723 A CN 202110400723A CN 113269317 B CN113269317 B CN 113269317B
Authority
CN
China
Prior art keywords
neural network
membrane potential
pulse
comparator
pooling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110400723.0A
Other languages
Chinese (zh)
Other versions
CN113269317A (en
Inventor
李丽
沈思睿
傅玉祥
陈沁雨
徐瑾
王心沅
何书专
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202110400723.0A priority Critical patent/CN113269317B/en
Publication of CN113269317A publication Critical patent/CN113269317A/en
Application granted granted Critical
Publication of CN113269317B publication Critical patent/CN113269317B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a pulse neural network computing array which can support convolution-pooling continuous operation, can support parallel reasoning operation of the pulse neural network and improves the execution efficiency in the algorithm reasoning process. The invention comprises a pulse neural network computing cluster formed by a plurality of pulse neural network computing units, wherein each pulse neural network computing unit comprises a membrane potential accumulator, a pulse transmitter, a pooling buffer area and a pooling comparator. The membrane potential accumulator is electrically connected with the pulse emitter, and the pulse emitter is electrically connected with the pooling buffer area and the pooling comparator. The membrane potential accumulator is used for accumulating the input pulse sequence; the pulse transmitter judges whether to transmit a pulse to the next stage according to the membrane potential input by the accumulator; the pooling buffer area counts and buffers the pulses of the pulse transmitter; the pooling comparator performs a comparison operation on the input of the buffer.

Description

Pulse neural network computing array
Technical Field
The invention relates to the field of hardware implementation in the field of neural network computation, in particular to a pulse neural network computation array.
Background
The impulse neural network (Spiking Neural Networks, SNNs) uses impulse neurons as a computing unit, and can simulate the information coding and processing process of the human brain. Different from the traditional deep neural network, the information transmission is carried out by using specific values, and the pulse neural network can provide sparse but high-accuracy computing capability by carrying out information transmission through each pulse transmitting time in a pulse sequence. The impulse neurons accumulate inputs to the membrane voltage and when a specific threshold is reached, pulse firing is performed, enabling event-driven calculations. Because of the sparsity of the impulse events and the event-driven calculation form, the impulse neural network can provide excellent energy utilization efficiency and is the neural network structure closest to the biological brain at present.
According to the characteristics of the impulse neural network data, the sparsity of the traditional neural network computing array structure cannot be utilized to the maximum extent, and an operational array needs to be designed according to the computing characteristics of the impulse neural network. While some impulse neural network computing arrays have been disclosed, most only support the operation of fully connected and convolved structures. However, these designs have the disadvantage of poor support for the pooling layer in the impulse neural network and require a significant overhead to complete the pooling operation in hardware. The array cannot adapt to various network structures, and the array has poor universality.
Due to the event-driven characteristics of the impulse neural network algorithm model, the convergence speed needs to be increased and the membrane potential needs to be refreshed in the reasoning process. How to design the computing structure to adapt to the refresh rate of the event and increase the speed of reasoning convergence is also part of the study of the impulse neural network computing unit.
In summary, how to design a pulse neural network computing array with high resource utilization, low operation delay and low cost, which has good support for each layer of the pulse neural network, is a problem to be solved in the research of the existing pulse neural network technology.
Disclosure of Invention
The invention aims to: the pulse neural network computing array aims to overcome the defects of the prior art, can better support a pulse neural network pooling layer, comprehensively considers the aspects of precision, area power consumption and operation time delay of hardware implementation, and is provided.
The technical scheme is as follows: aiming at the problems, the invention provides a pulse neural network computing array which can support convolution-pooling continuous operation, can support parallel reasoning operation of the pulse neural network and improves the execution efficiency in the algorithm reasoning process. The invention comprises a pulse neural network computing cluster formed by a plurality of pulse neural network computing units, wherein each pulse neural network computing unit comprises a membrane potential accumulator, a pulse transmitter, a pooling buffer area and a pooling comparator. The membrane potential accumulator is electrically connected with the pulse emitter, and the pulse emitter is electrically connected with the pooling buffer area and the pooling comparator. The membrane potential accumulator is used for accumulating the input pulse sequence; the pulse transmitter judges whether to transmit a pulse to the next stage according to the membrane potential input by the accumulator; the pooling buffer area counts and buffers the pulses of the pulse transmitter; the pooling comparator performs a comparison operation on the input of the buffer.
In a further embodiment, the membrane potential accumulator comprises at least one membrane potential loading unit and at least one fixed point accumulator, the membrane potential loading unit and the fixed point accumulator being electrically connected to each other.
In a further embodiment, the pulse transmitter comprises at least one fixed point comparator and at least one fixed point subtractor, the fixed point comparator and the fixed point subtractor being electrically connected to each other.
In a further embodiment, the pooled buffer includes at least one pooled count load unit, at least one counter, and at least one first-in first-out data buffer, the pooled count load unit, counter, and first-in first-out data buffer being electrically connected to each other;
The pooling comparator comprises at least one fixed point number comparator and at least one data manager, wherein the fixed point number comparator and the data manager are electrically connected with each other.
In a further embodiment, the membrane potential loading unit is activated simultaneously with the fixed point accumulator, the loading process being overridden by the accumulation process.
In a further embodiment, the output of the counter is electrically connected to both the fifo and the pooling comparator.
In a further embodiment, the data manager includes a first-in first-out data buffer reading unit, a data packet parsing unit, and a data allocation unit; the first-in first-out data buffer reading unit is electrically connected with the data packet analysis unit, and the data packet analysis unit is electrically connected with the data distribution unit.
In a further embodiment, each impulse neural network computing unit comprises an input port and an output port, the input ports being electrically connected to each other through a membrane potential accumulator, an impulse transmitter, a pooling buffer and a pooling comparator.
In a further embodiment, the plurality of impulse neural network computing units form an impulse neural network computing cluster, the impulse neural network computing units within the cluster sharing the same set of control logic.
In a further embodiment, the input of the impulse neural network specific computing array is electrically connected to the input of the impulse neural network computing cluster, and the output of the impulse neural network computing cluster is electrically connected to the output of the impulse neural network specific computing array.
The beneficial effects are that: the special calculation array for the impulse neural network can realize a plurality of different operation modes through external configuration signals. In addition, the invention realizes the parallelization calculation of the impulse neural network through the 8x8 impulse neural network calculation unit structure, and has good operation flexibility and hardware resource utilization rate. The accumulation method fully utilizes the sparsity characteristic of the impulse neural network, solves the problem of massive redundancy in the traditional calculation array, and improves the operation efficiency of the impulse neural network.
Drawings
FIG. 1 is a schematic diagram of a pulsed neural network computational array of the present invention.
Fig. 2 is a schematic diagram of the structure of the impulse neural network calculation unit in the present invention.
FIG. 3 is a schematic diagram of a membrane potential accumulator according to the present invention.
Fig. 4 is a schematic diagram of the structure of the pulse emitter according to the present invention.
FIG. 5 is a schematic diagram of the structure of a pooled buffer according to the present invention.
FIG. 6 is a schematic diagram of a pooled comparator according to the present invention.
Fig. 7 is a schematic diagram of the structure of a single convolutional layer calculation of a impulse neural network.
FIG. 8 is a schematic diagram of the structure of a pulsed neural network convolution-pooling layer continuous calculation.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the invention.
Example 1
Referring to fig. 1, SNPU in fig. 1 represents a impulse neural network calculation unit, and SNPU CLUSTER represents a cluster of impulse neural network calculation units. The invention relates to a pulse neural network computing array, which comprises a pulse neural network computing cluster formed by a plurality of pulse neural network computing units. The single impulse neural network computing unit can perform computing processing on the original data, the intra-cluster units can compute different rows of each image in parallel, and the inter-cluster parallelism can compute different convolution patterns of the same image in parallel. The computing array in the embodiment comprises 8 pulse neural network computing clusters, and each cluster comprises 8 pulse neural network computing units, so that 64 paths of data can be processed in parallel. In combination with the illustration of fig. 2, the pulsed neural network computing unit of the present invention includes a membrane potential accumulator, a pulse emitter, a pooling buffer, and a pooling comparator. The membrane potential accumulator is electrically connected with the pulse emitter, and the pulse emitter is electrically connected with the pooling buffer area and the pooling comparator.
Describing the above components in further detail, specifically, in conjunction with fig. 3, the membrane potential accumulator is used to accumulate the input pulse sequence; specifically, the membrane potential accumulator comprises a membrane potential loading unit and a fixed-point accumulator. The membrane potential loading unit comprises an initialization input port, a loading enabling port, a membrane potential loading port and a membrane potential register.
The membrane potential loading unit resets the membrane potential register to 0 according to the initialization input port being 1'b1, and reads the value of the membrane potential loading port into the register according to the enabling signal loaded into the enabling port when the initialization input port is 1' b 0. The fixed point accumulator has 18 bit weight input port, 1bit input enable port and 18 bit membrane potential output port. The fixed point accumulator can accumulate the value of the membrane potential register and output an 8bit membrane potential value under the condition that the input enabling port is 1' b 1.
Referring to fig. 4, the pulse transmitter is used for determining whether to transmit a pulse to the next stage according to the membrane potential value input by the accumulator. Specifically, the pulse transmitter comprises a fixed point comparator and a fixed point subtractor. The fixed point comparator comprises two 8bit inputs and one 1bit output. The fixed point comparator compares the input membrane potential value with the emission threshold value, and when the membrane potential value is larger than the emission threshold value, the output signal of the comparator is pulled up. When the membrane potential is less than the emission threshold, the output signal of the comparator is not pulled high. The fixed point subtractor comprises two 8bit inputs and one 1bit output. When the output of the comparator is pulled high, the fixed point subtracter performs fixed point subtraction on the value of the membrane potential, and outputs the membrane potential value after subtracting the threshold value. The membrane potential threshold value can be 32 in design, and the membrane potential threshold value can be configured.
As shown in connection with fig. 5, the pooled buffer is used to count and buffer pulses of the pulse emitter. Specifically, the pooling buffer comprises a pooling count loading unit, a counter and a first-in first-out data buffer. The pooling count loading unit comprises 1bit initialization input port, 1bit loading enabling port and 8bit pooling count loading port. The pooling count loading unit resets the pooling counter to 0 according to the initialization input port being 1'b1, and reads the value of the pooling count loading port into the counter according to the enabling signal of the loading enabling port when the initialization input port is 1' b 0. The counter has a 1bit input port and an 8bit output port. The counter increases the count according to the pulse emission condition. The current pooling count is output after each time step. The first-in first-out data buffer has a 1bit input enabling port, a 9bit input port, a 1bit output enabling port and a 9bit output port. The data format of the first-in first-out data buffer is { state of transmission (state), pooling count (pool_cnt) }. The first-in first-out data buffer has a depth of 128, which is greater than the length of one line of the image. The counter outputs 9bit data to both the pooling comparator and the first-in-first-out data buffer.
Referring to fig. 6, the pooling comparator is configured to perform a comparison operation on the input of the buffer area by using the pooling comparator, and find the maximum value in the mask of the input image. Specifically, the pooled comparator comprises a data manager and a fixed point number comparator. The data manager is connected with the output of the pooling buffer area, and reads pooling count and transmitting information of the upper line and the lower line simultaneously, and the data manager comprises two 9bit outputs of the first-in first-out buffer and two outputs of the pooling comparator. The fixed point number comparator has 48 bit inputs and 12 bit output. The comparator can compare the 48 bit inputs to each other to find the maximum value therein, and the maximum value is represented by the 2bit output. The 4 8-bit numbers include two pooling counts sequentially fetched from the first-in-first-out buffer and two pooling counts directly output by the pooling counter. The comparison refers to finding the maximum of 4 numbers. 4 pixels are arranged in the image, wherein the number of the pixels is 4, and two rows of the pixels are close to 4 pixels in the image.
According to the pulse neural network computing array, various different operation modes can be realized through external configuration signals. If the mode select signal is set to 1' b0, the array sends the output of the pulse generator directly to the output port of the array for calculation by the next layer of the network (see FIG. 7). The output of the pulse generator is determined by looking at the calculated value of the membrane potential.
Mode 0 may be any mode as long as the membrane potential and the output of the pulse generator are calculated. If the mode select signal is set to 1' b1, the output of the pulse generator in the array is fed into the pooling buffer, and the pooling layer is calculated by the pooling buffer and the pooling comparator and then fed into the output port of the array (see FIG. 8). The membrane potential value is calculated first, and then the output of the layer is sent to a pooling buffer area and a pooling comparator to calculate the maximum value in the mask. And the mode 1 is to continuously send the membrane potential and the value of the pulse generator into the pooling unit for calculation. At the same time, the membrane potential will also be output and stored.
According to the pulse neural network computing array, parallelization computation of the pulse neural network is realized through the 8x8 pulse neural network computing unit structure, and the pulse neural network computing array has good operation flexibility and hardware resource utilization rate. The accumulation method fully utilizes the sparsity characteristic of the impulse neural network, solves the problem of massive redundancy in the traditional calculation array, and improves the operation efficiency of the impulse neural network.
The preferred embodiments of the present invention have been described in detail above, but the present invention is not limited to the specific details of the above embodiments, and various equivalent changes can be made to the technical solution of the present invention within the scope of the technical concept of the present invention, and all the equivalent changes belong to the protection scope of the present invention.

Claims (9)

1. The impulse neural network computing array is characterized by comprising impulse neural network computing clusters formed by a plurality of impulse neural network computing units;
The pulse neural network calculation unit comprises a membrane potential accumulator, and is used for performing accumulation operation on an input pulse sequence; the membrane potential accumulator comprises at least one membrane potential loading unit and at least one fixed-point accumulator, and the membrane potential loading unit and the fixed-point accumulator are electrically connected with each other;
The membrane potential loading unit resets a membrane potential register to 0 according to the initialization input port being 1'b1, and reads the value of the membrane potential loading port into the register according to the enabling signal loaded into the enabling port when the initialization input port is 1' b 0;
The fixed-point accumulator is provided with 1 8-bit weight input port, 1-bit input enabling port and 1 8-bit membrane potential output port; the fixed-point accumulator accumulates the values of the membrane potential registers and outputs 8-bit membrane potential values under the condition that the input enabling port is 1' b 1; the pulse emitter is electrically connected with the membrane potential accumulator and judges whether to emit pulses to the next stage according to the membrane potential input by the accumulator;
The pooling buffer area is electrically connected with the pulse emitter and is used for counting and buffering the pulses of the pulse emitter;
The pooling comparator is electrically connected with the pulse transmitter and is used for comparing and operating the input of the buffer area;
The pulse transmitter is used for judging whether to transmit a pulse to the next stage according to the membrane potential value input by the accumulator, and comprises a fixed point comparator and a fixed point subtracter; the fixed point comparator comprises two 8bit inputs and one 1bit output; the fixed point comparator compares the input membrane potential value with the emission threshold value, and when the membrane potential value is greater than the emission threshold value, the output signal of the comparator is pulled up; when the membrane potential is smaller than the emission threshold, the output signal of the comparator is not pulled high; the fixed point subtracter comprises two 8bit inputs and one 1bit output; when the output of the comparator is pulled high, the fixed point subtracter performs fixed point subtraction on the value of the membrane potential, and outputs the membrane potential value after subtracting the threshold value.
2. The pulsed neural network computational array of claim 1, wherein the pulsed transmitter comprises at least one fixed-point comparator and at least one fixed-point subtractor, the fixed-point comparator and fixed-point subtractor being electrically connected to each other.
3. The pulsed neural network computing array of claim 1, wherein the pooling buffer comprises at least one pooled count load unit, at least one counter, and at least one first-in-first-out data buffer, the pooled count load unit, counter, and first-in-first-out data buffer being electrically connected to one another;
The pooling comparator comprises at least one fixed point number comparator and at least one data manager, wherein the fixed point number comparator and the data manager are electrically connected with each other.
4. The pulsed neural network computational array of claim 1 wherein the membrane potential loading unit is activated simultaneously with the fixed-point accumulator and the loading process is overridden by the accumulation process.
5. A pulsed neural network computational array according to claim 3, wherein the counter outputs are simultaneously electrically connected to the fifo and the pooling comparator.
6. A pulsed neural network computational array according to claim 3, wherein the data manager comprises a first-in first-out data buffer reading unit, a data packet parsing unit, and a data distribution unit; the first-in first-out data buffer reading unit is electrically connected with the data packet analysis unit, and the data packet analysis unit is electrically connected with the data distribution unit.
7. A pulsed neural network computational array according to any one of claims 1 to 6, wherein each pulsed neural network computational cell comprises an input port and an output port, the input ports being electrically connected to each other with the output ports by a membrane potential accumulator, a pulsed emitter, a pooling buffer and a pooling comparator.
8. The impulse neural network computing array of claim 7, wherein the plurality of impulse neural network computing units form an impulse neural network computing cluster, and impulse neural network computing units within the cluster share the same set of control logic.
9. The impulse neural network computing array of claim 8, wherein an input of the impulse neural network-specific computing array is electrically connected to an input of the impulse neural network computing cluster, and an output of the impulse neural network computing cluster is electrically connected to an output of the impulse neural network-specific computing array.
CN202110400723.0A 2021-04-14 2021-04-14 Pulse neural network computing array Active CN113269317B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110400723.0A CN113269317B (en) 2021-04-14 2021-04-14 Pulse neural network computing array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110400723.0A CN113269317B (en) 2021-04-14 2021-04-14 Pulse neural network computing array

Publications (2)

Publication Number Publication Date
CN113269317A CN113269317A (en) 2021-08-17
CN113269317B true CN113269317B (en) 2024-05-31

Family

ID=77229076

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110400723.0A Active CN113269317B (en) 2021-04-14 2021-04-14 Pulse neural network computing array

Country Status (1)

Country Link
CN (1) CN113269317B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113902106B (en) * 2021-12-06 2022-02-22 成都时识科技有限公司 Pulse event decision device, method, chip and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816102A (en) * 2017-11-22 2019-05-28 英特尔公司 Reconfigurable nerve synapse core for spike neural network
CN110046695A (en) * 2019-04-09 2019-07-23 中国科学技术大学 A kind of configurable high degree of parallelism spiking neuron array
CN110348564A (en) * 2019-06-11 2019-10-18 中国人民解放军国防科技大学 SCNN reasoning acceleration device based on systolic array, processor and computer equipment
CN111325321A (en) * 2020-02-13 2020-06-23 中国科学院自动化研究所 Brain-like computing system based on multi-neural network fusion and execution method of instruction set
CN112232486A (en) * 2020-10-19 2021-01-15 南京宁麒智能计算芯片研究院有限公司 Optimization method of YOLO pulse neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11501130B2 (en) * 2016-09-09 2022-11-15 SK Hynix Inc. Neural network hardware accelerator architectures and operating method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816102A (en) * 2017-11-22 2019-05-28 英特尔公司 Reconfigurable nerve synapse core for spike neural network
CN110046695A (en) * 2019-04-09 2019-07-23 中国科学技术大学 A kind of configurable high degree of parallelism spiking neuron array
CN110348564A (en) * 2019-06-11 2019-10-18 中国人民解放军国防科技大学 SCNN reasoning acceleration device based on systolic array, processor and computer equipment
CN111325321A (en) * 2020-02-13 2020-06-23 中国科学院自动化研究所 Brain-like computing system based on multi-neural network fusion and execution method of instruction set
CN112232486A (en) * 2020-10-19 2021-01-15 南京宁麒智能计算芯片研究院有限公司 Optimization method of YOLO pulse neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Maurizio Capra ; Beatrice Bussolino ; Alberto Marchisio ; Guido Masera ; Maurizio Martina ; Muhammad Shafique.Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead.IEEE Access ( Volume: 8).2020,全文. *
基于忆阻的脉冲神经网络设计及其在图像分类中的应用;赵亮;中国优秀硕士学位论文全文数据库;20200315;全文 *

Also Published As

Publication number Publication date
CN113269317A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
Liu et al. Neu-NoC: A high-efficient interconnection network for accelerated neuromorphic systems
CN109447241B (en) Dynamic reconfigurable convolutional neural network accelerator architecture for field of Internet of things
CN110751280A (en) Configurable convolution accelerator applied to convolutional neural network
CN110516801A (en) A kind of dynamic reconfigurable convolutional neural networks accelerator architecture of high-throughput
CN109302357B (en) On-chip interconnection structure for deep learning reconfigurable processor
CN112418396B (en) Sparse activation perception type neural network accelerator based on FPGA
CN113269317B (en) Pulse neural network computing array
CN113298237A (en) Convolutional neural network on-chip training accelerator based on FPGA
CN112862091B (en) Resource multiplexing type neural network hardware accelerating circuit based on quick convolution
Liang et al. A 1.13 μJ/classification spiking neural network accelerator with a single-spike neuron model and sparse weights
CN109800872B (en) Neuromorphic processor based on segmented multiplexing and parameter quantification sharing
CN113283587A (en) Winograd convolution operation acceleration method and acceleration module
CN116167424B (en) CIM-based neural network accelerator, CIM-based neural network accelerator method, CIM-based neural network storage processing system and CIM-based neural network storage processing equipment
CN111723924B (en) Deep neural network accelerator based on channel sharing
Yan et al. Acceleration and optimization of artificial intelligence CNN image recognition based on FPGA
CN110555519A (en) Low-complexity convolutional neural network based on symbol random computation
CN112346704B (en) Full-streamline type multiply-add unit array circuit for convolutional neural network
CN114116052A (en) Edge calculation method and device
CN115440226A (en) Low-power-consumption system applied to voice keyword recognition based on impulse neural network
Liu et al. Energy-Efficient and Low-Latency Optical Network-on-Chip Architecture and Mapping Solution for Artificial Neural Networks
Shen et al. Learning-based adaptation to applications and environments in a reconfigurable network-on-chip
US20240296142A1 (en) Neural network accelerator
CN113296953A (en) Distributed computing architecture, method and device of cloud side heterogeneous edge computing network
Lin et al. Design of convolutional neural network SoC system based on FPGA
CN116882467B (en) Edge-oriented multimode configurable neural network accelerator circuit structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant