CN108345938A - A kind of neural network processor and its method including bits switch device - Google Patents
A kind of neural network processor and its method including bits switch device Download PDFInfo
- Publication number
- CN108345938A CN108345938A CN201810170612.3A CN201810170612A CN108345938A CN 108345938 A CN108345938 A CN 108345938A CN 201810170612 A CN201810170612 A CN 201810170612A CN 108345938 A CN108345938 A CN 108345938A
- Authority
- CN
- China
- Prior art keywords
- bit
- data
- neural network
- conversion
- bit conversion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 84
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000006243 chemical reaction Methods 0.000 claims abstract description 167
- 230000004913 activation Effects 0.000 claims description 4
- 230000003139 buffering effect Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 230000014509 gene expression Effects 0.000 abstract description 8
- 238000005265 energy consumption Methods 0.000 abstract description 6
- 238000004364 calculation method Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 5
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 231100000870 cognitive problem Toxicity 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The present invention provides a kind of neural network processor, and the method for carrying out bits switch to the data of neural network using the neural network processor.The neural network processor includes bits switch device, which includes:Input interface, control unit, Date Conversion Unit and output interface;Wherein, described control unit is used to generate the control signal for the Date Conversion Unit;The input interface is for receiving initial data;The Date Conversion Unit is used to carry out bits switch to the initial data according to the control signal, and the initial data is converted to the bits switch result expressed using less number of bits;The output interface is used to the bits switch result exporting the bits switch device.Number of bits used by expression data can be reduced through the invention, reduced hardware cost and energy consumption needed for calculating, improved calculating speed.
Description
Technical Field
The invention relates to the field of artificial intelligence, in particular to improvement of a neural network processor.
Background
The deep learning technology of artificial intelligence has been rapidly developed in recent years, and has been widely used in the fields of solving high-level abstract cognitive problems such as image recognition, speech recognition, natural language understanding, weather prediction, gene expression, content recommendation, and intelligent robots and confirmed to have excellent performance. This makes the development and improvement of artificial intelligence technology a research hotspot in academia and industry.
The deep neural network is one of perception models with the highest development level in the field of artificial intelligence, simulates a neural connection structure of human brain by establishing a model, describes data characteristics by adopting a plurality of transformation stages and layers, and brings breakthrough progress for large-scale data processing tasks such as images, videos, audios and the like. The model of the deep neural network belongs to an operational model, which comprises a large number of nodes, and the nodes adopt a mesh-shaped interconnection structure, and are called as neurons of the deep neural network. The strength of the connection between two nodes represents the weighted value of the signal between the two nodes, i.e. the weight, to correspond to the memory in the neural network in the biological sense.
A dedicated processor for neural network computation, i.e., a neural network processor, has also been developed accordingly. In the actual calculation process of the neural network, operations such as convolution, activation, pooling and the like need to be repeatedly performed on a large amount of data, which consumes a very large amount of calculation time and seriously affects the use experience of a user. This makes how to reduce the computation time of the neural network an improved strategy for neural network processors.
Disclosure of Invention
Accordingly, the present invention is directed to overcoming the above-mentioned drawbacks of the prior art, and provides a neural network processor including a bit conversion apparatus, the bit conversion apparatus including:
the device comprises an input interface, a control unit, a data conversion unit and an output interface;
wherein,
the control unit is used for generating a control signal aiming at the data conversion unit;
the input interface is used for receiving original data;
the data conversion unit is used for carrying out bit conversion on the original data according to the control signal so as to convert the original data into a bit conversion result expressed by using fewer bit digits;
the output interface is used for outputting the bit conversion result to the bit conversion device.
Preferably, the neural network processor, wherein the control unit is configured to determine a rule for performing bit conversion according to a set parameter or an input parameter to generate the control signal;
wherein the parameter includes information related to a number of bits of the original data and a number of bits of the bit conversion result.
Preferably, according to the neural network processor, wherein the data conversion unit is configured to determine a reserved bit and a truncated bit in the original data according to the control signal, and determine the bit conversion result according to a highest bit of the reserved bit and the truncated bit of the original data.
Preferably, according to the neural network processor, the data conversion unit is configured to determine a reserved bit and a truncated bit in the original data according to the control signal, and use the reserved bit in the original data as the bit conversion result.
Preferably, according to the neural network processor, the data conversion unit is configured to perform bit conversion on the original data according to the control signal, so that the original data is converted into a bit conversion result expressed by half of the original bit number.
A method for performing bit conversion on data of a neural network by using the neural network processor, comprising:
1) the control unit generates a control signal for the data conversion unit;
2) the input interface receives original data needing to perform bit conversion from the outside of the bit conversion device;
3) the data conversion unit performs bit conversion on the original data according to the control signal so as to convert the original data into a bit conversion result expressed by using fewer bit digits;
4) the output interface outputs the bit conversion result to the bit conversion device.
Preferably, according to the method, wherein step 1) comprises:
1-1) the control unit determining a rule for performing bit conversion according to a set parameter or an input parameter;
1-2) the control unit generating a control signal corresponding to the rule;
wherein the parameter includes information related to a number of bits of the original data and a number of bits of the bit conversion result.
Preferably, according to the method, wherein step 3) comprises:
the data conversion unit determines the bit conversion result based on a highest bit of the reserved bits of the original data and the truncated bits of the original data according to the control signal.
Preferably, according to the method, wherein step 3) comprises:
and the data conversion unit takes the reserved bit in the original data as the bit conversion result according to the control signal.
Preferably, according to the method, the buffered neural network data is input to the bit conversion device to perform the steps 1) -4) when the buffering of the neural network data has been completed and the convolution operation has not been completed, or the result of the convolution operation is input to the bit conversion device to perform the steps 1) -4) when the convolution operation of the data has been completed and the activation operation has not been completed.
A computer-readable storage medium, in which a computer program is stored which, when executed, is adapted to carry out the method of any of the above.
Compared with the prior art, the invention has the advantages that:
the invention provides a bit conversion device for a neural network processor, which can be used for adjusting the bit number adopted by the expression data in various calculation processes of the neural network. By reducing the number of bits used to represent the data, the hardware cost required for the computation may be reduced, the computation speed may be increased, the need for data storage space for the neural network processor may be reduced, and the energy consumption to perform the neural network computation may be reduced.
Drawings
Embodiments of the invention are further described below with reference to the accompanying drawings, in which:
FIG. 1 shows a block diagram of a bit conversion apparatus according to one embodiment of the present invention;
fig. 2 is a connection relationship diagram of respective units in the bit conversion apparatus according to an embodiment of the present invention;
FIG. 3 is a flow chart of a method for bit converting neural network data using the bit converting apparatus shown in FIG. 1 according to an embodiment of the present invention;
FIG. 4a is a hardware configuration diagram for performing bit conversion in a "rounding mode" in a data conversion unit of a bit conversion apparatus according to an embodiment of the present invention;
fig. 4b is a hardware configuration diagram for performing bit conversion in a "direct truncation mode" in a data conversion unit of a bit conversion apparatus according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
As described in the foregoing, in designing a neural network processor, it is desirable to be able to reduce the computation time of the neural network. In this regard, the inventors believe that the computation time of the neural network can be reduced by reducing the number of bits of the data involved in the calculation of the neural network appropriately, for example, by using fewer bits to represent the data that would otherwise require more bits to represent. This is because, the inventor finds in the research of the prior art that the algorithm of the neural network has relatively high fault tolerance for the intermediate result of the calculation, and even if the use of fewer bits to represent the data that originally needs more bits to represent changes the precision of the data involved in the calculation, which affects the accuracy of the obtained intermediate result, this does not have a great influence on the final output result of the neural network.
In the present invention, the manner in which the bits of the data used for such reduction calculation are referred to as a "clipping operation" on the data. The process of adjusting the number of binary bits required to express a value is referred to as "bit conversion". For example, for a decimal value of 0.5, the result expressed by using Q7 fixed point data is 01000000 (here, Q7 adopts the leftmost first bit of 8 bits as a sign bit, and the remaining 7 bits are used to express a decimal part, thereby expressing a decimal with a precision of 7 between-1 and 1), and when performing bit conversion, the result originally expressed by using Q7 can be modified to be expressed by using Q3, and the result 0100 is obtained (similarly to Q7, Q3 adopts the leftmost first bit as a sign bit, except that it adopts 3 bits to express a decimal part, and can express a decimal with a precision of 3 between-1 and 1).
Based on the above analysis, the present invention provides a bit conversion apparatus for a neural network processor. The rule for performing bit conversion may be determined by the bit conversion means according to a set parameter or based on a user input to perform bit conversion on data. Through such conversion, the neural network processor can process relatively small amount of data, thereby improving the processing speed and reducing the energy consumption of the neural network processor. The inventors believe that in a logic combination circuit, the speed of data operation is inversely proportional to the number of bits of the numerical expression; the energy consumption of data operation is in direct proportion to the bit of numerical expression; therefore, after the data is subjected to bit conversion, the effects of accelerating calculation and reducing power consumption can be achieved.
Fig. 1 shows a bit conversion apparatus 101 according to an embodiment of the present invention, including: an input bus unit 102 as an input interface, a data conversion unit 103, an output bus unit 104 as an output interface, and a control unit 105.
The input bus unit 102 is used for acquiring neural network data which needs to be subjected to bit conversion, and providing the neural network data to the data conversion unit 103. In some embodiments, the input bus unit 102 may receive and/or transmit multiple data to be converted in parallel.
A data conversion unit 103 for performing bit conversion on the neural network data from the input bus unit 102 according to a rule for performing bit conversion, for example, set or determined based on a parameter input by a user.
An output bus unit 104 for outputting the result of the bit conversion obtained by the processing by the data conversion unit 103 from the bit conversion device 101 to be supplied to a device for performing subsequent processing in the neural network processing.
A control unit 105 for determining the rule of bit conversion, selecting the corresponding bit conversion mode to control the data conversion unit 103 to perform the operation of bit conversion. The control unit 105 may determine a rule to perform bit conversion by analyzing the set parameter or the parameter input by the user to select from various conversion modes set in advance. The parameters may include the number of bits of the data to be converted and the number of bits of the converted data, and may also be a binary expression manner adopted by the data to be converted and a binary expression manner expected to be adopted by the converted data, such as Q7, Q3, and the like. For example, based on the parameters input by the user, it is determined to convert the neural network data represented by Q7 to be represented by Q3. In reducing the bits used for expression, a "rounding" approach may be used, e.g., converting 01011000 to 0110, or a "direct truncation" approach may be used, e.g., converting 01011000 to 0101. The conversion method such as "rounding" or "direct truncation" may be input by a user or may be set to be fixed.
In some embodiments, the input bus unit 102 and/or the output bus unit 104 may receive and/or transmit multiple data to be converted in parallel.
Fig. 2 is a connection relationship diagram of respective units in the bit conversion apparatus according to an embodiment of the present invention. The bit number of the input bus unit is 128 bits, and the bit number of the output bus is 64 bits. The control unit receives from outside the bit conversion device parameters input by a user for generating a mode switching signal for the data conversion unit according to the determined bit conversion rule so that the data conversion unit can know in which way to perform bit conversion under the current conditions. Also, the control unit may further generate an input control signal for controlling the input bus unit to start receiving data or to suspend receiving data, and an output control signal for controlling the output bus unit to start outputting or suspend outputting the bit conversion result.
The following describes a method process of bit conversion of neural network data by using the bit conversion apparatus shown in fig. 1 according to an embodiment. Referring to fig. 3, the method includes:
step 1. the rules of the bit conversion used are determined by the control unit 105 in the bit conversion device 101 based on the set conversion requirement parameters or parameters input by the user. The set conversion requirement parameters and the parameters input by the user comprise information related to the bit number of the neural network data needing to be converted and the converted data bit number. The set conversion requirement parameter, the parameter input by the user may further include a truncation rule in performing bit conversion, such as a rule of "rounding" or "direct truncation".
Based on the above rule, selection can be made by the control unit 105 from a bit conversion pattern set in advance. According to an embodiment of the invention, the bit conversion modes include a "rounding mode" and a "direct truncation mode", and the processing modes for the two different modes will be described in the following steps.
Step 2, the input bus unit 102 in the bit conversion device 101 provides the neural network data which is obtained by the input bus unit and needs to perform bit conversion to the data conversion unit 103.
The input bus unit 102 here may include a plurality of interfaces capable of receiving data in parallel to receive neural network data required to perform bit conversion from outside the bit conversion device 101 in parallel. Similarly, the input bus unit 102 may also include a plurality of interfaces capable of outputting data in parallel, thereby supplying the data to the data conversion unit 103 in parallel for the process of bit conversion.
And 3, the data conversion unit 103 performs bit conversion on the neural network data needing to perform bit conversion according to the rule of bit conversion determined by the control unit 105.
In this step, a control signal from the control unit 105 may be received by the data conversion unit 103 to perform bit conversion according to the rule.
The inventor finds that when the bit number of data used for calculation is reduced, if the reduced bit number is greater than or equal to half of the bit number of the original data, the neural network processor can achieve a compromise among hardware cost, processing speed and accuracy. Therefore, in the present invention, it is preferable to reduce the bit number of the neural network data that needs to perform bit conversion to half of the original number, for example, a fixed hardware structure is used to perform bit conversion, so as to convert 32-bit data into 16-bit data, 16-bit data into 8-bit data, 8-bit data into 4-bit data, 4-bit data into 2-bit data, and 2-bit data into 1-bit data.
In the process of performing bit conversion, according to the rule, each bit of the neural network data that needs to perform bit conversion may be divided into a reserved bit and a truncated bit, where the reserved bit is one or more higher bits of the neural network data, and the truncated bit is the rest bits of the neural network data. For example, for 8-bit data 10101111, if the bit number is reduced to half of the original number, the reserved bit is 1010 and the truncated bit is 1111.
FIG. 4a shows a hardware device structure for performing bit conversion in the "rounding mode" in the data conversion unit 103 according to an embodiment of the present invention, in which 16 neural network data of 8 bits that need to perform bit conversion are input in parallel to the data conversion unit 103, and bits (e.g., a) other than the sign bit are included in the reserved bits of 4 bits in each neural network data of 8 bits (e.g., a) except the sign bit1、a2、a3) And the highest bit (e.g., a) of the corresponding truncation bits4) Are used as two inputs of an adder, respectively, and the output of the adder and the sign bit in the neural network data are used together as a result of performing bit conversion on the 8-bit neural network data.
As illustrated with reference to fig. 4a, in the "rounding mode", it is assumed that the neural network data input into the conversion unit 103 is 10101111 (inverse code), which represents decimal-0.6328125, whose truncation bit is 1111, the highest bit 1 of the truncation bit is added to 3 bits 010 excluding the sign bit in the reserved bits, and the result after bit conversion is 1011 (inverse code) based on the sign bit in the neural network data and the result of the adder, which represents decimal-0.625.
FIG. 4b shows a hardware device structure for performing bit conversion in the "direct truncation mode" in the data conversion unit 103 according to an embodiment of the present invention, wherein 16 8-bit neural network data required to perform bit conversion are input into the data conversion unit 103 in parallel, 4-bit of each 8-bit neural network dataReserved bit (e.g. a)0、a1、a2、a3) Is used directly as a result of performing bit conversion on the 8-bit neural network data.
As illustrated with reference to fig. 4b, in the "direct truncation mode", assuming that the neural network data input to the conversion unit 103 is 10101111 (complement), the result after performing the bit conversion is 1010.
And 4, outputting the bit conversion result obtained by the data conversion unit 103 from the bit conversion device 101 by the output bus unit 104 to be provided to a device for performing subsequent processing in the neural network processing.
The bit conversion apparatus provided by the above-described embodiments of the present invention may be used in various computational processes for a neural network as part of a neural network processor.
For example, the bit conversion device may be used to perform bit conversion on the buffered neural network data when the buffering of the neural network data has been completed and the convolution operation has not been completed. The reason for this is that different network layers of the neural network may have different requirements on the number of bits used by the data, and in order to adapt to the required computation speed and the expected energy consumption, the buffered neural network data may be bit-converted by the bit conversion device, and the result obtained through the bit conversion may be provided to a unit for performing a convolution operation to perform the convolution operation.
For another example, when the convolution operation on the data is completed and the activation operation is not completed, the bit conversion device may be used to perform bit conversion on the result of the convolution operation. The reason for this is that the accumulation operation of the convolution operation unit tends to increase the bit number of the obtained result of the convolution operation, and in order to adapt to the requirement of the bit number for the subsequent operation (for example, for some active operation units implemented in a hardware manner, the bit number used is often fixed), the bit conversion of the result of the convolution operation is required.
Based on the above embodiments, the present invention provides a bit conversion apparatus for a neural network processor, which can be used to adjust the number of bits used to express data in various calculations of the neural network. By reducing the number of bits used to represent the data, the hardware cost required for the computation may be reduced, the computation speed may be increased, the need for data storage space for the neural network processor may be reduced, and the energy consumption to perform the neural network computation may be reduced.
It should be noted that, all the steps described in the above embodiments are not necessary, and those skilled in the art may make appropriate substitutions, replacements, modifications, and the like according to actual needs.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (11)
1. A neural network processor, comprising bit conversion means, the bit conversion means comprising:
the device comprises an input interface, a control unit, a data conversion unit and an output interface;
wherein,
the control unit is used for generating a control signal aiming at the data conversion unit;
the input interface is used for receiving original data;
the data conversion unit is used for carrying out bit conversion on the original data according to the control signal so as to convert the original data into a bit conversion result expressed by using fewer bit digits;
the output interface is used for outputting the bit conversion result to the bit conversion device.
2. The neural network processor of claim 1, wherein the control unit is configured to determine a rule for performing bit conversion according to a set parameter or an input parameter to generate the control signal;
wherein the parameter includes information related to a number of bits of the original data and a number of bits of the bit conversion result.
3. The neural network processor of claim 2, wherein the data conversion unit is configured to determine a reserved bit and a truncated bit in the original data according to the control signal, and determine the bit conversion result according to a highest bit of the reserved bit and the truncated bit of the original data.
4. The neural network processor of claim 2, wherein the data conversion unit is configured to determine a reserved bit and a truncated bit in the original data according to the control signal, and to take the reserved bit in the original data as the bit conversion result.
5. The neural network processor of claim 1, wherein the data conversion unit is configured to perform bit conversion on the raw data according to the control signal, so that the raw data is converted into a bit conversion result expressed by half of the original bit number.
6. A method of bit converting data of a neural network using a neural network processor as claimed in any one of claims 1 to 5, comprising:
1) the control unit generates a control signal for the data conversion unit;
2) the input interface receives original data needing to perform bit conversion from the outside of the bit conversion device;
3) the data conversion unit performs bit conversion on the original data according to the control signal so as to convert the original data into a bit conversion result expressed by using fewer bit digits;
4) the output interface outputs the bit conversion result to the bit conversion device.
7. The method of claim 6, wherein step 1) comprises:
1-1) the control unit determining a rule for performing bit conversion according to a set parameter or an input parameter;
1-2) the control unit generating a control signal corresponding to the rule;
wherein the parameter includes information related to a number of bits of the original data and a number of bits of the bit conversion result.
8. The method of claim 7, wherein step 3) comprises:
the data conversion unit determines the bit conversion result based on a highest bit of the reserved bits of the original data and the truncated bits of the original data according to the control signal.
9. The method of claim 7, wherein step 3) comprises:
and the data conversion unit takes the reserved bit in the original data as the bit conversion result according to the control signal.
10. The method according to any one of claims 6 to 9, wherein the buffered neural network data is input to the bit conversion device to perform steps 1) to 4) when the buffering of the neural network data has been completed and the convolution operation has not been completed, or wherein the result of the convolution operation is input to the bit conversion device to perform steps 1) to 4) when the convolution operation of the data has been completed and the activation operation has not been completed.
11. A computer-readable storage medium, in which a computer program is stored which, when being executed, is adapted to carry out the method of any one of claims 6-10.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810170612.3A CN108345938A (en) | 2018-03-01 | 2018-03-01 | A kind of neural network processor and its method including bits switch device |
PCT/CN2018/082179 WO2019165679A1 (en) | 2018-03-01 | 2018-04-08 | Neural network processor comprising bit conversion device and method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810170612.3A CN108345938A (en) | 2018-03-01 | 2018-03-01 | A kind of neural network processor and its method including bits switch device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108345938A true CN108345938A (en) | 2018-07-31 |
Family
ID=62959552
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810170612.3A Pending CN108345938A (en) | 2018-03-01 | 2018-03-01 | A kind of neural network processor and its method including bits switch device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108345938A (en) |
WO (1) | WO2019165679A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021180201A1 (en) * | 2020-03-13 | 2021-09-16 | 华为技术有限公司 | Data processing method and apparatus for terminal network model, terminal and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106796668A (en) * | 2016-03-16 | 2017-05-31 | 香港应用科技研究院有限公司 | For the method and system that bit-depth in artificial neural network is reduced |
CN107203808A (en) * | 2017-05-08 | 2017-09-26 | 中国科学院计算技术研究所 | A kind of two-value Convole Unit and corresponding two-value convolutional neural networks processor |
CN107292458A (en) * | 2017-08-07 | 2017-10-24 | 北京中星微电子有限公司 | A kind of Forecasting Methodology and prediction meanss applied to neural network chip |
CN107340993A (en) * | 2016-04-28 | 2017-11-10 | 北京中科寒武纪科技有限公司 | A kind of apparatus and method for the neural network computing for supporting less digit floating number |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107330515A (en) * | 2016-04-29 | 2017-11-07 | 北京中科寒武纪科技有限公司 | A kind of apparatus and method for performing artificial neural network forward operation |
CN106447034B (en) * | 2016-10-27 | 2019-07-30 | 中国科学院计算技术研究所 | A kind of neural network processor based on data compression, design method, chip |
CN107423816B (en) * | 2017-03-24 | 2021-10-12 | 中国科学院计算技术研究所 | Multi-calculation-precision neural network processing method and system |
CN107145939B (en) * | 2017-06-21 | 2020-11-24 | 北京图森智途科技有限公司 | Computer vision processing method and device of low-computing-capacity processing equipment |
-
2018
- 2018-03-01 CN CN201810170612.3A patent/CN108345938A/en active Pending
- 2018-04-08 WO PCT/CN2018/082179 patent/WO2019165679A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106796668A (en) * | 2016-03-16 | 2017-05-31 | 香港应用科技研究院有限公司 | For the method and system that bit-depth in artificial neural network is reduced |
CN107340993A (en) * | 2016-04-28 | 2017-11-10 | 北京中科寒武纪科技有限公司 | A kind of apparatus and method for the neural network computing for supporting less digit floating number |
CN107203808A (en) * | 2017-05-08 | 2017-09-26 | 中国科学院计算技术研究所 | A kind of two-value Convole Unit and corresponding two-value convolutional neural networks processor |
CN107292458A (en) * | 2017-08-07 | 2017-10-24 | 北京中星微电子有限公司 | A kind of Forecasting Methodology and prediction meanss applied to neural network chip |
Non-Patent Citations (1)
Title |
---|
郑南宁: "《数字信号处理简明教程》", 30 September 2015 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021180201A1 (en) * | 2020-03-13 | 2021-09-16 | 华为技术有限公司 | Data processing method and apparatus for terminal network model, terminal and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2019165679A1 (en) | 2019-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11307864B2 (en) | Data processing apparatus and method | |
CN107862374B (en) | Neural network processing system and processing method based on assembly line | |
CN107944545B (en) | Computing method and computing device applied to neural network | |
CN109325591B (en) | Winograd convolution-oriented neural network processor | |
CN107423816B (en) | Multi-calculation-precision neural network processing method and system | |
CN107340993B (en) | Arithmetic device and method | |
CN110097172B (en) | Convolutional neural network data processing method and device based on Winograd convolutional operation | |
US20220004858A1 (en) | Method for processing artificial neural network, and electronic device therefor | |
US20190044535A1 (en) | Systems and methods for compressing parameters of learned parameter systems | |
KR102655950B1 (en) | High speed processing method of neural network and apparatus using thereof | |
CN110781686B (en) | Statement similarity calculation method and device and computer equipment | |
CN107395211B (en) | Data processing method and device based on convolutional neural network model | |
CN109508784B (en) | Design method of neural network activation function | |
CN108171328B (en) | Neural network processor and convolution operation method executed by same | |
CN108345934B (en) | Activation device and method for neural network processor | |
CN113901823B (en) | Named entity identification method, named entity identification device, storage medium and terminal equipment | |
Putra et al. | lpspikecon: Enabling low-precision spiking neural network processing for efficient unsupervised continual learning on autonomous agents | |
CN111047045B (en) | Distribution system and method for machine learning operation | |
EP3444758B1 (en) | Discrete data representation-supporting apparatus and method for back-training of artificial neural network | |
CN112561050A (en) | Neural network model training method and device | |
CN108345938A (en) | A kind of neural network processor and its method including bits switch device | |
CN111431540B (en) | Neural network model-based FPGA configuration file arithmetic compression and decompression method | |
WO2021238734A1 (en) | Method for training neural network, and related device | |
KR20190118332A (en) | Electronic apparatus and control method thereof | |
CN115496181A (en) | Chip adaptation method, device, chip and medium of deep learning model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180731 |