CN108345938A - A kind of neural network processor and its method including bits switch device - Google Patents
A kind of neural network processor and its method including bits switch device Download PDFInfo
- Publication number
- CN108345938A CN108345938A CN201810170612.3A CN201810170612A CN108345938A CN 108345938 A CN108345938 A CN 108345938A CN 201810170612 A CN201810170612 A CN 201810170612A CN 108345938 A CN108345938 A CN 108345938A
- Authority
- CN
- China
- Prior art keywords
- data
- bit
- bit conversion
- neural network
- bits
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 84
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000006243 chemical reaction Methods 0.000 claims abstract description 166
- 230000004913 activation Effects 0.000 claims description 5
- 230000014509 gene expression Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 230000003139 buffering effect Effects 0.000 claims 1
- 238000004364 calculation method Methods 0.000 abstract description 20
- 238000005265 energy consumption Methods 0.000 abstract description 6
- 238000012545 processing Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 5
- 238000001994 activation Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 231100000870 cognitive problem Toxicity 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
本发明提供一种神经网络处理器,以及采用所述神经网络处理器对神经网络的数据进行比特转换的方法。所述神经网络处理器中包括比特转换装置,该比特转换装置包括:输入接口、控制单元、数据转换单元、和输出接口;其中,所述控制单元用于产生针对所述数据转换单元的控制信号;所述输入接口用于接收原始数据;所述数据转换单元用于根据所述控制信号对所述原始数据进行比特转换,以将所述原始数据转换为采用更少的比特位数进行表达的比特转换结果;所述输出接口用于将所述比特转换结果输出所述比特转换装置。通过本发明可以减少表达数据所采用的比特位数,降低计算所需的硬件成本、和能耗,提高计算速度。
The invention provides a neural network processor and a method for performing bit conversion on neural network data by using the neural network processor. The neural network processor includes a bit conversion device, and the bit conversion device includes: an input interface, a control unit, a data conversion unit, and an output interface; wherein the control unit is used to generate a control signal for the data conversion unit ; the input interface is used to receive the original data; the data conversion unit is used to perform bit conversion on the original data according to the control signal, so as to convert the original data into a representation with fewer bits A bit conversion result; the output interface is used to output the bit conversion result to the bit conversion device. The invention can reduce the number of bits used for expressing data, reduce the hardware cost and energy consumption required for calculation, and improve the calculation speed.
Description
技术领域technical field
本发明涉及人工智能领域,尤其涉及对神经网络处理器的改进。The invention relates to the field of artificial intelligence, in particular to the improvement of the neural network processor.
背景技术Background technique
人工智能的深度学习技术在近几年得到了飞速的发展,在解决高级抽象认知问题上,例如图像识别、语音识别、自然语言理解、天气预测、基因表达、内容推荐和智能机器人等领域得到了广泛应用并且被证实了具有出色的表现。这使得对于人工智能技术的开发和改进成为了学术界和工业界的研究热点。The deep learning technology of artificial intelligence has been developed rapidly in recent years. It has achieved great success in solving advanced abstract cognitive problems, such as image recognition, speech recognition, natural language understanding, weather prediction, gene expression, content recommendation and intelligent robots. It has been widely used and proved to have excellent performance. This makes the development and improvement of artificial intelligence technology a research hotspot in academia and industry.
深度神经网络是人工智能领域具有最高发展水平的感知模型之一,该类网络通过建立模型来模拟人类大脑的神经连接结构,采用多个变换阶段分层对数据特征进行描述,为图像、视频和音频等大规模数据处理任务带来了突破性进展。深度神经网络的模型属于一种运算模型,其包含大量的节点,这些节点之间采用网状的互连结构,被称作为深度神经网络的神经元。在两个节点之间的连接强度代表信号在该两个节点间的加权值,即权重,以与生物学意义上的神经网络中的记忆相对应。Deep neural network is one of the perception models with the highest level of development in the field of artificial intelligence. This type of network simulates the neural connection structure of the human brain by building a model, and uses multiple transformation stages to describe the data characteristics layeredly. Large-scale data processing tasks such as audio have brought breakthrough progress. The model of the deep neural network is a kind of computing model, which contains a large number of nodes, and these nodes adopt a mesh interconnection structure, which is called the neuron of the deep neural network. The connection strength between two nodes represents the weighted value of the signal between the two nodes, that is, the weight, corresponding to the memory in the neural network in the biological sense.
针对神经网络计算的专用处理器,即神经网络处理器也得到了相应的发展。在实际的神经网络计算处理过程中,需要对大量的数据反复地进行卷积、激活、池化等操作,这需要消耗极大量的计算时间,严重影响了用户的使用体验。这使得如何减少神经网络的计算时间成为了针对神经网络处理器的一种改进策略。A dedicated processor for neural network computing, that is, a neural network processor, has also been developed accordingly. In the actual neural network computing process, it is necessary to repeatedly perform convolution, activation, pooling and other operations on a large amount of data, which consumes a huge amount of computing time and seriously affects the user experience. This makes how to reduce the computation time of neural networks an improvement strategy for neural network processors.
发明内容Contents of the invention
因此,本发明的目的在于克服上述现有技术的缺陷,提供一种神经网络处理器,该神经网络处理器中包括比特转换装置,该比特转换装置包括:Therefore, the object of the present invention is to overcome above-mentioned defective of prior art, provide a kind of neural network processor, comprise bit conversion device in this neural network processor, this bit conversion device comprises:
输入接口、控制单元、数据转换单元、和输出接口;an input interface, a control unit, a data conversion unit, and an output interface;
其中,in,
所述控制单元用于产生针对所述数据转换单元的控制信号;The control unit is used to generate a control signal for the data conversion unit;
所述输入接口用于接收原始数据;The input interface is used to receive raw data;
所述数据转换单元用于根据所述控制信号对所述原始数据进行比特转换,以将所述原始数据转换为采用更少的比特位数进行表达的比特转换结果;The data conversion unit is configured to perform bit conversion on the original data according to the control signal, so as to convert the original data into a bit conversion result expressed with fewer bits;
所述输出接口用于将所述比特转换结果输出所述比特转换装置。The output interface is used to output the bit conversion result to the bit conversion device.
优选地,根据所述神经网络处理器,其中所述控制单元用于根据设置的参数或者输入的参数确定执行比特转换的规则,以产生所述控制信号;Preferably, according to the neural network processor, wherein the control unit is configured to determine a rule for performing bit conversion according to a set parameter or an input parameter, so as to generate the control signal;
其中,所述参数包括与所述原始数据的比特位数以及所述比特转换结果的比特位数相关的信息。Wherein, the parameter includes information related to the number of bits of the original data and the number of bits of the bit conversion result.
优选地,根据所述神经网络处理器,其中所述数据转换单元用于根据所述控制信号,确定所述原始数据中的保留位以及截断位,并且根据所述原始数据的保留位以及所述原始数据的截断位中的最高位确定所述比特转换结果。Preferably, according to the neural network processor, wherein the data conversion unit is configured to determine the reserved bits and the truncated bits in the original data according to the control signal, and according to the reserved bits of the original data and the The highest bit among the truncated bits of the original data determines the bit conversion result.
优选地,根据所述神经网络处理器,其中所述数据转换单元用于根据所述控制信号,确定所述原始数据中的保留位以及截断位,并且将所述原始数据中的保留位作为所述比特转换结果。Preferably, according to the neural network processor, wherein the data converting unit is configured to determine reserved bits and truncated bits in the original data according to the control signal, and use the reserved bits in the original data as the The result of the bit conversion described above.
优选地,根据所述神经网络处理器,其中所述数据转换单元用于根据所述控制信号对所述原始数据进行比特转换,以原始数据转化为采用原本一半的比特位数进行表达的比特转换结果。Preferably, according to the neural network processor, wherein the data conversion unit is configured to perform bit conversion on the original data according to the control signal, and convert the original data into a bit conversion expressed by using half of the original number of bits result.
一种采用上述任意一项所述的神经网络处理器对神经网络的数据进行比特转换的方法,包括:A method for bit-converting the data of the neural network using the neural network processor described in any one of the above, comprising:
1)所述控制单元产生针对数据转换单元的控制信号;1) The control unit generates a control signal for the data conversion unit;
2)所述输入接口接收来自所述比特转换装置外部的需要执行比特转换的原始数据;2) The input interface receives raw data that needs to be bit-converted from outside the bit conversion device;
3)所述数据转换单元根据所述控制信号对所述原始数据进行比特转换,以将所述原始数据转换为采用更少的比特位数进行表达的比特转换结果;3) The data conversion unit performs bit conversion on the original data according to the control signal, so as to convert the original data into a bit conversion result expressed with fewer bits;
4)所述输出接口将所述比特转换结果输出所述比特转换装置。4) The output interface outputs the bit conversion result to the bit conversion device.
优选地,根据所述方法,其中步骤1)包括:Preferably, according to the method, wherein step 1) comprises:
1-1)所述控制单元根据设置的参数或者输入的参数确定执行比特转换的规则;1-1) The control unit determines the rules for performing bit conversion according to set parameters or input parameters;
1-2)所述控制单元产生与所述规则对应的控制信号;1-2) The control unit generates a control signal corresponding to the rule;
其中,所述参数包括与所述原始数据的比特位数以及所述比特转换结果的比特位数相关的信息。Wherein, the parameter includes information related to the number of bits of the original data and the number of bits of the bit conversion result.
优选地,根据所述方法,其中步骤3)包括:Preferably, according to the method, wherein step 3) comprises:
所述数据转换单元根据所述控制信号,基于所述原始数据的保留位以及所述原始数据的截断位中的最高位确定所述比特转换结果。The data conversion unit determines the bit conversion result based on the highest bit among the reserved bits of the original data and the truncated bits of the original data according to the control signal.
优选地,根据所述方法,其中步骤3)包括:Preferably, according to the method, wherein step 3) comprises:
所述数据转换单元根据所述控制信号,将所述原始数据中的保留位作为所述比特转换结果。The data conversion unit uses the reserved bits in the original data as the bit conversion result according to the control signal.
优选地,根据所述方法,在已完成对神经网络数据的缓存、并且尚未完成卷积运算时,将缓存的神经网络数据输入所述比特转换装置以执行步骤1)-4),或者在已完成对数据的卷积运算、并且尚未完成激活运算时,将卷积运算的结果输入所述比特转换装置以执行步骤1)-4)。Preferably, according to the method, when the neural network data has been cached and the convolution operation has not been completed, the cached neural network data is input to the bit conversion device to perform steps 1)-4), or when the neural network data has been When the convolution operation on the data is completed and the activation operation has not been completed, the result of the convolution operation is input to the bit conversion device to perform steps 1)-4).
一种计算机可读存储介质,其中存储有计算机程序,所述计算机程序在被执行时用于实现上述任意一项所述的方法。A computer-readable storage medium, in which a computer program is stored, and the computer program is used to implement the method described in any one of the above when executed.
与现有技术相比,本发明的优点在于:Compared with the prior art, the present invention has the advantages of:
本发明提供一种用于神经网络处理器的比特转换装置,其可被用于在神经网络的各种计算过程中对表达数据所采用的比特位数进行调整。通过减少表达数据所采用的比特位数,可以降低计算所需的硬件成本、提高计算速度、减少神经网络处理器对数据存储空间的需要、并且降低执行神经网络计算的能耗。The invention provides a bit conversion device for a neural network processor, which can be used to adjust the number of bits used to express data in various calculation processes of the neural network. By reducing the number of bits used to express data, it is possible to reduce hardware costs required for calculations, increase calculation speed, reduce the need for data storage space by neural network processors, and reduce energy consumption for performing neural network calculations.
附图说明Description of drawings
以下参照附图对本发明实施例作进一步说明,其中:Embodiments of the present invention will be further described below with reference to the accompanying drawings, wherein:
图1示出了根据本发明的一个实施例的比特转换装置的模块图;Fig. 1 shows a block diagram of a bit conversion device according to an embodiment of the present invention;
图2是根据本发明的一个实施例的比特转换装置中各个单元的连接关系图;Fig. 2 is a connection diagram of each unit in the bit conversion device according to an embodiment of the present invention;
图3是根据本发明的一个实施例的采用如图1所示出的比特转换装置对神经网络数据进行比特转换的方法流程;Fig. 3 is the method flow that adopts the bit conversion device as shown in Fig. 1 to carry out bit conversion to neural network data according to an embodiment of the present invention;
图4a是根据本发明的一个实施例在比特转换装置的数据转换单元中用于在“四舍五入模式”下执行比特转换的硬件结构图;Fig. 4a is a hardware structural diagram for performing bit conversion under "rounding mode" in the data conversion unit of the bit conversion device according to an embodiment of the present invention;
图4b是根据本发明的一个实施例在比特转换装置的数据转换单元中用于在“直接截断模式”下执行比特转换的硬件结构图。Fig. 4b is a hardware structural diagram for performing bit conversion in the "direct truncation mode" in the data conversion unit of the bit conversion device according to an embodiment of the present invention.
具体实施方式Detailed ways
下面结合附图和具体实施方式对本发明作详细说明。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.
如前文中所述,在设计神经网络处理器时,希望能够减少神经网络的计算时间。对此,发明人认为可以通过适当地减少参与到神经网络计算的数据的比特位数,例如采用更少的比特位来代表原本需要较多比特位来表示的数据,降低运算量以减少神经网络的计算时间。这是由于,发明人在对现有技术的研究中发现,神经网络的算法对于计算的中间结果存在相对较高的容错性,即便采用更少的比特位来代表原本需要较多比特位来表示的数据的做法会改变参与计算的数据的精度从而影响所获得的中间结果的准确性,然而这并不会对神经网络最终输出的结果造成较大的影响。As mentioned earlier, when designing a neural network processor, it is desirable to reduce the computation time of the neural network. In this regard, the inventor believes that by appropriately reducing the number of bits of data involved in neural network calculations, such as using fewer bits to represent data that originally requires more bits to represent, reducing the amount of calculations to reduce the number of neural networks calculation time. This is because, in the research of the prior art, the inventor found that the algorithm of the neural network has a relatively high fault tolerance for the intermediate results of the calculation, even if fewer bits are used to represent the original need for more bits to represent The practice of using data will change the accuracy of the data involved in the calculation and thus affect the accuracy of the intermediate results obtained, but this will not have a major impact on the final output of the neural network.
在本发明中,将这种缩减计算所使用的数据的比特位的方式称作为对数据的“裁剪操作”。并且,将对表达数值所需的二进制比特位数进行调整的过程称作为“比特转换”。例如,针对十进制的数值0.5,采用Q7定点数据进行表示的结果为01000000(这里Q7采用8比特中的最左侧第一个比特位作为符号位,采用其余7个比特位表示小数部分,由此来表示-1到1之间的精度为7的小数),在进行比特转换时,可以将原本采用Q7表示的结果修改为采用Q3进行表示,获得结果0100(与Q7相类似地,Q3同样采用最左侧第一个比特作为符号位,不同的是其采用3个比特位表示小数部分,可以表示-1到1之间的精度为3的小数)。In the present invention, this way of reducing the bits of the data used in the calculation is referred to as a "cutting operation" on the data. Also, the process of adjusting the number of binary bits required to express a numerical value is called "bit conversion". For example, for the decimal value 0.5, the result represented by Q7 fixed-point data is 01000000 (where Q7 uses the leftmost first bit in the 8 bits as the sign bit, and uses the remaining 7 bits to represent the fractional part, thus to represent a decimal with a precision of 7 between -1 and 1), when performing bit conversion, the result originally expressed by Q7 can be modified to be expressed by Q3, and the result is 0100 (similar to Q7, Q3 also uses The first bit on the left is used as the sign bit, the difference is that it uses 3 bits to represent the fractional part, which can represent a decimal with a precision of 3 between -1 and 1).
基于上述分析,本发明提出了一种用于神经网络处理器的比特转换装置。通过所述比特转换装置可以根据设置的或者基于用户输入的参数确定执行比特转换的规则,以对数据执行比特转换。通过这样的转换,神经网络处理器可以对相对较少量的数据进行处理,并由此提升处理速度、降低神经网络处理器的能耗。发明人认为,在逻辑组合电路中,数据运算的速度与数值表达的比特位数成反比;数据运算的能耗与数值表达的比特位成正比;故对数据进行比特转换后,可达到加速计算与降低功耗的效果。Based on the above analysis, the present invention proposes a bit conversion device for a neural network processor. By means of the bit conversion device, a rule for performing bit conversion can be determined according to parameters set or based on user input, so as to perform bit conversion on data. Through such conversion, the neural network processor can process a relatively small amount of data, thereby increasing the processing speed and reducing the energy consumption of the neural network processor. The inventor believes that in a logic combinational circuit, the speed of data operation is inversely proportional to the number of bits expressed in numerical values; the energy consumption of data operations is proportional to the bits expressed in numerical values; therefore, after bit conversion of data, accelerated calculation can be achieved with the effect of reducing power consumption.
图1示出了根据本发明的一个实施例的比特转换装置101,包括:作为输入接口的输入总线单元102、数据转换单元103、作为输出接口的输出总线单元104、控制单元105。FIG. 1 shows a bit conversion device 101 according to an embodiment of the present invention, including: an input bus unit 102 as an input interface, a data conversion unit 103 , an output bus unit 104 as an output interface, and a control unit 105 .
其中,输入总线单元102,用于获取需要进行比特转换的神经网络数据,以将其提供至数据转换单元103。在一些实施例中,输入总线单元102可以并行地接收和/或传输多个待转换数据。Wherein, the input bus unit 102 is used to obtain the neural network data that needs to be bit-converted, so as to provide it to the data conversion unit 103 . In some embodiments, the input bus unit 102 may receive and/or transmit multiple data to be converted in parallel.
数据转换单元103,用于根据例如设置的或者基于用户输入的参数而确定的执行比特转换的规则,对来自输入总线单元102的神经网络数据执行比特转换。The data conversion unit 103 is configured to perform bit conversion on the neural network data input from the bus unit 102 according to a rule for performing bit conversion that is set or determined based on parameters input by a user, for example.
输出总线单元104,用于将经由数据转换单元103处理所获得的比特转换的结果从比特转换装置101中输出,以提供至神经网络处理中用于执行后续处理的装置。The output bus unit 104 is configured to output the bit conversion result obtained through the processing of the data conversion unit 103 from the bit conversion device 101 , so as to provide it to the device for performing subsequent processing in the neural network processing.
控制单元105,用于确定比特转换的规则,选择相应的比特转换模式来控制数据转换单元103执行比特转换的操作。所述控制单元105可以通过分析设置的参数或者由用户输入的参数来确定执行比特转换的规则,以从预先设置的各种转换模式中进行选择。这里参数可以包括待转换数据的比特位数以及转换后的数据比特位数,也可以是待转换数据所采用的二进制的表达方式以及转化后的数据所期望采用的二进制的表达方式,例如Q7、Q3等。例如,根据用户输入的参数,确定将采用Q7表示的神经网络数据转换为采用Q3表示。在减少表达所采用的比特位时,可以采用“四舍五入”的方式,例如将01011000转换为0110,也可以采用“直接截断”的方式,例如将01011000转换为0101。这里的“四舍五入”或者“直接截断”等转换方式既可以由用户输入,也可以被设置为是固定的。The control unit 105 is configured to determine a bit conversion rule, and select a corresponding bit conversion mode to control the data conversion unit 103 to perform the bit conversion operation. The control unit 105 can determine the rules for performing bit conversion by analyzing the set parameters or the parameters input by the user, so as to select from various preset conversion modes. The parameters here can include the number of bits of the data to be converted and the number of bits of the converted data, or the binary expression used by the data to be converted and the expected binary expression of the converted data, such as Q7, Q3 and so on. For example, according to the parameters input by the user, it is determined to convert the neural network data represented by Q7 to be represented by Q3. When reducing the bits used for expression, the "rounding" method can be adopted, such as converting 01011000 to 0110, or the "direct truncation" method can be adopted, such as converting 01011000 to 0101. Here, conversion methods such as "rounding" or "direct truncation" can be input by the user, or can be set to be fixed.
在一些实施例中,输入总线单元102和/或输出总线单元104可以并行地接收和/或传输多个待转换数据。In some embodiments, the input bus unit 102 and/or the output bus unit 104 may receive and/or transmit multiple data to be converted in parallel.
图2是根据本发明的一个实施例的比特转换装置中各个单元的连接关系图。其中,输入总线单元的比特位数为128bit,输出总线的比特位数为64bit。控制单元从比特转换装置外部接收由用户输入的参数,其用于根据确定的比特转换规则以产生用于数据转换单元的模式切换信号,使得数据转换单元可以获知在当前状况下需要采用何种方式以执行比特转换。并且,控制单元还可以产生用于控制输入总线单元开始接收数据或者暂停接收数据的输入控制信号,以及用于控制输出总线单元开始输出或者暂停输出比特转换结果的输出控制信号。Fig. 2 is a connection diagram of various units in the bit conversion device according to an embodiment of the present invention. Wherein, the bit number of the input bus unit is 128 bits, and the bit number of the output bus unit is 64 bits. The control unit receives parameters input by the user from the outside of the bit conversion device, which are used to generate a mode switching signal for the data conversion unit according to the determined bit conversion rule, so that the data conversion unit can know which mode needs to be adopted under the current situation to perform bit conversion. In addition, the control unit can also generate an input control signal for controlling the input bus unit to start or stop receiving data, and an output control signal for controlling the output bus unit to start or stop outputting the bit conversion result.
下面将通过一个实施例介绍采用如图1所示出的比特转换装置对神经网络数据进行比特转换的方法过程。参考图3,所述方法包括:In the following, an embodiment will be used to introduce the process of using the bit conversion device shown in FIG. 1 to perform bit conversion on neural network data. Referring to Figure 3, the method includes:
步骤1.基于设置的转换需求参数或者由用户输入的参数,由比特转换装置101中的控制单元105确定所使用的比特转换的规则。所述设置的转换需求参数、所述由用户输入的参数中包括与需要转换的神经网络数据的比特位数以及转化后的数据比特位数相关的信息。所述设置的转换需求参数、所述由用户输入的参数还可以包括在进行比特转换时的截断规则,例如“四舍五入”或者“直接截断”等规则。Step 1. Based on the set conversion requirement parameters or the parameters input by the user, the control unit 105 in the bit conversion device 101 determines the used bit conversion rule. The set conversion requirement parameters and the parameters input by the user include information related to the number of bits of the neural network data to be converted and the number of bits of the converted data. The set conversion requirement parameters and the parameters input by the user may also include truncation rules when performing bit conversion, such as "rounding up" or "direct truncation" and other rules.
基于上述规则,可以由控制单元105从预先设置的比特转换模式中进行选择。根据本发明的一个实施例,所述比特转换模式包括“四舍五入模式”和“直接截断模式”,对于所述两种不同模式的处理方式将在随后的步骤中进行介绍。Based on the above rules, the control unit 105 can select from preset bit conversion modes. According to an embodiment of the present invention, the bit conversion mode includes "rounding mode" and "direct truncation mode", and the processing methods for the two different modes will be introduced in subsequent steps.
步骤2.比特转换装置101中的输入总线单元102将其所获得的需要执行比特转换的神经网络数据提供至数据转换单元103。Step 2. The input bus unit 102 in the bit conversion device 101 provides the obtained neural network data that needs to be bit converted to the data conversion unit 103 .
这里的输入总线单元102可以包括多个能够并行接收数据的接口,以并行地接收来自比特转换装置101外部的需要执行比特转换的神经网络数据。类似的,输入总线单元102也可以包括多个能够并行输出数据的接口,从而并行地将数据提供至数据转换单元103,以进行比特转换的处理。The input bus unit 102 here may include a plurality of interfaces capable of receiving data in parallel, so as to receive in parallel the neural network data that needs to be bit converted from outside the bit conversion device 101 . Similarly, the input bus unit 102 may also include multiple interfaces capable of outputting data in parallel, so as to provide data to the data conversion unit 103 in parallel for bit conversion processing.
步骤3.数据转换单元103依据控制单元105所确定的比特转换的规则,对需要执行比特转换的神经网络数据执行比特转换。Step 3. The data conversion unit 103 performs bit conversion on the neural network data that needs to be bit converted according to the bit conversion rule determined by the control unit 105 .
在此步骤中,可以由数据转换单元103接收来自控制单元105的控制信号以依据所述规则执行比特转换。In this step, the data conversion unit 103 may receive a control signal from the control unit 105 to perform bit conversion according to the rule.
发明人发现,在降低计算所使用的数据的比特位数时,若是缩减后的比特位数大于等于原本数据的比特位数的一半,则可以使得神经网络处理器在硬件成本、处理速度、和准确率之间达到折中。因此,在本发明中优选地,将需要执行比特转换的神经网络数据的比特位数缩减为原本的一半,例如采用固定的硬件结构来执行比特转换,以将32bit的数据转化为16bit、将16bit的数据转化为8bit、将8bit的数据转化为4bit、将4bit的数据转化为2bit、以及将2bit的数据转化为1bit等。The inventor found that when reducing the number of bits of the data used for calculation, if the reduced number of bits is greater than or equal to half of the number of bits of the original data, the neural network processor can be improved in terms of hardware cost, processing speed, and There is a compromise between accuracy. Therefore, in the present invention, preferably, the number of bits of the neural network data that needs to perform bit conversion is reduced to half of the original, for example, a fixed hardware structure is used to perform bit conversion to convert 32bit data into 16bit, 16bit Convert the data into 8bit, convert 8bit data into 4bit, convert 4bit data into 2bit, and convert 2bit data into 1bit, etc.
在执行比特转换的过程中,可以根据所述规则,将需要执行比特转换的神经网络数据的各个比特位划分为保留位和截断位,其中保留位为所述神经网络数据的各个比特位中较高的一个或多个比特位,截断位为所述神经网络数据的各个比特位中的其余比特位。例如,对于8bit的数据10101111而言,若采用将其比特位数缩减为原本的一半的方式,则其保留位为1010,其截断位为1111。In the process of performing bit conversion, according to the rules, each bit of the neural network data that needs to be bit converted can be divided into a reserved bit and a truncated bit, wherein the reserved bit is the lower bit in each bit of the neural network data. one or more high bits, and the truncation bits are the remaining bits in each bit of the neural network data. For example, for 8-bit data 10101111, if the number of bits is reduced to half of the original, the reserved bits are 1010, and the truncated bits are 1111.
图4a示出了根据本发明的一个实施例在数据转换单元103中用于在“四舍五入模式”下执行比特转换的硬件装置结构,其中16个8bit的需要执行比特转换的神经网络数据被并行地输入到数据转换单元103中,每一个8bit的神经网络数据中的4bit的保留位中除去符号位以外的比特位(例如a1、a2、a3)和对应的截断位中的最高位(例如a4)被分别用作为加法器的两个输入,所述加法器的输出以及所述神经网络数据中的符号位共同被用作为针对所述8bit的神经网络数据执行比特转换后的结果。Figure 4a shows a hardware device structure for performing bit conversion in the data conversion unit 103 according to an embodiment of the present invention, wherein 16 8-bit neural network data that need to perform bit conversion are processed in parallel Input into the data conversion unit 103, the bits (for example a 1 , a 2 , a 3 ) and the highest bit in the corresponding truncated bits ( For example, a 4 ) is respectively used as two inputs of an adder, and the output of the adder and the sign bit in the neural network data are jointly used as a result of bit conversion performed on the 8-bit neural network data.
参考图4a进行举例说明,在“四舍五入模式”下,假设输入到转换单元103中的神经网络数据为10101111(反码),表示其表示十进制的-0.6328125,其截断位为1111,将截断位的最高位1与保留位中除符号位之外的3个比特位010相加,基于所述神经网络数据中的符号位与加法器的结果得到比特转换后的结果为1011(反码),表示十进制的-0.625。Referring to Fig. 4a for example, in the "rounding mode", suppose the neural network data input to the conversion unit 103 is 10101111 (inverse code), which means that it represents -0.6328125 in decimal, and its truncated bit is 1111, and the truncated bit The highest bit 1 is added to the 3 bits 010 except the sign bit in the reserved bit, and the result obtained after the bit conversion based on the sign bit in the neural network data and the result of the adder is 1011 (inverse code), indicating Decimal -0.625.
图4b示出了根据本发明的一个实施例在数据转换单元103中用于在“直接截断模式”下执行比特转换的硬件装置结构,其中16个8bit的需要执行比特转换的神经网络数据被并行地输入到数据转换单元103中,每一个8bit的神经网络数据中的4bit的保留位(例如a0、a1、a2、a3)被直接用作为针对所述8bit的神经网络数据执行比特转换后的结果。Figure 4b shows a hardware device structure for performing bit conversion in the data conversion unit 103 according to an embodiment of the present invention, wherein 16 8-bit neural network data that need to perform bit conversion are parallelized input into the data conversion unit 103, and the 4-bit reserved bits (such as a 0 , a 1 , a 2 , a 3 ) in each 8-bit neural network data are directly used as execution bits for the 8-bit neural network data The transformed result.
参考图4b进行举例说明,在“直接截断模式”下,假设输入到转换单元103中的神经网络数据为10101111(反码),则执行比特转换后的结果为1010。Referring to FIG. 4b for illustration, in the "direct truncation mode", assuming that the neural network data input to the conversion unit 103 is 10101111 (inverse code), the result after bit conversion is 1010.
步骤4.由输出总线单元104将经由数据转换单元103处理所获得的比特转换的结果从比特转换装置101中输出,以提供至神经网络处理中用于执行后续处理的装置。Step 4. The output bus unit 104 outputs the result of the bit conversion obtained through the processing of the data conversion unit 103 from the bit conversion device 101 to be provided to the device for performing subsequent processing in the neural network processing.
由本发明上述实施例所提供的比特转换装置可以作为神经网络处理器的一部分,在针对神经网络的各种计算过程中使用。The bit conversion device provided by the above embodiments of the present invention can be used as a part of the neural network processor in various calculation processes for the neural network.
例如,可以在已完成对神经网络数据的缓存、并且尚未完成卷积运算时,采用比特转换装置对缓存的神经网络数据进行比特转换。这样做的原因在于,神经网络的不同网络层对数据所采用的比特位数可能存在不同的要求,为了适应于所需要的计算速度、以及期望的能耗,可以由比特转换装置对缓存的神经网络数据进行比特转换,并将经过比特转换所获得的结果提供至用于执行卷积运算的单元以执行卷积运算。For example, when the neural network data has been cached and the convolution operation has not been completed, a bit conversion device may be used to perform bit conversion on the cached neural network data. The reason for this is that different network layers of the neural network may have different requirements for the number of bits used in the data. In order to adapt to the required calculation speed and expected energy consumption, the neural network of the cache can be adjusted by the bit conversion device. The network data is bit-converted, and the result obtained through the bit-conversion is provided to a unit for performing a convolution operation to perform the convolution operation.
又例如,可以在已完成对数据的卷积运算、并且尚未完成激活运算时,采用比特转换装置对卷积运算的结果进行比特转换。这样做的原因在于,卷积运算单元的累加操作往往会增加所获得的卷积运算的结果的比特位数,为了适应于后续操作对比特位数的要求(例如对于一些采用硬件方式实现的激活运算单元而言,其所使用的比特位数往往是固定的),需要对卷积运算的结果进行比特转换。For another example, when the convolution operation on the data has been completed and the activation operation has not been completed, a bit conversion device may be used to perform bit conversion on the result of the convolution operation. The reason for this is that the accumulation operation of the convolution operation unit will often increase the number of bits of the result of the obtained convolution operation, in order to adapt to the requirements for the number of bits of subsequent operations (for example, for some activations implemented by hardware As far as the arithmetic unit is concerned, the number of bits used is often fixed), and it is necessary to perform bit conversion on the result of the convolution operation.
基于上述实施例,本发明提供一种用于神经网络处理器的比特转换装置,其可被用于在神经网络的各种计算过程中对表达数据所采用的比特位数进行调整。通过减少表达数据所采用的比特位数,可以降低计算所需的硬件成本、提高计算速度、减少神经网络处理器对数据存储空间的需要、并且降低执行神经网络计算的能耗。Based on the above embodiments, the present invention provides a bit conversion device for a neural network processor, which can be used to adjust the number of bits used to express data in various calculation processes of the neural network. By reducing the number of bits used to express data, it is possible to reduce hardware costs required for calculations, increase calculation speed, reduce the need for data storage space by neural network processors, and reduce energy consumption for performing neural network calculations.
需要说明的是,上述实施例中介绍的各个步骤并非都是必须的,本领域技术人员可以根据实际需要进行适当的取舍、替换、修改等。It should be noted that not all the steps described in the foregoing embodiments are necessary, and those skilled in the art may make appropriate trade-offs, replacements, modifications, etc. according to actual needs.
最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管上文参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than limit them. Although the present invention has been described in detail above with reference to the embodiments, those skilled in the art should understand that modifications or equivalent replacements to the technical solutions of the present invention do not depart from the spirit and scope of the technical solutions of the present invention, and all of them should be covered in Within the scope of the claims of the present invention.
Claims (11)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810170612.3A CN108345938A (en) | 2018-03-01 | 2018-03-01 | A kind of neural network processor and its method including bits switch device |
PCT/CN2018/082179 WO2019165679A1 (en) | 2018-03-01 | 2018-04-08 | Neural network processor comprising bit conversion device and method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810170612.3A CN108345938A (en) | 2018-03-01 | 2018-03-01 | A kind of neural network processor and its method including bits switch device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108345938A true CN108345938A (en) | 2018-07-31 |
Family
ID=62959552
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810170612.3A Pending CN108345938A (en) | 2018-03-01 | 2018-03-01 | A kind of neural network processor and its method including bits switch device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108345938A (en) |
WO (1) | WO2019165679A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021180201A1 (en) * | 2020-03-13 | 2021-09-16 | 华为技术有限公司 | Data processing method and apparatus for terminal network model, terminal and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106796668A (en) * | 2016-03-16 | 2017-05-31 | 香港应用科技研究院有限公司 | Method and system for bit depth reduction in artificial neural networks |
CN107203808A (en) * | 2017-05-08 | 2017-09-26 | 中国科学院计算技术研究所 | A kind of two-value Convole Unit and corresponding two-value convolutional neural networks processor |
CN107292458A (en) * | 2017-08-07 | 2017-10-24 | 北京中星微电子有限公司 | A kind of Forecasting Methodology and prediction meanss applied to neural network chip |
CN107340993A (en) * | 2016-04-28 | 2017-11-10 | 北京中科寒武纪科技有限公司 | A kind of apparatus and method for the neural network computing for supporting less digit floating number |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934331B (en) * | 2016-04-29 | 2020-06-19 | 中科寒武纪科技股份有限公司 | Apparatus and method for performing artificial neural network forward operations |
CN106447034B (en) * | 2016-10-27 | 2019-07-30 | 中国科学院计算技术研究所 | A kind of neural network processor based on data compression, design method, chip |
CN107423816B (en) * | 2017-03-24 | 2021-10-12 | 中国科学院计算技术研究所 | Multi-calculation-precision neural network processing method and system |
CN107145939B (en) * | 2017-06-21 | 2020-11-24 | 北京图森智途科技有限公司 | A computer vision processing method and device for low computing power processing equipment |
-
2018
- 2018-03-01 CN CN201810170612.3A patent/CN108345938A/en active Pending
- 2018-04-08 WO PCT/CN2018/082179 patent/WO2019165679A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106796668A (en) * | 2016-03-16 | 2017-05-31 | 香港应用科技研究院有限公司 | Method and system for bit depth reduction in artificial neural networks |
CN107340993A (en) * | 2016-04-28 | 2017-11-10 | 北京中科寒武纪科技有限公司 | A kind of apparatus and method for the neural network computing for supporting less digit floating number |
CN107203808A (en) * | 2017-05-08 | 2017-09-26 | 中国科学院计算技术研究所 | A kind of two-value Convole Unit and corresponding two-value convolutional neural networks processor |
CN107292458A (en) * | 2017-08-07 | 2017-10-24 | 北京中星微电子有限公司 | A kind of Forecasting Methodology and prediction meanss applied to neural network chip |
Non-Patent Citations (1)
Title |
---|
郑南宁: "《数字信号处理简明教程》", 30 September 2015 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021180201A1 (en) * | 2020-03-13 | 2021-09-16 | 华为技术有限公司 | Data processing method and apparatus for terminal network model, terminal and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2019165679A1 (en) | 2019-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110378468B (en) | A neural network accelerator based on structured pruning and low-bit quantization | |
CN106447034B (en) | A kind of neural network processor based on data compression, design method, chip | |
CN107480789B (en) | Efficient conversion method and device of deep learning model | |
EP4131020A1 (en) | Data processing method and device | |
US20220335304A1 (en) | System and Method for Automated Design Space Determination for Deep Neural Networks | |
CN109325590B (en) | Device for realizing neural network processor with variable calculation precision | |
CN108345934B (en) | A kind of activation device and method for neural network processor | |
CN113595993B (en) | A joint learning method for vehicle sensing equipment based on model structure optimization under edge computing | |
CN111160524A (en) | Two-stage convolutional neural network model compression method | |
CN107092961B (en) | A kind of neural network processor and design method based on mode frequency statistical coding | |
TWI744724B (en) | Method of processing convolution neural network | |
CN114626516A (en) | Neural network acceleration system based on floating point quantization of logarithmic block | |
CN108304926A (en) | A kind of pond computing device and method suitable for neural network | |
CN112884146A (en) | Method and system for training model based on data quantization and hardware acceleration | |
CN115238883A (en) | Neural network model training method, device, equipment and storage medium | |
CN108345938A (en) | A kind of neural network processor and its method including bits switch device | |
CN111260049A (en) | A neural network implementation method based on domestic embedded system | |
CN112183744A (en) | Neural network pruning method and device | |
CN117521752A (en) | Neural network acceleration method and system based on FPGA | |
CN108564165A (en) | The method and system of convolutional neural networks fixed point optimization | |
CN115222028A (en) | One-dimensional CNN-LSTM acceleration platform based on FPGA and implementation method | |
CN110084362B (en) | A kind of logarithmic quantization device and method for neural network | |
CN113850370A (en) | Data processing method and equipment and processing chip | |
CN113935456A (en) | Method and equipment for processing data in pulse neural network layer and processing chip | |
CN118194934B (en) | An embedded online compression method for multimodal networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180731 |
|
RJ01 | Rejection of invention patent application after publication |