CN114692862B - Method for adaptively adjusting and activating quantization bit width - Google Patents
Method for adaptively adjusting and activating quantization bit width Download PDFInfo
- Publication number
- CN114692862B CN114692862B CN202011622451.0A CN202011622451A CN114692862B CN 114692862 B CN114692862 B CN 114692862B CN 202011622451 A CN202011622451 A CN 202011622451A CN 114692862 B CN114692862 B CN 114692862B
- Authority
- CN
- China
- Prior art keywords
- data
- bit width
- quantized
- quantization
- conv
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013139 quantization Methods 0.000 title claims abstract description 34
- 238000000034 method Methods 0.000 title claims abstract description 32
- 230000003213 activating effect Effects 0.000 title abstract description 5
- 238000012549 training Methods 0.000 claims abstract description 13
- 230000001133 acceleration Effects 0.000 claims abstract description 8
- 238000009825 accumulation Methods 0.000 claims abstract description 7
- 230000004913 activation Effects 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims abstract description 5
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 230000007547 defect Effects 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/22—Microcontrol or microprogram arrangements
- G06F9/28—Enhancement of operational speed, e.g. by using several microcontrol devices operating in parallel
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
- G06F9/3887—Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by a single instruction for multiple data lanes [SIMD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Complex Calculations (AREA)
Abstract
The invention provides a method for adaptively adjusting and activating quantization bit width, which aims to overcome the defects in the prior art and solve the problem that a quantized model cannot achieve the optimal acceleration ratio and precision. The method comprises the following steps: s1, data quantization: quantizing the data to be quantized to obtain low-bit data; s2, transmitting data to the next layer when training a low-bit model, and for activation, adopting a model, carrying out convolution after quantization, wherein the result is as follows: S3, in the reasoning process, under the condition that a weight channel is not changed, reducing the activated bit width can reduce the condition that the convolution accumulation result exceeds 16 bits, and if conv (W sqf,Fuqf) is more than 1.0, reducing the activated quantized bit width until conv (W sqf,Fuqf) is less than or equal to 1.0; and the same operation is performed on each output channel of the layer, so as to determine the corresponding bit width according to the distribution condition of each channel.
Description
Technical Field
The invention relates to the technical field of convolutional neural network acceleration, in particular to a method for adaptively adjusting and activating quantization bit width.
Background
In recent years, with rapid development of technology, a large data age has come. Deep learning takes a Deep Neural Network (DNN) as a model, and has quite remarkable results in many key fields of artificial intelligence, such as image recognition, reinforcement learning, semantic analysis and the like. The Convolutional Neural Network (CNN) is used as a typical DNN structure, can effectively extract hidden layer characteristics of images, accurately classifies the images, and is widely applied to the fields of image recognition and detection in recent years.
In the existing quantization model method, layers of the model are quantized to the same bit width, or the quantization bit width of a certain layer of the model is manually adjusted.
However, the accuracy loss is different from different layers of quantization to different bit widths in the neural network model, and when the whole model is quantized to the same bit width, the whole quantization bit width may not be reduced, or the convergence effect of the model is not ideal, so that the optimal acceleration ratio cannot be achieved.
Technical terms commonly used in the prior art include:
Convolutional neural network (Convolutional Neural Networks, CNN): is a type of feedforward neural network that includes convolution calculations and has a depth structure.
Quantification: quantization refers to the process of approximating a continuous value (or a large number of possible discrete values) of a signal to a finite number (or fewer) discrete values.
Low bits: the data is quantized to 8bit,4bit or 2bit wide data.
SIMD full Single Instruction Multiple Data, single instruction multiple data stream. Is a technique for achieving spatial parallelism by employing a controller to control multiple processors while performing the same operations on each of a set of data (also known as "data vectors"), respectively. In the image processing process, since the data types of the image are commonly used in formats of RGB565, RGBA8888, YUV422, and the like, the data of the formats is characterized in that a component of a pixel is always represented by 8 bits or less of data. If a conventional processor is used for computation, the processor's registers are either 32-bit or 64-bit, but processing these data can only be used for their lower 8-bits, which is inefficient. If the 64-bit register is disassembled into 8-bit registers, 8 operations can be completed simultaneously, and the calculation efficiency is improved by 8 times. This is the core idea of SIMD instructions.
Disclosure of Invention
In order to solve the above problems, the present method aims to overcome the defects in the prior art and solve the problem that the quantized model cannot achieve the optimal speed-up ratio and precision.
Specifically, the invention provides a method for adaptively adjusting an activation quantization bit width, which comprises the following steps:
s1, data quantization: quantizing the data to be quantized to obtain low-bit data;
s2, transmitting data to the next layer when training a low-bit model, and for activation, adopting a Relu model, carrying out convolution after quantization, wherein the result is as follows:
This equation is for explaining the relationship between conv (W sqf,Fuqf) and conv (W q,Fq), where wb, fb are the quantized bit widths of the weight and Feature map, respectively, W sqf is the data of the weight data quantized to low bits and normalized to [ -1 to 1], F uqf is the data of the bottom bits and normalized to [ -1 to 1], and W q,Fq is the data of the weight and Feature map quantized to low bits, respectively;
S3, in the reasoning process, under the condition that a weight channel is not changed, reducing the activated bit width can reduce the condition that the convolution accumulation result exceeds 16 bits, and if conv (W sqf,Fuqf) is more than 1.0, reducing the activated quantized bit width until conv (W sqf,Fuqf) is less than or equal to 1.0; and the same operation is performed on each output channel of the layer, so as to determine the corresponding bit width according to the distribution condition of each channel.
The step S1 includes:
1) Signed data quantization:
2) Unsigned data quantization:
Description of variables: w f is full-precision data, W q is analog quantized data, max w is the maximum value in full-precision data W f, and b is quantized bit width.
In the step S2, the data transferred to the next layer is:
If there is a sign
If unsigned.
The formula of Relu is as follows:
relu6(x)=min(max(x,0),6)∈[0,6]。
In the step S3, the convolution operation is accelerated by using a SIMD acceleration method.
The operation in the step S3 can be completed in model training, namely conv (W sqf,Fuqf) is greater than 1.0 at the nth step when training the model, and fb n+1=fbn -1 at the (n+1) th step, and fb n+1=fbn at the (n+1) th step if conv (W sqf,Fuqf) is less than or equal to 1.0 at the nth step.
The method performs full-precision model training.
Thus, the present application has the advantages that: the accumulated result of the convolution is reduced by adjusting the activated bit width, so that the accumulated sum is compressed to be within 16 bits, and the acceleration effect of using SIMD is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and together with the description serve to explain the application.
FIG. 1 is a schematic flow chart of the method of the present invention.
Detailed Description
In order that the technical content and advantages of the present invention may be more clearly understood, a further detailed description of the present invention will now be made with reference to the accompanying drawings.
As shown in fig. 1, the present invention relates to a method for improving model accuracy in the quantization of convolutional neural network, and in particular to a method for adaptively adjusting and activating quantization bit width.
A method of adaptively adjusting an active quantization bit width, the method comprising the steps of:
s1, data quantization: quantizing the data to be quantized to obtain low-bit data;
s2, transmitting data to the next layer when training a low-bit model, and for activation, adopting a Relu model, carrying out convolution after quantization, wherein the result is as follows:
This equation is for explaining the relationship between conv (W sqf,Fuqf) and conv (W q,Fq), where wb, fb are the quantized bit widths of the weight and Feature map, respectively, W sqf is the data of the weight data quantized to low bits and normalized to [ -1 to 1], F uqf is the data of the bottom bits and normalized to [ -1 to 1], and W q,Fq is the data of the weight and Feature map quantized to low bits, respectively;
s3, at the time of reasoning, under the condition that a weight channel is not changed, reducing the activated bit width can reduce the condition that the convolution accumulation result exceeds 16 bits, if conv (W sqf,Fuqf) is more than 1.0, the activated quantized bit width is reduced until conv (W sqf,Fuqf) is less than or equal to 1.0, the operation can be completed in model training, namely, when conv (W sqf,Fuqf) is more than 1.0 at the nth step in model training, the n+1th step fb n+1=fbn -1 is carried out, and when conv (W sqf,Fuqf) is less than or equal to 1.0 at the nth step, the n+1th step fb n+1=fbn is carried out; and the same operation is performed on each output channel of the layer, so as to determine the corresponding bit width according to the distribution condition of each channel. In other words, in order to improve the acceleration effect of SIMD, it is desirable that the result of conv (W q,Fq) at the time of reasoning is less than 16 bits, and when wb=8, fb=8, it can be derived from the above equation that the result of conv (W q,Fq) is required to be less than 16 bits and the result of conv (W sqf,Fuqf) must be less than 1.0, and the result of conv (W sqf,Fuqf) can be adjusted by changing the size of fb without changing wb, thereby achieving adaptive adjustment of the bit width of activation quantization.
Specifically, the method comprises performing full-precision model training:
1. data quantization: quantizing the data to be quantized according to the formula shown to obtain low-bit data:
as shown in equation set 1 consisting of signed and unsigned quantization:
Quantification of the number of symbols:
Wf=min(max(Wf,-maxw),maxw)
Wq=clamp(-2b-1,2b-1-1,Wint)
Unsigned quantization:
Wf=min(max(Wf,0),maxw)
Wq=clamp(0,2b-1,Wint)
Description of variables: w f is full-precision data, W q is analog quantized data, max w is maximum value in full-precision data W f, and b is quantized bit width.
2. Data passed to the next layer when training the low bit model, as shown in equation set 2:
for the activation of the model with Relu, the result of the post-quantization convolution is quantized as shown in equation 3:
3. Since the SIMID acceleration mode is adopted to accelerate the convolution operation in reasoning, if the result after convolution accumulation is saved to 16 bits according to the characteristics of SIMID, the result is twice faster than the result after accumulation is saved to 32 bits, and the situation that the convolution accumulation result exceeds 16 bits can be reduced by reducing the activated bit width under the condition that the weight channel is not changed can be known by the formula group 1 and the formula 3, the activated quantization bit width is reduced until conv (W sqf,Fuqf) is less than or equal to 1.0 according to the formula 3 if conv (W sqf,Fuqf) > 1.0. And the same operation can be performed on each output channel of the layer, so as to determine the corresponding bit width according to the distribution condition of each channel.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (6)
1. A method of adaptively adjusting an active quantization bit width, the method comprising the steps of:
s1, data quantization: quantizing the data to be quantized to obtain low-bit data;
s2, transmitting data to the next layer when training a low-bit model, and for activation, adopting a Relu model, carrying out convolution after quantization, wherein the result is as follows:
This equation is for explaining the relationship between conv (W sqf,Fuqf) and conv (W q,Fq), where wb, fb are the quantized bit widths of the weight and Feature map, respectively, W sqf is the data of the weight data quantized to low bits and normalized to [ -1 to 1], F uqf is the data of the bottom bits and normalized to [ -1 to 1], and W q,Fq is the data of the weight and Feature map quantized to low bits, respectively;
S3, in the reasoning process, under the condition that a weight channel is not changed, reducing the activated bit width can reduce the condition that the convolution accumulation result exceeds 16 bits, and if conv (W sqf,Fuqf) is more than 1.0, reducing the activated quantized bit width until conv (W sqf,Fuqf) is less than or equal to 1.0; and the same operation is carried out on each output channel of the layer, so that the corresponding bit width is determined according to the distribution condition of each channel; the operation of convolution is accelerated by adopting a SIMD acceleration mode.
2. The method of claim 1 wherein the adaptive adjustment of the active quantization bit width, the method is characterized in that the step S1 comprises the following steps:
1) Signed data quantization:
2) Unsigned data quantization:
Description of variables: w f is full-precision data, W q is analog quantized data, max w is the maximum value in full-precision data W f, and b is quantized bit width.
3. The method according to claim 1, wherein in the step S2, the data transferred to the next layer is:
If there is a sign
If unsigned.
4. The method of claim 1 wherein the adaptive adjustment of the active quantization bit width, the method is characterized in that the formula of Relu is as follows:
relu6(x)=min(max(x,0),6)∈[0,6]。
5. The method according to claim 1, wherein the operation in the step S3 is performed during model training, i.e. conv (W sqf,Fuqf) >1.0 in the nth step and fb n+1=fbn -1 in the (n+1) th step when conv (W sqf,Fuqf) < 1.0 in the nth step, and fb n+1=fbn in the (n+1) th step when conv (W sqf,Fuqf) < 1.0 in the nth step.
6. The method of claim 1, wherein the method performs full-precision model training.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011622451.0A CN114692862B (en) | 2020-12-31 | 2020-12-31 | Method for adaptively adjusting and activating quantization bit width |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011622451.0A CN114692862B (en) | 2020-12-31 | 2020-12-31 | Method for adaptively adjusting and activating quantization bit width |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114692862A CN114692862A (en) | 2022-07-01 |
CN114692862B true CN114692862B (en) | 2024-10-15 |
Family
ID=82133796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011622451.0A Active CN114692862B (en) | 2020-12-31 | 2020-12-31 | Method for adaptively adjusting and activating quantization bit width |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114692862B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107480770A (en) * | 2017-07-27 | 2017-12-15 | 中国科学院自动化研究所 | The adjustable neutral net for quantifying bit wide quantifies the method and device with compression |
CN109902745A (en) * | 2019-03-01 | 2019-06-18 | 成都康乔电子有限责任公司 | A kind of low precision training based on CNN and 8 integers quantization inference methods |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110555508B (en) * | 2018-05-31 | 2022-07-12 | 赛灵思电子科技(北京)有限公司 | Artificial neural network adjusting method and device |
US11676029B2 (en) * | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
CN110852439B (en) * | 2019-11-20 | 2024-02-02 | 字节跳动有限公司 | Data processing method and device and storage medium |
-
2020
- 2020-12-31 CN CN202011622451.0A patent/CN114692862B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107480770A (en) * | 2017-07-27 | 2017-12-15 | 中国科学院自动化研究所 | The adjustable neutral net for quantifying bit wide quantifies the method and device with compression |
CN109902745A (en) * | 2019-03-01 | 2019-06-18 | 成都康乔电子有限责任公司 | A kind of low precision training based on CNN and 8 integers quantization inference methods |
Also Published As
Publication number | Publication date |
---|---|
CN114692862A (en) | 2022-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bocchieri | Vector quantization for the efficient computation of continuous density likelihoods | |
US20190392300A1 (en) | Systems and methods for data compression in neural networks | |
US6611620B1 (en) | Reversible coding method, reversible coding apparatus, and memory medium used therein | |
CN109389212B (en) | Reconfigurable activation quantization pooling system for low-bit-width convolutional neural network | |
CN1106599A (en) | Image coding method and apparatus therefor | |
US6072909A (en) | Image coding devise and image decoding devise using with image disassembly | |
CN114692862B (en) | Method for adaptively adjusting and activating quantization bit width | |
CN114139683A (en) | Neural network accelerator model quantization method | |
CN111310888A (en) | Method for processing convolutional neural network | |
CN114140641A (en) | Image classification-oriented multi-parameter self-adaptive heterogeneous parallel computing method | |
Lu et al. | Equal-average equal-variance equal-norm nearest neighbor search algorithm for vector quantization | |
CN110533575B (en) | Depth residual error steganalysis method based on heterogeneous core | |
EP1416738A3 (en) | Adaptive DCT/IDCT apparatus based on energy and method for controlling the same | |
CN114372565B (en) | Target detection network compression method for edge equipment | |
Aji et al. | Neural machine translation with 4-bit precision and beyond | |
CN113283591B (en) | Efficient convolution implementation method and device based on Winograd algorithm and approximate multiplier | |
Sasaki et al. | Post training weight compression with distribution-based filter-wise quantization step | |
CN113128116A (en) | Pure integer quantization method for lightweight neural network | |
CN108462481A (en) | Ratio LMP filtering methods based on parameter adjustment under a kind of μ rule function | |
CN116246737A (en) | Coarse aggregate ultra-high performance concrete strength prediction method and related equipment | |
Liang et al. | SIGVIC: Spatial Importance Guided Variable-Rate Image Compression | |
CN113762496B (en) | Method for reducing low-bit convolutional neural network reasoning operation complexity | |
CN108683899A (en) | A kind of color space conversion optimization method of Embedded image processing system | |
JPH07183858A (en) | Voice coding device | |
CN113177627A (en) | Optimization system, retraining system, and method thereof, and processor and readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |