CN109902745A - A kind of low precision training based on CNN and 8 integers quantization inference methods - Google Patents
A kind of low precision training based on CNN and 8 integers quantization inference methods Download PDFInfo
- Publication number
- CN109902745A CN109902745A CN201910154088.5A CN201910154088A CN109902745A CN 109902745 A CN109902745 A CN 109902745A CN 201910154088 A CN201910154088 A CN 201910154088A CN 109902745 A CN109902745 A CN 109902745A
- Authority
- CN
- China
- Prior art keywords
- quantization
- integers
- integer
- low precision
- weight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides a kind of low precision training based on CNN and 8 integers quantify inference methods, key step are as follows: low accuracy model training;Model training is carried out with the low precision fixed point algorithm of 16 floating types, obtains the model for target detection;Quantization weight;It proposes 8 integer quantization schemes, the weight parameter of convolutional neural networks is quantified as 8 integers from 16 floating types by layer;8 integers quantify reasoning;Activation value is quantified as 8 integer datas, i.e. each layer of CNN all receives int8 type quantification and inputs and generate int8 quantization output.The present invention obtains weight with the low precision fixed point algorithm training pattern of 16 floating types, the integer data that re-quantization is 8 carries out forward inference, the weight that compared to 32 floating type algorithm training patterns obtain directly carries out 8 integer quantization reasonings, the reasoning process for optimizing convolutional layer effectively reduces low level fixed point quantization reasoning bring loss of significance.
Description
Technical field
The invention belongs to convolutional neural networks technical field, more particularly, to a kind of low precision training based on CNN
Quantify inference method with 8 integers.
Background technique
Convolutional neural networks (Convolutional Neural Networks, CNNs) image classification, target detection,
The fields such as recognition of face achieve superior achievement, but due to the complexity and computation delay of network structure, in storage resource and
The embedded platform of computing resource relative deficiency realizes the real-time forward inference of CNNs, needs the condition in control loss of significance
Under, compress the model size and lift scheme computational efficiency of neural network.
Currently used method be quantify CNN weight and (or) activation value, by data from 32 floating types be converted to compared with
The integer of low level.But current quantization method still remains deficiency, many quantization methods in the tradeoff of precision and computational efficiency
Web compression has been carried out to varying degrees, saves storage resource, but effectively cannot improve computational efficiency in hardware platform.
Lot of documents quantization weight at present efficiently solves the problems, such as that storage resource is insufficient on hardware platform, but shorter mention calculates
Efficiency.And binary neural network (BNN), ternary right value network (TWN), XNOR-net realize that multiplication is operated by being displaced,
Computational efficiency is improved in hardware platform, but weight and activation value, which are all quantified as 1, to be indicated or 2 expressions normally result in
The sharp fall of precision, this is very strict to performance force request of the quantization scheme to model, be not suitable for network structure it is simple,
It is suitable for the light weight model for being applied to embedded platform.
The data representation of low level saves hardware resource, greatly optimizes the design of hardware accelerator.But it largely grinds
Study carefully all is the high-precision model training of 32 floating types to be carried out using GPU acceleration, and low precision is only carried out in forward inference
Quantization, accelerate the forward inference speed of CNNs.When being indicated using the data of extremely low position, the heavy losses of parameters precision cause
Simulated target detection accuracy more significantly declines, therefore the model of the low precision of training is particularly important.
Summary of the invention
The invention will solve the problems, such as to be intended to overcome above-mentioned defect existing in the prior art, propose that one kind is based on
The training of low precision and 8 integers of CNN quantify inference methods, solve that loss of significance existing for existing quantization method is larger, meter
Calculate the not high enough problem of efficiency.
In order to solve the above technical problems, the technical solution of the invention is achieved in that
A kind of low precision training based on CNN and 8 integers quantization inference methods, include the following steps:
Low accuracy model training;Model training is carried out with the low precision fixed point algorithm of 16 floating types, is obtained for target detection
Model, i.e. weight;
Quantization weight;It proposes 8 integer quantization schemes, quantifies the weight parameter of convolutional neural networks from 16 floating types by layer
For 8 integers;
8 integers quantify reasoning;Activation value is quantified as 8 integer datas, i.e. each layer of CNN all receives int8 type amount
Change and inputs and generate int8 quantization output.
Further, the training of low accuracy model is included in large server end with GPU acceleration come training pattern, in calculating process
Data are saved with 16 floating types.
Further, in 16 real-coded GAs in calculating process, with 2 preservation integer parts, with 14 preservation decimals
Part;With 14 precision for retaining real-coded GA that round up.
Further, quantization weight includes proposing quantization scheme:Wherein x indicates floating type
Data, a, b respectively indicate the minimum value of data, maximum value in array to be quantified, i.e. a:=min (xi), b:=max (xi).Again
By rounding up, function Round () rounding obtains quantized value q.
Further, weight is divided into a series of arrays by layer, seeks the most value of each weight array, by the same array
Interior data equal proportion scaling is at 8 integers.
Further, 8 integer quantization reasoning processes include the following steps:
(a) BN algorithm pre-processes;The mean value and variance that input sample is calculated before convolution algorithm, are normalized pretreatment,
The calculating for saving BN algorithm is time-consuming;
(b) integer convolution algorithm;Integer multiplying is converted by floating type multiplying with 8 integer quantization schemes, as far as possible
It is fitted floating type multiplication and calculates effect, while integer multiplying significantly improves computational efficiency.
(c) optimize activation primitive;The active region [a, b] of each convolutional layer is successively chosen, activation primitive handle is optimized
Convolution algorithm result is mapped to known region [a, b], and the calculating for saving quantization activation value is time-consuming.
Further, it is calculated in above-mentioned reasoning process using full integer data.
The invention has the advantages and positive effects of:
The present invention obtains weight, the integer that re-quantization is 8 with the low precision fixed point algorithm training pattern of 16 floating types first
According to forward inference is carried out, the weight that compared to 32 floating type algorithm training patterns obtain directly carries out 8 integer quantizations and pushes away
Reason effectively reduces low level fixed point quantization reasoning bring loss of significance.In addition, the Calculation bottleneck due to convolutional neural networks is
Convolutional layer proposes 8 integer quantization schemes, and quantify pushing away for scheme optimization convolutional layer using 8 integers to improve computational efficiency
Reason process first carries out the pretreatment of BN algorithm, and the calculating for reducing BN algorithm is time-consuming, then carries out convolution algorithm, then collects in verifying
The active region for determining each convolutional layer is tested in enterprising line number thousand times quantization reasonings, is saved and is activated letter in reasoning process every time
Quantization activation value requires first to seek the time overhead that activation value is most worth in real time after number operation.This method be fitted floating-point arithmetic before to
Reasoning process improves convolutional layer computational efficiency under the premise of controlling loss of significance.
Detailed description of the invention
Fig. 1 is tiny-YOLO network structure;
Fig. 2 is the flow chart that the present invention is applied to that tiny-YOLO network carries out model training, forward inference optimization;
Fig. 3 a is not carry out pretreated 8 integers of BN to quantify reasoning flow chart;
Fig. 3 b is that the present invention carries out the pretreated 8 integers quantization reasoning flow chart of BN;
Fig. 4 is overall procedure of the present invention.
Specific embodiment
It should be noted that the feature in embodiment and embodiment in the case where not colliding, in the invention
It can be combined with each other.
Detailed description of specific embodiments of the invention is provided below.
Technical solution of the present invention is divided into two stages: the first stage be with the low precision fixed point algorithms of 16 floating types into
Row model training obtains the model for target detection, i.e. weight.Second stage is using 8 integer quantization scheme quantization power
Activation value, is quantified as 8 integer datas, realizes the quantization reasoning of 8 integers by weight.
Specific step is as follows:
A. the low precision fixed point algorithm training pattern of 16 floating types for using<2.14>, i.e., indicate integer part with 2, with 14
It indicates fractional part, converts 16 real-coded GAs for 32 real-coded GAs with rounding up.In model training parameter
In the presence of the data for being largely similar to 0, retaining 14 precision will not make multi-parameter be converted to 0, cause the loss of gradient information.This
The training error of method can be fitted 32 floating type algorithms substantially, will not significantly reduce convergence rate.
B. quantization weight, the weight that model training is obtained are converted into 8 integer datas.8 are described with following formula
The mathematical definition of integer quantization scheme, the i.e. int8 of numerical value indicate that (value after indicating quantization with q) (uses x with former floating type expression
Indicate 32 original floating type numerical value) between corresponding relationship:
First by 16 real-coded GA scalings to the floating type of [- 127,128], wherein a, b respectively indicate number in array to be quantified
According to minimum value, maximum value, i.e. a:=min (xi), b:=max (xi).Quantized value q is obtained by round again
The integer 16 floating number equal proportions being scaled in [- 127,128] range.
C. the pre- of Batch normalization (BN) algorithm is carried out with the input data of weight and convolutional layer after quantization
Processing.BN algorithm is usually to be handled after the convolution algorithm of each convolutional layer with individual operating block, the mathematics of BN algorithm
It indicates as shown by the following formula:
Wherein W is the weight matrix of convolutional layer, and X is the input of convolutional layer, i.e. characteristic pattern matrix, zoom factor γ and be biasing ginseng
Number β are BN layers and learn reconstruction parameter, can allow the network recovery to go out the feature distribution learnt in training process.ε be one very
Small constant, μ are characterized the mean value of figure, δ2It is characterized the variance of figure.
Following formula can be released:
It simplifies are as follows:
DefinitionW is first calculated before convolution algorithm each timeBN、βBN,
And it is quantified as 8 integer datas.
D. with the W being calculated in step CBNInput (characteristic pattern) with convolutional layer carries out convolution algorithm, i.e.,
According to matrix operation property, it converts convolution algorithm to the inner product operation of multiple two vectors, facilitates the DSP calculating using FPGA
The hardware realization of unit progress inner product of vectors.Convolution operation result is stored with int32 type in coding, then operation result amount
Int8 type is turned to, then plus the β being calculated in step CBN。
For any output valve w of convolution algorithmi, it is by two vector υi=[xi1,xi2,……xin],μi=[yi1,
yi2,……yin] inner product operation obtain:Wherein c indicates the convolutional layer
Channel number.
The mathematical definition of quantization scheme in step B can convert are as follows:Substituting into inner product operation formula can obtain
It arrives:It can be reduced to Its
InMatrix multiplication is sought with simplified formula, wherein only B is floating type, the B extreme values with data
Correlation can precalculate, and convert integer convolution algorithm purpose for floating type convolution algorithm to reach.
E. optimize activation primitive, the operation result of step D is activated, obtain the output result of the convolutional layer.It is testing
Card collects enterprising thousand quantization reasonings of line number experiment and collects numerical value before the activation in reasoning process since first convolutional layer, intends
Data distribution curve is closed, the appropriate activation range [a, b] of each convolutional layer is successively chosen, guarantees to reduce precision damage as far as possible
It loses.By taking Leaky activation primitive as an example:
Design Leaky_n activation primitive:
Mainly illustrate the present invention in model so that tiny-YOLO network is in the real-time vehicle detection of object detection field as an example below
Improvement in terms of training, integer reasoning.
The weight size and input feature vector figure size such as Fig. 1 institute of tiny-YOLO network structure and each convolutional layer
Show.Tiny-YOLO shares 15 layers, is made of 9 convolutional layers and 6 pond layers.Network structure is simple, and parameter amount is relatively fewer,
It is easier to be deployed to embedded platform realization real-time target detection.
The dimension of picture of specific embodiment of the invention is unified for 224*224 pixel, the input of first convolutional layer be to
The pixel matrix for detecting the rgb format of picture, it is a series of by being exported after the BN processing of convolutional layer, convolution algorithm, activation
Characteristic pattern, export new characteristic pattern using first pond layer, next layer is read spy of upper one layer of the output as this layer
Sign figure carries out operation.The last layer obtains object detection results, and weight size is related to target detection classification, this example is only examined
Survey " vehicle " this classification.
Before carrying out target detection using CNN, need at large server end with GPU acceleration come training pattern,
Object detection task is completed in application platform with the weight that training obtains again.The present invention is absorbed in the mould of the low precision of training
Type, and accelerate the forward inference process of CNN.
The present invention carries out model training, the flow chart that forward inference optimizes as shown in Fig. 2, specific with tiny-YOLO network
Implementation steps:
1,16 low accuracy model training
Model training uses darknet deep learning frame, and input dimension of picture unified standard is 224*224.Using 16
The low accuracy model training program in position, the integer part of floating number is indicated with 2, the fractional part of floating number is indicated with 14, is used
It rounds up and retains 14 precision of real-coded GA.
Model training step:
(1) first with ImageNet data set training sorter network, the number of iterations is 1,600,000 times;
(2) driving data collection training detection network is disclosed with BDD100K, the number of iterations is 400,000 times.Due to tiny-YOLO mould
Type structure is simple, and generalization ability is poor, and when training set and test set come from different distributions, object detection results are not ideal enough,
Therefore the COCO data set for being commonly used to training detection network is replaced using BDD100K data set.
(3) it is finely adjusted with the DDS data the set pair analysis model customized, the number of iterations is 400,000 times.DDS data set is root
According to practical application scene, the road conditions video in front of city acquisition driving vehicle, samples the key frame of video at home, then
Mark classification obtains.
2, quantization weight, with 8 integer quantization schemes will the obtained weight W of training from 16 floating types be quantified as 8 it is whole
Type data.
Step 3-10 introduction quantifies the process that inference schemes complete an object detection task with integer.
3, input verifying collection picture, obtains the pixel matrix of image, numerical value is the integer in [0,255] section, directly
The integer data for saving into 8, the input sample X as first convolutional layer.
Step 4-8 introduces the integer reasoning process of convolutional layer, and 8 integer quantization schemes are applied directly to reasoning process
Process is as shown in Figure 3a, by the pretreated 8 integers quantization reasoning process of BN as shown in Fig. 3 b.
4, the mean value and variance of convolutional layer input sample X are calculated: Its
Middle m is the number of minimum batching data.
5, W is calculatedBN、βBNAnd 8 integer datas are saved as, wherein It completes
BN pretreatment.
6, it is calculated with integer convolution algorithm methodConvolution algorithm is saved with 32 integers as a result, adding
The calculated offset parameter β of step 5BN。
7, it is activated using Leaky activation primitive, and collects the activation value of convolutional layer when detecting every picture, fitting activation
The data distribution function of value.
8, activation value is quantified as 8 integers, obtains the volume by maximum value, the minimum value for seeking the activation value that step 7 obtains
The output characteristic pattern of lamination.
9, the input feature vector figure for using the output characteristic pattern of step 8 as pond layer carries out maximum pondization operation, generates new
Characteristic pattern.
10, step 4-9 is repeated according to the network structure of tiny-YOLO, is verified the testing result of collection picture, and calculated
Mean accuracy mean value (mean Average Precision, mAP), the reference frame as following optimization activation primitive.
11, first suitable active region of convolutional layer [a, b] is chosen by the data distribution function that step 7 obtains, used
Convolution algorithm result is mapped to known region [a, b] by activation primitive Leaky_n.It repeats step 3-10 and obtains modification first
Target detection index mAP after the activation primitive of convolutional layer, control mAP loss can omit step 8 at this time, that is, ask less than 0.1%
Take the process that activation value is most worth.
The active region that step 11 successively chooses the 2-9 convolutional layer is repeated, successively modifies activation primitive Leaky_n, most
Control mAP loss is less than 1% eventually.
12, input test collection picture quantifies inference schemes using 8 integers for optimizing activation primitive and completes target detection
Task obtains testing result.
In conjunction with described above, it is as shown in Figure 4 to obtain overall procedure of the invention.
In conclusion the present invention is to obtain weight with the low precision fixed point algorithm training pattern of 16 floating types, then measure first
It turns to 8 integer datas and carries out forward inference, the weight that compared to 32 floating type algorithm training patterns obtain directly carries out 8
Position integer quantifies reasoning, effectively reduces low level fixed point quantization reasoning bring loss of significance.
In addition, to improve computational efficiency, proposing 8 integer quantities since the Calculation bottleneck of convolutional neural networks is convolutional layer
Change scheme, and using the reasoning process of 8 integer quantization scheme optimization convolutional layers, the pretreatment of BN algorithm is first carried out, BN is reduced
The calculating of algorithm is time-consuming, then carries out convolution algorithm.
The active region for testing each determining convolutional layer in the enterprising line number thousand times quantizations reasoning of verifying collection later, saves
Quantify activation value in reasoning process after each activation primitive operation to require first to seek the time overhead that activation value is most worth in real time.It should
Method is fitted floating-point arithmetic forward inference process, under the premise of controlling loss of significance, improves convolutional layer computational efficiency.
It is obvious to a person skilled in the art that the invention is not limited to the details of above-mentioned exemplary embodiment, and
And without departing substantially from the spirit or essential attributes of the invention, wound that the present invention can be realized in other specific forms
It makes.
Therefore, in all respects, the present embodiments are to be considered as illustrative and not restrictive, this
The range of innovation and creation is indicated by the appended claims rather than the foregoing description, it is intended that equally wanting for claim will be fallen in
All changes in the meaning and scope of part are included in the invention.Any appended drawing reference in claim should not be regarded
To limit the claims involved.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped
Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should
It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art
The other embodiments being understood that.
Claims (7)
1. a kind of low precision training based on CNN quantifies inference method with 8 integers, which comprises the steps of:
Low accuracy model training;Model training is carried out with the low precision fixed point algorithm of 16 floating types, is obtained for target detection
Model;
Quantization weight;It proposes 8 integer quantization schemes, quantifies the weight parameter of convolutional neural networks from 16 floating types by layer
For 8 integers;
8 integers quantify reasoning;Activation value is quantified as 8 integer datas, i.e. each layer of CNN all receives int8 type quantification
It inputs and generates int8 quantization output.
2. a kind of low precision training based on CNN according to claim 1 quantifies inference method, feature with 8 integers
Be: the training of low accuracy model is included in large server end with GPU acceleration come training pattern, with 16 floating-points in calculating process
Type saves data.
3. a kind of low precision training based on CNN according to claim 2 quantifies inference method, feature with 8 integers
It is: in 16 real-coded GAs in calculating process, is given up with 14 preservation fractional parts with four with 2 preservation integer parts
Five enter to retain 14 precision of real-coded GA.
4. a kind of low precision training based on CNN according to claim 1 quantifies inference method, feature with 8 integers
It is, quantization weight includes proposing quantization scheme:, wherein x indicates real-coded GA,Point
The minimum value of data, maximum value in array to be quantified are not indicated, i.e.,,
Quantized value q is obtained by function the Round () rounding that rounds up again.
5. a kind of low precision training based on CNN according to claim 4 quantifies inference method, feature with 8 integers
It is: weight is divided into a series of arrays by layer, seeks the most value of each weight array, by the data etc. in the same array
Scale is at 8 integers.
6. a kind of low precision training based on CNN according to claim 1 quantifies inference method, feature with 8 integers
It is, 8 integer quantization reasoning processes include the following steps:
(a) BN algorithm pre-processes;The mean value and variance that input sample is calculated before convolution algorithm, are normalized pretreatment;
(b) integer convolution algorithm;Integer multiplying is converted by floating type multiplying with 8 integer quantization schemes;
(c) optimize activation primitive;The active region [a, b] of each convolutional layer is successively chosen, optimizes activation primitive convolution
Operation result is mapped to known region [a, b].
7. quantifying inference method with 8 integers to 6 any low precision training based on CNN according to claim 1, feature exists
In: using the calculating of full integer data in reasoning process.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910154088.5A CN109902745A (en) | 2019-03-01 | 2019-03-01 | A kind of low precision training based on CNN and 8 integers quantization inference methods |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910154088.5A CN109902745A (en) | 2019-03-01 | 2019-03-01 | A kind of low precision training based on CNN and 8 integers quantization inference methods |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109902745A true CN109902745A (en) | 2019-06-18 |
Family
ID=66946069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910154088.5A Pending CN109902745A (en) | 2019-03-01 | 2019-03-01 | A kind of low precision training based on CNN and 8 integers quantization inference methods |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109902745A (en) |
Cited By (58)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298438A (en) * | 2019-07-05 | 2019-10-01 | 北京中星微电子有限公司 | The method of adjustment and adjustment device of neural network model |
CN110309877A (en) * | 2019-06-28 | 2019-10-08 | 北京百度网讯科技有限公司 | A kind of quantization method, device, electronic equipment and the storage medium of feature diagram data |
CN110322414A (en) * | 2019-07-05 | 2019-10-11 | 北京探境科技有限公司 | A kind of image data based on AI processor quantifies antidote and system online |
CN110659734A (en) * | 2019-09-27 | 2020-01-07 | 中国科学院半导体研究所 | Low bit quantization method for depth separable convolution structure |
CN110674924A (en) * | 2019-08-22 | 2020-01-10 | 苏州浪潮智能科技有限公司 | Deep learning inference automatic quantification method and device |
CN110852434A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN quantization method, forward calculation method and device based on low-precision floating point number |
CN110852416A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN accelerated computing method and system based on low-precision floating-point data expression form |
CN110929862A (en) * | 2019-11-26 | 2020-03-27 | 陈子祺 | Fixed-point neural network model quantization device and method |
CN111178087A (en) * | 2019-12-20 | 2020-05-19 | 沈阳雅译网络技术有限公司 | Neural machine translation decoding acceleration method based on discrete attention mechanism |
CN111178258A (en) * | 2019-12-29 | 2020-05-19 | 浪潮(北京)电子信息产业有限公司 | Image identification method, system, equipment and readable storage medium |
CN111260022A (en) * | 2019-11-22 | 2020-06-09 | 中国电子科技集团公司第五十二研究所 | Method for fixed-point quantization of complete INT8 of convolutional neural network |
CN111680716A (en) * | 2020-05-09 | 2020-09-18 | 浙江大华技术股份有限公司 | Identification comparison method and device, computer equipment and storage medium |
CN111723934A (en) * | 2020-06-24 | 2020-09-29 | 北京紫光展锐通信技术有限公司 | Image processing method and system, electronic device and storage medium |
CN111767984A (en) * | 2020-06-09 | 2020-10-13 | 云知声智能科技股份有限公司 | 8-bit integer full-quantization inference method and device based on fixed displacement |
CN111860095A (en) * | 2020-03-23 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | State detection model training method and device and state detection method and device |
CN111950716A (en) * | 2020-08-25 | 2020-11-17 | 云知声智能科技股份有限公司 | Quantification method and system for optimizing int8 |
CN111985495A (en) * | 2020-07-09 | 2020-11-24 | 珠海亿智电子科技有限公司 | Model deployment method, device, system and storage medium |
CN112288744A (en) * | 2020-08-24 | 2021-01-29 | 西安电子科技大学 | SAR image change detection method based on integer reasoning quantification CNN |
CN112308226A (en) * | 2020-08-03 | 2021-02-02 | 北京沃东天骏信息技术有限公司 | Quantization of neural network models, method and apparatus for outputting information |
WO2021037174A1 (en) * | 2019-08-29 | 2021-03-04 | 杭州海康威视数字技术股份有限公司 | Neural network model training method and apparatus |
WO2021036905A1 (en) * | 2019-08-27 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Data processing method and apparatus, computer equipment, and storage medium |
CN112508125A (en) * | 2020-12-22 | 2021-03-16 | 无锡江南计算技术研究所 | Efficient full-integer quantization method of image detection model |
CN112651500A (en) * | 2020-12-30 | 2021-04-13 | 深圳金三立视频科技股份有限公司 | Method for generating quantization model and terminal |
WO2021068469A1 (en) * | 2019-10-11 | 2021-04-15 | 百度在线网络技术(北京)有限公司 | Quantization and fixed-point fusion method and apparatus for neural network |
CN113032007A (en) * | 2019-12-24 | 2021-06-25 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN113052868A (en) * | 2021-03-11 | 2021-06-29 | 奥比中光科技集团股份有限公司 | Cutout model training and image cutout method and device |
WO2021128293A1 (en) * | 2019-12-27 | 2021-07-01 | 华为技术有限公司 | Model training method and apparatus, and storage medium and program product |
CN113095472A (en) * | 2020-01-09 | 2021-07-09 | 北京君正集成电路股份有限公司 | Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process |
CN113343949A (en) * | 2021-08-03 | 2021-09-03 | 中国航空油料集团有限公司 | Pedestrian detection model training method for universal embedded platform |
CN113392973A (en) * | 2021-06-25 | 2021-09-14 | 广东工业大学 | AI chip neural network acceleration method based on FPGA |
CN113537340A (en) * | 2021-07-14 | 2021-10-22 | 深圳思悦创新有限公司 | Yolo target detection model compression method, system and storage medium |
CN113762452A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for quantizing PRELU activation function |
CN113762500A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Training method for improving model precision of convolutional neural network during quantification |
CN113762499A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for quantizing weight by channels |
CN113869517A (en) * | 2020-06-30 | 2021-12-31 | 阿里巴巴集团控股有限公司 | Inference method based on deep learning model |
WO2022007880A1 (en) * | 2020-07-09 | 2022-01-13 | 北京灵汐科技有限公司 | Data accuracy configuration method and apparatus, neural network device, and medium |
CN113963690A (en) * | 2021-09-10 | 2022-01-21 | 苏州奇梦者网络科技有限公司 | Low-memory-consumption efficient offline command word recognition system and modeling method |
WO2022031764A1 (en) * | 2020-08-04 | 2022-02-10 | Nvidia Corporation | Hybrid quantization of neural networks for edge computing applications |
CN114191267A (en) * | 2021-12-06 | 2022-03-18 | 南通大学 | Light-weight intelligent method and system for assisting blind person in going out in complex environment |
CN114386578A (en) * | 2022-01-12 | 2022-04-22 | 西安石油大学 | Convolution neural network method implemented on Haisi non-NPU hardware |
CN114692862A (en) * | 2020-12-31 | 2022-07-01 | 合肥君正科技有限公司 | Method for adaptively adjusting and activating quantization bit width |
US11397579B2 (en) | 2018-02-13 | 2022-07-26 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11437032B2 (en) | 2017-09-29 | 2022-09-06 | Shanghai Cambricon Information Technology Co., Ltd | Image processing apparatus and method |
US11442785B2 (en) | 2018-05-18 | 2022-09-13 | Shanghai Cambricon Information Technology Co., Ltd | Computation method and product thereof |
US11513586B2 (en) | 2018-02-14 | 2022-11-29 | Shanghai Cambricon Information Technology Co., Ltd | Control device, method and equipment for processor |
US11544059B2 (en) | 2018-12-28 | 2023-01-03 | Cambricon (Xi'an) Semiconductor Co., Ltd. | Signal processing device, signal processing method and related products |
US11609760B2 (en) | 2018-02-13 | 2023-03-21 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11630666B2 (en) | 2018-02-13 | 2023-04-18 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11675676B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11676029B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11703939B2 (en) | 2018-09-28 | 2023-07-18 | Shanghai Cambricon Information Technology Co., Ltd | Signal processing device and related products |
US11762690B2 (en) | 2019-04-18 | 2023-09-19 | Cambricon Technologies Corporation Limited | Data processing method and related products |
US11789847B2 (en) | 2018-06-27 | 2023-10-17 | Shanghai Cambricon Information Technology Co., Ltd | On-chip code breakpoint debugging method, on-chip processor, and chip breakpoint debugging system |
CN111008694B (en) * | 2019-12-02 | 2023-10-27 | 许昌北邮万联网络技术有限公司 | Depth convolution countermeasure generation network-based data model quantization compression method |
US11847554B2 (en) | 2019-04-18 | 2023-12-19 | Cambricon Technologies Corporation Limited | Data processing method and related products |
WO2024031989A1 (en) * | 2022-08-11 | 2024-02-15 | 山东浪潮科学研究院有限公司 | Memory optimization method and system for deep learning reasoning of embedded device |
US11966583B2 (en) | 2018-08-28 | 2024-04-23 | Cambricon Technologies Corporation Limited | Data pre-processing method and device, and related computer device and storage medium |
US12001955B2 (en) | 2019-08-23 | 2024-06-04 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method, device, computer equipment and storage medium |
-
2019
- 2019-03-01 CN CN201910154088.5A patent/CN109902745A/en active Pending
Cited By (96)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11437032B2 (en) | 2017-09-29 | 2022-09-06 | Shanghai Cambricon Information Technology Co., Ltd | Image processing apparatus and method |
US11720357B2 (en) | 2018-02-13 | 2023-08-08 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11663002B2 (en) | 2018-02-13 | 2023-05-30 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11397579B2 (en) | 2018-02-13 | 2022-07-26 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11709672B2 (en) | 2018-02-13 | 2023-07-25 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11507370B2 (en) | 2018-02-13 | 2022-11-22 | Cambricon (Xi'an) Semiconductor Co., Ltd. | Method and device for dynamically adjusting decimal point positions in neural network computations |
US11704125B2 (en) | 2018-02-13 | 2023-07-18 | Cambricon (Xi'an) Semiconductor Co., Ltd. | Computing device and method |
US11630666B2 (en) | 2018-02-13 | 2023-04-18 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11740898B2 (en) | 2018-02-13 | 2023-08-29 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11609760B2 (en) | 2018-02-13 | 2023-03-21 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11620130B2 (en) | 2018-02-13 | 2023-04-04 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US12073215B2 (en) | 2018-02-13 | 2024-08-27 | Shanghai Cambricon Information Technology Co., Ltd | Computing device with a conversion unit to convert data values between various sizes of fixed-point and floating-point data |
US11513586B2 (en) | 2018-02-14 | 2022-11-29 | Shanghai Cambricon Information Technology Co., Ltd | Control device, method and equipment for processor |
US11442785B2 (en) | 2018-05-18 | 2022-09-13 | Shanghai Cambricon Information Technology Co., Ltd | Computation method and product thereof |
US11442786B2 (en) | 2018-05-18 | 2022-09-13 | Shanghai Cambricon Information Technology Co., Ltd | Computation method and product thereof |
US11789847B2 (en) | 2018-06-27 | 2023-10-17 | Shanghai Cambricon Information Technology Co., Ltd | On-chip code breakpoint debugging method, on-chip processor, and chip breakpoint debugging system |
US11966583B2 (en) | 2018-08-28 | 2024-04-23 | Cambricon Technologies Corporation Limited | Data pre-processing method and device, and related computer device and storage medium |
US11703939B2 (en) | 2018-09-28 | 2023-07-18 | Shanghai Cambricon Information Technology Co., Ltd | Signal processing device and related products |
US11544059B2 (en) | 2018-12-28 | 2023-01-03 | Cambricon (Xi'an) Semiconductor Co., Ltd. | Signal processing device, signal processing method and related products |
US11847554B2 (en) | 2019-04-18 | 2023-12-19 | Cambricon Technologies Corporation Limited | Data processing method and related products |
US11934940B2 (en) | 2019-04-18 | 2024-03-19 | Cambricon Technologies Corporation Limited | AI processor simulation |
US11762690B2 (en) | 2019-04-18 | 2023-09-19 | Cambricon Technologies Corporation Limited | Data processing method and related products |
US11676029B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11676028B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US11675676B2 (en) | 2019-06-12 | 2023-06-13 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
US12093148B2 (en) | 2019-06-12 | 2024-09-17 | Shanghai Cambricon Information Technology Co., Ltd | Neural network quantization parameter determination method and related products |
CN110309877B (en) * | 2019-06-28 | 2021-12-07 | 北京百度网讯科技有限公司 | Feature map data quantization method and device, electronic equipment and storage medium |
CN110309877A (en) * | 2019-06-28 | 2019-10-08 | 北京百度网讯科技有限公司 | A kind of quantization method, device, electronic equipment and the storage medium of feature diagram data |
CN110298438B (en) * | 2019-07-05 | 2024-04-26 | 北京中星微电子有限公司 | Neural network model adjusting method and device |
CN110298438A (en) * | 2019-07-05 | 2019-10-01 | 北京中星微电子有限公司 | The method of adjustment and adjustment device of neural network model |
CN110322414A (en) * | 2019-07-05 | 2019-10-11 | 北京探境科技有限公司 | A kind of image data based on AI processor quantifies antidote and system online |
CN110322414B (en) * | 2019-07-05 | 2021-08-10 | 北京探境科技有限公司 | Image data online quantitative correction method and system based on AI processor |
CN110674924A (en) * | 2019-08-22 | 2020-01-10 | 苏州浪潮智能科技有限公司 | Deep learning inference automatic quantification method and device |
CN110674924B (en) * | 2019-08-22 | 2022-06-03 | 苏州浪潮智能科技有限公司 | Deep learning inference automatic quantification method and device |
US12001955B2 (en) | 2019-08-23 | 2024-06-04 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method, device, computer equipment and storage medium |
US12112257B2 (en) | 2019-08-27 | 2024-10-08 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method, device, computer equipment and storage medium |
WO2021036905A1 (en) * | 2019-08-27 | 2021-03-04 | 安徽寒武纪信息科技有限公司 | Data processing method and apparatus, computer equipment, and storage medium |
WO2021037174A1 (en) * | 2019-08-29 | 2021-03-04 | 杭州海康威视数字技术股份有限公司 | Neural network model training method and apparatus |
CN112446461A (en) * | 2019-08-29 | 2021-03-05 | 杭州海康威视数字技术股份有限公司 | Neural network model training method and device |
CN110659734A (en) * | 2019-09-27 | 2020-01-07 | 中国科学院半导体研究所 | Low bit quantization method for depth separable convolution structure |
CN110659734B (en) * | 2019-09-27 | 2022-12-23 | 中国科学院半导体研究所 | Low bit quantization method for depth separable convolution structure |
CN110852416A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN accelerated computing method and system based on low-precision floating-point data expression form |
CN110852416B (en) * | 2019-09-30 | 2022-10-04 | 梁磊 | CNN hardware acceleration computing method and system based on low-precision floating point data representation form |
CN110852434A (en) * | 2019-09-30 | 2020-02-28 | 成都恒创新星科技有限公司 | CNN quantization method, forward calculation method and device based on low-precision floating point number |
CN110852434B (en) * | 2019-09-30 | 2022-09-23 | 梁磊 | CNN quantization method, forward calculation method and hardware device based on low-precision floating point number |
WO2021068469A1 (en) * | 2019-10-11 | 2021-04-15 | 百度在线网络技术(北京)有限公司 | Quantization and fixed-point fusion method and apparatus for neural network |
CN111260022A (en) * | 2019-11-22 | 2020-06-09 | 中国电子科技集团公司第五十二研究所 | Method for fixed-point quantization of complete INT8 of convolutional neural network |
CN111260022B (en) * | 2019-11-22 | 2023-09-05 | 中国电子科技集团公司第五十二研究所 | Full INT8 fixed-point quantization method for convolutional neural network |
CN110929862A (en) * | 2019-11-26 | 2020-03-27 | 陈子祺 | Fixed-point neural network model quantization device and method |
CN110929862B (en) * | 2019-11-26 | 2023-08-01 | 陈子祺 | Fixed-point neural network model quantification device and method |
CN111008694B (en) * | 2019-12-02 | 2023-10-27 | 许昌北邮万联网络技术有限公司 | Depth convolution countermeasure generation network-based data model quantization compression method |
CN111178087A (en) * | 2019-12-20 | 2020-05-19 | 沈阳雅译网络技术有限公司 | Neural machine translation decoding acceleration method based on discrete attention mechanism |
CN111178087B (en) * | 2019-12-20 | 2023-05-09 | 沈阳雅译网络技术有限公司 | Neural machine translation decoding acceleration method based on discrete type attention mechanism |
CN113032007A (en) * | 2019-12-24 | 2021-06-25 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN113032007B (en) * | 2019-12-24 | 2024-06-11 | 阿里巴巴集团控股有限公司 | Data processing method and device |
WO2021128293A1 (en) * | 2019-12-27 | 2021-07-01 | 华为技术有限公司 | Model training method and apparatus, and storage medium and program product |
CN111178258A (en) * | 2019-12-29 | 2020-05-19 | 浪潮(北京)电子信息产业有限公司 | Image identification method, system, equipment and readable storage medium |
CN111178258B (en) * | 2019-12-29 | 2022-04-22 | 浪潮(北京)电子信息产业有限公司 | Image identification method, system, equipment and readable storage medium |
CN113095472A (en) * | 2020-01-09 | 2021-07-09 | 北京君正集成电路股份有限公司 | Method for reducing precision loss of convolutional neural network through forward reasoning in quantization process |
CN113095472B (en) * | 2020-01-09 | 2024-06-28 | 北京君正集成电路股份有限公司 | Method for reducing precision loss by forward reasoning of convolutional neural network in quantization process |
CN111860095A (en) * | 2020-03-23 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | State detection model training method and device and state detection method and device |
CN111680716B (en) * | 2020-05-09 | 2023-05-12 | 浙江大华技术股份有限公司 | Identification comparison method, device, computer equipment and storage medium |
CN111680716A (en) * | 2020-05-09 | 2020-09-18 | 浙江大华技术股份有限公司 | Identification comparison method and device, computer equipment and storage medium |
CN113762452B (en) * | 2020-06-04 | 2024-01-02 | 合肥君正科技有限公司 | Method for quantizing PRELU activation function |
CN113762499B (en) * | 2020-06-04 | 2024-04-02 | 合肥君正科技有限公司 | Method for quantizing weights by using multiple channels |
CN113762500B (en) * | 2020-06-04 | 2024-04-02 | 合肥君正科技有限公司 | Training method for improving model precision during quantization of convolutional neural network |
CN113762499A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for quantizing weight by channels |
CN113762500A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Training method for improving model precision of convolutional neural network during quantification |
CN113762452A (en) * | 2020-06-04 | 2021-12-07 | 合肥君正科技有限公司 | Method for quantizing PRELU activation function |
CN111767984A (en) * | 2020-06-09 | 2020-10-13 | 云知声智能科技股份有限公司 | 8-bit integer full-quantization inference method and device based on fixed displacement |
CN111723934A (en) * | 2020-06-24 | 2020-09-29 | 北京紫光展锐通信技术有限公司 | Image processing method and system, electronic device and storage medium |
CN113869517A (en) * | 2020-06-30 | 2021-12-31 | 阿里巴巴集团控股有限公司 | Inference method based on deep learning model |
CN111985495B (en) * | 2020-07-09 | 2024-02-02 | 珠海亿智电子科技有限公司 | Model deployment method, device, system and storage medium |
CN111985495A (en) * | 2020-07-09 | 2020-11-24 | 珠海亿智电子科技有限公司 | Model deployment method, device, system and storage medium |
WO2022007880A1 (en) * | 2020-07-09 | 2022-01-13 | 北京灵汐科技有限公司 | Data accuracy configuration method and apparatus, neural network device, and medium |
CN112308226B (en) * | 2020-08-03 | 2024-05-24 | 北京沃东天骏信息技术有限公司 | Quantization of neural network model, method and apparatus for outputting information |
CN112308226A (en) * | 2020-08-03 | 2021-02-02 | 北京沃东天骏信息技术有限公司 | Quantization of neural network models, method and apparatus for outputting information |
US20220044114A1 (en) * | 2020-08-04 | 2022-02-10 | Nvidia Corporation | Hybrid quantization of neural networks for edge computing applications |
WO2022031764A1 (en) * | 2020-08-04 | 2022-02-10 | Nvidia Corporation | Hybrid quantization of neural networks for edge computing applications |
CN112288744B (en) * | 2020-08-24 | 2023-04-07 | 西安电子科技大学 | SAR image change detection method based on integer reasoning quantification CNN |
CN112288744A (en) * | 2020-08-24 | 2021-01-29 | 西安电子科技大学 | SAR image change detection method based on integer reasoning quantification CNN |
CN111950716A (en) * | 2020-08-25 | 2020-11-17 | 云知声智能科技股份有限公司 | Quantification method and system for optimizing int8 |
CN112508125A (en) * | 2020-12-22 | 2021-03-16 | 无锡江南计算技术研究所 | Efficient full-integer quantization method of image detection model |
CN112651500A (en) * | 2020-12-30 | 2021-04-13 | 深圳金三立视频科技股份有限公司 | Method for generating quantization model and terminal |
CN112651500B (en) * | 2020-12-30 | 2021-12-28 | 深圳金三立视频科技股份有限公司 | Method for generating quantization model and terminal |
CN114692862B (en) * | 2020-12-31 | 2024-10-15 | 合肥君正科技有限公司 | Method for adaptively adjusting and activating quantization bit width |
CN114692862A (en) * | 2020-12-31 | 2022-07-01 | 合肥君正科技有限公司 | Method for adaptively adjusting and activating quantization bit width |
CN113052868A (en) * | 2021-03-11 | 2021-06-29 | 奥比中光科技集团股份有限公司 | Cutout model training and image cutout method and device |
CN113392973B (en) * | 2021-06-25 | 2023-01-13 | 广东工业大学 | AI chip neural network acceleration method based on FPGA |
CN113392973A (en) * | 2021-06-25 | 2021-09-14 | 广东工业大学 | AI chip neural network acceleration method based on FPGA |
CN113537340A (en) * | 2021-07-14 | 2021-10-22 | 深圳思悦创新有限公司 | Yolo target detection model compression method, system and storage medium |
CN113343949A (en) * | 2021-08-03 | 2021-09-03 | 中国航空油料集团有限公司 | Pedestrian detection model training method for universal embedded platform |
CN113963690A (en) * | 2021-09-10 | 2022-01-21 | 苏州奇梦者网络科技有限公司 | Low-memory-consumption efficient offline command word recognition system and modeling method |
CN114191267A (en) * | 2021-12-06 | 2022-03-18 | 南通大学 | Light-weight intelligent method and system for assisting blind person in going out in complex environment |
CN114386578A (en) * | 2022-01-12 | 2022-04-22 | 西安石油大学 | Convolution neural network method implemented on Haisi non-NPU hardware |
WO2024031989A1 (en) * | 2022-08-11 | 2024-02-15 | 山东浪潮科学研究院有限公司 | Memory optimization method and system for deep learning reasoning of embedded device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109902745A (en) | A kind of low precision training based on CNN and 8 integers quantization inference methods | |
US20230299951A1 (en) | Quantum neural network | |
CN111860982A (en) | Wind power plant short-term wind power prediction method based on VMD-FCM-GRU | |
CN110909926A (en) | TCN-LSTM-based solar photovoltaic power generation prediction method | |
CN110490884A (en) | A kind of lightweight network semantic segmentation method based on confrontation | |
CN107679618A (en) | A kind of static policies fixed point training method and device | |
CN112183742A (en) | Neural network hybrid quantization method based on progressive quantization and Hessian information | |
Wu et al. | Compiler-aware neural architecture search for on-mobile real-time super-resolution | |
CN115759237A (en) | End-to-end deep neural network model compression and heterogeneous conversion system and method | |
CN103077408B (en) | Method for converting seabed sonar image into acoustic substrate classification based on wavelet neutral network | |
CN113657595A (en) | Neural network real-time pruning method and system and neural network accelerator | |
CN115311506B (en) | Image classification method and device based on quantization factor optimization of resistive random access memory | |
CN117152427A (en) | Remote sensing image semantic segmentation method and system based on diffusion model and knowledge distillation | |
Doherty et al. | Comparative study of activation functions and their impact on the YOLOv5 object detection model | |
CN116384244A (en) | Electromagnetic field prediction method based on physical enhancement neural network | |
CN116011682A (en) | Meteorological data prediction method and device, storage medium and electronic device | |
Li et al. | Weight-dependent gates for differentiable neural network pruning | |
CN113139464A (en) | Power grid fault detection method | |
CN118337576A (en) | Lightweight automatic modulation identification method based on multichannel fusion | |
CN116843544A (en) | Method, system and equipment for super-resolution reconstruction by introducing hypersonic flow field into convolutional neural network | |
CN116757255A (en) | Method for improving weight reduction of mobile NetV2 distracted driving behavior detection model | |
CN116579408A (en) | Model pruning method and system based on redundancy of model structure | |
Song et al. | Sparse online maximum entropy inverse reinforcement learning via proximal optimization and truncated gradient | |
CN116109868A (en) | Image classification model construction and small sample image classification method based on lightweight neural network | |
CN115423091A (en) | Conditional antagonistic neural network training method, scene generation method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190618 |
|
WD01 | Invention patent application deemed withdrawn after publication |