CN109145770B - Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model - Google Patents
Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model Download PDFInfo
- Publication number
- CN109145770B CN109145770B CN201810863041.1A CN201810863041A CN109145770B CN 109145770 B CN109145770 B CN 109145770B CN 201810863041 A CN201810863041 A CN 201810863041A CN 109145770 B CN109145770 B CN 109145770B
- Authority
- CN
- China
- Prior art keywords
- layer
- wheat
- area
- similarity
- deconvolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 241000239290 Araneae Species 0.000 title claims abstract description 69
- 241000209140 Triticum Species 0.000 title claims abstract description 64
- 235000021307 Triticum Nutrition 0.000 title claims abstract description 64
- 230000004927 fusion Effects 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 15
- 238000012549 training Methods 0.000 claims abstract description 36
- 238000001514 detection method Methods 0.000 claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 claims description 23
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims 1
- 230000007547 defect Effects 0.000 abstract description 2
- 241000607479 Yersinia pestis Species 0.000 description 6
- 230000004807 localization Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 5
- 238000000605 extraction Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 235000011389 fruit/vegetable juice Nutrition 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明涉及图像识别技术领域,具体来说是一种基于多尺度特征融合网络与定位模型相结合的麦蜘蛛自动计数方法。The invention relates to the technical field of image recognition, in particular to a wheat spider automatic counting method based on the combination of a multi-scale feature fusion network and a positioning model.
背景技术Background technique
小麦是我国主要的粮食作物之一,在小麦生产过程中,容易受到多种害虫危害,麦蜘蛛是其中之一,它刺吸麦叶汁液、甚至干枯,严重影响小麦的产量。害虫种群数量的检测是害虫防治的重要手段,为害虫防治决策提供了理论依据。因此,田间麦蜘蛛的识别与计数对于提高小麦产量至关重要。Wheat is one of the main food crops in my country. In the process of wheat production, it is vulnerable to a variety of pests. Wheat spider is one of them. It sucks the juice of wheat leaves and even dries up, which seriously affects the output of wheat. The detection of pest population is an important means of pest control, which provides a theoretical basis for pest control decision-making. Therefore, identification and counting of wheat spiders in the field is crucial for improving wheat yield.
随着计算机视觉技术和图像处理技术的快速发展,基于图像的害虫自动识别与计数技术在近年来已成为研究热点。虽然此方法省时省力、具有智能化等优点,但是其不能适用于田间麦蜘蛛的识别与计数。原因在于:首先,麦蜘蛛个体很小只有几毫米大小,针对这样的小目标利用传统的图像识别技术(SVM)很难检测到;其次,采集图像时,外界环境的光照不稳定、不均匀都会影响图像的质量;再者,在实际应用中,采集的图像常混有其他杂物,背景较复杂。With the rapid development of computer vision technology and image processing technology, image-based pest automatic identification and counting technology has become a research hotspot in recent years. Although this method is time-saving and labor-saving, and has the advantages of intelligence, it is not suitable for the identification and counting of field wheat spiders. The reasons are: first, the individual wheat spiders are only a few millimeters in size, and it is difficult to detect such small targets using traditional image recognition technology (SVM). It affects the quality of the image; moreover, in practical applications, the collected images are often mixed with other debris, and the background is more complicated.
因此,如何在复杂的环境下实现麦蜘蛛这类小目标的检测已经成为急需解决的技术问题。Therefore, how to realize the detection of small targets such as wheat spiders in a complex environment has become a technical problem that needs to be solved urgently.
发明内容SUMMARY OF THE INVENTION
本发明的目的是为了解决现有技术中针对小目标进行图像检测误差率高的缺陷,提供一种基于多尺度特征融合网络与定位模型相结合的麦蜘蛛自动计数方法来解决上述问题。The purpose of the present invention is to solve the defect of high error rate of image detection for small targets in the prior art, and to provide an automatic counting method of wheat spiders based on the combination of multi-scale feature fusion network and positioning model to solve the above problems.
为了实现上述目的,本发明的技术方案如下:In order to achieve the above object, technical scheme of the present invention is as follows:
一种基于多尺度特征融合网络与定位模型相结合的麦蜘蛛自动计数方法,包括以下步骤:An automatic counting method for wheat spiders based on the combination of multi-scale feature fusion network and localization model, comprising the following steps:
训练样本的建立,获取多于2000张田间自然环境下麦蜘蛛图像作为训练图像,且图像中麦蜘蛛已进行标记,得到训练样本;For the establishment of training samples, more than 2000 images of wheat spiders in the natural environment in the field are obtained as training images, and the wheat spiders in the images have been marked to obtain training samples;
构造麦蜘蛛检测计数模型;Construct the wheat spider detection and counting model;
构造定位模型;Construct a positioning model;
构造多尺度特征融合网络,对多尺度特征融合网络结构进行改造;Construct a multi-scale feature fusion network, and transform the structure of the multi-scale feature fusion network;
训练多尺度特征融合网络,根据定位模型针对训练样本定位出来的候选区域的特征进行训练,将其每层的输出结果作为预测结果;Train the multi-scale feature fusion network, train the features of the candidate regions located by the training samples according to the localization model, and use the output results of each layer as the prediction results;
待计数图像的获取,获取田间拍摄的麦蜘蛛图像,并进行预处理得到待计数图像;The acquisition of the images to be counted is to acquire the images of wheat spiders photographed in the field, and preprocess to obtain the images to be counted;
麦蜘蛛个数的获得,将待计数图像输入麦蜘蛛检测计数模型,得到图像中麦蜘蛛个数。To obtain the number of wheat spiders, input the image to be counted into the wheat spider detection and counting model to obtain the number of wheat spiders in the image.
所述的构造定位模型包括以下步骤:The described construction positioning model includes the following steps:
设定颜色空间转换模块,颜色空间转换模块用于将RGB颜色空间转换为YcbCr颜色空间,并分割为R={r1,r2,...rn}个分割区域;Set the color space conversion module, the color space conversion module is used to convert the RGB color space to the YcbCr color space, and divide it into R={r 1 , r 2 ,...r n } divided regions;
计算色彩信息相似度,使用L1范式归一化获取图像每个颜色通道的25个直方图,计算该色彩空间的相似度,其计算公式如下:Calculate the similarity of color information, use L1 normalization to obtain 25 histograms of each color channel of the image, and calculate the similarity of the color space. The calculation formula is as follows:
其中,fcolor(ri,rj)表示分割区域ri与rj的色彩空间相似度;表示第i个通道、第k个直方图向量,i=1,2,3,k=0,1....,25,n表示直方图个数,ri表示分割区域R={r1,r2,...rn}第i个区域;表示第j个通道、第k个直方图向量,j=1,2,3,m表示m个直方图;Among them, f color (r i , r j ) represents the color space similarity between the segmented regions ri and r j ; Indicates the i-th channel and the k-th histogram vector, i=1, 2, 3, k=0, 1...., 25, n represents the number of histograms, ri i represents the segmented area R={r 1 ,r 2 ,...r n } i-th area; Represents the jth channel and the kth histogram vector, j=1, 2, 3, m represents m histograms;
计算边缘信息相似度,对每个颜色通道的8个不同方向计算方差σ=1的高斯微分,每个通道每个颜色获取10个的直方图使用L1范式归一化,计算边缘信息相似度,其计算公式如下:Calculate the similarity of edge information, calculate the Gaussian differential with variance σ=1 for 8 different directions of each color channel, obtain 10 histograms for each color of each channel and use L1 normalization to calculate the similarity of edge information, Its calculation formula is as follows:
其中,fedage(ri,rj)表示分割区域ri与rj的边缘信息相似度,表示第i个通道、第k个直方图向量,表示第j个通道、第k个直方图向量,其中i=1,2,3,j=1,2,3,k=0,1....,10;n表示直方图个数,m表示m个直方图,ri表示分割区域R={r1,r2,...rn}第i个区域;Among them, f edage (r i , r j ) represents the edge information similarity between the segmented regions ri and r j , represents the i-th channel, the k-th histogram vector, Represents the jth channel and the kth histogram vector, where i=1,2,3, j=1,2,3, k=0,1....,10; n represents the number of histograms, m represents m histograms, and ri represents the i -th region of the segmented region R={r 1 , r 2 , ... rn };
计算区域大小相似度,其计算公式如下:Calculate the similarity of the size of the area, and its calculation formula is as follows:
其中farea(ri,rj)表示表示分割区域ri与rj的区域大小相似度,area()表示区域面积;area(img)表示图片面积,ri表示分割区域R={r1,r2,...rn}第i个区域,rj表示分割区域R={r1,r2,...rn}第j个区域;where f area (r i , r j ) represents the similarity of the size of the segmented areas ri and r j , area() represents the area area; area(img) represents the image area, and ri represents the segmented area R={r 1 ,r 2 ,...rn } i- th area, r j represents the division area R={r 1 ,r 2 ,... rn } j- th area;
将色彩信息相似度、边缘信息相似度、区域大小相似度融合,其计算公式如下:The color information similarity, edge information similarity, and area size similarity are fused, and the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),f(r i ,r j )=w 1 f color (r i ,r j )+w 2 f edage (r i ,r j )+w 3 f area (r i ,r j ),
其中,f(ri,rj)表示分割区域ri与rj融合后的相似度,w1、w2、w3分别表示信息相似度、边缘信息相似度、区域大小相似度的权值,ri表示分割区域R={r1,r2,...rn}第i个区域,rj表示分割区域R={r1,r2,...rn}第j个区域。Among them, f(r i , r j ) represents the similarity of the segmentation regions ri and r j after fusion, w 1 , w 2 , and w 3 represent the weights of information similarity, edge information similarity, and region size similarity, respectively , ri represents the ith area of the divided area R={r 1 , r 2 ,...rn }, r j represents the jth area of the divided area R={r 1 , r 2 ,... rn } .
所述的构造多尺度特征融合网络包括以下步骤:The described construction of a multi-scale feature fusion network includes the following steps:
设定n层多尺度神经网络,从最顶层开始进行反卷积操作生成反卷积层;Set up an n-layer multi-scale neural network, and perform deconvolution operations from the top layer to generate a deconvolution layer;
设定第1层的输入为训练样本、输出第1层特征图,第1层特征图作为第2层的输入、输出第2层特征图,第2层特征图作为第3层的输入,…直至第n-1层特征图作为第n层的输入;Set the input of the first layer as the training sample, output the feature map of the first layer, the feature map of the first layer as the input of the second layer, output the feature map of the second layer, the feature map of the second layer as the input of the third layer, … Until the n-1th layer feature map is used as the input of the nth layer;
将第1层特征图、第2层特征图…第n层特征图与对应的第1层反卷积层、第二层反卷积层、…第n层反卷积层通过1*1卷积核连接,生成多尺度特征融合网络。The first layer feature map, the second layer feature map...nth layer feature map and the corresponding first layer deconvolution layer, second layer deconvolution layer, ... nth layer deconvolution layer through 1*1 volume The kernel connection is accumulated to generate a multi-scale feature fusion network.
所述的训练多尺度特征融合网络包括以下步骤:The described training multi-scale feature fusion network includes the following steps:
将训练样本输入定位模型,定位模型定位出训练样本的候选区域;Input the training samples into the positioning model, and the positioning model locates the candidate regions of the training samples;
将训练样本的候选区域分别输入多尺度神经网络的第1层,多尺度神经网络的第1层输出第1层特征图;The candidate regions of the training samples are respectively input into the first layer of the multi-scale neural network, and the first layer of the multi-scale neural network outputs the first layer feature map;
第1层特征图输入多尺度神经网络的第2层,多尺度神经网络的第1层输出第2层特征图,直至第n-1层特征图输入多尺度神经网络的第n层;The feature map of the first layer is input to the second layer of the multi-scale neural network, and the first layer of the multi-scale neural network outputs the feature map of the second layer, until the feature map of the n-1 layer is input to the nth layer of the multi-scale neural network;
对第n层特征图进行反卷积操作,生成第n层反卷积层,对第n-1层进行反卷积操作,生成第n-1反卷积层,由此直至第1层反卷积层;Perform a deconvolution operation on the feature map of the nth layer, generate the nth layer convolutional layer;
第1层特征图、第2层特征图、…直至第n层特征图与第1层反卷积层、第2层反卷积层、…直至第n层反卷积层通过1*1卷积核连接;The first layer feature map, the second layer feature map, ... until the nth layer feature map and the first layer deconvolution layer, the second layer deconvolution layer, ... until the nth layer deconvolution layer through 1*1 volume Accumulation nuclear connection;
第1层特征图与第1层反卷积层经过1*1卷积核连接后提取第一层特征后,生成第一层预测结果;第2层特征图与第2层反卷积层经过1*1卷积核连接后提取第二层特征后,生成第二层预测结果;…直至第n层特征图与第n层反卷积层经过1*1卷积核连接后提取第n层特征后,生成第n层预测结果;The feature map of the first layer and the deconvolution layer of the first layer are connected by a 1*1 convolution kernel to extract the features of the first layer, and then the prediction result of the first layer is generated; the feature map of the second layer and the deconvolution layer of the second layer are passed through After the 1*1 convolution kernel is connected, the second layer features are extracted, and the second layer prediction result is generated; ... until the nth layer feature map and the nth layer deconvolution layer are connected by 1*1 convolution kernel, the nth layer is extracted. After the feature, the nth layer prediction result is generated;
对第1层预测结果、第2层预测结果、…直至第n层结果做回归处理,生成最终的预测结果,回归函数如下:Perform regression processing on the first-layer prediction results, the second-layer prediction results, ... until the n-th layer results to generate the final prediction results. The regression function is as follows:
其中,C(λ)为最终的预测结果,λ表示训练参数,n表示网络层数,y(j)表示真实的类别,pλ(x(j))表示第j层预测的结果;x(j)表示第j层的特征向量;Among them, C(λ) is the final prediction result, λ is the training parameter, n is the number of network layers, y (j) is the real category, p λ (x (j) ) is the prediction result of the jth layer; x ( j) represents the feature vector of the jth layer;
通过C(λ)得到最终得分预测类别与类别在图中所在的坐标。The final score prediction category and the coordinates of the category in the graph are obtained by C(λ).
所述麦蜘蛛个数的获得包括以下步骤:The acquisition of the number of wheat spiders comprises the following steps:
将待计数图像输入定位模型,定位模型定位出待计数图像的候选区域;Input the image to be counted into the positioning model, and the positioning model locates the candidate area of the image to be counted;
将待计数图像的候选区域输入多尺度神经网络,得到图像中麦蜘蛛的预测分类,对麦蜘蛛数量进行统计,得到图像中麦蜘蛛个数。Input the candidate area of the image to be counted into the multi-scale neural network to obtain the predicted classification of the spider in the image, and count the number of spiders to obtain the number of spiders in the image.
有益效果beneficial effect
本发明的一种基于多尺度特征融合网络与定位模型相结合的麦蜘蛛自动计数方法,与现有技术相比实现了对田间自然环境下的麦蜘蛛进行直接识别、计数。Compared with the prior art, the automatic counting method of wheat spiders based on the combination of a multi-scale feature fusion network and a positioning model of the present invention realizes the direct identification and counting of the wheat spiders in the field natural environment.
本发明通过预处理消除了光照对检测计数的影响,将复杂环境简单化;再通过定位模型方法定位出疑似麦蜘蛛的候选区域;对候选区域利用多尺度特征融合网络进行特征提取,然后通过多预测结果回归最终确定麦蜘蛛区域。候选区域的定位(确定)大大减少了特征提取时间与特征维数,增强了计数的实时性;同时,多预测结果的回归融合保证了各个尺度的麦蜘蛛都能准确的检测到,提高了自动检测计数的鲁棒性与准确度。The invention eliminates the influence of illumination on the detection count through preprocessing, and simplifies the complex environment; then locates the candidate area of the suspected wheat spider through the localization model method; uses the multi-scale feature fusion network to extract the feature of the candidate area, and then uses the multi-scale feature fusion network to extract features. The prediction result regression finally determines the wheat spider area. The positioning (determination) of the candidate area greatly reduces the feature extraction time and feature dimension, and enhances the real-time performance of counting. Robustness and accuracy of detection counts.
附图说明Description of drawings
图1为本发明的方法顺序图;Fig. 1 is the method sequence diagram of the present invention;
图2a为现有技术中采用传统SVM技术对训练样本的检测结果图;Fig. 2a is the detection result diagram that adopts traditional SVM technology to training sample in the prior art;
图2b为利用本发明方法对的检测结果图;Fig. 2b is the detection result diagram of utilizing the method of the present invention;
图3为本发明中多尺度特征融合网络结构示意图。FIG. 3 is a schematic diagram of the structure of a multi-scale feature fusion network in the present invention.
具体实施方式Detailed ways
为使对本发明的结构特征及所达成的功效有更进一步的了解与认识,用以较佳的实施例及附图配合详细的说明,说明如下:In order to have a further understanding and understanding of the structural features of the present invention and the effects achieved, the preferred embodiments and accompanying drawings are used in conjunction with detailed descriptions, and the descriptions are as follows:
如图1所示,本发明所述的一种基于多尺度特征融合网络与定位模型相结合的麦蜘蛛自动计数方法,包括以下步骤:As shown in Figure 1, a method for automatic counting of wheat spiders based on the combination of a multi-scale feature fusion network and a positioning model according to the present invention includes the following steps:
第一步,训练样本的建立。获取多于2000张田间自然环境下麦蜘蛛图像作为训练图像,且图像中麦蜘蛛已进行标记,得到训练样本。The first step is the establishment of training samples. Obtain more than 2000 images of wheat spiders in the natural environment of the field as training images, and the wheat spiders in the images have been marked to obtain training samples.
第二步,构造麦蜘蛛检测计数模型。构造定位模型和多尺度特征融合网络,利用定位模型提取训练样本的的候选区域,(定位出麦蜘蛛的候选区域),再通过多尺度融合网络提取候选区域的特征后对其分类,如果是麦蜘蛛输出坐标值,如果不是麦蜘蛛候选区取消。The second step is to construct the wheat spider detection and counting model. Construct a localization model and a multi-scale feature fusion network, use the localization model to extract the candidate area of the training sample, (locate the candidate area of the wheat spider), and then extract the features of the candidate area through the multi-scale fusion network and classify it. Spider output coordinate value, if it is not a wheat spider candidate area, cancel.
首先,构造定位模型。为了减少特征提取时间,减少特征向量维数,增强自动计数的实时性,在此首先使用定位模型,定位出麦蜘蛛的候选区域,然后根据候选区域进行特征提取。First, a positioning model is constructed. In order to reduce the time of feature extraction, reduce the dimension of feature vector, and enhance the real-time performance of automatic counting, the localization model is used first to locate the candidate area of wheat spider, and then the feature extraction is carried out according to the candidate area.
其步骤如下:The steps are as follows:
(1)设定颜色空间转换模块,颜色空间转换模块用于将RGB颜色空间转换为YcbCr颜色空间,并分割为R={r1,r2,...rn}个分割区域。(1) Setting a color space conversion module, the color space conversion module is used to convert the RGB color space into the YcbCr color space, and divide it into R={r 1 , r 2 ,...rn } divided regions.
(2)计算色彩信息相似度。使用L1范式归一化获取图像每个颜色通道的25个直方图,计算该色彩空间的相似度,其计算公式如下:(2) Calculate the similarity of color information. Use L1 normalization to obtain 25 histograms of each color channel of the image, and calculate the similarity of the color space. The calculation formula is as follows:
其中,fcolor(ri,rj)表示分割区域ri与rj的色彩空间相似度;表示第i个通道、第k个直方图向量,i=1,2,3,k=0,1....,25,n表示直方图个数,ri表示分割区域R={r1,r2,...rn}第i个区域;表示第j个通道、第k个直方图向量,j=1,2,3,m表示m个直方图。Among them, f color (r i , r j ) represents the color space similarity between the segmented regions ri and r j ; Indicates the i-th channel and the k-th histogram vector, i=1, 2, 3, k=0, 1...., 25, n represents the number of histograms, ri i represents the segmented area R={r 1 ,r 2 ,...r n } i-th area; Represents the jth channel and the kth histogram vector, j=1, 2, 3, m represents m histograms.
(3)计算边缘信息相似度。对每个颜色通道的8个不同方向计算方差σ=1的高斯微分,每个通道每个颜色获取10个的直方图使用L1范式归一化,计算边缘信息相似度,其计算公式如下:(3) Calculate edge information similarity. Calculate the Gaussian differential with variance σ=1 for 8 different directions of each color channel, obtain 10 histograms for each color channel and use the L1 norm to normalize, and calculate the edge information similarity. The calculation formula is as follows:
其中,fedage(ri,rj)表示分割区域ri与rj的边缘信息相似度,表示第i个通道、第k个直方图向量,表示第j个通道、第k个直方图向量,其中i=1,2,3,j=1,2,3,k=0,1....,10;n表示直方图个数,m表示m个直方图,ri表示分割区域R={r1,r2,...rn}第i个区域。Among them, f edage (r i , r j ) represents the edge information similarity between the segmented regions ri and r j , represents the i-th channel, the k-th histogram vector, Represents the jth channel and the kth histogram vector, where i=1,2,3, j=1,2,3, k=0,1....,10; n represents the number of histograms, m represents m histograms, and ri represents the ith region of the divided region R={r 1 , r 2 , . . . rn }.
(4)计算区域大小相似度,其计算公式如下:(4) Calculate the similarity of the size of the area, and the calculation formula is as follows:
其中farea(ri,rj)表示表示分割区域ri与rj的区域大小相似度,area()表示区域面积;area(img)表示图片面积,ri表示分割区域R={r1,r2,...rn}第i个区域,rj表示分割区域R={r1,r2,...rn}第j个区域。where f area (r i , r j ) represents the similarity of the size of the segmented areas ri and r j , area() represents the area area; area(img) represents the image area, and ri represents the segmented area R={r 1 ,r 2 ,...rn } i- th region, r j represents the division region R={r 1 ,r 2 ,... rn } j- th region.
(5)将色彩信息相似度、边缘信息相似度、区域大小相似度融合,其计算公式如下:(5) Fusion of color information similarity, edge information similarity, and area size similarity, and the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),f(r i ,r j )=w 1 f color (r i ,r j )+w 2 f edage (r i ,r j )+w 3 f area (r i ,r j ),
其中,f(ri,rj)表示分割区域ri与rj融合后的相似度,w1、w2、w3分别表示信息相似度、边缘信息相似度、区域大小相似度的权值,ri表示分割区域R={r1,r2,...rn}第i个区域,rj表示分割区域R={r1,r2,...rn}第j个区域。Among them, f(r i , r j ) represents the similarity of the segmentation regions ri and r j after fusion, w 1 , w 2 , and w 3 represent the weights of information similarity, edge information similarity, and region size similarity, respectively , ri represents the ith area of the divided area R={r 1 , r 2 ,...rn }, r j represents the jth area of the divided area R={r 1 , r 2 ,... rn } .
经过色彩信息形似度、边缘信息相似度、区域大小相似度合并,ri与rj不断合最终生成的n个区域,即为麦蜘蛛的候选区域。After the color information similarity, edge information similarity, and region size similarity are merged, the n regions finally generated by r i and r j are continuously combined, which are the candidate regions of wheat spiders.
其次,构造多尺度特征融合网络,对多尺度特征融合网络结构进行改造。为了更好的提取每个尺度、各种形态的麦蜘蛛特征,设计了多尺度特征融合网络,来准确分辨出麦蜘蛛候选区域的准确区域。Secondly, construct a multi-scale feature fusion network, and transform the structure of the multi-scale feature fusion network. In order to better extract the features of each scale and various forms of the wheat spider, a multi-scale feature fusion network is designed to accurately distinguish the exact region of the wheat spider candidate region.
如图3所示,多尺度特征融合网络结构通过利用固有的多尺度和锥形层次结构的特征图来构造具有边际的多尺度特征网络,在此开发具有侧向连接的自顶向下的架构,用于在所有尺度上构建高级语义特征图,依靠一种通过自上而下的路径和横向连接将低分辨率但语义强的特征与高分辨率语义弱的特征结合在一起,这样就可以获得高分辨率、强语义的特征,有利于麦蜘蛛这种小目标的检测。其步骤如下:As shown in Fig. 3, the multi-scale feature fusion network structure constructs a multi-scale feature network with margins by exploiting the feature maps of the inherent multi-scale and conical hierarchy, where a top-down architecture with lateral connections is developed. , for building high-level semantic feature maps at all scales, relying on a way to combine low-resolution but semantically strong features with high-resolution semantically weak features through top-down paths and lateral connections, so that Obtaining high-resolution and strong semantic features is conducive to the detection of small targets such as wheat spiders. The steps are as follows:
(1)设定n层多尺度神经网络,从最顶层开始进行反卷积操作生成反卷积层。(1) Set up an n-layer multi-scale neural network, and perform deconvolution operations from the top layer to generate deconvolution layers.
(2)设定第1层的输入为训练样本、输出第1层特征图,第1层特征图作为第2层的输入、输出第2层特征图,第2层特征图作为第3层的输入,…直至第n-1层特征图作为第n层的输入。(2) Set the input of the first layer as the training sample, output the feature map of the first layer, the feature map of the first layer as the input of the second layer, output the feature map of the second layer, and the feature map of the second layer as the feature map of the third layer Input, ... up to the n-1th layer feature map as the input of the nth layer.
(3)将第1层特征图、第2层特征图…第n层特征图与对应的第1层反卷积层、第二层反卷积层、…第n层反卷积层通过1*1卷积核连接,生成多尺度特征融合网络。(3) Pass the feature map of the first layer, the feature map of the second layer ... the feature map of the nth layer and the corresponding first layer deconvolution layer, the second layer deconvolution layer, ... the nth layer deconvolution layer through 1 *1 The convolution kernel is connected to generate a multi-scale feature fusion network.
在此通过下采样生成特征图,将每一张训练图片作为输入,采用多尺度神经网络提取特征,多尺度神经网络每层通过下采样会生成一张特征图;Here, a feature map is generated by downsampling, each training image is used as input, and a multi-scale neural network is used to extract features, and each layer of the multi-scale neural network will generate a feature map through downsampling;
然后最后一层反卷积生成上一层大小的特征图依次迭代直到生成第二层大小为止。由于多尺度网络的每层都会下采样这样会导致特征图会越来越小,而特征图片中的麦蜘蛛就会更小甚至达到几个像素大小,对于麦蜘蛛检测计数影响很大。为避免这个问题,对每层金字塔图像采用反卷积操作,通过上采样将特征图放大到上一层大小这既能有效的提取害虫特征,又保证了麦蜘蛛在图片中的大小;Then the last layer of deconvolution generates a feature map of the size of the previous layer and iterates in turn until the size of the second layer is generated. Since each layer of the multi-scale network will be downsampled, the feature map will become smaller and smaller, and the wheat spider in the feature image will be smaller or even several pixels in size, which has a great impact on the detection count of the wheat spider. In order to avoid this problem, the deconvolution operation is used for each layer of the pyramid image, and the feature map is enlarged to the size of the previous layer by upsampling, which can not only effectively extract the pest features, but also ensure the size of the wheat spider in the picture;
将反卷积生成的各层特征图通过1*1卷积核连接,生成多尺度特征融合网络。The feature maps of each layer generated by deconvolution are connected through a 1*1 convolution kernel to generate a multi-scale feature fusion network.
最后,训练多尺度特征融合网络。根据定位模型针对训练样本定位出来的候选区域作为特征进行训练,将其每层的输出结果作为预测结果。其具体步骤如下:Finally, a multi-scale feature fusion network is trained. According to the positioning model, the candidate regions located by the training samples are used as features for training, and the output results of each layer are used as the prediction results. The specific steps are as follows:
(1)将训练样本输入定位模型,定位模型定位出训练样本的候选区域。(1) Input the training samples into the positioning model, and the positioning model locates the candidate regions of the training samples.
(2)将训练样本的候选区域分别输入多尺度神经网络的第1层,多尺度神经网络的第1层输出第1层特征图。(2) The candidate regions of the training samples are respectively input into the first layer of the multi-scale neural network, and the first layer of the multi-scale neural network outputs the first layer feature map.
(3)第1层特征图输入多尺度神经网络的第2层,多尺度神经网络的第1层输出第2层特征图,直至第n-1层特征图输入多尺度神经网络的第n层。(3) The first layer feature map is input to the second layer of the multi-scale neural network, the first layer of the multi-scale neural network outputs the second layer feature map, until the n-1 layer feature map is input to the nth layer of the multi-scale neural network .
(4)对第n层特征图进行反卷积操作,生成第n层反卷积层,对第n-1层进行反卷积操作,生成第n-1反卷积层,由此直至第1层反卷积层。(4) Perform the deconvolution operation on the feature map of the nth layer to generate the nth layer of deconvolution layer, perform the deconvolution operation on the n-1th layer, and generate the n-1th deconvolution layer, thus until the
(5)第1层特征图、第2层特征图、…直至第n层特征图与第1层反卷积层、第2层反卷积层、…直至第n层反卷积层通过1*1卷积核连接。(5) The first layer feature map, the second layer feature map, ... until the nth layer feature map and the first layer deconvolution layer, the second layer deconvolution layer, ... until the nth layer deconvolution layer passes 1 *1 convolution kernel connection.
(6)第1层特征图与第1层反卷积层经过1*1卷积核连接后提取第一层特征后,生成第一层预测结果;第2层特征图与第2层反卷积层经过1*1卷积核连接后提取第二层特征后,生成第二层预测结果;…直至第n层特征图与第n层反卷积层经过1*1卷积核连接后提取第n层特征后,生成第n层预测结果。(6) After the first layer feature map and the first layer deconvolution layer are connected by a 1*1 convolution kernel, the first layer feature is extracted, and the first layer prediction result is generated; the second layer feature map and the second layer are reversed After the layer is connected by a 1*1 convolution kernel, the second layer features are extracted, and the second layer prediction result is generated; ... until the nth layer feature map and the nth layer deconvolution layer are connected by a 1*1 convolution kernel and then extracted After the nth layer features, the nth layer prediction result is generated.
(7)对第1层预测结果、第2层预测结果、…直至第n层结果做回归处理,生成最终的预测结果,回归函数如下:(7) Perform regression processing on the first-layer prediction results, the second-layer prediction results, ... until the n-th layer results, and generate the final prediction results. The regression function is as follows:
其中,C(λ)为最终的预测结果,λ表示训练参数,n表示网络层数,y(j)表示真实的类别,pλ(x(j))表示第j层预测的结果;x(j)表示第j层的特征向量;Among them, C(λ) is the final prediction result, λ is the training parameter, n is the number of network layers, y (j) is the real category, p λ (x (j) ) is the prediction result of the jth layer; x ( j) represents the feature vector of the jth layer;
(8)通过C(λ)得到最终得分预测类别与类别在图中所在的坐标,坐标就是麦蜘蛛在图中的位置。(8) Obtain the final score prediction category and the coordinates of the category in the picture through C(λ), and the coordinates are the position of the wheat spider in the picture.
第三步,待计数图像的获取。获取田间拍摄的麦蜘蛛图像,并进行预处理得到待计数图像。The third step is to acquire the image to be counted. Acquire images of wheat spiders taken in the field, and perform preprocessing to obtain images to be counted.
第四步,麦蜘蛛个数的获得。将待计数图像输入麦蜘蛛检测计数模型,得到图像中麦蜘蛛个数。其具体步骤如下:The fourth step is to obtain the number of wheat spiders. Input the image to be counted into the wheat spider detection and counting model to obtain the number of wheat spiders in the image. The specific steps are as follows:
(1)将待计数图像输入定位模型,定位模型定位出待计数图像的候选区域;(1) Input the image to be counted into the positioning model, and the positioning model locates the candidate region of the image to be counted;
(2)将待计数图像的候选区域输入多尺度神经网络,得到图像中麦蜘蛛的预测分类,对麦蜘蛛数量进行统计,,得到图像中麦蜘蛛个数。(2) Input the candidate region of the image to be counted into the multi-scale neural network to obtain the predicted classification of the spider in the image, and count the number of spiders to obtain the number of spiders in the image.
如图2a所示,其为利用SVM算法得出的麦蜘蛛检测结果图。从图2a中可以看到,其小框检测出的麦蜘蛛区域非常大,特别是在图2a中部大框范围,其错误地将多个相对集中的麦蜘蛛全部归入一个大框的范围内。产生这个错误标示的原因在于:传统的SVM算法并未进行先期定位,若采用定位模型先定位出候选区域,则可以避免此类现象;图2a中,产生小框检测出的麦蜘蛛区域非常大的原因在于:传统的SVM算法并未采用多个预测结果的回归融合,并且图2a中,还有部分小框出现误标识。As shown in Figure 2a, it is the result of wheat spider detection obtained by using the SVM algorithm. As can be seen from Figure 2a, the small box detects a very large area of the spider, especially in the large box in the middle of Figure 2a, which mistakenly includes multiple relatively concentrated spiders into one large box. . The reason for this mislabeling is that the traditional SVM algorithm does not perform pre-positioning. If the positioning model is used to locate the candidate area first, this phenomenon can be avoided; in Figure 2a, the small-frame detected wheat spider area is very large. The reason is that the traditional SVM algorithm does not use the regression fusion of multiple prediction results, and in Figure 2a, there are still some small boxes that are misidentified.
如图2b所示,相对于传统的SVM算法而言,本发明能够精准地定位出麦蜘蛛个数和具体位置,具有较高的鲁棒性与准确度。As shown in Fig. 2b, compared with the traditional SVM algorithm, the present invention can accurately locate the number and specific positions of the spiders, and has higher robustness and accuracy.
以上显示和描述了本发明的基本原理、主要特征和本发明的优点。本行业的技术人员应该了解,本发明不受上述实施例的限制,上述实施例和说明书中描述的只是本发明的原理,在不脱离本发明精神和范围的前提下本发明还会有各种变化和改进,这些变化和改进都落入要求保护的本发明的范围内。本发明要求的保护范围由所附的权利要求书及其等同物界定。The foregoing has shown and described the basic principles, main features and advantages of the present invention. It should be understood by those skilled in the art that the present invention is not limited by the above-mentioned embodiments. The above-mentioned embodiments and descriptions describe only the principles of the present invention. Without departing from the spirit and scope of the present invention, there are various Variations and improvements are intended to fall within the scope of the claimed invention. The scope of protection claimed by the present invention is defined by the appended claims and their equivalents.
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810863041.1A CN109145770B (en) | 2018-08-01 | 2018-08-01 | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810863041.1A CN109145770B (en) | 2018-08-01 | 2018-08-01 | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109145770A CN109145770A (en) | 2019-01-04 |
CN109145770B true CN109145770B (en) | 2022-07-15 |
Family
ID=64798885
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810863041.1A Active CN109145770B (en) | 2018-08-01 | 2018-08-01 | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109145770B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110428413B (en) * | 2019-08-02 | 2021-09-28 | 中国科学院合肥物质科学研究院 | Spodoptera frugiperda imago image detection method used under lamp-induced device |
CN110689081B (en) * | 2019-09-30 | 2020-08-21 | 中国科学院大学 | Weak supervision target classification and positioning method based on bifurcation learning |
CN112651462A (en) * | 2021-01-04 | 2021-04-13 | 楚科云(武汉)科技发展有限公司 | Spider classification method and device and classification model construction method and device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850836A (en) * | 2015-05-15 | 2015-08-19 | 浙江大学 | Automatic insect image identification method based on depth convolutional neural network |
CN106845401A (en) * | 2017-01-20 | 2017-06-13 | 中国科学院合肥物质科学研究院 | A kind of insect image-recognizing method based on many spatial convoluted neutral nets |
CN107016680A (en) * | 2017-02-24 | 2017-08-04 | 中国科学院合肥物质科学研究院 | A kind of insect image background minimizing technology detected based on conspicuousness |
CN107133943A (en) * | 2017-04-26 | 2017-09-05 | 贵州电网有限责任公司输电运行检修分公司 | A kind of visible detection method of stockbridge damper defects detection |
CN107292314A (en) * | 2016-03-30 | 2017-10-24 | 浙江工商大学 | A kind of lepidopterous insects species automatic identification method based on CNN |
CN107346424A (en) * | 2017-06-30 | 2017-11-14 | 成都东谷利农农业科技有限公司 | Lamp lures insect identification method of counting and system |
CN107808116A (en) * | 2017-09-28 | 2018-03-16 | 中国科学院合肥物质科学研究院 | A kind of wheat spider detection method based on the fusion study of depth multilayer feature |
KR20180053003A (en) * | 2016-11-11 | 2018-05-21 | 전북대학교산학협력단 | Method and apparatus for detection and diagnosis of plant diseases and insects using deep learning |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016197303A1 (en) * | 2015-06-08 | 2016-12-15 | Microsoft Technology Licensing, Llc. | Image semantic segmentation |
US10354159B2 (en) * | 2016-09-06 | 2019-07-16 | Carnegie Mellon University | Methods and software for detecting objects in an image using a contextual multiscale fast region-based convolutional neural network |
US10262237B2 (en) * | 2016-12-08 | 2019-04-16 | Intel Corporation | Technologies for improved object detection accuracy with multi-scale representation and training |
CN107016405B (en) * | 2017-02-24 | 2019-08-30 | 中国科学院合肥物质科学研究院 | A Pest Image Classification Method Based on Hierarchical Prediction Convolutional Neural Network |
CN107368787B (en) * | 2017-06-16 | 2020-11-10 | 长安大学 | Traffic sign identification method for deep intelligent driving application |
CN108062531B (en) * | 2017-12-25 | 2021-10-19 | 南京信息工程大学 | A Video Object Detection Method Based on Cascaded Regression Convolutional Neural Networks |
CN108256481A (en) * | 2018-01-18 | 2018-07-06 | 中科视拓(北京)科技有限公司 | A kind of pedestrian head detection method using body context |
-
2018
- 2018-08-01 CN CN201810863041.1A patent/CN109145770B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850836A (en) * | 2015-05-15 | 2015-08-19 | 浙江大学 | Automatic insect image identification method based on depth convolutional neural network |
CN107292314A (en) * | 2016-03-30 | 2017-10-24 | 浙江工商大学 | A kind of lepidopterous insects species automatic identification method based on CNN |
KR20180053003A (en) * | 2016-11-11 | 2018-05-21 | 전북대학교산학협력단 | Method and apparatus for detection and diagnosis of plant diseases and insects using deep learning |
CN106845401A (en) * | 2017-01-20 | 2017-06-13 | 中国科学院合肥物质科学研究院 | A kind of insect image-recognizing method based on many spatial convoluted neutral nets |
CN107016680A (en) * | 2017-02-24 | 2017-08-04 | 中国科学院合肥物质科学研究院 | A kind of insect image background minimizing technology detected based on conspicuousness |
CN107133943A (en) * | 2017-04-26 | 2017-09-05 | 贵州电网有限责任公司输电运行检修分公司 | A kind of visible detection method of stockbridge damper defects detection |
CN107346424A (en) * | 2017-06-30 | 2017-11-14 | 成都东谷利农农业科技有限公司 | Lamp lures insect identification method of counting and system |
CN107808116A (en) * | 2017-09-28 | 2018-03-16 | 中国科学院合肥物质科学研究院 | A kind of wheat spider detection method based on the fusion study of depth multilayer feature |
Non-Patent Citations (5)
Title |
---|
Robust object tracking via multi-scale patch based sparse coding histogram;Zhong W 等;《Proc IEEE Conf Comput Vision Pattern Recognit》;20121231;第1838-1845页 * |
Selective Search for Object Recognition;J.R.R. Uijlings 等;《International Journal of Computer Vision》;20130930;第3节 * |
基于稀疏编码金字塔模型的农田害虫图像识别;谢成军 等;《农业工程学报》;20160930;第32卷(第7期);第144-151页 * |
基于稀疏表示的多特征融合害虫图像识别;胡永强 等;《模式识别与人工智能》;20141130;第27卷(第11期);第985-992页 * |
深度学习之检测模型-FPN;leo_whz;《https://blog.csdn.net/whz1861/article/details/79042283》;20180112;第1-3页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109145770A (en) | 2019-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020047738A1 (en) | Automatic pest counting method based on combination of multi-scale feature fusion network and positioning model | |
Jiao et al. | AF-RCNN: An anchor-free convolutional neural network for multi-categories agricultural pest detection | |
CN109584248B (en) | Infrared target instance segmentation method based on feature fusion and dense connection network | |
CN109344701B (en) | Kinect-based dynamic gesture recognition method | |
CN110020651B (en) | License plate detection and positioning method based on deep learning network | |
Li et al. | A coarse-to-fine network for aphid recognition and detection in the field | |
CN108154102B (en) | Road traffic sign identification method | |
CN111783523B (en) | A method for detecting rotating objects in remote sensing images | |
CN113205026B (en) | Improved vehicle type recognition method based on fast RCNN deep learning network | |
CN111144490A (en) | Fine granularity identification method based on alternative knowledge distillation strategy | |
CN109684906B (en) | Method for detecting red fat bark beetles based on deep learning | |
CN108830296A (en) | A kind of improved high score Remote Image Classification based on deep learning | |
CN112348036A (en) | Adaptive Object Detection Method Based on Lightweight Residual Learning and Deconvolution Cascade | |
CN108960404B (en) | Image-based crowd counting method and device | |
CN107203606A (en) | Text detection and recognition methods under natural scene based on convolutional neural networks | |
CN107977660A (en) | Region of interest area detecting method based on background priori and foreground node | |
Hu et al. | LE–MSFE–DDNet: a defect detection network based on low-light enhancement and multi-scale feature extraction | |
CN111178177A (en) | Cucumber disease identification method based on convolutional neural network | |
CN112861970B (en) | Fine-grained image classification method based on feature fusion | |
CN111222545B (en) | Image classification method based on linear programming incremental learning | |
CN109145770B (en) | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model | |
CN108230330B (en) | Method for quickly segmenting highway pavement and positioning camera | |
CN112464983A (en) | Small sample learning method for apple tree leaf disease image classification | |
WO2024217541A1 (en) | Remote-sensing image change detection method based on siamese network | |
CN113343989B (en) | Target detection method and system based on self-adaption of foreground selection domain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |