CN109145770B - Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model - Google Patents
Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model Download PDFInfo
- Publication number
- CN109145770B CN109145770B CN201810863041.1A CN201810863041A CN109145770B CN 109145770 B CN109145770 B CN 109145770B CN 201810863041 A CN201810863041 A CN 201810863041A CN 109145770 B CN109145770 B CN 109145770B
- Authority
- CN
- China
- Prior art keywords
- layer
- wheat
- similarity
- deconvolution
- spiders
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a method for automatically counting wheat spiders based on combination of a multi-scale feature fusion network and a positioning model, which overcomes the defect of high error rate of image detection aiming at small targets compared with the prior art. The invention comprises the following steps: establishing a training sample; constructing a wheat spider detection counting model; acquiring an image to be counted; and obtaining the number of the wheat spiders. The invention realizes the direct identification and counting of the wheat spiders under the field natural environment.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to an automatic counting method for wheat spiders based on combination of a multi-scale feature fusion network and a positioning model.
Background
Wheat is one of main grain crops in China, is easily damaged by various pests in the production process of the wheat, and wheat spiders are one of the main grain crops, and can suck wheat leaf juice and even dry up the wheat leaf juice, so that the yield of the wheat is seriously influenced. The detection of pest population quantity is an important means for pest control, and provides a theoretical basis for pest control decision. Therefore, the identification and counting of wheat spiders in the field are important for improving the yield of wheat.
With the rapid development of computer vision technology and image processing technology, image-based pest automatic identification and counting technology has become a research focus in recent years. Although the method is time-saving and labor-saving, has the advantages of intellectualization and the like, the method cannot be applied to the identification and counting of the wheat spiders in the field. The reason is that: firstly, the individual wheat spiders are only a few millimeters small, and are difficult to detect by using a traditional image recognition technology (SVM) aiming at such small targets; secondly, when the image is collected, the quality of the image is influenced by unstable and uneven illumination of the external environment; moreover, in practical application, the acquired image is often mixed with other impurities, and the background is complex.
Therefore, how to detect small targets such as wheat spiders in a complex environment has become an urgent technical problem to be solved.
Disclosure of Invention
The invention aims to solve the defect of high error rate of image detection for small targets in the prior art, and provides an automatic wheat spider counting method based on combination of a multi-scale feature fusion network and a positioning model to solve the problems.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a method for automatically counting wheat spiders based on combination of a multi-scale feature fusion network and a positioning model comprises the following steps:
Establishing a training sample, namely acquiring more than 2000 images of the wheat spiders in the field natural environment as the training images, and marking the wheat spiders in the images to obtain the training sample;
constructing a wheat spider detection counting model;
constructing a positioning model;
constructing a multi-scale feature fusion network, and transforming a multi-scale feature fusion network structure;
training a multi-scale feature fusion network, training the features of the candidate region positioned by the training sample according to the positioning model, and taking the output result of each layer as a prediction result;
acquiring an image to be counted, acquiring a wheat spider image shot in the field, and preprocessing the image to be counted to obtain the image to be counted;
and (3) obtaining the number of the wheat spiders, namely inputting the image to be counted into a wheat spider detection counting model to obtain the number of the wheat spiders in the image.
The construction positioning model comprises the following steps:
setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R ═ R { R-1,r2,...rn} segmented areas;
calculating the similarity of color information, obtaining 25 histograms of each color channel of the image by normalization in an L1 paradigm, and calculating the similarity of the color space by the following calculation formula:
Wherein f iscolor(ri,rj) Indicates a divided region riAnd rjThe color space similarity of (1);denotes the ith channel, the kth histogram vector, i 1,2,3, k 0,1iDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;represents the jth channel, the kth histogram vector, j ═ 1,2,3, and m represents the m histograms;
calculating the similarity of the edge information, calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms of each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:
wherein f isedage(ri,rj) Indicates a divided region riAnd rjThe degree of similarity of the edge information of (2),representing the ith channel, the kth histogram vector,representing the jth channel, the kth histogram vector, where i 1,2,3, j 1,2,3, k 0,1.., 10; n represents the number of histograms, m represents the number of m histograms, riDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;
calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:
wherein f isarea(ri,rj) Representing the divided region riAnd rjArea size similarity of() Representing the area of the region; area (img) represents the area of the picture, riDenotes a divided region R ═ { R ═ R 1,r2,...rnI-th area, rjDenotes a divided region R ═ R { (R)1,r2,...rnJth area;
and fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),
wherein, f (r)i,rj) Represents a divided region riAnd rjSimilarity after fusion, w1、w2、w3Weights r representing information similarity, edge information similarity, and region size similarity, respectivelyiDenotes a divided region R ═ R { (R)1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area.
The construction of the multi-scale feature fusion network comprises the following steps:
setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer;
setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2 and outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n;
and connecting the feature map of the layer 1 and the feature map of the layer 2 … of the nth layer with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the nth layer through 1 × 1 convolution kernels to generate the multi-scale feature fusion network.
The training multi-scale feature fusion network comprises the following steps:
Inputting the training sample into a positioning model, and positioning a candidate region of the training sample by the positioning model;
respectively inputting the candidate regions of the training sample into the 1 st layer of the multi-scale neural network, and outputting a 1 st layer characteristic diagram by the 1 st layer of the multi-scale neural network;
inputting the feature map of the layer 1 into the layer 2 of the multi-scale neural network, and outputting the feature map of the layer 2 from the layer 1 of the multi-scale neural network until the feature map of the layer n-1 is input into the nth layer of the multi-scale neural network;
performing deconvolution operation on the n-th layer of feature map to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1-th layer to generate an n-1-th deconvolution layer, so as to obtain a 1-st deconvolution layer;
the characteristic diagrams of the 1 st layer, the 2 nd layer, … to the nth layer are connected with the deconvolution layer of the 1 st layer, the deconvolution layer of the 2 nd layer, … to the nth layer by 1 × 1 convolution kernels;
connecting the feature map of the 1 st layer with the deconvolution layer of the 1 st layer through a 1 x 1 convolution kernel, extracting features of the first layer, and generating a prediction result of the first layer; connecting the feature map of the 2 nd layer with the deconvolution layer of the 2 nd layer through a 1 x 1 convolution kernel, extracting features of a second layer, and generating a prediction result of the second layer; …, generating an nth layer of prediction result until the nth layer of feature graph is connected with the nth layer of deconvolution layer through a 1 x 1 convolution kernel and then the nth layer of feature is extracted;
And (3) performing regression processing on the layer 1 prediction result, the layer 2 prediction result, … and the results of the nth layer to generate a final prediction result, wherein the regression function is as follows:
wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y(j)Representing true classes, pλ(x(j)) Representing the result of the j-th layer prediction; x is a radical of a fluorine atom(j)A feature vector representing a j-th layer;
and obtaining a final score through C (lambda) to predict the coordinates of the category and the position of the category in the graph.
The method for acquiring the number of the wheat spiders comprises the following steps:
inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;
inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, and counting the number of the wheat spiders to obtain the number of the wheat spiders in the image.
Advantageous effects
Compared with the prior art, the automatic wheat spider counting method based on the combination of the multi-scale feature fusion network and the positioning model realizes direct identification and counting of wheat spiders in the field natural environment.
The invention eliminates the influence of illumination on detection counting through pretreatment, and simplifies the complex environment; then, positioning a candidate area of the suspected wheat spider by a positioning model method; and performing feature extraction on the candidate region by using a multi-scale feature fusion network, and then finally determining the wheat spider region through multi-prediction result regression. The positioning (determination) of the candidate region greatly reduces the feature extraction time and feature dimension, and enhances the counting instantaneity; meanwhile, regression fusion of multiple prediction results ensures that the wheat spiders of all scales can be accurately detected, and robustness and accuracy of automatic detection and counting are improved.
Drawings
FIG. 1 is a sequence diagram of the method of the present invention;
FIG. 2a is a diagram illustrating the detection result of training samples by using the conventional SVM technique in the prior art;
FIG. 2b is a graph showing the results of detection by the method of the present invention;
FIG. 3 is a schematic diagram of a multi-scale feature fusion network structure according to the present invention.
Detailed Description
So that the manner in which the above recited features of the present invention can be understood and readily understood, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings, wherein:
as shown in fig. 1, the automatic wheat spider counting method based on the combination of the multi-scale feature fusion network and the positioning model includes the following steps:
in the first step, training samples are established. More than 2000 images of the wheat spiders in the field natural environment are obtained as training images, and the wheat spiders in the images are marked to obtain training samples.
And secondly, constructing a wheat spider detection counting model. And constructing a positioning model and a multi-scale feature fusion network, extracting candidate regions of the training sample by using the positioning model, and classifying the candidate regions after extracting the features of the candidate regions through the multi-scale fusion network, wherein if the candidate regions are the candidate regions of the wheat spider, the candidate regions of the wheat spider are cancelled.
First, a positioning model is constructed. In order to reduce the feature extraction time, reduce the feature vector dimension and enhance the real-time performance of automatic counting, a positioning model is firstly used for positioning a candidate region of the wheat spider, and then feature extraction is carried out according to the candidate region.
The method comprises the following steps:
(1) setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R (R) and { R ═ R1,r2,...rnAnd } division areas.
(2) And calculating the similarity of the color information. The normalization of L1 paradigm is used to obtain 25 histograms of each color channel of the image, and the similarity of the color space is calculated according to the following formula:
wherein, fcolor(ri,rj) Represents a divided region riAnd rjThe color space similarity of (a);denotes the ith channel, the kth histogram vector, i 1,2,3, k 0,1iDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;represents the jth channel, the kth histogram vector, j ═ 1,2,3, and m represents the m histograms.
(3) And calculating the similarity of the edge information. Calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms for each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:
Wherein, fedage(ri,rj) Represents a divided region riAnd rjThe degree of similarity of the edge information of (a),representing the ith channel, the kth histogram vector,a k-th histogram vector representing the j-th channel, where i 1,2,3, j 1,2,3, k 0,1.., 10; n represents the number of histograms, m represents the number of m histograms, riDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area.
(4) Calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:
wherein f isarea(ri,rj) Representing the divided region riAnd rjArea () represents an area of the region; area (img) represents the area of the picture, riDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area.
(5) And fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),
wherein, f (r)i,rj) Indicates a divided region riAnd rjSimilarity after fusion, w1、w2、w3Weights r respectively representing information similarity, edge information similarity, and region size similarityiDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area.
Through the combination of color information similarity, edge information similarity and region size similarity, riAnd rjAnd continuously generating n regions which are the candidate regions of the wheat spider.
Secondly, constructing a multi-scale feature fusion network, and reconstructing a multi-scale feature fusion network structure. In order to better extract the characteristics of the wheat spiders with each scale and various forms, a multi-scale characteristic fusion network is designed to accurately distinguish accurate regions of candidate regions of the wheat spiders.
As shown in FIG. 3, the multi-scale feature fusion network structure constructs a marginal multi-scale feature network by utilizing the inherent multi-scale and cone-shaped hierarchical structure feature maps, and develops a top-down architecture with lateral connection for constructing a high-level semantic feature map on all scales. The method comprises the following steps:
(1) and setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer.
(2) Setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2, outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n.
(3) And connecting the feature map of the layer 1 and the feature map of the layer 2 … of the layer n with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the layer n through a 1 x 1 convolution kernel to generate the multi-scale feature fusion network.
Generating a feature map by downsampling, taking each training picture as input, extracting features by adopting a multi-scale neural network, and generating a feature map by downsampling each layer of the multi-scale neural network;
and then, the last layer of deconvolution generates a feature map with the size of the previous layer, and the iteration is carried out in turn until a second layer of size is generated. The characteristic graph is smaller and smaller due to down-sampling of each layer of the multi-scale network, and the number of the wheat spiders in the characteristic graph is smaller and even reaches a few pixels, so that the influence on the detection count of the wheat spiders is large. In order to avoid the problem, deconvolution operation is adopted for each layer of pyramid image, and the feature map is amplified to the size of the upper layer through upsampling, so that the pest features can be effectively extracted, and the size of the wheat spiders in the image is ensured;
and connecting the feature maps of the layers generated by deconvolution through 1-by-1 convolution kernels to generate a multi-scale feature fusion network.
And finally, training the multi-scale feature fusion network. And training by taking the candidate area positioned by the positioning model aiming at the training sample as a characteristic, and taking the output result of each layer as a prediction result. The method comprises the following specific steps:
(1) And inputting the training samples into a positioning model, and positioning the candidate regions of the training samples by the positioning model.
(2) And respectively inputting the candidate regions of the training sample into the layer 1 of the multi-scale neural network, and outputting a layer 1 characteristic diagram by the layer 1 of the multi-scale neural network.
(3) Inputting the layer 1 characteristic diagram into the layer 2 of the multi-scale neural network, and outputting the layer 2 characteristic diagram by the layer 1 of the multi-scale neural network until the n-1 layer characteristic diagram is input into the nth layer of the multi-scale neural network.
(4) And performing deconvolution operation on the n-th layer feature diagram to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1 st layer to generate an n-1-th deconvolution layer, so as to reach the 1 st deconvolution layer.
(5) The layer 1 signature, the layer 2 signature, … through the nth signature are connected to the layer 1 deconvolution layer, the layer 2 deconvolution layer, … through the nth deconvolution layer by a 1 x 1 convolution kernel.
(6) Connecting the layer 1 feature graph with the layer 1 deconvolution layer through a 1 x 1 convolution kernel, extracting first layer features, and generating a first layer prediction result; connecting the 2 nd layer feature graph with the 2 nd layer deconvolution layer through 1 x 1 convolution kernel, extracting second layer features, and generating a second layer prediction result; … until the n-th layer feature graph and the n-th layer deconvolution layer are connected through a 1 x 1 convolution kernel, and after the n-th layer features are extracted, the n-th layer prediction result is generated.
(7) And (3) performing regression processing on the layer 1 prediction result, the layer 2 prediction result, … and the results of the nth layer to generate a final prediction result, wherein the regression function is as follows:
wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y(j)Representing true classes, pλ(x(j)) Representing the result of the j-th layer prediction; x is a radical of a fluorine atom(j)A feature vector representing a j-th layer;
(8) and obtaining a final score through C (lambda) to predict the category and the coordinate of the category in the graph, wherein the coordinate is the position of the wheat spider in the graph.
And thirdly, acquiring an image to be counted. And acquiring a wheat spider image shot in the field, and preprocessing the wheat spider image to obtain an image to be counted.
And fourthly, obtaining the number of the wheat spiders. And inputting the image to be counted into the wheat spider detection counting model to obtain the number of the wheat spiders in the image. The method comprises the following specific steps:
(1) inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;
(2) inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, counting the number of the wheat spiders, and obtaining the number of the wheat spiders in the image.
As shown in fig. 2a, it is a graph of the detection result of the wheat spider obtained by the SVM algorithm. As can be seen from fig. 2a, the area of the wheat spiders detected by the small boxes is very large, especially in the middle large box area of fig. 2a, which erroneously puts all of a plurality of relatively concentrated wheat spiders within the range of one large box. The reason for this false indication is that: the traditional SVM algorithm does not perform early positioning, and if a candidate area is positioned by adopting a positioning model, the phenomenon can be avoided; in fig. 2a, the reason why the wheat spider area detected by the small box is very large is that: the traditional SVM algorithm does not use regression fusion of multiple prediction results, and in fig. 2a, some small boxes are misidentified.
As shown in fig. 2b, compared with the conventional SVM algorithm, the method of the present invention can accurately locate the number and specific positions of the wheat spiders, and has high robustness and accuracy.
The foregoing shows and describes the general principles, principal features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are merely illustrative of the principles of the invention, but various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (2)
1. A wheat spider automatic counting method based on combination of a multi-scale feature fusion network and a positioning model is characterized by comprising the following steps:
11) establishing a training sample, namely acquiring more than 2000 images of the wheat spiders in the field natural environment as training images, and marking the wheat spiders in the images to obtain the training sample;
12) constructing a wheat spider detection counting model;
121) constructing a positioning model; the construction positioning model comprises the following steps:
1211) Setting a color space conversion module, wherein the color space conversion module is used for converting the RGB color space into the YcbCr color space and dividing the RGB color space into R (R) and { R ═ R1,r2,...rn} segmented areas;
1212) calculating the similarity of color information, obtaining 25 histograms of each color channel of the image by normalization using an L1 paradigm, and calculating the similarity of color space according to the following calculation formula:
wherein, fcolor(ri,rj) Represents a divided region riAnd rjThe degree of similarity in the color space of (c),denotes the a channel, k histogram vector, a is 1,2,3, k is 0,1iDenotes a divided region R ═ { R ═ R1,r2,...rnThe ith area;represents the b-th channel, the k-th histogram vector, b ═ 1,2, 3;
1213) calculating the similarity of the edge information, calculating the Gaussian differential with the variance sigma being 1 for 8 different directions of each color channel, acquiring 10 histograms for each color of each channel, and normalizing by using an L1 paradigm to calculate the similarity of the edge information, wherein the calculation formula is as follows:
wherein f isedage(ri,rj) Indicates a divided region riAnd rjThe degree of similarity of the edge information of (2),representing the e-th channel, the k-th histogram vector,represents the f-th channel, the k-th histogram vector, where e is 1,2,3, f is 1,2,3, k is 0,1.., 10; q represents the number of histograms, r iDenotes a divided region R ═ R { (R)1,r2,...rnThe ith area;
1214) calculating the similarity of the sizes of the regions, wherein the calculation formula is as follows:
wherein f isarea(ri,rj) Represents a divided region riAnd rjArea () represents an area of the region; area (img) represents the area of the picture, riDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area;
1215) and fusing the color information similarity, the edge information similarity and the region size similarity, wherein the calculation formula is as follows:
f(ri,rj)=w1fcolor(ri,rj)+w2fedage(ri,rj)+w3farea(ri,rj),
wherein, f (r)i,rj) Represents a scoreCutting region riAnd rjSimilarity after fusion, w1、w2、w3Weights r respectively representing information similarity, edge information similarity, and region size similarityiDenotes a divided region R ═ { R ═ R1,r2,...rnI-th area, rjDenotes a divided region R ═ { R ═ R1,r2,...rnJth area;
122) constructing a multi-scale feature fusion network, and transforming a multi-scale feature fusion network structure; the construction of the multi-scale feature fusion network comprises the following steps:
1221) setting n layers of multi-scale neural networks, and performing deconvolution operation from the topmost layer to generate a deconvolution layer;
1222) setting the input of the layer 1 as a training sample, outputting a layer 1 feature map, taking the layer 1 feature map as the input of the layer 2 and outputting a layer 2 feature map, taking the layer 2 feature map as the input of the layer 3, and taking … to the layer n-1 feature map as the input of the layer n;
1223) Connecting the feature map of the layer 1 and the feature map of the layer 2 … of the nth layer with the corresponding deconvolution layer of the layer 1, the deconvolution layer of the second layer and the deconvolution layer of the layer … of the nth layer through 1-1 convolution kernels to generate a multi-scale feature fusion network;
123) training a multi-scale feature fusion network, training by taking a candidate region positioned by a positioning model aiming at a training sample as a feature, and taking an output result of each layer as a prediction result;
the training multi-scale feature fusion network comprises the following steps:
1231) inputting the training sample into a positioning model, and positioning a candidate region of the training sample by the positioning model;
1232) respectively inputting the candidate regions of the training sample into the 1 st layer of the multi-scale neural network, and outputting a 1 st layer characteristic diagram by the 1 st layer of the multi-scale neural network;
1233) inputting the feature map of the layer 1 into the layer 2 of the multi-scale neural network, and outputting the feature map of the layer 2 by the layer 2 of the multi-scale neural network until the feature map of the layer n-1 is input into the nth layer of the multi-scale neural network;
1234) performing deconvolution operation on the n-th layer of feature map to generate an n-th deconvolution layer, and performing deconvolution operation on the n-1-th layer to generate an n-1-th deconvolution layer, so as to obtain a 1-st deconvolution layer;
1235) The characteristic diagrams of the 1 st layer, the 2 nd layer, … to the nth layer are connected with the deconvolution layer of the 1 st layer, the deconvolution layer of the 2 nd layer, … to the nth layer by 1 × 1 convolution kernels;
1236) connecting the feature map of the 1 st layer with the deconvolution layer of the 1 st layer through a 1 x 1 convolution kernel, extracting features of the first layer, and generating a prediction result of the first layer; connecting the feature map of the 2 nd layer with the deconvolution layer of the 2 nd layer through a 1 x 1 convolution kernel, extracting features of a second layer, and generating a prediction result of the second layer; …, generating an nth layer of prediction result until the nth layer of feature graph is connected with the nth layer of deconvolution layer through a 1 x 1 convolution kernel and then the nth layer of feature is extracted;
1237) and performing regression processing on the prediction result of the 1 st layer, the prediction result of the 2 nd layer, … till the result of the nth layer to generate a final prediction result, wherein the regression function is as follows:
wherein C (lambda) is the final prediction result, lambda represents the training parameter, n represents the number of network layers, y represents(j)Representing true classes, pλ(x(j)) Representing the result of the prediction of the j-th layer; x is the number of(j)A feature vector representing a j-th layer;
1238) obtaining a final score through C (lambda) to predict the coordinates of the category and the position of the category in the graph;
13) acquiring an image to be counted, acquiring a wheat spider image shot in the field, and preprocessing the image to be counted;
14) And (3) obtaining the number of the wheat spiders, namely inputting the image to be counted into a wheat spider detection counting model to obtain the number of the wheat spiders in the image.
2. The method for automatically counting the number of the wheat spiders based on the combination of the multi-scale feature fusion network and the positioning model as claimed in claim 1, wherein the obtaining of the number of the wheat spiders comprises the following steps:
21) inputting the image to be counted into a positioning model, and positioning a candidate area of the image to be counted by the positioning model;
22) inputting the candidate area of the image to be counted into a multi-scale neural network to obtain the prediction classification of the wheat spiders in the image, and counting the number of the wheat spiders to obtain the number of the wheat spiders in the image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810863041.1A CN109145770B (en) | 2018-08-01 | 2018-08-01 | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810863041.1A CN109145770B (en) | 2018-08-01 | 2018-08-01 | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109145770A CN109145770A (en) | 2019-01-04 |
CN109145770B true CN109145770B (en) | 2022-07-15 |
Family
ID=64798885
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810863041.1A Active CN109145770B (en) | 2018-08-01 | 2018-08-01 | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109145770B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110428413B (en) * | 2019-08-02 | 2021-09-28 | 中国科学院合肥物质科学研究院 | Spodoptera frugiperda imago image detection method used under lamp-induced device |
CN110689081B (en) * | 2019-09-30 | 2020-08-21 | 中国科学院大学 | Weak supervision target classification and positioning method based on bifurcation learning |
CN112651462A (en) * | 2021-01-04 | 2021-04-13 | 楚科云(武汉)科技发展有限公司 | Spider classification method and device and classification model construction method and device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850836A (en) * | 2015-05-15 | 2015-08-19 | 浙江大学 | Automatic insect image identification method based on depth convolutional neural network |
CN106845401A (en) * | 2017-01-20 | 2017-06-13 | 中国科学院合肥物质科学研究院 | A kind of insect image-recognizing method based on many spatial convoluted neutral nets |
CN107016680A (en) * | 2017-02-24 | 2017-08-04 | 中国科学院合肥物质科学研究院 | A kind of insect image background minimizing technology detected based on conspicuousness |
CN107133943A (en) * | 2017-04-26 | 2017-09-05 | 贵州电网有限责任公司输电运行检修分公司 | A kind of visible detection method of stockbridge damper defects detection |
CN107292314A (en) * | 2016-03-30 | 2017-10-24 | 浙江工商大学 | A kind of lepidopterous insects species automatic identification method based on CNN |
CN107346424A (en) * | 2017-06-30 | 2017-11-14 | 成都东谷利农农业科技有限公司 | Lamp lures insect identification method of counting and system |
CN107808116A (en) * | 2017-09-28 | 2018-03-16 | 中国科学院合肥物质科学研究院 | A kind of wheat spider detection method based on the fusion study of depth multilayer feature |
KR20180053003A (en) * | 2016-11-11 | 2018-05-21 | 전북대학교산학협력단 | Method and apparatus for detection and diagnosis of plant diseases and insects using deep learning |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016197303A1 (en) * | 2015-06-08 | 2016-12-15 | Microsoft Technology Licensing, Llc. | Image semantic segmentation |
US10354159B2 (en) * | 2016-09-06 | 2019-07-16 | Carnegie Mellon University | Methods and software for detecting objects in an image using a contextual multiscale fast region-based convolutional neural network |
US10262237B2 (en) * | 2016-12-08 | 2019-04-16 | Intel Corporation | Technologies for improved object detection accuracy with multi-scale representation and training |
CN107016405B (en) * | 2017-02-24 | 2019-08-30 | 中国科学院合肥物质科学研究院 | A kind of pest image classification method based on classification prediction convolutional neural networks |
CN107368787B (en) * | 2017-06-16 | 2020-11-10 | 长安大学 | Traffic sign identification method for deep intelligent driving application |
CN108062531B (en) * | 2017-12-25 | 2021-10-19 | 南京信息工程大学 | Video target detection method based on cascade regression convolutional neural network |
CN108256481A (en) * | 2018-01-18 | 2018-07-06 | 中科视拓(北京)科技有限公司 | A kind of pedestrian head detection method using body context |
-
2018
- 2018-08-01 CN CN201810863041.1A patent/CN109145770B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104850836A (en) * | 2015-05-15 | 2015-08-19 | 浙江大学 | Automatic insect image identification method based on depth convolutional neural network |
CN107292314A (en) * | 2016-03-30 | 2017-10-24 | 浙江工商大学 | A kind of lepidopterous insects species automatic identification method based on CNN |
KR20180053003A (en) * | 2016-11-11 | 2018-05-21 | 전북대학교산학협력단 | Method and apparatus for detection and diagnosis of plant diseases and insects using deep learning |
CN106845401A (en) * | 2017-01-20 | 2017-06-13 | 中国科学院合肥物质科学研究院 | A kind of insect image-recognizing method based on many spatial convoluted neutral nets |
CN107016680A (en) * | 2017-02-24 | 2017-08-04 | 中国科学院合肥物质科学研究院 | A kind of insect image background minimizing technology detected based on conspicuousness |
CN107133943A (en) * | 2017-04-26 | 2017-09-05 | 贵州电网有限责任公司输电运行检修分公司 | A kind of visible detection method of stockbridge damper defects detection |
CN107346424A (en) * | 2017-06-30 | 2017-11-14 | 成都东谷利农农业科技有限公司 | Lamp lures insect identification method of counting and system |
CN107808116A (en) * | 2017-09-28 | 2018-03-16 | 中国科学院合肥物质科学研究院 | A kind of wheat spider detection method based on the fusion study of depth multilayer feature |
Non-Patent Citations (5)
Title |
---|
Robust object tracking via multi-scale patch based sparse coding histogram;Zhong W 等;《Proc IEEE Conf Comput Vision Pattern Recognit》;20121231;第1838-1845页 * |
Selective Search for Object Recognition;J.R.R. Uijlings 等;《International Journal of Computer Vision》;20130930;第3节 * |
基于稀疏编码金字塔模型的农田害虫图像识别;谢成军 等;《农业工程学报》;20160930;第32卷(第7期);第144-151页 * |
基于稀疏表示的多特征融合害虫图像识别;胡永强 等;《模式识别与人工智能》;20141130;第27卷(第11期);第985-992页 * |
深度学习之检测模型-FPN;leo_whz;《https://blog.csdn.net/whz1861/article/details/79042283》;20180112;第1-3页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109145770A (en) | 2019-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110770752A (en) | Automatic pest counting method combining multi-scale feature fusion network with positioning model | |
Li et al. | SAR image change detection using PCANet guided by saliency detection | |
CN109344701B (en) | Kinect-based dynamic gesture recognition method | |
CN110348319B (en) | Face anti-counterfeiting method based on face depth information and edge image fusion | |
CN109154978B (en) | System and method for detecting plant diseases | |
CN108009559B (en) | Hyperspectral data classification method based on space-spectrum combined information | |
Li et al. | A coarse-to-fine network for aphid recognition and detection in the field | |
CN109684922B (en) | Multi-model finished dish identification method based on convolutional neural network | |
US8908919B2 (en) | Tactical object finder | |
CN111753828B (en) | Natural scene horizontal character detection method based on deep convolutional neural network | |
CN110766041B (en) | Deep learning-based pest detection method | |
CN114897816B (en) | Mask R-CNN mineral particle identification and particle size detection method based on improved Mask | |
WO2022028031A1 (en) | Contour shape recognition method | |
CN112733614B (en) | Pest image detection method with similar size enhanced identification | |
CN108960404B (en) | Image-based crowd counting method and device | |
CN111882554B (en) | SK-YOLOv 3-based intelligent power line fault detection method | |
CN109801305B (en) | SAR image change detection method based on deep capsule network | |
Trivedi et al. | Automatic segmentation of plant leaves disease using min-max hue histogram and k-mean clustering | |
CN109145770B (en) | Automatic wheat spider counting method based on combination of multi-scale feature fusion network and positioning model | |
CN112200121A (en) | Hyperspectral unknown target detection method based on EVM and deep learning | |
WO2024217541A1 (en) | Remote-sensing image change detection method based on siamese network | |
CN112464983A (en) | Small sample learning method for apple tree leaf disease image classification | |
CN116596875A (en) | Wafer defect detection method and device, electronic equipment and storage medium | |
CN113221956B (en) | Target identification method and device based on improved multi-scale depth model | |
CN111639697B (en) | Hyperspectral image classification method based on non-repeated sampling and prototype network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |