CN114693616B

CN114693616B - Rice disease detection method, device and medium based on improved target detection model and convolutional neural network

Info

Publication number: CN114693616B
Application number: CN202210263662.2A
Authority: CN
Inventors: 路阳; 张楠; 董宏丽; 蔡月芹; 田枫; 申雨轩; 胡仲瑞; 王鹏
Original assignee: Heilongjiang Bayi Agricultural University
Current assignee: Heilongjiang Bayi Agricultural University
Priority date: 2022-03-17
Filing date: 2022-03-17
Publication date: 2024-08-09
Anticipated expiration: 2042-03-17
Also published as: CN114693616A

Abstract

The invention provides a rice disease detection method, device and medium based on an improved target detection model and a convolutional neural network. The method shoots three common rice diseases including healthy rice leaves and rice blast, flax leaf spot and bacterial leaf spot under different complex environments in an actual rice field; then, by comparing test results of the four feature extraction networks, resNet-101 is determined as the optimal feature extraction network; aiming at the problem of low recognition rate of small disease spots in rice diseases, the recognition accuracy is improved by fusing the characteristic pyramid network. Test results show that the fast R-CNN model fused with ResNet-101 and the characteristic pyramid network can effectively detect three common diseases of rice in a complex background, and compared with an unmodified fast R-CNN model, a YOLO and SSD algorithm, the accuracy of the algorithm is further improved on the basis of guaranteeing detection in verification.

Description

Rice disease detection method, device and medium based on improved target detection model and convolutional neural network

Technical Field

The invention belongs to the technical field of rice disease and pest detection, and particularly relates to a rice disease detection method, device and medium based on an improved target detection model and a convolutional neural network.

Background

The rice is the grain crop with the largest sowing area and highest single yield in China, and the stable yield and high yield of the rice are always the targets of agricultural production in China. Rice blast, flax leaf spot and bacterial leaf spot are used as the most common main diseases of rice, and if the rice blast, the flax leaf spot and the bacterial leaf spot cannot be found in time for prevention and control, the yield of the rice can be seriously affected, and huge economic loss is caused. The disease type is rapidly and accurately identified in the rice growth process, the disease position and severity are determined, and then relevant measures are taken for control, so that the loss of farmers can be effectively reduced, and the national grain safety is ensured.

The most common methods for identifying rice diseases so far mainly depend on manual identification, and the main approaches include: farmers rely on their own planting experience; by consulting professional books; the subjective judgment is performed by inquiring information on the network. The samples were sent to the laboratory for detailed analysis at the molecular level. The rice grower gives an instruction to the professional agricultural technician. Due to lack of professional crop diagnosis knowledge, strong subjectivity and low recognition efficiency, and under a complex field environment, one blade may have diseases at multiple positions, and due to extremely high similarity of partial diseases, misjudgment is easily caused. The sample is brought back to the laboratory for analysis, which, although highly accurate, can only be done in the laboratory, is inefficient and costly. Expert dependence is high when expert education is given, and as plant protection stations in various areas in China are small in scale, the professionals cannot timely check crop diseases for various planting personnel at regular time in the growth period of rice, identification of diseases is easily delayed, and yield reduction of rice is caused.

In recent years, application of deep learning in image recognition has become a hotspot. In the field of agricultural information, studies on crop pest identification, plant classification, fruit classification and the like based on deep learning by students at home and abroad are gradually increasing. The convolutional neural network (Convolutional netural network, CNN) has good effects on detecting and identifying rice diseases. Huang Shuangping and the like, firstly, repeatedly stacking and constructing a main network by utilizing Inception basic modules, and then, accurately detecting rice spike blast by using a deep convolutional neural network GoogLeNet model, wherein the highest accuracy of spike blast disease prediction on a verification set is 92.0%. Jing designs a disease identification algorithm for common diseases of rice by utilizing a convolutional neural network, and the average accuracy rate can reach 96.03 percent. Chowdhury R. Rahman et al propose a two-stage small-sized CNN architecture, and the VGG16 algorithm after fine adjustment is used for detecting and identifying rice diseases by comparison, wherein the disease identification accuracy is 93.3%. RALLAPALLI et al propose an improved AlexNet model and named as M-Net, which can maintain good classification effect and has high accuracy, and the average identification accuracy of diseases such as Brown spot, hispa, leafBlast of rice is 90%. Although the accuracy rate obtained finally is higher, the research does not consider complex paddy field environments such as leaves, soil, weeds and the like of a plurality of rice plants and a plurality of disease spots with different sizes because the identified target background is single and the disease types are single, meanwhile, the CNN usually scans the whole image by using a sliding window, the time complexity is higher, the model parameters are too many along with the increase of the layer number, the data set used for training is too few, the fitting is easy to cause misjudgment, and the convolutional neural network model can not locate the disease spots. Therefore, there is an urgent need for a method for detecting rice diseases that is fast, accurate, efficient and applicable to complex field environments.

With the rapid development of deep learning, the target detection method based on the convolutional neural network has great advantages, and can not only identify the target in the image, but also locate the target. Li Shanjun et al propose an improved real-time classification method for citrus of SSD-ResNet, which can distinguish normal citrus, epidermoid lesion citrus and mechanically damaged citrus, wherein the mAP obtained finally by the method is 87.89%, and the average detection time is 0.02s. Fangfang Gao et al, using a SANP system based on fast R-CNN to detect apple fruits, compared the results of two different feature extraction networks VGG16 and ZFNet operation, found that the mAP value of the fast R-CNN model based on VGG16 was higher, 87.9%, and the average detection time was 0.241s. YunongTian et al propose an improved YOLO-V3 model for detecting apples at different growth stages in a complex background orchard environment, with an F1 value of 81.7%.

Various target detection methods based on convolutional neural networks have achieved good results in fruit detection, but few people use the target detection method in rice disease identification. The rice diseases and fruits are detected differently, and in a complex field environment, the disease spots of different diseases are similar in color and shape, are easy to shade each other, and are more difficult to identify. Therefore, the invention provides a positioning and identifying method for improving the rapid regional convolutional neural network (Faster R-CNN) rice disease lesions, and establishes a rice disease identification model. ResNet-101 is used as a characteristic extraction network of the fast R-CNN, and the range and shape characteristics of the small lesions are extracted by a fusion characteristic pyramid network (FPN) to be positioned and identified aiming at the problem of low recognition precision of the small lesions, so that the performance of the model is effectively improved.

Disclosure of Invention

The invention aims to solve the problems in the prior art and provides a rice disease detection method, device and medium based on an improved target detection model and a convolutional neural network.

The invention is realized by the following technical scheme, and provides a rice disease detection method based on an improved target detection model and a convolutional neural network, which specifically comprises the following steps:

step1, acquiring a rice health image and a rice disease image, and forming a data set;

step 2, carrying out data enhancement on the acquired image, expanding a data set, marking a rice disease data set by using a LabelImg tool, marking the positions of rice disease spots by using a rectangular frame, and randomly dividing the marked data set into a training set, a verification set and a test set;

Step 3, training an improved Faster R-CNN model by using a training set, and identifying the types and positioning the positions of rice diseases by using a testing set according to the trained model;

In the step 3, firstly, inputting a rice disease image with any size into ResNet-101 to extract characteristics, and outputting the characteristic image as a high-dimensional characteristic image; then, inputting the extracted high-dimensional features into a feature pyramid network for feature fusion to generate a multi-scale rice disease feature map; secondly, inputting the feature map into an RPN network to obtain a regional suggestion and a regional score; thirdly, inputting the high-dimensional feature map and the region suggestion into the ROI pooling layer, and outputting the feature of the corresponding region suggestion; and finally, inputting the obtained regional suggestion characteristics into the FC full-connection layer, outputting the category of the candidate region and the accurate position of the candidate region in the image, and obtaining the category and the accurate position of the diseases on the rice leaves.

Further, in step 1, 800 healthy rice images, 600 rice blast images, 600 flax leaf spot images, and 600 bacterial leaf streak images were collected together, and 2600 images in total.

Further, the expanded data set has 7800 images, the marked data set is randomly divided into 4680 images of the training set, 1560 images of the verification set and 1560 images of the test set according to the ratio of 6:2:2.

Further, inputting the rice disease image with any size into ResNet-101 to extract the characteristics, and outputting the characteristic image as a high-dimensional characteristic image; then, inputting the extracted high-dimensional features into a feature pyramid network for feature fusion to generate a multi-scale rice disease feature map; the method comprises the following steps:

ResNet-101 performing bottom-up feature extraction on an input image, performing 1x1 convolution operation on the extracted multi-level feature map to change the dimension into 256 dimensions, performing top-down 2 times up-sampling operation, overlapping and fusing with features of a previous layer, performing convolution operation on the fused features by using 3x3 convolution to reduce aliasing effects of 2 times up-sampling, obtaining a final feature layer { P2, P3, P4 and P5}, and inputting the final multi-scale feature map containing abundant semantic information and position information and the top-level feature map P6 into an RPN network to generate a candidate region frame.

Further, in the feature pyramid network, firstly sliding windows to generate anchors with different sizes and different shapes, then in the convolution layers of the two following 1*1, one is used for classification to judge whether each anchor is foreground or background, and the other is used for accurate positioning, and the anchors are finely tuned by boundingbox regression to be as close to groundtruth as possible.

Further, the loss function of the RPN is defined as:

Wherein L _cls is a target classification loss function, and a logarithmic loss function is adopted; l _reg is a bounding box regression loss function, and a Smooth _L1 loss function is adopted; p _i denotes the probability of being predicted to be of a certain class, Representing the prediction classification result, when the overlapping rate of the object and the ground truth Intersection (IOU) is larger than 0.5,Represented as foreground; when the overlapping ratio of the anchor and the object ground truth cross (IOU) is less than 0.5,Expressed as background, the background sample does not need to undergo bounding box regression; n _cls represents the minimum lot number, N _reg represents the number of anchors, α represents the weight balance parameter, and the classification loss function and the bounding box loss function are weighted substantially the same; t _i denotes four coordinate values of the prediction boundary box, t _i＝t_x,t_y,t_w,t_h),Four coordinate values representing a real bounding box.

Further, the method comprises the steps of,

The objective classification loss function is:

The bounding box regression loss function is:

The bounding box regression uses four-coordinate parameterization, defined as follows:

Where x represents the abscissa of the center point of the bounding box, y represents the ordinate of the center point of the bounding box, w represents the width of the bounding box, and h represents the height of the bounding box.

Further, in the training process of the model, the optimizer adopts a random gradient descent method with momentum, the weight attenuation is 0.001, the momentum factor is defaulted to be 0.9, and the max epoch is 50; the maximum iteration number of the Faster R-CNN is 60000, the learning rate is set to 0.01, and training is completed after the iteration number reaches 60000.

The invention provides an electronic device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the rice disease detection method based on an improved target detection model and a convolutional neural network when executing the computer program.

The invention provides a computer readable storage medium for storing computer instructions which when executed by a processor implement the steps of the rice disease detection method based on an improved target detection model and a convolutional neural network.

The beneficial effects of the invention are as follows:

aiming at the problem of identifying rice diseases under complex background and different sizes and shapes of disease spots, the invention provides a rice disease detection method based on improved Faster R-CNN. Firstly, three common rice diseases including healthy rice leaves and rice blast, flax leaf spot and bacterial leaf spot under different complex environments are shot in an actual rice field; then, by comparing test results of the four feature extraction networks, resNet-101 is determined as the optimal feature extraction network; aiming at the problem of low recognition rate of small disease spots in rice diseases, the recognition accuracy is improved by fusing the characteristic pyramid network. Test results show that the fast R-CNN model fused with ResNet-101 and the characteristic pyramid network can effectively detect three common diseases of rice in a complex background, and compared with the unmodified fast R-CNN model and the YOLO and SSD algorithm, the accuracy of the algorithm is further improved on the basis of guaranteeing detection, and the average precision average value obtained finally is 91.12%. The model in the invention can adapt to complex field environments, has excellent practical application, and has important reference significance for subsequent treatment of rice diseases, pesticide spraying and the like.

Drawings

FIG. 1 is a rice leaf disease image; wherein (a) healthy leaves, (b) rice blast, (c) septicemia and (d) bacterial leaf streak;

FIG. 2 is a schematic diagram showing the enhancement effect of rice blast data;

FIG. 3 is a schematic representation of a labeling example of a rice disease dataset, (a) healthy leaves, (b) rice blast, (c) flax leaf spot, and (d) bacterial leaf streak;

FIG. 4 is a schematic diagram of a modified Faster R-CNN model structure;

FIG. 5 is a feature pyramid diagram;

FIG. 6 is a diagram of a ResNet-FPN model configuration;

FIG. 7 is a graph showing a comparison of different feature extraction network loss curves;

FIG. 8 shows the identification and localization of rice blast after model improvement: (a) fast R-CNN prediction; (b) improved model predictive results; red arrows indicate undetected rice blast lesions;

FIG. 9 is a graph showing identification and localization of rice leaf spot after model improvement: (a) fast R-CNN prediction; (b) improved model predictive results; red arrows indicate undetected sepia lesions;

FIG. 10 shows identification and localization of bacterial leaf streaks in rice after model improvement: (a) fast R-CNN prediction; (b) improved model predictive results.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Data acquisition

The rice disease image is collected from a rice test field of a certain university. In order to ensure the diversity of samples, three types of mobile phones with different pixels are respectively used, wherein the back of the iphone12 is 1200W, the back of the red rice K30Pro is 6400W, the back of the red rice K30Pro is 4000W, and shooting is carried out under different field environments, different illumination intensities including forward light and reverse light and different maturity. The resolution of the images is 2532X1170, 2400X1080 and 2340X1080 respectively, and the images are stored in JPEG format. The method comprises the steps of collecting 800 healthy rice images, 600 rice blast images, 600 flax leaf spot images and 600 bacterial leaf streak images, wherein the total number of the images is 2600. The rice leaf disease image is shown in FIG. 1, and the main disease features are shown in Table 1.

TABLE 1 Rice leaf disease image characteristics

Dataset construction and annotation

Firstly, data enhancement is carried out on an original image, the number and diversity of samples are improved, and aiming at the characteristic of single color of a rice image, the method for increasing or decreasing contrast and brightness is used for expanding a data set, so that the color characteristics of rice lesions are enhanced. The extended dataset had 7800 images in total. The effect of the rice blast image enhancement is shown in FIG. 2. The rice disease dataset was labeled with LabelImg tools, the location of the rice lesions was labeled with rectangular boxes, and three disease images were labeled as shown in fig. 3. The annotated dataset was randomly partitioned into training set 4680 images, validation set 1560 images and test set 1560 images in a 6:2:2 ratio as shown in Table 2.

Table 2 number and types of datasets

Improved Faster R-CNN model

The Faster R-CNN model proposed by Ross B.Girsheck in 2015 can not only identify the target in the image, but also locate the target. The fast R-CNN model has good recognition accuracy on a larger target area, but when the rice diseases are recognized in a natural environment, the recognition accuracy of the fast R-CNN model is not high because the field environment is complex and part of the disease spots are smaller in size. The invention provides an improved Faster R-CNN model, which is characterized in that firstly, a rice disease image with any size is input into ResNet-101 to extract characteristics, and the characteristics are output as a high-dimensional characteristic diagram. And then, inputting the extracted high-dimensional features into a feature pyramid network for feature fusion to generate a multi-scale rice disease feature map. Secondly, the feature map is input into the RPN network to obtain the region proposal and the region score. And thirdly, inputting the high-dimensional feature map and the region suggestion into the ROI pooling layer, and outputting the feature of the corresponding region suggestion. And finally, inputting the obtained regional suggestion characteristics into the FC full-connection layer, outputting the category of the candidate region and the accurate position of the candidate region in the image, and obtaining the category and the accurate position of the diseases on the rice leaves, wherein the structure is shown in figure 4.

Feature extraction network

In the original Faster R-CNN model, pretrained VGG16 is used for feature extraction, in order to find a feature extraction network with good rice disease recognition effect and speed, four different convolutional neural networks VGG16, VGG19, resNet-50 and ResNet-101 are respectively used as feature extraction networks for extracting features of rice disease images in the test process, and the average precision average value obtained through comparison is the best ResNet-101 recognition effect, so ResNet-101 is determined to be used as an improved Faster R-CNN model feature extraction network.

Feature pyramid network

The traditional Faster R-CNN is a single-scale target detection network, and the top-level characteristic diagram is predicted independently and is not used for the characteristic diagrams of other layers. Although the top-level feature map has rich semantic information, the position information and the resolution ratio are poor, and the semantic information becomes fuzzy and even is lost after a small target is subjected to convolution pooling for many times, so that the detection effect is poor.

The feature pyramid network proposed by Lin et al in 2017 aims to solve the problem of multi-scale object detection, and the structure is shown in fig. 5. It is composed of three parts of bottom-up, top-down and transverse connection. The bottom-up process is a forward transmission part of the convolutional neural network, and in the forward transmission process, the feature map is smaller and smaller through calculation of a convolutional kernel, semantic information is richer and richer, and resolution is lower and lower. The feature map is divided into different stages according to the size of the feature map, and the last layer of each stage has the strongest semantic information, so that the feature map is output as the deepest layer. The top-down process is by 2-up sampling the top-level small feature map to the same size as the previous stage feature map. The transverse connection is to fuse the feature map with strong semantic information from top to top with the feature map with strong position information obtained by up sampling to obtain a new feature map with different scales and richer information. When the novel feature map is used for small disease spot detection, the small disease spot features in the rice disease data set are not easily ignored due to the fact that the novel feature map has strong position information and strong semantic information, so that the positioning is accurate, and the detection effect is better.

The deeper the network layer is in ResNet-101, the more abstract the feature map is, and the better the classification effect is, so that the classification effect is worst and the detection effect on a small target is not good although the higher layer has stronger semantic information. In the process of rolling and pooling, resNet-101 networks have good classification effect on the bottom layer feature images, and the position information can be well reserved, but the semantic information is weak. As the shape and the size of the rice disease spots in the real environment are different, the detection effect on small disease spots is improved, so that the rice disease can be prevented and controlled early, and a better prevention and control effect is achieved. Although the fast R-CNN model based on ResNet-101 has good recognition effect and accuracy for larger lesions, the recognition result for small lesions is not ideal, and in order to enhance the recognition effect for small lesions, the invention proposes to fuse the feature extraction network ResNet-101 with the FPN network to solve the problems. As shown in fig. 6, first, resNet-101 performs bottom-up feature extraction on an input image, performs a 1x1 convolution operation on the extracted multi-level feature map to change the dimension into 256 dimensions, performs a top-down 2-time upsampling operation, performs superposition fusion with features of a previous layer, and performs a convolution operation on the fused features by using a 3x3 convolution in order to reduce an aliasing effect occurring in the 2-time upsampling. The final feature layer obtained after fusion is { P2, P3, P4, P5}, and the final multi-scale feature map containing rich semantic information and position information and the top-level feature map P6 are input into an RPN network to generate a candidate region frame.

Region Proposal Network

The central part of the fast R-CNN is to provide a regional proposal network Region Proposal Network (RPN) for generating candidate regions, which replaces the time-consuming selective search (SELECTIVE SEARCH) and greatly improves the detection speed and the recognition accuracy. The main task of the RPN network is to generate candidate regions using a roll neural network. The RPN network adds a 3x3 convolution layer behind the convolution neural network to strengthen the fusion of surrounding information. Firstly, sliding windows generate anchors with different sizes and different shapes, then in the convolution layers of the two following 1*1, one is used for classification, judging whether each anchor is foreground or background, the other is used for accurate positioning, and the anchors are finely adjusted by bounding box regression to be as close to ground truth as possible.

Anchor is a candidate region finally generated by the RPN network, and when the RPN network slides on the feature map obtained in the previous feature extraction, anchors with different sizes, different length-width ratios and different shapes can be generated at each sliding position to cover regions possibly with targets in the image as much as possible. In the invention, the multi-scale feature map P2, P3, P4, P5 and P6 which are output after ResNet-101 and FPN are fused is used to replace the traditional single-scale feature map input RPN network, the corresponding anchor size of { P2, P3, P4, P5 and P6} is {322, 642, 1282, 2562 and 5122}, and the anchor proportion is set to {1:2,1:1,2:1}.

The loss function of the RPN is defined as:

Wherein L _cls is a target classification loss function, and a logarithmic loss function is adopted; l _reg is a bounding box regression loss function, and a Smooth _L1 loss function is adopted. p _i denotes the probability of being predicted as a certain class. Representing the prediction classification result, when the overlapping rate of the object and the ground truth Intersection (IOU) is larger than 0.5,Represented as foreground. When the overlapping ratio of the anchor and the object ground truth cross (IOU) is less than 0.5,Expressed as background, the background samples do not require a bounding box regression. N _cls represents the minimum lot number, and N _reg represents the number of anchors. α represents a weight balance parameter, here set to 10, such that the classification loss function and the bounding box loss function are weighted substantially the same. t _i denotes four coordinate values of the prediction boundary box, t _i＝(t_x,t_y,t_w,t_h).Four coordinate values representing a real bounding box.

The objective classification loss function is:

The bounding box regression loss function is:

Test platform and training parameters

The experiment is completed under the configuration of Windows10 (64-bit) operating system, intel CORE i78700K CPU@3.7GHz processor, GPU of NVIDIATITAN XP X, 12GB video memory and CUDA 10.1, and the running environment is Python 3.7,Anaconda 3.5.0,Tensorflow 2.1.0, and the memory is 64GB.

And identifying and positioning the lesion on the rice disease image by adopting a Faster R-CNN model fused with the FPN network. The whole training process of the model is about 6.5 hours, the optimizer adopts a random gradient descent method with momentum, the weight attenuation is 0.001, the momentum factor is 0.9 by default, and the max epoch is 50. The maximum iteration number of the Faster R-CNN is 60000, the learning rate is set to 0.01, and training is completed after the iteration number reaches 60000.

Evaluation index

The performance of each model is measured in the present invention using average accuracy (Average Precision, AP), average accuracy mean (MEAN AVERAGE Precision, mAP), and average detection time (Average Detection Time).

P (Precision) indicates the ratio of the number of correctly detected positive samples to the number of positive samples for all predictions:

Wherein TP (True position) represents the number of correctly predicted positive samples, i.e. actually positive samples, the prediction result is also positive samples. FP' refers to the number of mispredicted positive samples, i.e. actually negative samples, but the prediction result is positive samples.

AP (Average Precision) denotes the average precision, which is determined by the precision P and recall R, and AP is the area under the PR curve.

Σprecision represents the sum of all image accuracies, and N represents the number of images.

MAP (MeanAverage Precision) represents the average of the various classes.

Sigma Average Precision represents the sum of how much there is average precision, N' represents the number of all classes.

Experimental results and analysis

1. Comparison of test results of different feature extraction network models

Under the set model training parameters, the Faster R-CNN model comprises five feature extraction networks VGG16, VGG19, resNet-50, resNet-101 and ResNet-101+FPN for carrying out target detection on rice disease images, test results are shown in a table 3, and loss value curve pairs finally obtained by the five feature extraction networks are shown in a figure 7.

Table 3 comparison of test results of different feature extraction network models

As can be seen from the data in Table 3, although the average detection time and training time of the fast R-CNN model were not minimal when ResNet-101+FPN was used as the feature extraction network, the average precision mean could reach 91.12% and was higher than that of the other 5 models, which means that the ResNet-101+FPN feature extraction network structure was superior to the other 5 feature extraction networks in the diagnosis of rice disease. When VGG16 and VGG19 are used as the feature extraction network of the fast R-CNN model, the average precision mean is relatively low because many features cannot be extracted after multi-layer convolution due to too small individual lesions, and the parameters are mostly concentrated in the full-connected layer in VGG19, which is prone to over-fitting phenomenon. When ResNet-50 and ResNet-101 are used as the characteristic extraction networks, the residual network can solve the problem of gradient disappearance in the deep convolutional neural network, so that the overall average precision of the ResNet-50 and ResNet-101 networks is greatly improved compared with that of the VGG-16 and VGG-19 networks, but the average precision of small lesions is very low, and the average detection precision is not even 70%. In order to improve the average detection precision of smaller lesions, a characteristic pyramid network is integrated into a ResNet-101 network with better performance, so that the average precision of rice blast, flax leaf spot and bacterial leaf spot is improved, the average precision average value of the whole (taking ResNet-101 as the characteristic network) is 91.12%, and the improvement is 10.15%. In the average detection time, as the complexity of the feature extraction network increases, the time spent on the average detection time of ResNet-101+FPN on each image is 0.37s, and the average detection time is slower than that of other models, but still meets the requirement of real-time detection.

As the iteration times are increased, the loss values of the five feature extraction networks are all reduced, and the values gradually tend to be stable when the iteration times reach 20000 times, but by comparing the five curves, the fast R-CNN model based on ResNet-101+FPN feature extraction networks has the fastest convergence speed in the whole training process, and the loss values are always lower than those of other four feature extraction networks, so that the fast R-CNN performance of ResNet-101+FPN is better because the loss values are smaller. The pair of results of the loss curve with 60000 iterations after training is shown in fig. 7.

2. Comparison analysis of different model algorithm results

In order to embody the superiority of the method provided by the invention, the same data set is used for training and testing on the other two target detection algorithms YOLO and SSD respectively under the same test condition, and then the result is compared with the improved fast R-CNN algorithm provided by the invention. In order to ensure the effectiveness of test result comparison, the YOLO algorithm and the SSD algorithm are operated in Tensorflow environments as well, and a random gradient descent algorithm with a driving quantity is used, and other parameter settings are consistent with the improved fast R-CNN model. The final test results for the three algorithms are shown in table 4.

Table 4 comparison of test results of different algorithms

As can be seen from Table 4, the average accuracy average value of the improved Faster R-CNN model is 91.12%, which is higher than that of the two algorithms of YOLO and SSD. The average detection time of a single sample is 0.35s, and the average precision average value is greatly improved although the single sample takes longer. After comprehensive evaluation, the Faster R-CNN algorithm provided by the invention can improve the detection precision of rice diseases on the whole, can realize the real-time detection of rice diseases, and provides an important reference for the subsequent prevention and treatment of rice diseases.

3. Improved Faster R-CNN model detection result

Fig. 8, 9 and 10 show the detection results of the fast R-CNN model on three kinds of common diseased rice before and after improvement, respectively. The result shows that the improved Faster R-CNN model remarkably improves the accuracy of the identification frame, wherein the rice blast and the flax spot are smaller in disease spots, the improved model can identify small disease spots which are not identified by the original model, and the identification accuracy of the small disease spots is enhanced. The method has important application value in the aspect of identifying rice diseases.

In order to improve the accuracy of identifying rice diseases in a field complex environment, the rice disease detection method based on the improved target detection model and the convolutional neural network is designed and realized. And selecting ResNet-101 as an optimal feature extraction network of the fast R-CNN model, and aiming at the characteristic of smaller rice lesion areas, integrating a feature pyramid network on the basis of ResNet-101 to improve the fast R-CNN model so as to accurately position and identify the rice lesion. In the test, compared with classical VGG16, VGG19, resNet-50 and ResNet-101 models, the result shows that the average accuracy mean value of the improved Faster R-CNN model provided by the invention reaches 91.12%, and the average accuracy mean value is improved by 10.15% compared with the original Faster R-CNN model based on ResNet-101. Meanwhile, under the condition of using the same data set and the same test environment setting, the method provided by the invention is compared with other two common target detection algorithms YOLO and SSD, the improved Faster R-CNN model has obvious advantages for identifying rice diseases in a field complex environment, the average detection time of each image is 0.35s, and the method has good effect and robustness for detecting the rice diseases in the field complex environment in real time.

The invention provides a rice disease detection method, equipment and medium based on an improved target detection model and a convolutional neural network, and specific examples are applied to illustrate the principle and the implementation mode of the invention, and the illustration of the above examples is only used for helping to understand the method and the core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. The rice disease detection method based on the improved target detection model and the convolutional neural network is characterized by comprising the following steps of:

in the step 3, firstly, inputting a rice disease image with any size into ResNet-101 to extract characteristics, and outputting the characteristic image as a high-dimensional characteristic image; then, inputting the extracted high-dimensional features into a feature pyramid network for feature fusion to generate a multi-scale rice disease feature map; secondly, inputting the feature map into an RPN network to obtain a regional suggestion and a regional score; thirdly, inputting the high-dimensional feature map and the region suggestion into the ROI pooling layer, and outputting the feature of the corresponding region suggestion; finally, inputting the obtained regional suggestion characteristics into the FC full-connection layer, outputting the category of the candidate region and the accurate position of the candidate region in the image, and obtaining the category and the accurate position of the diseases on the rice leaves;

Inputting the rice disease image with any size into ResNet-101 to extract characteristics, and outputting the characteristic image as a high-dimensional characteristic image; then, inputting the extracted high-dimensional features into a feature pyramid network for feature fusion to generate a multi-scale rice disease feature map; the method comprises the following steps:

ResNet-101 performing bottom-up feature extraction on an input image, performing 1x1 convolution operation on the extracted multi-level feature map to change the dimension into 256 dimensions, performing top-down 2 times up-sampling operation, overlapping and fusing the multi-level feature map with the features of the previous layer, performing convolution operation on the fused features by using 3x3 convolution to reduce the aliasing effect of 2 times up-sampling, obtaining a final feature layer { P2, P3, P4 and P5}, and inputting the final multi-scale feature map containing rich semantic information and position information and the top-level feature map P6 into an RPN network to generate a candidate region frame;

In a feature pyramid network, firstly sliding windows to generate anchors with different sizes and different shapes, then in the convolution layers of the two 1*1 at the back, one is used for classifying, judging whether each anchor is a foreground or a background, the other is used for accurate positioning, and the anchors are finely adjusted by bounding box regression to be as close to ground truth as possible;

the loss function of the RPN is defined as:

Wherein L _cls is a target classification loss function, and a logarithmic loss function is adopted; l _reg is a bounding box regression loss function, and a Smooth _L1 loss function is adopted; p _i denotes the probability of being predicted to be of a certain class, Representing the prediction classification result, when the overlapping rate of the object and the ground truth Intersection (IOU) is larger than 0.5,Represented as foreground; when the overlapping ratio of the anchor and the object ground truth cross (IOU) is less than 0.5,Expressed as background, the background sample does not need to undergo bounding box regression; n _cls represents the minimum lot number, N _reg represents the number of anchors, α represents the weight balance parameter, and the classification loss function and the bounding box loss function are weighted substantially the same; t _i denotes four coordinate values of the prediction boundary box, t _i＝(t_x,t_y,t_w,t_h),Four coordinate values representing a real bounding box;

The objective classification loss function is:

The bounding box regression loss function is:

2. The method according to claim 1, wherein in step 1, 800 healthy rice images, 600 rice blast images, 600 flax leaf spot images, 600 bacterial leaf streak images, and 2600 total images are collected.

3. The method of claim 2, wherein the augmented dataset has a total of 7800 images, the annotated dataset is randomly divided into the training set 4680 images, the validation set 1560 images and the test set 1560 images in a 6:2:2 ratio.

4. A method according to claim 3, wherein during training of the model, the optimizer uses a random gradient descent method with momentum, the weight decay is 0.001, the momentum factor defaults to 0.9, and max epoch is 50; the maximum iteration number of the Faster R-CNN is 60000, the learning rate is set to 0.01, and training is completed after the iteration number reaches 60000.

5. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1-4 when the computer program is executed.

6. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of any one of claims 1-4.