Improved Mask R-CNN Combined with Otsu Preprocessing for Rice Panicle Detection and Segmentation
<p>Distribution diagram of 12 rice experimental fields.</p> "> Figure 2
<p>Mask R-CNN model structure.</p> "> Figure 3
<p>Panicle-Mask rice panicle detection and segmentation algorithm flowchart.</p> "> Figure 4
<p>Loss function curve of Panicle-Mask.</p> "> Figure 5
<p>Precision–recall curve of Panicle-Mask.</p> "> Figure 6
<p>Examples of detection results using different methods.</p> "> Figure 7
<p>Results from different methods and examples of binarized images: (<b>a</b>) Original image; (<b>b</b>) Otsu; (<b>c</b>) Mask R-CNN; (<b>d</b>) Panicle-Mask.</p> "> Figure 8
<p>Example images from the test set: (<b>a</b>) No. 9; (<b>b</b>) No. 12; (<b>c</b>) No. 13; (<b>d</b>) No. 5; (<b>e</b>) No. 19; (<b>f</b>) No. 20.</p> "> Figure 9
<p>Comparison of the results of the number of rice panicles.</p> "> Figure 10
<p>Comparison of the results of the proportional area of rice panicles.</p> "> Figure A1
<p>Capturing real-time images of a rice field with a web camera.</p> ">
Abstract
:1. Introduction
2. Dataset and Method
2.1. Dataset and Preprocessing
2.2. Method
2.2.1. Otsu Preprocessing
I = 0, I < threshold
2.2.2. Adjustment of the RPN Anchor Box
2.2.3. Bounding Box Adjustment
2.2.4. Prediction Box Deletion and Selection
Si = Si (1 − IoU(μ,Bi)), IoU − RDIoU(μ,Bi) > ε
2.3. Evaluation Indicators
2.3.1. Precision, Recall, F1-Score, and IoU
R = TP/(TP + FN)
F1-score = 2 × P × R/(P + R), a > 1
IoU = (A ∩ B)/(A ∪ B)
2.3.2. SSIM
c (x, y) = (2σxσy + C2)/(σx2+ σy2 + C2)
s (x, y) = (σxy + C3)/(σxσy + C3)
2.3.3. pHash
3. Experimental Results and Analysis
3.1. Experimental Process
3.2. Results and Analysis
3.2.1. Experimental Results
3.2.2. Error Analysis
3.3. Detection and Segmentation Results
3.4. Comparison with Rice Panicle Image Detecting and Segmenting Methods
4. Conclusions
- (1)
- The classical Mask R-CNN model has been improved and optimized by combining the KL divergence and the soft NMS algorithm with the best RPN anchor box size, which makes the model more accurate and efficient in detecting rice panicles, and improves the long training time, low detection accuracy and fuzzy boundaries of the original algorithm;
- (2)
- Before the actual images are inputted into the detection model, the ExG is calculated based on the dataset features. Then, the traditional Otsu threshold segmentation method is used for preprocessing, which reduces the influence of background interference and improves the model detection accuracy to a certain extent;
- (3)
- This method can operate well in a field environment and is of great value for monitoring rice growth and estimating yield.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
References
- Guo, T.; Liang, G.; Zhou, W.; Liu, D.; Wang, X.; Sun, J.; Li, S.; Hu, C. Effect of fertilizer management on greenhouse gas emission and nutrient status in paddy soil. J. Plant Nutr. 2016, 22, 337–345. [Google Scholar]
- Mique, E.; Palaoag, T. Rice pest and disease detection using convolutional neural network. In Proceedings of the 2018 International Conference on Information Science and Applications, Hong Kong, China, 25–27 June 2018. [Google Scholar]
- Chen, J.; Zhang, D.; Nanehkaran, Y.; Li, D. Detection of rice plant diseases based on deep transfer learning. J. Sci. Food Agric. 2020, 100, 3246–3256. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Liu, M.; Dannenmann, M.; Tao, Y.; Yao, Z.; Jing, R.; Zheng, X.; Klaus, B.; Lin, S. Benefit of using biodegradable film on rice grain yield and N use efficiency in ground cover rice production system. Field Crop Res. 2017, 201, 52–59. [Google Scholar] [CrossRef]
- Bai, X.; Cao, Z.; Zhao, L.; Zhang, J.; Lv, C.; Xie, J. Rice heading stage automatic observation by multi-classifier cascade-based rice spike detection method. Agric. For. Meteorol. 2018, 259, 260–270. [Google Scholar] [CrossRef]
- Xu, J.; Wang, J.; Xu, X.; Ju, S. Image recognition for different developmental stages of rice by RAdam deep convolutional neural networks. Trans. CSAE 2021, 37, 143–150. [Google Scholar]
- Guo, W.; Fukatsu, T.; Ninomiya, S. Automated characterization of flowering dynamics in rice using field-acquired time-series RGB images. Plant Methods 2015, 11, 7. [Google Scholar] [CrossRef] [Green Version]
- Zhou, C.; Liang, D.; Yang, X.; Yang, H.; Yue, J.; Yang, G. Wheat Ears Counting in Field Conditions Based on Multi-Feature Optimization and TWSVM. Front. Plant. Sci. 2018, 9, 1024–1040. [Google Scholar] [CrossRef] [PubMed]
- Fernandez-Gallego, J.; Kefauver, S.; Gutiérrez, N.; Nieto-Taladriz, M.; Araus, J. Wheat ear counting in-field conditions: High throughput and low-cost approach using RGB images. Plant Methods 2018, 14, 22–34. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lu, H.; Cao, Z.; Xiao, Y.; Fang, Z.; Zhu, Y.; Xian, K. Fine-grained maize tassel trait characterization with multi-view representations. Comput. Electron. Agric. 2015, 118, 143–158. [Google Scholar] [CrossRef]
- Xiong, X.; Duan, L.; Liu, L.; Tu, H.; Yang, P.; Wu, D.; Chen, G.; Xiong, L.; Yang, W.; Liu, Q. Panicle-SEG: A robust image segmentation method for rice panicles in the field based on deep learning and superpixel optimization. Plant Methods 2017, 13, 104–119. [Google Scholar] [CrossRef] [Green Version]
- Fan, M.; Ma, Q.; Liu, J.; Wang, Q.; Wang, Y.; Duan, X. Counting Method of Wheatear in Field Based on Machine Vision Technology. Trans. CSAM 2015, 46 (Suppl. S1), 234–239. [Google Scholar]
- Li, H.; Li, Z.; Dong, W.; Cao, X.; Wen, Z.; Xiao, R.; Wei, Y.; Zeng, H.; Ma, X. An automatic approach for detecting seedlings per hill of machine-transplanted hybrid rice utilizing machine vision. Comput. Electron. Agric. 2021, 185, 106178–106192. [Google Scholar] [CrossRef]
- Cao, Y.; Liu, Y.; Ma, D.; Li, A.; Xu, T. Best Subset Selection Based Rice Panicle Segmentation from UAV Image. Trans. CSAM 2020, 8, 1000–1298. [Google Scholar]
- Li, Q.; Cai, J.; Bettina, B.; Okamoto, M.; Miklavcic, S. Detecting spikes of wheat plants using neural networks with Laws texture energy. Plant Methods 2017, 13, 83–96. [Google Scholar]
- Olsen, P.; Ramamurthy, K.; Ribera, J.; Chen, Y.; Thompson, A.; Luss, R.; Tuinstra, M.; Abe, N. Detecting and counting panicles in sorghum images. In Proceedings of the 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 1–3 October 2018. [Google Scholar]
- Liu, L.; Ouyang, W.; Wang, X.; Fieguth, P.; Chen, J.; Liu, X.; Pietikäinen, M. Deep Learning for Generic Object Detection: A Survey. Int. J. Comput. Vision 2020, 128, 261–318. [Google Scholar] [CrossRef]
- Zhao, L.; Li, S. Object detection algorithm based on improved YOLOv3. Electronics 2020, 9, 537. [Google Scholar] [CrossRef] [Green Version]
- Luo, Y.; Wang, B.; Chen, X. Research progresses of target detection technology based on deep learning. Semicond. Optoelectron. 2020, 41, 1–10. [Google Scholar]
- Hu, X.; Liu, Y.; Zhao, Z.; Liu, J.; Yang, X.; Sun, C.; Chen, S.; Li, B.; Zhou, C. Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 2021, 185, 106135–106146. [Google Scholar] [CrossRef]
- Wu, D.; Wu, D.; Feng, H.; Duan, L.; Dai, G.; Liu, X.; Wang, K.; Yang, P.; Cheng, G.; Gay, A.; et al. A deep learning-integrated micro-CT image analysis pipeline for quantifying rice lodging resistance-related traits. Plant Commun. 2021, 2, 100165–100177. [Google Scholar] [CrossRef]
- Gu, X.; Li, S.; Ren, S.; Zheng, H.; Fan, C.; Xu, H. Adaptive enhanced swin transformer with U-net for remote sensing image segmentation. Comput. Electr. Eng. 2022, 102, 108223–108234. [Google Scholar] [CrossRef]
- Chen, L.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, Y.; Xiao, D.; Chen, H.; Liu, Y. Rice Panicle Detection Method Based on Improved Faster R-CNN. Trans. CSAM 2021, 52, 231–240. [Google Scholar]
- Sun, X.; Fang, W.; Gao, C.; Fu, L.; Majeed, Y.; Liu, X.; Gao, F.; Yang, R.; Li, R. Remote estimation of grafted apple tree trunk diameter in modern orchard with RGB and point cloud based on SOLOv2. Comput. Electron. Agric. 2022, 199, 107209–107221. [Google Scholar] [CrossRef]
- Xie, E.; Sun, P.; Song, X.; Wang, W.; Liang, D.; Shen, C.; Luo, P. Polarmask: Single shot instance segmentation with polar representation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Zhang, X.; Cao, J. Contour-Point Refined Mask Prediction for Single-Stage Instance Segnebtation. Acad. Accel. 2020, 40, 113–121. [Google Scholar]
- Zhang, L.; Chen, Y.; Li, Y.; Mang, L.; Du, K. Detection and Counting System for Winter Wheat Ears Based on Convolutional Neural Network. Trans. CSAM 2019, 50, 144–150. [Google Scholar]
- Madec, S.; Jin, X.; Lu, H.; Solan, B.; Liu, S.; Duyme, F.; Heritier, E.; Baret, F. Ear density estimation from high resolution RGB imagery using deep learning technique. Agric. For. Meteorol. 2019, 264, 225–234. [Google Scholar] [CrossRef]
- Yang, M.; Tseng, H.; Hsu, Y.; Tsai, H. Semantic Segmentation Using Deep Learning with Vegetation Indices for Rice Lodging Identification in Multi-date UAV Visible Images. Remote Sens. 2020, 12, 633. [Google Scholar] [CrossRef] [Green Version]
- Duan, L.; Xiong, X.; Liu, Q.; Yang, W.; Huang, C. Field rice panicle segmentation based on deep full convolutional neural network. Trans. CSAE 2018, 34, 202–209. [Google Scholar]
- Kong, H.; Chen, P. Mask R-CNN-based feature extraction and three-dimensional recognition of rice panicle CT images. Plant Direct 2021, 5, e00323. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Yang, P.; Song, W.; Zhao, X.; Zhang, R. An improved Otsu threshold segmentation algorithm. Int. J. Comput. Sci. Eng. 2020, 22, 146–153. [Google Scholar] [CrossRef]
- He, Y.; Zhu, C.; Wang, J.; Savvides, M.; Zhang, X. Bounding box regression with uncertainty for accurate object detection. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Bodla, N.; Singh, B.; Chellappa, R.S.; Davis, L. Soft-NMS improving object detection with one line of code. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017.
- Zhang, L.; Zhang, H.; Han, W.; Niu, Y.; Chávez, J.; Ma, W. The mean value of gaussian distribution of excess green index: A new crop water stress indicator. Agric. Water Manag. 2021, 251, 106866–106877. [Google Scholar] [CrossRef]
- Chen, J.; Matzinger, H.; Zhai, H.; Zhou, M. Centroid estimation based on symmetric KL divergence for Multinomial text classification problem. In Proceedings of the 2018 IEEE International Conference on Machine Learning and Applications, Orlando, FL, USA, 17–20 December 2018. [Google Scholar]
- Huang, X.; Jiang, Z.; Lu, L.; Tan, C.; Jiao, J. The study of illumination compensation correction algorithm. In Proceedings of the 2011 IEEE International Conference on Electronics, Communications and Control (ICECC), Ningbo, China, 9–11 September 2011. [Google Scholar]
- Tang, Y.; Ren, F.; Pedrycz, W. Fuzzy C-Means clustering through SSIM and patch for image segmentation. Appl. Soft Comput. 2020, 87, 105928. [Google Scholar] [CrossRef]
- Huang, Z.; Liu, S. Robustness and Discrimination Oriented Hashing Combining Texture and Invariant Vector Distance. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Korea, 22–26 October 2018. [Google Scholar]
Model | Anchor_Size | mAP (%) | F1-Score (%) | IoU(%) |
---|---|---|---|---|
Mask R-CNN | {32,64,128,256,512} | 68.58 | 71.31 | 61.42 |
RPN-panicle1 | {8,16,32,64,128} | 64.57 | 80.49 | 60.36 |
RPN-panicle2 | {16,32,64,128,256} | 80.25 | 74.78 | 75.73 |
RPN-panicle3 | {64,128,256,512,1024} | 60.41 | 69.06 | 58.75 |
RPN-panicle4 | {128,256,512,1024,2048} | 55.42 | 67.31 | 52.73 |
Panicle-Mask | {16,32,64,128,256} | 89.10 | 81.95 | 84.42 |
No. | Otsu (%) | Mask R-CNN (%) | Panicle-Mask (%) |
---|---|---|---|
1 | 87.69 | 78.51 | 93.75 |
2 | 83.59 | 74.39 | 89.86 |
3 | 78.20 | 72.49 | 88.55 |
4 | 84.75 | 74.75 | 88.74 |
5 | 79.70 | 79.46 | 89.09 |
6 | 81.11 | 77.37 | 92.68 |
7 | 80.16 | 72.42 | 89.13 |
8 | 84.86 | 72.97 | 88.24 |
9 | 89.20 | 76.87 | 90.10 |
10 | 88.58 | 79.54 | 91.50 |
11 | 87.21 | 83.31 | 92.85 |
12 | 89.20 | 73.37 | 95.86 |
13 | 92.31 | 81.22 | 94.75 |
14 | 85.62 | 75.01 | 89.52 |
15 | 84.60 | 74.31 | 89.97 |
16 | 85.84 | 75.21 | 88.96 |
17 | 85.36 | 84.88 | 90.67 |
18 | 80.40 | 74.10 | 88.53 |
19 | 79.92 | 76.06 | 89.05 |
20 | 79.28 | 72.37 | 89.74 |
No. | Otsu | Mask R-CNN | Panicle-Mask |
---|---|---|---|
1 | 5 | 10 | 3 |
2 | 7 | 11 | 4 |
3 | 6 | 12 | 5 |
4 | 7 | 11 | 5 |
5 | 9 | 9 | 5 |
6 | 7 | 10 | 3 |
7 | 8 | 12 | 4 |
8 | 7 | 12 | 5 |
9 | 5 | 11 | 4 |
10 | 5 | 9 | 4 |
11 | 5 | 7 | 3 |
12 | 4 | 12 | 2 |
13 | 3 | 8 | 2 |
14 | 6 | 11 | 4 |
15 | 6 | 11 | 4 |
16 | 5 | 10 | 5 |
17 | 6 | 6 | 4 |
18 | 8 | 11 | 5 |
19 | 9 | 10 | 5 |
20 | 10 | 12 | 5 |
Paper | Algorithm | Evaluation Indicators | Performance |
---|---|---|---|
Xiong X. et al., 2017 [11] | SLIC, SegNet, entropy rate superpixel optimization | p = 0.82 R = 0.73 F1-score = 76.73% | Segmentation and nondestructive estimation; time-consuming processing and training |
Cao Yingli et al., 2020 [14] | Best subset selection, multiple linear regression, BP neural network | RMSE = 11.11 | Accurate extraction of rice panicle number; difficulty in dataset preparation, unstable colour features, and shooting height affects the segmentation result |
Duan Lingfeng et al., 2018 [32] | Deep full CNN, SegNet | p = 0.83, R = 0.83, F1-score = 83% | High segmentation accuracy and fast processing speed for rice panicles in the field; verbose image edge filling and manual annotation in Photoshop |
Kong Huihua et al., 2021 [33] | 3-D recognition, Mask R-CNN, Euclidean distance | Count accuracy (grain) ≥99%. | Effectively identifies and counts individual rice panicles photographed at close range; inapplicable to actual field environments |
This article | An improved Mask R-CNN; Otsu preprocessing | p = 0.84, R = 0.80, F1-score = 81.95% Count accuracy (panicle) =83.27%. RMSE = 11.08 | Detection and segmentation of rice panicles in an actual field environment, rice growth monitoring and yield estimation; verbose labelling process |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hong, S.; Jiang, Z.; Liu, L.; Wang, J.; Zhou, L.; Xu, J. Improved Mask R-CNN Combined with Otsu Preprocessing for Rice Panicle Detection and Segmentation. Appl. Sci. 2022, 12, 11701. https://doi.org/10.3390/app122211701
Hong S, Jiang Z, Liu L, Wang J, Zhou L, Xu J. Improved Mask R-CNN Combined with Otsu Preprocessing for Rice Panicle Detection and Segmentation. Applied Sciences. 2022; 12(22):11701. https://doi.org/10.3390/app122211701
Chicago/Turabian StyleHong, Shilan, Zhaohui Jiang, Lianzhong Liu, Jie Wang, Luyang Zhou, and Jianpeng Xu. 2022. "Improved Mask R-CNN Combined with Otsu Preprocessing for Rice Panicle Detection and Segmentation" Applied Sciences 12, no. 22: 11701. https://doi.org/10.3390/app122211701
APA StyleHong, S., Jiang, Z., Liu, L., Wang, J., Zhou, L., & Xu, J. (2022). Improved Mask R-CNN Combined with Otsu Preprocessing for Rice Panicle Detection and Segmentation. Applied Sciences, 12(22), 11701. https://doi.org/10.3390/app122211701