Wood Veneer Defect Detection Based on Multiscale DETR with Position Encoder Net
<p>Image collection device.</p> "> Figure 2
<p>Image augmentation results. The first column presents the original images.</p> "> Figure 3
<p>Bounding box distribution of the dataset. The darker the color, the greater the number.</p> "> Figure 4
<p>The architecture of the proposed detection pipeline.</p> "> Figure 5
<p>Feature map flattening and split.</p> "> Figure 6
<p>The positional encoding net.</p> "> Figure 7
<p>Two designs of the positional encoding net. * means multiple.</p> "> Figure 8
<p>Comparison of inference speed and detection accuracy.</p> "> Figure 9
<p>Loss curves of 5 pipelines.</p> "> Figure 10
<p>Loss curves of PEN-1 and PEN-2.</p> "> Figure 11
<p>Confusion matrix of models with different feature maps.</p> "> Figure 12
<p>An example of detection results of four different models. (<b>a</b>) MPEND, (<b>b</b>) MPEND-S, (<b>c</b>) MPEND-M, (<b>d</b>) MPEND-L.</p> ">
Abstract
:1. Introduction
2. Literature Review
2.1. Review of Classic Object Detection Methodologies
2.2. Research on Veneer Defect Detection
3. Data Preparation
4. Proposed Detection Pipeline
4.1. Backbone
4.2. Multiscale Position Encoding Net
4.3. Loss Functions
5. Experiments and Analyses
5.1. Experimental Settings
5.2. Performance Comparison
5.3. Ablation Experiments
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Nomenclature
DETR | DEtection TRansformer |
IoU | Intersection over union |
CNN | Convolutional neural networks |
ILSVRC | ImageNet large-scale visual recognition challenge |
SOTA | State-of-the-art |
RCNN | Region-CNN |
YOLO | You only look once |
RNN | Recurrent neural network |
ViT | Vision transformer |
MPEND | Multiscale position encoding net detector |
MPEN | Multiscale position encoding net |
RES | Residual net |
PEN | Position encoding net |
References
- Funck, J.W.; Zhong, Y.; Butler, D.A.; Brunner, C.; Forrer, J. Image segmentation algorithms applied to wood defect detection. Comput. Electron. Agric. 2003, 41, 157–179. [Google Scholar] [CrossRef]
- Wyckhuyse, A.; Maldague, X. A study of wood inspection by infrared thermography, Part II: Thermography for wood defects detection. J. Res. Nondestruct. Eval. 2001, 13, 13–21. [Google Scholar] [CrossRef]
- Cavalin, P.; Oliveira, L.S.; Koerich, A.L.; Britto, A.S. Wood defect detection using grayscale images and an optimized feature set. In Proceedings of the IECON 2006-32nd Annual Conference on IEEE Industrial Electronics, Paris, France, 6–10 November 2006; pp. 3408–3412. [Google Scholar]
- Zhang, Y.; Xu, C.; Li, C.; Yu, H.; Cao, J. Wood defect detection method with PCA feature fusion and compressed sensing. J. For. Res. 2015, 26, 745–751. [Google Scholar] [CrossRef]
- Shi, J.; Li, Z.; Zhu, T.; Wang, D.; Ni, C. Defect detection of industry wood veneer based on NAS and multi-channel mask R-CNN. Sensors 2020, 20, 4398. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Zhou, X.; Liu, Y.; Hu, Z.; Ding, F. Wood defect detection based on depth extreme learning machine. Appl. Sci. 2020, 10, 7488. [Google Scholar] [CrossRef]
- He, T.; Liu, Y.; Xu, C.; Zhou, X.; Hu, Z.; Fan, J. A fully convolutional neural network for wood defect location and identification. IEEE Access 2019, 7, 123453–123462. [Google Scholar] [CrossRef]
- Sang, J.; Wu, Z.; Guo, P.; Hu, H.; Xiang, H.; Zhang, Q.; Cai, B. An improved YOLOv2 for vehicle detection. Sensors 2018, 18, 4272. [Google Scholar] [CrossRef]
- Chang, Y.L.; Anagaw, A.; Chang, L.; Wang, Y.C.; Hsiao, C.-Y.; Lee, W.-H. Ship detection based on YOLOv2 for SAR imagery. Remote Sens. 2019, 11, 786. [Google Scholar] [CrossRef]
- Jiang, H.; Learned-Miller, E. Face detection with the faster R-CNN. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, USA, 30 May–3 June 2017; pp. 650–657. [Google Scholar]
- Wang, R.; Liu, L.; Xie, C.; Yang, P.; Li, R.; Zhou, M. AgriPest: A large-scale domain-specific benchmark dataset for practical agricultural pest detection in the wild. Sensors 2021, 21, 1601. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6105–6114. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 213–229. [Google Scholar]
- Zhao, Z.Q.; Zheng, P.; Xu, S.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Lowe, D.G. Distinctive image feature from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Herbert, B.; Tuytelaars, T.; Van Gool, L. SURF: Speeded up robust features. In Proceedings of the European Conference on Computer Vision, Graz, Austria, 7–13 May 2006; pp. 404–417. [Google Scholar]
- Zhu, Q.; Yeh, M.C.; Cheng, K.T.; Avidan, S. Fast human detection using a cascade of histograms of oriented gradients. In Proceedings of the IEEE Conference on Conference on Computer Vision and Pattern Recognition, New York, NY, USA, 17–22 June 2006; pp. 1491–1498. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Fu, C.Y.; Liu, W.; Ranga, A.; Tyagi, A.; Berg, A.C. Dssd: Deconvolutional single shot detector. arXiv 2017, arXiv:1701.06659. [Google Scholar]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable detr: Deformable transformers for end-to-end object detection. arXiv 2020, arXiv:2010.04159. [Google Scholar]
- Wang, T.; Yuan, L.; Chen, Y.; Feng, J.; Yan, S. Pnp-detr: Towards efficient visual analysis with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 4661–4670. [Google Scholar]
- Danvind, J. Analysis of Drying Wood Based on Nondestructive Measurements and Numerical Tools. Ph.D. Thesis, Luleå University of Technology, Luleå, Sweden, 2005. [Google Scholar]
- Sarigul, E.; Abbott, A.L.; Schmoldt, D.L. Nondestructive rule-based defect detection and identification system in CT images of hardwood logs. AIP Conf. Proc. 2001, 557, 1936–1943. [Google Scholar]
- Bhandarkar, S.M.; Faust, T.D.; Tang, M. CATALOG: A system for detection and rendering of internal log defects using computer tomography. Mach. Vis. Appl. 1999, 11, 171–190. [Google Scholar] [CrossRef]
- Bhandarkar, S.M.; Faust, T.D.; Tang, M. A computer vision system for lumber production planning. In Proceedings of the Fourth IEEE Workshop on Applications of Computer Vision, WACV’98, Washington, DC, USA, 19–21 October 1998; pp. 134–139. [Google Scholar]
- Qi, D.; Yu, L. Omnidirectional morphology applied to wood defects testing by using computed tomography. In Proceedings of the 2008 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Xi’an, China, 2–5 July 2008; pp. 868–873. [Google Scholar]
- López, G.; Basterra, L.A.; Ramón-Cueto, G.; De Diego, A. Detection of singularities and subsurface defects in wood by infrared thermography. Int. J. Archit. Herit. 2014, 8, 517–536. [Google Scholar] [CrossRef]
- Ma, F.; Zhang, J.; Ji, P. Automatic end-to-end veneer grading system based on machine vision. J. Phys. Conf. Ser. 2021, 1961, 012029. [Google Scholar] [CrossRef]
- Hu, K.; Wang, B.; Shen, Y.; Guan, J.; Cai, Y. Defect identification method for poplar veneer based on progressive growing generated adversarial network and MASK R-CNN Model. BioResources 2020, 15, 3041–3052. [Google Scholar] [CrossRef]
- Fan, J.; Liu, Y.; Hu, Z.K.; Zhao, Q.; Shen, L.; Zhou, X. Solid wood panel defect detection and recognition system based on faster R-CNN. J. For. Eng. 2019, 4, 112–117. [Google Scholar]
- Gao, M.; Qi, D.; Mu, H.; Qi, D. A transfer residual neural network based on ResNet-34 for detection of wood knot defects. Forests 2021, 12, 212. [Google Scholar] [CrossRef]
- Yang, Y.; Wang, H.; Jiang, D.; Hu, Z. Surface detection of solid wood defects based on SSD improved with ResNet. Forests 2021, 12, 1419. [Google Scholar] [CrossRef]
- Xia, B.; Luo, H.; Shi, S. Improved Faster R-CNN Based Surface Defect Detection Algorithm for Plates. Comput. Intell. Neurosci. 2022, 2022, 3248722. [Google Scholar] [CrossRef] [PubMed]
- Hu, W.; Wang, T.; Wang, Y.; Chen, Z.; Huang, G. LE–MSFE–DDNet: A defect detection network based on low-light enhancement and multi-scale feature extraction. Vis. Comput. 2022, 38, 3731–3745. [Google Scholar] [CrossRef]
- Ding, F.; Zhuang, Z.; Liu, Y.; Jiang, D.; Yan, X.; Wang, Z. Detecting defects on solid wood panels based on an improved SSD algorithm. Sensors 2020, 20, 5315. [Google Scholar] [CrossRef]
- Yang, F.; Wang, Y.; Wang, S.; Cheng, Y. Wood veneer defect detection system based on machine vision. In Proceedings of the 2018 International Symposium on Communication Engineering & Computer Science (CECS 2018), Hohhot, China, 28–29 July 2018; pp. 413–418. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Zitnick, C.L.; Dollár, P. Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.Y.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Choi, J.; Chun, D.; Kim, H.; Lee, H.-J. Gaussian YOLOv3: An accurate and fast object detector using localization uncertainty for autonomous driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 502–511. [Google Scholar]
- Zheng, Z.; Wang, P.; Ren, D.; Liu, W.; Ye, R.; Hu, Q.; Zuo, W. Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 2021, 52, 8574–8586. [Google Scholar] [CrossRef] [PubMed]
Defect | No. of Training Image/Label | No. of Validation Image/Label | No. of Test Image/Label | Total Image/Label |
---|---|---|---|---|
Live knot | 4200/4830 | 600/690 | 1200/1380 | 6000/6900 |
Dead knot | 3612/4045 | 516/577 | 1032/1155 | 5160/5777 |
Wormhole | 3654/4498 | 522/642 | 1044/1285 | 5220/6425 |
Unit | Type | No. of Conv | Size/Step | Output Size |
---|---|---|---|---|
Res1 | Con | 512 | 3 × 3/1 | 256 × 256 |
Con | 256 | 1 × 1/1 | ||
Con | 512 | 3 × 3/1 | ||
Pooling | - | 2 × 2/2 | ||
Res2 | Con | 512 | 3 × 3/1 | 128 × 128 |
Con | 256 | 1 × 1/1 | ||
Con | 512 | 3 × 3/1 | ||
Pooling | - | 2 × 2/2 | ||
Res3 | Con | 256 | 3 × 3/1 | 64 × 64 |
Con | 128 | 1 × 1/1 | ||
Con | 256 | 3 × 3/1 | ||
Pooling | - | 2 × 2/2 | ||
Res4 | Con | 512 | 3 × 3/1 | 32 × 32 |
Con | 256 | 1 × 1/1 | ||
Con | 512 | 3 × 3/1 | ||
Pooling | - | 2 × 2/2 | ||
Res5 | Con | 256 | 3 × 3/1 | 16 × 16 |
Con | 128 | 1 × 1/1 | ||
Con | 256 | 3 × 3/1 | ||
Pooling | - | 2 × 2/2 |
Model | ||||||
---|---|---|---|---|---|---|
Live Knot | Dead Knot | Wormhole | ||||
Faster-RCNN | 93.4 | 95.2 | 96.7 | 95.1 | 72.6 | 60.2 |
YOLOv4 | 91.6 | 93.7 | 95.4 | 93.6 | 68.0 | 58.4 |
DETR | 94.1 | 94.5 | 97.2 | 95.3 | 73.1 | 60.4 |
MPEND | 86.9 | 89.1 | 90.2 | 88.7 | 59.6 | 43.8 |
MPEND-R50 | 94.7 | 95.0 | 96.9 | 95.5 | 71.2 | 62.1 |
Model | ||||||
---|---|---|---|---|---|---|
Live Knot | Dead Knot | Wormhole | ||||
PEN-1 | 86.9 | 89.1 | 90.2 | 88.7 | 59.6 | 43.8 |
PEN-2 | 63.4 | 65.8 | 73.1 | 67.4 | 32.9 | 20.2 |
Model | ||||||
---|---|---|---|---|---|---|
Live Knot | Dead Knot | Wormhole | ||||
MPEND | 86.9 | 89.1 | 90.2 | 88.7 | 59.6 | 43.8 |
MPEND-S | 55.3 | 57.7 | 60.6 | 57.8 | 26.1 | 17.4 |
MPEND-M | 62.6 | 68.5 | 72.4 | 67.8 | 33.4 | 19.3 |
MPEND-L | 82.0 | 81.9 | 84.7 | 82.7 | 36.8 | 23.8 |
Model | ||||||
---|---|---|---|---|---|---|
Live Knot | Dead Knot | Wormhole | ||||
MPEND | 86.9 | 89.1 | 90.2 | 88.7 | 59.6 | 43.8 |
MPEND-L1 | 87.0 | 88.7 | 88.9 | 88.2 | 57.2 | 43.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ge, Y.; Jiang, D.; Sun, L. Wood Veneer Defect Detection Based on Multiscale DETR with Position Encoder Net. Sensors 2023, 23, 4837. https://doi.org/10.3390/s23104837
Ge Y, Jiang D, Sun L. Wood Veneer Defect Detection Based on Multiscale DETR with Position Encoder Net. Sensors. 2023; 23(10):4837. https://doi.org/10.3390/s23104837
Chicago/Turabian StyleGe, Yilin, Dapeng Jiang, and Liping Sun. 2023. "Wood Veneer Defect Detection Based on Multiscale DETR with Position Encoder Net" Sensors 23, no. 10: 4837. https://doi.org/10.3390/s23104837
APA StyleGe, Y., Jiang, D., & Sun, L. (2023). Wood Veneer Defect Detection Based on Multiscale DETR with Position Encoder Net. Sensors, 23(10), 4837. https://doi.org/10.3390/s23104837