CN116579992A - Small target bolt defect detection method for unmanned aerial vehicle inspection - Google Patents
Small target bolt defect detection method for unmanned aerial vehicle inspection Download PDFInfo
- Publication number
- CN116579992A CN116579992A CN202310446386.8A CN202310446386A CN116579992A CN 116579992 A CN116579992 A CN 116579992A CN 202310446386 A CN202310446386 A CN 202310446386A CN 116579992 A CN116579992 A CN 116579992A
- Authority
- CN
- China
- Prior art keywords
- target
- data set
- small target
- aerial vehicle
- unmanned aerial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 80
- 238000007689 inspection Methods 0.000 title claims abstract description 57
- 230000007547 defect Effects 0.000 title claims abstract description 50
- 238000000605 extraction Methods 0.000 claims abstract description 47
- 238000012549 training Methods 0.000 claims abstract description 30
- 230000004927 fusion Effects 0.000 claims abstract description 6
- 230000007246 mechanism Effects 0.000 claims abstract description 5
- 238000000034 method Methods 0.000 claims description 31
- 238000010586 diagram Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 11
- 230000001629 suppression Effects 0.000 claims description 4
- 230000002349 favourable effect Effects 0.000 claims description 3
- 230000002401 inhibitory effect Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 2
- 238000005728 strengthening Methods 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims 2
- 230000005540 biological transmission Effects 0.000 abstract description 25
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000012360 testing method Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0004—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
- G06V10/763—Non-hierarchical techniques, e.g. based on statistics of modelling distributions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30108—Industrial image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Remote Sensing (AREA)
- Quality & Reliability (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a small target bolt defect detection method for unmanned aerial vehicle inspection, which comprises the following steps: (1) Constructing a backbone network suitable for small target feature extraction; (2) Constructing a global-local two-stage small target bolt defect detection model; (3) Constructing a small target bolt defect data set based on the unmanned aerial vehicle inspection image; (4) Training a global-local two-stage small target bolt defect detection model on an unmanned aerial vehicle inspection picture data set; (5) And using the trained model to intelligently identify the bolt defects in the unmanned aerial vehicle inspection picture. Aiming at the problems of small bolt defect targets and difficult feature extraction in a power transmission line inspection picture, the invention provides a small target bolt defect detection method for unmanned aerial vehicle inspection by combining the technologies of local image extraction, feature fusion, attention mechanism and the like.
Description
Technical Field
The invention belongs to the technical field of digital image recognition, and particularly relates to a small target bolt defect detection method for unmanned aerial vehicle inspection.
Background
The transmission line inspection is an important means for guaranteeing the reliable operation of the power system, the traditional inspection mode mainly uses a telescope under a pole tower or ascends the pole tower to inspect by an inspection staff, and along with the continuous expansion of the power grid scale, the traditional manual inspection mode is more and more difficult to meet the inspection requirement of the transmission line. In recent years, unmanned aerial vehicle inspection modes are continuously popularized in the inspection of the transmission line, and the inspection efficiency of the transmission line is greatly improved. Unmanned aerial vehicle inspection is that unmanned aerial vehicle reaches appointed place through unmanned aerial vehicle flying hand control or according to appointed route, shoots transmission equipment. A large number of pictures are generated in the unmanned aerial vehicle inspection process, and the computer vision technology is combined with the unmanned aerial vehicle inspection, so that the development of automatic inspection of the power transmission line is effectively promoted.
The bolts are used as fastening components among the power transmission line connecting fittings, defects such as bolt missing, rust, nut looseness and the like are widely existed in the power transmission line, inspection is an important work, the defects of the bolts are intelligently identified by combining an image identification technology with unmanned aerial vehicle inspection, the inspection efficiency of the power transmission line is greatly improved, and the safety of the power transmission line is ensured.
Whereas for image recognition the physical size of the bolt is small, belonging to the typical small target type. Deep learning-based models have made great progress in the field of power image recognition, but these methods are still poorly suited for small-sized component recognition. Due to the limitation of the positioning precision and the endurance time of the unmanned aerial vehicle, fine shooting of small-size components on a power transmission line is often difficult in the inspection process. Therefore, small-size hardware fittings such as bolts in power transmission line equipment occupy a small area in the inspection picture, and background parts which do not contain equipment information occupy most of the area of the picture. In the processing process of a computer vision algorithm, a high-resolution inspection image is often firstly downsampled to a certain size, so that a large amount of information is lost, and the detection difficulty of small-size targets such as bolt defects is greatly increased. If the original image is directly analyzed, a large number of invalid background areas consume extremely large amounts of computing resources and time.
Therefore, by deeply analyzing the characteristics of the defects of the bolts of the transmission line, the intelligent identification method for the defects of the bolts of the transmission line image, which is suitable for inspection of the unmanned aerial vehicle, is provided by combining the existing deep learning technology, and has important significance for improving the inspection efficiency of the transmission line and promoting the intelligent development of inspection.
Disclosure of Invention
In order to solve the problems, the invention provides a small target bolt defect detection method for unmanned aerial vehicle inspection, which can effectively identify the bolt defect of a power transmission line by intelligently analyzing a power transmission line inspection image obtained by unmanned aerial vehicle shooting, provides reference for power detection personnel and ensures the reliability of power transmission.
The technical scheme adopted by the invention is as follows:
a small target bolt defect detection method for unmanned aerial vehicle inspection comprises the following steps:
step 1: constructing a backbone network suitable for small target feature extraction;
step 2: constructing a global-local two-stage small target bolt defect detection model;
step 3: constructing a small target bolt defect data set based on the unmanned aerial vehicle inspection image;
step 4: training a global-local two-stage small target bolt defect detection model on an unmanned aerial vehicle inspection picture data set;
step 5: and using the trained model to intelligently identify the bolt defects in the unmanned aerial vehicle inspection picture.
In the step 1, the specific implementation is as follows:
step 1.1: designing a small target bolt defect feature extraction network by taking ResNet as a main network;
step 1.2: the mixed attention module is added into the feature extraction network, and the module is mainly divided into a channel attention layer and a space attention layer, wherein the channel attention layer can strengthen the feature diagram which is favorable for representing the target and inhibit other feature diagrams, and the space attention layer is used for strengthening the foreground region where the target is positioned in the feature diagram and inhibiting the information of the background region. By designing the mixed attention module, foreground information can be kept as much as possible in the feature extraction process, and the effectiveness of feature extraction is improved;
step 1.3: and the feature pyramid network is adopted in the feature extraction network to perform multi-scale feature fusion, so that the information loss in the feature extraction process is reduced and the detection precision of the bolt defects is improved by fusing features with different scales.
In the step 2, the specific implementation is as follows:
step 2.1: the global-local ultra-small target detection module consists of two branches: global saliency region detection branches and local object detection branches. And taking the original image after the global saliency region detection branch is sampled as input, generating a saliency region by adopting a region suggestion network after the original image passes through a feature extraction network, and cutting out corresponding pixel blocks in the original image according to the coordinates of the saliency region. The cut picture blocks are transmitted into local target detection branches, targets are identified from the picture blocks and mapped into an original image, and repeated results are removed through non-maximum suppression, so that ultra-small target identification results are obtained;
step 2.2: in the global saliency region detection branch, clustering candidate targets by using a k-means algorithm, so as to avoid the need of labeling data in the saliency region generation stage;
step 2.3: the local target detection branch takes a foreground picture block extracted by the global significance region detection branch as input data, designs by referring to the Faster RCNN network, and simultaneously introduces a self-attention mechanism to optimize the RPN network.
In the step 3, the specific implementation is as follows:
step 3.1: in the implementation process, firstly, marking a bolt target in an original inspection image, and then dividing an original data set into 2 sub-data sets, wherein the first sub-data set is used for training a global significance region detection branch, and the second data set is used for training a local target detection branch;
step 3.2: merging all marked bolt target categories in the inspection picture into a target type of 'foreground' to form a target detection data set by the sub data set;
step 3.3: the second sub data set is formed by randomly cutting the original data set into 640 x 640 small pictures, and the target category is consistent with the original data set.
In the step 4, the specific implementation is as follows:
step 4.1: firstly, the method pretrains a feature extraction network on an image net2012 public data set;
step 4.2: and initializing a global salient region detection branch and a local target detection branch by using a trained feature extraction network respectively, and training the two modules on the first sub-data set and the second sub-data set respectively.
Aiming at the problems of small bolt defect targets and difficult feature extraction in a power transmission line inspection picture, the invention provides a small target bolt defect detection method for unmanned aerial vehicle inspection by combining the technologies of local image extraction, feature fusion, attention mechanism and the like, a global salient region detection branch acquires a feature dense region in a high-resolution image to obtain a local fine image, then a global-local detection network is utilized to detect pictures with different scales, and a global-local identification result is fused by adopting an improved non-maximum suppression method, so that the end-to-end identification of small target bolt defects in the power transmission line unmanned aerial vehicle inspection picture is realized.
Drawings
FIG. 1 is a schematic diagram of a hybrid attention module configuration of the present invention;
FIG. 2 is a schematic diagram of a backbone network employing multi-scale feature fusion in accordance with the present invention;
FIG. 3 is a diagram of the overall structure of the global-local two-stage small target bolt defect detection model of the present invention;
fig. 4 is a diagram showing a defective example of the bolt according to the present invention.
Detailed Description
In order to facilitate the understanding and practice of the invention, those of ordinary skill in the art will now make further details with reference to the drawings and examples, it being understood that the examples described herein are for the purpose of illustration and explanation only and are not intended to limit the invention thereto.
The invention provides a small target bolt defect detection method for unmanned aerial vehicle inspection, which comprises the following steps:
step 1: constructing a backbone network suitable for small target feature extraction;
step 1.1: the invention designs an ultra-small target feature extraction network based on ResNet, and a depth residual error network (Deep residual network, resNet) is one of the most representative feature extraction networks in the field of computer vision, mainly comprises network structures with different depths such as ResNet-50, resNet-101, resNet-152 and the like, and the extraction capacity of deep features is further enhanced along with the increase of the number of network layers, but the loss of texture features is caused, the detection of small targets is not facilitated, and the calculated amount is greatly increased. The present embodiment builds a backbone network with ResNet-50, it being understood that the use of other backbone networks is within the scope of this patent.
Step 2.2: after the characteristic extraction of the pixel information in the picture is carried out by the convolutional neural network, a characteristic diagram with a certain depth is formed, and the characteristic diagram is used for representing the target. In the ultra-small target detection task of the power transmission line, pixel information used for representing a target to be detected is less, background information is dominant in the global convolution feature extraction process, and effective information is difficult to extract. Thus, a hybrid attention module is added to the feature extraction network, and the hybrid attention module is mainly divided into a channel attention layer and a spatial attention layer, wherein the channel attention layer can enhance a feature map which is favorable for representing a target and inhibit other feature maps, and the spatial attention layer is information for enhancing a foreground region where the target is located and inhibiting a background region in the feature map. By designing the mixed attention module, foreground information can be kept as much as possible in the feature extraction process, and the effectiveness of feature extraction is improved. A block diagram of the hybrid attention module is shown in fig. 1.
Step 2.3: the convolution neural network obtains feature images with different scales by carrying out convolution and pooling operations on the original image. Experiments show that the shallow feature map has high resolution, the detailed information of the original picture is reserved completely, but the whole form of the object has insufficient representation capability; the deep feature map contains rich semantic information after complex nonlinear transformation, but detail information in the picture is lost due to too low resolution. For ultra-small target detection tasks of the power transmission line, the shallow feature map is difficult to capture the overall form information of the target, and the deep feature map causes small target pixel information loss due to convolution operation, so that available information is greatly reduced. Therefore, the multi-scale feature fusion is carried out by adopting the feature pyramid network, so that the accuracy of the ultra-small target detection of the power transmission line is improved.
Step 2: constructing a global-local two-stage small target bolt defect detection model;
step 2.1: the global-local ultra-small target detection module consists of two branches: global saliency region detection branches and local object detection branches. And taking the original image after the global saliency region detection branch is sampled as input, generating a saliency region by adopting a region suggestion network after the original image passes through a feature extraction network, and cutting out corresponding pixel blocks in the original image according to the coordinates of the saliency region. And the cut picture blocks are transmitted into local target detection branches, targets are identified from the picture blocks and mapped into the original image, and the repeated results are removed through non-maximum suppression, so that an ultra-small target identification result is obtained. The structure diagram of the global-local two-stage small target bolt defect detection model is shown in figure 2.
Step 2.2: the global saliency region detection branch firstly downsamples an original picture to 800×800, a feature extraction network is utilized to obtain a feature map, and then an ultra-small target saliency region extraction module is designed based on a region suggestion network.
For the input feature map, taking a feature map of 25×25×256 as an example, 25×25 anchor points may be selected at equal intervals in the original image as the center points of candidate windows, 9 possible candidate windows {32×32,32×64,64×32,64×64,64×128,128×64,128×256,256×128} are set at each center point position, and a binary label is assigned to each candidate window, so that the label of each point may be finally represented by an 18-dimensional vector. In the module, if the overlapping part of the candidate window and any one of the targets is larger than 70% of the target area, the region is considered as a salient region, positive labels are given to the candidate frame, if the overlapping part of the candidate window and all the targets is smaller than 30% of the target area, the region is considered as a background, negative labels are given to the candidate frame, and the rest candidate frames are non-positive and non-negative. It is noted that the feature extraction network in the present model adopts a feature pyramid structure, so that 9 candidate frames with different sizes are dispersed into three feature graphs with different scales for prediction.
The loss function of the ultra-small target significance region extraction network is as follows.
Wherein N is cls Representing the number of categories, and extracting the ultra-small target salient region, wherein the ultra-small target salient region extraction module comprises 2 categories of foreground and background, so that N is cls =2,L cls (p i ,p i * ) Representing class loss functions, L herein cls (p i ,p i * ) The error between the predicted value and the actual value is measured using cross entropy, which can be expressed specifically as follows.
Wherein p is i * Representing the actual category, p i Representing the prediction category.
In the model training process, due to the fact that the number of candidate frames is large, positive and negative samples are seriously unbalanced, only 256 positive samples and 256 negative samples are selected from the feature map of each scale to participate in training. And finally obtaining 300 candidate frames with highest confidence coefficient as a foreground region in the test process.
For each candidate frame, the target position is characterized by the center point and the midpoints of four sides, so that 1500 anchor points are obtained. And k-means clustering is used for the 1500 anchors to obtain N clustering centers, and then, if the number of anchors with the distance from a certain anchor point being smaller than 64 is smaller than 3 in each type of anchor points, the anchor points are invalid. And for the effective anchor point, acquiring a corresponding boundary, thereby acquiring a salient region. The specific implementation is shown in algorithm 1.
Step 2.3: the local target detection branch takes a foreground picture block extracted by the global significance region detection branch as input data, uniformly resamples the picture block to 640 multiplied by 640, acquires a picture feature map through a feature extraction network, designs a target detection stage by referring to a fast RCNN network, and simultaneously introduces a self-attention mechanism to optimize an RPN network, wherein the overall structure of the local target detection module is shown as a local target detection branch in the figure 3.
First, self-attention semantic feature extraction branches are established herein based on conv_4 and conv_5 output features, respectively, as shown by blue and orange arrows in fig. 3, respectively. By adding the semantic feature extraction branches, the correlation between pixels can be fully reserved in the picture downsampling process. Subsequently, the semantic feature graphs obtained by conv_4 and conv_5 are spliced, and it should be noted that the conv_4 semantic feature graph is firstly downsampled by Average Pooling so as to keep the dimension consistent with the conv_5 semantic feature graph. Finally, the semantic feature map is fused with conv_ proposal feature map and used for subsequent target detection.
The loss function of the local object detection module is as follows.
Wherein N is cls In the local target detection module, bolts are classified into 6 types according to different states, and the background is required to be used as one type independently, so N is the same as N cls =7,L cls (p i ,p i * ) Representing class loss functions, L herein cls (p i ,p i * ) The cross entropy is calculated and can be expressed as the following formula.
Wherein p is i * Representing the actual category, p i Representing the prediction category.
N pos Representing the number of position coordinates, herein a rectangular box is used to represent the target position, thus N pos =4,L pos Representing the position loss function, in order to accelerate the convergence speed of the model, a CIoU (complete intersection over union loss) loss function is introduced herein to calculate the position loss, which can be expressed specifically as the following formula.
Wherein t is i Representing the position of the prediction frame, t i * Representing the position of the actual frame, ρ representing the predicted frame t i Center point of (c) and true target position t i * The Euclidean distance between the center points of (c) represents the coverage t i And t i * Is IoU represents t i And t i * The ratio of the cross-over ratio can be calculated by the following equation.
Alpha is a balance coefficient and can be calculated by the following formula.
Where v represents a coupling coefficient between the aspect ratio of the predicted target and the aspect ratio of the actual target, and can be expressed by the following formula.
Wherein w and h represent the width and height of the predicted target, w * And h * Representing the width and height of the actual target.
Step 3: constructing a small target bolt defect data set based on the unmanned aerial vehicle inspection image;
step 3.1: the main targets of the data set used in the invention are bolts at the connecting hardware fitting such as a triangle yoke plate, an adjusting plate, a wire clamp and the like, and according to different visual forms of the bolts, the targets of the bolts to be identified are divided into five types: a, normal shape of nut and pin; b double nuts are in normal form; c, adding pins into the nuts and forming pin missing shapes; d, double nuts, wherein the nuts are loosened; e double nuts, nut missing form; f, rusting the nut; examples of the various categories are shown in fig. 4.
Step 3.2: in the embodiment, 1852 sample pictures with bolt defects are collected in total, 1482 sample pictures are randomly selected as a training set, 370 sample pictures are taken as a test set, and the sample pictures are marked according to the PASCAL VOC standard;
step 3.3: in order to enable the data set to be better suitable for the training tasks of the two stages of the method, the training set is further processed to obtain two data sets;
data set a: combining all target categories into a target type of 'foreground', wherein a total of 14635 targets are in a training set and a testing set, and the data set is used for training and testing a global significance region detection branch;
data set B: the data set is obtained by randomly clipping an original data set, the size of a clipped picture is 640 multiplied by 640, and two types of objects of A, B are considered to be far more than other four types of objects, so that if the clipped picture only comprises an A type object or a B type object, the picture is omitted. 5 pictures are randomly cut from each picture, 7410 pictures are obtained from a final training set, 1850 pictures are obtained from a test set, and a data set B is formed for training and testing of local target detection branches;
step 4: training a global-local two-stage small target bolt defect detection model on an unmanned aerial vehicle inspection picture data set;
step 4.1: in this embodiment, the feature extraction network, the global salient region detection branch and the local target detection branch need to be trained respectively, and in order to enable the model to better converge on the dataset of this embodiment, the feature extraction network needs to be pre-trained on the ImageNet dataset to obtain an initialization weight;
step 4.2: in this embodiment, the model structure designed in the local target detection branch is pre-trained on the ImageNet dataset, the whole model is first randomly initialized, and then the model is trained on the pre-training dataset for 40 generations, wherein the initial training weight is 0.001, and the weight decay is 0.0001 after 32 generations of training. The training process adopts a small batch gradient descent method, the number of pictures for each training is 64, the training process adopts a random gradient descent method to optimize parameters, and adopts a momentum optimization method to accelerate convergence, and the momentum is set to be 0.9. After model training is finished, saving the trained parameters to a weight file;
step 4.3: when the global significance region detection branch is trained, firstly, a pre-trained model is utilized to initialize a feature extraction network, other parameters are randomly initialized, and meanwhile, the parameters of the first three layers of the network are frozen, and only the parameters of the second two layers and the head are trained. The model was trained on data set a herein for 15 generations, the input pictures were uniformly resampled to 800 x 800, the initial learning rate was 0.01, and the learning rate decay was 0.001 after 12 generations of training. The rest of the optimization strategy is consistent with the pre-training. The local target detection branch completes training in the data set B, and the training process and super-parameter setting are the same as those of the global significance region detection branch;
step 5: using a trained model to intelligently identify bolt defects in the unmanned aerial vehicle inspection picture;
step 5.1: to evaluate the effectiveness of the methods presented herein, comparative verification was performed herein using the Faster RCNN, SSD, retinaNet model, and the test results are shown in Table 1. The test result shows that the performance of the method provided by the invention on the task of detecting the defects of the bolts of the power transmission line is far superior to that of other methods.
Table 1 comparison of model test results
It should be understood that parts of the specification not specifically set forth herein are all prior art.
It should be understood that the foregoing description of the preferred embodiments is not intended to limit the scope of the invention, but rather to limit the scope of the claims, and that those skilled in the art can make substitutions or modifications without departing from the scope of the invention as set forth in the appended claims.
Claims (5)
1. A small target bolt defect detection method for unmanned aerial vehicle inspection is characterized in that:
comprises the following steps:
step 1: constructing a backbone network suitable for small target feature extraction;
step 2: constructing a global-local two-stage small target bolt defect detection model;
step 3: constructing a small target bolt defect data set based on the unmanned aerial vehicle inspection image;
step 4: training a global-local two-stage small target bolt defect detection model on an unmanned aerial vehicle inspection picture data set;
step 5: and using the trained model to intelligently identify the bolt defects in the unmanned aerial vehicle inspection picture.
2. The method for detecting the defects of the small target bolts for unmanned aerial vehicle inspection according to claim 1, wherein the method comprises the following steps: the construction of the backbone network suitable for small target feature extraction in the step 1 comprises the following specific steps:
step 1.1: designing a small target bolt defect feature extraction network by taking ResNet as a main network;
step 1.2: adding a mixed attention module into a feature extraction network, wherein the module is mainly divided into a channel attention layer and a space attention layer, the channel attention layer can strengthen a feature diagram which is favorable for representing a target and inhibit other feature diagrams, the space attention layer is used for strengthening a foreground region where the target is positioned in the feature diagram and inhibiting information of a background region, and by designing the mixed attention module, the foreground information can be kept as much as possible in the feature extraction process, and the effectiveness of feature extraction is improved;
step 1.3: and the feature pyramid network is adopted in the feature extraction network to perform multi-scale feature fusion, so that the information loss in the feature extraction process is reduced and the detection precision of the bolt defects is improved by fusing features with different scales.
3. The method for detecting the defects of the small target bolts for unmanned aerial vehicle inspection according to claim 1, wherein the method comprises the following steps: the construction of the global-local two-stage small target bolt defect detection model in the step 2 comprises the following specific steps:
step 2.1: the global-local ultra-small target detection module consists of two branches: the method comprises the steps that a global saliency region detection branch and a local target detection branch are adopted, an original image after the global saliency region detection branch is subjected to downsampling is taken as input, after a feature extraction network is adopted, a saliency region is generated by adopting a region suggestion network, corresponding pixel blocks are cut in an original image according to saliency region coordinates, the cut image blocks are transmitted into the local target detection branch, targets are identified from the image blocks and mapped into the original image, a repetition result is removed through non-maximum suppression, and therefore an ultra-small target identification result is obtained;
step 2.2: in the global saliency region detection branch, clustering candidate targets by using a k-means algorithm, so as to avoid the need of labeling data in the saliency region generation stage; the algorithm implementation flow is as follows:
step 2.3: the local target detection branch takes a foreground picture block extracted by the global significance region detection branch as input data, designs by referring to the Faster RCNN network, and simultaneously introduces a self-attention mechanism to optimize the RPN network.
4. The method for detecting the defects of the small target bolts for unmanned aerial vehicle inspection according to claim 1, wherein the method comprises the following steps: the method for constructing the small target bolt defect data set based on the unmanned aerial vehicle inspection image comprises the following specific steps of:
step 3.1: in the implementation process, firstly, marking a bolt target in an original inspection image, and then dividing an original data set into 2 sub-data sets, wherein the first sub-data set is used for training a global significance region detection branch, and the second data set is used for training a local target detection branch;
step 3.2: merging all marked bolt target categories in the inspection picture into a target type of 'foreground' to form a target detection data set by the sub data set;
step 3.3: the second sub data set is formed by randomly cutting the original data set into 640 x 640 small pictures, and the target category is consistent with the original data set.
5. The method for detecting the defects of the small target bolts for unmanned aerial vehicle inspection according to claim 1, wherein the method comprises the following steps: the global-local two-stage small target bolt defect detection model provided by the invention is trained on an unmanned aerial vehicle inspection picture data set in the step 4, and the specific steps are as follows:
step 4.1: firstly, the method pretrains a feature extraction network on an image net2012 public data set;
step 4.2: and initializing a global salient region detection branch and a local target detection branch by using a trained feature extraction network respectively, and training the two modules on the first sub-data set and the second sub-data set respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310446386.8A CN116579992A (en) | 2023-04-23 | 2023-04-23 | Small target bolt defect detection method for unmanned aerial vehicle inspection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310446386.8A CN116579992A (en) | 2023-04-23 | 2023-04-23 | Small target bolt defect detection method for unmanned aerial vehicle inspection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116579992A true CN116579992A (en) | 2023-08-11 |
Family
ID=87544503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310446386.8A Pending CN116579992A (en) | 2023-04-23 | 2023-04-23 | Small target bolt defect detection method for unmanned aerial vehicle inspection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116579992A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116883391A (en) * | 2023-09-05 | 2023-10-13 | 中国科学技术大学 | Two-stage distribution line defect detection method based on multi-scale sliding window |
CN117237363A (en) * | 2023-11-16 | 2023-12-15 | 国网山东省电力公司曲阜市供电公司 | Method, system, medium and equipment for identifying external broken source of power transmission line |
-
2023
- 2023-04-23 CN CN202310446386.8A patent/CN116579992A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116883391A (en) * | 2023-09-05 | 2023-10-13 | 中国科学技术大学 | Two-stage distribution line defect detection method based on multi-scale sliding window |
CN116883391B (en) * | 2023-09-05 | 2023-12-19 | 中国科学技术大学 | Two-stage distribution line defect detection method based on multi-scale sliding window |
CN117237363A (en) * | 2023-11-16 | 2023-12-15 | 国网山东省电力公司曲阜市供电公司 | Method, system, medium and equipment for identifying external broken source of power transmission line |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108647585B (en) | Traffic identifier detection method based on multi-scale circulation attention network | |
CN111767882B (en) | Multi-mode pedestrian detection method based on improved YOLO model | |
CN110348376B (en) | Pedestrian real-time detection method based on neural network | |
CN108665496B (en) | End-to-end semantic instant positioning and mapping method based on deep learning | |
US20210224609A1 (en) | Method, system and device for multi-label object detection based on an object detection network | |
CN113436169B (en) | Industrial equipment surface crack detection method and system based on semi-supervised semantic segmentation | |
CN112884064A (en) | Target detection and identification method based on neural network | |
CN111862126A (en) | Non-cooperative target relative pose estimation method combining deep learning and geometric algorithm | |
CN108961235A (en) | A kind of disordered insulator recognition methods based on YOLOv3 network and particle filter algorithm | |
CN116579992A (en) | Small target bolt defect detection method for unmanned aerial vehicle inspection | |
US20220366682A1 (en) | Computer-implemented arrangements for processing image having article of interest | |
CN114049356B (en) | Method, device and system for detecting structure apparent crack | |
CN111080609B (en) | Brake shoe bolt loss detection method based on deep learning | |
CN115620180A (en) | Aerial image target detection method based on improved YOLOv5 | |
CN113297915A (en) | Insulator recognition target detection method based on unmanned aerial vehicle inspection | |
CN114119621A (en) | SAR remote sensing image water area segmentation method based on depth coding and decoding fusion network | |
CN111079748A (en) | Method for detecting oil throwing fault of rolling bearing of railway wagon | |
CN113111740A (en) | Characteristic weaving method for remote sensing image target detection | |
CN111639530A (en) | Detection and identification method and system for power transmission tower and insulator of power transmission line | |
CN114155474A (en) | Damage identification technology based on video semantic segmentation algorithm | |
CN115861756A (en) | Earth background small target identification method based on cascade combination network | |
CN112101113B (en) | Lightweight unmanned aerial vehicle image small target detection method | |
CN114565824B (en) | Single-stage rotating ship detection method based on full convolution network | |
CN115690574A (en) | Remote sensing image ship detection method based on self-supervision learning | |
CN112884795A (en) | Power transmission line inspection foreground and background segmentation method based on multi-feature significance fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |