CN109325947A - A kind of SAR image steel tower object detection method based on deep learning - Google Patents
A kind of SAR image steel tower object detection method based on deep learning Download PDFInfo
- Publication number
- CN109325947A CN109325947A CN201811100702.1A CN201811100702A CN109325947A CN 109325947 A CN109325947 A CN 109325947A CN 201811100702 A CN201811100702 A CN 201811100702A CN 109325947 A CN109325947 A CN 109325947A
- Authority
- CN
- China
- Prior art keywords
- size
- layer
- sample
- window
- layers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 80
- 238000013135 deep learning Methods 0.000 title claims abstract description 22
- 229910000831 Steel Inorganic materials 0.000 title abstract description 3
- 239000010959 steel Substances 0.000 title abstract description 3
- 238000012549 training Methods 0.000 claims abstract description 40
- 238000011478 gradient descent method Methods 0.000 claims abstract description 7
- 230000011218 segmentation Effects 0.000 claims abstract description 4
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 claims description 42
- 238000000034 method Methods 0.000 claims description 33
- 238000013527 convolutional neural network Methods 0.000 claims description 25
- 229910052742 iron Inorganic materials 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 17
- 238000010586 diagram Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 8
- 238000013519 translation Methods 0.000 claims description 5
- 230000005764 inhibitory process Effects 0.000 claims description 4
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 2
- 230000001131 transforming effect Effects 0.000 claims description 2
- 238000012360 testing method Methods 0.000 abstract description 3
- 238000013526 transfer learning Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10032—Satellite or aerial image; Remote sensing
- G06T2207/10044—Radar image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of SAR image steel tower object detection method based on deep learning, comprising: randomly select several SAR images from SAR data concentration, segmentation obtains sample slice, constructs training sample set after sample label is arranged for every sample slice;The every sample slice concentrated to training sample is handled, and generates multiple different artificial sample slices, and training sample set is added and realizes after filling sample label and expands;Construct SSD model;By the SSD model of the training sample set input building after expansion, SSD model is trained using gradient descent method;Testing data figure is cut into multiple slices to be detected identical with sample slice size, trained SSD model is inputted, obtains the object detection results of testing data figure.The present invention has the advantages that strong robustness, the speed of service are fast, detection performance is high and easy to migrate, there is higher contrast without target and background when detection, can be used under complex scene detecting.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a SAR image iron tower target detection method based on deep learning.
Background
A satellite-borne Synthetic Aperture Radar (SAR) belongs to a microwave imaging radar, and is characterized in that the SAR is not influenced by weather and climate, and can observe earth all day long, all weather, high resolution and large area, so the SAR is widely applied to various fields such as military target detection, ocean monitoring, resource detection, agriculture, forestry and the like.
Convolutional Neural Network (CNN) is a common deep, pre-feedback artificial neural network, is one of deep learning methods, and has been successfully applied in the field of computer vision at present. With the resurgence of deep learning techniques, deep learning based target detection methods have developed rapidly in recent years. R-CNN method based on candidate region, and its improved version FastR-CNN method and Faster R-CNN, and YOLO method and SSD method based on end-to-end idea are proposed in turn and have been widely used. But the fast R-CNN model has poor real-time performance and the YOLO model has poor accuracy.
Currently, many target detection methods for SAR images have been developed. The constant false alarm rate CFAR detection method is widely applied to SAR image target detection due to the characteristics of simplicity, rapidness and strong real-time performance. Different characterization forms are provided on the SAR image according to different types of targets, and correspondingly different detection methods are provided. However, these existing SAR image detection methods only use the statistical properties of the local area of the SAR image, and can only perform pixel-level detection, require a high contrast between the target and the background, and have a good detection performance in a simple scene but a poor detection performance in a complex scene.
Disclosure of Invention
Technical problem to be solved
The invention aims to solve the technical problems that a target detection method aiming at an SAR image in the prior art requires higher contrast between a target and a background and has poorer detection performance in a complex scene.
(II) technical scheme
In order to solve the technical problem, the invention provides a method for detecting an SAR image iron tower target based on deep learning, which is used for improving the detection performance under a complex scene and comprises the following steps:
s1, randomly extracting a plurality of SAR images from the SAR data set, segmenting to obtain sample slices, setting a sample label for each sample slice, and constructing a training sample set, wherein the sample label comprises coordinate information and category information of a target in the sample slice;
s2, processing each sample slice in the training sample set to generate a plurality of different artificial sample slices, and adding the training sample set to realize expansion after adding sample labels;
s3, constructing the SSD model, including:
15 layers of convolutional neural networks, which are used for preliminarily extracting the image characteristics of the input image;
8 layers of convolutional neural networks are used for further extracting image features of different scales;
the multi-scale detection network is used for detecting the extracted image characteristics with different scales;
s4, inputting the training sample set expanded in the step S2 as an input image into the SSD model constructed in the step S3, and training the SSD model by adopting a gradient descent method;
and S5, cutting the data graph to be detected into a plurality of slices to be detected, wherein the slices have the same size as the sample slice, and inputting the SSD model trained in the step S4 to obtain a target detection result of the data graph to be detected.
Preferably, the step S1 includes:
s1-1, randomly extracting 100 SAR images from the MiniSAR data set;
s1-2, obtaining a plurality of sheet sample slices from each SAR image by random segmentation;
s1-3, adding a sample label to each sample slice, wherein the sample label comprises an absolute path of the sample slice, the number of targets, coordinates of a target frame and the type of the target;
and S1-4, forming the sample slices with the corresponding sample labels into a training sample set.
Preferably, in step S2, when each sample slice in the training sample set is processed to generate a plurality of different artificial sample slices, the processing is performed by one or more of translation, inversion, rotation, and noise addition at random, and 100 different artificial sample slices are generated from each sample slice.
Preferably, the step S4 includes:
s4-1, inputting the expanded training sample set into an SSD model;
s4-2, calculating a cost function of the input image to the current SSD model by using a forward propagation method;
s4-3, respectively calculating gradient values of the cost function to parameters in each convolution layer in the SSD model by using a back propagation method;
s4-4, updating the parameters in each convolution layer according to the gradient value of the cost function to the parameters in each convolution layer by using a gradient descent method;
and S4-5, circularly performing the steps S4-2 to S4-4 to update the SSD model, finishing training when the number of circulation times reaches a set value or the parameters of each convolution layer in the SSD model are not updated, and storing the current SSD model.
Preferably, the step S5 includes:
s5-1, cutting the data graph to be detected into a plurality of slices with the size similar to that of the sample slice in a grid shape, and taking the slices as slices to be detected;
s5-2, inputting the slices to be detected into the SSD model trained in the step S4, and performing target detection on each slice to be detected by using the SSD model to obtain a target detection result of each slice to be detected;
and S5-3, merging the target detection results of the slices to be detected at the corresponding positions of the original data graph to be detected to obtain the target detection results of the data graph to be detected.
Preferably, the SSD model input image size built in step S3 is 300 × 300 pixels;
the 15-layer convolutional neural network comprises:
a first set of convolutional layers: the 1 st to 2 nd convolution layers, each layer uses 64 convolution kernels with the window size of 3 x 3 respectively, the window moving step length is 1, 1 pixel is filled in the edge, and 64 characteristic graphs with the size of 300 x 300 are output;
a second set of convolutional layers: the 3 rd to 4 th convolution layers, each layer uses 128 convolution kernels with the window size of 3 x 3 respectively, the window moving step size is 1, 1 pixel is filled in the edge, and 128 characteristic graphs with the size of 150 x 150 are output;
a third set of convolutional layers: the 5 th to 7 th convolution layers respectively use 256 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 256 characteristic graphs with the size of 75 multiplied by 75 are output;
a fourth set of convolutional layers: the 8 th to 10 th convolution layers respectively use 512 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 512 characteristic graphs with the size of 38 multiplied by 38 are output;
a fifth set of convolutional layers: the 11 th to 13 th convolution layers respectively use 512 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 512 characteristic graphs with the size of 19 multiplied by 19 are output;
a sixth group of convolutional layers: a 14 th convolutional layer, which uses 1024 convolutional kernels with the window size of 3 × 3, the window moving step is 1, the edge is filled with 1 pixel, and 1024 characteristic graphs with the size of 19 × 19 are output;
a seventh group of convolutional layers: a 15 th layer of convolution layer, which uses 1024 convolution kernels with the window size of 1 × 1, the window moving step length is 1, the edges are not filled, and 1024 characteristic graphs with the size of 19 × 19 are output;
the feature maps output by the 2 nd, 4 th, 7 th, 10 th and 13 th convolutional layers need to be subjected to dimensionality reduction through the maximum downsampling layer, a downsampling kernel with the window size of 2 x 2 is used, and the window moving step size is 2.
Preferably, the 8-layer convolutional neural network in step S3 includes:
an eighth group of convolutional layers: the 16 th-17 th convolutional layer, the 16 th convolutional layer uses 256 convolutional kernels with the window size of 1 multiplied by 1, the window moving step length is 1, and the edges are not filled; layer 17 uses 512 convolution kernels with window size of 3 × 3, window moving step size of 2, edge filling of 1 pixel; outputting 512 feature maps with the size of 10 multiplied by 10;
a ninth set of convolutional layers: 18-19 convolutional layers, 18 layer uses 128 convolutional kernels with window size 1 × 1, window moving step size is 1, and edges are not filled; layer 17 uses 256 convolution kernels with window size of 3 × 3, window moving step size of 2, edge filling of 1 pixel; outputting 256 feature maps with the size of 5 multiplied by 5;
a tenth set of convolutional layers: layers 20-21, layer 20 using 128 convolution kernels with window size 1 × 1, window moving step size 1, no edge filling; layer 21 uses 256 convolution kernels with window size of 3 × 3, window moving step size is 1, and edges are not filled; outputting 256 feature maps with the size of 3 x 3;
an eleventh group of convolutional layers: layers 22-23, layer 22 using 128 convolution kernels with window size 1 × 1, window shift step size 1, no edge padding; layer 23 uses 256 convolution kernels with window size of 3 × 3, window moving step size is 1, and edges are not filled; 256 feature maps of size 1 × 1 are output.
Preferably, the multi-scale detection network in step S3 performs the following operations on the feature maps output by the 10 th, 15 th, 17 th, 19 th, 21 th, and 23 th convolutional layers, respectively:
normalizing the characteristic diagram;
extracting coordinate information of k default frames by using a convolution kernel with 4k windows of 3 multiplied by 3, the window moving step length is 1, and the edges are filled with convolution layers of 1 pixel; wherein, k is 1,2,3, 4, 5, 6 corresponding to 10 th, 15 th, 17 th, 19 th, 21 th, 23 th layer;
extracting confidence scores of k default frames for each class by using a convolution kernel containing (classes +1) multiplied by k windows with the size of 3 multiplied by 3, wherein the window moving step is 1, edges are filled with convolution layers of 1 pixel, and the classes are the number of the classes of the target to be detected;
and splicing the coordinate information extracted from each default frame of each layer of feature map with the confidence score to generate a detection result.
Preferably, for the 10 th, 15 th, 17 th, 19 th, 21 th and 23 th layers, the default border size values are sequentially the following formula s in the following formula1To s6:
In the above formula, skIs the default size value of the frame, smin=0.2,smax=0.9;
Setting different aspect ratio a for each layer of characteristic diagramrIn combination with the value of the dimension s of the layerkGet the default width of the frameAnd heightRespectively as follows:
wherein, for the 10 th, 21 th and 23 th layers, arE {1,2,1/2 }; for layer a of 15 th, 17 th, 19 thrE {1,2,3,1/2,1/3}, and for the six convolutional layers, there is also a corresponding size
Preferably, when the multi-scale detection network in step S3 concatenates the coordinate information extracted from each default frame of each layer of feature map with the confidence score to generate a detection result, first discarding all results with confidence scores lower than a confidence threshold; then, a non-maximum value inhibition method is used, the optimal detection result with higher confidence score is retained, and meanwhile, the suboptimal result is inhibited; and finally, splicing the screened coordinate information and the confidence scores.
(III) advantageous effects
The technical scheme of the invention has the following advantages: the invention provides a method for detecting an iron tower target of an SAR image based on deep learning. Compared with the prior art, the invention has the advantages that:
(1) strong robustness
The invention adopts a multilayer convolutional neural network model, can fully extract the high-level characteristics of the target and fully utilizes the translation invariance of the convolutional layer, so the invention has stronger robustness to the translation of the SAR image. Meanwhile, the invention adopts a multi-aspect ratio detection method of the multi-scale characteristic diagram, so the invention has stronger robustness to the deformation of the SAR image.
(2) The running speed is high
The traditional CFAR detection method and the Faster R-CNN model both need two steps of generating a suspicious target and identifying the suspicious target, and the detection efficiency is low. The invention adopts an end-to-end training and detecting method, integrates generation and identification into a whole, and improves the training and detecting speed.
(3) High detection performance
The traditional CFAR detection method needs to set parameters according to prior information in an image to be detected, and if the parameters are unreasonably set, the detection result is seriously influenced. The network model fully considers the multi-aspect ratio targets of all the multi-scale characteristic graphs, and simultaneously avoids the problem of unreasonable parameter setting in the CFAR detection method, so that the accuracy is improved.
(4) Easy migration
The lower layer parameters in the trained convolutional neural network reflect underlying features common to the images, such as edges, shapes, and the like. Thus, when it is desired to train a model that detects other targets, the training can be completed more quickly based on the transfer learning technique.
Drawings
Fig. 1 is a schematic diagram illustrating steps of a method for detecting an SAR image iron tower target based on deep learning in an embodiment of the invention;
FIG. 2 is a schematic view of an SSD model in a second embodiment of the present invention;
FIG. 3 is a diagram of data to be measured according to a third embodiment of the present invention;
fig. 4a and 4b are partial enlarged views of the detection results of fig. 3.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Example one
As shown in fig. 1, an SAR image iron tower target detection method based on deep learning provided in an embodiment of the present invention includes:
s1, randomly extracting a plurality of SAR images from the SAR data set, segmenting to obtain sample slices, setting a sample label for each sample slice, and constructing a training sample set, wherein the sample label comprises coordinate information and category information of a target in the sample slice.
Preferably, step S1 includes:
s1-1, randomly extracting 100 SAR images from the MiniSAR data set;
s1-2, obtaining a plurality of sheet sample slices from each SAR image by random segmentation;
s1-3, adding a sample label to each sample slice, wherein the sample label comprises an absolute path of the sample slice, the number of targets, target frame coordinates (namely coordinate information) and a target class (namely class information); specifically, the composition of the sample label may be:
<DIR>n xt1yt1xb1yb1c1xt2yt2xb2yb2c2…xtnytnxbnybncn
wherein,<DIR>representing an absolute path of a corresponding sample slice, wherein n represents the number of iron tower targets in the slice; followed by n sets of target information, each set of target information comprising (x)ti,yti)、(xbi,ybi) And ci:xtiAnd ytiRepresents the coordinate of the upper left corner of the ith target frame, xbiAnd ybiRepresents the coordinates of the lower right corner of the ith target frame, ciIndicating the category of the ith object.
And S1-4, forming the sample slices with the corresponding sample labels into a training sample set.
And S2, processing each sample slice in the training sample set to generate a plurality of different artificial sample slices, and adding the training sample set to realize expansion after adding the sample labels.
Preferably, in step S2, each sample slice in the training sample set is processed, and when a plurality of different artificial sample slices are generated, one or more methods of translation, inversion, rotation, and noise increase are randomly used for processing, 100 different artificial sample slices are generated by transforming each sample slice, and a sample label corresponding to the artificial sample slice is added according to a specific operation, and each artificial sample slice with the corresponding sample label is added to the training sample set, so as to obtain an expanded training sample set.
S3, constructing the SSD model, including:
15 layers of convolutional neural networks, which are used for preliminarily extracting the image characteristics of the input image; the construction can be based on a VGG-16 model network;
the 8-layer convolutional neural network is connected behind the 15-layer convolutional neural network and is used for further extracting image features of different scales;
and the multi-scale detection network is connected behind the 8-layer convolutional neural network and is used for detecting the extracted image features of different scales.
And S4, inputting the training sample set expanded in the step S2 as an input image into the SSD model constructed in the step S3, and training the SSD model by adopting a gradient descent method.
Preferably, step S4 includes:
s4-1, inputting the expanded training sample set into an SSD model;
s4-2, calculating a cost function of the input image to the current SSD model by using a forward propagation method;
s4-3, respectively calculating gradient values of the cost function to parameters in each convolution layer in the SSD model by using a back propagation method;
s4-4, updating the parameters in each convolution layer according to the gradient value of the cost function to the parameters in each convolution layer by using a gradient descent method;
and S4-5, circularly performing the steps S4-2 to S4-4 to update the SSD model, finishing training when the number of circulation times reaches a set value or the parameters of each convolution layer in the SSD model are not updated, and storing the current SSD model.
And S5, cutting the data graph to be detected into a plurality of slices to be detected, wherein the slices to be detected have the same or similar size with the sample slice, and inputting the SSD model trained in the step S4 to obtain a target detection result of the data graph to be detected. Preferably, the specimen section should be square or rectangular with an aspect ratio of approximately square, for example rectangular with an aspect ratio of no more than 3/2, and the section to be examined should be of a size similar to the specimen section in training, for example the ratio between the long sides of the two sections (specimen section and section to be examined) is 2/3 to 3/2.
Preferably, step S5 includes:
s5-1, cutting the data graph to be detected into a plurality of slices with the size similar to that of the sample slice in a grid shape, and taking the slices as slices to be detected;
s5-2, inputting the slices to be detected into the SSD model trained in the step S4, and performing target detection on each slice to be detected by using the SSD model to obtain a target detection result of each slice to be detected;
and S5-3, merging the target detection results of the slices to be detected at the corresponding positions of the original data graph to be detected to obtain the target detection results of the data graph to be detected.
The method provided by the invention adopts a method of deep learning and extracting the target characteristics by a multilayer convolutional neural network model to realize the detection of the iron tower target in the SAR image, does not need higher contrast between the target and the background, and utilizes a multi-aspect ratio detection method of a multi-scale characteristic diagram to extract the characteristics in a multi-scale manner, so that the method has stronger robustness on the deformation of the SAR image and can detect a complex scene. And suspicious target generation and suspicious target identification are integrated, and training and detection speed is high.
Example two
As shown in fig. 2, the second embodiment is basically the same as the first embodiment, and the same parts are not repeated herein, except that:
the SSD model input image size constructed in step S3 is 300 × 300 pixels.
Fig. 2 is a schematic view of the SSD model in this embodiment, in which key convolutional layers are labeled. Wherein convolutional layer 4_3 represents the 3 rd convolutional layer, i.e., the 10 th convolutional layer, in the fourth convolutional layer, as follows.
As shown in fig. 2, the 15-layer convolutional neural network constructed in step S3 has a specific structure including:
a first set of convolutional layers: the 1 st to 2 nd convolution layers, each layer uses 64 convolution kernels with the window size of 3 x 3 respectively, the window moving step length is 1, 1 pixel is filled in the edge, and 64 characteristic graphs with the size of 300 x 300 are output;
a second set of convolutional layers: the 3 rd to 4 th convolution layers, each layer uses 128 convolution kernels with the window size of 3 x 3 respectively, the window moving step size is 1, 1 pixel is filled in the edge, and 128 characteristic graphs with the size of 150 x 150 are output;
a third set of convolutional layers: the 5 th to 7 th convolution layers respectively use 256 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 256 characteristic graphs with the size of 75 multiplied by 75 are output;
a fourth set of convolutional layers: the 8 th to 10 th convolution layers respectively use 512 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 512 characteristic graphs with the size of 38 multiplied by 38 are output;
a fifth set of convolutional layers: the 11 th to 13 th convolution layers respectively use 512 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 512 characteristic graphs with the size of 19 multiplied by 19 are output;
a sixth group of convolutional layers: a 14 th convolutional layer, which uses 1024 convolutional kernels with the window size of 3 × 3, the window moving step is 1, the edge is filled with 1 pixel, and 1024 characteristic graphs with the size of 19 × 19 are output;
a seventh group of convolutional layers: a 15 th layer of convolution layer, which uses 1024 convolution kernels with the window size of 1 × 1, the window moving step length is 1, the edges are not filled, and 1024 characteristic graphs with the size of 19 × 19 are output;
the feature maps output by the 2 nd, 4 th, 7 th, 10 th and 13 th convolutional layers need to be subjected to dimensionality reduction through the maximum downsampling layer, a downsampling kernel with the window size of 2 x 2 is used, and the window moving step size is 2.
The 8-layer convolutional neural network in step S3 has a specific structure including:
an eighth group of convolutional layers: the 16 th-17 th convolutional layer, the 16 th convolutional layer uses 256 convolutional kernels with the window size of 1 multiplied by 1, the window moving step length is 1, and the edges are not filled; layer 17 uses 512 convolution kernels with window size of 3 × 3, window moving step size of 2, edge filling of 1 pixel; the eighth group integrally outputs 512 feature maps with the size of 10 multiplied by 10;
a ninth set of convolutional layers: 18-19 convolutional layers, 18 layer uses 128 convolutional kernels with window size 1 × 1, window moving step size is 1, and edges are not filled; layer 17 uses 256 convolution kernels with window size of 3 × 3, window moving step size of 2, edge filling of 1 pixel; the ninth group integrally outputs 256 feature maps with the size of 5 multiplied by 5;
a tenth set of convolutional layers: layers 20-21, layer 20 using 128 convolution kernels with window size 1 × 1, window moving step size 1, no edge filling; layer 21 uses 256 convolution kernels with window size of 3 × 3, window moving step size is 1, and edges are not filled; the tenth group integrally outputs 256 feature maps with the size of 3 × 3;
an eleventh group of convolutional layers: layers 22-23, layer 22 using 128 convolution kernels with window size 1 × 1, window shift step size 1, no edge padding; layer 23 uses 256 convolution kernels with window size of 3 × 3, window moving step size is 1, and edges are not filled; the eleventh group outputs 256 characteristic maps of 1 × 1 size as a whole.
The multi-scale detection network in step S3 performs the following operations on the feature maps output by the 10 th, 15 th, 17 th, 19 th, 21 th, and 23 th convolutional layers, respectively:
a) and normalizing the feature map.
b) Extracting coordinate information of k default frames by using a convolution kernel with 4k windows of 3 multiplied by 3, the window moving step length is 1, and the edges are filled with convolution layers of 1 pixel; where k is 1,2,3, 4, 5, and 6, respectively, corresponding to 10 th, 15 th, 17 th, 19 th, 21 th, and 23 th layers.
c) Using a convolution kernel with the size of (classes +1) multiplied by k windows being 3 multiplied by 3, the window moving step length being 1, and filling the convolution layer with 1 pixel at the edge, extracting confidence scores of k default frames for each class, wherein the classes is the number of the classes of the target to be detected, namely the total number of the classes of each target in the training sample set.
Default bounding box size skAnd aspect ratio arIs a hyper-parameter. For the 10 th, 15 th, 17 th, 19 th, 21 th and 23 th layers, the default frame size values are s1To s6:
In the above formula, skIs the default size value of the frame, smin=0.2,smaxThe size value of the default frame in the lowest layer 10 th feature map is 0.9, the size value of the default frame in the highest layer 23 th feature map is 0.9, and the size values of the default frames in the feature maps of all layers are uniformly spaced.
Setting different aspect ratio a for each layer of characteristic diagramrIn combination with the value of the dimension s of the layerkGet the default frameWidth of (2)And heightRespectively as follows:
wherein, for the 10 th, 21 th and 23 th layers, arE {1,2,1/2 }; for layer a of 15 th, 17 th, 19 thrE {1,2,3,1/2,1/3}, and one more dimension is considered for the above-mentioned six convolutional layers of 10 th, 15 th, 17 th, 19 th, 21 th, and 23 thThus, for the feature maps at levels 10, 21, and 23, a maximum of 4 different default bounding boxes can be generated for each position; for the feature maps at levels 15, 17, and 19, a maximum of 6 different default bounding boxes can be generated for each position. This arrangement can substantially cover iron tower objects of various shapes and sizes in the input image.
d) And splicing the coordinate information extracted from each default frame of each layer of feature map with the confidence score to generate a detection result. Preferably, the four pieces of coordinate information and one confidence score form a group and represent one detection result, and when the detection results are spliced, the corresponding relation between the coordinate information and the confidence score is maintained, and the detection results are spliced.
Preferably, all results with confidence scores below the confidence threshold (0.1 in the present invention) are discarded first; then, a non-maximum value inhibition method (NMS) is used, the optimal detection result with higher confidence score is kept, and meanwhile, the suboptimal result is inhibited; and finally, splicing the screened coordinate information and the confidence scores. The non-maximum value inhibition method can prevent the same target from being detected for multiple times, preferably, the invention uses the intersection ratio IoU (AB) as a basis to judge whether the overlapping rate of the two areas is too high, if:
the overlapping rate of the area a and the area B is considered to be too high.
Further preferably, the calculating the cost function of the input image to the current SSD model in step S4-2 by using the forward propagation method includes:
matching the target frame and the default frame based on the intersection ratio: matching each target frame first with the default frame that has the greatest cross-over ratio to it and second with any target frame that has a cross-over ratio to it above a threshold (0.5 in the present invention) simplifies the learning process and better handles the prediction between overlapping targets.
Is provided withIndicating for the class p whether there is a match between the ith default bounding box and the jth target bounding box: (Indicating that the ith default bounding box matches the jth target bounding box of category p,indicating that the ith default bounding box does not match the jth target bounding box of category p), as can be seen from the matching strategy described above,i.e., one target frame may be matched to multiple default frames.
Preferably, the cost function, i.e. the overall loss function, can be expressed by the following formula:
wherein x isN is the number of matching default frames, if N is 0, the loss function is set to 0, and the weight parameter α is set to 1 in the methodconf(x, c) is the categorical loss component in the overall loss function, Lloc(x, l, g) is the localization loss component in the overall loss function.
Classification loss function Lconf(x, c) is the softmax penalty for all classes considered, which can be expressed as follows:
wherein,indicating the output classification result of the ith default frame for the category p,and representing the confidence probability obtained after the i-th default frame performs softmax on the output classification results of different classes p.
Location loss function Lloc(x, L, g) is the Smooth L1 penalty between the predicted bounding box and the target bounding box, and can be expressed as follows:
wherein,the x-coordinate representing the ith prediction box,the y-coordinate representing the ith prediction frame,indicates the width of the ith prediction frame,representing the height of the ith prediction frame;representing the x coordinate of the degenerated jth target bounding box,the y coordinate representing the degenerated jth target bounding box,representing the width of the degenerated jth target bounding box,representing the height of the degenerated jth target frame; the Smooth L1 loss can be expressed as follows:
the coordinates of the target bounding box are degenerated to using the coordinates of the default bounding box as:
wherein,the x-coordinate representing the jth target frame,the y-coordinate representing the jth target frame,indicating the width of the jth target frame,indicating the height of the ith target frame,the x-coordinate representing the ith default bounding box,the y-coordinate representing the ith default bounding box,indicates the width of the ith default bounding box,indicating the height of the ith default bounding box.
In the method provided by the invention, the multi-aspect ratio targets of each multi-scale characteristic diagram are fully considered when the network model is constructed, different aspect ratios are set for lower-layer parameters and higher-layer parameters, and the problem of unreasonable parameter setting in a CFAR detection method is avoided, so that the accuracy is improved. And, the lower layer parameters in the trained convolutional neural network reflect the common underlying features of the image, such as edges, shapes, and the like. Thus, when it is desired to train a model that detects other targets, the training can be completed more quickly based on the transfer learning technique.
EXAMPLE III
As shown in fig. 3, 4a and 4b, the third embodiment is substantially the same as the second embodiment, and the description of the same parts is omitted, except that:
the data map to be measured used in the present embodiment is a scene SAR image, as shown in fig. 3. The image size of the data image to be detected is 16384 × 8192 pixels, and the data image to be detected comprises various artificial targets such as iron towers, buildings and the like, and aims to detect and position all types of iron tower targets in the data image to be detected.
In order to better observe the detection result, as shown in fig. 4a and 4b, a part of the target detection result is enlarged, and a white frame mark part in the figure is detected as a part of a steel tower.
In this embodiment, the detection result of the whole to-be-detected data graph is calculated, so that the accuracy rate of the method provided by the invention is 82.4%, the recall ratio (recall ratio) is 97.6%, and the value of the comprehensive evaluation index F1 (the harmonic mean value of the accurate value and the recall ratio) is 89.4%, so that a better test result is obtained.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A SAR image iron tower target detection method based on deep learning is characterized by comprising the following steps:
s1, randomly extracting a plurality of SAR images from the SAR data set, segmenting to obtain sample slices, setting a sample label for each sample slice, and constructing a training sample set, wherein the sample label comprises coordinate information and category information of a target in the sample slice;
s2, processing each sample slice in the training sample set to generate a plurality of different artificial sample slices, and adding the training sample set to realize expansion after adding sample labels;
s3, constructing the SSD model, including:
15 layers of convolutional neural networks, which are used for preliminarily extracting the image characteristics of the input image;
8 layers of convolutional neural networks are used for further extracting image features of different scales;
the multi-scale detection network is used for detecting the extracted image characteristics with different scales;
s4, inputting the training sample set expanded in the step S2 as an input image into the SSD model constructed in the step S3, and training the SSD model by adopting a gradient descent method;
and S5, cutting the data graph to be detected into a plurality of slices to be detected, wherein the slices have the same size as the sample slice, and inputting the SSD model trained in the step S4 to obtain a target detection result of the data graph to be detected.
2. The SAR image iron tower target detection method based on deep learning of claim 1, wherein the step S1 includes:
s1-1, randomly extracting 100 SAR images from the MiniSAR data set;
s1-2, obtaining a plurality of sheet sample slices from each SAR image by random segmentation;
s1-3, adding a sample label to each sample slice, wherein the sample label comprises an absolute path of the sample slice, the number of targets, coordinates of a target frame and the type of the target;
and S1-4, forming the sample slices with the corresponding sample labels into a training sample set.
3. The SAR image iron tower target detection method based on deep learning of claim 1, characterized in that:
in step S2, when a plurality of different artificial sample slices are generated by processing each sample slice in the training sample set, one or more methods of translation, inversion, rotation, and noise increase are used at random to perform the processing, and 100 different artificial sample slices are generated by transforming each sample slice.
4. The SAR image iron tower target detection method based on deep learning of claim 1, wherein the step S4 includes:
s4-1, inputting the expanded training sample set into an SSD model;
s4-2, calculating a cost function of the input image to the current SSD model by using a forward propagation method;
s4-3, respectively calculating gradient values of the cost function to parameters in each convolution layer in the SSD model by using a back propagation method;
s4-4, updating the parameters in each convolution layer according to the gradient value of the cost function to the parameters in each convolution layer by using a gradient descent method;
and S4-5, circularly performing the steps S4-2 to S4-4 to update the SSD model, finishing training when the number of circulation times reaches a set value or the parameters of each convolution layer in the SSD model are not updated, and storing the current SSD model.
5. The SAR image iron tower target detection method based on deep learning of claim 1, wherein the step S5 includes:
s5-1, cutting the data graph to be detected into a plurality of slices with the size similar to that of the sample slice in a grid shape, and taking the slices as slices to be detected;
s5-2, inputting the slices to be detected into the SSD model trained in the step S4, and performing target detection on each slice to be detected by using the SSD model to obtain a target detection result of each slice to be detected;
and S5-3, merging the target detection results of the slices to be detected at the corresponding positions of the original data graph to be detected to obtain the target detection results of the data graph to be detected.
6. The SAR image iron tower target detection method based on deep learning of any one of claims 1-5, characterized in that the SSD model input image size constructed in the step S3 is 300 x 300 pixels;
the 15-layer convolutional neural network comprises:
a first set of convolutional layers: the 1 st to 2 nd convolution layers, each layer uses 64 convolution kernels with the window size of 3 x 3 respectively, the window moving step length is 1, 1 pixel is filled in the edge, and 64 characteristic graphs with the size of 300 x 300 are output;
a second set of convolutional layers: the 3 rd to 4 th convolution layers, each layer uses 128 convolution kernels with the window size of 3 x 3 respectively, the window moving step size is 1, 1 pixel is filled in the edge, and 128 characteristic graphs with the size of 150 x 150 are output;
a third set of convolutional layers: the 5 th to 7 th convolution layers respectively use 256 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 256 characteristic graphs with the size of 75 multiplied by 75 are output;
a fourth set of convolutional layers: the 8 th to 10 th convolution layers respectively use 512 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 512 characteristic graphs with the size of 38 multiplied by 38 are output;
a fifth set of convolutional layers: the 11 th to 13 th convolution layers respectively use 512 convolution kernels with the window size of 3 multiplied by 3, the window moving step length is 1, 1 pixel is filled in the edge, and 512 characteristic graphs with the size of 19 multiplied by 19 are output;
a sixth group of convolutional layers: a 14 th convolutional layer, which uses 1024 convolutional kernels with the window size of 3 × 3, the window moving step is 1, the edge is filled with 1 pixel, and 1024 characteristic graphs with the size of 19 × 19 are output;
a seventh group of convolutional layers: a 15 th layer of convolution layer, which uses 1024 convolution kernels with the window size of 1 × 1, the window moving step length is 1, the edges are not filled, and 1024 characteristic graphs with the size of 19 × 19 are output;
the feature maps output by the 2 nd, 4 th, 7 th, 10 th and 13 th convolutional layers need to be subjected to dimensionality reduction through the maximum downsampling layer, a downsampling kernel with the window size of 2 x 2 is used, and the window moving step size is 2.
7. The SAR image iron tower target detection method based on deep learning of claim 6, wherein the 8-layer convolutional neural network in the step S3 comprises:
an eighth group of convolutional layers: the 16 th-17 th convolutional layer, the 16 th convolutional layer uses 256 convolutional kernels with the window size of 1 multiplied by 1, the window moving step length is 1, and the edges are not filled; layer 17 uses 512 convolution kernels with window size of 3 × 3, window moving step size of 2, edge filling of 1 pixel; outputting 512 feature maps with the size of 10 multiplied by 10;
a ninth set of convolutional layers: 18-19 convolutional layers, 18 layer uses 128 convolutional kernels with window size 1 × 1, window moving step size is 1, and edges are not filled; layer 17 uses 256 convolution kernels with window size of 3 × 3, window moving step size of 2, edge filling of 1 pixel; outputting 256 feature maps with the size of 5 multiplied by 5;
a tenth set of convolutional layers: layers 20-21, layer 20 using 128 convolution kernels with window size 1 × 1, window moving step size 1, no edge filling; layer 21 uses 256 convolution kernels with window size of 3 × 3, window moving step size is 1, and edges are not filled; outputting 256 feature maps with the size of 3 x 3;
an eleventh group of convolutional layers: layers 22-23, layer 22 using 128 convolution kernels with window size 1 × 1, window shift step size 1, no edge padding; layer 23 uses 256 convolution kernels with window size of 3 × 3, window moving step size is 1, and edges are not filled; 256 feature maps of size 1 × 1 are output.
8. The method for detecting the SAR image iron tower target based on deep learning of claim 7, wherein the multi-scale detection network in the step S3 performs the following operations on the feature maps output by the 10 th, 15 th, 17 th, 19 th, 21 th and 23 th convolutional layers respectively:
normalizing the characteristic diagram;
extracting coordinate information of k default frames by using a convolution kernel with 4k windows of 3 multiplied by 3, the window moving step length is 1, and the edges are filled with convolution layers of 1 pixel; wherein, k is 1,2,3, 4, 5, 6 corresponding to 10 th, 15 th, 17 th, 19 th, 21 th, 23 th layer;
extracting confidence scores of k default frames for each class by using a convolution kernel containing (classes +1) multiplied by k windows with the size of 3 multiplied by 3, wherein the window moving step is 1, edges are filled with convolution layers of 1 pixel, and the classes are the number of the classes of the target to be detected;
and splicing the coordinate information extracted from each default frame of each layer of feature map with the confidence score to generate a detection result.
9. The SAR image iron tower target detection method based on deep learning of claim 8, characterized in that, for the 10 th, 15 th, 17 th, 19 th, 21 th and 23 th layers, the default border size values are sequentially the following formula s in the following formula1To s6:
In the above formula, skIs the default size value of the frame, smin=0.2,smax=0.9;
Setting different aspect ratio a for each layer of characteristic diagramrIn combination with the value of the dimension s of the layerkGet the default width of the frameAnd heightRespectively as follows:
wherein, for the 10 th, 21 th and 23 th layers, arE {1,2,1/2 }; for layer a of 15 th, 17 th, 19 thrE {1,2,3,1/2,1/3}, and for the six convolutional layers, there is also a corresponding size
10. The SAR image iron tower target detection method based on deep learning of claim 9, characterized in that:
when the multi-scale detection network in step S3 concatenates the coordinate information extracted from each default frame of each layer of feature map with the confidence score to generate a detection result, first discarding all results with confidence scores lower than a confidence threshold; then, a non-maximum value inhibition method is used, the optimal detection result with higher confidence score is retained, and meanwhile, the suboptimal result is inhibited; and finally, splicing the screened coordinate information and the confidence scores.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811100702.1A CN109325947A (en) | 2018-09-20 | 2018-09-20 | A kind of SAR image steel tower object detection method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811100702.1A CN109325947A (en) | 2018-09-20 | 2018-09-20 | A kind of SAR image steel tower object detection method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109325947A true CN109325947A (en) | 2019-02-12 |
Family
ID=65265068
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811100702.1A Pending CN109325947A (en) | 2018-09-20 | 2018-09-20 | A kind of SAR image steel tower object detection method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109325947A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109978014A (en) * | 2019-03-06 | 2019-07-05 | 华南理工大学 | A kind of flexible base board defect inspection method merging intensive connection structure |
CN110163802A (en) * | 2019-05-20 | 2019-08-23 | 电子科技大学 | A kind of SAR image ultra-resolution method neural network based |
CN110334612A (en) * | 2019-06-19 | 2019-10-15 | 上海交通大学 | Electric inspection process image object detection method with self-learning capability |
CN110378308A (en) * | 2019-07-25 | 2019-10-25 | 电子科技大学 | The improved harbour SAR image offshore Ship Detection based on Faster R-CNN |
CN110427793A (en) * | 2019-08-01 | 2019-11-08 | 厦门商集网络科技有限责任公司 | A kind of code detection method and its system based on deep learning |
CN110826514A (en) * | 2019-11-13 | 2020-02-21 | 国网青海省电力公司海东供电公司 | Construction site violation intelligent identification method based on deep learning |
CN110930383A (en) * | 2019-11-20 | 2020-03-27 | 佛山市南海区广工大数控装备协同创新研究院 | Injector defect detection method based on deep learning semantic segmentation and image classification |
CN110942459A (en) * | 2019-12-10 | 2020-03-31 | 北京航空航天大学 | Power grid architecture extraction method based on target detection network |
CN111583220A (en) * | 2020-04-30 | 2020-08-25 | 腾讯科技(深圳)有限公司 | Image data detection method and device |
CN111881970A (en) * | 2020-07-23 | 2020-11-03 | 国网天津市电力公司 | Intelligent outer broken image identification method based on deep learning |
CN111950626A (en) * | 2020-08-10 | 2020-11-17 | 上海交通大学 | EM-based image classification deep neural network model robustness evaluation method |
CN114266846A (en) * | 2021-12-25 | 2022-04-01 | 福州大学 | Self-learning filling method for target detection model |
CN114519699A (en) * | 2022-01-24 | 2022-05-20 | 北京航空航天大学 | Cloth hole detection method and device and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107563411A (en) * | 2017-08-07 | 2018-01-09 | 西安电子科技大学 | Online SAR target detection method based on deep learning |
-
2018
- 2018-09-20 CN CN201811100702.1A patent/CN109325947A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107563411A (en) * | 2017-08-07 | 2018-01-09 | 西安电子科技大学 | Online SAR target detection method based on deep learning |
Non-Patent Citations (4)
Title |
---|
ML_BOY: "目标检测--SSD(tensorflow版)逐行逐句解读", 《CSDN》 * |
ZHAOCHENG WANG 等: "SAR Target Detection Based on SSD With Data Augmentation and Transfer Learning", 《HTTPS://IEEEXPLORE.IEEE.ORG/DOCUMENT/8467523》 * |
唐聪 等: "基于深度学习的多视窗SSD 目标检测方法", 《红外与激光工程》 * |
陈慧岩 等: "《智能车辆理论与应用》", 31 July 2018 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109978014A (en) * | 2019-03-06 | 2019-07-05 | 华南理工大学 | A kind of flexible base board defect inspection method merging intensive connection structure |
CN110163802B (en) * | 2019-05-20 | 2020-09-01 | 电子科技大学 | SAR image super-resolution method based on neural network |
CN110163802A (en) * | 2019-05-20 | 2019-08-23 | 电子科技大学 | A kind of SAR image ultra-resolution method neural network based |
CN110334612A (en) * | 2019-06-19 | 2019-10-15 | 上海交通大学 | Electric inspection process image object detection method with self-learning capability |
CN110378308A (en) * | 2019-07-25 | 2019-10-25 | 电子科技大学 | The improved harbour SAR image offshore Ship Detection based on Faster R-CNN |
CN110378308B (en) * | 2019-07-25 | 2021-07-20 | 电子科技大学 | Improved port SAR image near-shore ship detection method based on fast R-CNN |
CN110427793A (en) * | 2019-08-01 | 2019-11-08 | 厦门商集网络科技有限责任公司 | A kind of code detection method and its system based on deep learning |
CN110427793B (en) * | 2019-08-01 | 2022-04-26 | 厦门商集网络科技有限责任公司 | Bar code detection method and system based on deep learning |
CN110826514A (en) * | 2019-11-13 | 2020-02-21 | 国网青海省电力公司海东供电公司 | Construction site violation intelligent identification method based on deep learning |
CN110930383A (en) * | 2019-11-20 | 2020-03-27 | 佛山市南海区广工大数控装备协同创新研究院 | Injector defect detection method based on deep learning semantic segmentation and image classification |
CN110942459A (en) * | 2019-12-10 | 2020-03-31 | 北京航空航天大学 | Power grid architecture extraction method based on target detection network |
CN110942459B (en) * | 2019-12-10 | 2023-01-17 | 北京航空航天大学 | Power grid architecture extraction method based on target detection network |
CN111583220A (en) * | 2020-04-30 | 2020-08-25 | 腾讯科技(深圳)有限公司 | Image data detection method and device |
CN111583220B (en) * | 2020-04-30 | 2023-04-18 | 腾讯科技(深圳)有限公司 | Image data detection method and device |
CN111881970A (en) * | 2020-07-23 | 2020-11-03 | 国网天津市电力公司 | Intelligent outer broken image identification method based on deep learning |
CN111950626A (en) * | 2020-08-10 | 2020-11-17 | 上海交通大学 | EM-based image classification deep neural network model robustness evaluation method |
CN114266846A (en) * | 2021-12-25 | 2022-04-01 | 福州大学 | Self-learning filling method for target detection model |
CN114519699A (en) * | 2022-01-24 | 2022-05-20 | 北京航空航天大学 | Cloth hole detection method and device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109325947A (en) | A kind of SAR image steel tower object detection method based on deep learning | |
CN109977918B (en) | Target detection positioning optimization method based on unsupervised domain adaptation | |
CN108596055B (en) | Airport target detection method of high-resolution remote sensing image under complex background | |
CN103049763B (en) | Context-constraint-based target identification method | |
CN110009010B (en) | Wide-width optical remote sensing target detection method based on interest area redetection | |
CN111783523B (en) | Remote sensing image rotating target detection method | |
CN113160062B (en) | Infrared image target detection method, device, equipment and storage medium | |
CN112395987A (en) | SAR image target detection method based on unsupervised domain adaptive CNN | |
CN109087294A (en) | A kind of product defects detection method, system and computer readable storage medium | |
CN110084284A (en) | Target detection and secondary classification algorithm and device based on region convolutional neural networks | |
CN108764330A (en) | SAR image sorting technique based on super-pixel segmentation and convolution deconvolution network | |
CN109165658A (en) | A kind of strong negative sample underwater target detection method based on Faster-RCNN | |
CN109784205A (en) | A kind of weeds intelligent identification Method based on multispectral inspection image | |
CN116958962A (en) | Method for detecting pre-fruit-thinning pomegranate fruits based on improved YOLOv8s | |
CN110533025A (en) | The millimeter wave human body image detection method of network is extracted based on candidate region | |
Luo et al. | Extraction of bridges over water from IKONOS panchromatic data | |
CN114565824B (en) | Single-stage rotating ship detection method based on full convolution network | |
CN110046595A (en) | A kind of intensive method for detecting human face multiple dimensioned based on tandem type | |
CN112164087B (en) | Super-pixel segmentation method and device based on edge constraint and segmentation boundary search | |
CN109284752A (en) | A kind of rapid detection method of vehicle | |
CN112465821A (en) | Multi-scale pest image detection method based on boundary key point perception | |
CN112949630B (en) | Weak supervision target detection method based on frame hierarchical screening | |
CN113963265B (en) | Rapid detection and identification method for small sample and small target in complex remote sensing land environment | |
Sun et al. | Contextual models for automatic building extraction in high resolution remote sensing image using object-based boosting method | |
CN113947723A (en) | High-resolution remote sensing scene target detection method based on size balance FCOS |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190212 |