Disclosure of Invention
In order to realize high-efficiency high-precision synchronous detection and identification of the pit impact target, the invention provides an end-to-end pit impact detection and identification model-CraterIDNet based on a full convolution neural network structure. The network input is a remote sensing image with any resolution, and the output is the detected position, diameter and identification result of the impact pit. The network consists of two parts, namely an impact pit detection channel and an impact pit identification channel. The invention provides a pre-training network model with strong generalization capability. On the basis, migration learning is carried out on the CraterIDNet. The invention provides a candidate frame scale optimization and density adjustment strategy aiming at the target characteristics of the impact pit, and simultaneously realizes the synchronous detection of the multi-scale impact pit target by utilizing different receptive fields, thereby greatly improving the detection performance of the small impact pit target. The invention further provides a grid pattern layer which generates a grid pattern diagram which integrates the distribution characteristics and the scale characteristics of the impact pits and has rotation and scale invariance to realize impact pit recognition without constructing a matching characteristic database. The method has the characteristics of high detection and identification rate, strong detection capability on the small-scale impact pits, small occupied space and strong robustness, and achieves the advanced performance level of impact pit detection and identification.
The technical scheme adopted by the invention is as follows: an end-to-end collision pit detection and identification method based on a full convolution neural network structure synchronously realizes the detection and identification of collision pits in a celestial body remote sensing image, a network established by the method is named as CraterIDNet, the established network is composed of two collision pit detection channels and one collision pit identification channel, a network weight parameter is only composed of convolution layers without a full link layer, wherein,
1) establishing an impact pit detection channel for detecting an impact pit target in an image and outputting a mass center coordinate and a visual radius of the impact pit target, wherein two impact pit channels in CraterIDNet are respectively connected to feature maps with different resolutions, so that the detection of the impact pit target with different scales can be synchronously realized, and the two impact pit channels share the first 4 layers of convolutional layers of a network;
2) and establishing an impact pit identification channel for identifying the impact pit target output by the impact pit detection channel, wherein the impact pit identification channel firstly generates a grid pattern diagram corresponding to the impact pit target through a grid pattern layer, and then identifies the impact pit through a subsequent two-classification convolutional neural network framework, and the network classification result corresponds to the impact pit identification result.
In the collision pit detection channel, a candidate frame optimization selection strategy comprises the following steps:
(1) optimizing the dimensions of the candidate frames, wherein the detection range of the candidate frames must cover the dimension change range of the target example of the impact pit, the detection ranges of the candidate frames of all dimensions are mutually overlapped at a certain overlapping rate, the objective of optimizing the dimensions of the candidate frames is to ensure that the number of the types of the candidate frames is minimum on the basis of meeting the conditions, and then the dimensions of each optimized candidate frame are expressed as:
wherein T isIouOverlap threshold, S, representing a candidate box marked as a positive samplegminRepresents the minimum impact pit size, S, of the sample setgmaxThe maximum impact pit size of the sample set is represented, and lambda is the overlapping rate of the adjacent two-scale candidate frames in the detectable impact pit scale range;
(2) optimizing the generation density of the candidate frames based on three parameters of the average effective candidate frame number, the average target number in the training set scene and the density adjusting factor to achieve the purpose of balancing the number of the candidate frames of each scale in the training scene, and calculating the optimal density adjusting factor of the candidate frames of each scale according to the following formula:
wherein
The total average effective candidate frame number corresponding to the ith scale candidate frame is obtained by the product of the average effective candidate frame number and the average target number in the training set scene, N
batchThe number of candidate frames of each batch in training is shown, and omega is a penalty coefficient;
(3) according to the obtained optimal density adjustment factor of the candidate frame, performing density adjustment on the candidate frame of each scale when the tau is determined
i>When 0, increase the number of generated corresponding candidate frames to the original
Multiple, when tau
i<When 0, the number of generated corresponding candidate frames is reduced to the original one
Multiple, when tau
iWhen the number is 0, the original candidate frame number is maintained.
In the collision pit detection channel, a 3-step interactive training method comprises the following steps:
(1) initializing network parameters of an impact pit detection channel by taking a pre-training model as an initial value, initializing a newly added convolution layer by using a 'xavier' method, and finely adjusting the network by using a training set of an impact pit detection channel 2, wherein the learning rate of unique convolution layers (conv4_ 1-conv 4_3) of an impact pit detection channel 1 is set to be 0, and training is not performed;
(2) secondly, the network takes the model trained in the first step as an initial value, fixes the convolutional layers conv 5-conv 7 and the convolutional layers (conv7_ 1-conv 7_3) unique to the impact pit detection channel 2, and finely tunes the network by utilizing the training set of the impact pit detection channel 1;
(3) thirdly, fixing the shared convolution layers conv 1-conv 4 and the unique convolution layer of the impact pit detection channel 1, and finely adjusting the model trained in the last step by utilizing the training set of the impact pit detection channel 2;
after the 3 steps of interactive training, the two collision pit detection channels of CraterIDNet realize the sharing of the convolution layer.
In the impact pit identification channel, the grid mode layer comprises the following working steps:
(1) firstly, screening the size of the impact pit, and setting the height of a working track of a remote sensing camera as Hmin~Hmax,HrefIn order to train the reference track height corresponding to the sample, the impact pit dimension which can be effectively identified can be detected by the impact pit detection channel within the track height range of the remote sensing camera, so that the impact pit with the diameter range satisfying the following formula is selected as a candidate impact pit:
wherein DminDenotes the minimum detectable impact pit diameter, D, of the impact pit detection channelmaxRepresents the maximum impact pit diameter detectable by the impact pit detection channel;
(2) selecting a main impact pit, wherein at least 3 impact pit targets are usually needed in a field of view to solve the position relation of the spacecraft relative to the surface of the spacecraft, 10 candidate impact pit targets closest to the center of the field of view are selected as the main impact pit, a grid pattern diagram of the main impact pit is constructed, and if less than 10 impact pit targets exist in the field of view, all the candidate impact pit targets are marked as the main impact pits;
(3) the scale standardization is carried out, the distance from each main impact pit to other candidate impact pits is calculated and called as the main distance, then the diameters of the main distance and all impact pits are unified to a reference scale, H is set as the height of a track where a remote sensing camera is located during imaging, and the diameters of the main distance and the candidate impact pits are multiplied by a scale transformation factor Href/H;
(4) Generating a 17 multiplied by 17 grid pattern diagram by taking the main impact pit as the center and the direction of the connecting line from the main impact pit to the nearest adjacent impact pit as the positive direction, wherein the side length of each grid element is LgThe normalized principal distance is determined to fall intoAdjacent impact pits within the grid, when at least one adjacent impact pit falls into a cell (i, j), the cell is in an activated state, the output amplitude is equal to the cumulative sum of the normalized diameters of impact pits falling within this cell, the amplitude of cells not containing impact pits is 0, and finally the normalized diameter of the primary impact pit is summed into the amplitude of the central cell.
In the impact pit recognition channel, the impact pit recognition training set is constructed by the following steps:
(1) firstly, selecting candidate impact pits in a data set according to an equation (3), and giving a unique number for identification to each impact pit training target;
(2) for each impact pit training target, taking the side length L of the grid elementgConstructing 2000 grid pattern graphs by using adjacent impact pit targets in a sample set, adding position noise with a mean value of 0, a standard deviation of 2.5 pixels, a normal distribution obeying, a mean value of 1.5 pixels, a variance of 1.5 pixels and a normal distribution obeying apparent diameter noise into each impact pit target in each grid pattern graph;
(3) randomly selecting 400 pattern maps from the 2000 generated grid pattern maps, and randomly removing information of one adjacent impact pit target in the pattern maps to simulate the condition that the impact pit detection channel does not detect the impact pit;
(4) randomly selecting 700 and 400 grid pattern diagrams, respectively adding 1 and 2 false impact pit targets to simulate the condition that an impact pit detection channel detects the false impact pit targets, wherein the apparent diameter of the added false impact pits is a random variable subject to [20,50] pixel uniform distribution, and the position of the added false impact pits is randomly selected in the grid pattern diagrams;
(5) randomly selecting 8 groups of 100 grid pattern diagrams in each group, and removing impact pit target information in a light gray area respectively corresponding to 8 conditions shown in fig. 6 so as to simulate the condition that a main impact pit is close to a field boundary;
(6) and randomly selecting 60% of samples in the finally generated grid pattern diagram sample set to generate a training set, and generating a testing set by using the rest samples.
Compared with the prior art, the invention has the advantages that:
the invention provides an end-to-end full convolution neural network model-CraterIDNet for synchronously realizing collision pit detection and identification. The network model has the advantages of high detection and identification rate, strong detection capability on small-scale impact pits, small occupied space and strong robustness. A candidate frame scale optimization and density adjustment mechanism is provided for the collision pit detection channel, optimal candidate frame selection is achieved, detection performance of small collision pit targets is greatly improved, and meanwhile, synchronous detection of multi-scale collision pit targets is achieved by utilizing different receptive fields, so that the network has the capability of detecting large-scale range change collision pit targets. Aiming at the collision pit recognition channel, a grid pattern layer is provided to generate a grid pattern diagram with rotation and scale invariance to realize collision pit recognition, and a matching feature database does not need to be constructed. The grid pattern diagram integrates the distribution characteristics and the scale characteristics of the impact pits, and the identification robustness of the method is enhanced. The invention provides a solution for realizing the detection and identification of the impact pit by using an independent network model, and simultaneously achieves the advanced performance level of the detection and identification of the impact pit.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
1) Network architecture
The network system provided by the invention is called CraterIDnet and is an end-to-end full convolution neural network model. The whole system is an independent and unified collision pit detection and identification network. The network structure is shown in fig. 1:
and the CraterIDNet receives an input remote sensing image with any resolution, and outputs the position and the diameter of the detected impact pit and the number of the identified impact pit. The network comprises two main parts, an impact pit detection channel and an impact pit identification channel. The whole system adopts a full convolution architecture without a full link layer, thereby greatly reducing the network scale. In order to further reduce the network size under the condition of ensuring the detection recognition effect, the invention firstly trains a small-scale pre-training network to initialize the convolution characteristic layer conv 1-7. The pre-training network uses remote sensing images of Mars impact pits with different scales, different lighting conditions, different forms and distributed in different areas as training samples so as to expand the application range and generalization capability of the network. After CraterIDNet receives the input image, the feature maps of convolutional layers conv4 and conv7 are then input into two impact pit detection channels and to achieve synchronous detection of impact pit targets on different scale feature maps. The detection channel for detecting the impact pit is composed of 3 convolution layers and an object generation layer, the additional convolution layers generate a candidate frame with a specific shape on the feature mapping, and the detection of the position and the apparent diameter of the impact pit is realized by judging the object type in the candidate frame and regressing the offset and the zoom quantity of an output object relative to the candidate frame. The detection results output by the two detection channels are fused on the target generation layer and input into the impact pit identification channel, and the impact pit identification channel is composed of a grid pattern layer and 4 convolution layers. The grid pattern layer utilizes the information of the front layer to generate a grid pattern diagram which corresponds to the collision pit to be identified and has rotation and scale invariance. And then classifying the grid pattern graph by training a subsequent convolution layer, wherein the classification corresponds to the collision pit recognition result.
The whole CraterIDnet realizes end-to-end bump pit detection and identification through the full convolution layer and the additional functional layer, and the size of the finally generated model is only about 4 Mbyte.
2) Pre-training model construction
Pre-training model structure
The pre-training model network is structured as a classical two-classification convolutional neural network. The input image is passed through a series of convolutional layers, shallow convolutional using a 5 x 5 sized filter, deep convolutional using a 3 x 3 sized filter, deep convolutional span fixed at 1 pixel, and boundary filled with 1 pixel to maintain the resolution of the feature map after convolution. The network uses 3 maximum pooling layers for spatial pooling. The convolutional layer has connected two full-link layers at last, and the first full-link has 256 passageways, and the second full-link layer has 2 passageways, corresponds the final classification of the pre-training network: impact pit targets or non-impact pit targets. And the last layer is a soft-max layer, and the output result is expressed as probability distribution between the impact pit target and the non-impact pit target. The structure of the pre-trained network is shown in fig. 2.
② training of pre-training model
According to a high-resolution full-color image acquired by a Mars express train number detection satellite of the European space navigation bureau and a Robbins impact pit database, 1600 impact pit targets under different scales and different illumination conditions are manually selected as an initial positive sample set. And then, the samples are subjected to data enhancement through random rotation, random movement and random scaling so as to improve the diversity of the samples. Finally, a positive sample set containing 8000 samples was obtained. Meanwhile, 8000 image regions with other topographic features are randomly selected in the selected scene as negative samples (such as plains, canyons, riverbeds, etc.). From these 16000 samples, 70% samples were randomly selected as training samples and 30% as testing samples, where the ratio of positive and negative samples is 1: 1.
before the samples are transmitted to the network training, the sample images are scaled to be 125 × 125 pixels with fixed size. The pre-training network is trained using a batch momentum gradient descent method. The batch size was set to 128 and the momentum factor 0.9. Regularizing the network using weight attenuation and dropout in the training process, L2The penalty factor is set to 0.0005 and the dropout ratio is 0.5. . The network weight is initially carried out by using a 'xavier' methodThe bias is initialized to a constant of 0.1. The network starts 40 rounds of training at a basic learning rate of 0.01, and the learning rate is gradually reduced at an attenuation rate of 0.5 every 500 iterations.
3) Impact pit detection channel construction
The invention constructs an efficient impact pit detection channel through the following three aspects: detecting a channel structure, a candidate box optimization selection strategy and a 3-step interactive training method.
Impact pit detection channel network structure
CraterIDnet connects two bump-in-pit detection channels after convolutional layers conv4 and conv7, respectively, the two detection channels sharing a network lower convolutional layer. The convolution template size of the frame candidate strapdown convolution layer (conv4_1, conv7_1) is 3 × 3 pixels, and n frame candidates are generated at each position corresponding to the feature map in the sliding convolution template process. The output feature maps of the frame candidate strapdown convolutional layers are further input to the 1 × 1 classification convolutional layers (conv4_2, conv7_2) and the regression convolutional layers (conv4_3, conv7_ 3). Each candidate frame corresponds to 2 outputs on the classification convolutional layer, i.e. the probability of belonging to the impact pit and the background, and 3 outputs on the regression convolutional layer, respectively the horizontal offset, the vertical offset, and the proportionality coefficient of the predicted impact pit diameter with respect to the candidate frame width of the predicted impact pit centroid with respect to the candidate frame center. The output characteristic mapping reception fields of the convolutional layers conv4 and conv7 are respectively 37 pixels and 101 pixels, the detection of the small-scale impact pits is realized by using a detection channel connected to the convolutional layer conv4, and the detection of the large-scale impact pits is realized by using a detection channel connected to the convolutional layer conv 7. The multi-scale detection structure for detecting the corresponding targets in different scale ranges on the feature maps with different resolutions can effectively solve the detection problem of the multi-scale impact pit targets. And finally, integrating the classification and regression parameters output by the previous layer on the target layer. And judging candidate frames with the input probability of the pit collision target larger than a certain threshold value as candidate pit collision targets, and calculating the position and the diameter of the regressed candidate frames on the resolution of the original image. Finally, the detected impact pit target position and diameter are output after maximum value suppression.
Second, the optimization selection strategy of the candidate box
Since the impact pit target approximates a circle, the present invention selects a square candidate box as the default candidate box scale. The relation that the candidate box and the size of the impact pit marking box should satisfy and the ultimate impact pit detection size of the network can be expressed as:
wherein S
gRepresenting the side length, S, of the square impact pit marking box
aRepresenting the square candidate box side length, T
IouAnd d represents the interval step length of the candidate frames corresponding to the two adjacent feature points on the feature mapping on the original image. Setting the length of the minimum dimension candidate frame side after optimization as S
a1Then S is
a1Should satisfy
S
gminIndicating the minimum impact pit size for the sample set. Due to the absence of less than S
gminTarget of (1), thus S
a1Larger values within this range tend to be selected. Setting:
the target scale range that the scale candidate box can detect is [ (1-lambda) Sgmin,(1-λ)Sgmin/TIou]. In order to ensure the effectiveness of the candidate frames in detecting targets of all scales in the data set, λ is set as the overlapping rate of detectable impact pit scale ranges of two adjacent scale candidate frames, and thus, the detectable target range corresponding to the nth candidate frame is as follows:
Sgnthe upper bound must be larger than the maximum target labeling box size, so that the condition that the number n of the selected candidate boxes should satisfy is:
get
The candidate box optimization metric can be expressed as:
according to the method, the optimal candidate frame sizes of 15, 27, 49, 89, 143 and 255 pixels are finally obtained according to the target characteristics of the training set. Candidate boxes with dimensions 15 and 27 are associated with the small-scale impact pit detection channel, and candidate boxes with dimensions 49, 89, 143, and 255 are associated with the large-scale impact pit detection channel.
In consideration of the serious inconsistency of the scale distribution of the pit impact data set, in order to ensure that the candidate frames of all scales obtain training opportunities with approximate equal probability, the invention provides a candidate frame density adjusting mechanism, and provides an optimization equation of candidate frame density adjustment based on three parameters of the average effective candidate frame number, the average target number in a training set scene and a density adjusting factor, so that the aim of balancing the number of the candidate frames of all scales in the training scene is fulfilled, and the network can obtain the consistent training effect on the pit impact targets of all scales.
As shown in FIG. 3, BgIndicating the boundary frame of the impact pit, BaRepresenting candidate boxes, SgAnd SaRespectively represent BgAnd BaThe size of the dimension (c). Let (x)a,ya) Is represented by BaCenter coordinate of (B)gThe center is located at the origin of coordinates. When the origin of the candidate box is located in the blue effective area, the current candidate box is marked as being capable of being marked as a positive sampleThe valid candidate boxes of the present document satisfy:
IoU(Bg,Ba)≥TIoU (7)
the number of candidate frames satisfying equation (7) is defined as the effective number of candidate frames without considering the influence across the image boundary. In the invention, S is
g=S
aThe effective number of the corresponding candidate frames is approximate to the average effective candidate frame number of the target in the detectable scale range of the scale candidate frame, and is recorded as
According to the definition of the overlapping degree, when the formula (7) is satisfied, the candidate frame center coordinates should satisfy the discriminant formula:
average the effective number of candidate frames
Approximately as the ratio of the active area to the square of the candidate box spacing, i.e.:
wherein d isaRepresenting the span interval of the candidate frame at the resolution of the original image. Then, the following equations (8) and (9) can be obtained:
defining a total average effective candidate box number of a scale candidate box in a training set scene
Comprises the following steps:
wherein
Representing statistically derived training set metrics
Is the average number of impact pit targets. The purpose of the candidate frame density adjustment is to balance the total average effective candidate frame number of candidate frames in each scale, so that the candidate frames obtain training opportunities with approximate probabilities. Therefore, the present invention proposes that the goal of candidate box density adjustment is to minimize the square loss function as follows:
wherein τ i represents a density adjustment factor of the candidate frame with the corresponding scale, the value is an integer, and n is the number of the scale types of the candidate frame. Further, it is considered that the candidate frames generated by the network should not be too many or too few, and too many will greatly increase the training time, and too few will result in insufficient training. Therefore, penalty terms based on the total effective candidate box number are introduced, and the final optimization objective function is obtained as follows:
wherein N is
batchAnd omega is a penalty coefficient for the number of candidate frames of each batch in training, and is taken as 0.02 in the invention. By introducing a penalty term, the minimum experience loss is ensured, and the minimum structural risk is also ensured. During the training process, the total N is randomly selected in each iteration
batchEach candidate box is trained to include half of the positive samples and half of the negative samples. In order to ensure that the network generates enough training candidate frames and also leave room for random selection in the training process, taking N
batchAs the desired total number of valid candidate boxes generated by the network. Obtaining an optimized density adjustment factor according to the formula (13)τ
iWhen the number of the generated corresponding candidate frames is larger than 0, the number of the generated corresponding candidate frames is increased to be original
Multiple, when tau
iIf < 0, the number of generated corresponding candidate frames is reduced to the original one
Multiple, when tau
iWhen the number is 0, the original candidate frame number is maintained. The result of the frame candidate density adjustment is shown in fig. 4. FIG. 4(a) shows τ
iWhen the frame density is-1, the frame density is halved, and τ is shown in fig. 4(b)
iWhen 0, the frame density is not changed, d
aIndicates the frame interval candidates, and FIG. 4(c) indicates τ
iWhen 1, the frame density doubles.
③ 3 step type interactive training method
And the impact pit detection channel performs end-to-end training by using a momentum random gradient descent method. Randomly selecting N in image in each iterationbatchA batch loss function is calculated for 256 candidate blocks with a 1: 1 ratio of positive and negative samples. If the number of positive samples is less than 128, then the negative samples are filled in. Since two impact pit detection channels share convolutional layers conv 1-conv 4, there is a need for a training method that allows two detection channels to share convolutional layers, rather than training two independent detection channels. The invention provides a 3-step interactive training method, which comprises the following specific steps:
(1) initializing the network parameters of the collision pit detection channel by taking the pre-training model as an initial value, initializing the newly added convolution layer by using a 'xavier' method, and initializing the bias to be a constant 0. The network utilizes the training set of the collision pit detection channel 2 to fine tune with the parameter settings of the basic learning rate 0.005, the momentum coefficient 0.9, and the weight attenuation coefficient 0.0005. The learning rate of the convolutional layers (conv4_1 to conv4_3) unique to the impact pit detection channel 1 was set to 0, and training was not performed. 50 rounds of training are performed, and the learning rate is gradually reduced by an attenuation rate of 0.8 every 10000 times of iteration.
(2) And in the second step, the network takes the model trained in the first step as an initial value, fixes the convolutional layers conv 5-conv 7 and the convolutional layers (conv7_ 1-conv 7_3) unique to the impact pit detection channel 2, and utilizes the training set of the impact pit detection channel 1 to finely tune the network. The basic learning rate is set to be 0.001, the momentum coefficient is set to be 0.9, and the weight attenuation coefficient is set to be 0.0005. The network performs 30 rounds of training, and the learning rate is gradually reduced by 0.6 at every 20000 iterations.
(3) And thirdly, fixing the shared convolution layers conv 1-conv 4 and the unique convolution layer of the impact pit detection channel 1, and finely adjusting the model trained in the last step by utilizing the training set of the impact pit detection channel 2. The basic learning rate 0.0002, the momentum coefficient 0.9 and the weight attenuation coefficient 0.0005 are set. The network performs 30 rounds of training, with the learning rate gradually decreasing at a decay rate of 0.8 every 15000 iterations.
After the 3 steps of interactive training, the two collision pit detection channels of CraterIDNet realize the sharing of the convolution layer.
4) Impact pit identification channel construction
The invention constructs an efficient collision pit identification channel through the following two aspects: grid pattern layer and impact pit recognition channel training method.
Grid pattern layer
The grid pattern layer inputs impact pit characteristic information (position and diameter) detected by an impact pit detection channel and outputs a grid pattern diagram with rotation and scale invariance corresponding to an impact pit target. The invention fuses the characteristic information of impact pit distribution and scale into a grid mode and graphs the information. And classifying the grid pattern graph by using a subsequent convolutional network to finish the identification process.
The grid mode layer working steps are as follows:
(1) impact pit size screening was performed first. The height of the working track of the remote sensing camera is set as Hmin~Hmax,HrefThe reference track height corresponding to the training sample. The impact pit size which can be effectively identified can be detected by an impact pit detection channel within the working track height range of the remote sensing camera, so that the impact pit with the diameter range satisfying the following formula is selected as a candidate impact pit:
wherein DminDenotes the minimum detectable impact pit diameter, D, of the impact pit detection channelmaxIndicating the maximum diameter of the impact pit detectable by the impact pit detection channel.
(2) And selecting a main impact pit. Typically at least 3 impact pit targets are required in the field of view to resolve the positional relationship of the spacecraft relative to the spacecraft surface. In the invention, 10 candidate impact pit targets closest to the center of a field of view are selected as main impact pits, a grid pattern diagram of the main impact pits is constructed, and if less than 10 impact pit targets exist in the field of view, all the impact pit targets are marked as the main impact pits.
(3) And (5) dimension standardization. For each main impact pit, its distance to other candidate impact pits is calculated separately, which is called the main pitch. The principal distance is then unified with the diameter of all impact pits to a reference scale. And H is the height of the track where the remote sensing camera is positioned during imaging, and the main distance and the diameter of the candidate impact pit are multiplied by a scale transformation factor Href/H。
(4) And generating a grid pattern diagram. A17 x 17 grid pattern diagram is established with the main impact pit as the center and the direction of the line from the main impact pit to the nearest adjacent impact pit as the positive direction. Each grid element side length is Lg. And judging adjacent impact pits falling into the grid by the normalized principal distance, when at least one adjacent impact pit falls into a grid element (i, j), the grid element is in an activated state, the output amplitude is equal to the summation of the normalized diameters of the impact pits falling into the grid element, and the amplitude of the grid element without the impact pits is 0. And finally, accumulating the standardized diameter of the main impact pit into the amplitude of the central grid element.
The generation process of the grid pattern diagram is shown in fig. 5, the dark gray circles represent the main impact pits, and the light gray circles represent other candidate impact pits. Fig. 5(b) shows that the dimension normalization is performed, and the arrow direction shows the direction of the line connecting the main impact pit to the nearest neighboring impact pit. Fig. 5(c) shows the grid pattern created in the forward direction, and fig. 5(d) shows the final grid pattern diagram. And finally, outputting a grid pattern diagram by the grid pattern layer, fusing the distribution characteristic and the scale characteristic of the impact pit target, and imaging the two characteristics. And the position detection error and the scale detection error of the impact pit generated by the impact pit detection channel are converted into position noise and gray scale noise in the image, and further, the robustness of the identification result on the position and the scale error of the impact pit is enhanced in the treatment of the grid characteristics by the convolution network.
Impact pit identification channel training method
Firstly, selecting candidate impact pits in a data set according to equation (14), and then constructing an impact pit recognition channel training set according to the following steps:
(1) each impact pit training target is given a unique number for identification.
(2) For each impact pit training target, taking the side length L of the grid elementg2000 grid pattern maps were constructed with adjacent impact pit targets in the sample set, 24 pixels. The impact pit target in each grid pattern plot was added with a mean of 0, a standard deviation of 2.5 pixels, normally distributed positional noise and a mean of 1.5 pixels, a variance of 1.5 pixels, normally distributed apparent diameter noise.
(3) 400 pattern maps were randomly selected from the 2000 generated grid pattern maps, and information of one adjacent impact pit target in the pattern maps was randomly removed to simulate a case where the impact pit detection channel did not detect the impact pit.
(4) And randomly selecting 700 and 400 grid pattern graphs to add 1 and 2 false impact pit targets respectively so as to simulate the situation that the impact pit detection channel detects the false impact pit targets. The false impact pits added are random variables whose apparent diameters obey a uniform distribution of [20,50] pixels, and the locations are randomly chosen within the grid pattern map.
(5) And randomly selecting 8 groups of 100 grid pattern diagrams in each group, and removing the impact pit target information in the light gray area corresponding to the 8 conditions shown in fig. 6 respectively so as to simulate the condition that the main impact pit is close to the boundary of the field of view.
(6) And randomly selecting 60% of samples in the finally generated grid pattern diagram sample set to generate a training set, and generating a testing set by using the rest samples.
The impact pit identification channel is trained by a batch momentum gradient descent method. The batch size was set at 512, the momentum coefficient 0.9, and the weighted decay coefficient 0.0005. Convolutional layer conv 8-conv 11 network weights were initialized using the "xavier" method, with the bias initialized to a constant of 0.1. CIP starts 30 rounds of training at a basic learning rate of 0.01, and the learning rate is gradually decreased at an attenuation rate of 0.5 per 10000 iterations. And finally, the identification accuracy of the impact pit identification channel on the test set reaches 99.22%, which shows that the impact pit identification channel has high identification accuracy and strong generalization capability.
The invention provides an end-to-end collision pit detection and identification method based on a full convolution neural network structure. It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention are included in the protection scope of the present invention.