CN108734219B

CN108734219B - End-to-end collision pit detection and identification method based on full convolution neural network structure

Info

Publication number: CN108734219B
Application number: CN201810499889.0A
Authority: CN
Inventors: 江洁; 王昊; 张广军
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2018-05-23
Filing date: 2018-05-23
Publication date: 2022-02-01
Anticipated expiration: 2038-05-23
Also published as: CN108734219A

Abstract

The invention discloses an end-to-end impact crater detection and identification method based on a fully convolutional neural network structure. The method synchronously realizes the detection and identification of impact craters in celestial remote sensing images. The network established by the method is named CraterIDNet. The network consists of two impact crater detection channels and one impact crater identification channel. The network weight parameters are only composed of convolutional layers and no full link layer. The invention proposes a candidate frame size optimization and density adjustment mechanism for the impact crater detection channel, realizes the optimal candidate frame selection, greatly improves the detection performance of small impact crater targets, and uses different receptive fields to synchronize multi-scale impact crater targets. detection, so that the network has the ability to detect large-scale range-varying impact crater targets. For the impact crater identification channel, a grid pattern layer is proposed to generate a grid pattern map with rotation and scale invariance to realize impact crater identification without building a matching feature database. The present invention enhances recognition robustness.

Description

End-to-end collision pit detection and identification method based on full convolution neural network structure

Technical Field

The invention relates to the technical field of remote sensing image processing and astronomical autonomous navigation, in particular to an end-to-end impact pit detection and identification method based on a full convolution neural network structure.

Background

The impact pit is the most abundant topographic structure of the surface of the celestial body, and the morphological characteristics and the spatial distribution of the impact pit are important bases for the planetary geology research. In addition, the collision pit is also an ideal landmark for autonomous navigation of the spacecraft. The detection and identification technology of the impact pit is very important for realizing the planetary geological research and the autonomous navigation of the spacecraft. Currently, impact pit detection and identification are separately studied as two independent algorithms. The object of impact pit detection is to determine whether an impact pit is contained in the image and to mark its position in the image. The goal of impact pit identification is to match the detected impact pits with landmarks at known coordinate locations in a database, and the matching results can be used for spacecraft position estimation.

At present, the impact pit detection algorithm is mainly realized by carrying out pattern recognition and matching on characteristic information such as the edge, the shape, the light and shade distribution and the like of the impact pit. The algorithm is complex, low in recall ratio and poor in generalization capability, the detection precision of small-scale impact pit targets is low, and the detection effect of complex targets such as high erosion, irregular shapes and overlapping of multiple impact pits is poor. The impact pit identification algorithm mainly utilizes the detected impact pit target to construct a certain characteristic pattern and searches a unique characteristic database for matching realization. The algorithm needs to store a feature database, occupies a large storage space, and takes a long time to search and match. When a false target appears in a visual field, the recognition performance is seriously influenced, and the algorithm is extremely sensitive to position detection errors and visual diameter detection errors of the target which impacts the pit, and the robustness is insufficient. At present, no method which can simultaneously realize the detection and the identification of the impact pits in a large scale change range, and has high accuracy and strong robustness exists.

Disclosure of Invention

In order to realize high-efficiency high-precision synchronous detection and identification of the pit impact target, the invention provides an end-to-end pit impact detection and identification model-CraterIDNet based on a full convolution neural network structure. The network input is a remote sensing image with any resolution, and the output is the detected position, diameter and identification result of the impact pit. The network consists of two parts, namely an impact pit detection channel and an impact pit identification channel. The invention provides a pre-training network model with strong generalization capability. On the basis, migration learning is carried out on the CraterIDNet. The invention provides a candidate frame scale optimization and density adjustment strategy aiming at the target characteristics of the impact pit, and simultaneously realizes the synchronous detection of the multi-scale impact pit target by utilizing different receptive fields, thereby greatly improving the detection performance of the small impact pit target. The invention further provides a grid pattern layer which generates a grid pattern diagram which integrates the distribution characteristics and the scale characteristics of the impact pits and has rotation and scale invariance to realize impact pit recognition without constructing a matching characteristic database. The method has the characteristics of high detection and identification rate, strong detection capability on the small-scale impact pits, small occupied space and strong robustness, and achieves the advanced performance level of impact pit detection and identification.

The technical scheme adopted by the invention is as follows: an end-to-end collision pit detection and identification method based on a full convolution neural network structure synchronously realizes the detection and identification of collision pits in a celestial body remote sensing image, a network established by the method is named as CraterIDNet, the established network is composed of two collision pit detection channels and one collision pit identification channel, a network weight parameter is only composed of convolution layers without a full link layer, wherein,

1) establishing an impact pit detection channel for detecting an impact pit target in an image and outputting a mass center coordinate and a visual radius of the impact pit target, wherein two impact pit channels in CraterIDNet are respectively connected to feature maps with different resolutions, so that the detection of the impact pit target with different scales can be synchronously realized, and the two impact pit channels share the first 4 layers of convolutional layers of a network;

2) and establishing an impact pit identification channel for identifying the impact pit target output by the impact pit detection channel, wherein the impact pit identification channel firstly generates a grid pattern diagram corresponding to the impact pit target through a grid pattern layer, and then identifies the impact pit through a subsequent two-classification convolutional neural network framework, and the network classification result corresponds to the impact pit identification result.

In the collision pit detection channel, a candidate frame optimization selection strategy comprises the following steps:

(1) optimizing the dimensions of the candidate frames, wherein the detection range of the candidate frames must cover the dimension change range of the target example of the impact pit, the detection ranges of the candidate frames of all dimensions are mutually overlapped at a certain overlapping rate, the objective of optimizing the dimensions of the candidate frames is to ensure that the number of the types of the candidate frames is minimum on the basis of meeting the conditions, and then the dimensions of each optimized candidate frame are expressed as:

wherein T is_IouOverlap threshold, S, representing a candidate box marked as a positive sample_gminRepresents the minimum impact pit size, S, of the sample set_gmaxThe maximum impact pit size of the sample set is represented, and lambda is the overlapping rate of the adjacent two-scale candidate frames in the detectable impact pit scale range;

(2) optimizing the generation density of the candidate frames based on three parameters of the average effective candidate frame number, the average target number in the training set scene and the density adjusting factor to achieve the purpose of balancing the number of the candidate frames of each scale in the training scene, and calculating the optimal density adjusting factor of the candidate frames of each scale according to the following formula:

wherein

The total average effective candidate frame number corresponding to the ith scale candidate frame is obtained by the product of the average effective candidate frame number and the average target number in the training set scene, N_batchThe number of candidate frames of each batch in training is shown, and omega is a penalty coefficient;

(3) according to the obtained optimal density adjustment factor of the candidate frame, performing density adjustment on the candidate frame of each scale when the tau is determined_i>When 0, increase the number of generated corresponding candidate frames to the original

Multiple, when tau_i<When 0, the number of generated corresponding candidate frames is reduced to the original one

Multiple, when tau_iWhen the number is 0, the original candidate frame number is maintained.

In the collision pit detection channel, a 3-step interactive training method comprises the following steps:

(1) initializing network parameters of an impact pit detection channel by taking a pre-training model as an initial value, initializing a newly added convolution layer by using a 'xavier' method, and finely adjusting the network by using a training set of an impact pit detection channel 2, wherein the learning rate of unique convolution layers (conv4_ 1-conv 4_3) of an impact pit detection channel 1 is set to be 0, and training is not performed;

(2) secondly, the network takes the model trained in the first step as an initial value, fixes the convolutional layers conv 5-conv 7 and the convolutional layers (conv7_ 1-conv 7_3) unique to the impact pit detection channel 2, and finely tunes the network by utilizing the training set of the impact pit detection channel 1;

(3) thirdly, fixing the shared convolution layers conv 1-conv 4 and the unique convolution layer of the impact pit detection channel 1, and finely adjusting the model trained in the last step by utilizing the training set of the impact pit detection channel 2;

after the 3 steps of interactive training, the two collision pit detection channels of CraterIDNet realize the sharing of the convolution layer.

In the impact pit identification channel, the grid mode layer comprises the following working steps:

(1) firstly, screening the size of the impact pit, and setting the height of a working track of a remote sensing camera as H_min～H_max，H_refIn order to train the reference track height corresponding to the sample, the impact pit dimension which can be effectively identified can be detected by the impact pit detection channel within the track height range of the remote sensing camera, so that the impact pit with the diameter range satisfying the following formula is selected as a candidate impact pit:

wherein D_minDenotes the minimum detectable impact pit diameter, D, of the impact pit detection channel_maxRepresents the maximum impact pit diameter detectable by the impact pit detection channel;

(2) selecting a main impact pit, wherein at least 3 impact pit targets are usually needed in a field of view to solve the position relation of the spacecraft relative to the surface of the spacecraft, 10 candidate impact pit targets closest to the center of the field of view are selected as the main impact pit, a grid pattern diagram of the main impact pit is constructed, and if less than 10 impact pit targets exist in the field of view, all the candidate impact pit targets are marked as the main impact pits;

(3) the scale standardization is carried out, the distance from each main impact pit to other candidate impact pits is calculated and called as the main distance, then the diameters of the main distance and all impact pits are unified to a reference scale, H is set as the height of a track where a remote sensing camera is located during imaging, and the diameters of the main distance and the candidate impact pits are multiplied by a scale transformation factor H_ref/H；

(4) Generating a 17 multiplied by 17 grid pattern diagram by taking the main impact pit as the center and the direction of the connecting line from the main impact pit to the nearest adjacent impact pit as the positive direction, wherein the side length of each grid element is L_gThe normalized principal distance is determined to fall intoAdjacent impact pits within the grid, when at least one adjacent impact pit falls into a cell (i, j), the cell is in an activated state, the output amplitude is equal to the cumulative sum of the normalized diameters of impact pits falling within this cell, the amplitude of cells not containing impact pits is 0, and finally the normalized diameter of the primary impact pit is summed into the amplitude of the central cell.

In the impact pit recognition channel, the impact pit recognition training set is constructed by the following steps:

(1) firstly, selecting candidate impact pits in a data set according to an equation (3), and giving a unique number for identification to each impact pit training target;

(2) for each impact pit training target, taking the side length L of the grid element_gConstructing 2000 grid pattern graphs by using adjacent impact pit targets in a sample set, adding position noise with a mean value of 0, a standard deviation of 2.5 pixels, a normal distribution obeying, a mean value of 1.5 pixels, a variance of 1.5 pixels and a normal distribution obeying apparent diameter noise into each impact pit target in each grid pattern graph;

(3) randomly selecting 400 pattern maps from the 2000 generated grid pattern maps, and randomly removing information of one adjacent impact pit target in the pattern maps to simulate the condition that the impact pit detection channel does not detect the impact pit;

(4) randomly selecting 700 and 400 grid pattern diagrams, respectively adding 1 and 2 false impact pit targets to simulate the condition that an impact pit detection channel detects the false impact pit targets, wherein the apparent diameter of the added false impact pits is a random variable subject to [20,50] pixel uniform distribution, and the position of the added false impact pits is randomly selected in the grid pattern diagrams;

(5) randomly selecting 8 groups of 100 grid pattern diagrams in each group, and removing impact pit target information in a light gray area respectively corresponding to 8 conditions shown in fig. 6 so as to simulate the condition that a main impact pit is close to a field boundary;

(6) and randomly selecting 60% of samples in the finally generated grid pattern diagram sample set to generate a training set, and generating a testing set by using the rest samples.

Compared with the prior art, the invention has the advantages that:

the invention provides an end-to-end full convolution neural network model-CraterIDNet for synchronously realizing collision pit detection and identification. The network model has the advantages of high detection and identification rate, strong detection capability on small-scale impact pits, small occupied space and strong robustness. A candidate frame scale optimization and density adjustment mechanism is provided for the collision pit detection channel, optimal candidate frame selection is achieved, detection performance of small collision pit targets is greatly improved, and meanwhile, synchronous detection of multi-scale collision pit targets is achieved by utilizing different receptive fields, so that the network has the capability of detecting large-scale range change collision pit targets. Aiming at the collision pit recognition channel, a grid pattern layer is provided to generate a grid pattern diagram with rotation and scale invariance to realize collision pit recognition, and a matching feature database does not need to be constructed. The grid pattern diagram integrates the distribution characteristics and the scale characteristics of the impact pits, and the identification robustness of the method is enhanced. The invention provides a solution for realizing the detection and identification of the impact pit by using an independent network model, and simultaneously achieves the advanced performance level of the detection and identification of the impact pit.

Drawings

FIG. 1 is a CratedNet architecture;

FIG. 2 is a diagram of a pre-training network architecture;

FIG. 3 is a diagram of valid candidate box criteria;

FIG. 4 shows the candidate frame density adjustment, wherein FIG. 4(a) shows the density adjustment factor τ_iWhen the frame density is-1, the frame density is halved, and fig. 4(b) shows the density adjustment factor τ_iWhen the frame density is 0, the frame candidate density is not changed, and fig. 4(c) shows the density adjustment factor τ_iWhen the frame density is 1, doubling the frame density;

FIG. 5 is a grid pattern generation process, wherein FIG. 5(a) is impact pit dimension screening and main impact pit selection, FIG. 5(b) is dimension normalization, FIG. 5(c) is a grid pattern creation diagram, and FIG. 5(d) is a final generated grid pattern diagram;

fig. 6 is a graph simulating the case where the main impact pit is close to the boundary of the field of view (removing impact pit target information in light gray areas).

Detailed Description

The invention is further described with reference to the following figures and detailed description.

1) Network architecture

The network system provided by the invention is called CraterIDnet and is an end-to-end full convolution neural network model. The whole system is an independent and unified collision pit detection and identification network. The network structure is shown in fig. 1:

and the CraterIDNet receives an input remote sensing image with any resolution, and outputs the position and the diameter of the detected impact pit and the number of the identified impact pit. The network comprises two main parts, an impact pit detection channel and an impact pit identification channel. The whole system adopts a full convolution architecture without a full link layer, thereby greatly reducing the network scale. In order to further reduce the network size under the condition of ensuring the detection recognition effect, the invention firstly trains a small-scale pre-training network to initialize the convolution characteristic layer conv 1-7. The pre-training network uses remote sensing images of Mars impact pits with different scales, different lighting conditions, different forms and distributed in different areas as training samples so as to expand the application range and generalization capability of the network. After CraterIDNet receives the input image, the feature maps of convolutional layers conv4 and conv7 are then input into two impact pit detection channels and to achieve synchronous detection of impact pit targets on different scale feature maps. The detection channel for detecting the impact pit is composed of 3 convolution layers and an object generation layer, the additional convolution layers generate a candidate frame with a specific shape on the feature mapping, and the detection of the position and the apparent diameter of the impact pit is realized by judging the object type in the candidate frame and regressing the offset and the zoom quantity of an output object relative to the candidate frame. The detection results output by the two detection channels are fused on the target generation layer and input into the impact pit identification channel, and the impact pit identification channel is composed of a grid pattern layer and 4 convolution layers. The grid pattern layer utilizes the information of the front layer to generate a grid pattern diagram which corresponds to the collision pit to be identified and has rotation and scale invariance. And then classifying the grid pattern graph by training a subsequent convolution layer, wherein the classification corresponds to the collision pit recognition result.

The whole CraterIDnet realizes end-to-end bump pit detection and identification through the full convolution layer and the additional functional layer, and the size of the finally generated model is only about 4 Mbyte.

2) Pre-training model construction

Pre-training model structure

The pre-training model network is structured as a classical two-classification convolutional neural network. The input image is passed through a series of convolutional layers, shallow convolutional using a 5 x 5 sized filter, deep convolutional using a 3 x 3 sized filter, deep convolutional span fixed at 1 pixel, and boundary filled with 1 pixel to maintain the resolution of the feature map after convolution. The network uses 3 maximum pooling layers for spatial pooling. The convolutional layer has connected two full-link layers at last, and the first full-link has 256 passageways, and the second full-link layer has 2 passageways, corresponds the final classification of the pre-training network: impact pit targets or non-impact pit targets. And the last layer is a soft-max layer, and the output result is expressed as probability distribution between the impact pit target and the non-impact pit target. The structure of the pre-trained network is shown in fig. 2.

② training of pre-training model

According to a high-resolution full-color image acquired by a Mars express train number detection satellite of the European space navigation bureau and a Robbins impact pit database, 1600 impact pit targets under different scales and different illumination conditions are manually selected as an initial positive sample set. And then, the samples are subjected to data enhancement through random rotation, random movement and random scaling so as to improve the diversity of the samples. Finally, a positive sample set containing 8000 samples was obtained. Meanwhile, 8000 image regions with other topographic features are randomly selected in the selected scene as negative samples (such as plains, canyons, riverbeds, etc.). From these 16000 samples, 70% samples were randomly selected as training samples and 30% as testing samples, where the ratio of positive and negative samples is 1: 1.

before the samples are transmitted to the network training, the sample images are scaled to be 125 × 125 pixels with fixed size. The pre-training network is trained using a batch momentum gradient descent method. The batch size was set to 128 and the momentum factor 0.9. Regularizing the network using weight attenuation and dropout in the training process, L₂The penalty factor is set to 0.0005 and the dropout ratio is 0.5. . The network weight is initially carried out by using a 'xavier' methodThe bias is initialized to a constant of 0.1. The network starts 40 rounds of training at a basic learning rate of 0.01, and the learning rate is gradually reduced at an attenuation rate of 0.5 every 500 iterations.

3) Impact pit detection channel construction

The invention constructs an efficient impact pit detection channel through the following three aspects: detecting a channel structure, a candidate box optimization selection strategy and a 3-step interactive training method.

Impact pit detection channel network structure

CraterIDnet connects two bump-in-pit detection channels after convolutional layers conv4 and conv7, respectively, the two detection channels sharing a network lower convolutional layer. The convolution template size of the frame candidate strapdown convolution layer (conv4_1, conv7_1) is 3 × 3 pixels, and n frame candidates are generated at each position corresponding to the feature map in the sliding convolution template process. The output feature maps of the frame candidate strapdown convolutional layers are further input to the 1 × 1 classification convolutional layers (conv4_2, conv7_2) and the regression convolutional layers (conv4_3, conv7_ 3). Each candidate frame corresponds to 2 outputs on the classification convolutional layer, i.e. the probability of belonging to the impact pit and the background, and 3 outputs on the regression convolutional layer, respectively the horizontal offset, the vertical offset, and the proportionality coefficient of the predicted impact pit diameter with respect to the candidate frame width of the predicted impact pit centroid with respect to the candidate frame center. The output characteristic mapping reception fields of the convolutional layers conv4 and conv7 are respectively 37 pixels and 101 pixels, the detection of the small-scale impact pits is realized by using a detection channel connected to the convolutional layer conv4, and the detection of the large-scale impact pits is realized by using a detection channel connected to the convolutional layer conv 7. The multi-scale detection structure for detecting the corresponding targets in different scale ranges on the feature maps with different resolutions can effectively solve the detection problem of the multi-scale impact pit targets. And finally, integrating the classification and regression parameters output by the previous layer on the target layer. And judging candidate frames with the input probability of the pit collision target larger than a certain threshold value as candidate pit collision targets, and calculating the position and the diameter of the regressed candidate frames on the resolution of the original image. Finally, the detected impact pit target position and diameter are output after maximum value suppression.

Second, the optimization selection strategy of the candidate box

Since the impact pit target approximates a circle, the present invention selects a square candidate box as the default candidate box scale. The relation that the candidate box and the size of the impact pit marking box should satisfy and the ultimate impact pit detection size of the network can be expressed as:

wherein S_gRepresenting the side length, S, of the square impact pit marking box_aRepresenting the square candidate box side length, T_IouAnd d represents the interval step length of the candidate frames corresponding to the two adjacent feature points on the feature mapping on the original image. Setting the length of the minimum dimension candidate frame side after optimization as S_a1Then S is_a1Should satisfy

S_gminIndicating the minimum impact pit size for the sample set. Due to the absence of less than S_gminTarget of (1), thus S_a1Larger values within this range tend to be selected. Setting:

the target scale range that the scale candidate box can detect is [ (1-lambda) S_gmin,(1-λ)S_gmin/T_Iou]. In order to ensure the effectiveness of the candidate frames in detecting targets of all scales in the data set, λ is set as the overlapping rate of detectable impact pit scale ranges of two adjacent scale candidate frames, and thus, the detectable target range corresponding to the nth candidate frame is as follows:

S_gnthe upper bound must be larger than the maximum target labeling box size, so that the condition that the number n of the selected candidate boxes should satisfy is:

get

The candidate box optimization metric can be expressed as:

according to the method, the optimal candidate frame sizes of 15, 27, 49, 89, 143 and 255 pixels are finally obtained according to the target characteristics of the training set. Candidate boxes with dimensions 15 and 27 are associated with the small-scale impact pit detection channel, and candidate boxes with dimensions 49, 89, 143, and 255 are associated with the large-scale impact pit detection channel.

In consideration of the serious inconsistency of the scale distribution of the pit impact data set, in order to ensure that the candidate frames of all scales obtain training opportunities with approximate equal probability, the invention provides a candidate frame density adjusting mechanism, and provides an optimization equation of candidate frame density adjustment based on three parameters of the average effective candidate frame number, the average target number in a training set scene and a density adjusting factor, so that the aim of balancing the number of the candidate frames of all scales in the training scene is fulfilled, and the network can obtain the consistent training effect on the pit impact targets of all scales.

As shown in FIG. 3, B_gIndicating the boundary frame of the impact pit, B_aRepresenting candidate boxes, S_gAnd S_aRespectively represent B_gAnd B_aThe size of the dimension (c). Let (x)_a，y_a) Is represented by B_aCenter coordinate of (B)_gThe center is located at the origin of coordinates. When the origin of the candidate box is located in the blue effective area, the current candidate box is marked as being capable of being marked as a positive sampleThe valid candidate boxes of the present document satisfy:

IoU(B_g，B_a)≥T_IoU (7)

the number of candidate frames satisfying equation (7) is defined as the effective number of candidate frames without considering the influence across the image boundary. In the invention, S is_g＝S_aThe effective number of the corresponding candidate frames is approximate to the average effective candidate frame number of the target in the detectable scale range of the scale candidate frame, and is recorded as

According to the definition of the overlapping degree, when the formula (7) is satisfied, the candidate frame center coordinates should satisfy the discriminant formula:

average the effective number of candidate frames

Approximately as the ratio of the active area to the square of the candidate box spacing, i.e.:

wherein d is_aRepresenting the span interval of the candidate frame at the resolution of the original image. Then, the following equations (8) and (9) can be obtained:

defining a total average effective candidate box number of a scale candidate box in a training set scene

Comprises the following steps:

wherein

Representing statistically derived training set metrics

Is the average number of impact pit targets. The purpose of the candidate frame density adjustment is to balance the total average effective candidate frame number of candidate frames in each scale, so that the candidate frames obtain training opportunities with approximate probabilities. Therefore, the present invention proposes that the goal of candidate box density adjustment is to minimize the square loss function as follows:

wherein τ i represents a density adjustment factor of the candidate frame with the corresponding scale, the value is an integer, and n is the number of the scale types of the candidate frame. Further, it is considered that the candidate frames generated by the network should not be too many or too few, and too many will greatly increase the training time, and too few will result in insufficient training. Therefore, penalty terms based on the total effective candidate box number are introduced, and the final optimization objective function is obtained as follows:

wherein N is_batchAnd omega is a penalty coefficient for the number of candidate frames of each batch in training, and is taken as 0.02 in the invention. By introducing a penalty term, the minimum experience loss is ensured, and the minimum structural risk is also ensured. During the training process, the total N is randomly selected in each iteration_batchEach candidate box is trained to include half of the positive samples and half of the negative samples. In order to ensure that the network generates enough training candidate frames and also leave room for random selection in the training process, taking N_batchAs the desired total number of valid candidate boxes generated by the network. Obtaining an optimized density adjustment factor according to the formula (13)τ_iWhen the number of the generated corresponding candidate frames is larger than 0, the number of the generated corresponding candidate frames is increased to be original

Multiple, when tau_iIf < 0, the number of generated corresponding candidate frames is reduced to the original one

Multiple, when tau_iWhen the number is 0, the original candidate frame number is maintained. The result of the frame candidate density adjustment is shown in fig. 4. FIG. 4(a) shows τ_iWhen the frame density is-1, the frame density is halved, and τ is shown in fig. 4(b)_iWhen 0, the frame density is not changed, d_aIndicates the frame interval candidates, and FIG. 4(c) indicates τ_iWhen 1, the frame density doubles.

③ 3 step type interactive training method

And the impact pit detection channel performs end-to-end training by using a momentum random gradient descent method. Randomly selecting N in image in each iteration_batchA batch loss function is calculated for 256 candidate blocks with a 1: 1 ratio of positive and negative samples. If the number of positive samples is less than 128, then the negative samples are filled in. Since two impact pit detection channels share convolutional layers conv 1-conv 4, there is a need for a training method that allows two detection channels to share convolutional layers, rather than training two independent detection channels. The invention provides a 3-step interactive training method, which comprises the following specific steps:

(1) initializing the network parameters of the collision pit detection channel by taking the pre-training model as an initial value, initializing the newly added convolution layer by using a 'xavier' method, and initializing the bias to be a constant 0. The network utilizes the training set of the collision pit detection channel 2 to fine tune with the parameter settings of the basic learning rate 0.005, the momentum coefficient 0.9, and the weight attenuation coefficient 0.0005. The learning rate of the convolutional layers (conv4_1 to conv4_3) unique to the impact pit detection channel 1 was set to 0, and training was not performed. 50 rounds of training are performed, and the learning rate is gradually reduced by an attenuation rate of 0.8 every 10000 times of iteration.

(2) And in the second step, the network takes the model trained in the first step as an initial value, fixes the convolutional layers conv 5-conv 7 and the convolutional layers (conv7_ 1-conv 7_3) unique to the impact pit detection channel 2, and utilizes the training set of the impact pit detection channel 1 to finely tune the network. The basic learning rate is set to be 0.001, the momentum coefficient is set to be 0.9, and the weight attenuation coefficient is set to be 0.0005. The network performs 30 rounds of training, and the learning rate is gradually reduced by 0.6 at every 20000 iterations.

(3) And thirdly, fixing the shared convolution layers conv 1-conv 4 and the unique convolution layer of the impact pit detection channel 1, and finely adjusting the model trained in the last step by utilizing the training set of the impact pit detection channel 2. The basic learning rate 0.0002, the momentum coefficient 0.9 and the weight attenuation coefficient 0.0005 are set. The network performs 30 rounds of training, with the learning rate gradually decreasing at a decay rate of 0.8 every 15000 iterations.

4) Impact pit identification channel construction

The invention constructs an efficient collision pit identification channel through the following two aspects: grid pattern layer and impact pit recognition channel training method.

Grid pattern layer

The grid pattern layer inputs impact pit characteristic information (position and diameter) detected by an impact pit detection channel and outputs a grid pattern diagram with rotation and scale invariance corresponding to an impact pit target. The invention fuses the characteristic information of impact pit distribution and scale into a grid mode and graphs the information. And classifying the grid pattern graph by using a subsequent convolutional network to finish the identification process.

The grid mode layer working steps are as follows:

(1) impact pit size screening was performed first. The height of the working track of the remote sensing camera is set as H_min～H_max，H_refThe reference track height corresponding to the training sample. The impact pit size which can be effectively identified can be detected by an impact pit detection channel within the working track height range of the remote sensing camera, so that the impact pit with the diameter range satisfying the following formula is selected as a candidate impact pit:

wherein D_minDenotes the minimum detectable impact pit diameter, D, of the impact pit detection channel_maxIndicating the maximum diameter of the impact pit detectable by the impact pit detection channel.

(2) And selecting a main impact pit. Typically at least 3 impact pit targets are required in the field of view to resolve the positional relationship of the spacecraft relative to the spacecraft surface. In the invention, 10 candidate impact pit targets closest to the center of a field of view are selected as main impact pits, a grid pattern diagram of the main impact pits is constructed, and if less than 10 impact pit targets exist in the field of view, all the impact pit targets are marked as the main impact pits.

(3) And (5) dimension standardization. For each main impact pit, its distance to other candidate impact pits is calculated separately, which is called the main pitch. The principal distance is then unified with the diameter of all impact pits to a reference scale. And H is the height of the track where the remote sensing camera is positioned during imaging, and the main distance and the diameter of the candidate impact pit are multiplied by a scale transformation factor H_ref/H。

(4) And generating a grid pattern diagram. A17 x 17 grid pattern diagram is established with the main impact pit as the center and the direction of the line from the main impact pit to the nearest adjacent impact pit as the positive direction. Each grid element side length is L_g. And judging adjacent impact pits falling into the grid by the normalized principal distance, when at least one adjacent impact pit falls into a grid element (i, j), the grid element is in an activated state, the output amplitude is equal to the summation of the normalized diameters of the impact pits falling into the grid element, and the amplitude of the grid element without the impact pits is 0. And finally, accumulating the standardized diameter of the main impact pit into the amplitude of the central grid element.

The generation process of the grid pattern diagram is shown in fig. 5, the dark gray circles represent the main impact pits, and the light gray circles represent other candidate impact pits. Fig. 5(b) shows that the dimension normalization is performed, and the arrow direction shows the direction of the line connecting the main impact pit to the nearest neighboring impact pit. Fig. 5(c) shows the grid pattern created in the forward direction, and fig. 5(d) shows the final grid pattern diagram. And finally, outputting a grid pattern diagram by the grid pattern layer, fusing the distribution characteristic and the scale characteristic of the impact pit target, and imaging the two characteristics. And the position detection error and the scale detection error of the impact pit generated by the impact pit detection channel are converted into position noise and gray scale noise in the image, and further, the robustness of the identification result on the position and the scale error of the impact pit is enhanced in the treatment of the grid characteristics by the convolution network.

Impact pit identification channel training method

Firstly, selecting candidate impact pits in a data set according to equation (14), and then constructing an impact pit recognition channel training set according to the following steps:

(1) each impact pit training target is given a unique number for identification.

(2) For each impact pit training target, taking the side length L of the grid element_g2000 grid pattern maps were constructed with adjacent impact pit targets in the sample set, 24 pixels. The impact pit target in each grid pattern plot was added with a mean of 0, a standard deviation of 2.5 pixels, normally distributed positional noise and a mean of 1.5 pixels, a variance of 1.5 pixels, normally distributed apparent diameter noise.

(3) 400 pattern maps were randomly selected from the 2000 generated grid pattern maps, and information of one adjacent impact pit target in the pattern maps was randomly removed to simulate a case where the impact pit detection channel did not detect the impact pit.

(4) And randomly selecting 700 and 400 grid pattern graphs to add 1 and 2 false impact pit targets respectively so as to simulate the situation that the impact pit detection channel detects the false impact pit targets. The false impact pits added are random variables whose apparent diameters obey a uniform distribution of [20,50] pixels, and the locations are randomly chosen within the grid pattern map.

(5) And randomly selecting 8 groups of 100 grid pattern diagrams in each group, and removing the impact pit target information in the light gray area corresponding to the 8 conditions shown in fig. 6 respectively so as to simulate the condition that the main impact pit is close to the boundary of the field of view.

The impact pit identification channel is trained by a batch momentum gradient descent method. The batch size was set at 512, the momentum coefficient 0.9, and the weighted decay coefficient 0.0005. Convolutional layer conv 8-conv 11 network weights were initialized using the "xavier" method, with the bias initialized to a constant of 0.1. CIP starts 30 rounds of training at a basic learning rate of 0.01, and the learning rate is gradually decreased at an attenuation rate of 0.5 per 10000 iterations. And finally, the identification accuracy of the impact pit identification channel on the test set reaches 99.22%, which shows that the impact pit identification channel has high identification accuracy and strong generalization capability.

The invention provides an end-to-end collision pit detection and identification method based on a full convolution neural network structure. It should be noted that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention are included in the protection scope of the present invention.

Claims

1. a kind of end-to-end impact crater detection and identification method based on full convolutional neural network structure, it is characterized in that: this method synchronously realizes the detection and identification of impact crater in celestial remote sensing image, the network that this method establishes is named CraterIDNet, The established network consists of two impact crater detection channels and one impact crater identification channel. The network weight parameters are only composed of convolutional layers and no full-link layer. Among them,

1) Establish a crater detection channel, which is used to detect the crater target in the image, and output the center of mass coordinates and apparent radius of the crater target. The two crater channels in CraterIDNet are respectively connected to feature maps of different resolutions, which can be synchronized To achieve the detection of crater targets of different scales, the two crater channels share the first 4 convolutional layers of the network;

2) Establish an impact crater identification channel to identify the impact crater target output by the impact crater detection channel. The impact crater identification channel first generates the grid pattern map corresponding to the impact crater target through the grid pattern layer, and then passes the subsequent two-classification volume. The cumulative neural network architecture is used to identify impact craters, and the network classification results correspond to the impact crater identification results.

2. A kind of end-to-end impact crater detection and identification method based on full convolutional neural network structure according to claim 1, is characterized in that: in described impact crater detection channel, a kind of candidate frame optimization selection strategy, its Proceed as follows:

(1) Optimize the size of the candidate frame. The detection range of the candidate frame must cover the scale variation range of the target instance of the impact crater. The detection ranges of the candidate frames of each scale overlap each other at a certain overlap rate. The goal of optimizing the candidate frame size is to ensure that On the basis of satisfying the above conditions, the number of candidate frames is minimized, so the optimized scale of each candidate frame is expressed as:

Among them, S _a represents the side length of the square candidate frame, T _IoU represents the overlap threshold of the candidate frame marked as a positive sample, S _gmin represents the minimum impact crater size of the sample set, S _gmax represents the maximum impact crater size of the sample set, and λ is the adjacent The two-scale candidate frame can detect the overlap rate of the impact crater scale range;

(2) Based on the average number of valid candidate boxes, the average number of targets in the training set scene, and the density adjustment factor, optimize the density of candidate boxes to balance the number of candidate boxes at each scale in the training scene. Calculate as follows The optimal density adjustment factor of each scale candidate box:

in

is the total average number of valid candidate frames corresponding to the ith scale candidate frame, obtained by the product of the average number of valid candidate frames and the average number of targets in the training set scene, N _batch is the number of candidate frames in each batch in training, ω is penalty factor;

(3) According to the obtained optimal density adjustment factor of the candidate frame, adjust the density of each scale candidate frame. When τ _i > 0, increase the number of generated corresponding candidate frames to the original

times, when τ _i < 0, reduce the number of generated corresponding candidate boxes to the original

times, when τ _i =0, the number of original candidate frames is maintained unchanged.

3. a kind of end-to-end impact crater detection and identification method based on full convolutional neural network structure according to claim 1, is characterized in that: in described impact crater detection channel, a kind of 3-step interactive training method, The steps are as follows:

(1) Using the pre-training model as the initial value, initialize the network parameters of the impact crater detection channel, use the "xavier" method to initialize the newly added convolutional layer, and use the training set of the impact crater detection channel 2 to fine-tune the network, Among them, the learning rate of the unique convolution layer (conv4_1~conv4_3) of crater detection channel 1 is set to 0, and no training is performed;

(2) The second-step network takes the model trained in the first step as the initial value, fixes the convolutional layers conv5 to conv7 and the unique convolutional layers (conv7_1 to conv7_3) of the crater detection channel 2, and uses the impact crater detection channel 1 to train set to fine-tune the network;

(3) The third step is to fix the shared convolutional layers conv1 to conv4 and the unique convolutional layer of impact crater detection channel 1, and use the training set of impact crater detection channel 2 to fine-tune the model trained in the previous step;

After the above 3 steps of interactive training, the two crater detection channels of CraterIDNet implement shared convolutional layers.

4. a kind of end-to-end impact crater detection and identification method based on full convolutional neural network structure according to claim 1, is characterized in that: in described impact crater identification channel, grid pattern layer working steps are as follows:

(1) First, the impact crater scale is screened, and the working orbit height of the remote sensing camera is set to be H _min ~ H _max , and H _ref is the reference orbit height corresponding to the training sample. The track height range can be detected by the impact crater detection channel, so the impact crater with the diameter range satisfying the following formula is selected as the candidate impact crater:

D _min represents the minimum diameter of the impact crater that can be detected by the impact crater detection channel, and D _max represents the maximum diameter of the impact crater that can be detected by the impact crater detection channel;

(2) Main crater selection. At least 3 crater targets are required in the field of view to calculate the positional relationship of the spacecraft relative to the surface of the celestial body. Select the 10 candidate crater targets closest to the center of the field of view as the main crater, and construct Its grid pattern map, if there are less than 10 impact crater targets in the field of view, all of them will be marked as the main impact crater;

(3) Scale standardization. For each main impact crater, the distance to other candidate impact craters is calculated separately, which is called the main distance, and then the main distance and the diameters of all impact craters are unified to the reference scale, and H is the imaging When the orbital height of the remote sensing camera is located, the main distance and the diameter of the candidate impact crater are multiplied by the scaling factor H _ref /H;

(4) The grid pattern map is generated. Taking the main impact crater as the center, the direction of the connection between the main impact crater and the nearest adjacent impact crater is the positive direction, and a 17×17 grid pattern map is established. The side length of each grid element is is L _g , the adjacent impact craters that fall into the grid are determined by the standardized main distance. When at least one adjacent impact crater falls into the grid element (i, j), the grid element is in the active state, and the output amplitude is The value is equal to the cumulative sum of the normalized diameters of the impact craters that fall within this grid cell, the grid cell amplitude excluding the impact crater is 0, and finally the normalized diameter of the main impact crater is accumulated into the central grid cell amplitude.