Port ore heap segmentation and reserve calculation method based on improved UNet network
Technical Field
The invention relates to a port ore heap segmentation and reserve calculation method based on an improved UNet network, and belongs to the field of optical remote sensing image processing and deep learning.
Background
Image semantic segmentation is an important field in computer vision, and can identify objects at a pixel level and predict a class to which each pixel in an image belongs. The port ore stacking area is a specific area for stacking ores to be transported in a port, ore piles in the same ore stacking area are arranged in order, but the ore piles can be gradually increased or decreased along with continuous transportation, so that the shape is irregular, and a plurality of ores are closer to the color of a bare area, so that the traditional computer vision method is difficult to detect the edge of the ore in a remote sensing image. The deep learning image semantic segmentation method can effectively solve the problems of irregular shape and close color to the bare ground of the ore heap. The image semantic segmentation technology is used for finding out the ore stacking area, so that the ore stacking area can be further calculated, the ore reserves of ports can be estimated, and the method has high application value. Although few researches on semantic segmentation of the remote sensing images of the ore heap are currently conducted, with the development of the high-spatial-resolution optical remote sensing satellite technology, many researchers apply semantic segmentation of the images to the field of the optical remote sensing images by using a deep learning method and generate a lot of application results.
UNet is an image semantic segmentation network proposed by Ronneberger et al in 2015. The UNet network adopts the idea of an encoder-decoder, has a simple structure and is suitable for training small sample data sets. Researchers at home and abroad use UNet networks to obtain a plurality of research results in the aspects of building extraction classification, mining area change detection, forest type classification and the like.
Liuhao et al propose an improved UNet network SE-Unet for classifying building ground structures, the SE-Unet adopts a feature compression activation mode in the down-sampling process of an encoder, four feature compression activation modules are added and respectively act on an input image, the down-sampling process, the up-sampling and merging process and a final output image, and the features obtained by each convolution are compressed and activated, so that the utilization capability of effective features is improved.
Sunward et al conducted a change detection study on the mine area in the remote sensing image based on the improved UNet twin network, and here the pooling layer of UNet was replaced with a convolutional layer with a step length of 2, and a dual-channel structure with weight sharing was used to construct the twin network, so that the network receives the images of two periods simultaneously and extracts the differences therefrom. The pooling layer in UNet increases the field of view, allowing the convolution to receive more information, but loses much information, and replacing pooling with 2-step convolutional layers increases the field of view while reducing information loss.
Wangyou et al used UNet to classify forest types of high-resolution multispectral remote sensing images, firstly extracting NDVI characteristics and four wave bands of original images, constructing an UNet model by 5 characteristic data for classification, and then processing a classification result by a Conditional Random Field (CRF). The CRF can effectively refine edges among various ground objects and improve classification precision.
Disclosure of Invention
The invention aims to provide a port ore heap segmentation and reserve calculation method based on an improved UNet network, so as to solve the problems in the prior art.
A port ore heap segmentation and reserve calculation method based on an improved UNet network comprises the following steps:
step 1, making a port ore heap semantic segmentation data set based on a high-resolution optical remote sensing image;
step 2, the UNet network algorithm is improved: optimizing the UNet network downsampling process by using a hole convolution layer;
step 3, training the ore heap segmentation data set by using an improved UNet network;
step 4, performing image semantic segmentation on image test data containing the ore heap by using the trained network;
and 5, estimating the reserve of the segmented ore heap by using an ore heap volume estimation method.
Further, in step 1, the method specifically comprises the following steps:
1.1, selecting a data set image of a mine pile for segmentation, wherein the data set image selects Google base map, randomly selecting 40 images containing port ore stacking areas for interception, wherein the images comprise port ore piles with different shapes, colors, sizes and stacking modes, using 36 images for network training, and using 4 images for a test set;
step 1.2, manually labeling the selected image by using a labelme tool to generate a segmentation result json, converting by using the labelme tool, and converting the segmentation result json into a label image form;
and step 1.3, cutting the image of the ore stacking area and the corresponding label image into 256 multiplied by 256 images, filling the 256 multiplied by 256 images with 0 value for the part with the size less than 256 multiplied by 256 after cutting, then removing the image without the effective label, and dividing 346 cut images into 80% training set and 20% verification set.
Further, in step 2, the method specifically comprises the following steps:
step 2.1, constructing an improved UNet network, replacing five convolutional layers in the downsampling process of the UNet network with a hole convolution, wherein the hole convolution has a plurality of intervals compared with a standard convolutional layer, and the improved UNet network adopts 5 multiplied by 5 hole convolution with the interval number of 1;
and 2.2, filling by using a 0 value in the up-sampling and down-sampling processes.
Further, in step 3, the method specifically comprises the following steps:
step 3.1, training the data set with the following hyper-parameters, namely, an initial learning rate learning _ rate =0.001, a batch size batch _ size =4, a training algebra epochs =100, and a segmentation class n _ classes =2;
3.2, in the training process, reducing the learning rate by half when the loss value of the verification set is unchanged in continuous two generations of training by using a reduced LROnPlateau;
and 3.3, storing the training result once per generation, and taking the highest accuracy of the final verification set as the final training result.
Further, in step 4, the method specifically comprises the following steps:
step 4.1, slicing the image for testing into 256 × 256 images, and filling the part with the size less than 256 × 256 after cutting into 256 × 256 size by using 0 value;
step 4.2, performing semantic segmentation on each slice by using a trained UNet network to obtain a gray image with the size of 256 multiplied by 256, wherein the gray value range is [0,n \ "classes), n _ classes is the number of segmentation classes, and n _ classes =2 and comprises two segmentation classes of a mine pile and a bare area;
and 4.3, splicing the segmentation result slices according to the sequence of the original image, and cutting the part which exceeds the boundary of the original image.
Further, in step 5, the method specifically comprises the following steps:
step 5.1, extracting the outlines of the marking results in the segmentation result images by using a findContours method in opencv, wherein each extracted outline is a mine pile;
step 5.2, calculating the number of pixels occupied by each ore heap in the image, and calculating the total number of pixels of the image;
step 5.3, calculating the geographical area actually occupied by the image by using the remote sensing positioning information;
step 5.4, converting the contour length unit in the ore heap image into meters from pixels according to the proportion of the total pixel number and the geographic area size;
step 5.5, calculating the external rectangle of the ore heap by using the opencv minAreaRect method according to the extracted contour, and calculating the length l of the external rectangle box And width w box ;
Step 5.6, obtaining the radius r of the bottom surface of the cone according to the height h and the stacking angle alpha of the ore pile and the length and the width l of the circumscribed rectangle box 、w box And the radius r of the bottom surface of the cone estimates the width w of the upper surface of the step top And upper surface length l top :
r=h/tanα (1)
w top =w box -2r (2)
l top =l box -r (3);
Step 5.7, estimating the upper surface area S of the step according to the following conditions according to whether the ore heap is complete or not top And surface area S 'under the terrace' bottom :
Step 5.8, calculating the total base area S of one triangular prism and two 1/4 cones on one side of the ore pile cones :
S cones =πr 2 /2+w top r (6);
Step 5.9, according to the whole bottom areas S and S of the ore heap cones Comparing and dividing the integrity of the ore pile into three conditions, and correcting the surface area S 'under the bench' bottom The existing error is used for solving the real lower table of the stepArea S bottom :
Step 5.10, estimating the volume V of the ore heap by using a step and cone volume formula:
the invention has the following main advantages:
(1) The UNet network is improved, and the hole convolution is adopted, so that the receptive field is improved, the information loss is reduced, and the identification precision is improved;
(2) Optimizing the result by using a conditional random field, refining the ore heap and the bare area segmentation edge, and reducing the probability of wrong division during semantic segmentation;
(3) The ore heap reserve estimation is carried out by using an ore heap reserve estimation algorithm, and the method has a guiding function in the fields of financial future prices and the like.
Drawings
FIG. 1 is a flow chart of a method for dividing a port ore heap and calculating reserves based on an improved UNet network according to the invention;
FIG. 2 is a schematic structural diagram of a modified UNet structure using hole convolution;
FIG. 3 is a diagram of a heap model, wherein FIG. 3 (a) is a three-view diagram of the complete heap model; fig. 3 (b) shows the ore heap model after excavation.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the invention relates to a port ore heap segmentation and reserve calculation method based on an improved UNet network, which comprises the following steps:
step 1, making a port ore heap semantic segmentation data set based on a high-resolution optical remote sensing image;
step 2, the UNet network algorithm is improved: optimizing the UNet network downsampling process by using a hole convolution layer;
step 3, training the ore heap segmentation data set by using an improved UNet network;
step 4, performing image semantic segmentation on image test data containing the ore heap by using the trained network;
and 5, estimating the reserve volume of the divided ore heap by using an ore heap volume estimation method.
Further, in step 1, the method specifically comprises the following steps:
step 1.1, selecting a data set image of the ore heap for segmentation, wherein the data set image selects Google base map, randomly selecting 40 images containing port ore stacking areas for interception, wherein the images comprise port ore heaps with different shapes, colors, sizes and stacking modes, using 36 images for network training, and using 4 images for a test set;
step 1.2, manually labeling the selected image by using a labelme tool to generate a segmentation result json, converting by using the labelme tool, and converting the segmentation result json into a label image form;
and 1.3, cutting the image of the ore stacking area and the corresponding label image into 256 × 256 images, filling the cut part with the size less than 256 × 256 with a 0 value to 256 × 256, removing the image without the effective label, and dividing 346 cut images into 80% of training sets and 20% of verification sets.
Further, in step 2, the method specifically comprises the following steps:
step 2.1, constructing an improved UNet network as shown in fig. 2, replacing five convolutional layers in the downsampling process of the UNet network with a cavity convolution, wherein the cavity convolution has many intervals compared with a standard convolutional layer, so that the information loss can be effectively reduced while the receptive field is increased, and the invention adopts 5 × 5 cavity convolution with the interval number of 1;
and 2.2, filling by using a 0 value in the up-sampling and down-sampling processes.
Further, in step 3, the method specifically comprises the following steps:
step 3.1, training the data set with the following hyper-parameters, namely, an initial learning rate learning _ rate =0.001, a batch size batch _ size =4, a training algebra epochs =100, and a segmentation class n _ classes =2;
3.2, in the training process, reducing the learning rate by half when the loss value of the verification set is unchanged in continuous two generations of training by using a reduced LROnPlateau;
and 3.3, storing the training result once per generation, and taking the highest accuracy of the final verification set as the final training result.
Specifically, the program will run on a machine with a CPU of Intel Core i7-9700, a gpu of NVIDIA GeForce RTX2060 (computer Capability =7.5,1920 cudacores), a memory of 16GB, an operating system of Ubuntu 18.04, a Python version of 3.5, a tensorflow version of 1.13.1, and a keras version of 2.2.4.
Further, in step 4, the method specifically includes the following steps:
step 4.1, slicing the image for testing into 256 × 256 images, and filling the part with the size less than 256 × 256 after cutting into 256 × 256 size by using 0 value;
step 4.2, performing semantic segmentation on each slice by using a trained UNet network to obtain a gray image with the size of 256 multiplied by 256, wherein the gray value range is [0,n \ "classes), n _ classes is the number of segmentation classes, and n _ classes =2 and comprises two segmentation classes of a mine pile and a bare area;
and 4.3, splicing the segmentation result slices according to the sequence of the original image, and cutting the part which exceeds the boundary of the original image.
Further, in step 5, the method specifically comprises the following steps:
step 5.1, extracting the outlines of the marking results in the segmentation result images by using a findContours method in opencv, wherein each extracted outline is a mine pile;
step 5.2, calculating the number of pixels occupied by each ore heap in the image, and calculating the total number of pixels of the image;
step 5.3, calculating the geographical area actually occupied by the image by using the remote sensing positioning information;
step 5.4, converting the contour length unit in the ore heap image into rice from pixels according to the proportion of the number of the total pixels to the size of the geographic area;
step 5.5, as shown in fig. 3, the complete ore heap is regarded as the splicing of a terrace and two sides, the incomplete ore heap is obtained by digging a part of one side of the complete ore heap, the two sides can be regarded as the splicing of a triangular prism and two 1/4 cones, the gradient of the terrace is the stacking angle of the ore heap, the stacking angle and the heap height are slightly different according to different mineral types and stacking area regulations, the stacking angle and the heap height can be obtained by table lookup according to actual conditions, the external rectangle of the ore heap is calculated by using the opencv minAreaRect method according to the extracted outline, and the length l of the external rectangle is calculated box And width w box ;
Step 5.6, obtaining the radius r of the bottom surface of the cone according to the height h of the ore heap and the stacking angle alpha (unit rad), and obtaining the length l and the width l of the circumscribed rectangle box 、w box And the radius r of the bottom surface of the cone estimates the width w of the upper surface of the step top And upper surface length l top :
r=h/tanα (1)
w top =w box -2r (2)
l top =l box -r (3);
Step 5.7, estimating the upper surface area S of the step according to the following conditions according to whether the ore heap is complete or not top And bottom surface area S' bottom :
Step 5.8, calculating the total bottom area S of one triangular prism and two 1/4 cones on one side of the ore pile cones :
S cones =πr 2 /2+w top r (6);
Step 5.9, according to the whole bottom areas S and S of the ore heap cones And (4) correcting the surface area S 'under the step by dividing the completeness of the ore pile into three conditions' bottom The error exists, and the real surface area S under the step is obtained bottom :
Step 5.10, estimating the volume V of the ore heap by using a step and cone volume formula: