CN114782417A

CN114782417A - Real-time detection method for digital twin characteristics of fan based on edge enhanced image segmentation

Info

Publication number: CN114782417A
Application number: CN202210680421.8A
Authority: CN
Inventors: 胡伟飞; 忻奕杰; 吕昊; 程锦; 刘振宇; 谭建荣
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2022-06-16
Filing date: 2022-06-16
Publication date: 2022-07-22

Abstract

The invention discloses a real-time detection method for digital twin characteristics of a fan based on edge enhancement image segmentation. The method comprises the steps of training probability distribution in image data in a supervised mode from a large number of fan surface image samples with labels, improving the capacity of extracting image features from a model extraction image, simultaneously constructing a new fan model in a virtual environment by utilizing a digital twin method, adding defects at a reasonable position on a fan, carrying out simulated shooting in the virtual environment by using a camera, and marking generated pictures, thereby expanding data set samples during training and ensuring the generalization capacity of the deep learning model in classification and segmentation in wider complex images.

Description

Real-time detection method for digital twin characteristics of fan based on edge enhanced image segmentation

Technical Field

The invention relates to the field of computer vision and the field of digital twinning, in particular to a real-time detection method for the digital twinning characteristics of a fan based on edge-enhanced image segmentation.

Background

The Digital Twin technology (Digital Twin) fully utilizes data such as a physical model, sensor updating, operation history and the like, integrates a multidisciplinary, multi-physical quantity, multi-scale and multi-probability simulation process, and finishes mapping of a physical world in a virtual world so as to reflect the whole life cycle process of corresponding entity equipment. A digital twin may be viewed as a digital mapping system of one or more important, interdependent equipment systems that are the ties of the physical world interacting and fusing with the virtual world.

The image feature classification method is to identify semantic information of image partition sub-regions into corresponding feature classes, wherein the semantic information comprises subdivision features such as colors, contours and textures. The traditional method is time-consuming and labor-consuming for the expression of features, and can only reliably operate on processing the same and well-made parts, but the traditional algorithm becomes more and more complex as the types and the number of defects increase. The development of the quality detection field is promoted by the appearance of a deep neural network model, and the deep neural network can quickly identify effective variables which are difficult to capture by a traditional machine vision method in an image through learning experience with a great number of images and having a potential of progress, such as: illumination, curved surfaces, or fields of view, etc., to accomplish reliable identification of complex features under more complex conditions.

The edge detection is an important problem in image processing and computer vision, the purpose of the edge detection is to reflect important events and changes of attributes in the form of irregular edges on the basis of image feature detection, and the image edge detection greatly reduces irrelevant information of images and only retains important structural attributes of the images.

Wind energy has been developed in recent years as a promising renewable energy source, and the frequency of accidents of wind turbines has also increased year by year. The method has the advantages that the primary damage characteristic detection is carried out on the surface of the wind turbine in time, and the method has important significance for reducing catastrophic accidents of the wind turbine and increasing the effective working time of the wind turbine.

Disclosure of Invention

The invention aims to provide a real-time detection method for the digital twin characteristics of a fan based on edge-enhanced image segmentation, which introduces a data set expansion mode based on digital twin and an image segmentation technology based on a deep neural network, enhances the edge recognition capability of the fan, and supervises probability distribution in image data from a large number of fan surface image samples with labels while generating a large amount of effective virtual data, thereby improving the capability of extracting the digital characteristics in the image by a model and ensuring the classification generalization capability of the model in wider complex images. And finally, carrying the trained model by using the unmanned aerial vehicle supporting remote high-precision control, and realizing high-automation, low-cost, quick and accurate fan surface target feature recognition in a field scene.

The purpose of the invention is realized by the following technical scheme:

a real-time detection method for a fan digital twin characteristic based on edge enhanced image segmentation comprises the following steps:

(1) selecting a plurality of fan pictures with surface characteristics of a certain specific fan, modeling the specific fan and the peripheral scene of the fan in a virtual environment on the basis of the selected fan pictures, and increasing the surface characteristics of the fan;

(2) carrying out simulation shooting on a virtual fan model with surface characteristics by using a camera in a virtual scene to obtain a digital twin virtual fan picture with the surface characteristics;

(3) firstly, expanding and enhancing a data set of a real fan picture and a digital twin virtual fan picture, and marking out areas of all surface characteristics displayed in the pictures and the category of each surface characteristic, wherein the categories comprise a vortex generator panel, vortex generator panel loss and corrosion;

(4) selecting Mask RCNN as a basic network for image segmentation, and enhancing the edge recognition effect of the Mask RCNN to obtain an edge-enhanced Mask RCNN; abstracting the areas of the fan surface features in all the fan pictures obtained in the step (3) and the category of the surface features into a digital vector and a scalar respectively, wherein the digital vector is used as the input of the edge strengthening Mask RCNN, and the scalar is used as a comparison label for classification of the edge strengthening Mask RCNN; the edge strengthening Mask RCNN is a deep learning image segmentation neural network after the weight occupied by the boundary of a prediction region in a loss function is increased during Mask RCNN training;

(5) inputting the fan pictures, the digital vectors and the comparison labels which are expanded and enhanced in the step (3) into the edge strengthening Mask RCNN for training to obtain edge strengthening Mask RCNN weight parameters with optimized parameters;

(6) and inputting the picture to be tested into the edge strengthening Mask RCNN after parameter optimization, and outputting the category to which all the surface features on the picture belong and the area of each surface feature in real time.

Further, the edge-enhanced Mask RCNN is formed by sequentially connecting four parts of a deep convolutional neural network, an RPN network, a RoIAlign network, a parallel full-connection layer and an improved FCN network in series; the deep convolutional neural network is used for extracting information of different scales of an input image, the RPN network is used for generating areas of surface features to be detected with different sizes, the RoIAlign network is used for generating the areas of the surface features to be detected with different sizes into digital features with fixed length, the full connection layer is used for class detection, and the improved FCN network is used for area detection of the surface features.

Further, the deep convolutional neural network selects Resnet 101.

Further, in the step (1), the virtual scene is constructed based on three-dimensional fusion of the entity photos, and the virtual fan and fan characteristics are generated based on the real fan model and the existing characteristic image.

Further, in the step (2), the virtual camera automatically rotates around the fan body through setting, and the virtual camera automatically shoots.

Further, in the step (3), firstly, the fan picture with the surface features is subjected to sample picture expansion by using two image enhancement methods of pyramid and patch enhancement, then all the expanded pictures are further expanded through left-right turning, contrast normalization and Gaussian blur, and then the area of each surface feature and the category of each surface feature on the picture are marked.

Further, in the step (3), the contour of each surface feature is labeled to represent the region where the surface feature is located, and the category to which each surface feature belongs is defined as an integer.

Further, the feature map output by the deep convolutional neural network is used as the input of the RPN network, a plurality of candidate frames with the length-to-width ratio of 1:2, 1:1, 2:1 and the sizes of 32, 64, 128, 256 and 512 are generated, and the candidate frames with the target object are reserved through detection of the RPN network.

The invention has the following beneficial effects:

(1) compared with the traditional data set expansion means such as image scaling, turning, rotation and the like, the method adopts a digital twin method to reconstruct the fan. In the reconstructed model, different types of characteristics with different shapes can be added to a proper position, and meanwhile, the environmental conditions such as illumination conditions, climate conditions and the like can be changed, so that the images of the fan under various conditions are enriched during training, and the adaptability of the neural network to the environment and the change of the characteristics of the fan during detection is improved.

(2) Compared with the existing target detection network, the method adopts the Mask RCNN with the edge enhancement, increases the weight of the edge during training, can predict more accurate pixel-level edges, and is more beneficial to the detection of the surface characteristics of the fan.

(3) The method can quickly identify the effective variable which is difficult to capture by the traditional machine vision method in the image, and can complete reliable identification of complex characteristics under complex working conditions. The method can accurately detect the edge of the surface characteristic of the wind turbine and the type of the characteristic, thereby detecting the damage characteristic of the surface of the wind turbine in time and maintaining the damage characteristic, reducing the catastrophic accidents of the wind turbine and increasing the effective working time of the wind turbine.

Drawings

FIG. 1 is a schematic diagram of a digital twinning expansion data set method.

Fig. 2 is a schematic diagram of an image enhancement method.

FIG. 3 is a schematic diagram of dataset tagging.

Fig. 4 is a schematic diagram of an image segmentation process implemented by the edge enhancement Mask RCNN.

FIG. 5 is a schematic diagram of the overall structural framework of the edge strengthening Mask RCNN.

Fig. 6 is a schematic diagram of RPN generation candidate sub-regions.

Fig. 7 is a schematic diagram of FCN network generation detection edge.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will be more apparent, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.

The invention discloses a real-time detection method for a fan digital twinning characteristic based on edge-enhanced image segmentation, which comprises the following steps of:

the method comprises the following steps: selecting a plurality of fan pictures with surface characteristics of a certain specific fan, modeling the specific fan and the peripheral scene of the fan in a virtual environment on the basis of the pictures, and increasing the surface characteristics of the fan;

the virtual scene is constructed based on three-dimensional fusion of the entity photos, and the virtual fan and fan characteristics are generated based on a real fan model and the existing characteristic images.

Step two: carrying out simulation shooting on a virtual fan model with surface characteristics by using a camera in a virtual scene to obtain a fan picture with the surface characteristics; as shown in fig. 1.

Step three: and selecting digital twin generated and real fan pictures with clear surface characteristics, and amplifying the selected pictures. Fig. 2 shows a schematic diagram of a pyramid + image enhancement method in the present invention. And enlarging or reducing the original image according to a certain proportion. Meanwhile, a sliding window with a fixed size is used for intercepting the magnified or reduced image, a target feature is ensured in the sliding window in the intercepting process, and finally the intercepted image is used as a new data sample.

In this embodiment, a 1.5-fold ratio is adopted for enlargement and reduction, 1161 pictures of the zoomed and original images are intercepted by using a 1000 pixel × 1000 pixel sliding window, 1969 images with 1000 pixel × 1000 pixel containing certain target features are obtained, and both are used as data sets of a training network.

As can be seen from fig. 2, in the original 4000 pixel × 3000 pixel image, the proportion of the target area to the whole image area is very small, which is not beneficial to extracting features from the target area by using a neural network. In the 1000 pixel by 1000 pixel image intercepted by the sliding window, the proportion of the target area to the whole image area is increased greatly, and the recognition of the characteristics of the target area by the neural network is facilitated. And then further expanding all the expanded pictures through left-right turning, contrast normalization and Gaussian blur.

The areas of all surface features shown in the picture are marked out, and each surface feature belongs to a category, wherein the categories comprise a vortex generator panel, a vortex generator panel missing category and a rust category.

As shown in fig. 3, the present embodiment outlines the features on the picture, and replaces the shapes of all the features with special-shaped polygons, and the categories of the features are expressed by integers (in the present invention, the 1-vortex generator panel VG, the 2-vortex generator panel lacks teeth VGMT, 3-RUST run). And only keeping each vertex and the category code number of the polygon in the generated label, and compressing the label into a Json file.

Step four: and abstracting the areas of the surface features of the fan in all the fan pictures obtained in the first step and the second step and the category of the surface features into a digital vector and a scalar respectively, wherein the digital vector is used as the input of the edge strengthening Mask RCNN, and the scalar is used as a comparison label for classifying the edge strengthening Mask RCNN.

Step five: inputting the fan picture, the digital vector and the comparison label which are expanded and enhanced in the third step into the edge strengthening Mask RCNN for training to obtain an edge strengthening Mask RCNN weight parameter with optimized parameters;

step six: and inputting the picture to be tested into the edge strengthening Mask RCNN after parameter optimization, and outputting the category to which all the surface features on the picture belong and the area of each surface feature in real time.

The edge-enhanced Mask RCNN is a deep learning image segmentation neural network which increases the weight occupied by the boundary of the prediction region in the loss function during the training of the Mask RCNN. The edge-enhanced Mask RCNN is formed by sequentially connecting a deep convolutional Neural Network, an RPN (Region probability Network candidate area generation Network), a RoIAlign (Region of Interest alignment) Network, a parallel Full-connection layer and an improved FCN (Full Connect Neural Network) Network in series. The deep convolutional neural network is used for extracting information of different scales of an input image, and Resnet101 is selected in the embodiment; the candidate area generation network is used for generating areas with different sizes of surface features to be detected; the RoIAlign network is used for generating the digital features with fixed length in the areas of the surface features to be detected with different sizes; the full connection layer is used for carrying out category detection; the improved FCN network is used for area detection of surface features.

The loss function of the edge strengthening Mask RCNN is improved, the weight occupied by the boundary of a prediction region during training is increased, so that the accurate prediction result near the region edge is ensured, and the loss function of the edge strengthening Mask RCNN is known as follows:

in the formula: w is a_iFor the weight coefficient, for points inside the real label region, w_i=1, true markPoints of the edge of the label area, w_i=w(w>1). Firstly, convolving a real label picture by using a convolution kernel g of 3 x 3, filling the boundary with 0, setting the step length to be 1, obtaining the picture with the same size as the original picture after convolution, setting the corresponding element value of an edge point in the original picture to be a positive number, setting the corresponding element value of an internal point in a label area to be zero, and setting the corresponding element value of an external point in the label area to be a negative number. Then, each element is subjected to a step function, and a weight matrix of the loss function can be obtainedW=y(g*Y_True) Wherein y () is a step function,

，Y_Trueif the label is a real label, the pixel belonging to the target is 1, the pixel belonging to the background is 0, and then the loss function of the edge strengthening Mask RCNN branch is a weight matrixWAnd cross entropy on each pixelL _PAverage of the hadamard products (multiplied by element):

. The weight occupied by the boundary pixels in the training is enhanced, so that the generated mask is more accurate, and better positioning performance is provided.

Fig. 4 is a flowchart of using the edge enhanced Mask RCNN network of the present invention, and fig. 5 is a structural diagram of the edge enhanced Mask RCNN. The network training is mainly carried out on a GPU with more prominent repeated computing capability. In order to reduce the pressure of the data set on the GPU memory, preprocessing operation is firstly carried out on the data set image, and the image label is compressed. The method comprises the steps of firstly extracting a boundary frame in a label, and then compressing label color blocks in the boundary frame into 56 pixel by 56 pixel small pictures, thereby reducing the reading amount of data. When the compressed label color blocks are stored, data are not rounded, but stored as floating point numbers, and the compressed label color blocks can be accurately restored.

Then, the core Network structure part of the Network adopts a Resnet101 deep convolutional neural Network to extract multi-scale information of an input image, outputs a Feature map containing effective Feature information of the image, and matches with a Feature Pyramid Network (hereinafter referred to as FPN) to solve the problem of difficulty in detecting small-sized objects in a target detection scene. The convolution operation is carried out on the images with different sizes, so that the characteristics of objects with different scales can be obtained, simple targets can be distinguished by utilizing shallow characteristics, and complex targets can be distinguished by utilizing deep characteristics. Firstly, sending the picture into a constructed Resnet101 network, reducing the picture step by step to generate pictures with different scales, then establishing a corresponding top-down network on the basis, and overlapping the up-sampled picture and the feature picture with the corresponding size.

The RPN network generation candidate block diagram is shown in fig. 6. And after the feature extraction is completed, inputting the feature map into the RPN network for detecting the candidate frame. In practical experiments, the length-width ratio is 1:2, 1:1 and 2:1, and the size is 32, 64, 128, 256 and 512, so that each small area generates 15 candidate boxes, and in the candidate boxes, all rectangular boxes with target objects are left through detection of the RPN. These rectangle boxes are divided into three categories, and the Intersection ratio (IoU) between the rectangle box and gt (group truth) is defined as the following expression:

when IoU for the rectangle box with GT is greater than 0.7, it is positive, when IoU for the rectangle box with GT is less than 0.3, it is negative, when IoU for the rectangle box with GT is between 0.3 and 0.7, it is neutral, the positive and negative rectangle boxes participate in the training, the neutral rectangle box is discarded and does not participate in the training. And under the condition that a plurality of front rectangular frames are overlapped, carrying out non-maximum suppression optimization, and leaving a rectangular frame with the highest score.

The method comprises the steps of mapping an interest region to a position corresponding to an input feature map according to an input image by using a rectangular frame and the feature map with different scales finally obtained through processing, dividing the mapped region into a plurality of small regions, obtaining the size of a divided pixel value by using a bilinear interpolation method, and keeping a floating point number during feature extraction in the dividing process. RoIAlign ensures that the number of the divided parts is the same as the final output dimension, and finally performs maximum pooling operation on each divided part to obtain the digital feature vector with fixed length. And inputting the extracted digital features into an interest region classifier to identify and classify the candidate frame containing the target object, further optimizing the position and the size of the frame according to the identified type of the target object, and finally outputting the candidate frame with the highest confidence coefficient as an output boundary frame. The network structure of the edge strengthening Mask RCNN is shown in figure 5.

FIG. 7 is a diagram illustrating generation of target feature detection edges by a region-generating branch network. The FCN is adopted in the area generation branch network part, and the coiled layer is replaced by the full connection layer to output the spatial distribution information. The FCN uses the deconvolution layer to upsample the feature map of the last convolution layer to restore it to the same size as the input image, thereby producing a prediction on each pixel while preserving spatial information in the original input image, and finally performs pixel-by-pixel classification on the upsampled feature map to output the edges of local features. The image segmentation range is narrowed down to the range of the bounding box by the previous network identification. In a smaller area, the target object is ensured to be a main object of the identification area, so that the calculation amount required in training and identification is reduced, and the identification accuracy is greatly improved.

The following results are shown in table 1, in which the accuracy of each feature type and the total accuracy are used as indexes to compare with the conventional mainstream target detection networks RCNN, Faster RCNN, and Mask RCNN. The edge-enhanced Mask RCNN is superior to the Mask RCNN in detection of different features and overall accuracy.

TABLE 1 accuracy of multiple target detection networks

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the invention and is not intended to limit the invention to the particular forms disclosed, and that modifications may be made, or equivalents may be substituted for elements thereof, while remaining within the scope of the claims that follow. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims

1. A real-time detection method for a fan digital twin characteristic based on edge enhancement image segmentation is characterized by comprising the following steps:

(1) selecting a plurality of fan pictures with surface characteristics of a certain specific fan, modeling the specific fan and the peripheral scene of the fan in a virtual environment based on the selected fan pictures, and increasing the surface characteristics of the fan;

(3) firstly, expanding and enhancing a data set of a real fan picture and a digital twin virtual fan picture, and marking out areas of all surface features displayed in the pictures and the category of each surface feature, wherein the categories comprise a vortex generator panel, vortex generator panel loss and corrosion;

(4) selecting Mask RCNN as a basic network for image segmentation, and enhancing the edge recognition effect of the Mask RCNN to obtain an edge-enhanced Mask RCNN; abstracting the areas of the surface features of the fans in all the fan pictures obtained in the step (3) and the categories of the surface features into a digital vector and a scalar respectively, wherein the digital vector is used as the input of the edge strengthening Mask RCNN, and the scalar is used as a comparison label for classifying the edge strengthening Mask RCNN; the edge strengthening Mask RCNN is a deep learning image segmentation neural network which increases the weight occupied by the boundary of a prediction region in a loss function during the Mask RCNN training;

(5) inputting the fan picture, the digital vector and the comparison label which are expanded and enhanced in the step (3) into the edge strengthening Mask RCNN for training to obtain an edge strengthening Mask RCNN weight parameter with optimized parameters;

2. The real-time detection method for the digital twin characteristics of the fan based on the edge-enhanced image segmentation as claimed in claim 1, wherein the edge-enhanced Mask RCNN is formed by sequentially connecting four parts of a deep convolution neural network, an RPN network, a RoIAlign network, a parallel full connection layer and an improved FCN network in series; the deep convolutional neural network is used for extracting information of different scales of an input image, the RPN network is used for generating areas of surface features to be detected with different sizes, the RoIAlign network is used for generating the areas of the surface features to be detected with different sizes into digital features with fixed length, the full connection layer is used for class detection, and the improved FCN network is used for area detection of the surface features.

3. The real-time detection method for the fan digital twin characteristics based on the edge-enhanced image segmentation as claimed in claim 2, wherein the Resnet101 is selected as the deep convolutional neural network.

4. The real-time detection method for the digital twin characteristics of the fan based on the edge-enhanced image segmentation as claimed in claim 1, wherein in the step (1), the construction of the virtual scene is based on the three-dimensional fusion of the solid photos, and the characteristics of the virtual fan and the fan are generated based on the real fan model and the existing characteristic image.

5. The real-time detection method for the fan digital twin characteristics based on the edge-enhanced image segmentation as claimed in claim 1, wherein in the step (2), the virtual camera automatically shoots by setting to automatically rotate around the fan body.

6. The real-time detection method for the fan digital twin features based on the edge-enhanced image segmentation of claim 1, wherein in the step (3), a pyramid and a patch enhancement image enhancement method are firstly adopted to expand sample pictures of a fan picture with surface features, then all the expanded pictures are further expanded through left-right turning, contrast normalization and Gaussian blur, and then the area of each surface feature on the picture and the category of each surface feature are labeled.

7. The real-time detection method for the fan digital twin feature based on the edge-enhanced image segmentation as claimed in claim 1, wherein in the step (3), the contour of each surface feature is labeled to represent the region where the surface feature is located, and the category to which each surface feature belongs is defined as an integer.

8. The real-time detection method for the fan digital twin features based on the edge-enhanced image segmentation is characterized in that a feature map output by a deep convolutional neural network is used as an input of an RPN network, a plurality of candidate frames with the length-to-width ratio of 1:2, 1:1, 2:1 and the sizes of 32, 64, 128, 256 and 512 are generated, and the candidate frames with the target object are reserved through detection of the RPN network.