CN117456356A

CN117456356A - A video recognition and early warning method for urban waterlogging based on deep learning

Info

Publication number: CN117456356A
Application number: CN202311356080.XA
Authority: CN
Inventors: 薛丰昌; 陈笑娟; 吕鑫
Original assignee: Hebei Meteorological Disaster Prevention And Environmental Meteorological Center Hebei Early Warning Information Release Center; Nanjing University of Information Science and Technology
Current assignee: Hebei Meteorological Disaster Prevention And Environmental Meteorological Center Hebei Early Warning Information Release Center; Nanjing University of Information Science and Technology
Priority date: 2023-10-19
Filing date: 2023-10-19
Publication date: 2024-01-26

Abstract

The invention discloses a deep learning-based urban waterlogging video identification and early warning method, which includes: collecting urban waterlogging picture data and generating label data in json format; processing the json format data into a binary image format; and processing the water logging image data. Including pre-processing work such as rotation, scaling, color gamut transformation, Gaussian blur, etc.; input the pre-processed image data and corresponding label data into the DeepLabV3+ deep learning model for training to obtain the best model training weight file; input urban waterlogging video data The model is used to identify frame by frame, and the video data after model identification is used to calculate the proportion of water pixels in the entire image pixels frame by frame, which is used to represent the dynamic range change process of urban waterlogging in a long-term sequence. The invention can use monitoring facilities widely distributed in the city as monitoring media for urban waterlogging to identify water accumulation conditions in real time, and can reflect dynamic changes in urban waterlogging.

Description

A video recognition and early warning method for urban waterlogging based on deep learning

技术领域Technical field

本发明涉及一种基于深度学习的城市内涝视频识别预警方法，属于城市内涝监测预警技术领域。The invention relates to a deep learning-based urban waterlogging video recognition and early warning method, and belongs to the technical field of urban waterlogging monitoring and early warning.

背景技术Background technique

目前，传统人工监测方法效率低、危险性高；自动站监测方法成本高、维修困难、且自动站分布少难以兼顾全局；遥感监测法受卫星重返周期的影响难以对内涝进行实时监测，并且光学卫星难以穿透云层，SAR卫星受到城市复杂环境的影响不能有效地提取内涝区域。因此，高效、安全地对城市内涝进行监测对降低内涝造成的风险、减少内涝引起的生命财产损失和经济损失等具有重要意义。At present, the traditional manual monitoring method is inefficient and dangerous; the automatic station monitoring method is high cost, difficult to maintain, and the automatic station distribution is small and difficult to take into account the overall situation; the remote sensing monitoring method is affected by the satellite return cycle and it is difficult to monitor waterlogging in real time, and Optical satellites have difficulty penetrating clouds, and SAR satellites are affected by the complex urban environment and cannot effectively extract waterlogged areas. Therefore, efficient and safe monitoring of urban waterlogging is of great significance to reduce the risks caused by waterlogging and reduce the loss of life, property and economic losses caused by waterlogging.

发明内容Contents of the invention

本发明所要解决的技术问题是：提供一种基于深度学习的城市内涝视频识别预警方法，利用图像分割模型预测积水区域并根据积水区域的动态变化进行预警，解决了现有内涝监测成本高、监测站点少难以兼顾全局的问题。The technical problem to be solved by this invention is to provide a deep learning-based urban waterlogging video recognition and early warning method, using an image segmentation model to predict water accumulation areas and perform early warning based on the dynamic changes of water accumulation areas, thus solving the existing high cost of waterlogging monitoring. , there are too few monitoring sites and it is difficult to take into account the overall situation.

本发明为解决上述技术问题采用以下技术方案：The present invention adopts the following technical solutions to solve the above technical problems:

一种基于深度学习的城市内涝视频识别预警方法，包括如下步骤：A deep learning-based urban waterlogging video recognition and early warning method includes the following steps:

步骤1，获取城市内涝积水图像数据集，对数据集中每个原始积水图像中的积水区域进行标注，将标注后的积水图像保存为json格式；Step 1: Obtain the urban waterlogging image data set, label the water accumulation area in each original water accumulation image in the data set, and save the labeled water accumulation image in json format;

步骤2，对json格式的标注后的积水图像进行二值化，得到与原始积水图像一一对应的二值化图像；Step 2: Binarize the annotated water accumulation image in json format to obtain a binary image that corresponds one-to-one to the original water accumulation image;

步骤3，对步骤1获取的城市内涝积水图像数据集中的原始积水图像进行数据增强，得到增强后的数据集，将增强后的数据集及对应的二值化图像划分为训练集、验证集和测试集；Step 3: Perform data enhancement on the original waterlogging images in the urban waterlogging and waterlogging image data set obtained in step 1 to obtain an enhanced data set. The enhanced data set and corresponding binary images are divided into training sets and verification sets. set and test set;

步骤4，构建DeepLabV3+图像分割模型，利用训练集和验证集对DeepLabV3+图像分割模型进行训练和验证，得到训练好的DeepLabV3+图像分割模型；Step 4: Construct the DeepLabV3+ image segmentation model, use the training set and verification set to train and verify the DeepLabV3+ image segmentation model, and obtain the trained DeepLabV3+ image segmentation model;

步骤5，利用训练好的DeepLabV3+图像分割模型对测试集中图像的积水区域进行识别，计算图像中积水区域的像素数量占整个图像像素数量的比例，即积水像素比，根据积水像素比表征积水区域的动态变化情况，并在积水像素比超出预设阈值时，进行预警。Step 5: Use the trained DeepLabV3+ image segmentation model to identify the water area in the test set image, and calculate the ratio of the number of pixels in the water area in the image to the number of pixels in the entire image, that is, the water pixel ratio. According to the water pixel ratio Characterize the dynamic changes of the water accumulation area and provide an early warning when the water pixel ratio exceeds the preset threshold.

作为本发明的一种优选方案，所述步骤1中，利用目视解译的方法对原始积水图像中的积水区域使用不规则多边形进行框选，得到标注后的积水图像。As a preferred solution of the present invention, in step 1, the water accumulation area in the original water accumulation image is framed using irregular polygons using a visual interpretation method to obtain an annotated water accumulation image.

作为本发明的一种优选方案，所述步骤2中，将标注后的积水图像中积水区域的像素值赋值1，其余部分的像素值赋值0，得到二值化图像。As a preferred solution of the present invention, in step 2, the pixel value of the water accumulation area in the marked water accumulation image is assigned a value of 1, and the pixel values of the remaining parts are assigned a value of 0, thereby obtaining a binary image.

作为本发明的一种优选方案，所述步骤3中，数据增强的具体操作如下：As a preferred solution of the present invention, in step 3, the specific operations of data enhancement are as follows:

1)在0到1内随机选取一个数a，若a在0到0.5之间，则进行数据增强操作，否则不进行数据增强操作；1) Randomly select a number a between 0 and 1. If a is between 0 and 0.5, perform data enhancement operation, otherwise no data enhancement operation will be performed;

2)在0到1内随机选取一个数b，若b在0到0.25之间，则对原始积水图像的长和宽进行随机缩放，缩放倍率在0.25到2之间随机选取，将原始积水图像对应的二值化图像进行相同倍率的缩放操作；2) Randomly select a number b between 0 and 1. If b is between 0 and 0.25, randomly scale the length and width of the original water image. The scaling factor is randomly selected between 0.25 and 2, and the original product is The binary image corresponding to the water image is scaled at the same magnification;

3)在0到1内随机选取一个数c，若c在0.25到0.5之间，则对原始积水图像进行随机翻转，翻转角度在0到360度间随机选取，将原始积水图像对应的二值化图像进行相同角度的翻转操作；3) Randomly select a number c between 0 and 1. If c is between 0.25 and 0.5, randomly flip the original water image. The flip angle is randomly selected between 0 and 360 degrees, and the original water image corresponding to The binary image is flipped at the same angle;

4)在0到1内随机选取一个数d，若d在0.5到0.75之间，则对原始积水图像进行高斯模糊，模糊核大小设为5×5；4) Randomly select a number d between 0 and 1. If d is between 0.5 and 0.75, perform Gaussian blur on the original water image, and set the blur kernel size to 5×5;

5)在0到1内随机选取一个数e，若e在0.75到1之间，则对原始积水图像进行色域变换，将原始积水图像转换到HSV色彩空间，进行色相、饱和度、亮度的随机变换。5) Randomly select a number e between 0 and 1. If e is between 0.75 and 1, perform color gamut transformation on the original water image, convert the original water image into HSV color space, and perform hue, saturation, Random transformation of brightness.

作为本发明的一种优选方案，所述步骤4中，DeepLabV3+图像分割模型包括编码和解码两部分；其中，编码部分包括四次下采样的主干特征提取网络以及由空洞空间卷积模块和一个1×1的卷积层构成的加强特征提取网络；主干特征提取网络包括依次连接的第一至第四下采样模块以及一个1×1的卷积层，第一下采样模块包括依次连接的一个3×3的卷积层和两个3×3的瓶颈结构，第二下采样模块包括依次连接的两个3×3的瓶颈结构，第三下采样模块包括依次连接的三个3×3的瓶颈结构，第四下采样模块包括依次连接的七个3×3的瓶颈结构；空洞空间卷积模块包括五个并行的层，分别为一个1×1的卷积层、三个3×3的膨胀率分别为6、12和18的并行空洞卷积层以及一个全池化层；解码部分包括一个1×1的卷积层、一个3×3的卷积层以及第一和第二四倍上采样模块；As a preferred solution of the present invention, in step 4, the DeepLabV3+ image segmentation model includes two parts: encoding and decoding; wherein, the encoding part includes a four-times downsampling backbone feature extraction network and a dilated spatial convolution module and a 1 An enhanced feature extraction network composed of ×1 convolutional layers; the backbone feature extraction network includes the first to fourth downsampling modules connected in sequence and a 1×1 convolutional layer. The first downsampling module includes a 3x1 convolutional layer connected in sequence. ×3 convolutional layer and two 3×3 bottleneck structures. The second downsampling module includes two 3×3 bottleneck structures connected in sequence. The third downsampling module includes three 3×3 bottlenecks connected in sequence. structure, the fourth downsampling module includes seven 3×3 bottleneck structures connected in sequence; the atrous spatial convolution module includes five parallel layers, which are one 1×1 convolution layer and three 3×3 expansion layers. Parallel atrous convolutional layers with rates of 6, 12 and 18 and a full pooling layer; the decoding part includes a 1×1 convolutional layer, a 3×3 convolutional layer and the first and second quadruple sampling module;

将增强后的积水图像与对应的二值化图像作为DeepLabV3+图像分割模型的输入，经主干特征提取网络生成大小为原图像1/16的特征张量和大小为原图像1/4的卷积特征图层；特征张量传入空洞空间卷积模块中，由五个并行的层对特征张量进行并行处理后拼接，对拼接得到的图层进行1×1的卷积处理，得到加强特征提取网络的输出，由解码部分的第一四倍上采样模块对加强特征提取网络的输出进行四倍上采样；主干特征提取网络生成的卷积特征图层经1×1的卷积处理后与第一四倍上采样模块输出的图层进行通道拼接，拼接结果依次经3×3的卷积处理和第二四倍上采样模块后得到DeepLabV3+图像分割模型的输出。The enhanced water accumulation image and the corresponding binary image are used as the input of the DeepLabV3+ image segmentation model, and the backbone feature extraction network is used to generate a feature tensor with a size of 1/16 of the original image and a convolution with a size of 1/4 of the original image. Feature layer; the feature tensor is passed into the hole space convolution module, and the feature tensor is processed in parallel by five parallel layers and then spliced. The spliced layer is subjected to 1×1 convolution processing to obtain enhanced features. The output of the extraction network is quadrupled by the first quadruple upsampling module of the decoding part to quadruple the output of the enhanced feature extraction network; the convolutional feature layer generated by the backbone feature extraction network is processed by 1×1 convolution and The layers output by the first quadruple upsampling module are channel spliced, and the splicing results are sequentially subjected to 3×3 convolution processing and the second quadruple upsampling module to obtain the output of the DeepLabV3+ image segmentation model.

一种计算机设备，包括存储器、处理器，以及存储在所述存储器中并能够在所述处理器上运行的计算机程序，所述处理器执行所述计算机程序时实现如上所述的基于深度学习的城市内涝视频识别预警方法的步骤。A computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor. When the processor executes the computer program, the deep learning-based method as described above is implemented. Steps of urban waterlogging video identification and early warning method.

一种计算机可读存储介质，所述计算机可读存储介质存储有计算机程序，所述计算机程序被处理器执行时实现如上所述的基于深度学习的城市内涝视频识别预警方法的步骤。A computer-readable storage medium stores a computer program. When the computer program is executed by a processor, the steps of the deep learning-based urban flooding video recognition and early warning method are implemented as described above.

本发明采用以上技术方案与现有技术相比，具有以下技术效果：Compared with the existing technology, the present invention adopts the above technical solution and has the following technical effects:

1、本发明提出的方法能够脱离传统的监测站点，对内涝进行监测，城市中广泛分布的监控设施都可以作为城市内涝监测的媒介，并且能够为模型提供海量训练数据；可以对模型不断更新优化，以提升对内涝的识别精度；能够通过计算积水像素比的方式对长时间序列视频中的内涝动态范围变化进行表征计算，并以此作为参照发布城市内涝预警信息。1. The method proposed by the present invention can monitor waterlogging without traditional monitoring sites. Monitoring facilities widely distributed in the city can be used as media for urban waterlogging monitoring, and can provide massive training data for the model; the model can be continuously updated and optimized. , to improve the recognition accuracy of waterlogging; it can characterize and calculate the changes in the dynamic range of waterlogging in long-term video sequences by calculating the water pixel ratio, and use this as a reference to issue urban waterlogging early warning information.

2、本发明突破了传统积水监测站点分布少、成本高的局限性，能够将城市内广泛分布的监控设施作为城市内涝的监测媒介对积水状况进行实时地识别，并能反应城市内涝的动态变化，为能更加高效、安全、及时地对内涝进行监测和预警提供了技术支持。2. The present invention breaks through the limitations of traditional water accumulation monitoring sites such as low distribution and high cost. It can use monitoring facilities widely distributed in the city as a monitoring medium for urban waterlogging to identify water accumulation conditions in real time, and can reflect the occurrence of urban waterlogging. Dynamic changes provide technical support for more efficient, safe and timely monitoring and early warning of waterlogging.

附图说明Description of the drawings

图1是本发明一种基于深度学习的城市内涝视频识别预警方法的流程图；Figure 1 is a flow chart of a deep learning-based urban waterlogging video recognition and early warning method of the present invention;

图2是积水预期二值化示意图，其中，(a)是原始图像，(b)是二值化图像；Figure 2 is a schematic diagram of expected binarization of water accumulation, in which (a) is the original image and (b) is the binarized image;

图3是图像分割模型中主干特征提取网络的结构图；Figure 3 is a structural diagram of the backbone feature extraction network in the image segmentation model;

图4是图像分割模型的结构图；Figure 4 is a structural diagram of the image segmentation model;

图5是利用图像分割模型识别出的不同阶段视频内涝的效果图；Figure 5 is a rendering of video waterlogging at different stages identified using the image segmentation model;

图6是内涝视频中积水像素比变化曲线图。Figure 6 is a graph showing the change in water pixel ratio in the waterlogging video.

具体实施方式Detailed ways

下面详细描述本发明的实施方式，所述实施方式的示例在附图中示出。下面通过参考附图描述的实施方式是示例性的，仅用于解释本发明，而不能解释为对本发明的限制。Embodiments of the invention are described in detail below, examples of which are illustrated in the accompanying drawings. The embodiments described below with reference to the drawings are exemplary and are only used to explain the present invention and cannot be construed as limitations of the present invention.

如图1所示，为本发明提出的一种基于深度学习的城市内涝视频识别预警方法的流程图，包括如下步骤：As shown in Figure 1, it is a flow chart of a deep learning-based urban waterlogging video identification and early warning method proposed by the present invention, which includes the following steps:

步骤1：制作城市内涝积水图像数据集，生成json格式文件。Step 1: Create an urban waterlogging image data set and generate a json format file.

收集能够反应复杂场景下城市内涝状况的图片数据，利用图像标注工具绘制多边形。图像左下角顶点坐标为(0,0)，绘制的多边形各个顶点坐标为{(x1,y1),(x2,y2)……(xk,yk)}，k表示某个多边形共有k个顶点，由这k个顶点连线包围的区域为积水区域，利用json文件保存每个多边形的顶点坐标信息。Collect image data that can reflect urban waterlogging conditions in complex scenarios, and use image annotation tools to draw polygons. The coordinates of the vertices in the lower left corner of the image are (0,0), and the coordinates of each vertex of the drawn polygon are {(x1,y1), (x2,y2)...(xk,yk)}. k means that a certain polygon has k vertices. The area surrounded by the lines connecting these k vertices is the water accumulation area, and a json file is used to save the vertex coordinate information of each polygon.

步骤2：将json文件转换为二值化图像；json文件中多边形包围部分的积水区域图像像素赋予值1，其余部分图像像素赋予值0，每张二值化图像对应一个原始图像。积水预期二值化示意图如图2的(a)和(b)所示。Step 2: Convert the json file into a binary image; the image pixels in the water area surrounded by polygons in the json file are assigned a value of 1, and the remaining image pixels are assigned a value of 0. Each binary image corresponds to an original image. The schematic diagram of the expected binarization of water accumulation is shown in (a) and (b) of Figure 2.

步骤3：对原始的城市内涝图像数据进行预处理，并将数据拆分为训练集、验证集和测试集。Step 3: Preprocess the original urban waterlogging image data and split the data into a training set, a verification set and a test set.

步骤3.1：对原始图像数据进行数据增强，包括随机缩放、随机翻转、高斯模糊、色域变换。随机缩放和随机翻转使图像上内涝的位置和范围发生变化，因此需要对缩放和翻转的原始图像对应的二值化图像进行相同的缩放和翻转操作。高斯模糊和色域变换并未改变原始图像中内涝的位置和范围信息，该部分原始图像对应的二值化图像不需要进行任何变换操作。增强后的图像与二值化图像一一对应。具体数据增强操作如下：Step 3.1: Perform data enhancement on the original image data, including random scaling, random flipping, Gaussian blur, and color gamut transformation. Random scaling and random flipping change the position and range of waterlogging on the image, so the same scaling and flipping operations need to be performed on the binary image corresponding to the scaled and flipped original image. Gaussian blur and color gamut transformation do not change the location and range information of waterlogging in the original image, and the binary image corresponding to this part of the original image does not require any transformation operation. The enhanced image has a one-to-one correspondence with the binary image. The specific data enhancement operations are as follows:

(1)在0到1内随机选取一个数，若其在0到0.5之间，则进行增强操作，否则不进行增强操作。(1) Randomly select a number between 0 and 1. If it is between 0 and 0.5, the enhancement operation will be performed. Otherwise, the enhancement operation will not be performed.

(2)在0到1内随机选取一个数，若其在0到0.25之间，则进行图像长和宽的缩放，缩放倍率在0.25到2之间随机选取，将其对应的二值化图像进行相同倍率的缩放操作。(2) Randomly select a number between 0 and 1. If it is between 0 and 0.25, the length and width of the image will be scaled. The scaling factor will be randomly selected between 0.25 and 2, and the corresponding binary image will be Perform zoom operations at the same magnification.

(3)在0到1内随机选取一个数，若其在0.25到0.5之间，则进行图像旋转，旋转角度在0到360度间随机选取，将其对应的二值化图像进行相同角度的旋转操作。(3) Randomly select a number between 0 and 1. If it is between 0.25 and 0.5, perform image rotation. The rotation angle is randomly selected between 0 and 360 degrees, and the corresponding binary image is rotated at the same angle. Rotation operation.

(4)在0到1内随机选取一个数，若其在0.5到0.75之间，则进行图像高斯模糊，模糊核大小设为5×5，即将一个25个像素组成的正方形中心区域的像素值设置为周围24个像素值的平均值，其对应的二值化图像不进行操作。(4) Randomly select a number between 0 and 1. If it is between 0.5 and 0.75, perform Gaussian blur on the image. Set the blur kernel size to 5×5, which is the pixel value of the central area of a square composed of 25 pixels. Set to the average of the surrounding 24 pixel values, and the corresponding binary image is not operated.

(5)在0到1内随机选取一个数，若其在0.75到1之间，则进行图像色域变换，将图像转换到HSV色彩空间，进行色相、饱和度、亮度的随机变换，其对应的二值化图像不进行操作。(5) Randomly select a number between 0 and 1. If it is between 0.75 and 1, perform image color gamut transformation, convert the image to HSV color space, and perform random transformations on hue, saturation, and brightness. The corresponding No operation is performed on the binarized image.

步骤3.2：将上述增强后的图像与其对应的二值化图像按照8.5:1:0.5的比例随机划分训练集、验证集和测试集。Step 3.2: Randomly divide the above-mentioned enhanced images and their corresponding binary images into training sets, verification sets and test sets in a ratio of 8.5:1:0.5.

步骤4：基于DeeplabV3+图像分割算法对训练集的图像数据进行训练，验证集的图像数据对模型参数进行调试。具体按以下步骤实施：Step 4: Train the image data of the training set based on the DeeplabV3+ image segmentation algorithm, and debug the model parameters using the image data of the verification set. Specifically follow the following steps to implement:

步骤4.1：搭建DeepLabV3+图像分割模型，整体模型由编码和解码两部分组成，其中编码部分由四次下采样的主干特征提取网络(DCNN)和由空洞空间卷积模块(ASPP)构成的加强特征提取网络构成；解码部分为主干特征提取网络提取的浅层特征和加强特征提取网络提取特征的拼接和上采样处理。主干特征提取网络的结构如图3所示。模型的结构如图4所示。Step 4.1: Build the DeepLabV3+ image segmentation model. The overall model consists of two parts: encoding and decoding. The encoding part consists of four times down-sampled backbone feature extraction network (DCNN) and enhanced feature extraction composed of atrous spatial convolution module (ASPP). Network composition; the decoding part is the splicing and upsampling processing of shallow features extracted by the backbone feature extraction network and features extracted by the enhanced feature extraction network. The structure of the backbone feature extraction network is shown in Figure 3. The structure of the model is shown in Figure 4.

从GitHub网站上下载DeepLabV3+对于VOC数据集的权重文件作为训练积水图像的预训练权重。Download the weight file of DeepLabV3+ for the VOC data set from the GitHub website as the pre-training weight for training water images.

步骤4.2：调模型训练时所需的下采样倍数、批量大小(batch_size)、迭代次数(epoch)等参数，进行模型的训练，训练的步骤具体为：Step 4.2: Adjust the downsampling multiple, batch size (batch_size), number of iterations (epoch) and other parameters required for model training to train the model. The training steps are as follows:

(1)将训练集的图像输入模型并进行编码处理，经过一系列的卷积运算，生成大小为原始图像1/16的特征张量。(1) Input the images of the training set into the model and perform encoding processing. After a series of convolution operations, a feature tensor with a size of 1/16 of the original image is generated.

(2)将这些特征张量传入ASPP结构中，该结构由一个1×1的卷积层、三个3×3的膨胀率分别为6、12和18的并行空洞卷积层以及一个全池化层对上述特征张量进行处理。将这些层进行通道拼接处理，之后将拼接处理后的图层进行1×1卷积处理。(2) Pass these feature tensors into the ASPP structure, which consists of a 1×1 convolution layer, three 3×3 parallel atrous convolution layers with expansion rates of 6, 12 and 18 respectively, and a full The pooling layer processes the above feature tensors. These layers are channel spliced, and then the spliced layers are subjected to 1×1 convolution.

(3)提取出主干网络生成的大小为原图像1/4的卷积特征图层，将ASPP结果图层经过四倍上采样，形成大小相同的图层进行通道拼接，再将拼接后的图层进行3×3卷积处理和四倍上采样处理，形成与原始图像大小相同的城市内涝预测效果图。(3) Extract the convolutional feature layer generated by the backbone network with a size of 1/4 of the original image, upsample the ASPP result layer four times to form a layer of the same size for channel splicing, and then stitch the spliced image The layer performs 3×3 convolution processing and four times upsampling processing to form an urban flooding prediction effect map with the same size as the original image.

步骤5：将测试集输入到步骤4训练完成的模型中，使模型逐帧对图像中的积水区域进行识别。构建积水像素比参数，计算识别出的积水像素数量占整个图像像素数量的比例用来表征监控区域内积水范围的动态变化情况，若该比例大于一定阈值，则进行预警。具体按以下步骤实施：Step 5: Input the test set into the model trained in step 4, so that the model can identify the water areas in the image frame by frame. Construct the water pixel ratio parameter and calculate the ratio of the number of identified water pixels to the number of pixels in the entire image to represent the dynamic changes in the water range in the monitored area. If the ratio is greater than a certain threshold, an early warning will be issued. Specifically follow the following steps to implement:

步骤5.1：基于步骤4.2训练完成后生成的最佳模型权值文件对测试集进行测试，对识别后的积涝区域与原始图像进行叠加，并呈现。图5是不同阶段视频内涝的效果图。Step 5.1: Test the test set based on the best model weight file generated after the training in step 4.2, superimpose the identified waterlogged area and the original image, and present it. Figure 5 is a rendering of video waterlogging at different stages.

步骤5.2：计算步骤5.1中识别出的图像中积水像素Pixels_Flooded占总体图像像素Pixels_Total的比例，构建积水像素比参数(Water Pixel Ratio，WPR)，其表达式为：Step 5.2: Calculate the proportion of the water pixels _Flooded in the image identified in step 5.1 to the total image pixels _Total , and construct the water pixel ratio parameter (Water Pixel Ratio, WPR), whose expression is:

该参数可以理解为静止的监控设施拍摄到的可见区域中积水的范围，它的值在0％到100％之间变化。当积水区域的视野被物体或人遮挡的情况下，连续时刻记录的WPR的值会产生明显的波动，但是这种波动不会影响WPR的变化趋势，即当长时间序列连续对WPR进行计算时，可以表征城市内涝积水范围的变化趋势。当该参数大于一定阈值时，则说明内涝较为严重，应当进行预警。This parameter can be understood as the range of water accumulation in the visible area captured by the stationary monitoring facility, and its value varies between 0% and 100%. When the view of the waterlogged area is blocked by objects or people, the WPR values recorded at consecutive times will fluctuate significantly, but this fluctuation will not affect the changing trend of WPR, that is, when the WPR is continuously calculated in a long-term series time, it can characterize the changing trend of urban waterlogging extent. When this parameter is greater than a certain threshold, it indicates that waterlogging is serious and early warning should be carried out.

本发明以一段积水视频为例，对视频中的积涝区域逐帧进行了识别，并且通过等时间间隔取帧法选取了视频263帧的图像，绘制出了真实积水像素比变化曲线与实际模型预测出的积水像素比变化曲线进行拟合，验证模型的可行性，如图6所示。Taking a video of water accumulation as an example, the present invention identifies the waterlogging area in the video frame by frame, and selects 263 frames of the video through the frame-taking method at equal time intervals, and draws the real water pixel ratio change curve and the The change curve of the water pixel ratio predicted by the actual model was fitted to verify the feasibility of the model, as shown in Figure 6.

基于同一发明构思，本申请实施例提供一种计算机设备，包括存储器、处理器，以及存储在存储器中并可在处理器上运行的计算机程序，处理器执行计算机程序时实现前述的基于深度学习的城市内涝视频识别预警方法的步骤。Based on the same inventive concept, embodiments of the present application provide a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, the aforementioned deep learning-based Steps of urban waterlogging video identification and early warning method.

基于同一发明构思，本申请实施例提供一种计算机可读存储介质，该计算机可读存储介质存储有计算机程序，该计算机程序被处理器执行时实现前述的基于深度学习的城市内涝视频识别预警方法的步骤。Based on the same inventive concept, embodiments of the present application provide a computer-readable storage medium. The computer-readable storage medium stores a computer program. When the computer program is executed by a processor, the aforementioned deep learning-based urban waterlogging video recognition and early warning method is implemented. A step of.

本领域内的技术人员应明白，本发明的实施例可提供为方法、系统、或计算机程序产品。因此，本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present invention may be provided as methods, systems, or computer program products. Thus, the invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

以上实施例仅为说明本发明的技术思想，不能以此限定本发明的保护范围，凡是按照本发明提出的技术思想，在技术方案基础上所做的任何改动，均落入本发明保护范围之内。The above embodiments are only for illustrating the technical ideas of the present invention and cannot limit the protection scope of the present invention. Any changes made based on the technical solutions based on the technical ideas proposed by the present invention will fall within the protection scope of the present invention. Inside.

Claims

1. A city waterlogging video recognition early warning method based on deep learning is characterized by comprising the following steps:

step 1, acquiring a city waterlogged ponding image dataset, marking ponding areas in each original ponding image in the dataset, and storing the marked ponding images in json format;

step 2, binarizing the water accumulation image marked in json format to obtain a binarized image corresponding to the original water accumulation image one by one;

step 3, carrying out data enhancement on the original ponding image in the urban waterlogging ponding image data set obtained in the step 1 to obtain an enhanced data set, and dividing the enhanced data set and the corresponding binarized image into a training set, a verification set and a test set;

step 4, constructing a deep LabV3+ image segmentation model, and training and verifying the deep LabV3+ image segmentation model by using a training set and a verification set to obtain a trained deep LabV3+ image segmentation model;

and 5, identifying the ponding area of the image in the test set by using the trained deep LabV3+ image segmentation model, calculating the proportion of the number of pixels of the ponding area in the image to the number of pixels of the whole image, namely the ponding pixel ratio, representing the dynamic change condition of the ponding area according to the ponding pixel ratio, and carrying out early warning when the ponding pixel ratio exceeds a preset threshold value.

2. The method for identifying and pre-warning urban waterlogging video based on deep learning according to claim 1, wherein in the step 1, a method of visual interpretation is utilized to frame and select a ponding area in an original ponding image by using an irregular polygon, so as to obtain a marked ponding image.

3. The urban waterlogging video recognition and early warning method based on deep learning according to claim 1, wherein in the step 2, the pixel value of the ponding area in the marked ponding image is assigned 1, and the pixel value of the rest part is assigned 0, so as to obtain a binarized image.

4. The urban waterlogging video recognition early warning method based on deep learning according to claim 1, wherein in the step 3, the specific operation of data enhancement is as follows:

1) Randomly selecting a number a from 0 to 1, if a is between 0 and 0.5, performing data enhancement operation, otherwise, not performing data enhancement operation;

2) Randomly selecting a number b from 0 to 1, randomly scaling the length and the width of the original ponding image if b is between 0 and 0.25, randomly selecting the scaling factor from 0.25 to 2, and performing scaling operation of the same scaling factor on the binarized image corresponding to the original ponding image;

3) Randomly selecting a number c from 0 to 1, if c is between 0.25 and 0.5, randomly overturning the original ponding image, randomly selecting an overturning angle from 0 to 360 degrees, and carrying out overturning operation of the same angle on the binarized image corresponding to the original ponding image;

4) Randomly selecting a number d from 0 to 1, and if d is between 0.5 and 0.75, carrying out Gaussian blur on an original ponding image, wherein the size of a blur kernel is set to be 5 multiplied by 5;

5) And randomly selecting a number e from 0 to 1, if the number e is between 0.75 and 1, performing color gamut conversion on the original ponding image, converting the original ponding image into an HSV color space, and performing random conversion on hue, saturation and brightness.

5. The urban waterlogging video recognition early warning method based on deep learning according to claim 1, wherein in the step 4, a deep labv3+ image segmentation model comprises an encoding part and a decoding part; the coding part comprises a trunk feature extraction network for four times of downsampling and a reinforced feature extraction network formed by a cavity space convolution module and a 1 multiplied by 1 convolution layer; the trunk feature extraction network comprises first to fourth downsampling modules and a 1×1 convolution layer which are sequentially connected, wherein the first downsampling module comprises a 3×3 convolution layer and two 3×3 bottleneck structures which are sequentially connected, the second downsampling module comprises two 3×3 bottleneck structures which are sequentially connected, the third downsampling module comprises three 3×3 bottleneck structures which are sequentially connected, and the fourth downsampling module comprises seven 3×3 bottleneck structures which are sequentially connected; the cavity space convolution module comprises five parallel layers, namely a 1 multiplied by 1 convolution layer, three 3 multiplied by 3 parallel cavity convolution layers with expansion rates of 6, 12 and 18 respectively and a full pooling layer; the decoding part comprises a 1 x1 convolution layer, a 3 x 3 convolution layer and first and second quadruple up-sampling modules;

taking the enhanced ponding image and the corresponding binarized image as input of a deep LabV < 3+ > image segmentation model, and generating a characteristic tensor with the size of 1/16 of the original image and a convolution characteristic image layer with the size of 1/4 of the original image through a trunk characteristic extraction network; in the feature tensor input hole space convolution module, five parallel layers are used for carrying out parallel processing on the feature tensor and then splicing, a 1 multiplied by 1 convolution processing is carried out on the spliced image layer, so that the output of the reinforced feature extraction network is obtained, and the first four-time up-sampling module of the decoding part is used for carrying out four-time up-sampling on the output of the reinforced feature extraction network; and carrying out channel splicing on the convolution characteristic layer generated by the trunk characteristic extraction network and the layer output by the first four-time up-sampling module after carrying out convolution processing on the convolution characteristic layer by 1 multiplied by 1, and obtaining the output of the deep LabV3+ image segmentation model after the splicing result sequentially passes through convolution processing of 3 multiplied by 3 and the second four-time up-sampling module.

6. A computer device comprising a memory, a processor, and a computer program stored in the memory and capable of running on the processor, characterized in that the processor, when executing the computer program, implements the steps of the deep learning based urban waterlogging video recognition pre-warning method of any one of claims 1 to 5.

7. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the deep learning based urban waterlogging video recognition pre-warning method of any one of claims 1 to 5.