CN117456356A - Urban waterlogging video recognition early warning method based on deep learning - Google Patents
Urban waterlogging video recognition early warning method based on deep learning Download PDFInfo
- Publication number
- CN117456356A CN117456356A CN202311356080.XA CN202311356080A CN117456356A CN 117456356 A CN117456356 A CN 117456356A CN 202311356080 A CN202311356080 A CN 202311356080A CN 117456356 A CN117456356 A CN 117456356A
- Authority
- CN
- China
- Prior art keywords
- image
- ponding
- convolution
- original
- urban waterlogging
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000013135 deep learning Methods 0.000 title claims abstract description 19
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000012545 processing Methods 0.000 claims abstract description 18
- 230000008859 change Effects 0.000 claims abstract description 15
- 238000006243 chemical reaction Methods 0.000 claims abstract description 7
- 238000000605 extraction Methods 0.000 claims description 21
- 238000003709 image segmentation Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 17
- 238000005070 sampling Methods 0.000 claims description 12
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 11
- 238000009825 accumulation Methods 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 9
- 238000012795 verification Methods 0.000 claims description 7
- 238000003860 storage Methods 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 3
- 230000001788 irregular Effects 0.000 claims description 2
- 230000000007 visual effect Effects 0.000 claims description 2
- 238000012544 monitoring process Methods 0.000 abstract description 18
- 238000007781 pre-processing Methods 0.000 abstract description 2
- 230000008569 process Effects 0.000 abstract description 2
- 238000013136 deep learning model Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 101100295091 Arabidopsis thaliana NUDT14 gene Proteins 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/176—Urban or other man-made structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a deep learning-based urban waterlogging video identification and early warning method, which comprises the following steps: collecting urban waterlogging picture data and generating json-format tag data; processing json format data into a binary image format; preprocessing such as rotation, scaling, color gamut conversion, gaussian blur and the like is carried out on the ponding image data; inputting the preprocessed image data and corresponding label data into a deep LabV3+ deep learning model for training to obtain an optimal model training weight file; the input urban waterlogging video data are identified frame by using the model, and the proportion of the ponding pixels to the whole image pixels is calculated frame by using the video data identified by the model, so as to represent the dynamic range change process of the urban waterlogging with long time sequence. The invention can identify the ponding condition in real time by taking widely distributed monitoring facilities in the city as monitoring media of urban waterlogging and can reflect the dynamic change of the urban waterlogging.
Description
Technical Field
The invention relates to a deep learning-based urban waterlogging video identification and early warning method, and belongs to the technical field of urban waterlogging monitoring and early warning.
Background
At present, the traditional manual monitoring method has low efficiency and high risk; the automatic station monitoring method has high cost, difficult maintenance, less automatic station distribution and difficult overall consideration; the remote sensing monitoring method is difficult to monitor waterlogging in real time under the influence of the satellite reentry period, the optical satellite is difficult to penetrate through cloud layers, and the SAR satellite cannot effectively extract waterlogging areas under the influence of urban complex environments. Therefore, the method for monitoring urban inland inundation efficiently and safely has important significance for reducing risk caused by inland inundation, reducing life and property loss and economic loss caused by inland inundation and the like.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: the urban waterlogging video identification early warning method based on deep learning utilizes an image segmentation model to predict a waterlogging area and carries out early warning according to the dynamic change of the waterlogging area, and solves the problems that the existing waterlogging monitoring cost is high, the number of monitoring sites is small, and the global situation is difficult to consider.
The invention adopts the following technical scheme for solving the technical problems:
a city waterlogging video recognition early warning method based on deep learning comprises the following steps:
step 1, acquiring a city waterlogged ponding image dataset, marking ponding areas in each original ponding image in the dataset, and storing the marked ponding images in json format;
step 2, binarizing the water accumulation image marked in json format to obtain a binarized image corresponding to the original water accumulation image one by one;
step 3, carrying out data enhancement on the original ponding image in the urban waterlogging ponding image data set obtained in the step 1 to obtain an enhanced data set, and dividing the enhanced data set and the corresponding binarized image into a training set, a verification set and a test set;
step 4, constructing a deep LabV3+ image segmentation model, and training and verifying the deep LabV3+ image segmentation model by using a training set and a verification set to obtain a trained deep LabV3+ image segmentation model;
and 5, identifying the ponding area of the image in the test set by using the trained deep LabV3+ image segmentation model, calculating the proportion of the number of pixels of the ponding area in the image to the number of pixels of the whole image, namely the ponding pixel ratio, representing the dynamic change condition of the ponding area according to the ponding pixel ratio, and carrying out early warning when the ponding pixel ratio exceeds a preset threshold value.
In the step 1, the method of visual interpretation is used for selecting the ponding region in the original ponding image by using an irregular polygon to obtain the marked ponding image.
In the step 2, the pixel value of the ponding area in the marked ponding image is assigned 1, and the pixel value of the rest part is assigned 0, so as to obtain a binarized image.
As a preferred embodiment of the present invention, in the step 3, the specific operation of data enhancement is as follows:
1) Randomly selecting a number a from 0 to 1, if a is between 0 and 0.5, performing data enhancement operation, otherwise, not performing data enhancement operation;
2) Randomly selecting a number b from 0 to 1, randomly scaling the length and the width of the original ponding image if b is between 0 and 0.25, randomly selecting the scaling factor from 0.25 to 2, and performing scaling operation of the same scaling factor on the binarized image corresponding to the original ponding image;
3) Randomly selecting a number c from 0 to 1, if c is between 0.25 and 0.5, randomly overturning the original ponding image, randomly selecting an overturning angle from 0 to 360 degrees, and carrying out overturning operation of the same angle on the binarized image corresponding to the original ponding image;
4) Randomly selecting a number d from 0 to 1, and if d is between 0.5 and 0.75, carrying out Gaussian blur on an original ponding image, wherein the size of a blur kernel is set to be 5 multiplied by 5;
5) And randomly selecting a number e from 0 to 1, if the number e is between 0.75 and 1, performing color gamut conversion on the original ponding image, converting the original ponding image into an HSV color space, and performing random conversion on hue, saturation and brightness.
As a preferred embodiment of the present invention, in the step 4, the deelabv3+ image segmentation model includes two parts of encoding and decoding; the coding part comprises a trunk feature extraction network for four times of downsampling and a reinforced feature extraction network formed by a cavity space convolution module and a 1 multiplied by 1 convolution layer; the trunk feature extraction network comprises first to fourth downsampling modules and a 1×1 convolution layer which are sequentially connected, wherein the first downsampling module comprises a 3×3 convolution layer and two 3×3 bottleneck structures which are sequentially connected, the second downsampling module comprises two 3×3 bottleneck structures which are sequentially connected, the third downsampling module comprises three 3×3 bottleneck structures which are sequentially connected, and the fourth downsampling module comprises seven 3×3 bottleneck structures which are sequentially connected; the cavity space convolution module comprises five parallel layers, namely a 1 multiplied by 1 convolution layer, three 3 multiplied by 3 parallel cavity convolution layers with expansion rates of 6, 12 and 18 respectively and a full pooling layer; the decoding part comprises a 1 x1 convolution layer, a 3 x 3 convolution layer and first and second quadruple up-sampling modules;
taking the enhanced ponding image and the corresponding binarized image as input of a deep LabV < 3+ > image segmentation model, and generating a characteristic tensor with the size of 1/16 of the original image and a convolution characteristic image layer with the size of 1/4 of the original image through a trunk characteristic extraction network; in the feature tensor input hole space convolution module, five parallel layers are used for carrying out parallel processing on the feature tensor and then splicing, a 1 multiplied by 1 convolution processing is carried out on the spliced image layer, so that the output of the reinforced feature extraction network is obtained, and the first four-time up-sampling module of the decoding part is used for carrying out four-time up-sampling on the output of the reinforced feature extraction network; and carrying out channel splicing on the convolution characteristic layer generated by the trunk characteristic extraction network and the layer output by the first four-time up-sampling module after carrying out convolution processing on the convolution characteristic layer by 1 multiplied by 1, and obtaining the output of the deep LabV3+ image segmentation model after the splicing result sequentially passes through convolution processing of 3 multiplied by 3 and the second four-time up-sampling module.
A computer device comprising a memory, a processor, and a computer program stored in the memory and capable of running on the processor, the processor implementing the steps of the deep learning based urban waterlogging video recognition early warning method as described above when the computer program is executed.
A computer readable storage medium storing a computer program which when executed by a processor implements the steps of a deep learning based urban waterlogging video recognition early warning method as described above.
Compared with the prior art, the technical scheme provided by the invention has the following technical effects:
1. the method provided by the invention can be separated from the traditional monitoring station to monitor the waterlogging, and the monitoring facilities widely distributed in the city can be used as the medium for monitoring the waterlogging in the city, and can provide massive training data for the model; the model can be updated and optimized continuously so as to improve the recognition accuracy of waterlogging; the method can be used for carrying out characterization calculation on the dynamic range change of the waterlogging in the long-time sequence video in a mode of calculating the waterlogging pixel ratio, and issuing urban waterlogging early warning information by taking the dynamic range change as a reference.
2. The invention breaks through the limitations of few distribution and high cost of the traditional ponding monitoring stations, can identify ponding conditions in real time by taking widely distributed monitoring facilities in cities as monitoring media of urban waterlogging, can reflect dynamic changes of the urban waterlogging, and provides technical support for monitoring and early warning the waterlogging more efficiently, safely and timely.
Drawings
FIG. 1 is a flow chart of an urban waterlogging video recognition early warning method based on deep learning;
FIG. 2 is a diagram of expected binarization of water accumulation, wherein (a) is an original image and (b) is a binarized image;
FIG. 3 is a block diagram of a backbone feature extraction network in an image segmentation model;
FIG. 4 is a block diagram of an image segmentation model;
FIG. 5 is an effect graph of video inland inundation at different stages identified using an image segmentation model;
fig. 6 is a graph of the change in the ponding pixel ratio in a waterlogged video.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings. The embodiments described below by referring to the drawings are exemplary only for explaining the present invention and are not to be construed as limiting the present invention.
As shown in fig. 1, a flowchart of a method for identifying and pre-warning urban waterlogging video based on deep learning provided by the invention comprises the following steps:
step 1: and (5) manufacturing an urban waterlogging ponding image data set and generating a json format file.
And collecting picture data capable of reflecting urban waterlogging conditions in complex scenes, and drawing polygons by using an image marking tool. The vertex coordinates of the left lower corner of the image are (0, 0), the vertex coordinates of the drawn polygons are { (x 1, y 1), (x 2, y 2) … … (xk, yk) }, k represents that a certain polygon has k vertices, the area surrounded by the connecting lines of the k vertices is a water accumulation area, and vertex coordinate information of each polygon is saved by using a json file.
Step 2: converting the json file into a binarized image; the image pixels of the ponding area of the polygonal surrounding part in the json file are endowed with a value of 1, the image pixels of the rest part are endowed with a value of 0, and each binarized image corresponds to one original image. The expected binarization of the water is shown in fig. 2 (a) and (b).
Step 3: preprocessing the original urban waterlogging image data, and splitting the data into a training set, a verification set and a test set.
Step 3.1: the original image data is subjected to data enhancement, including random scaling, random flipping, gaussian blur and color gamut transformation. Random scaling and random flipping change the position and range of the waterlogging on the image, so that the same scaling and flipping operation needs to be performed on the binary image corresponding to the scaled and flipped original image. The Gaussian blur and the color gamut transformation do not change the position and range information of the waterlogging in the original image, and the binarized image corresponding to the part of the original image does not need any transformation operation. The enhanced image corresponds to the binarized image one by one. The specific data enhancement operation is as follows:
(1) And randomly selecting a number from 0 to 1, if the number is between 0 and 0.5, performing enhancement operation, otherwise, not performing enhancement operation.
(2) And randomly selecting one number from 0 to 1, if the number is between 0 and 0.25, scaling the length and the width of the image, randomly selecting the scaling factor from 0.25 to 2, and performing scaling operation of the same scaling factor on the corresponding binary image.
(3) And randomly selecting one number from 0 to 1, if the number is between 0.25 and 0.5, rotating the image, randomly selecting the rotation angle from 0 to 360 degrees, and rotating the corresponding binary image by the same angle.
(4) And randomly selecting one number from 0 to 1, if the number is between 0.5 and 0.75, performing Gaussian blur on the image, setting the size of a blur kernel to be 5 multiplied by 5, namely setting the pixel value of a square central area consisting of 25 pixels as the average value of 24 surrounding pixel values, and performing no operation on the corresponding binary image.
(5) And randomly selecting one number from 0 to 1, if the number is between 0.75 and 1, performing image color gamut conversion, converting the image into HSV color space, performing random conversion of hue, saturation and brightness, and performing no operation on the corresponding binary image.
Step 3.2: and randomly dividing the enhanced image and the corresponding binarized image into a training set, a verification set and a test set according to the proportion of 8.5:1:0.5.
Step 4: training the image data of the training set based on the deep V < 3+ > image segmentation algorithm, and debugging the model parameters by the image data of the verification set. The method is implemented by the following steps:
step 4.1: constructing a deep LabV3+ image segmentation model, wherein the whole model consists of an encoding part and a decoding part, and the encoding part consists of a trunk feature extraction network (DCNN) for four times of downsampling and a reinforcement feature extraction network consisting of a cavity space convolution module (ASPP); the decoding part is used for splicing and upsampling the shallow layer features extracted by the trunk feature extraction network and the reinforced feature extraction network extraction features. The structure of the backbone feature extraction network is shown in fig. 3. The structure of the model is shown in fig. 4.
The weight file of deep labv3+ for the VOC dataset was downloaded from the GitHub website as a pre-training weight for training the ponding image.
Step 4.2: parameters such as downsampling multiple, batch size (batch_size), iteration number (epoch) and the like required during model training are adjusted to train the model, and the training steps are as follows:
(1) The images of the training set are input into a model and are subjected to coding processing, and a series of convolution operations are carried out to generate a characteristic tensor with the size of 1/16 of the original image.
(2) These feature tensors are passed into ASPP structure, which is processed by a 1 x1 convolution layer, three 3 x 3 parallel hole convolution layers with expansion rates of 6, 12 and 18, respectively, and a full pooling layer. The layers are subjected to channel splicing processing, and then the spliced layers are subjected to 1×1 convolution processing.
(3) And extracting a convolution characteristic image layer with the size of 1/4 of that of the original image generated by the backbone network, performing quadruple up-sampling on the ASPP result image layer to form an image layer with the same size, performing channel splicing, and performing 3X 3 convolution processing and quadruple up-sampling processing on the spliced image layer to form the urban waterlogging prediction effect image with the same size as the original image.
Step 5: inputting the test set into the model trained in the step 4, and enabling the model to identify the ponding area in the image frame by frame. And constructing a ponding pixel ratio parameter, calculating the proportion of the number of the identified ponding pixels to the number of the whole image pixels, and representing the dynamic change condition of the ponding range in the monitoring area, and if the proportion is larger than a certain threshold value, carrying out early warning. The method is implemented by the following steps:
step 5.1: and (3) testing the test set based on the optimal model weight file generated after the training in the step (4.2), and superposing and presenting the identified waterlogging area and the original image. Fig. 5 is an effect diagram of video waterlogging at different stages.
Step 5.2: calculating the ponding Pixels Pixels in the image identified in step 5.1 Flooded Occupy the overall image pixel Pixels Total Is used for constructing a ponding pixel ratio parameter (Water Pixel Ratio, WPR) with the expression:
the parameter is understood to be the range of water accumulation in the visible region captured by a stationary monitoring facility, the value of which varies between 0% and 100%. When the view of the water accumulation area is blocked by an object or a person, the value of the WPR recorded at continuous moments can obviously fluctuate, but the fluctuation does not influence the change trend of the WPR, namely, when the WPR is continuously calculated in a long-time sequence, the change trend of the urban waterlogging water accumulation range can be represented. When the parameter is larger than a certain threshold, the waterlogging is severe, and early warning is needed.
Taking a section of ponding video as an example, the invention identifies the ponding area in the video frame by frame, selects images of 263 frames of the video by a time interval frame taking method, draws a real ponding pixel ratio change curve and a ponding pixel ratio change curve predicted by an actual model to fit, and verifies the feasibility of the model, as shown in fig. 6.
Based on the same inventive concept, the embodiment of the application provides a computer device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the urban waterlogging video identification early warning method based on deep learning when executing the computer program.
Based on the same inventive concept, the embodiments of the present application provide a computer readable storage medium storing a computer program, which when executed by a processor, implements the steps of the foregoing deep learning-based urban waterlogging video recognition early warning method.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereto, and any modification made on the basis of the technical scheme according to the technical idea of the present invention falls within the protection scope of the present invention.
Claims (7)
1. A city waterlogging video recognition early warning method based on deep learning is characterized by comprising the following steps:
step 1, acquiring a city waterlogged ponding image dataset, marking ponding areas in each original ponding image in the dataset, and storing the marked ponding images in json format;
step 2, binarizing the water accumulation image marked in json format to obtain a binarized image corresponding to the original water accumulation image one by one;
step 3, carrying out data enhancement on the original ponding image in the urban waterlogging ponding image data set obtained in the step 1 to obtain an enhanced data set, and dividing the enhanced data set and the corresponding binarized image into a training set, a verification set and a test set;
step 4, constructing a deep LabV3+ image segmentation model, and training and verifying the deep LabV3+ image segmentation model by using a training set and a verification set to obtain a trained deep LabV3+ image segmentation model;
and 5, identifying the ponding area of the image in the test set by using the trained deep LabV3+ image segmentation model, calculating the proportion of the number of pixels of the ponding area in the image to the number of pixels of the whole image, namely the ponding pixel ratio, representing the dynamic change condition of the ponding area according to the ponding pixel ratio, and carrying out early warning when the ponding pixel ratio exceeds a preset threshold value.
2. The method for identifying and pre-warning urban waterlogging video based on deep learning according to claim 1, wherein in the step 1, a method of visual interpretation is utilized to frame and select a ponding area in an original ponding image by using an irregular polygon, so as to obtain a marked ponding image.
3. The urban waterlogging video recognition and early warning method based on deep learning according to claim 1, wherein in the step 2, the pixel value of the ponding area in the marked ponding image is assigned 1, and the pixel value of the rest part is assigned 0, so as to obtain a binarized image.
4. The urban waterlogging video recognition early warning method based on deep learning according to claim 1, wherein in the step 3, the specific operation of data enhancement is as follows:
1) Randomly selecting a number a from 0 to 1, if a is between 0 and 0.5, performing data enhancement operation, otherwise, not performing data enhancement operation;
2) Randomly selecting a number b from 0 to 1, randomly scaling the length and the width of the original ponding image if b is between 0 and 0.25, randomly selecting the scaling factor from 0.25 to 2, and performing scaling operation of the same scaling factor on the binarized image corresponding to the original ponding image;
3) Randomly selecting a number c from 0 to 1, if c is between 0.25 and 0.5, randomly overturning the original ponding image, randomly selecting an overturning angle from 0 to 360 degrees, and carrying out overturning operation of the same angle on the binarized image corresponding to the original ponding image;
4) Randomly selecting a number d from 0 to 1, and if d is between 0.5 and 0.75, carrying out Gaussian blur on an original ponding image, wherein the size of a blur kernel is set to be 5 multiplied by 5;
5) And randomly selecting a number e from 0 to 1, if the number e is between 0.75 and 1, performing color gamut conversion on the original ponding image, converting the original ponding image into an HSV color space, and performing random conversion on hue, saturation and brightness.
5. The urban waterlogging video recognition early warning method based on deep learning according to claim 1, wherein in the step 4, a deep labv3+ image segmentation model comprises an encoding part and a decoding part; the coding part comprises a trunk feature extraction network for four times of downsampling and a reinforced feature extraction network formed by a cavity space convolution module and a 1 multiplied by 1 convolution layer; the trunk feature extraction network comprises first to fourth downsampling modules and a 1×1 convolution layer which are sequentially connected, wherein the first downsampling module comprises a 3×3 convolution layer and two 3×3 bottleneck structures which are sequentially connected, the second downsampling module comprises two 3×3 bottleneck structures which are sequentially connected, the third downsampling module comprises three 3×3 bottleneck structures which are sequentially connected, and the fourth downsampling module comprises seven 3×3 bottleneck structures which are sequentially connected; the cavity space convolution module comprises five parallel layers, namely a 1 multiplied by 1 convolution layer, three 3 multiplied by 3 parallel cavity convolution layers with expansion rates of 6, 12 and 18 respectively and a full pooling layer; the decoding part comprises a 1 x1 convolution layer, a 3 x 3 convolution layer and first and second quadruple up-sampling modules;
taking the enhanced ponding image and the corresponding binarized image as input of a deep LabV < 3+ > image segmentation model, and generating a characteristic tensor with the size of 1/16 of the original image and a convolution characteristic image layer with the size of 1/4 of the original image through a trunk characteristic extraction network; in the feature tensor input hole space convolution module, five parallel layers are used for carrying out parallel processing on the feature tensor and then splicing, a 1 multiplied by 1 convolution processing is carried out on the spliced image layer, so that the output of the reinforced feature extraction network is obtained, and the first four-time up-sampling module of the decoding part is used for carrying out four-time up-sampling on the output of the reinforced feature extraction network; and carrying out channel splicing on the convolution characteristic layer generated by the trunk characteristic extraction network and the layer output by the first four-time up-sampling module after carrying out convolution processing on the convolution characteristic layer by 1 multiplied by 1, and obtaining the output of the deep LabV3+ image segmentation model after the splicing result sequentially passes through convolution processing of 3 multiplied by 3 and the second four-time up-sampling module.
6. A computer device comprising a memory, a processor, and a computer program stored in the memory and capable of running on the processor, characterized in that the processor, when executing the computer program, implements the steps of the deep learning based urban waterlogging video recognition pre-warning method of any one of claims 1 to 5.
7. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of the deep learning based urban waterlogging video recognition pre-warning method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311356080.XA CN117456356A (en) | 2023-10-19 | 2023-10-19 | Urban waterlogging video recognition early warning method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311356080.XA CN117456356A (en) | 2023-10-19 | 2023-10-19 | Urban waterlogging video recognition early warning method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117456356A true CN117456356A (en) | 2024-01-26 |
Family
ID=89593953
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311356080.XA Pending CN117456356A (en) | 2023-10-19 | 2023-10-19 | Urban waterlogging video recognition early warning method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117456356A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117746342A (en) * | 2024-02-19 | 2024-03-22 | 广州市突发事件预警信息发布中心(广州市气象探测数据中心) | Method for identifying road ponding by utilizing public video |
CN118470659A (en) * | 2024-07-15 | 2024-08-09 | 南昌航空大学 | Waterlogging detection method and device based on denoising diffusion model under urban monitoring view angle |
-
2023
- 2023-10-19 CN CN202311356080.XA patent/CN117456356A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117746342A (en) * | 2024-02-19 | 2024-03-22 | 广州市突发事件预警信息发布中心(广州市气象探测数据中心) | Method for identifying road ponding by utilizing public video |
CN117746342B (en) * | 2024-02-19 | 2024-05-17 | 广州市突发事件预警信息发布中心(广州市气象探测数据中心) | Method for identifying road ponding by utilizing public video |
CN118470659A (en) * | 2024-07-15 | 2024-08-09 | 南昌航空大学 | Waterlogging detection method and device based on denoising diffusion model under urban monitoring view angle |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110084817B (en) | Digital elevation model production method based on deep learning | |
CN110059698B (en) | Semantic segmentation method and system based on edge dense reconstruction for street view understanding | |
CN117456356A (en) | Urban waterlogging video recognition early warning method based on deep learning | |
CN113139543B (en) | Training method of target object detection model, target object detection method and equipment | |
CN109840483B (en) | Landslide crack detection and identification method and device | |
CN113610822A (en) | Surface defect detection method based on multi-scale information fusion | |
RU2008129793A (en) | METHOD FOR IMPROVING FURTHER PROCESSING OF IMAGES USING DEFORMABLE NETS | |
CN114742799B (en) | Industrial scene unknown type defect segmentation method based on self-supervision heterogeneous network | |
CN116777898B (en) | Method for realizing crack measurement in 3D printing retaining wall construction process based on AFFormer | |
CN112906794A (en) | Target detection method, device, storage medium and terminal | |
CN110059769A (en) | The semantic segmentation method and system rebuild are reset based on pixel for what streetscape understood | |
CN116110036B (en) | Electric power nameplate information defect level judging method and device based on machine vision | |
CN117557775B (en) | Substation power equipment detection method and system based on infrared and visible light fusion | |
CN110555424A (en) | port container layout detection method, device, equipment and readable storage medium | |
CN112989995A (en) | Text detection method and device and electronic equipment | |
CN115410081A (en) | Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium | |
CN116958827A (en) | Deep learning-based abandoned land area extraction method | |
CN113762265A (en) | Pneumonia classification and segmentation method and system | |
CN112614094B (en) | Insulator string abnormity positioning and identifying method based on sequence state coding | |
CN116485802A (en) | Insulator flashover defect detection method, device, equipment and storage medium | |
CN116109518A (en) | Data enhancement and segmentation method and device for metal rust image | |
CN114494236A (en) | Fabric defect detection method and system based on over-complete convolutional neural network | |
CN116563538B (en) | Image segmentation method and system | |
CN117152621B (en) | Building change detection method, device, electronic equipment and storage medium | |
CN118053150B (en) | Supervision method based on text detail graph as end-to-end text detection and recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |