WO2022067668A1 - 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 - Google Patents
基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 Download PDFInfo
- Publication number
- WO2022067668A1 WO2022067668A1 PCT/CN2020/119413 CN2020119413W WO2022067668A1 WO 2022067668 A1 WO2022067668 A1 WO 2022067668A1 CN 2020119413 W CN2020119413 W CN 2020119413W WO 2022067668 A1 WO2022067668 A1 WO 2022067668A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- model
- image
- fire
- lfnet
- feature extraction
- Prior art date
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 90
- 238000000605 extraction Methods 0.000 claims abstract description 40
- 239000000428 dust Substances 0.000 claims abstract description 29
- 238000012549 training Methods 0.000 claims abstract description 28
- 230000004927 fusion Effects 0.000 claims abstract description 27
- 239000004576 sand Substances 0.000 claims abstract description 20
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 10
- 230000003044 adaptive effect Effects 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 27
- 238000002485 combustion reaction Methods 0.000 claims description 10
- 230000007246 mechanism Effects 0.000 claims description 8
- 230000005540 biological transmission Effects 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 238000004088 simulation Methods 0.000 claims description 5
- 239000003086 colorant Substances 0.000 claims description 4
- 238000002834 transmittance Methods 0.000 claims description 4
- 238000003384 imaging method Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000013507 mapping Methods 0.000 claims description 2
- 230000002159 abnormal effect Effects 0.000 abstract description 7
- 238000010586 diagram Methods 0.000 description 11
- 238000012544 monitoring process Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 239000000779 smoke Substances 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
Definitions
- the present application belongs to the technical field of fire detection, and in particular relates to a fire detection method, system, terminal and storage medium based on video image target detection.
- Fire detection plays a vital role in security monitoring.
- the traditional fire detection method is based on image prior, which is based on the color and shape of the image for fire detection.
- image prior is based on the color and shape of the image for fire detection.
- the robustness and bit error rate of color and motion features are often limited by preset parameters. It can not be applied in complex environments, and the positioning accuracy is easily affected by the region.
- CNN convolutional neural network
- Methods based on deep learning require a large number of remote sensing images as training data. Due to the scarcity of real remote sensing images, the training of the model is very challenging.
- the anti-interference ability is weak, and it is easily affected by the harsh monitoring environment such as haze and dust.
- the present application provides a fire detection method, system, terminal and storage medium based on video image target detection, aiming to solve one of the above technical problems in the prior art at least to a certain extent.
- a fire detection method based on video image target detection comprising:
- Construct a convolutional neural network model LFNet input the data set into the LFNet model for iterative training, and obtain optimal model parameters;
- the convolutional neural network model LFNet includes a skeleton feature extraction model, a main feature extraction model and a variable-scale feature fusion model.
- the skeleton feature extraction model extracts the main features of the input image through convolutions of three different scales; the main feature extraction model is used for further feature extraction on the main features to generate three sets of feature maps;
- the scale feature fusion model performs adaptive fusion on the three sets of feature maps, and outputs detection results;
- the technical solution adopted in the embodiment of the present application further includes: before the data enhancement algorithm based on the atmospheric scattering model is used to convert the original natural image into the haze image and the sand-dust image, the method includes:
- An original natural image is obtained; the original natural image includes a non-alarm image without a fire alarm area and a real fire alarm image.
- the technical solution adopted in the embodiment of the present application also includes: the use of the data enhancement algorithm based on the atmospheric scattering model to convert the original natural image into a haze image includes:
- the atmospheric scattering model adopts at least two transmission rates respectively to simulate and generate haze images with different concentrations; the haze image imaging formula is:
- I(x) is the simulated haze image
- J(x) is the input haze-free image
- ⁇ is the atmospheric light value
- t(x) is the scene transmission rate.
- the technical solution adopted in the embodiment of the present application further includes: the conversion of the original natural image into the sand and dust image by the data enhancement algorithm based on the atmospheric scattering model includes:
- the atmospheric scattering model adopts a fixed transmittance and atmospheric light value, and combines three colors to simulate and generate sand and dust images with different concentrations; the sand and dust image simulation formula is:
- D(x) is the simulated dust image
- J(x) is the input fog-free image
- C(x) is the color value
- the technical solution adopted in the embodiment of the present application further includes: the inputting the data set into the LFNet model for iterative training includes:
- the skeleton feature extraction model adopts the convolution of the scale of $3*3$, $5*5$ and $7*7$ to extract the features of the input image, and the obtained dimensions are $13*13$, $26*26$ and $52*52$ respectively.
- the feature maps of The fusion model maps the three sets of feature maps to different convolution kernels and strides for convolution, and splices all convolutions of the same size to obtain three sets of feature maps, and uses the channel-based attention mechanism to operate the three sets of features Mapping to obtain feature maps with sizes of $13*13$, $26*26$ and $52*52$, which are used to detect small, medium and large objects, respectively.
- the inputting the data set into the LFNet model for iterative training further includes:
- the mean square error and cross entropy are respectively selected as loss functions for model optimization.
- the loss function is specifically:
- R() represents the R channel of the image
- SCP(x) is the difference between the image brightness and the dark channel
- v(x) is the brightness of the image
- DCP(x) is the value of the dark channel of the image
- CHP represents the combustion histogram prior
- CHP(I) and CHP(R) represent the CHP values of the area selected by the target detection algorithm and the area marked respectively
- the loss function is a weighted summation of three different loss functions:
- L CHP is the final loss function
- L CE is the cross-entropy loss function
- L MSE is the mean square error loss function
- L CHP is the combustion histogram prior loss.
- a fire detection system based on video image target detection comprising:
- Data set building module It is used to convert the original natural image into haze image and sand dust image by using the data enhancement algorithm based on the atmospheric scattering model, and generate a data set for training the model;
- the LFNet model training module used to construct a convolutional neural network model LFNet, and input the data set into the LFNet model for iterative training to obtain optimal model parameters;
- the convolutional neural network model LFNet includes a skeleton feature extraction model and a main feature extraction model. model and variable-scale feature fusion model;
- the skeleton feature extraction model extracts the main features of the input image through convolutions of three different scales;
- the main feature extraction model is used for further feature extraction on the main features to generate three group feature maps;
- the variable-scale feature fusion model performs adaptive fusion on the three groups of feature maps, and outputs detection results;
- the detection results include the fire location area of the fire image and the fire type.
- a terminal includes a processor and a memory coupled to the processor, wherein,
- the memory stores program instructions for implementing the video image target detection-based fire detection method
- the processor is configured to execute the program instructions stored in the memory to control fire detection based on video image object detection.
- a storage medium storing program instructions executable by a processor, where the program instructions are used to execute the fire detection method based on video image target detection.
- the beneficial effects of the embodiments of the present application are: the fire detection method, system, terminal and storage medium based on video image target detection according to the embodiments of the present application transform the original image by using the data enhancement algorithm based on the atmospheric scattering model. Convert to images subject to different degrees of haze or sand, generate a data set for training the model, and build a convolutional neural network model LFNet suitable for fire and smoke detection in uncertain environments, which can improve the model's ability to perform well in sand and haze. Robustness under abnormal weather, so that the model can obtain better detection results.
- the size of the LFNet model in the embodiment of the present application is small, the computation cost can be reduced, and the LFNet model can be applied to a resource-constrained device.
- FIG. 1 is a flowchart of a fire detection method based on video image target detection according to an embodiment of the present application
- FIG. 2 is a schematic diagram of the simulation effect of haze and sand dust images based on an atmospheric scattering model according to an embodiment of the present application
- FIG. 3 is a frame diagram of a convolutional neural network model according to an embodiment of the present application.
- FIG. 4 is a structural diagram of a variable-scale feature fusion model according to an embodiment of the present application.
- FIG. 5 is a structural diagram of a channel-based attention mechanism according to an embodiment of the present application.
- FIG. 6 is a schematic structural diagram of a fire detection system based on video image target detection according to an embodiment of the application
- FIG. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application.
- FIG. 8 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
- FIG. 1 is a flowchart of a fire detection method based on video image target detection according to an embodiment of the present application.
- the fire detection method based on video image target detection according to the embodiment of the present application includes the following steps:
- the acquired original natural images include 293 non-alarm images without fire alarm areas and 5073 real fire alarm images.
- non-alarm images can improve the robustness of the training algorithm to non-alarm targets and reduce the bit error rate of the detector.
- real fire alarm images can improve the detection ability of the target detection model.
- the present invention considers the influence of abnormal weather on the fire detection algorithm, and simulates different levels of haze images and sand dust images through a data enhancement method based on an atmospheric scattering model, thereby converting the original natural images into different New synthetic images of the effects of dust and haze weather, build large-scale benchmark datasets for training and testing fire detection models, to improve the robustness of object detection models under abnormal weather conditions such as dust and haze.
- FIG. 2 is a schematic diagram of the simulation effect of haze and sand dust images based on the atmospheric scattering model according to the embodiment of the present application, wherein (a) is the original image, and (b), (c) and (d) are respectively
- the haze images synthesized by atmospheric scattering models with different transmission rates, (e), (f) and (g) are sand and dust images simulated with three different colors using fixed transmittance and atmospheric light values, respectively.
- the imaging formula of haze image is:
- I(x) is the simulated haze image
- J(x) is the input haze-free image
- ⁇ is the atmospheric light value
- t(x) is the scene transmission rate, which describes the The part that is not scattered and reaches the camera sensor.
- the atmospheric light value ⁇ is set to 0.8 in the embodiment of the present application
- the transmittance is set to 0.8, 0.6 and 0.4, respectively.
- the embodiment of the present application selects three colors suitable for simulating sand and dust images to simulate respectively, and the sand and dust image simulation formula is:
- D(x) is the simulated dust image
- J(x) is the input haze-free image
- C(x) is the selected color value.
- LFNet consists of a common convolutional layer, a bottleneck building block, a linear unit for parameter correction, group normalization, etc., including: a skeleton feature extraction model, a main feature extraction model, and a variable-scale feature fusion model.
- the functions of each model are as follows:
- Skeleton Feature Extraction Model Used to extract the main features of the input image. In order to extract richer image features, firstly, convolutions with scales of $3*3$, $5*5$ and $7*7$ are used to extract the features of the input image, expand the receptive field, and extract more image features. After three convolutions of different scales, feature maps with sizes of $13*13$, $26*26$ and $52*52$ are obtained, respectively. Based on the above, by using multi-scale convolution for feature map extraction, feature information of different sizes around pixels can be extracted, which is particularly important for fire images.
- Main feature extraction model It is used for further feature extraction on the main features extracted by the skeleton feature extraction model, and generates three sets of feature maps with sizes of $52*52$, $26*26$, $13*13$, each small.
- the feature maps of size are all extracted from the feature maps of larger size in the upper layer, and each convolution block is extracted by one-layer convolutional structure and five-layer residual structure.
- Variable-scale feature fusion model It is used to concatenate the features extracted by the main feature extraction model by using variable-scale feature fusion (VSFF), and then use convolution to extract features and perform adaptive fusion of features.
- VSFF variable-scale feature fusion
- the structure of the variable-scale feature fusion model is shown in Figure 4.
- three sets of feature map maps are fused, and the functions of $13*13$ and $26*26$ are extended to $52*52$.
- the three inputs are feature maps with sizes of $13*13$, $26*26$, and $52*52$, respectively.
- Three feature maps of different sizes are mapped to different convolution kernels and strides for convolution to make upsampling. Or downsample to the other two sizes.
- concatenate all convolutions of the same size to obtain three sets of feature maps. Since the feature map obtained by splicing contains richer image features, it can make the model localization more accurate.
- the embodiment of the present application utilizes a channel-based attention mechanism to operate three sets of feature maps extracted from the VSFF.
- the channel-based attention mechanism can be viewed as a process of weighting feature maps according to their importance. For example, in a set of 24 ⁇ 13 ⁇ 13 convolutions, the channel-based attention mechanism will determine which of the set of feature maps has a more significant impact on the prediction results, and then increase the weight of that part. With the help of the attention mechanism, three fusions are performed to obtain feature maps with sizes of $13*13$, $26*26$ and $52*52$, which are used to detect small, medium and large objects, respectively.
- the detailed structure of the channel-based attention mechanism is shown in Figure 5.
- the size of the LFNet model of the embodiment of the present application is very small (22.5M), but it occupies a leading position in both quantitative and qualitative evaluation, which reduces the computational cost and is beneficial to the application of LNet to resource-constrained devices.
- the LFNet model has two tasks: one is to accurately locate the warning area in the image; the other is to classify the disaster types in the warning area.
- MSE mean square error
- CE cross entropy
- the loss function is based on a large number of statistics on different fire images or videos. , which can help LFNet detect fire areas effectively.
- the embodiments of the present application regard these statistical data as combustion histogram prior (CHP), and according to these statistical data, write it as the formula of CHP:
- R() represents the R channel of the image
- SCP(x) is the difference between the image brightness and the dark channel, which can also be written as:
- v(x) is the brightness of the image
- DCP(x) is the value of the dark channel of the image.
- CHP represents the combustion histogram prior
- CHP(I) and CHP(R) represent the CHP values of the area selected by the target detection algorithm and the area marked in the ground truth, respectively.
- the final loss function is the weighted summation of three different loss functions: cross entropy loss function, mean square error loss function and combustion histogram prior loss function.
- the formula is:
- L CHP is the final loss function
- L CE is the cross-entropy loss function
- L MSE is the mean square error loss function
- L CHP is the combustion histogram prior loss
- ⁇ , ⁇ and ⁇ are set to 0.25 respectively. , 0.25 and 0.5.
- S50 Input the fire image to be detected into the trained LFNet model, and output the fire location area and fire type of the fire image to be detected through the LFNet model.
- FIG. 6 is a schematic structural diagram of a fire detection system based on video image target detection according to an embodiment of the present application.
- the fire detection system 40 based on video image target detection according to the embodiment of the present application includes:
- Data set building module 41 used to convert the original natural image into a haze image and a dust image by using a data enhancement algorithm based on the atmospheric scattering model, and generate a data set for training the model;
- LFNet model training module 42 used to construct a convolutional neural network model LFNet, and input the data set into the LFNet model for iterative training to obtain optimal model parameters;
- the convolutional neural network model LFNet includes a skeleton feature extraction model, a main feature Extraction model and variable-scale feature fusion model;
- the skeleton feature extraction model extracts the main features of the input image through convolution of three different scales;
- the main feature extraction model is used for further feature extraction on the main features, generating Three sets of feature maps;
- the variable-scale feature fusion model performs adaptive fusion on the three sets of feature maps, and outputs detection results;
- Model optimization module 43 used to select mean square error and cross entropy respectively as loss functions for model optimization.
- FIG. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application.
- the terminal 50 includes a processor 51 and a memory 52 coupled to the processor 51 .
- the memory 52 stores program instructions for implementing the above-mentioned fire detection method based on video image object detection.
- the processor 51 is configured to execute program instructions stored in the memory 52 to control fire detection based on video image object detection.
- the processor 51 may also be referred to as a CPU (Central Processing Unit, central processing unit).
- the processor 51 may be an integrated circuit chip with signal processing capability.
- the processor 51 may also be a general purpose processor, digital signal processor (DSP), application specific integrated circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware component .
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA off-the-shelf programmable gate array
- a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
- FIG. 8 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
- the storage medium of this embodiment of the present application stores a program file 61 capable of implementing all the above methods, wherein the program file 61 may be stored in the above-mentioned storage medium in the form of a software product, and includes several instructions to make a computer device (which may It is a personal computer, a server, or a network device, etc.) or a processor that executes all or part of the steps of the methods of the various embodiments of the present invention.
- a computer device which may It is a personal computer, a server, or a network device, etc.
- a processor that executes all or part of the steps of the methods of the various embodiments of the present invention.
- the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes , or terminal devices such as computers, servers, mobile phones, and tablets.
- the fire detection method, system, terminal, and storage medium based on video image target detection convert the original image into an image affected by different degrees of haze or sand by using a data enhancement algorithm based on an atmospheric scattering model, and generate images for
- the data set for training the model and constructing a convolutional neural network model LFNet suitable for fire and smoke detection in uncertain environments can improve the robustness of the model under abnormal weather such as sand, dust and haze, and enable the model to obtain better detection. result.
- the size of the LFNet model in the embodiment of the present application is small, the computation cost can be reduced, and the LFNet model can be applied to a resource-constrained device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Fire-Detection Mechanisms (AREA)
Abstract
Description
Claims (10)
- 一种基于视频图像目标检测的火灾检测方法,其特征在于,包括:采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;构建卷积神经网络模型LFNet,将所述数据集输入LFNet模型进行迭代训练,得到最优模型参数;所述卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;所述骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;所述主要特征提取模型用于对所述主要特征进行进一步的特征提取,生成三组特征图;所述变尺度特征融合模型对所述三组特征图进行自适应融合,输出检测结果;将待检测火灾图像输入训练好的LFNet模型,通过LFNet模型输出待检测火灾图像的火灾定位区域以及火灾类型。
- 根据权利要求1所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像前包括:获取原始自然图像;所述原始自然图像包括没有火灾报警区域的非报警图像和真实的火灾报警图像。
- 根据权利要求1或2所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像包括:所述大气散射模型分别采用至少两种传输速率分别模拟生成不同浓度的灰霾图像;所述灰霾图像成像公式为:I(x)=J(x)t(x)+ɑ(1-t(x))上述公式中,I(x)是模拟出来的灰霾图像,J(x)是输入的无雾图像,ɑ是大气光值,t(x)是场景传输速率。
- 根据权利要求3所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述采用基于大气散射模型的数据增强算法将原始自然图像转换为沙尘图像包括:所述大气散射模型采用固定透射率和大气光值,结合三种颜色模拟生成不同浓度的沙尘图像;所述沙尘图像模拟公式为:D(x)=J(x)t(x)+a(C(x)*(1-t(x)))上述公式中,D(x)为模拟出的沙尘图像,J(x)为输入的无雾图像,C(x)为颜色值。
- 根据权利要求1所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述将所述数据集输入LFNet模型进行迭代训练包括:所述骨架特征提取模型分别采用$3*3$、$5*5$和$7*7$尺度的卷积提取输入图像的特征,得到尺寸分别为$13*13$、$26*26$和$52*52$的特征图;所述主要特征提取模型对所述主要特征进行进一步的特征提取,生成大小分别为$52*52$、$26*26$、$13*13$的三组特征图;所述变尺度特征融合模型将所述三组特征图映射到不同的卷积核和步长进行卷积,并拼接所有相同大小的卷积,得到三组特征映射,利用基于信道的注意机制操作所述三组特征映射,得到大小分别为$13*13$、$26*26$和$52*52$的特征图,分别用于检测小、中、大型物体。
- 根据权利要求5所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述将数据集输入LFNet模型进行迭代训练还包括:分别选取均方误差和交叉熵作为损失函数进行模型优化。
- 根据权利要求6所述的基于视频图像目标检测的火灾检测方法,其特征在于,所述损失函数具体为:统计火灾区域的路径的亮度、暗通道值和R通道数据,将所述统计数据视为燃烧直方图先验,写成CHP的公式:上述公式中,R()代表图像的R通道,SCP(x)是图像亮度与暗通道的差值;SCP(x)=||v(x)-DCP(x)||上述公式中,v(x)是图像的亮度,DCP(x)是图像暗通道的值;L CHP=||CHP(I)-CHP(R)|| 2上述公式中,CHP代表燃烧直方图先验,CHP(I)和CHP(R)分别代表目标检测算法选中的区域和标注的区域的CHP值;所述损失函数为将三个不同的损失函数进行加权求和:L CHP=βL CE+γL MSE+δL CHP上述公式中,L CHP为最终的损失函数,L CE为交叉熵损失函数,L MSE为均方差损失函数,L CHP为燃烧直方图先验损失。
- 一种基于视频图像目标检测的火灾检测系统,其特征在于,包括:数据集构建模块:用于采用基于大气散射模型的数据增强算法将原始自然图像转换为灰霾图像及沙尘图像,生成用于训练模型的数据集;LFNet模型训练模块:用于构建卷积神经网络模型LFNet,将所述数据集输入LFNet模型进行迭代训练,得到最优模型参数;所述卷积神经网络模型LFNet包括骨架特征提取模型、主要特征提取模型和变尺度特征融合模型;所述骨架特征提取模型通过三个不同尺度的卷积提取输入图像的主要特征;所述主要特征提 取模型用于对所述主要特征进行进一步的特征提取,生成三组特征图;所述变尺度特征融合模型对所述三组特征图进行自适应融合,输出检测结果;所述检测结果包括火灾图像的火灾定位区域以及火灾类型。
- 一种终端,其特征在于,所述终端包括处理器、与所述处理器耦接的存储器,其中,所述存储器存储有用于实现权利要求1-7任一项所述的基于视频图像目标检测的火灾检测方法的程序指令;所述处理器用于执行所述存储器存储的所述程序指令以控制基于视频图像目标检测的火灾检测。
- 一种存储介质,其特征在于,存储有处理器可运行的程序指令,所述程序指令用于执行权利要求1至7任一项所述基于视频图像目标检测的火灾检测方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/119413 WO2022067668A1 (zh) | 2020-09-30 | 2020-09-30 | 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/119413 WO2022067668A1 (zh) | 2020-09-30 | 2020-09-30 | 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022067668A1 true WO2022067668A1 (zh) | 2022-04-07 |
Family
ID=80949324
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/119413 WO2022067668A1 (zh) | 2020-09-30 | 2020-09-30 | 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022067668A1 (zh) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114882430A (zh) * | 2022-04-29 | 2022-08-09 | 东南大学 | 一种基于Transformer的轻量化早期火灾检测方法 |
CN114998801A (zh) * | 2022-06-09 | 2022-09-02 | 北京林业大学 | 基于对比自监督学习网络的森林火灾烟雾视频检测方法 |
CN115171006A (zh) * | 2022-06-15 | 2022-10-11 | 武汉纺织大学 | 基于深度学习的自动识别人员进入电力危险区的检测方法 |
CN116958774A (zh) * | 2023-09-21 | 2023-10-27 | 北京航空航天大学合肥创新研究院 | 一种基于自适应空间特征融合的目标检测方法 |
CN116977826A (zh) * | 2023-08-14 | 2023-10-31 | 北京航空航天大学 | 一种边缘端计算架构下的可重构神经网络目标检测系统及方法 |
CN117132752A (zh) * | 2023-10-24 | 2023-11-28 | 硕橙(厦门)科技有限公司 | 基于多维度加权的沙尘图像增强方法、装置、设备及介质 |
CN117197658A (zh) * | 2023-08-08 | 2023-12-08 | 北京科技大学 | 基于多情境生成图像的建筑火灾多目标检测方法与系统 |
CN117409341A (zh) * | 2023-12-15 | 2024-01-16 | 深圳市光明顶技术有限公司 | 基于无人机照明的图像分析方法及系统 |
CN117935166A (zh) * | 2024-01-31 | 2024-04-26 | 中煤科工集团重庆研究院有限公司 | 一种煤矿采空区智能火灾监测方法和系统 |
CN118217567A (zh) * | 2024-04-30 | 2024-06-21 | 广州帝能云科技股份有限公司 | 一种智能安全用电消防系统 |
CN118470342A (zh) * | 2024-07-10 | 2024-08-09 | 天翼视联科技有限公司 | 火情检测方法、装置及计算机设备 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109345477A (zh) * | 2018-09-26 | 2019-02-15 | 四川长虹电器股份有限公司 | 一种基于深度卷积神经网络的快速图像去雾霾系统 |
CN110135266A (zh) * | 2019-04-17 | 2019-08-16 | 浙江理工大学 | 一种基于深度学习的双摄像头电气火灾防控方法及系统 |
EP3561788A1 (en) * | 2016-12-21 | 2019-10-30 | Hochiki Corporation | Fire monitoring system |
CN111179202A (zh) * | 2019-12-31 | 2020-05-19 | 内蒙古工业大学 | 一种基于生成对抗网络的单幅图像去雾增强方法和系统 |
-
2020
- 2020-09-30 WO PCT/CN2020/119413 patent/WO2022067668A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3561788A1 (en) * | 2016-12-21 | 2019-10-30 | Hochiki Corporation | Fire monitoring system |
CN109345477A (zh) * | 2018-09-26 | 2019-02-15 | 四川长虹电器股份有限公司 | 一种基于深度卷积神经网络的快速图像去雾霾系统 |
CN110135266A (zh) * | 2019-04-17 | 2019-08-16 | 浙江理工大学 | 一种基于深度学习的双摄像头电气火灾防控方法及系统 |
CN111179202A (zh) * | 2019-12-31 | 2020-05-19 | 内蒙古工业大学 | 一种基于生成对抗网络的单幅图像去雾增强方法和系统 |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114882430A (zh) * | 2022-04-29 | 2022-08-09 | 东南大学 | 一种基于Transformer的轻量化早期火灾检测方法 |
CN114998801A (zh) * | 2022-06-09 | 2022-09-02 | 北京林业大学 | 基于对比自监督学习网络的森林火灾烟雾视频检测方法 |
CN115171006A (zh) * | 2022-06-15 | 2022-10-11 | 武汉纺织大学 | 基于深度学习的自动识别人员进入电力危险区的检测方法 |
CN117197658A (zh) * | 2023-08-08 | 2023-12-08 | 北京科技大学 | 基于多情境生成图像的建筑火灾多目标检测方法与系统 |
CN116977826B (zh) * | 2023-08-14 | 2024-03-22 | 北京航空航天大学 | 一种边缘端计算架构下的可重构神经网络目标检测方法 |
CN116977826A (zh) * | 2023-08-14 | 2023-10-31 | 北京航空航天大学 | 一种边缘端计算架构下的可重构神经网络目标检测系统及方法 |
CN116958774B (zh) * | 2023-09-21 | 2023-12-01 | 北京航空航天大学合肥创新研究院 | 一种基于自适应空间特征融合的目标检测方法 |
CN116958774A (zh) * | 2023-09-21 | 2023-10-27 | 北京航空航天大学合肥创新研究院 | 一种基于自适应空间特征融合的目标检测方法 |
CN117132752A (zh) * | 2023-10-24 | 2023-11-28 | 硕橙(厦门)科技有限公司 | 基于多维度加权的沙尘图像增强方法、装置、设备及介质 |
CN117132752B (zh) * | 2023-10-24 | 2024-02-02 | 硕橙(厦门)科技有限公司 | 基于多维度加权的沙尘图像增强方法、装置、设备及介质 |
CN117409341A (zh) * | 2023-12-15 | 2024-01-16 | 深圳市光明顶技术有限公司 | 基于无人机照明的图像分析方法及系统 |
CN117409341B (zh) * | 2023-12-15 | 2024-02-13 | 深圳市光明顶技术有限公司 | 基于无人机照明的图像分析方法及系统 |
CN117935166A (zh) * | 2024-01-31 | 2024-04-26 | 中煤科工集团重庆研究院有限公司 | 一种煤矿采空区智能火灾监测方法和系统 |
CN118217567A (zh) * | 2024-04-30 | 2024-06-21 | 广州帝能云科技股份有限公司 | 一种智能安全用电消防系统 |
CN118470342A (zh) * | 2024-07-10 | 2024-08-09 | 天翼视联科技有限公司 | 火情检测方法、装置及计算机设备 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022067668A1 (zh) | 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 | |
Hu et al. | Fast forest fire smoke detection using MVMNet | |
CN110598558B (zh) | 人群密度估计方法、装置、电子设备及介质 | |
CN112689843B (zh) | 闭环自动数据集创建系统和方法 | |
CN111179217A (zh) | 一种基于注意力机制的遥感图像多尺度目标检测方法 | |
CN108734210B (zh) | 一种基于跨模态多尺度特征融合的对象检测方法 | |
US11756306B2 (en) | Anti-drowning safety alarm method and device for swimming pool | |
Yu et al. | SAR ship detection based on improved YOLOv5 and BiFPN | |
CN113627504B (zh) | 基于生成对抗网络的多模态多尺度特征融合目标检测方法 | |
Jiang et al. | A self-attention network for smoke detection | |
TWI667621B (zh) | 人臉辨識方法 | |
CN112036381B (zh) | 视觉跟踪方法、视频监控方法及终端设备 | |
CN116524189A (zh) | 一种基于编解码索引化边缘表征的高分辨率遥感图像语义分割方法 | |
Viraktamath et al. | Comparison of YOLOv3 and SSD algorithms | |
Xu et al. | Tackling small data challenges in visual fire detection: a deep convolutional generative adversarial network approach | |
Li et al. | A self-attention feature fusion model for rice pest detection | |
CN110827320A (zh) | 基于时序预测的目标跟踪方法和装置 | |
Ning et al. | Point-voxel and bird-eye-view representation aggregation network for single stage 3D object detection | |
Zhou et al. | Improved YOLOv7 models based on modulated deformable convolution and swin transformer for object detection in fisheye images | |
CN112215122B (zh) | 基于视频图像目标检测的火灾检测方法、系统、终端以及存储介质 | |
CN114463624A (zh) | 一种应用于城市管理监督的违章建筑物检测方法及装置 | |
CN116912675B (zh) | 一种基于特征迁移的水下目标检测方法及系统 | |
CN115205793B (zh) | 基于深度学习二次确认的电力机房烟雾检测方法及装置 | |
Wang et al. | A lightweight CNN model based on GhostNet | |
CN117576461A (zh) | 一种用于变电站场景的语义理解方法、介质及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20955693 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20955693 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20955693 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14/12/2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20955693 Country of ref document: EP Kind code of ref document: A1 |