CN113269067B - Periodic industrial video clip key frame two-stage extraction method based on deep learning - Google Patents
Periodic industrial video clip key frame two-stage extraction method based on deep learning Download PDFInfo
- Publication number
- CN113269067B CN113269067B CN202110532120.6A CN202110532120A CN113269067B CN 113269067 B CN113269067 B CN 113269067B CN 202110532120 A CN202110532120 A CN 202110532120A CN 113269067 B CN113269067 B CN 113269067B
- Authority
- CN
- China
- Prior art keywords
- image
- key frame
- sequence
- image sequence
- images
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 25
- 238000013135 deep learning Methods 0.000 title claims abstract description 24
- 230000000737 periodic effect Effects 0.000 title claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 55
- 238000004519 manufacturing process Methods 0.000 claims abstract description 41
- 230000008569 process Effects 0.000 claims abstract description 26
- 239000011159 matrix material Substances 0.000 claims abstract description 25
- 230000011218 segmentation Effects 0.000 claims abstract description 25
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000012216 screening Methods 0.000 claims abstract description 6
- 238000012360 testing method Methods 0.000 claims description 23
- 238000012549 training Methods 0.000 claims description 22
- 238000012545 processing Methods 0.000 claims description 6
- 238000012937 correction Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 238000013519 translation Methods 0.000 claims description 3
- 230000009191 jumping Effects 0.000 claims 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000004927 fusion Effects 0.000 abstract 1
- 238000009776 industrial production Methods 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 5
- 238000005245 sintering Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000002485 combustion reaction Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000001746 injection moulding Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06316—Sequencing of tasks or work
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Educational Administration (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Development Economics (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Game Theory and Decision Science (AREA)
- Software Systems (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
本发明涉及一种基于深度学习的周期性工业视频片段关键帧两阶段提取方法。该方法包括:获取工业视频图像,提取感兴趣区域,预处理,得到预处理后图像序列;构建基于深度学习的语义分割网络模型,提取所述预处理后图像的目标区域;第一阶段,构建卷积神经网络对所述预处理后图像进行分类,并对其时间序列进行分割,得候选关键帧序列集合;第二阶段,构建所述目标区域相似度矩阵,对所述候选关键帧序列进行聚类、筛选和融合,得关键帧。本发明针对工业视频特征复杂,当前方法缺乏全局性和局部性的问题,引入深度学习技术,以“先全局后局部”的两阶段思想,更快、更准确地提取了工业视频关键帧,对优化生产,实现提质增产具有指导意义。
The invention relates to a two-stage extraction method for key frames of periodic industrial video segments based on deep learning. The method includes: acquiring an industrial video image, extracting a region of interest, preprocessing, and obtaining a preprocessed image sequence; constructing a semantic segmentation network model based on deep learning, and extracting the target area of the preprocessed image; in the first stage, constructing The convolutional neural network classifies the preprocessed image, and divides its time series to obtain a set of candidate key frame sequences; in the second stage, constructing the similarity matrix of the target area, and performing a process on the candidate key frame sequence Clustering, screening and fusion, get keyframes. The present invention aims at the problem that the characteristics of industrial video are complex and the current method lacks globality and locality. It introduces deep learning technology and uses the two-stage idea of "first global and then local" to extract key frames of industrial video faster and more accurately. It is of guiding significance to optimize production and realize quality improvement and production increase.
Description
技术领域Technical Field
本发明涉及机器视觉、图像处理、模式识别领域,特别涉及一种基于深度学习的周期性工业视频片段关键帧两阶段提取方法。The present invention relates to the fields of machine vision, image processing and pattern recognition, and in particular to a two-stage extraction method of key frames of periodic industrial video clips based on deep learning.
背景技术Background Art
周期性生产过程是一种常见的工业生产过程。在这类过程中,一系列既定的工序被周而复始的执行。例如,在钢铁烧结过程中,存在着“布料→点火→台车行进→卸料”这一周期性生产过程;再比如,在注塑过程中,“合模→填充→保压→冷却→开模→脱模”这一系列工序被循环执行。The cyclical production process is a common industrial production process. In this type of process, a series of established procedures are executed repeatedly. For example, in the steel sintering process, there is a cyclical production process of "laying → ignition → trolley movement → unloading"; for another example, in the injection molding process, the series of procedures of "closing mold → filling → pressure holding → cooling → mold opening → demolding" are executed cyclically.
工业视频是工业生产过程工况信息的直观表现和间接反映。对于某一生产工序而言,关键帧是其监控视频片段中最能反映当前工业生产过程工况特征的图像,是评估该工序当前生产工况的重要特征参数之一。但是由于工业过程的复杂性,导致目前对关键帧的提取存在着以下问题。Industrial video is an intuitive representation and indirect reflection of the working condition information of the industrial production process. For a certain production process, the key frame is the image in the monitoring video clip that best reflects the current working condition characteristics of the industrial production process, and is one of the important characteristic parameters for evaluating the current production condition of the process. However, due to the complexity of the industrial process, the current extraction of key frames has the following problems.
(1)生产周期的动态性(1) Dynamics of the production cycle
理论上对于周期性生产过程,在生产速率一定的情况下,可以确定每个关键帧之间的时间间隔。在人为确定第一帧关键帧后,可以根据生产速率确定后续生产过程中的各关键帧。但是受到物料、燃料、操作、环境等因素的波动的影响,生产周期往往存在一定的波动,导致各关键帧之间时间间隔无法确定。Theoretically, for a periodic production process, the time interval between each key frame can be determined when the production rate is constant. After the first key frame is manually determined, the key frames in the subsequent production process can be determined according to the production rate. However, due to the fluctuations of factors such as materials, fuel, operation, and environment, the production cycle often fluctuates to a certain extent, resulting in the inability to determine the time interval between key frames.
(2)工序间的相似性(2) Similarity between processes
在实际生产过程中,不同工序往往在同一场合下进行,这使得获得的各工序监控视频间存在着较多相似场景。例如,在烧结过程的机尾断面监控视频中,“台车运行”和“卸料”两个工序之间便存在着“烧结料层”这一共同场景,而“卸料”过程特有的“燃烧带”图像在这一场景中仅占了很小的一部分,从图像特征角度来看,这导致了两工序图像间的相似性。而传统的手工特征提取方法无法有效的克服这一相似性,造成了工序视频片段分割的困难性。In the actual production process, different processes are often carried out in the same place, which makes the monitoring videos of each process have many similar scenes. For example, in the tail section monitoring video of the sintering process, there is a common scene of "sintering material layer" between the two processes of "trolley operation" and "unloading", while the "burning zone" image unique to the "unloading" process only occupies a small part of this scene. From the perspective of image features, this leads to the similarity between the images of the two processes. However, the traditional manual feature extraction method cannot effectively overcome this similarity, which makes it difficult to segment the process video clips.
(3)工序内的相似性(3) Similarity within the process
在实际生产过程中,生产设备的动作,以及物料、产品的各种物理化学变化,往往为连续变化过程,监控视频各帧之间的差异较小,并主要体现在空间位置和纹理上,传统的手工特征无法有效的表达这一差异性。例如,在烧结过程的机尾断面监控视频中,“卸料”工序的各断面图像间的主要差异主要表现为燃烧带空间分布、纹理等变化,简单的亮度、直方图等手工特征无法精确的描述这一变化。这一问题便导致了工序视频片段关键帧提取的困难性。In the actual production process, the movement of production equipment, as well as various physical and chemical changes of materials and products, are often continuous changes. The differences between the frames of the monitoring video are small and mainly reflected in the spatial position and texture. Traditional manual features cannot effectively express this difference. For example, in the tail section monitoring video of the sintering process, the main differences between the cross-sectional images of the "unloading" process are mainly manifested in the changes in the spatial distribution and texture of the combustion zone. Simple manual features such as brightness and histogram cannot accurately describe this change. This problem leads to the difficulty of extracting key frames of process video clips.
因此,如何克服上述问题,准确提取工业视频图像特征,快速实现周期性工业视频片段关键帧提取是工业过程工况评估中亟需解决的问题。Therefore, how to overcome the above problems, accurately extract industrial video image features, and quickly realize key frame extraction of periodic industrial video clips is an urgent problem to be solved in industrial process condition assessment.
发明内容Summary of the invention
基于此,本发明针对上述技术问题,提出了一种基于深度学习的关键帧提取方法,其目的是为了解决现有关键帧提取过程各关键帧时间间隔无法确定,无法有效克服工序相似性以及特征无法精确描述的技术问题,提供一种准确提取工业视频图像特征,快速实现周期性工业视频片段关键帧提取的方法。Based on this, the present invention proposes a key frame extraction method based on deep learning to address the above-mentioned technical problems. The purpose is to solve the technical problems that the time intervals of key frames in the existing key frame extraction process cannot be determined, the similarity of processes cannot be effectively overcome, and the features cannot be accurately described. This method provides a method for accurately extracting industrial video image features and quickly realizing key frame extraction of periodic industrial video clips.
本发明提供了一种基于深度学习的工业视频周期性生产片段关键帧两阶段提取方法,具体包括:The present invention provides a two-stage extraction method of key frames of periodic production segments of industrial videos based on deep learning, which specifically includes:
S1:获取工业视频图像,提取兴趣区域图像,并进行预处理,获得预处理图像序列;S1: Acquire industrial video images, extract interest area images, and perform preprocessing to obtain preprocessed image sequences;
S2:构建基于深度学习的语义分割网络模型,对所述预处理图像序列提取图像目标区域;S2: constructing a semantic segmentation network model based on deep learning to extract the image target area from the preprocessed image sequence;
S3:获取所述步骤S2中语义分割网络模型中间层的输出特征,并构建卷积神经网络模型,对所述预处理图像序列进行二分类,获得图像类别特征;S3: Obtain the output features of the middle layer of the semantic segmentation network model in step S2, and construct a convolutional neural network model to perform binary classification on the preprocessed image sequence to obtain image category features;
S4:根据所述图像类别特征对所述预处理后的图像序列进行分割获得候选关键帧序列集合;S4: Segmenting the preprocessed image sequence according to the image category features to obtain a set of candidate key frame sequences;
S5:计算所述候选关键帧序列集合中各图像目标区域的相似度,构建相似度矩阵,并以所述相似度矩阵为输入,对所述候选关键帧序列进行聚类处理,获得多类别图像集合;S5: calculating the similarity of each image target area in the candidate key frame sequence set, constructing a similarity matrix, and taking the similarity matrix as input, performing clustering processing on the candidate key frame sequence to obtain a multi-category image set;
S6:根据工业过程实际需求,构建关键帧选择指标和权值矩阵,根据所述关键帧选择指标对所述多类别图像集合进行筛选获得关键帧序列,并根据所述权值矩阵对所述关键帧序列进行加权平均,获得关键帧。S6: According to the actual needs of the industrial process, a key frame selection index and a weight matrix are constructed, the multi-category image set is screened according to the key frame selection index to obtain a key frame sequence, and the key frame sequence is weighted averaged according to the weight matrix to obtain a key frame.
进一步的,所述步骤S1中的预处理包括去噪、色彩校正和去雾处理。Furthermore, the preprocessing in step S1 includes denoising, color correction and dehazing.
进一步的,所述步骤S2具体包括:Furthermore, the step S2 specifically includes:
从预处理图像序列随机选取多张第一典型图像,并筛选出第一掩模图像,构建第一训练集和第一测试集;Randomly select a plurality of first typical images from the preprocessed image sequence, and screen out a first mask image to construct a first training set and a first test set;
将所述第一训练集和第一测试集进行平移、尺度、亮度和旋转变换处理获得增强训练集和测试集;Performing translation, scale, brightness and rotation transformation on the first training set and the first test set to obtain enhanced training set and test set;
构建深度语义分割网络模型,以所述增强训练集为输入对网络模型进行输入,并以增强测试集对网络模型进行测试,获得训练后的深度语义分割网络模型;Constructing a deep semantic segmentation network model, inputting the network model with the enhanced training set as input, and testing the network model with the enhanced test set to obtain a trained deep semantic segmentation network model;
将所述预处理图像采用训练后的深度语义分割网络模型进行类别特征提取,获得图像类别特征。The preprocessed image is subjected to category feature extraction using a trained deep semantic segmentation network model to obtain image category features.
进一步的,所述步骤S3具体包括:Furthermore, the step S3 specifically includes:
从预处理图像序列随机选取多张第二典型图像,并根据工业过程的实际需求将所述第二典型图像进行分类,构建第二训练集和第二测试集;Randomly selecting a plurality of second typical images from the preprocessed image sequence, and classifying the second typical images according to actual requirements of the industrial process to construct a second training set and a second test set;
以所述第二训练集和第二测试集为输入,采用步骤S2中的深度语义分割模型进行模拟,获取模型中间层输出作为图像深度特征;Taking the second training set and the second test set as input, using the deep semantic segmentation model in step S2 for simulation, and obtaining the output of the middle layer of the model as the image depth feature;
构建卷积神经网络模型,以所述图像深度特征为输入和分类作为输出,多网络进行训练和测试,获得训练后的卷积神经网络模型;Constructing a convolutional neural network model, taking the image depth features as input and classification as output, training and testing multiple networks to obtain a trained convolutional neural network model;
将所述预处理图像采用训练后的卷积神经网络模型进行特征提取,获得图像类别特征。The preprocessed image is subjected to feature extraction using a trained convolutional neural network model to obtain image category features.
进一步的,所述步骤S4具体包括:Furthermore, the step S4 specifically includes:
构建临时图像序列并设定最小图像序列长度;遍历所述预处理图像序列,提取当前图像的类别特征,并判断图像是否属于目标图像;Constructing a temporary image sequence and setting a minimum image sequence length; traversing the preprocessed image sequence, extracting the category features of the current image, and determining whether the image belongs to the target image;
若当前图像为目标图像时,将当前图像添加至临时图像序列,且目标图像数量增加1,当所述临时图像序列的数量大于最小图像序列长度时,将临时图像序列中除最后张图像外的所有图像添加至当前目标图像序列;所述当前目标图像序列集合即为候选关键帧序列集合。If the current image is the target image, the current image is added to the temporary image sequence, and the number of target images increases by 1. When the number of the temporary image sequence is greater than the minimum image sequence length, all images in the temporary image sequence except the last image are added to the current target image sequence; the current target image sequence set is the candidate key frame sequence set.
进一步的,所述构建相似度矩阵具体包括:Furthermore, the constructing of the similarity matrix specifically includes:
取候选关键帧序列中任意两张图像In和Im,利用所述深度语义分割网络提取相应目标区域Maskn和Maskm,并计算Maskn和Maskm之间的相似度 Get candidate key frame sequence For any two images I n and I m in the image, the deep semantic segmentation network is used to extract the corresponding target regions Mask n and Mask m , and the similarity between Mask n and Mask m is calculated.
其中,表示Maskn和Maskm之间的匹配特征描述子数量,W和H分别为图像的宽度和长度,∑∑Maskn和∑∑Maskm分别表示目标区域Maskn和Maskm的面积,Kn和Km分别表示Maskn和Maskm的特征描述子数量;in, represents the number of matching feature descriptors between Mask n and Mask m , W and H are the width and length of the image respectively, ∑∑Mask n and ∑∑Mask m represent the areas of the target regions Mask n and Mask m respectively, K n and K m represent the number of feature descriptors of Mask n and Mask m respectively;
计算候选关键帧序列中所有图像之间的相似度,得相似度矩阵Calculate candidate key frame sequence The similarity between all images in the , and the similarity matrix
进一步的,所述聚类处理具体包括:Furthermore, the clustering process specifically includes:
根据工业实际需求,设定类别数量D,以所述相似度矩阵为输入,对相对应的候选关键帧序列进行聚类操作,获得多类别图像集合。According to actual industrial needs, the number of categories D is set, and the similarity matrix is used as input to perform clustering operations on the corresponding candidate key frame sequences to obtain a multi-category image set.
进一步的,所述步骤S6具体包括:Furthermore, the step S6 specifically includes:
以工业过程实际需求为目标,根据所述图像目标区域构建关键帧选择目标,从所述类别图像集合中选择图像获得关键帧序列;Taking the actual needs of the industrial process as the goal, constructing a key frame selection target according to the target area of the image, and selecting images from the category image set to obtain a key frame sequence;
以工业过程实际需求为目标,根据所述图像目标区域构建权值矩阵,对所述关键帧序列中图像进行加权平均,获得关键帧。Taking the actual needs of the industrial process as the goal, a weight matrix is constructed according to the target area of the image, and the images in the key frame sequence are weighted averaged to obtain the key frame.
有益效果:Beneficial effects:
本发明的上述实施例所述的基于深度学习的工业视频周期性生产片段关键帧两阶段提取方法通过引入深度学习技术,弥补了传统手工特征方法在工业图像特征提取能力上的不足,能够完整准确地提取图像特征;通过生成候选关键帧序列,对海量的工业图像进行初步粗筛选,减少了第二阶段中聚类操作的计算量,并提高了其准确率;使用聚类操作对候选关键帧序列进行分割,提高了关键帧序列内图像间的相似度,减少了关键帧合成的计算量,并避免了噪声图像的干扰;采用多图像加权平均合成关键帧的方式,最大程度减少了图像变化过程中的特征丢失,能够更加完整的反映工业生产过程中的视觉信息。The two-stage method for extracting key frames of periodic production fragments of industrial videos based on deep learning described in the above-mentioned embodiment of the present invention makes up for the shortcomings of traditional manual feature methods in the ability to extract industrial image features by introducing deep learning technology, and can extract image features completely and accurately; by generating a candidate key frame sequence, a preliminary rough screening is performed on a large number of industrial images, which reduces the computational complexity of the clustering operation in the second stage and improves its accuracy; the candidate key frame sequence is segmented by clustering operation, which improves the similarity between images in the key frame sequence, reduces the computational complexity of key frame synthesis, and avoids the interference of noise images; the method of synthesizing key frames by weighted average of multiple images is adopted to minimize the feature loss during the image change process, and can more completely reflect the visual information in the industrial production process.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For ordinary technicians in this field, other drawings can be obtained based on these drawings without paying creative work.
图1为本发明的基于深度学习的周期性工业视频片段关键帧两阶段提取方法的流程示意图;FIG1 is a schematic flow chart of a two-stage method for extracting key frames of periodic industrial video clips based on deep learning according to the present invention;
图2为本发明实施例提供的典型的原始ROI图像,及其预处理后的图像;FIG2 is a typical original ROI image and a preprocessed image thereof provided by an embodiment of the present invention;
图3为本发明实施例提供的深度语义分割网络结构示意图;FIG3 is a schematic diagram of a deep semantic segmentation network structure provided by an embodiment of the present invention;
图4为本发明实施例提供的典型的预处理后图像及其图像目标区域;FIG4 is a typical preprocessed image and its image target area provided by an embodiment of the present invention;
图5为本发明实施例提供的聚类结果的示意图;FIG5 is a schematic diagram of a clustering result provided by an embodiment of the present invention;
图6为本发明实施例提供的典型关键帧;FIG6 is a typical key frame provided by an embodiment of the present invention;
图7为本发明实施例提供的各方法对关键帧提取效果的对比图。FIG. 7 is a comparison diagram of the key frame extraction effects of various methods provided by the embodiments of the present invention.
具体实施方式DETAILED DESCRIPTION
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。In order to make the purpose, technical solution and advantages of the present invention more clearly understood, the present invention is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention and are not used to limit the present invention.
如图1所示,在本发明实施例中,提出了一种基于深度学习的周期性工业视频片段关键帧两阶段提取方法的流程示意图,具体包括以下步骤:As shown in FIG1 , in an embodiment of the present invention, a flowchart of a two-stage key frame extraction method for periodic industrial video clips based on deep learning is proposed, which specifically includes the following steps:
步骤S1,获取工业视频图像,提取兴趣区域图像,并进行预处理,获得预处理图像序列。Step S1, acquiring industrial video images, extracting images of regions of interest, and performing preprocessing to obtain preprocessed image sequences.
在本发明实施例中,对获取的工业视频图像,进行定宽高裁剪,去除图像中的无用背景,提取出感兴趣区域(Region ofInterest,ROI)图像,再对所述ROI图像进行去噪、色彩校正和去雾等预处理操作,降低图像中受不同光照、高温、扬尘影响而产生的噪点、照度不均以及雾化等缺陷,得到预处理后图像序列,如图2所示的ROI图像及其预处理后的图像。In an embodiment of the present invention, the acquired industrial video image is cropped with a fixed width and height to remove useless background in the image, extract the region of interest (ROI) image, and then perform preprocessing operations such as denoising, color correction and defogging on the ROI image to reduce defects such as noise, uneven illumination and fogging caused by different lighting, high temperature and dust in the image, and obtain a preprocessed image sequence, such as the ROI image and its preprocessed image shown in Figure 2.
步骤S2,构建基于深度学习的语义分割网络模型,对所述预处理图像序列提取图像目标区域。Step S2, constructing a semantic segmentation network model based on deep learning to extract the image target area from the preprocessed image sequence.
在本发明实施例中,首先从所述预处理后的图像中随机选取多张第一典型图像,并筛选出第一掩模图像,构建第一训练集和第一测试集;将所述第一训练集和第一测试集图像进行随机的平移、尺度、亮度和旋转变换等数据增强操作,得增强训练集和测试集;构建深度语义分割网络,如图3所示,其输入的烧结断面尺寸为1024×128×3,整体结构包括四个编码器层和四个对应的解码器层。每层编码器层包含两个3×3的卷积层(Convolution)、一个批正则化层(BatchNormalization)和一个最大池化层(MaxPooling);每层解码器包含一个上采样层(Upsampling)、一个3×3的卷积层、一个联合层(concatenate)、两个3×3的卷积层,以及一个批正则化层。最后经两个3×3的卷积层,由Sigmoid激活函数激活后,输出大小为1024×128×1的燃烧带形态。图3展示了本文设计的深度语义分割网络结构。然后选取所述增强训练集和增强测试集对网络进行训练和测试,训练时采用交叉熵(Cross Entropy)作为损失函数,Adam作为优化器,其学习率为3×10-4。使用训练后的深度语义分割网络,提取所述预处理图像的目标区域,提取结果如图4所示。In an embodiment of the present invention, firstly, a plurality of first typical images are randomly selected from the preprocessed images, and a first mask image is screened out to construct a first training set and a first test set; the first training set and the first test set images are subjected to data enhancement operations such as random translation, scale, brightness and rotation transformation to obtain an enhanced training set and a test set; a deep semantic segmentation network is constructed, as shown in FIG3 , wherein the input sintering section size is 1024×128×3, and the overall structure includes four encoder layers and four corresponding decoder layers. Each encoder layer includes two 3×3 convolution layers, a batch normalization layer, and a maximum pooling layer; each decoder layer includes an upsampling layer, a 3×3 convolution layer, a concatenate layer, two 3×3 convolution layers, and a batch normalization layer. Finally, after two 3×3 convolution layers are activated by a Sigmoid activation function, a combustion zone morphology with a size of 1024×128×1 is output. FIG3 shows the deep semantic segmentation network structure designed in this paper. Then, the enhanced training set and the enhanced test set are selected to train and test the network. Cross Entropy is used as the loss function and Adam is used as the optimizer with a learning rate of 3×10 -4 during training. The trained deep semantic segmentation network is used to extract the target area of the preprocessed image. The extraction result is shown in FIG4 .
步骤S3,获取所述步骤S2中语义分割网络模型中间层的输出特征,并构建卷积神经网络模型,对所述预处理图像序列进行二分类,获得图像类别特征。Step S3, obtaining the output features of the middle layer of the semantic segmentation network model in step S2, and constructing a convolutional neural network model to perform binary classification on the preprocessed image sequence to obtain image category features.
在本发明实施例中,引入迁移学习的思想,从预处理图像序列随机选取多张第二典型图像,并根据工业过程的实际需求将所述第二典型图像进行分类,构建第二训练集和第二测试集;以所述第二训练集和第二测试集为输入,采用步骤S2中的深度语义分割模型进行模拟,获取模型中间层输出作为图像深度特征;构建构建卷积神经网络模型,主要包括一个Flatten层、一个128维的全连接层、一个批正则化层、一个2维的全连接层和一个Sigmoid激活层,以所述图像深度特征为输入,以所述人工分类结果为输出,对网络进行训练,训练时采用交叉熵(Cross Entropy)作为损失函数,Adam作为优化器,其学习率为3×10-4;将所述预处理图像采用训练后的卷积神经网络模型进行特征提取,获得图像类别特征。In the embodiment of the present invention, the idea of transfer learning is introduced, and a plurality of second typical images are randomly selected from the preprocessed image sequence, and the second typical images are classified according to the actual needs of the industrial process to construct a second training set and a second test set; the second training set and the second test set are used as input, and the deep semantic segmentation model in step S2 is used for simulation, and the output of the middle layer of the model is obtained as the image depth feature; a convolutional neural network model is constructed, which mainly includes a Flatten layer, a 128-dimensional fully connected layer, a batch normalization layer, a 2-dimensional fully connected layer and a Sigmoid activation layer, and the image depth feature is used as input and the manual classification result is used as output to train the network, and cross entropy (Cross Entropy) is used as the loss function during training, and Adam is used as the optimizer, and its learning rate is 3× 10-4 ; the preprocessed image is subjected to feature extraction using the trained convolutional neural network model to obtain image category features.
步骤S4,根据所述图像类别特征对所述预处理后的图像序列进行分割获得候选关键帧序列集合。Step S4: segment the preprocessed image sequence according to the image category features to obtain a set of candidate key frame sequences.
在本发明实施例中,所述分割处理具体包括:In the embodiment of the present invention, the segmentation process specifically includes:
步骤S41,输入预处理图像序列Sinput和最小图像序列长度δ;Step S41, inputting a preprocessed image sequence S input and a minimum image sequence length δ;
步骤S42,定义当前目标图像序列和临时图像序列T,以及目标图像数量Cg=0和非目标图像数量Cng=0;Step S42, defining the current target image sequence and a temporary image sequence T, and a target image number C g = 0 and a non-target image
步骤S43,遍历图像序列Sinput,提取当前图像I的类别特征;Step S43, traversing the image sequence S input , extracting the category features of the current image I;
步骤S44,判断图像I是否为目标图像,如果是,则跳转至步骤S45;否则,跳转至步骤S47;Step S44, determine whether the image I is the target image, if yes, jump to step S45; otherwise, jump to step S47;
步骤S45,将图像I添加至临时图像序列T,同时令目标图像数量Cg自增1;Step S45, adding the image I to the temporary image sequence T, and increasing the number of target images Cg by 1;
步骤S46,判断目标图像数量Cg是否大于等于最小图像序列长度δ,如果是,则令非目标图像数量Cng=0;Step S46, determining whether the number of target images Cg is greater than or equal to the minimum image sequence length δ, if so, setting the number of non-target images Cng = 0;
步骤S47,令非目标图像数量Cng自增1,同时判断目标图像数量Cg是否大于等于最小图像序列长度δ,如果是,则将图像I添加至临时图像序列T;Step S47, increment the number of non-target images C ng by 1, and determine whether the number of target images C g is greater than or equal to the minimum image sequence length δ. If yes, add the image I to the temporary image sequence T;
步骤S48,判断非目标图像数量Cng是否大于等于最小图像序列长度δ,如果是,则跳转至步骤S49;否则,跳转至步骤S412;Step S48, determining whether the number of non-target images C ng is greater than or equal to the minimum image sequence length δ, if yes, jump to step S49; otherwise, jump to step S412;
步骤S49,判断目标图像数量Cg是否大于等于最小图像序列长度δ,如果是,则跳转至步骤S410;否则,跳转至步骤S411;Step S49, determining whether the target image quantity Cg is greater than or equal to the minimum image sequence length δ, if yes, jump to step S410; otherwise, jump to step S411;
步骤S410,将临时图像序列T中除最后δ张图像外的所有图像添加至当前目标图像序列 Step S410: add all images except the last δ images in the temporary image sequence T to the current target image sequence
步骤S411,将目标图像数量Cg和非目标图像数量Cng清零,同时清空临时图像序列T;Step S411, clearing the number of target images Cg and the number of non-target images Cng , and clearing the temporary image sequence T;
步骤S412,重复步骤S43至步骤S411,直到图像序列Sinput终止;Step S412, repeating steps S43 to S411 until the image sequence S input is terminated;
步骤S413,得到候选关键帧序列集合 Step S413: Obtain a candidate key frame sequence set
步骤S5,计算所述候选关键帧序列集合中各图像目标区域的相似度,构建相似度矩阵,并以所述相似度矩阵为输入,对所述候选关键帧序列进行聚类处理,获得多类别图像集合。Step S5, calculating the similarity of each image target area in the candidate key frame sequence set, constructing a similarity matrix, and using the similarity matrix as input, performing clustering processing on the candidate key frame sequence to obtain a multi-category image set.
在本发明实施例中,取候选关键帧序列中任意两张图像In和Im,利用所述深度语义分割网络提取相应目标区域Maskn和Maskm;首先使用STFI算法提取Maskn和Maskm的SIFT特征描述集合和 其中和分别为128维的特征描述子;然后计算Fn中特征描述子与Fm中各特征描述子之间的欧式距离In the embodiment of the present invention, the candidate key frame sequence is taken Take any two images I n and I m in the image, and use the deep semantic segmentation network to extract the corresponding target regions Mask n and Mask m ; first use the STFI algorithm to extract the SIFT feature description set of Mask n and Mask m and in and They are 128-dimensional feature descriptors respectively; then calculate the feature descriptors in F n and each feature descriptor in Fm The Euclidean distance between
并选取距离最小的特征描述子作为在Fm的匹配特征描述子And select the feature descriptor with the smallest distance as Matching feature descriptors in Fm
同理,可以得到Fm中特征描述子在Fn中的匹配特征描述子为如果Similarly, we can get the feature descriptor in Fm The matching feature descriptor in Fn is if
则称与为Mi和Mj之间的匹配特征描述子;Then it is called and is the matching feature descriptor between Mi and Mj ;
考虑工业过程的时序规律,以及候选关键帧序列中各图像之间的相似性,这里记Maskn和Maskm的相似度为Considering the temporal law of the industrial process and the similarity between the images in the candidate key frame sequence, the similarity between Mask n and Mask m is recorded as
其中表示Maskn和Maskm之间的匹配特征描述子数量,W和H分别为图像的宽度和长度,∑∑Maskn和∑∑Maskm分别表示Maskn和Maskm的面积,Kn和Km分别表示Maskn和Maskm的特征描述子数量;in represents the number of matching feature descriptors between Mask n and Mask m , W and H are the width and length of the image respectively, ∑∑Mask n and ∑∑Mask m represent the areas of Mask n and Mask m respectively, Kn and Km represent the number of feature descriptors of Mask n and Mask m respectively;
计算候选关键帧序列中所有图像之间的相似度,得相似度矩阵Calculate candidate key frame sequence The similarity between all images in the , and the similarity matrix
在本发明实施例中,所述聚类处理具体包括:结合工业生产实际,将生产过程划分为前期、中期和后期,选取类别数量D=3;采用谱聚类算法,以所述相似度矩阵Ai为输入,对相应的候选关键帧序列进行聚类操作,得到多类别图像集合C=(c1,c2,…,cd,…,cD),其聚类结果示意图如图5所示。In the embodiment of the present invention, the clustering process specifically includes: combining the actual industrial production, dividing the production process into early stage, middle stage and late stage, selecting the number of categories D=3; using the spectral clustering algorithm, taking the similarity matrix Ai as input, and clustering the corresponding candidate key frame sequences A clustering operation is performed to obtain a multi-category image set C = (c 1 , c 2 , …, c d , …, c D ), and a schematic diagram of the clustering result is shown in FIG5 .
步骤S6,根据工业过程实际需求,构建关键帧选择指标和权值矩阵,根据所述关键帧选择指标对所述多类别图像集合进行筛选获得关键帧序列,并根据所述权值矩阵对所述关键帧序列进行加权平均,获得关键帧。Step S6, constructing a key frame selection index and a weight matrix according to the actual needs of the industrial process, screening the multi-category image set according to the key frame selection index to obtain a key frame sequence, and performing weighted averaging on the key frame sequence according to the weight matrix to obtain a key frame.
在本发明实施例中,以工业过程实际需求为目标,认为关键帧序列需满足目标区域总面积最大,并位于生产周期的中部;在所述目标区域的基础上,构建关键帧选择指标In the embodiment of the present invention, the actual needs of the industrial process are taken as the target, and it is considered that the key frame sequence needs to meet the total area of the target area, which is the largest and located in the middle of the production cycle; based on the target area, a key frame selection index is constructed.
其中N为候选关键帧序列的图像数量。Where N is the number of images in the candidate key frame sequence.
从所述多类别图像集合C=(c1,c2,…,cd,…,cD)中选择最佳图像集合,得关键帧序列 The best image set is selected from the multi-category image set C = (c 1 , c 2 , ..., c d , ..., c D ) to obtain a key frame sequence
以工业过程实际需求为目标,根据所述图像目标区域,以目标区域面积为权值,构建权值矩阵W=[w1,w2,…,wK];对关键帧序列中所有图像,以所述权值矩阵W为权,计算其加权平均,得关键帧Ikey,其结果如图6所示。Taking the actual needs of the industrial process as the goal, according to the image target area, the target area area is used as the weight to construct a weight matrix W = [w 1 ,w 2 ,…,w K ]; for the key frame sequence For all images in , the weight matrix W is used as the weight, and the weighted average is calculated to obtain the key frame I key , and the result is shown in FIG6 .
在本发明实施例,图7展示了图像特征曲线峰值法以目标面积为特征,对工业视频关键帧提取的结果。图A、B和C分别为生产专家提取的关键帧、图像特征曲线峰值法提取的关键帧和本文所提方法提取的关键帧的及其对应的图像目标区域。为了对两种方法的优异进行评估,本发明使用均值哈希距离、差值哈希距离、感知哈希距离、余弦距离和SIFT匹配特征点匹配率(生产专家提取的关键帧的SIFT特征点被匹配的百分比)计算两种方法与生产专家所提关键帧之间的相似度。其中,均值哈希距离、差值哈希距离和感知哈希距离越小,说明两幅图像之间的相似度越高;余弦距离和SIFT匹配特征点匹配率越大,说明两幅图像之间的相似度越高。表1展示上述方法对两种算法的评估结果,可见本发明所提方法能够更加准确的提取关键帧。In an embodiment of the present invention, FIG. 7 shows the result of extracting the key frames of industrial videos using the peak method of the image feature curve with the target area as the feature. FIG. A, B and C are the key frames extracted by the production expert, the key frames extracted by the peak method of the image feature curve and the key frames extracted by the method proposed in this article and their corresponding image target areas, respectively. In order to evaluate the excellence of the two methods, the present invention uses the mean hash distance, the difference hash distance, the perceptual hash distance, the cosine distance and the SIFT matching feature point matching rate (the percentage of SIFT feature points of the key frames extracted by the production expert being matched) to calculate the similarity between the two methods and the key frames proposed by the production expert. Among them, the smaller the mean hash distance, the difference hash distance and the perceptual hash distance, the higher the similarity between the two images; the larger the cosine distance and the SIFT matching feature point matching rate, the higher the similarity between the two images. Table 1 shows the evaluation results of the above method on the two algorithms, and it can be seen that the method proposed in this invention can extract key frames more accurately.
表1不同方法与生产专家所提关键帧之间的相似度Table 1 Similarity between different methods and key frames proposed by production experts
根据生产专家和本文所提方法,图7中框1内的图像属于同一生产周期,但图像特征曲线峰值法将其分为了三个周期,可见本发明所提方法对关键帧提取的准确率更高。According to production experts and the method proposed in this paper, the images in
本发明的上述实施例所述的基于深度学习的工业视频周期性生产片段关键帧两阶段提取方法通过引入深度学习技术,弥补了传统手工特征方法在工业图像特征提取能力上的不足,能够完整准确地提取图像特征;通过生成候选关键帧序列,对海量的工业图像进行初步粗筛选,减少了第二阶段中聚类的计算量,并提高了聚类的准确率;使用聚类操作对候选关键帧序列进行二次分割,提高了关键帧序列内图像间的相似度,减少了关键帧计算的计算量,并避免了噪声图像的干扰;采用多图像加权平均合成关键帧的方式,最大程度减少了图像变化过程中的特征丢失,能够更加完整的反映工业生产过程中的视觉信息。The two-stage method for extracting key frames of periodic production fragments of industrial videos based on deep learning described in the above-mentioned embodiment of the present invention makes up for the shortcomings of traditional manual feature methods in the ability to extract industrial image features by introducing deep learning technology, and can extract image features completely and accurately; by generating a candidate key frame sequence, a preliminary rough screening is performed on a large number of industrial images, which reduces the amount of clustering calculations in the second stage and improves the accuracy of clustering; the candidate key frame sequence is secondary segmented using a clustering operation, which improves the similarity between images in the key frame sequence, reduces the amount of key frame calculation, and avoids the interference of noise images; the method of synthesizing key frames by weighted averaging of multiple images is used to minimize the feature loss during the image change process, and can more completely reflect the visual information in the industrial production process.
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation methods of the present invention, and the description thereof is relatively specific and detailed, but it cannot be understood as limiting the scope of the patent of the present invention. It should be pointed out that, for ordinary technicians in this field, several variations and improvements can be made without departing from the concept of the present invention, which all belong to the protection scope of the present invention. Therefore, the protection scope of the patent of the present invention shall be subject to the attached claims.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。Those skilled in the art will readily appreciate other embodiments of the present disclosure after considering the specification and practicing the invention disclosed herein. This application is intended to cover any variations, uses or adaptations of the present disclosure that follow the general principles of the present disclosure and include common knowledge or customary techniques in the art that are not disclosed in the present disclosure. The specification and examples are intended to be exemplary only, and the true scope and spirit of the present disclosure are indicated by the claims.
应该理解的是,虽然本发明各实施例的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,各实施例中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although each step in the flow chart of each embodiment of the present invention is shown in sequence according to the indication of the arrow, these steps are not necessarily performed in sequence according to the order indicated by the arrow. Unless there is a clear explanation in this article, the execution of these steps does not have strict order restrictions, and these steps can be performed in other orders. Moreover, at least a portion of the steps in each embodiment may include a plurality of sub-steps or a plurality of stages, and these sub-steps or stages are not necessarily performed at the same time, but can be performed at different times, and the execution order of these sub-steps or stages is not necessarily performed in sequence, but can be performed in turn or alternately with at least a portion of other steps or sub-steps or stages of other steps.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those skilled in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be completed by instructing the relevant hardware through a computer program, and the program can be stored in a non-volatile computer-readable storage medium. When the program is executed, it can include the processes of the embodiments of the above-mentioned methods. Among them, any reference to memory, storage, database or other media used in the embodiments provided in this application can include non-volatile and/or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM) or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above-described embodiments may be arbitrarily combined. To make the description concise, not all possible combinations of the technical features in the above-described embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110532120.6A CN113269067B (en) | 2021-05-17 | 2021-05-17 | Periodic industrial video clip key frame two-stage extraction method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110532120.6A CN113269067B (en) | 2021-05-17 | 2021-05-17 | Periodic industrial video clip key frame two-stage extraction method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113269067A CN113269067A (en) | 2021-08-17 |
CN113269067B true CN113269067B (en) | 2023-04-07 |
Family
ID=77231053
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110532120.6A Active CN113269067B (en) | 2021-05-17 | 2021-05-17 | Periodic industrial video clip key frame two-stage extraction method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113269067B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115967823A (en) * | 2021-10-09 | 2023-04-14 | 北京字节跳动网络技术有限公司 | Video cover generation method and device, electronic equipment and readable medium |
CN118840662A (en) * | 2024-08-20 | 2024-10-25 | 四川省农业科学院植物保护研究所 | Citrus leaf detection method for inoculation and preservation of citrus yellow dragon disease pathogenic bacteria |
CN118840699B (en) * | 2024-09-20 | 2025-02-14 | 南京信息工程大学 | Key frame extraction method, device and storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590442A (en) * | 2017-08-22 | 2018-01-16 | 华中科技大学 | A kind of video semanteme Scene Segmentation based on convolutional neural networks |
CN107943837B (en) * | 2017-10-27 | 2022-09-30 | 江苏理工学院 | Key-framed video abstract generation method for foreground target |
CN107784118B (en) * | 2017-11-14 | 2020-08-28 | 北京林业大学 | A video key information extraction system for user interest semantics |
CN109377494B (en) * | 2018-09-14 | 2022-06-28 | 创新先进技术有限公司 | Semantic segmentation method and device for image |
CN110267041B (en) * | 2019-06-28 | 2021-11-09 | Oppo广东移动通信有限公司 | Image encoding method, image encoding device, electronic device, and computer-readable storage medium |
-
2021
- 2021-05-17 CN CN202110532120.6A patent/CN113269067B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN113269067A (en) | 2021-08-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113269067B (en) | Periodic industrial video clip key frame two-stage extraction method based on deep learning | |
CN107527337B (en) | A deep learning-based video object removal and tampering detection method | |
CN112150450B (en) | Image tampering detection method and device based on dual-channel U-Net model | |
JP7058941B2 (en) | Dictionary generator, dictionary generation method, and program | |
CN112132145B (en) | An image classification method and system based on a model-extended convolutional neural network | |
Marzan et al. | Automated tobacco grading using image processing techniques and a convolutional neural network | |
CN113449672B (en) | Remote sensing scene classification method and device based on bilinear twin framework | |
CN117173172B (en) | Machine vision-based silica gel molding effect detection method and system | |
Ben-Ahmed et al. | Deep multimodal features for movie genre and interestingness prediction | |
Zhang et al. | Automatic head overcoat thickness measure with NASNet-large-decoder net | |
CN112949634B (en) | A method for detecting bird nests in railway contact network | |
El-Gayar et al. | A novel approach for detecting deep fake videos using graph neural network | |
CN114724218A (en) | Video detection method, device, equipment and medium | |
Huang et al. | A method for identifying origin of digital images using a convolutional neural network | |
CN116030056A (en) | Detection method and system for steel surface cracks | |
Lu et al. | Source Camera Identification Algorithm Based on Multi-Scale Feature Fusion. | |
CN113496221B (en) | Point supervision remote sensing image semantic segmentation method and system based on depth bilateral filtering | |
KR101313285B1 (en) | Method and Device for Authoring Information File of Hyper Video and Computer-readable Recording Medium for the same | |
CN112070116A (en) | An automatic classification system and method for art paintings based on support vector machine | |
CN114708457B (en) | Hyperspectral deep learning identification method for anti-purple fringing identification | |
Chady et al. | The application of rough sets theory to design of weld defect classifiers | |
He et al. | A high-quality sample generation method for improving steel surface defect inspection | |
CN118501159B (en) | Automobile part defect detection method and system based on machine vision | |
Nair et al. | Image forgery and image tampering detection techniques: A review | |
Piccoli | Visual Anomaly Detection For Automatic Quality Control |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |