CN114282052A

CN114282052A - Video image localization method and system based on frame feature

Info

Publication number: CN114282052A
Application number: CN202111599413.2A
Authority: CN
Inventors: 王晶
Original assignee: Space Shichuang Chongqing Technology Co ltd
Current assignee: Space Shichuang Chongqing Technology Co ltd
Priority date: 2021-12-24
Filing date: 2021-12-24
Publication date: 2022-04-05

Abstract

The present invention relates to the technical field of image data processing, in particular to a method and system for locating video images based on frame features. The method includes the following contents: a video analysis step: splitting a video to be matched into video frames, and extracting from the video frames Extracting image features and storing image features; image processing steps: acquiring the target image, segmenting the target image to generate a target subject image, and extracting target features from the target subject image; image matching step: matching the target features with the image features in turn, calculating Similarity; image screening and push steps: according to the similarity, the push image is screened from the video frame, and the image information is obtained and pushed according to the push image. With this solution, the technical problem of low image matching accuracy in the prior art can be solved.

Description

Video image localization method and system based on frame features

技术领域technical field

本发明涉及图像数据处理的技术领域，具体为一种基于帧特征的视频图像定位方法及系统。The present invention relates to the technical field of image data processing, in particular to a frame feature-based video image positioning method and system.

背景技术Background technique

多媒体的快速兴起，使得各式各样的视频出现在人们眼前，在用户观看视频的过程中，对于喜欢或感兴趣的画面会采用截图的方式进行保存和分享，在后续查看图像时，经常会出现查看完整视频的想法或者想得知图像的来源。在现有技术中，通常采用的方式为将截图与视频中的视频帧进行匹配，从而得知截图的来源。但是截图中存在相当大的噪声，这些噪声相当于干扰信息，而这些干扰信息会影响匹配过程的准确度，导致无法准确的匹配出截图所对应的视频帧，无法准确的获取视频来源。The rapid rise of multimedia makes all kinds of videos appear in front of people. During the process of watching videos, users will take screenshots to save and share the pictures they like or are interested in. The idea of viewing the full video or wanting to know where the image came from. In the prior art, a common method is to match the screenshot with the video frame in the video, so as to know the source of the screenshot. However, there is considerable noise in the screenshot, which is equivalent to interference information, and these interference information will affect the accuracy of the matching process, resulting in the inability to accurately match the video frame corresponding to the screenshot, and to obtain the video source accurately.

发明内容SUMMARY OF THE INVENTION

本发明的目的之一在于提供一种基于帧特征的视频图像定位方法，以解决现有技术中图像匹配准确度较低的技术问题。One of the objectives of the present invention is to provide a video image positioning method based on frame features, so as to solve the technical problem of low image matching accuracy in the prior art.

本发明提供的基础方案一：基于帧特征的视频图像定位方法，包括以下内容：Basic scheme 1 provided by the present invention: a video image positioning method based on frame features, including the following contents:

视频分析步骤：将待匹配的视频拆分为视频帧，从视频帧中提取图像特征，存储图像特征；Video analysis step: split the video to be matched into video frames, extract image features from the video frames, and store the image features;

图像处理步骤：获取目标图像，对目标图像进行分割生成目标主体图像，从目标主体图像中提取目标特征；Image processing step: acquiring a target image, segmenting the target image to generate a target subject image, and extracting target features from the target subject image;

图像匹配步骤：根据目标特征依次和图像特征进行匹配，计算相似度；Image matching step: match the target features with the image features in sequence, and calculate the similarity;

图像筛选及推送步骤：根据相似度从视频帧中筛选出推送图像，根据推送图像获取并推送图像信息。Image screening and push steps: Screen push images from video frames according to similarity, and obtain and push image information according to push images.

基础方案一的有益效果：Beneficial effects of Basic Program 1:

目标图像为用户想得知来源的图像，通过图像处理步骤对目标图像进行分割获得目标主体图像，目标主体图包含目标图像中的关键信息，通过分割目标图像去除目标图像中的噪声，降低匹配过程中的干扰信息。视频分析步骤和图像处理步骤分别提取目标特征和图像特征，通过图像匹配步骤目标特征和图像特征进行匹配，通过提取特征进行匹配的方式，匹配速度更快。The target image is the image that the user wants to know the source of. The target image is segmented through the image processing step to obtain the target subject image. The target subject image contains the key information in the target image. By segmenting the target image, the noise in the target image is removed and the matching process is reduced. interference information in . The video analysis step and the image processing step extract the target feature and the image feature respectively, and the target feature and the image feature are matched through the image matching step, and the matching speed is faster by extracting the feature for matching.

通过图像筛选及推送步骤根据相似度对视频帧进行筛选，从而获得与目标图像相似的推送图像，进而获取推送图像的图像信息进行推送，通过图像信息得知推送图像在其视频中的位置。与现有技术相比，采用本方案，通过对目标图像的分割处理，减少匹配过程中的干扰，提高图像匹配的准确度，从而实现目标图像在视频中的精准定位。Through the steps of image screening and pushing, the video frames are screened according to the similarity, so as to obtain the push image similar to the target image, and then obtain the image information of the push image to push, and know the position of the push image in the video through the image information. Compared with the prior art, by adopting this solution, by dividing the target image, the interference in the matching process is reduced, and the accuracy of image matching is improved, thereby realizing the precise positioning of the target image in the video.

进一步，对目标图像进行分割生成目标主体图像，具体包括以下内容：Further, segment the target image to generate the target subject image, which specifically includes the following contents:

对目标图像进行轮廓识别；根据预设的筛选条件从识别出的轮廓中筛选出目标轮廓作为目标主体图像。Perform contour recognition on the target image; screen out the target contour from the identified contours as the target subject image according to preset screening conditions.

有益效果：大部分图像中，其有效信息集中在图像中的各图形中。在分割目标图像时，通过轮廓识别获得目标图像中的各图形，通过筛选条件对识别出的轮廓进行筛选，从而获得目标主体图像，即图像中的有效信息。Beneficial effect: In most images, the effective information is concentrated in each figure in the image. When segmenting the target image, each figure in the target image is obtained through contour recognition, and the identified contours are screened by filtering conditions, thereby obtaining the target subject image, that is, the effective information in the image.

进一步，从视频帧中提取图像特征，具体包括以下内容：Further, image features are extracted from video frames, including the following:

对视频帧进行预处理；Preprocessing video frames;

获取预处理后视频帧中各像素点的低频信号，根据低频信号为视频帧中的各像素点赋值；Obtain the low-frequency signal of each pixel in the video frame after preprocessing, and assign each pixel in the video frame according to the low-frequency signal;

根据赋值结果生成图像特征。Generate image features based on the assignment results.

有益效果：在提取图像特征时，根据视频帧的低频信号为各像素点进行赋值，以此获取图像特征。Beneficial effects: when extracting image features, assign values to each pixel point according to the low-frequency signal of the video frame, so as to obtain image features.

进一步，根据目标特征依次和图像特征进行匹配，计算相似度，具体包括以下内容：Further, according to the target features and the image features in sequence, the similarity is calculated, which specifically includes the following contents:

依次计算目标特征和图像特征的汉明距离，汉明距离为相似度。Calculate the Hamming distance of the target feature and the image feature in turn, and the Hamming distance is the similarity.

有益效果：通过计算汉明距离，得知目标特征和图像特征的距离，以此得知目标图像和视频帧的相似度。Beneficial effects: By calculating the Hamming distance, the distance between the target feature and the image feature is known, so as to obtain the similarity between the target image and the video frame.

进一步，图像信息包括时间点和相似度，图像信息的格式为json。Further, the image information includes a time point and a similarity, and the format of the image information is json.

有益效果：时间点为视频帧在对应视频中的时间点，相似度为视频帧与目标图像的相似度，通过图像信息便于用户在对应视频中快速定位推送图像，从而查看和确认推送图像和目标图像是否为同一图像。图像信息采用json格式，读写速度更快，易于使用，兼容性更广。Beneficial effects: The time point is the time point of the video frame in the corresponding video, and the similarity is the similarity between the video frame and the target image. It is convenient for the user to quickly locate the push image in the corresponding video through the image information, so as to view and confirm the push image and target. Whether the images are the same image. The image information is in json format, which is faster to read and write, easy to use, and has wider compatibility.

本发明的目的之二在于提供一种基于帧特征的视频图像定位系统。Another object of the present invention is to provide a video image positioning system based on frame features.

本发明提供基础方案二：基于帧特征的视频图像定位系统，包括：The present invention provides basic solution 2: a video image positioning system based on frame features, including:

视频拆分模块，用于将待匹配的视频拆分为视频帧；The video splitting module is used to split the video to be matched into video frames;

还包括：Also includes:

目标分割模块，用于对获取的目标图像分割生成目标主体图像；The target segmentation module is used to segment the acquired target image to generate the target subject image;

特征提取模块，用于从视频帧中提取图像特征，还用于从目标主体图像中提取目标特征；A feature extraction module for extracting image features from video frames, and also for extracting target features from target subject images;

相似计算模块，用于根据目标特征依次和图像特征进行匹配，计算相似度；The similarity calculation module is used to match the image features in sequence according to the target features, and calculate the similarity;

信息筛选模块，用于根据相似度从视频帧中筛选出推送图像，根据推送图像获取图像信息。The information screening module is used to filter out the push images from the video frames according to the similarity, and obtain image information according to the push images.

基础方案二的有益效果：Beneficial effects of basic plan two:

目标图像为用户想得知来源的图像，图像分割模块的设置，对目标图像进行分割获得目标主体图像，目标主体图包含目标图像中的关键信息，通过分割目标图像去除目标图像中的噪声，降低匹配过程中的干扰信息。The target image is the image that the user wants to know the source of. The settings of the image segmentation module, segment the target image to obtain the target subject image. The target subject image contains the key information in the target image. By segmenting the target image, the noise in the target image is removed, reducing Interference information in the matching process.

特征提取模块的设置，分别对目标图像和拆分后的视频帧提取目标特征和图像特征。相似计算模块的设置，通过目标特征和图像特征进行匹配，通过提取特征进行匹配的方式，匹配速度更快。The setting of the feature extraction module extracts target features and image features from the target image and the split video frame respectively. The setting of the similarity calculation module matches the target feature and the image feature, and the matching speed is faster by extracting the feature.

信息筛选模块的设置，根据相似度对视频帧进行筛选，从而获得与目标图像相似的推送图像，进而获取推送图像的图像信息进行推送，通过图像信息得知推送图像在其视频中的位置。与现有技术相比，采用本方案，通过对目标图像的分割处理，减少匹配过程中的干扰，提高图像匹配的准确度，从而实现目标图像在视频中的精准定位。The settings of the information screening module screen the video frames according to the similarity, so as to obtain the push image similar to the target image, and then obtain the image information of the push image to push, and know the position of the push image in the video through the image information. Compared with the prior art, by adopting this solution, by dividing the target image, the interference in the matching process is reduced, and the accuracy of image matching is improved, thereby realizing the precise positioning of the target image in the video.

进一步，目标分割模块用于对获取的目标图像进行轮廓识别，根据预设的筛选条件从识别出的轮廓中筛选出目标轮廓作为目标主体图像。Further, the target segmentation module is used to perform contour recognition on the acquired target image, and select the target contour from the identified contours as the target subject image according to preset screening conditions.

有益效果：大部分图像中，其有效信息集中在图像中的各图形中。图像分割模块的设置，在分割目标图像时，通过轮廓识别获得目标图像中的各图形，通过筛选条件对识别出的轮廓进行筛选，从而获得目标主体图像，即图像中的有效信息。Beneficial effect: In most images, the effective information is concentrated in each figure in the image. The settings of the image segmentation module, when segmenting the target image, obtain each figure in the target image through contour recognition, and filter the identified contours through the screening conditions, so as to obtain the target subject image, that is, the effective information in the image.

进一步，特征提取模块用于对视频帧进行预处理；还用于获取预处理后视频帧中各像素点的低频信号，根据低频信号为视频帧中的各像素点赋值，根据赋值结果生成图像特征。Further, the feature extraction module is used to preprocess the video frame; it is also used to obtain the low frequency signal of each pixel point in the preprocessed video frame, assign value to each pixel point in the video frame according to the low frequency signal, and generate image features according to the assignment result. .

有益效果：特征提取模块在提取图像特征时，通过各像素点的低频信号为各像素点重新赋值，以此获取图像特征。Beneficial effects: when extracting image features, the feature extraction module reassigns each pixel point through the low-frequency signal of each pixel point, so as to obtain the image feature.

进一步，相似计算模块用于依次计算目标特征和图像特征的汉明距离，汉明距离为相似度。Further, the similarity calculation module is used to calculate the Hamming distance of the target feature and the image feature in turn, and the Hamming distance is the similarity.

有益效果：相似计算模块的设置，通过计算汉明距离，得知目标特征和图像特征的距离，以此得知目标图像和视频帧的相似度。Beneficial effect: the setting of the similarity calculation module, through calculating the Hamming distance, the distance between the target feature and the image feature is known, so as to know the similarity between the target image and the video frame.

附图说明Description of drawings

图1为本发明基于帧特征的视频图像定位系统实施例的逻辑框图。FIG. 1 is a logical block diagram of an embodiment of a video image positioning system based on frame features of the present invention.

具体实施方式Detailed ways

下面通过具体实施方式进一步详细说明：The following is further described in detail by specific embodiments:

实施例Example

基于帧特征的视频图像定位方法，包括以下内容：A video image localization method based on frame features, including the following:

视频分析步骤：将待匹配的视频拆分为视频帧，从视频帧中提取图像特征，存储图像特征。待匹配的视频为系统中所存储的全部视频，在其他实施例中，当需定位的图像为用户自己截图存储的图像，则待匹配的视频为用户浏览过视频。Video analysis step: split the video to be matched into video frames, extract image features from the video frames, and store the image features. The video to be matched is all the videos stored in the system. In other embodiments, when the image to be located is the image stored by the user's own screenshot, the video to be matched is the video that the user has browsed.

从视频帧中提取图像特征，具体包括以下内容：对视频帧进行预处理；获取预处理后视频帧中各像素点的低频信号，根据低频信号为视频帧中的各像素点赋值；根据赋值结果生成图像特征。具体的，预设有图像尺寸，根据图像尺寸对视频帧进行缩放，获得与图像尺寸相同的视频帧；将缩放后的视频帧转换为灰度图像；调用预设的离散余弦变换，根据灰度图像计算生成DCT矩阵，DCT矩阵包括各像素点的DCT值，根据DCT矩阵计算灰度图像的DCT平均值，根据各像素点的DCT值和DCT平均值进行赋值，当像素点DCT值大于或等于DCT平均值时，将所述像素点的hash值赋值为1，当像素点DCT值小于DCT平均值时，将所述像素点的hash值赋值为0；根据各像素点的赋值结果生成hash码，在本实施例中，从左至右从上至下将hash值组合成一个整数，其为hash码，hash码采用npy文件进行存储，即为图像特征。npy文件是指numpy专用二进制文件，而numpy为python中用于数据处理的包。Extracting image features from video frames, including the following: preprocessing the video frame; obtaining the low-frequency signal of each pixel in the preprocessed video frame, and assigning a value to each pixel in the video frame according to the low-frequency signal; according to the assignment result Generate image features. Specifically, an image size is preset, and the video frame is scaled according to the image size to obtain a video frame with the same size as the image; the scaled video frame is converted into a grayscale image; the preset discrete cosine transform is called, and the The image calculation generates a DCT matrix. The DCT matrix includes the DCT value of each pixel point. According to the DCT matrix, the DCT average value of the grayscale image is calculated, and the value is assigned according to the DCT value and the DCT average value of each pixel point. When the pixel point DCT value is greater than or equal to When the DCT average value is used, the hash value of the pixel point is assigned as 1, and when the DCT value of the pixel point is less than the DCT average value, the hash value of the pixel point is assigned as 0; the hash code is generated according to the assignment result of each pixel point. , in this embodiment, the hash value is combined into an integer from left to right and from top to bottom, which is a hash code, and the hash code is stored in an npy file, which is an image feature. npy files refer to numpy-specific binary files, and numpy is a package for data processing in python.

图像处理步骤：获取目标图像，对目标图像进行分割生成目标主体图像，从目标主体图像中提取目标特征。Image processing steps: acquiring a target image, segmenting the target image to generate a target subject image, and extracting target features from the target subject image.

对目标图像进行分割生成目标主体图像，具体包括以下内容：预设有筛选条件，筛选条件为面积最大的轮廓，即像素点最多的轮廓。对目标图像进行轮廓识别；根据预设的筛选条件从识别出的轮廓中筛选出目标轮廓作为目标主体图像，即统计识别出的轮廓的像素点，筛选像素点最多的轮廓作为目标轮廓。The target image is segmented to generate the target subject image, which specifically includes the following content: a screening condition is preset, and the screening condition is the contour with the largest area, that is, the contour with the most pixels. Perform contour recognition on the target image; select the target contour from the identified contours as the target subject image according to the preset screening conditions, that is, count the pixels of the identified contours, and select the contour with the most pixels as the target contour.

从目标主体图像中提取目标特征，与从视频帧中提取图像特征相同，具体包括以下内容：对目标主体图像进行预处理；获取预处理后目标主体图像中各像素点的低频信号，根据低频信号为目标主体图像中的各像素点赋值；根据赋值结果生成目标特征。具体的，预设有图像尺寸，根据图像尺寸对目标主体图像进行缩放，获得与图像尺寸相同的目标主体图像；将缩放后的目标主体图像转换为灰度图像；调用预设的离散余弦变换，根据灰度图像计算生成DCT矩阵，DCT矩阵包括各像素点的DCT值，根据DCT矩阵计算灰度图像的DCT平均值，根据各像素点的DCT值和DCT平均值进行赋值，当像素点DCT值大于或等于DCT平均值时，将所述像素点的hash值赋值为1，当像素点DCT值小于DCT平均值时，将所述像素点的hash值赋值为0；根据各像素点的赋值结果生成hash码，在本实施例中，从左至右从上至下将hash值组合成一个整数，采用二进制保存，其为hash码，hash码采用npy文件进行存储，即为目标特征。Extracting target features from the target subject image is the same as extracting image features from video frames, and specifically includes the following contents: preprocessing the target subject image; Assign values to each pixel in the target subject image; generate target features according to the assignment results. Specifically, the image size is preset, and the target subject image is scaled according to the image size to obtain a target subject image with the same size as the image; the scaled target subject image is converted into a grayscale image; the preset discrete cosine transform is called, Calculate and generate a DCT matrix according to the grayscale image. The DCT matrix includes the DCT value of each pixel point. Calculate the DCT average value of the grayscale image according to the DCT matrix, and assign values according to the DCT value and the DCT average value of each pixel point. When the pixel point DCT value When greater than or equal to the DCT average value, assign the hash value of the pixel point to 1, when the pixel point DCT value is less than the DCT average value, assign the hash value of the pixel point to 0; according to the assignment result of each pixel point To generate a hash code, in this embodiment, the hash value is combined into an integer from left to right and from top to bottom, which is stored in binary, which is a hash code, and the hash code is stored in an npy file, which is the target feature.

图像匹配步骤：根据目标特征依次和图像特征进行匹配，计算相似度。图像匹配步骤具体包括以下内容：依次计算目标特征和图像特征的汉明距离，即计算目标特征对应hash码与图像特征对应hash码的汉明距离，通过汉明距离表征两hash码的异同，即汉明距离为目标特征和图像特征的相似度。Image matching step: Match with the image features in sequence according to the target features, and calculate the similarity. The image matching step specifically includes the following contents: calculating the Hamming distance of the target feature and the image feature in turn, that is, calculating the Hamming distance between the hash code corresponding to the target feature and the hash code corresponding to the image feature, and characterizing the similarities and differences between the two hash codes by the Hamming distance, that is, Hamming distance is the similarity between target features and image features.

图像筛选及推送步骤：根据相似度从视频帧中筛选出推送图像，根据推送图像获取并推送图像信息。图像信息包括时间点和相似度，图像信息的格式为json。具体的，根据相似度从高到低对视频帧进行排序，根据预设的推送阈值筛选出对应数量的视频帧作为推送图像，根据推送图像获取推送图像所对应视频帧在视频中的时间点，根据时间点和相似度生成对应推送图像的图像信息，将推送信息整合为json格式。在本实施例中，推送阈值为4，即筛选排序前4位的视频帧作为推送图像。Image screening and push steps: Screen push images from video frames according to similarity, and obtain and push image information according to push images. Image information includes time point and similarity, and the format of image information is json. Specifically, the video frames are sorted from high to low according to the similarity, a corresponding number of video frames are selected as the push image according to the preset push threshold, and the time point in the video of the video frame corresponding to the push image is obtained according to the push image, Generate image information corresponding to the push image according to the time point and similarity, and integrate the push information into json format. In this embodiment, the push threshold is 4, that is, the top 4 video frames in the sorting order are selected as push images.

基于帧特征的视频图像定位系统，使用上述基于帧特征的视频图像定位方法，如附图1所示，包括视频拆分模块、目标分割模块、特征提取模块、相似计算模块和信息筛选模块。The frame feature-based video image positioning system, using the above frame feature-based video image positioning method, as shown in Figure 1, includes a video splitting module, a target splitting module, a feature extraction module, a similarity calculation module and an information screening module.

视频拆分模块用于将待匹配的视频拆分为视频帧，从视频帧中提取图像特征，存储图像特征。待匹配的视频为系统中所存储的全部视频，在其他实施例中，当需定位的图像为用户自己截图存储的图像，则待匹配的视频为用户浏览过视频。具体的，视频拆分模块用于对拆分后的视频帧进行预处理。预处理包括：预设有图像尺寸，根据图像尺寸对视频帧进行缩放，获得与图像尺寸相同的视频帧；将缩放后的视频帧转换为灰度图像。视频拆分模块还用于获取预处理后视频帧中各像素点的低频信号，根据低频信号为视频帧中的各像素点赋值，根据赋值结果生成图像特征。调用预设的离散余弦变换，根据灰度图像计算生成DCT矩阵，DCT矩阵包括各像素点的DCT值，根据DCT矩阵计算灰度图像的DCT平均值，根据各像素点的DCT值和DCT平均值进行赋值，当像素点DCT值大于或等于DCT平均值时，将所述像素点的hash值赋值为1，当像素点DCT值小于DCT平均值时，将所述像素点的hash值赋值为0；根据各像素点的赋值结果生成hash码，在本实施例中，从左至右从上至下将hash值组合成一个整数，其为hash码，hash码采用npy文件进行存储，即为图像特征。npy文件是指numpy专用二进制文件，而numpy为python中用于数据处理的包。The video splitting module is used to split the video to be matched into video frames, extract image features from the video frames, and store the image features. The video to be matched is all the videos stored in the system. In other embodiments, when the image to be located is the image stored by the user's own screenshot, the video to be matched is the video that the user has browsed. Specifically, the video splitting module is used for preprocessing the split video frames. The preprocessing includes: presetting an image size, scaling the video frame according to the image size to obtain a video frame with the same size as the image; converting the scaled video frame into a grayscale image. The video splitting module is also used to obtain the low-frequency signal of each pixel in the preprocessed video frame, assign values to each pixel in the video frame according to the low-frequency signal, and generate image features according to the assignment result. Call the preset discrete cosine transform, calculate and generate the DCT matrix according to the grayscale image, the DCT matrix includes the DCT value of each pixel, calculate the DCT average value of the grayscale image according to the DCT matrix, and calculate the DCT value and DCT average value of each pixel point according to the DCT value. Carry out assignment, when the DCT value of the pixel point is greater than or equal to the DCT average value, assign the hash value of the pixel point to 1, when the pixel point DCT value is less than the DCT average value, assign the hash value of the pixel point to 0 ; Generate a hash code according to the assignment result of each pixel, in the present embodiment, the hash value is combined into an integer from left to right from top to bottom, which is a hash code, and the hash code is stored by using an npy file, which is an image feature. npy files refer to numpy-specific binary files, and numpy is a package for data processing in python.

目标分割模块用于对获取的目标图像分割生成目标主体图像。具体的，目标分割模块预设有筛选条件，筛选条件为面积最大的轮廓，即像素点最多的轮廓。目标分割模块用于对获取的目标图像进行轮廓识别，根据筛选条件从识别出的轮廓中筛选出目标轮廓作为目标主体图像，即统计识别出的轮廓的像素点，筛选像素点最多的轮廓作为目标轮廓。The target segmentation module is used for segmenting the acquired target image to generate a target subject image. Specifically, the target segmentation module presets a screening condition, and the screening condition is the contour with the largest area, that is, the contour with the most pixels. The target segmentation module is used to identify the contour of the obtained target image, and select the target contour from the identified contour as the target subject image according to the screening conditions, that is, count the pixels of the identified contour, and select the contour with the most pixels as the target. contour.

特征提取模块用于从目标主体图像中提取目标特征。具体的，特征提取模块用于对目标主体图像进行预处理，预处理包括：预设有图像尺寸，根据图像尺寸对目标主体图像进行缩放，获得与图像尺寸相同的目标主体图像；将缩放后的目标主体图像转换为灰度图像。特征提取模块还用于获取预处理后目标主体图像中各像素点的低频信号，根据低频信号为目标主体图像中的各像素点赋值；根据赋值结果生成目标特征。调用预设的离散余弦变换，根据灰度图像计算生成DCT矩阵，DCT矩阵包括各像素点的DCT值，根据DCT矩阵计算灰度图像的DCT平均值，根据各像素点的DCT值和DCT平均值进行赋值，当像素点DCT值大于或等于DCT平均值时，将所述像素点的hash值赋值为1，当像素点DCT值小于DCT平均值时，将所述像素点的hash值赋值为0；根据各像素点的赋值结果生成hash码，在本实施例中，从左至右从上至下将hash值组合成一个整数，采用二进制保存，其为hash码，hash码采用npy文件进行存储，即为目标特征。The feature extraction module is used to extract target features from the target subject image. Specifically, the feature extraction module is used to preprocess the target subject image, and the preprocessing includes: presetting the image size, scaling the target subject image according to the image size, and obtaining the target subject image with the same size as the image; The target subject image is converted to a grayscale image. The feature extraction module is also used to obtain the low-frequency signal of each pixel in the target subject image after preprocessing, and assign values to each pixel in the target subject image according to the low-frequency signal; and generate target features according to the assignment result. Call the preset discrete cosine transform, calculate and generate the DCT matrix according to the grayscale image, the DCT matrix includes the DCT value of each pixel, calculate the DCT average value of the grayscale image according to the DCT matrix, and calculate the DCT value and DCT average value of each pixel point according to the DCT value. Carry out assignment, when the DCT value of the pixel point is greater than or equal to the DCT average value, assign the hash value of the pixel point to 1, when the pixel point DCT value is less than the DCT average value, assign the hash value of the pixel point to 0 ; Generate hash code according to the assignment result of each pixel, in the present embodiment, from left to right from top to bottom, the hash value is combined into an integer, adopt binary to save, it is hash code, and hash code adopts npy file to store , which is the target feature.

相似计算模块用于根据目标特征依次和图像特征进行匹配，计算相似度。具体的，相似计算模块用于依次计算目标特征和图像特征的汉明距离，即计算目标特征对应hash码与图像特征对应hash码的汉明距离，通过汉明距离表征两hash码的异同，即汉明距离为目标特征和图像特征的相似度。The similarity calculation module is used to sequentially match with the image features according to the target features, and calculate the similarity. Specifically, the similarity calculation module is used to calculate the Hamming distance of the target feature and the image feature in turn, that is, to calculate the Hamming distance of the hash code corresponding to the target feature and the hash code corresponding to the image feature, and to characterize the similarities and differences between the two hash codes by the Hamming distance, that is, Hamming distance is the similarity between target features and image features.

信息筛选模块用于根据相似度从视频帧中筛选出推送图像，根据推送图像获取并推送图像信息。图像信息包括时间点和相似度，图像信息的格式为json。具体的，信息筛选模块预设有推送阈值，在本实施例中，推送阈值为4，即筛选排序前4位的视频帧作为推送图像。信息筛选模块用于根据相似度从高到低对视频帧进行排序，根据预设的推送阈值筛选出对应数量的视频帧作为推送图像，根据推送图像获取推送图像所对应视频帧在视频中的时间点，根据时间点和相似度生成对应推送图像的图像信息，将推送信息整合为json格式。The information screening module is used to filter out the push images from the video frames according to the similarity, and obtain and push the image information according to the push images. Image information includes time point and similarity, and the format of image information is json. Specifically, the information screening module is preset with a push threshold. In this embodiment, the push threshold is 4, that is, the top 4 video frames in the sorting order are selected as the push images. The information screening module is used to sort the video frames from high to low according to the similarity, filter out the corresponding number of video frames as the push image according to the preset push threshold, and obtain the time in the video of the video frame corresponding to the push image according to the push image point, generate image information corresponding to the push image according to the time point and similarity, and integrate the push information into json format.

以上所述的仅是本发明的实施例，方案中公知的具体结构及特性等常识在此未作过多描述，所属领域普通技术人员知晓申请日或者优先权日之前发明所属技术领域所有的普通技术知识，能够获知该领域中所有的现有技术，并且具有应用该日期之前常规实验手段的能力，所属领域普通技术人员可以在本申请给出的启示下，结合自身能力完善并实施本方案，一些典型的公知结构或者公知方法不应当成为所属领域普通技术人员实施本申请的障碍。应当指出，对于本领域的技术人员来说，在不脱离本发明结构的前提下，还可以作出若干变形和改进，这些也应该视为本发明的保护范围，这些都不会影响本发明实施的效果和专利的实用性。本申请要求的保护范围应当以其权利要求的内容为准，说明书中的具体实施方式等记载可以用于解释权利要求的内容。The above are only the embodiments of the present invention, and the common knowledge such as the well-known specific structures and characteristics in the scheme has not been described too much here. Those of ordinary skill in the art know that the invention belongs to the technical field before the filing date or the priority date. Technical knowledge, can know all the prior art in this field, and have the ability to apply conventional experimental means before the date, those of ordinary skill in the art can improve and implement this scheme in combination with their own ability under the enlightenment given in this application, Some typical well-known structures or well-known methods should not be an obstacle to those skilled in the art from practicing the present application. It should be pointed out that for those skilled in the art, some modifications and improvements can be made without departing from the structure of the present invention. These should also be regarded as the protection scope of the present invention, and these will not affect the implementation of the present invention. Effectiveness and utility of patents. The scope of protection claimed in this application shall be based on the content of the claims, and the descriptions of the specific implementation manners in the description can be used to interpret the content of the claims.

Claims

1. a video image positioning method based on frame feature, is characterized in that, comprises the following content:

Video analysis step: split the video to be matched into video frames, extract image features from the video frames, and store the image features;

Image processing step: acquiring a target image, segmenting the target image to generate a target subject image, and extracting target features from the target subject image;

Image matching step: match the target features with the image features in sequence, and calculate the similarity;

Image screening and push steps: Screen push images from video frames according to similarity, and obtain and push image information according to push images.

2. The frame feature-based video image positioning method according to claim 1, wherein the target image is segmented to generate a target subject image, specifically comprising the following content:

Perform contour recognition on the target image; screen out the target contour from the identified contours as the target subject image according to preset screening conditions.

3. the video image location method based on frame feature according to claim 1, is characterized in that: extract image feature from video frame, specifically comprise the following content:

Preprocessing video frames;

Obtain the low-frequency signal of each pixel in the video frame after preprocessing, and assign each pixel in the video frame according to the low-frequency signal;

Generate image features based on the assignment results.

4. the video image positioning method based on frame feature according to claim 1, is characterized in that: carry out matching with image feature successively according to target feature, calculate similarity, specifically comprise the following content:

Calculate the Hamming distance of the target feature and the image feature in turn, and the Hamming distance is the similarity.

5 . The frame feature-based video image positioning method according to claim 1 , wherein the image information includes a time point and a similarity, and the format of the image information is json. 6 .

6. Video image positioning system based on frame features, including:

The video splitting module is used to split the video to be matched into video frames;

It is characterized in that it also includes:

The target segmentation module is used to segment the acquired target image to generate the target subject image;

A feature extraction module for extracting image features from video frames, and also for extracting target features from target subject images;

The similarity calculation module is used to match the image features in sequence according to the target features, and calculate the similarity;

The information screening module is used to filter out the push images from the video frames according to the similarity, and obtain image information according to the push images.

7. The frame feature-based video image positioning system according to claim 6, wherein the target segmentation module is used to carry out contour identification to the obtained target image, and screen out the identified contour according to a preset screening condition The target contour is taken as the target subject image.

8. The frame feature-based video image positioning system according to claim 6, wherein the feature extraction module is used to preprocess the video frame; it is also used to obtain the low frequency signal of each pixel in the preprocessed video frame , assign values to each pixel in the video frame according to the low-frequency signal, and generate image features according to the assignment results.

9 . The frame feature-based video image positioning system according to claim 6 , wherein the similarity calculation module is used to sequentially calculate the Hamming distance of the target feature and the image feature, and the Hamming distance is the similarity. 10 .

10 . The frame feature-based video image positioning system according to claim 6 , wherein the image information includes a time point and a similarity, and the format of the image information is json. 11 .