WO2023019510A1

WO2023019510A1 - Data indexing method, apparatus and device, and storage medium

Info

Publication number: WO2023019510A1
Application number: PCT/CN2021/113518
Authority: WO
Inventors: 金晨; 卢红喜; 夏欢; 周俊杰; 李国庆
Original assignee: 浙江吉利控股集团有限公司; 宁波吉利汽车研究开发有限公司
Priority date: 2021-08-19
Filing date: 2021-08-19
Publication date: 2023-02-23
Also published as: CN117597680A

Abstract

The present application relates to the technical field of data processing. Disclosed are a data indexing method, apparatus and device, and a storage medium. The method comprises: performing scene classification on video data to be processed, so as to obtain a plurality of scene segments; acquiring segment information corresponding to the scene segments; and according to the segment information, generating data inter-frame labels and scene classification information, which correspond to the scene segments, and constructing video data indexes corresponding to the scene segments in said video data. Therefore, there is no need to separately capture scene segments from video data to be processed and to name and store same; it is only necessary to generate corresponding video data indexes according to data inter-frame labels and scene classification information, which correspond to the scene segments, and to store the video data indexes; and the position, in a video to be processed, of a scene segment needing to be used can be found according to the video data indexes, such that the time of video data processing is reduced, and a large amount of manpower and material resources are prevented from being consumed, thereby improving the efficiency of video data processing.

Description

Data indexing method, device, equipment and storage medium

technical field

The present application relates to the technical field of data processing, and in particular to a data indexing method, device, equipment and storage medium.

Background technique

With the development and technical iteration of autonomous driving, the mass production of high-level autonomous vehicles is getting closer and closer, and the introduction of deep learning has greatly accelerated the implementation and application of autonomous driving technology. However, deep learning model training requires a large amount of data for training. The current public autonomous driving data sets such as coco, kitti, etc., but the public data sets have certain limitations, so most of the current OEMs or automatic driving system solution providers are in Establish your own database based on your own algorithm model, or build a data acquisition system for data acquisition and update, and then perform algorithm training.

At present, after data collection and before algorithm training, data processing is generally required. The traditional processing method of scene-based databases is to intercept a scene separately, name it and store it. This method takes a long time and consumes a lot of manpower and material resources.

The above content is only used to assist in understanding the technical solution of the present application, and does not mean that the above content is admitted as prior art.

technical problem

The main purpose of this application is to propose a data indexing method, device, device, and storage medium, aiming at solving the problem of separately intercepting, naming, and storing scenes in video data in the prior art, which takes a long time and consumes a lot of manpower and material resources technical issues.

technical solution

In order to achieve the above purpose, the present application provides a data indexing method, the data indexing method includes the following steps:

Perform scene classification on the video data to be processed to obtain multiple scene fragments;

Obtain the fragment information corresponding to each scene fragment;

Generate inter-frame data tags and scene classification information corresponding to each scene segment according to the segment information;

A video data index corresponding to each scene segment in the video data to be processed is constructed according to the data inter-frame tag and the scene classification information.

In one embodiment, the generation of data inter-frame labels and scene classification information corresponding to each scene segment according to the segment information includes:

Determining data frame information and segment attribute information corresponding to each scene segment according to the segment information;

Generate data inter-frame tags corresponding to each scene segment according to the data frame information, and generate scene classification information corresponding to each scene segment according to the segment attribute information.

In an embodiment, the generating the data inter-frame tags corresponding to each scene segment according to the data frame information includes:

determining frame header information and frame tail information corresponding to each scene segment according to the data frame information;

Generate data inter-frame tags corresponding to each scene segment according to the frame header information and the frame trailer information.

In an embodiment, after constructing the video data index corresponding to each scene segment in the video data to be processed according to the data inter-frame tag and the scene classification information, further comprising:

When receiving a model training instruction, determine target scene information according to the model training instruction;

extracting target scene segments from the video data to be processed according to the target scene information and the video data index;

Model training is performed according to the target scene segment.

In an embodiment, the extracting the target scene segment from the video data to be processed according to the target scene information and the video data index includes:

Matching the target scene information with the scene classification information in the video data index to determine the target scene classification information;

Using the scene segment corresponding to the target scene classification information as the target scene segment, and using the data inter-frame tag corresponding to the target scene segment as the target data inter-frame tag;

Extracting target scene segments from the video data to be processed according to the target scene classification information and the target data inter-frame tags.

In an embodiment, the extracting target scene segments from the video data to be processed according to the target scene classification information and the target data inter-frame tags includes:

performing data location positioning according to the target scene classification information and the target data inter-frame label to determine the start time and end time of the target scene segment;

Extracting target scene segments from the video data to be processed according to the start time and the end time.

In one embodiment, the performing model training according to the target scene segment includes:

When there are multiple target scene segments, the target scene segments are sorted according to the corresponding start time and end time of each target scene segment, and the sorted target scene segments are obtained;

Model training is performed sequentially according to the sorted target scene segments.

In addition, in order to achieve the above purpose, the present application also proposes a data indexing device, the data indexing device includes:

The scene classification module is used to classify the scene of the video data to be processed, and obtain a plurality of scene fragments;

An information acquisition module, configured to acquire segment information corresponding to each scene segment;

A data generation module, configured to generate inter-frame labels and scene classification information corresponding to each scene segment according to the segment information;

A data index module, configured to construct a video data index corresponding to each scene segment in the video data to be processed according to the data inter-frame tags and the scene classification information.

In addition, in order to achieve the above purpose, the present application also proposes a data indexing device, the data indexing device includes: a memory, a processor, and a data indexing program stored in the memory and operable on the processor, the When the data indexing program is executed by the processor, the above-mentioned data indexing method is realized.

In addition, in order to achieve the above purpose, the present application also proposes a storage medium, on which a data index program is stored, and when the data index program is executed by a processor, the above data index method is implemented.

Beneficial effect

The data indexing method proposed in this application performs scene classification on the video data to be processed to obtain multiple scene segments; obtains segment information corresponding to each scene segment; generates data inter-frame labels and scene classification information corresponding to each scene segment according to the segment information ; Constructing a video data index corresponding to each scene segment in the video data to be processed according to the data inter-frame label and the scene classification information. Therefore, it is not necessary to separately intercept, name and store the scene fragments in the video data to be processed, but only need to generate the corresponding video data index according to the data inter-frame labels and scene classification information corresponding to each scene fragment, and carry out the video data index It only needs to be stored, and according to the video data index, the position of the scene segment to be used in the video to be processed can be found, which saves the time of video data processing and avoids the consumption of a lot of manpower and material resources, thereby improving the efficiency of video data processing .

Description of drawings

FIG. 1 is a schematic structural diagram of a data indexing device in a hardware operating environment involved in the solution of the embodiment of the present application;

FIG. 2 is a schematic flow chart of the first embodiment of the data indexing method of the present application;

FIG. 3 is a schematic diagram of a driving scene segment of an embodiment of the data indexing method of the present application;

FIG. 4 is a schematic diagram of weather scene fragments in an embodiment of the data indexing method of the present application;

FIG. 5 is a schematic diagram of a combination of a driving scene and a weather scene according to an embodiment of the data indexing method of the present application;

FIG. 6 is a schematic flow chart of the second embodiment of the data indexing method of the present application;

FIG. 7 is a schematic diagram of a data frame of a scene segment in an embodiment of the data indexing method of the present application;

FIG. 8 is a schematic flowchart of a third embodiment of the data indexing method of the present application;

FIG. 9 is a schematic diagram of functional modules of the first embodiment of the data indexing device of the present application.

The realization, functional features and advantages of the present application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Embodiment of this application

It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

Referring to FIG. 1 , FIG. 1 is a schematic structural diagram of a data indexing device in a hardware operating environment involved in an embodiment of the present application.

As shown in FIG. 1 , the data indexing device may include: a processor 1001 , such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002 , a user interface 1003 , a network interface 1004 , and a memory 1005 . Wherein, the communication bus 1002 is used to realize connection and communication between these components. The user interface 1003 may include a display screen (Display) and an input unit such as a button, and the optional user interface 1003 may also include a standard wired interface and a wireless interface. Optionally, the network interface 1004 may include a standard wired interface and a wireless interface (such as a Wi-Fi interface). The memory 1005 may be a high-speed random access memory (Random Access Memory, RAM), or a stable memory (non-volatile memory), such as a disk memory. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .

Those skilled in the art can understand that the device structure shown in FIG. 1 does not constitute a limitation on the data indexing device, and may include more or less components than those shown in the figure, or combine some components, or arrange different components.

As shown in FIG. 1 , memory 1005 as a storage medium may include an operating system, a network communication module, a user interface module, and a data index program.

In the data indexing device shown in Figure 1, the network interface 1004 is mainly used to connect to the external network and perform data communication with other network devices; the user interface 1003 is mainly used to connect to user equipment and perform data communication with the user equipment; this application The device invokes the data indexing program stored in the memory 1005 through the processor 1001, and executes the data indexing method provided in the embodiment of the present application.

Based on the above hardware structure, an embodiment of the data indexing method of the present application is proposed.

Referring to FIG. 2 , FIG. 2 is a schematic flowchart of the first embodiment of the data indexing method of the present application.

In the first embodiment, the data indexing method includes the following steps:

Step S10, performing scene classification on the video data to be processed to obtain a plurality of scene fragments.

It should be noted that the executor of this embodiment may be a data indexing device, such as a computer device with data processing functions, or other devices capable of realizing the same or similar functions, which is not limited in this embodiment.

It should be noted that the video data to be processed in this embodiment can be the video data obtained by collecting the environmental data during the driving process of the test vehicle, which can be automatically collected by the vehicle-mounted camera equipment on the test vehicle, or can be collected by setting the The external camera equipment on the test vehicle can automatically collect the environmental data, and other methods can also be used to automatically collect the environmental data to obtain the video data to be processed, which is not limited in this embodiment.

In the specific implementation, for example, the test vehicle has been driving for 6 hours, and the environmental data has been collected automatically during the driving process of the test vehicle to obtain 6 hours of video data. After the data collection, the collected video data needs to be processed. processing, the video data can be used as video data to be processed.

It should be understood that after the video data to be processed is acquired, scene classification may be performed on the video data to be processed to obtain multiple scene segments. It should be noted that, in this step of the solution, it is not necessary to intercept these scene fragments from the video data to be processed, but only need to identify these scene fragments in the video data to be processed. Specifically, scene classification is performed on the video data to be processed according to a preset classification type, and scene classification may also be performed in other ways, which is not limited in this embodiment.

In the specific implementation, the preset classification type can be set in advance according to the actual use requirements. For example, if the weather scene is used to distinguish, it can be divided into sunny days, rainy days, foggy days, etc., and further, sunny days can be subdivided into sunny days and cloudy days. , few clouds, etc., rainy days can be subdivided into heavy rain, moderate rain, light rain, etc., and foggy days can be subdivided into equal dense fog, medium fog, light fog, etc. If the driving scene is distinguished, it can be divided into cruising, lane changing, braking, etc. Further, cruising can be subdivided into high-speed cruising, medium-speed cruising, low-speed cruising, etc. Lane changing can be subdivided into left lane changing, Right lane change, continuous lane change, etc. Braking can be subdivided into emergency braking, slow braking, point braking, etc. It should be understood that, in addition to the above classification types, other more scene classification types may also be included, such as time scenes, terrain scenes, etc., which are not limited in this embodiment. In this embodiment, the preset classification types include The weather scene and the driving scene are taken as examples for illustration.

It can be understood that image analysis can be performed on the video data to be processed, and multiple scene segments in the video data to be processed can be obtained according to the image analysis results.

In a specific implementation, for example, if the driving scene is used for classification, as shown in Figure 3, Figure 3 is a schematic diagram of a driving scene segment, the horizontal line in Figure 3 is the time axis of the video data to be processed, and the video data to be processed can be identified The scene segments in the video data include a lane change scene segment 1 , a cruising scene segment 1 , and a lane change scene segment 2 .

For another example, if the weather scene is used to classify, as shown in Figure 4, Figure 4 is a schematic diagram of a weather scene segment, the horizontal line in Figure 4 is the time axis of the video data to be processed, and the video data to be processed can be identified The scene segments include a sunny scene segment 1 and a rainy day scene segment 1 .

For another example, if the driving scene and the weather scene are combined for classification, it can be shown in Figure 5, which is a schematic diagram of the combination of the driving scene and the weather scene, and the horizontal line in Figure 5 is the time axis of the video data to be processed. In addition to identifying the above-mentioned driving scene and weather scene, the driving scene and the weather scene can also be combined to obtain four scene fragments A1, A2, A3, and A4 in Figure 5, where A1 is the lane change scene fragment 1 on a sunny day, A2 is the cruising scene segment 1 in sunny days, A3 is the cruising scene segment 1 in rainy days, and A4 is the lane changing scene segment 1 in rainy days.

It can be understood that, in addition to the independent scene classification method and combined scene classification method illustrated above, other independent scene classification methods can also be used for classification, or more types of scene classification can be combined, which can be set according to actual use requirements. This implementation Examples are not limited to this.

Step S20, acquiring segment information corresponding to each scene segment.

It should be understood that since video data contains many types of information, and scene segments are part of the video data, scene segments also contain many types of information. Therefore, after determining a plurality of scene fragments contained in the video data to be processed, the fragment information corresponding to these scene fragments can be obtained, wherein the fragment information can include data frame information and fragment attribute information, besides, it can also include other Type information, which is not limited in this embodiment.

Step S30, generating data inter-frame tags and scene classification information corresponding to each scene segment according to the segment information.

It can be understood that the inter-frame labeling can be performed on each scene segment according to the segment information, so as to determine the start frame and the end frame corresponding to each scene segment, and then generate data inter-frame tags. Scene classification information may also be generated according to attribute information such as scene category, duration, and storage location included in the fragment information.

Step S40, constructing a video data index corresponding to each scene segment in the video data to be processed according to the inter-frame tag of the data and the scene classification information.

It should be understood that, after the data inter-frame tags and scene classification information corresponding to each scene segment are generated, a video data index corresponding to each scene segment in the video data to be processed may be constructed according to the data inter-frame tag and scene classification information. These video data indexes can be stored, and the complete video data to be processed can be stored directly, without time-consuming operations such as separate interception, naming and storage of the video data to be processed, and scene fragments in the video data to be processed can be used , the relevant information of the scene segments to be used can be determined directly according to the video data index, and then these scene segments are extracted from the video data to be processed for use.

It can be understood that the video data index can be stored in the form of a database or a table, or can be stored in other programming languages. For example, these video data indexes can be stored in an EXCLE table. In addition, the video data index may also be stored in other ways, which is not limited in this embodiment. In this embodiment, storage of the video data index in a table is taken as an example for illustration. Compared with segmented video data, the video data index requires less storage space, is easy to store, and does not require complicated management. The information can be clearly recorded through the table, which can reduce the cost of data processing. time and avoid wasting a lot of manpower and material resources.

In a specific implementation, the video data indexes corresponding to all scene segments in the video data to be processed can be recorded in the same table for storage, and the video data indexes corresponding to different types of scene segments can also be recorded in different tables for storage. For storage, for example, the video data index corresponding to the scene segment corresponding to the weather scene is recorded in a table for storage, and the video data index corresponding to the scene segment corresponding to the driving scene is recorded in another table for storage. For another example, it can also be further refined, and the video data index corresponding to the scene segment corresponding to the sunny scene is recorded in a table for storage, and the video data index corresponding to the corresponding scene segment corresponding to the rainy scene is recorded in another table. storage. In addition to the above methods, other storage methods may also be used for storage according to actual usage requirements, which is not limited in this embodiment.

In this embodiment, scene classification is performed on the video data to be processed to obtain a plurality of scene segments; segment information corresponding to each scene segment is obtained; data inter-frame tags and scene classification information corresponding to each scene segment are generated according to the segment information; The data inter-frame tags and the scene classification information construct a video data index corresponding to each scene segment in the video data to be processed. Therefore, it is not necessary to separately intercept, name and store the scene fragments in the video data to be processed, but only need to generate the corresponding video data index according to the data inter-frame labels and scene classification information corresponding to each scene fragment, and carry out the video data index It only needs to be stored, and according to the video data index, the position of the scene segment to be used in the video to be processed can be found, which saves the time of video data processing and avoids the consumption of a lot of manpower and material resources, thereby improving the efficiency of video data processing .

In one embodiment, as shown in FIG. 6 , based on the first embodiment, a second embodiment of the data indexing method of the present application is proposed. The step S30 includes:

Step S301: Determine the data frame information and segment attribute information corresponding to each scene segment according to the segment information.

It should be understood that video data may consist of multiple frames of image data, and the data frame information in this embodiment refers to video frame information related to each scene segment. Furthermore, the frame header information and frame tail information corresponding to each scene segment can be determined according to the data frame information, wherein the frame header information refers to the video frame information at the beginning of the scene segment, and the frame tail information refers to the video frame at the end of the scene segment information. After the frame header information and frame trailer information corresponding to each scene segment are determined, data frame information corresponding to the scene segment can be generated according to the frame header information and frame trailer information.

In the specific implementation, it can be described on the basis of weather scene classification, as shown in Figure 7, which is a schematic diagram of the data frame of the scene segment, in Figure 7, the frame header of the lane change scene segment 1 is O1, and the frame The tail is O2, that is, the video data between the beginning of the O1 frame and the end of the O2 frame in the video to be processed is the video data corresponding to the lane change scene segment 1; the frame header of the cruising scene segment 1 is O2, and the frame tail is O3, that is, the video data between the beginning of the O2 frame and the end of the O3 frame in the video to be processed is the video data corresponding to the cruise scene segment 1; the frame header of the lane change scene segment 2 is O3, and the frame tail is O4, That is, the video data between the beginning of the O3th frame and the end of the O4th frame in the video to be processed is the video data corresponding to the lane change scene segment 2.

Step S302, generating data inter-frame tags corresponding to each scene segment according to the data frame information, and generating scene classification information corresponding to each scene segment according to the segment attribute information.

It should be understood that inter-frame labeling can be performed on each scene segment in the video data to be processed according to the data frame information to generate a data inter-frame tag corresponding to each scene segment, and the scene segment can be determined according to the data inter-frame tag corresponding to the scene segment The position in the video data to be processed.

It should be understood that since the segment attribute information includes attributes such as scene category, duration, and storage location, the scene classification information corresponding to each scene segment can be generated according to the segment attribute information, and the scene classification information corresponding to the scene segment can be determined. Attributes such as classification information corresponding to the scene segment.

It can be understood that since both the data inter-frame label and the scene classification information contain partial information corresponding to each scene segment in the video data to be processed, the former is used for positioning and the latter is used for classification. Therefore, when it is necessary to use the scene segment corresponding to Data can be indexed and located through inter-frame labels and scene classification information to facilitate data extraction.

In this embodiment, the data frame information and segment attribute information corresponding to each scene segment are determined according to the segment information; the data inter-frame tags corresponding to each scene segment are generated according to the data frame information, and generated according to the segment attribute information Scene classification information corresponding to each scene segment. Thereby, positioning and classification can be carried out respectively through the data inter-frame tags and scene classification information corresponding to each scene segment in the video data to be processed, and when the scene segment in the video data to be processed needs to be used, it can be conveniently obtained from the video data to be processed Data corresponding to the scene clip is extracted.

In one embodiment, as shown in FIG. 8 , the third embodiment of the data indexing method of this application is proposed based on the first embodiment or the second embodiment. In this embodiment, the description is made based on the first embodiment. The steps After S40, it also includes:

Step S50, when receiving a model training instruction, determine target scene information according to the model training instruction.

It should be understood that, when the data in the video data to be processed needs to be used to train the deep learning model, a model training instruction can be sent to the computer device. Among them, the deep learning model can be divided into multiple purposes and corresponding training types. For example, if it is a deep learning model used for weather scenes, the corresponding training type is weather scene training, which needs to be paired with some weather scene related data. The deep learning model is trained; if it is a deep learning model for driving scenes, the corresponding training type is driving scene training, and some data related to driving scenes is needed to train the deep learning model.

It can be understood that, when the computer device receives the model training instruction, it can determine the target scene information corresponding to the currently required data according to the model training instruction. For example, when data related to a weather scene is currently needed, the corresponding target scene information is a weather scene; when data related to a driving scene is currently needed, the corresponding target scene information is a driving scene.

Step S60, extracting target scene segments from the video data to be processed according to the target scene information and the video data index.

It should be understood that since the video data index corresponding to each scene segment in the video data to be processed has been stored, the target scene segment corresponding to the target scene information can be easily found according to the stored video data index, and The data corresponding to the target scene segment is extracted from the video data to be processed for model training.

It can be understood that the target scene information can be matched with the scene classification information in the video data index, and the target scene classification information can be determined according to the matching result. Furthermore, according to the video data index, the scene segment corresponding to the target scene classification information can be used as the target scene segment, and the data inter-frame label corresponding to the target scene segment can be used as the target data inter-frame label, and then according to the target scene classification information and the target data frame The inter tag extracts the data corresponding to the target scene segment from the video data to be processed.

In a specific implementation, for example, if the target scene information is a lane-changing scene, that is, the data of the lane-changing scene needs to be used to train the model, the target scene classification information related to the lane-changing scene can be matched in the video data index, and then Determine the corresponding target scene segments as lane change scene segment 1 and lane change scene segment 2, and further determine that the data inter-frame label corresponding to lane change scene segment 1 is frame O1~O2, and the data corresponding to lane change scene segment 2 The inter-frame labels are frame O3~O4. Therefore, the data corresponding to these two lane-changing scene segments can be extracted from the video data to be processed for model training.

It should be understood that since the scene classification information includes attributes such as duration and storage location, the storage location of the video data to be processed may be first determined according to the target scene classification information. In order to facilitate data extraction, the data location can be positioned according to the target scene classification information and the inter-frame labels of the target data to determine the start time and end time of the target scene segment in the video data to be processed, wherein the start time and the target scene segment's The frame header corresponds, and the end time corresponds to the frame end of the target scene segment. Then, the data corresponding to the target scene segment may be extracted from the video data to be processed according to the start time and the end time corresponding to the target scene segment.

Step S70, perform model training according to the target scene segment.

It should be understood that if there is only one target scene segment, the data corresponding to the target scene segment is directly extracted from the video data to be processed for model training. If there are multiple target scene segments, the target scene segments can be sorted according to the corresponding start time and end time of each target scene segment, and the sorted target scene segments are obtained, and the model training is performed according to the sorted target scene segments in turn.

In a specific implementation, for example, if there are two target scene segments, the lane change scene segment 1 and the lane change scene segment 2, according to the sorting result, it can be known that the lane change scene segment 1 is before the lane change scene segment 2, therefore, it can be passed through The data corresponding to the lane change scene segment 1 is used for model training, and then after the training is completed, the jump is automatically performed, and then the model training is performed with the data corresponding to the lane change scene segment 2. It can be understood that if there are more target scene fragments, the above method can continue to jump according to the sorting results, and carry out model training until all the data corresponding to the target scene fragments have been trained for the model.

It can be understood that, through the solution of this embodiment, the data corresponding to the target scene segment can be automatically indexed through the video data index during model training, and automatically and quickly linked into the scene data channel. After the training of the data corresponding to a target scene segment When , it can automatically switch to the data corresponding to the next target scene segment, and continue training, so as to achieve a better effect of automatic training of the model.

In this embodiment, when a model training instruction is received, target scene information is determined according to the model training instruction; target scene segments are extracted from the video data to be processed according to the target scene information and the video data index; Model training is performed according to the target scene segment. In this solution, there is no need to divide and store the original video data, and it is not necessary to search for these divided and stored video data when using it. Instead, the target scene information is used to perform scene matching, and the current needs are located according to the video data index. The target scene segment, and extract the data corresponding to the target scene segment from the original video data to be processed for model training. Through indexing and fast positioning, the efficiency of model training is improved, and labor costs are also saved.

In addition, the embodiment of the present application also proposes a storage medium, on which a data index program is stored, and when the data index program is executed by a processor, the steps of the above-mentioned data index method are implemented.

Since the storage medium adopts all the technical solutions of all the above-mentioned embodiments, it at least has all the beneficial effects brought by the technical solutions of the above-mentioned embodiments, which will not be repeated here.

In addition, referring to FIG. 9 , the embodiment of the present application also proposes a data indexing device, the data indexing device includes:

The scene classification module 10 is configured to perform scene classification on the video data to be processed to obtain a plurality of scene fragments.

It should be understood that after the video data to be processed is acquired, scene classification may be performed on the video data to be processed to obtain multiple scene segments. It should be noted that, in this step of the solution, it is not necessary to intercept these scene fragments from the video data to be processed, but only need to identify these scene fragments in the video data to be processed. Specifically, the scene classification is performed on the video data to be processed according to the preset classification type, and the scene classification may also be performed in other ways, which is not limited in this embodiment.

The information acquisition module 20 is configured to acquire segment information corresponding to each scene segment.

The data generation module 30 is configured to generate data inter-frame tags and scene classification information corresponding to each scene segment according to the segment information.

The data index module 40 is configured to construct a video data index corresponding to each scene segment in the video data to be processed according to the inter-frame tag of the data and the scene classification information.

In one embodiment, the data generation module 30 is further configured to determine the data frame information and segment attribute information corresponding to each scene segment according to the segment information; generate the data frame information corresponding to each scene segment according to the data frame information tags, and generate scene classification information corresponding to each scene segment according to the segment attribute information.

In an embodiment, the data generating module 30 is further configured to determine frame header information and frame trailer information corresponding to each scene segment according to the data frame information; generate each frame header information and frame trailer information according to the frame header information and the frame trailer information The inter-frame label of the data corresponding to the scene fragment.

In one embodiment, the data indexing device further includes a model training module, configured to determine target scene information according to the model training command when receiving a model training command; Extracting a target scene segment from the video data to be processed; performing model training according to the target scene segment.

In one embodiment, the model training module is further configured to match the target scene information with the scene classification information in the video data index to determine the target scene classification information; the scene corresponding to the target scene classification information The segment is used as the target scene segment, and the data inter-frame tag corresponding to the target scene segment is used as the target data inter-frame tag; extract from the video data to be processed according to the target scene classification information and the target data inter-frame tag Target scene fragment.

In one embodiment, the model training module is further configured to perform data location positioning according to the target scene classification information and the inter-frame label of the target data, so as to determine the start time and end time of the target scene segment; according to the The start time and the end time extract target scene segments from the video data to be processed.

In one embodiment, the model training module is further configured to sort the target scene segments according to the start time and end time corresponding to each target scene segment when there are multiple target scene segments, to obtain the sorted target scene segments. Scene fragments; perform model training according to the sorted target scene fragments in turn.

For other embodiments or specific implementation methods of the data indexing device described in this application, reference may be made to the above-mentioned method embodiments, which will not be repeated here.

It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus comprising that element.

The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation. Based on this understanding, the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, which is stored in a computer-readable storage medium as described above (such as ROM/RAM, magnetic disk, optical disk), including several instructions to make a smart device (which can be a mobile phone, a computing machine, a data indexing device, or a network data indexing device, etc.) execute the various embodiments described in this application Methods.

The above are only preferred embodiments of the present application, and are not intended to limit the patent scope of the present application. All equivalent structures or equivalent process transformations made by using the description of the application and the accompanying drawings are directly or indirectly used in other related technical fields. , are all included in the patent protection scope of the present application in the same way.

Claims

A data indexing method, characterized in that the data indexing method comprises the following steps:

Perform scene classification on the video data to be processed to obtain multiple scene fragments;

Obtain the fragment information corresponding to each scene fragment;

Generate inter-frame data tags and scene classification information corresponding to each scene segment according to the segment information;

A video data index corresponding to each scene segment in the video data to be processed is constructed according to the data inter-frame tag and the scene classification information.
The data indexing method according to claim 1, wherein said generating the data inter-frame labels and scene classification information corresponding to each scene segment according to the segment information comprises:

Determining data frame information and segment attribute information corresponding to each scene segment according to the segment information;

Generate data inter-frame tags corresponding to each scene segment according to the data frame information, and generate scene classification information corresponding to each scene segment according to the segment attribute information.
The data indexing method according to claim 2, wherein said generating the data inter-frame tags corresponding to each scene segment according to the data frame information comprises:

determining frame header information and frame tail information corresponding to each scene segment according to the data frame information;

Generate data inter-frame tags corresponding to each scene segment according to the frame header information and the frame trailer information.
The data indexing method according to any one of claims 1 to 3, wherein the video corresponding to each scene segment in the video data to be processed is constructed according to the data inter-frame label and the scene classification information After data indexing, also includes:

When receiving a model training instruction, determine target scene information according to the model training instruction;

extracting target scene segments from the video data to be processed according to the target scene information and the video data index;

Model training is performed according to the target scene segment.
The data indexing method according to claim 4, wherein said extracting a target scene segment from said video data to be processed according to said target scene information and said video data index comprises:

Matching the target scene information with the scene classification information in the video data index to determine the target scene classification information;

Using the scene segment corresponding to the target scene classification information as the target scene segment, and using the data inter-frame tag corresponding to the target scene segment as the target data inter-frame tag;

Extracting target scene segments from the video data to be processed according to the target scene classification information and the target data inter-frame tags.
The data indexing method according to claim 5, wherein said extracting a target scene segment from said video data to be processed according to said target scene classification information and said target data inter-frame label comprises:

performing data location positioning according to the target scene classification information and the target data inter-frame label to determine the start time and end time of the target scene segment;

Extracting target scene segments from the video data to be processed according to the start time and the end time.
The data indexing method according to claim 6, wherein said performing model training according to said target scene segment comprises:

When there are multiple target scene segments, the target scene segments are sorted according to the corresponding start time and end time of each target scene segment, and the sorted target scene segments are obtained;

Model training is performed sequentially according to the sorted target scene segments.
A data indexing device, characterized in that the data indexing device includes:

The scene classification module is used to classify the scene of the video data to be processed, and obtain a plurality of scene fragments;

An information acquisition module, configured to acquire segment information corresponding to each scene segment;

A data generation module, configured to generate inter-frame labels and scene classification information corresponding to each scene segment according to the segment information;

A data index module, configured to construct a video data index corresponding to each scene segment in the video data to be processed according to the data inter-frame tags and the scene classification information.
A data indexing device, characterized in that the data indexing device includes: a memory, a processor, and a data indexing program stored in the memory and operable on the processor, and the data indexing program is executed by the processor When executed, the data indexing method according to any one of claims 1 to 7 is realized.
A storage medium, characterized in that a data indexing program is stored on the storage medium, and when the data indexing program is executed by a processor, the data indexing method according to any one of claims 1 to 7 is implemented.