CN109977822A

CN109977822A - Data supply method, model training method, device, system, equipment and medium

Info

Publication number: CN109977822A
Application number: CN201910197522.8A
Authority: CN
Inventors: 梁柱锦; 刘运; 蒋德为
Original assignee: Guangzhou Netstar Information Technology Co Ltd
Current assignee: Guangzhou Netstar Information Technology Co Ltd
Priority date: 2019-03-15
Filing date: 2019-03-15
Publication date: 2019-07-05
Anticipated expiration: 2039-03-15
Also published as: CN109977822B

Abstract

The invention discloses a kind of data supply method, model training method, device, system, equipment and media.Wherein, which includes: to obtain the train request for being directed to Video Model, and the train request includes preset batch processing mechanism and this trains corresponding Data Identification；Matched target video data is obtained in distributed storage data set according to the Data Identification, the distributed storage data set includes all types of video data；The target video data is handled according to the batch processing mechanism, obtains the corresponding training data of the Video Model.Technical solution provided in an embodiment of the present invention, is directly trained video data, and the memory space of required occupancy is small, and the time spent needed for reading video data is few, improves the training effectiveness of Video Model.

Description

Data supply method, model training method, device, system, equipment and medium

Technical field

The present embodiments relate to video field more particularly to a kind of data supply method, model training method, device, System, equipment and medium.

Background technique

It is directed to the training of Video Model at present, generally first obtains video data, is corresponding picture by Digital video resolution (i.e. video frame) and audio-frequency information, are stored as data file for picture and audio-frequency information respectively, in the process of training video model One of in, at least in the following way: it is trained and from audio data file from picture is read in picture data files Audio-frequency information is read to be trained.

Using the training data supply mode of existing Video Model: if directly storing video data, required occupancy Memory space is larger, and the memory space occupied needed for picture data files and audio data file is than corresponding video data Memory space is bigger, and therefore, it is necessary to occupy the more memory spaces of trained equipment.In addition, due to picture data files and audio The data volume of data file is very big, in the training process, needs to spend more time for reading picture and audio-frequency information extremely It is one of few, so that the efficiency of training is lower.

Summary of the invention

The embodiment of the invention provides a kind of data supply method, model training method, device, system, equipment and medium, Improve the training effectiveness of Video Model.

In a first aspect, the embodiment of the invention provides a kind of data supply methods, this method comprises:

The train request for being directed to Video Model is obtained, the train request includes preset batch processing mechanism and this The corresponding Data Identification of training；

Matched target video data is obtained in distributed storage data set according to the Data Identification, the distribution is deposited Storing up data set includes all types of video datas；

The target video data is handled according to the batch processing mechanism, obtains the corresponding instruction of the Video Model Practice data.

Second aspect, the embodiment of the invention provides a kind of model training methods, this method comprises:

According to the data supply method in first aspect, the corresponding training data of Video Model is obtained；

The training data is inputted into the Video Model, the Video Model after being trained.

The third aspect, the embodiment of the invention provides a kind of data supply device, which includes:

Train request obtains module, and for obtaining the train request for being directed to Video Model, the train request includes preparatory The batch processing mechanism of setting and this corresponding Data Identification of training；

Target data obtains module, for obtaining matched target in distributed storage data set according to the Data Identification Video data, the distributed storage data set include all types of video datas；

Training data determining module is obtained for being handled according to the batch processing mechanism the target video data To the corresponding training data of the Video Model.

Fourth aspect, the embodiment of the invention provides a kind of model training apparatus, which includes:

Training data obtains module, for it is corresponding to obtain Video Model according to the data supply method in first aspect Training data；

Video Model training module, for the training data to be inputted the Video Model, the video after being trained Model.

5th aspect, the embodiment of the invention provides a kind of data feed system, which includes: Distributed Storage End, batch loading end and the data supply side being connect respectively with Distributed Storage end and batch loading end；The distribution number According to storage end distributed storage storing data collection；Described batch of loading end stores batch processing mechanism, and generates train request；The number It is arranged according to supply side just like the data supply device in the third aspect.

6th aspect, the embodiment of the invention provides a kind of model training systems, which includes: Distributed Storage End, batch loading end and the model training end being connect respectively with Distributed Storage end and batch loading end；The distribution number According to storage end distributed storage storing data collection；Described batch of loading end stores batch processing mechanism, and generates train request；The mould Type training end is arranged just like the model training apparatus in fourth aspect.

7th aspect, the embodiment of the invention provides a kind of equipment, which includes:

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes data supply method described in first aspect present invention, or realizes the instruction of model described in second aspect of the present invention Practice method.

Eighth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey Sequence realizes data supply method described in first aspect present invention, or realizes the present invention when program is executed by processor Model training method described in second aspect.

The embodiment of the invention provides a kind of data supply method, model training method, device, system, equipment and medium, Matched target video data is obtained in distributed storage data set by the Data Identification in train request, while according to pre- The batch processing mechanism first set handles the target video data, and data processing function is set without taking a significant amount of time Can, obtain the corresponding training data of Video Model to be trained, compared with the existing technology in picture or audio-frequency information are carried out Trained Video Model training method is directly trained video data using the technical solution of the embodiment of the present invention, required The memory space of occupancy is small, and the time spent needed for reading video data is few, improves the training effectiveness of Video Model.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:

Figure 1A is a kind of flow chart for data supply method that the embodiment of the present invention one provides；

Figure 1B is the original block diagram that the data that the embodiment of the present invention one provides supply process；

Fig. 2A is a kind of flow chart of data supply method provided by Embodiment 2 of the present invention；

Fig. 2 B is the schematic illustration that a kind of data provided by Embodiment 2 of the present invention supply process；

Fig. 3 A is a kind of flow chart for model training method that the embodiment of the present invention three provides；

Fig. 3 B is the schematic illustration for the model training process that the embodiment of the present invention three provides；

Fig. 4 is a kind of structural schematic diagram for data supply device that the embodiment of the present invention four provides；

Fig. 5 is a kind of structural schematic diagram for model training apparatus that the embodiment of the present invention five provides；

Fig. 6 is a kind of schematic illustration for data feed system that the embodiment of the present invention six provides；

Fig. 7 is a kind of schematic illustration for model training systems that the embodiment of the present invention seven provides；

Fig. 8 is a kind of structural schematic diagram for equipment that the embodiment of the present invention eight provides.

Specific embodiment

The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.In addition, in the absence of conflict, this The feature in embodiment and embodiment in invention can be combined with each other.

Embodiment one

Figure 1A is a kind of flow chart for data supply method that the embodiment of the present invention one provides, one kind provided in this embodiment Data supply method can be executed by data supply device provided in an embodiment of the present invention, the device can by software and/ Or the mode of hardware is realized, and is integrated in the equipment for executing this method, which can be carries at corresponding data Any intelligent terminal of reason ability.

Specifically, this method may include steps of with reference to Figure 1A:

S110 obtains the train request for being directed to Video Model.

Wherein, train request includes preset batch processing mechanism and this trains corresponding Data Identification.Specifically, It, being capable of simulative neural network behavior by depth learning technology with extensive use of the neural network model in terms of data processing Feature carries out information processing, to reach video processing intent, constructs with self study and adaptive all kinds of videos processing The Video Model of function, the Video Model in the present embodiment can be it is any can be by the network parameter and nerve of building Meta structure executes the neural network model of corresponding identification or classification feature to a certain video, such as disobeys to whether there is in video The video audit model etc. that rule content is judged.When constructing Video Model, need a large amount of training data to initial setting Neural network model be iterated training so that the Video Model after training can accurately reach video for any video The purpose of processing, therefore the train request in the present embodiment can serve to indicate that for Video Model to be trained, and need in advance Obtain training data of the Video Model in subsequent training process.

Optionally, since the training demand of Video Model is different, being directed at this time includes needle in the train request of Video Model The batch processing that the training data under batch should meet is corresponded to when preset each repetitive exercise to different training missions Mechanism, and train for this Data Identification of corresponding training data.It wherein, include according to wait train in batch processing mechanism The corresponding training mission of Video Model it is different and the different numbers of training data under batch are corresponded in the corresponding repetitive exercise that sets According to composition requirement；Data Identification is the mark for referring to uniquely indicate that this trains required training data, in the present embodiment Data Identification can be the uniform resource locator (Uniform Resource Locator, URL) of video data, which can It to be used to indicate the file address that the video data in local or network is stored, while also including when obtaining the video data The information such as the corresponding agreement met and path.

Specifically, needing to be trained a certain Video Model, when it being made to have corresponding video processing function, user Corresponding train request can be generated by executing corresponding training operation, training operation can be selection and participate in this instruction The Data Identification of experienced training data to generate corresponding identification list, and sets the batch processor that this training should meet System, and then this training is generated according to the identification list and the batch processing mechanism and is asked for the training of Video Model to be trained It asks, so that subsequent obtain participates in this training data trained.

Illustratively, " the criticizing " in the present embodiment in batch processing mechanism refers to the batch in machine learning training, that is, In an iteration training, the corresponding whole training datas participated in；At this time according to the difference of training mission, under a batch The composition of training data has different requirements；Such as common visual classification training, it is desirable that carry various labels in batch The quantity of video data balances as far as possible；Video training for pairs of (pair-wise), it is desirable that the video data inside batch Occur in pairs；Training with loss function (triplet-loss), then requiring the video data inside batch is three one What group occurred；The sequence for having some training missions that can load to training data simultaneously requires, other training missions example in distress The demand of excavation, can be dynamically according to the composition of training result adjusting training collection；It is equal for these different training demands at this time It can be set in batch processing mechanism in the present embodiment.

In addition, the functional block diagram that data supply in the present embodiment is as shown in Figure 1B, user can execute on batch loading end The corresponding trigger action of model training participates in the Data Identification of the video data of this training by choosing on batch loading end, And corresponding identification list is generated, while obtaining the batch processing mechanism that this training is chosen, it is generated jointly with the identification list pair The train request answered is asked so that data supply side be enable to obtain this for Video Model training generated to be trained It asks.

S120 obtains matched target video data in distributed storage data set according to Data Identification.

Wherein, distributed storage data set includes all types of video data；Specifically, in order to improve the spirit of data supply Activity, distributed storage data set can support storage and the reading manner of various video data, and distribution is deposited in the present embodiment Storage data set may include the video data that local disk is stored according to single file mode, the video that will allow to participate in training Data are gathered into a training data packet by way of packing and then are stored in the video data of local disk, using distribution The video data that data storage protocols carry out multiterminal storage (is such as stored in Hadoop distributed file system (Hadoop Distributed Filesystem, HDFS) on single video data or video data packet) and pass through network data Video data (the video counts stored on the arbitrary network address such as by URL access that agreement is stored on arbitrary network address According to, may include interconnect Web realease video data, be buffered to content distributing network (Content Distribution Network, CDN) on video data, be uploaded to open source distributed file system (Fast Distributed File System, FastDFS) on video data and same local area network in open hypertext transfer protocol (Hyper Text Transport Protocol, HTTP) service the shared video data etc. of each server).

Optionally, it is contemplated that it is different for the training demand of different video model, it is required in Video Model training Training data format is also different, and the distributed storage data set of the present embodiment can provide two different data store strategies, One is the storage strategies accessed according to single video sample (including the single video data stored in local disk or network In URL access etc.), another kind is that multiple video samples are carried out with the storage strategy of packing access (including to deposit in local disk The video data packet of storage or the video data packet etc. being stored in HDFS/FastDFS distributed file system).Further , the storage for single video data, can store in distributed storage data set the URL of the single video data, filename, The videos such as data label and other additional informations are associated with content, provide corresponding instruction to Video Model to be trained so as to subsequent When practicing data, all kinds of relevant informations for participating in the video data of training can be obtained；It is additionally provided in the present embodiment simultaneously a set of It then can be using in the present embodiment in video data storage if there is the demand of packing for the packing program of video data The set provided is packaged program and is packaged all kinds of content informations of corresponding video data correlation, and then with video data packet Form stores the corresponding position into distributed storage data set.In addition, different data store strategies has different advantages, The mode of single video sample access can provide the random access of video data, be suitble to dynamic generation data, or to data The more demanding occasion of sequence randomness；And the characteristics of being packaged access mode is that data reading speed is fast, can overcome random visit Ask distributed storage data set bring data input/output delay issue；User is in batch loading end by holding in the present embodiment When the corresponding trigger action of row generates train request, it can adapt to select according to the training mission of this Video Model to be trained The video data under different storage strategies is taken, to improve the flexibility of training data acquisition；Number in the present embodiment at this time The mark that can be single video data according to mark is also possible to the packet mark of the video data packet after being packaged.

Specifically, the present embodiment is being got for when the train request for the Video Model trained, it can be to the training Request is parsed, and the preset batch processing mechanism for being suitble to this training mission for including in the train request and right is obtained The Data Identification answered；And then according to including the Data Identification for participating in all videos data of this training in train request, dividing Cloth storing data, which is concentrated, obtains matched target video data；The target video data is then the view for participating in this training at this time Frequency evidence includes the videos association content such as filename, label and other additional informations of corresponding storage.

Illustratively, as shown in Figure 1B, the storage location in the distributed storage data set in the present embodiment may include Local file system, CDN cluster, HDFS cluster and FastDFS cluster, wherein CDN cluster, FastDFS cluster, HDFS collection Group is interspersion with the training server cluster where respectively Video Model to be trained, all containing hundreds of G in every server Disk array (the Redundant Arrays of of the trained video card of memory size, several piece support model, tens of T capacity Independent Disks, RAID), the central processing unit (Central Processing Unit, CPU) of tens of cores, and 10,000,000,000 network connections are used between each server；The data frequently accessed at this time can be cached in memory, in this way in training, Hard disk input/output can farthest be reduced, promote the reading speed of video data, be effectively utilized the memory of server And hard disk resources；Carry out centrally stored video data using distributed storage mode simultaneously, distributed training can be made not have to mention Preceding copy training data, data preparation stage when acceleration model is trained.

S130 is handled target video data according to batch processing mechanism, obtains the corresponding training data of Video Model.

Specifically, training corresponding Data Identification to get matched mesh in distributed storage data set according to this When marking video data, the target video data can be carried out according to the preset batch processing mechanism carried in train request Corresponding batch processing；Specifically, open transmission control protocol (Transmission Control can be passed through in the present embodiment Protocol, TCP) port receives the corresponding all kinds of videos of target video data under corresponding batch and is associated with contents, and by the batch Under target video data be grouped processing according to the packet mode of training mission demand, and be loaded onto memory, so To the corresponding training data of Video Model to be trained；When the subsequent training to Video Model, training data can be carried out corresponding Decoding and pretreatment operation, training data is converted into the specified format of Video Model to be trained, so as to subsequent training.

Using scheme provided in this embodiment treat trained Video Model be trained data supply when, as long as according to this The corresponding Data Identification of secondary training can obtain corresponding target video data, beat without being in advance downloaded video data Packet processing does not need for the video data for participating in training to be transmitted to all participation training when carrying out distributed training yet Machine on, substantially reduce the time of the training data of model training；It is stored in distributed storage data set simultaneously Video data itself be it is compressed, downloaded from distributed storage data set according to Data Identification target video data to When memory, the bandwidth resources of trained equipment itself will not be largely occupied, are also not take up the disk input/output resource of itself, after That the decoding and pretreatment of continuous training data mainly occupy is also the CPU of itself, can be with occupancy image processing unit The Video Model training parallel processing of (graphics processing unit, GPU), is waiting number without the ancillary cost time Above Data preprocess, accordingly reduce the time of Video Model training itself, the hardware greatly improved in trained equipment utilizes Rate.Very time-consuming data preparation and data prediction are standardized simultaneously, and provide a set of flexible customized batch of life At interface, algorithm engineering teacher can be allowed to be absorbed in the improvement of Video Model or the improvement of training method, it is not necessary to when spending a large amount of Between processing data on；And it the characteristics of training method can make Video Model more be bonded business datum end to end, can Better model is trained, the flexibility of Video Model training is improved.

Technical solution provided in this embodiment is obtained in distributed storage data set by the Data Identification in train request Matched target video data is taken, while the target video data is handled according to preset batch processing mechanism, nothing It need to take a significant amount of time and obtain the corresponding training data of Video Model to be trained to set data processing function, directly to view For frequency according to being trained, the memory space of required occupancy is small, and the time spent needed for reading video data is few, improves video screen module The training effectiveness of type.

Embodiment two

Fig. 2A is a kind of flow chart of data supply method provided by Embodiment 2 of the present invention, and Fig. 2 B is the embodiment of the present invention A kind of schematic illustration of the two data supply processes provided.It is in technical solution provided by the above embodiment in the present embodiment On the basis of optimize.Specifically, mainly to the tool for obtaining target video data in distributed storage data set in the present embodiment Body process carries out detailed explanation.

Optionally, as shown in Figure 2 A, it may include steps of in the present embodiment:

S210, obtain be directed to Video Model train request, the train request include preset batch processing mechanism and This trains corresponding Data Identification.

S220 determines the type of Data Identification, if Data Identification is single video mark, is being divided according to single video mark Cloth storing data collection obtains matched single video data；If Data Identification is to be packaged video identifier, regarded according to being packaged Frequency marking, which is known, obtains matched packing video data in distributed storage data set.

Specifically, video access two is accessed and is packaged according to single video sample due to existing in distributed storage data set The different data store strategy of kind, the video counts stored using different storage strategies can be chosen according to training mission difference According to Data Identification, therefore when being parsed to obtain the Data Identification for participating in the video data of this training to train request, Firstly the need of the type for judging the Data Identification, at this time if Data Identification is single video mark, directly according to haplopia frequency marking Knowledge obtains matched single video data in distributed storage data set, including what is stored in local disk or arbitrary network Single video data；If Data Identification is to be packaged video identifier, directly according to packing video identifier in distributed storage number Matched packing video data is obtained according to concentrating, including is deposited in local disk or HDFS/FastDFS distributed file system The video data packet etc. of storage；And then obtain the matched target video data for participating in this training.

Optionally, the centrally stored video data of distributed storage data includes local disk or distributed file system The interior video resource of middle storage further includes the external video resource stored on arbitrary network, therefore is being divided according to Data Identification Cloth storing data collection obtains matched target video data, comprising: if target video data is interior video data, root Matched target video data is obtained in distributed storage data set according to the Data Identification；If target video data is outside Video data, then distributed storage data set exists after external network obtains matched target video data according to Data Identification Distributed storage data set obtains matched target video data.

Specifically, judging that matched target video data is according to Data Identification first when obtaining target video data It is no be interior video data, can under distributed storage data set local file system, CDN cluster, HDFS cluster and Inquiry whether there is corresponding target video number in FastDFS cluster；If it exists, then illustrate the target video data for internal view Frequency evidence directly obtains matched target video data in distributed storage data set at this time；If it does not exist, then illustrate the mesh Mark video data is external video data, to avoid frequently accessing external video bring campus network expense, improves access speed Degree, as shown in Figure 2 B, distributed storage data set can obtain in corresponding external network matched according to Data Identification at this time Target video data, and store into distributed storage data set, and the target video data is sent to corresponding data and is supplied It is downloaded to end, so that data supply side obtains matched target video data in external network in distributed storage data set Afterwards, matched target video data is obtained in distributed storage data set according to Data Identification.Specifically, outer in the present embodiment Portion's video data is stored in the caching of distributed storage data set.

Illustratively, when the target video data that access obtains is the video data of public network, for same outer Portion's video data obtains target video data from source address by distributed storage data set when accessing for the first time, and stores under it CDN caching in, it is subsequent need to access the target video data again when, can be directly from distributed storage data set It is read in CDN caching, while saving downloading flow, greatly improves the speed of download of target video data.In addition, this reality The video data being downloaded in local disk in example is applied, can be uploaded to dedicated in distributed storage data set In FastDFS, so that obtaining target view on the slave FastDFS that each training machine can be unified when carrying out distributed training Frequency evidence, existing expense when avoiding multitude of video data copy to each training machine.

Optionally, the data storage method in the present embodiment need not be confined to HDFS, FastDFS and NFS, divide as long as meeting Cloth data storage protocols or network data agreement, while data buffer storage network need not also be confined to CDN network, it is any Agreement with data buffer storage and load balancing can use.

S230 is handled target video data according to batch processing mechanism, obtains the corresponding training data of Video Model.

Technical solution provided in this embodiment is identified and is packaged using single video the two different modes of video identifier and dividing Cloth storing data, which is concentrated, obtains matched target video data, is adapted to different training missions, improves training data and obtain The flexibility taken, while external video data being cached in distributed storage data set, improve target video data Speed of download, improve the training effectiveness of Video Model.

Embodiment three

Fig. 3 A is a kind of flow chart for model training method that the embodiment of the present invention three provides, and the present embodiment can be applied to appoint In the case where a kind of pair of Video Model is trained.A kind of model training method provided in this embodiment can be implemented by the present invention The model training apparatus that example provides executes, which can be realized by way of software and/or hardware, and be integrated in and hold In the equipment of row this method, which can be any intelligent terminal for carrying corresponding data-handling capacity.

Optionally, the present embodiment may include steps of:

S310 obtains the corresponding training data of Video Model according to above-mentioned data supply method.

Specifically, above-mentioned data supply method is the data supply method provided in any other embodiment of the present invention, this Using the data supply method in above-described embodiment in embodiment, the corresponding trained number of Video Model to be trained can be obtained According to, and have the identical beneficial effect of the data supply method in above-described embodiment.

S320, the Video Model by training data input video model, after being trained.

Optionally, after getting the corresponding training data of Video Model, which can be directly inputted to In trained Video Model, and the Video Model is trained using existing neural network training method, after being trained Video Model, enable training after Video Model corresponding video can be realized accurately for any video data Processing intent.

Illustratively, as shown in Figure 3B, training data input video model can be specifically included: is adopted in the present embodiment Training data is decoded with multithreading；Pre-process decoded training data；By pretreated training data input video model.

Specifically, the present embodiment after obtaining the corresponding training data of Video Model, which can be loaded onto The memory of training machine, and using multithreading by training data be decoded into the matched specified format of Video Model, so as to subsequent It is trained；Decoded training data is pre-processed simultaneously, which may include applying number to training data It is handled according to enhancing, and is converted into the format of Video Model needs；And then pretreated training data is input to be trained It is trained in Video Model, with the Video Model after being trained.Wherein, decoded video data class is supported in the present embodiment Type includes that common all videos and audio file formats, decoding process can choose CPU and GPU, it is possible to specify from video counts Any position in starts to decode, and supports to come with specified transmission frame number per second (Frames Per Second, FPS) defeated Decoding frame out, while the video flowing and audio stream for supporting decoded output to be aligned can provide outside video frame and audio stream, additionally There is provided video data whether original FPS, frame be wide, vertical frame dimension, video playing duration, code rate and containing audio stream and every frame The information such as Presentation Time Stamp (Presentation Time Stamp, PTS)；And it is carried out after being decoded to training data pre- Processing, it is therefore an objective to RGB image format and pulse code modulation (the Pulse Code that will include in decoded training data Modulation, PCM) audio stream do some data enhancement operations, be then converted into Video Model needs format, regard at this time The pretreatment of frequency frame may include common random cropping, random brightness, random contrast and random scaling etc.；Audio stream Pretreatment may include stochastic gain transformation, turns log spectrum, turn Meier frequency spectrum, random interception selections and with specified energy Ratio is superimposed two section audios etc.；Random FPS conversion function is supported in decoding, while can be according to specified dimension after data enhancing Degree arrangement mode will export after data transposition, obtain the training data for meeting Video Model training requirement.Further, this implementation Example can be supported treated training data after training data is decoded and is pre-processed with the format of Numpy array Output, and then meet the input requirements of the Video Model of mainstream；In addition the Video Model of GPU operation interface is disclosed for some (such as mxnet), the present embodiment are also supported directly to load treated training data into GPU with its format needed, this Operation and training are parallel, can save in Video Model training and training data is waited to load the time into GPU；In the present embodiment All calculating and input/output be all it is parallel, the resource of training machine can be made full use of, it is maximized to promote processing speed Degree.

The present embodiment to the video frame and audio stream for including in video data while can carry out respective handling, at this time can be with It trains that index is higher, the better Video Model of performance based on synchronous video frame and audio-frequency information, simplifies multi-modal video The training operation of model；The audio data processing mode in the present embodiment supports numerous common audio preprocess methods simultaneously, It can be compatible with presently disclosed most of audio processing mode, simply specify several parameters can be so that data supplying module Output meets the data that open source mode input requires, and greatlies simplify the operating procedure of verifying open source model performance.

Technical solution provided in this embodiment, the data supply method provided through the foregoing embodiment obtain Video Model Corresponding training data, and the training data is inputted in Video Model to be trained and is trained, guarantee the instruction of Video Model Practice efficiency, improves the performance of Video Model.

Example IV

Fig. 4 is a kind of structural schematic diagram for data supply device that the embodiment of the present invention four provides, specifically, such as Fig. 4 institute Show, the apparatus may include:

Train request obtains module 410, and for obtaining the train request for being directed to Video Model, which includes preparatory The batch processing mechanism of setting and this corresponding Data Identification of training；

Target data obtains module 420, for obtaining matched target in distributed storage data set according to Data Identification Video data, the distributed storage data set include all types of video datas；

Training data determining module 430 obtains video for handling according to batch processing mechanism target video data The corresponding training data of model.

Further, above-mentioned target data obtains module 420, can be specifically used for:

It is matched in the acquisition of distributed storage data set according to single video mark if Data Identification is single video mark Single video data；

If Data Identification is to be packaged video identifier, according to packing video identifier in the acquisition of distributed storage data set The packing video data matched.

Further, above-mentioned target data obtains module 420, can also be specifically used for:

If target video data is interior video data, according to Data Identification in the acquisition of distributed storage data set The target video data matched；

If target video data is external video data, distributed storage data set obtains matched in external network After target video data, matched target video data is obtained in distributed storage data set according to Data Identification.

Further, said external video data is stored in the caching of distributed storage data set.

Further, above-mentioned batch processing mechanism includes the packet mode to target video data.

Data supply device provided in this embodiment is applicable to the data supply method that above-mentioned any embodiment provides, tool Standby corresponding function and beneficial effect.

Embodiment five

Fig. 5 is a kind of structural schematic diagram for model training apparatus that the embodiment of the present invention five provides, specifically, such as Fig. 5 institute Show, the apparatus may include:

Training data obtains module 510 and obtains video for the data supply method in any embodiment according to the present invention The corresponding training data of model；

Video Model training module 520, for the Video Model by training data input video model, after being trained.

Further, above-mentioned Video Model training module 520, can be specifically used for:

Training data is decoded using multithreading；

Pre-process decoded training data；

By pretreated training data input video model.

Model training apparatus provided in this embodiment is applicable to the model training method that above-mentioned any embodiment provides, tool Standby corresponding function and beneficial effect.

Embodiment six

Fig. 6 is a kind of schematic illustration for data feed system that the embodiment of the present invention six provides.It is main in the present embodiment It is described in detail for the training data supply process of Video Model.Referring to Fig. 6, the data feed system 60 of the present embodiment May include Distributed Storage end 610, batch loading end 620 and respectively with Distributed Storage end 610 and batch load The data supply side 630 of 620 connection of end.

Wherein, distributed storage storing data collection in Distributed Storage end 610；It criticizes in loading end 620 at storage batch Reason mechanism, and generate train request；Data supply side 630 is provided with the data supply device of any embodiment of that present invention offer.

Specifically, the Distributed Storage end 610 for including in data feed system 60, batch loading end 620 and data The building principle of supply side 630 is not made in detail herein referring in particular to the description in data supply method provided in an embodiment of the present invention It describes in detail bright.

Embodiment seven

Fig. 7 is a kind of schematic illustration for model training systems that the embodiment of the present invention seven provides.It is main in the present embodiment It is described in detail for the training data supply process of Video Model.Referring to Fig. 7, the model training systems 70 of the present embodiment May include Distributed Storage end 710, batch loading end 720 and respectively with Distributed Storage end 710 and batch load The model training end 730 of 720 connection of end.

Wherein, distributed storage storing data collection in Distributed Storage end 710；It criticizes in loading end 720 at storage batch Reason mechanism, and generate train request；Model training end 730 is provided with the model training apparatus of any embodiment of that present invention offer.

Specifically, the Distributed Storage end 710 for including in model training systems 70, batch loading end 720 and model The building principle at training end 730 is not made in detail herein referring in particular to the description in model training method provided in an embodiment of the present invention It describes in detail bright.

Embodiment eight

Fig. 8 is a kind of structural schematic diagram for equipment that the embodiment of the present invention eight provides, as shown in figure 8, the equipment includes place Manage device 80, storage device 81 and communication device 82；The quantity of processor 80 can be one or more in equipment, with one in Fig. 8 For a processor 80；Processor 80, storage device 81 and communication device 82 in equipment can pass through bus or other modes It connects, in Fig. 8 for being connected by bus.

Storage device 81 is used as a kind of computer readable storage medium, and it is executable to can be used for storing software program, computer Program and module, as the corresponding program of data supply method or model training method provided in the embodiment of the present invention refers to Order/module.Software program, instruction and the module that processor 80 is stored in storage device 81 by operation, thereby executing setting Standby various function application and data processing, that is, realize above-mentioned data supply method or model training method.

Storage device 81 can mainly include storing program area and storage data area, wherein storing program area can store operation Application program needed for system, at least one function；Storage data area, which can be stored, uses created data etc. according to terminal. It can also include nonvolatile memory in addition, storage device 81 may include high-speed random access memory, for example, at least one A disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, storage device 81 can It further comprise the memory remotely located relative to processor 80, these remote memories can be by network connection to setting It is standby.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.

Communication device 82 can be used for realizing the network connection or mobile data cube computation of equipment room.

A kind of equipment provided in this embodiment can be used for executing the data supply method that above-mentioned any embodiment provides or Model training method has corresponding function and beneficial effect.

Embodiment nine

The embodiment of the present invention nine additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should Program can realize the data supply method in above-mentioned any embodiment when being executed by processor.This method can specifically include:

The train request for being directed to Video Model is obtained, which includes preset batch processing mechanism and this instruction Practice corresponding Data Identification；

Matched target video data, the distributed storage data are obtained in distributed storage data set according to Data Identification Collection includes all types of video data；

Target video data is handled according to batch processing mechanism, obtains the corresponding training data of Video Model.

Alternatively, realizing the model training method in above-mentioned any embodiment, this method be can specifically include:

Data supply method in any embodiment according to the present invention obtains the corresponding training data of Video Model；

Video Model by training data input video model, after being trained.

Certainly, a kind of storage medium comprising computer executable instructions, computer provided by the embodiment of the present invention Data supply provided by any embodiment of the invention can also be performed in the method operation that executable instruction is not limited to the described above Relevant operation in method or model training method.

By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art Part can be embodied in the form of software products, which can store in computer readable storage medium In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.

It is worth noting that, included is each in above-mentioned data supply device or the embodiment of model training apparatus Unit and module are only divided according to the functional logic, but are not limited to the above division, as long as can be realized corresponding Function；In addition, the specific name of each functional unit is also only for convenience of distinguishing each other, it is not intended to restrict the invention Protection scope.

The above description is only a preferred embodiment of the present invention, is not intended to restrict the invention, for those skilled in the art For, the invention can have various changes and changes.All any modifications made within the spirit and principles of the present invention are equal Replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. a kind of data supply method characterized by comprising

The train request for being directed to Video Model is obtained, the train request includes preset batch processing mechanism and this training Corresponding Data Identification；

Matched target video data, the distributed storage number are obtained in distributed storage data set according to the Data Identification It include all types of video datas according to collection；

The target video data is handled according to the batch processing mechanism, obtains the corresponding trained number of the Video Model According to.

2. the method according to claim 1, wherein it is described according to the Data Identification in distributed storage data Collection obtains matched target video data, comprising:

If the Data Identification is single video mark, according to single video mark in the acquisition of distributed storage data set The single video data matched；

If the Data Identification is to be packaged video identifier, obtained according to the packing video identifier in distributed storage data set Take matched packing video data.

3. method according to claim 1 or 2, which is characterized in that it is described according to the Data Identification in distributed storage Data set obtains matched target video data, comprising:

If the target video data is interior video data, obtained according to the Data Identification in distributed storage data set Take matched target video data；

If the target video data is external video data, the distributed storage data set is in external network acquisition After the target video data matched, matched target video data is obtained in distributed storage data set according to the Data Identification.

4. according to the method described in claim 3, it is characterized in that, the external video data is stored in the distributed storage In the caching of data set.

5. method according to claim 1 or 2, which is characterized in that the batch processing mechanism includes to the target video The packet mode of data.

6. a kind of model training method characterized by comprising

Data supply method according to any one of claims 1 to 5 obtains the corresponding training data of Video Model；

7. according to the method described in claim 6, it is characterized in that, described input the Video Model for the training data, Include:

The training data is decoded using multithreading；

Pre-process decoded training data；

Pretreated training data is inputted into the Video Model.

8. a kind of data supply device characterized by comprising

Train request obtains module, and for obtaining the train request for being directed to Video Model, the train request includes presetting Batch processing mechanism and this train corresponding Data Identification；

Target data obtains module, for obtaining matched target video in distributed storage data set according to the Data Identification Data, the distributed storage data set include all types of video datas；

Training data determining module obtains institute for handling according to the batch processing mechanism the target video data State the corresponding training data of Video Model.

9. a kind of model training apparatus characterized by comprising

Training data obtains module, is used for data supply method according to any one of claims 1 to 5, obtains video screen module The corresponding training data of type；

Video Model training module, for the training data to be inputted the Video Model, the Video Model after being trained.

10. a kind of data feed system characterized by comprising Distributed Storage end, batch loading end and respectively with point The data supply side that cloth the data storage end is connected with batch loading end；

The Distributed Storage end distributed storage storing data collection；Described batch of loading end stores batch processing mechanism, and raw At train request；The data supply side is provided with data supply device as claimed in claim 8.

11. a kind of model training systems characterized by comprising Distributed Storage end, batch loading end and respectively with point The model training end that cloth the data storage end is connected with batch loading end；

The Distributed Storage end distributed storage storing data collection；Described batch of loading end stores batch processing mechanism, and raw At train request；The model training end is provided with model training apparatus as claimed in claim 9.

12. a kind of equipment, which is characterized in that the equipment includes:

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as data supply method as claimed in any one of claims 1 to 5, or the realization instruction of the model as described in claim 6 or 7 Practice method.

13. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor Such as data supply method as claimed in any one of claims 1 to 5 is realized when execution, or is realized as described in claim 6 or 7 Model training method.