CN106503693A

CN106503693A - The offer method and device of video front cover

Info

Publication number: CN106503693A
Application number: CN201611059438.2A
Authority: CN
Inventors: 赵彦宾; 姜东�; 洪定坤; 夏绪宏
Original assignee: Beijing ByteDance Technology Co Ltd
Current assignee: Beijing Douyin Information Service Co Ltd
Priority date: 2016-11-28
Filing date: 2016-11-28
Publication date: 2017-03-15
Anticipated expiration: 2036-11-28
Also published as: CN106503693B

Abstract

The embodiment of the present application discloses the offer method and device of video front cover, and method includes：The video file that receive user is uploaded, the situation of change according to adjacent content frame in video file determine scene change key frame and the corresponding picture of the scene change key frame are intercepted；It is that the picture being truncated to is given a mark and sorted by the good machine learning model for picture classification of training in advance；Preset width number picture high for score is supplied to as candidate's picture of video front cover by user according to sequence, so that user carries out the selection of video front cover from candidate's picture.With this, can both ensure not omitting all important scenes in video file, the picture multiplicity in the video front cover candidate's picture for providing can be provided again, lift the quality of candidate's picture, the user that is more convenient for therefrom chooses the video front cover being more suitable for.

Description

The offer method and device of video front cover

Technical field

The application is related to field of computer technology, more particularly to the offer method and device of video front cover.

Background technology

When we see video in video website, can see that in related web page each video has a video front cover, The quality height of the corresponding picture of video front cover is the key factor for attracting user to click on video, fiery short particularly with comparing instantly For video, the quality of the corresponding picture of video front cover is particularly important.

One video typically (such as, is pressed duration average according to set time point by the Choice of existing video front cover Be divided into several pieces sub-video, the time point that every one's share of expenses for a joint undertaking video is commenced play out as set time point, etc.), from the video The middle picture that intercepts therefrom is selected for user as candidate's picture of video front cover, but the picture of the video front cover for so obtaining The problems such as fuzzy, out of focus often occurs, or picture excessively simple, do not contain significant object or object etc..

With the fast development of depth machine learning techniques, and depth machine learning techniques are in the identification of image and voice The huge progress obtained in terms of process, in order to solve the problems, such as in above-mentioned selecting video front cover scheme, the base that YouTube is proposed Scheme is automatically generated in the video thumbnails of depth machine learning techniques, deep neural network (DNN, Deep Neural can be adopted Network), the picture as video front cover for user being uploaded, will be random from video file used as " high-quality " training set Then the picture of intercepting is entered using " high-quality " training set and " low quality " training set in advance as " low quality " training set Training of the row based on the machine learning model of DNN, with the DNN machine learning models for obtaining training.Generate in video thumbnails During, can be first from video file random intercepting picture (such as, intercepting a frame within one second) then good using above-mentioned training in advance DNN machine learning models the picture being truncated to is given a mark, then choosing from the picture (possibly some width) of highest scoring Take a best width picture and be used as video front cover.Through manual evaluation, namely DNN machine learning moulds are contrasted by evaluator The video front cover that type is produced is thought with the video front cover intercepted according to set time point produced by the scheme of picture, 65% people The picture of the video front cover that DNN machine learning models are produced is more preferable.

But, this kind of scheme can also have the following disadvantages：

First, the picture for directly uploading user, will be from video according to set time point used as " high-quality " training set The picture of intercepting can be introduced into substantial amounts of " dirty data " that is, can in the picture of user's upload as " low quality " training set Can there is the picture of a lot of poor qualities, be likely to exist in the picture intercepted according to set time point from video a lot The pretty good picture of quality, therefore, this training set comprising " dirty data " can directly result in the machine learning model for training and reach Less than good classifying quality；

Secondly, when video file duration is longer, this kind of sectional drawing mode can cause the multiplicity of the picture being truncated to compare Height, the video front cover picture for being finally provided to user are likely to be the higher picture of some multiplicities.

Content of the invention

This application provides the offer method and device of video front cover, can not only ensure not omitting owning in video file Important scenes, can reduce the picture multiplicity in the video front cover candidate's picture for providing again, lift the quality of candidate's picture, more just The video front cover that be more suitable for therefrom is chosen in user.

This application provides following scheme：

A kind of offer method of video front cover, including：

The video file that receive user is uploaded, the situation of change according to adjacent content frame in video file determine scene change Key frame is simultaneously intercepted to the corresponding picture of the scene change key frame；

It is that the picture being truncated to is given a mark side by side by the good machine learning model for picture classification of training in advance Sequence；

Preset width number picture high for score is supplied to as candidate's picture of video front cover by user according to sequence, to use Family carries out the selection of video front cover from candidate's picture.

Optionally, also include：

Selection instruction of the receive user to any picture in candidate's picture；

The picture that user selects is defined as video front cover.

Optionally, scene change key frame is determined and to the field according to the situation of change of adjacent content frame in video file The corresponding picture of scape conversion key frame is intercepted, including：

Judge that whether adjacent two content frames change is beyond preset change threshold in video file；

Frame beyond preset change threshold is defined as scene change key frame；

The corresponding picture of scene change key frame is intercepted, and the picture being truncated to is constituted scene change key frame Picture set.

Optionally, the training to the machine learning model for picture classification, including：

Determine the image data for machine learning model training；

The image data is done repetitive exercise in the machine learning model of convolutional neural networks CNN, and is instructed in iteration The weights of convolutional neural networks are adjusted during white silk, to obtain for picture classification on the basis of CNN machine learning models CNN machine learning models；

The CNN machine learning models for picture classification are estimated；

If assessment passes through, training terminates and using the CNN machine learning models for picture classification as training The CNN machine learning models for picture classification.

Optionally, also include：

If assessment does not pass through, to being adjusted using the parameter of algorithm in the CNN machine learning models for picture classification Whole, to continue to do iteration in the CNN machine learning models for picture classification by the image data after parameter adjustment Training, and the weights of convolutional neural networks are adjusted during repetitive exercise, until the CNN machines for picture classification for obtaining Learning model assessment passes through.

Optionally, the image data determined for machine learning model training, including：

Obtain basic image data collection；

Obtain the color character parameter value that basic image data concentrates picture；

Basic image data is concentrated the picture for not meeting prerequisite remove according to the color character parameter value, to obtain The image data of machine learning model training must be used for.

Optionally, the basic image data collection includes：The first data set containing user's uploading pictures and containing by pre- Put the second data set of the picture that time interval is intercepted at random；

The color character parameter value includes tone value, intensity value and brightness value；

Basic image data is concentrated the picture for not meeting prerequisite remove according to the color character parameter value, to obtain The image data of machine learning model training must be used for, including：

According to preset color character weight, weighted sum calculating is done to the color character parameter value of every width picture, to obtain The corresponding color character numerical value of every width picture；

The picture and second data set that color character numerical value in first data set is less than the first preset score value Middle color character numerical value is removed higher than the picture of the second preset score value, obtains first kind data set and Second Type respectively Data set, using as the image data for machine learning model training.

The color character parameter value includes tone value, intensity value and rgb value；

By in picture and second data set of the tone value in first data set less than the first preset hue threshold Tone value is removed higher than the picture of the second preset hue threshold；

Intensity value in first data set is less than the picture of the first preset saturation threshold value and second data Intensity value is concentrated to be removed higher than the picture of the second preset saturation threshold value；

The black and white picture in first data set is removed according to the rgb value；

By the picture remained in the first data set and the second data set be identified as first kind data set and Second Type data set, using as the image data for machine learning model training.

Optionally, by color character numerical value in first data set less than the first preset score value picture and described the In two data sets color character numerical value higher than the second preset score value picture be removed after, also include：

Respectively the similarity in the first data set and the second data set between remaining picture is judged, and according to judgement As a result choose a width picture to be retained in the picture for reaching preset similarity threshold from similarity, so as to by the first data set and The picture remained in second data set is respectively as the first kind data set and Second Type data set.

A kind of offer device of video front cover, including：

Sectional drawing unit, for the video file that receive user is uploaded, and the change according to adjacent content frame in video file Situation determines scene change key frame and the corresponding picture of the scene change key frame is intercepted；

Marking unit, for being the picture being truncated to by the good machine learning model for picture classification of training in advance Given a mark and sorted；

Candidate's picture provide unit, for according to sequence using preset width number picture high for score as video front cover candidate Picture is supplied to user, so that user carries out the selection of video front cover from candidate's picture.

Optionally, also include：

Instruction reception unit, for selection instruction of the receive user to any picture in candidate's picture；

Video front cover determining unit, the picture for selecting user are defined as video front cover.

Optionally, the sectional drawing unit, specifically for：

Frame beyond preset change threshold is defined as scene change key frame；

According to the specific embodiment that the application is provided, this application discloses following technique effect：

By the embodiment of the present application, after the video file of user's upload is received, can be according to consecutive frame in video file The situation of change of content determines scene change frame, and the corresponding picture of the scene change frame is intercepted, and then can pass through The good machine learning model for picture classification of training in advance is that the picture being truncated to is given a mark and sorted, further according to sequence Preset width number picture high for score is supplied to user as candidate's picture of video front cover, so that user is from candidate's picture In carry out the selection of video front cover.With this, can both ensure not omitting all important scenes in video file, can reduce again providing Video front cover candidate's picture in picture multiplicity, lift the quality of candidate's picture, the user that is more convenient for therefrom chooses more Suitable video front cover.

Certainly, the arbitrary product for implementing the application is it is not absolutely required to while reaching all the above advantage.

Description of the drawings

In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to institute in embodiment The accompanying drawing for using is needed to be briefly described, it should be apparent that, drawings in the following description are only some enforcements of the application Example, for those of ordinary skill in the art, on the premise of not paying creative work, can be being obtained according to these accompanying drawings Obtain other accompanying drawings.

Fig. 1 is the method flow diagram that the embodiment of the present application is provided；

Fig. 2 is the training flow process in the method that the embodiment of the present application is provided to the machine learning model for picture classification Figure；

Fig. 3-1 to Fig. 3-3 is the experimental data schematic diagram in the method that the embodiment of the present application is provided；

Fig. 4 is the schematic device that the embodiment of the present application is provided.

Specific embodiment

Accompanying drawing in below in conjunction with the embodiment of the present application, to the embodiment of the present application in technical scheme carry out clear, complete Site preparation is described, it is clear that described embodiment is only some embodiments of the present application, rather than whole embodiments.It is based on Embodiment in the application, the every other embodiment obtained by those of ordinary skill in the art belong to the application protection Scope.

Referring to Fig. 1, the embodiment of the present application provide firstly a kind of offer method of video front cover, can include following step Suddenly：

S101, the video file that receive user is uploaded, and determined according to the situation of change of adjacent content frame in video file Scene change key frame is simultaneously intercepted to the corresponding picture of the scene change key frame.

Under normal circumstances, video website is not only able to for built-in video file in its corresponding server to be supplied to user's sight See, can be watched so that user is supplied to after the video file for receiving any user upload.In the present embodiment, when receiving After the video file that user uploads, can first determine the scene change situation in video file (it can be appreciated that camera lens occurs The situation of switching), the content change situation of consecutive frame in video file in the present embodiment, can be such as obtained, adjacent two are judged Whether content frame change exceeds preset change threshold, and it is crucial that the frame beyond preset change threshold is defined as scene change Frame, then can intercept to the corresponding picture of the scene change key frame for determining, and can further by all figures being truncated to Piece constitutes scene change key frame picture set, used in subsequent step, to ensure not omitting video file midfield with this Scene (being also believed to important scenes) when scape is converted, can reduce the multiplicity for being truncated to picture again.

In actual applications, also can by video file in code check change judging the scene change feelings in video file Condition, then carries out the intercepting of picture from video file according to scene change situation, to obtain corresponding sectional drawing during scene change, By this kind of sectional drawing mode, can ensure as far as possible not omit all important scenes in video file, can reduce again being truncated to figure The multiplicity of piece.

In addition it is also possible to determine the scene change situation in video file using other modes, for example we can be with By the grey level histogram feature of picture, Scale invariant features transform (SIFT, Scale-Invariant Feature Transform) feature etc. is judged to the similarity of picture in video file, such as, first can be intercepted according to preset frequency Picture (such as, 2 seconds a frame etc.), then according to existing with regard to judging the technology of picture similarity between the picture that is truncated to Similarity is judged, the picture of similarity height (such as similarity reaches default Similarity value) is only protected according to judged result One is stayed, the purpose for determining the scene changes situation in video file is so also can reach.

S102, is that the picture being truncated to is given a mark by the good machine learning model for picture classification of training in advance And sort.

Referring to shown in Fig. 2, in the present embodiment, the training process of the machine learning model to being used for picture classification can be wrapped Include following steps：

Step 1, determines the image data for machine learning model training.

In the present embodiment, can use depth machine learning model (including supervised learning and unsupervised learning. different Learning framework under the learning model set up different), can such as adopt deep learning convolutional neural networks (CNN, Convolutional Neural Network), a kind of machine learning model of the supervised learning of depth, certainly, according to reality Border needs other suitable depth machine learning models may also be employed.

Under normal circumstances, the data for machine learning model training can be divided into three parts：Training dataset (training data), test data set (testing data), checking data set (validation data), this three part The ratio of data may be configured as 80%, 10%, 10%.For the machine learning model of supervised learning, obtain for instructing Experienced data are one of most important links, and high-quality data are the keys of machine learning model training.

This is based on, when implementing, in order to determine the image data for machine learning model training, base can be first obtained Plinth image data collection, wherein, the basic image data collection may include：The first data set containing user's uploading pictures and contain Press the second data set of the picture that preset time interval is intercepted at random.

In existing video website, video front cover has two main generation channels：One be uploaded videos user oneself Pictures are uploaded as video front cover, another is that aforesaid system intercepts picture at random by preset time interval and therefrom selects Take some width pictures and be supplied to user's selection, user therefrom chooses a width picture as video front cover.On the one hand, on user oneself The picture of biography is typically all the reasonable picture of well-chosen matter, but being wherein also not excluded for there are some looks not It is good picture, we can used as the first data set, (it be the relatively high data of quality also to be understood as using such picture Collection)；On the other hand, system due to be in a random basis intercepting picture, based on this, there is provided to the picture quality meeting that user does selection Very different, but wherein also it is not excluded for the presence of the pretty good picture of some quality, and we can be by such picture as second Data set (it can be appreciated that the relatively low data set of quality), in the present embodiment, we can be by the first data set and Two data sets are defined as the basic image data for machine learning model training.

After above-mentioned basic image data is got, the color character that basic image data concentrates picture can be further obtained Parameter value (such as may include tone value, intensity value, brightness value, rgb value etc.), then can be according to the color character parameter value Basic image data is concentrated the picture for not meeting prerequisite remove, to obtain the picture number for machine learning model training According to.

The selection of video front cover is the very strong work of a subjectivity, and neither one objectively passes judgment on criterion, pictures Quality, often related to the subjective factorss of people larger, different people has different viewpoints and a preference, such as enriches Color, eye-catching people or object, the definition of picture, contrast, saturation etc. are all the factors for affecting width picture quality.

Therefore, in one implementation, we can first obtain the color character parameter that basic image data concentrates picture Value, the color character parameter value may include HSV (Hue (tone), Saturation (saturation), Luminence (brightness)) value Deng then the color character numerical value of picture can be calculated by the tone value that gets, intensity value, brightness value etc., such as may be used The color character numerical value such as color saturation, lightness, contrast including picture.Certainly, according to actual needs, it is also possible to pass through Obtain HSL (Hue (form and aspect), Saturation (saturation), Luminence (brightness)) value etc. to replace above-mentioned HSV value, to enter Row subsequent step.

We can carry out color character weight setting previously according to conventional experience to above-mentioned color character parameter value, than Such as：It is that 1, tone value weight is 0.8 that color saturation weight is 0.7, luminance weights, etc..Then, we just can be according to pre- The color character weight for first arranging, is weighted to the corresponding color character parameter value of every width picture and calculates, to obtain per width The corresponding color character numerical value of picture, that is to say, that the corresponding color character numerical value of every width picture.

Next, can be according to the color character numerical value of every width picture, will be low for color character numerical value in first data set It is removed in the picture (color character numerical value is relatively low, the picture of poor quality) of the first preset score value, to obtain the first kind Type data set (can such as be quality data), and be higher than second preset point by color character numerical value in second data set The picture (color character numerical value is higher, the picture that quality is pretty good) of value is removed, to obtain Second Type data set (such as Can be low quality data), and can be using first kind data set and Second Type data set as machine learning model training Image data.

In another kind of implementation, we can first obtain the color character parameter value that basic image data concentrates picture, The color character parameter value may include Hue (tone) value, Saturation (saturation) value, RGB ((Red (red), Green (green), Blue (indigo plant)) value, then can be respectively by the tone value that gets, intensity value, rgb value to not meeting prerequisite Picture remove.

When implementing, tone value in first data set can be less than picture and the institute of the first preset hue threshold The picture that tone value in the second data set is stated higher than the second preset hue threshold is removed, that is, by color in the first data set The relatively good picture of tone in relatively poor picture and the second data set is adjusted to be removed, to reduce the first data set and the Picture number in two data sets, and then reduce the operand of machine learning model training, operation time is reduced, computing speed is lifted Degree, while can also lift the quality of picture in the first data set and the second data set.

Then, can also be by intensity value in first data set less than the picture of the first preset saturation threshold value and described In second data set, intensity value is removed higher than the picture of the second preset saturation threshold value, that is, by the first data set In the relatively poor picture of color saturation and the second data set, the relatively good picture of color saturation is removed, to reduce Picture number in first data set and the second data set, and then reduce the operand of machine learning model training, reduce computing Time, improving operational speed, while can also lift the quality of picture in the first data set and the second data set.

Additionally, the artwork master in order to further lift the picture quality in the first data set, in the first data set picture Not we want reservation to piece (being also believed to pure gray scale picture), that is to say, that black and white picture is not that we want to carry User is used as the picture of video front cover for supply, therefore, carries out the black and white picture in first data set according to the rgb value Remove, that is, by the first data set, not comprising chrominance information, (such as, in RGB, three component values are three in 0 or RGB Component value is 255 etc.) black and white picture be removed, with this, the picture number in the first data set can be reduced, and then reduced The operand of machine learning model training, reduces operation time, improving operational speed, while can also be lifted in the first data set scheming The quality of piece.

Then, the picture remained in the first data set and the second data set is identified as first kind data Collection (i.e. quality data collection) and Second Type data set (i.e. low quality data collection), using as machine learning model training Image data.

With this, " dirty data " fallen in prior art in training set " can be cleaned " and (namely do not meet the figure of prerequisite Piece), the pretty good picture of quality in the picture that the picture of the poor quality in the picture uploaded including user and system are intercepted at random, The machine learning model for training is caused not reach ideal sort effect to solve the problems, such as due to there is " dirty data ".

In actual applications, in order to further reduce the operand of machine learning model training, according to preset color Feature weight, before doing weighted sum calculating to the color character parameter value of every width picture, can also be by the size adjusting of every width picture For preset size.

As the dimension of picture that system is truncated to is probably than larger, therefore, it can before weighted sum is calculated to figure Piece carries out resize operations with the Aspect Ratio of unified adjustment picture, to meet the requirement of machine learning model, such as, picture Original size is 1000*2000, can be 100*200 by resize operations by its size adjusting, and this operation can only change picture Size, without making picture metamorphopsic distortion, with this, is effectively reduced the operand of machine learning model training, lifts computing speed Degree.

In actual applications, can there is the very high picture of some similarities due to system in the picture being truncated at random, I Also picture too high for similarity only can be retained one, to improve the quality of picture in the data set for training, and reduce Picture number in data set.

In the present embodiment, can be in the picture and institute that color score value in first data set is less than the first preset score value State color score value in the second data set higher than the second preset score value picture be removed after, to the first data set and the second number According to concentrating the similarity between remaining picture to be judged, judge than the grey level histogram feature such as by picture picture it Between similarity, specifically, can first obtain the pixel data of each picture and generate the histogram data of each picture, then to each figure The histogram data of piece is normalized, and reuses Pasteur's Coefficient Algorithm and histogram data is calculated, finally draw Each picture Similarity value, its value scope can be between [0,1], wherein, and 0 can represent extremely different, 1 can represent extremely similar (or Identical), similarity judgement can be carried out according to the Similarity value of each picture for acquiring.

Then, preset similarity threshold (such as Similarity value is not less than 0.8) can be reached from similarity according to judged result Picture in choose a width picture and retained, that is to say, that only retain a width (i.e. other width figures in the high picture of similarity Piece is all removed), so as to using the picture remained in the first data set and the second data set as the first kind number According to collection and Second Type data set, with this, can further in first kind data set and Second Type data set picture quantity, And it is comprehensive to ensure that picture feature in first kind data set and Second Type data set is covered, and can further improve use In the data set quality of training, picture number is reduced, and then the operand that machine learning model is trained can be reduced, lift computing speed Degree.

Step 2, the image data is done repeatedly in the machine learning model of the good convolutional neural networks CNN of training in advance Generation training, and the weights of convolutional neural networks are adjusted during repetitive exercise, with the good CNN machine learning moulds of training in advance The CNN machine learning models for picture classification are obtained on the basis of type.

For the training of the machine learning model of large data sets generally requires long time, therefore, we can add and move Move the thought of study, transfer learning can be carried out using the convolutional neural networks (CNN) of Inception-v3 definition, wherein, Inception-v3 is the Large Visual Recognition Challenge data for training ImageNet in 2012 Collection, this is a class standard task of computer vision field, and which can be divided into 1000 classifications, Inception- whole image collection The top5 error rates of v3 are 3.46%.

When implementing, can be in the CNN machine learning models of the Inception-v3 definition for having trained, by not Disconnected repetitive exercise and the adjustment to neural network weight, to obtain the CNN machine learning for picture classification for suiting the requirements Model, to increase extensibility and the motility of model.

Step 3, is estimated to the CNN machine learning models.

In the present embodiment, first, can be estimated by above-mentioned 10% checking data set, but, this assessment side Method possibly cannot learn whether CNN machine learning models have the situation of over-fitting, it is possible to occur in the standard on checking data set Really rate is very high, but the unsatisfactory problem of effect in actual applications, finally to affect CNN machine learning models to picture classification Accuracy rate.

Therefore, manual evaluation can be also carried out, can such as randomly selects a video file, and be cut from video file at random Some width pictures (such as 100 width etc.) are taken, this 100 width picture are given a mark and is ranked up by CNN machine learning models, Then, high several width (front 8 width in the such as scoring sequence) picture of selection score and the low several width of score are (in such as scoring sequence Rear 8 width) picture is compared, also will the minimum several pictures of several pictures of model marking highest and marking compared Compared with being estimated to CNN machine learning models by comparative result.

On the basis of above-mentioned manual evaluation, secondary manual evaluation can also be carried out, such as, can arbitrarily choose a video File, can intercept a few width picture (such as 8 width pictures) by preset time interval (such as per 2 seconds once), by the random intercepting 8 width pictures and above-mentioned first time manual evaluation during the 8 width pictures of highest scoring chosen be compared, by comparing knot Fruit is assessed again to machine learning model.

With this, it is estimated with first passing through checking data set, then by way of manual evaluation twice, engineering can be avoided The situation of model over-fitting is practised, and to realize more efficiently assessment being carried out to CNN machine learning models, obtains preferably assessment effect Really, so ensure accuracy rate of the CNN machine learning models to picture classification.

Step 4, if assessment passes through, can such as be preset by verifying that the precision that is estimated of data set can reach first Percentage ratio (such as first preset percentage is 85%), and think to give a mark by CNN machine learning models by manual evaluation The high score picture for obtaining afterwards is more suitable for can reaching the second preset percentage as the ratio of video front cover, and (such as this is second preset Percentage ratio is that 90%) as assessment passes through, then training terminates and using the CNN machine learning models as being used for for training The CNN machine learning models of picture classification.

Step 5, if assessment does not pass through, can such as be preset by verifying the precision not up to first that data set is estimated Percentage ratio, and think that after CNN machine learning models are given a mark the high score picture that obtains is more suitable for as regarding by manual evaluation The ratio of frequency front cover is not up to the second preset percentage), as assess and do not pass through.

In such cases, then can to CNN machine learning models be adjusted using the parameter of algorithm, specifically can be according to instruction Practice the degree of convergence of process, the accuracy situation of training to be adjusted, can such as use the TensorBoard of google intuitively The situation whether neutral net restrains is obtained, wherein, graphical, visualization tool of the Tensorboard for Tensorflow, Tensorboard can show the static map being made up of in Tensorflow tensor and flow, and precision in training process, partially Dynamic Graphs of analysis such as difference etc..

For the adjustment of above-mentioned algorithm parameter, mainly to learning rate (learning rate), batch processing size (batch size), the isoparametric adjustment of iterationses (step).Such as, in parameter tuning process, if learning rate mistake Greatly, may be such that convolutional neural networks are not restrained, in concussion state, now need to reduce learning rate；If study speed Rate is too small, and convergence rate is slower, and more iterationses could cause convolutional neural networks to reach local extremum, can now arrange Larger iterationses increase learning rate；In addition, batch processing size also influences whether convergence situation, can also pass through to criticizing The adjustment of reason size is adjusting convergence situation.That is, the details of study can be checked by TensorBoard, analyze Employed in machine learning model, the parameter setting of algorithm is irrational local and is adjusted correspondingly, by parameter adjustment Journey, so that machine learning model finally restrains and training for promotion accuracy rate.

After parameter adjustment, will continue to do in CNN machine learning models of the image data after algorithm parameter adjustment Repetitive exercise, and the weights of convolutional neural networks are adjusted during repetitive exercise, until the CNN for picture classification for obtaining Machine learning model assessment passes through.

Preset width number picture high for score is supplied to user as candidate's picture of video front cover according to sequence by S103, So that user carries out the selection of video front cover from candidate's picture.

Wherein, sequence can be ascending order (score is from low to high) or descending (fraction is from high to low), in the present embodiment, can From being ranked up with descending, high preset width number picture (front 8 in such as sequence of score can be chosen from the forefront of sequence Width) user is supplied to as candidate's picture of video front cover, so that user chooses a width picture from this 8 width picture as video Front cover.

When implementing, in user is to above-mentioned candidate's picture, (namely in above-mentioned 8 width picture) any picture is carried out During clicking operation, selection instruction of the user to the picture is as received, user can be selected according to the selection instruction Picture is defined as the video front cover of video file.

The present inventor has carried out substantial amounts of experiment in R＆D process, according to the above-mentioned repetitive exercise to machine learning model Method has obtained 6 editions CNN machine learning models for picture marking, by verifying that the precision that data set is estimated reaches 89.9%, think that the high score picture obtained after CNN machine learning models are given a mark is more suitable for as video by manual evaluation The ratio of front cover reaches 93.3%, and the picture that is given a mark by CNN machine learning models and provided has high definition, contrast Good, bright in luster abundant, containing significant object (personage or object etc.) the features such as, than traditional video front cover choosing method More high-quality and high-efficiency.

Referring to the part comparison diagram (wherein color is not showed that) that Fig. 3-1 to 3-3 is inventor's test, in Fig. 3-1 to 3-3 In, 8 width picture of top is marking 8 width picture of highest, and lower section is 8 minimum width pictures of giving a mark in same video.

By the embodiment of the present application, after the video file of user's upload is received, can be according to consecutive frame in video file The situation of change of content determines scene change key frame and the corresponding picture of the scene change frame is intercepted, then can lead to The good machine learning model for picture classification is that the picture being truncated to is given a mark and sorted to cross training in advance, further according to row Preset width number picture high for score is supplied to user as candidate's picture of video front cover by sequence, so that user is schemed from the candidate The selection of video front cover is carried out in piece.With this, can both ensure not omitting all important scenes in video file, can reduce again carrying For video front cover candidate's picture in picture multiplicity, lift the quality of candidate's picture, the user that is more convenient for therefrom chooses more For the video front cover being suitable for.

Corresponding with the offer method of the video front cover provided in previous embodiment, the embodiment of the present application additionally provides one kind The offer device of video front cover, referring to Fig. 4, the device can include：

Sectional drawing unit 41, for the video file that receive user is uploaded, and the change according to adjacent content frame in video file Change situation determines scene change key frame and the corresponding picture of the scene change key frame is intercepted.

When implementing, the sectional drawing unit 41 can be specifically for：

Frame beyond preset change threshold is defined as scene change key frame；

Marking unit 42, for being the figure being truncated to by the good machine learning model for picture classification of training in advance Piece is given a mark and is sorted.

Candidate's picture provide unit 43, for according to sequence using preset width number picture high for score as video front cover time Picture is selected to be supplied to user, so that user carries out the selection of video front cover from candidate's picture.

Additionally, described device, may also include：

Video front cover determining unit, the picture for selecting user are defined as the video front cover of video file.

In the present embodiment, the training to the machine learning model for picture classification used in the marking unit 42 Process, it may include following steps：

Step 1, determines the image data for machine learning model training.

When implementing, basic image data collection can be first obtained, the basic image data collection includes：Containing on user First data set of blit piece and the second data set containing the picture for pressing the random intercepting of preset time interval.

Then, the color character parameter value that basic image data concentrates picture can be obtained, such as includes the color character Parameter value includes tone value, intensity value, brightness value, rgb value etc., further according to the color character parameter value by basic picture number According to concentrating the picture for not meeting prerequisite to remove, to obtain the image data for machine learning model training.

In one implementation, after the color character parameter value of picture being concentrated basic image data is obtained, the color Color characteristic ginseng value may include HSV, and (Luminence (brightness) value, according to preset for Hue (tone), Saturation (saturation) Color character weight, the color character parameter value of every width picture is done weighted sum calculating, to obtain the corresponding color of every width picture Color character numerical value, then color character numerical value in first data set be less than the picture and described second of the first preset score value In data set, color character numerical value is removed higher than the picture of the second preset score value, obtains first kind data set and the respectively Two categorical data collection, using as the image data for machine learning model training.

In another kind of implementation, such as, can be in the color character parameter value for obtaining basic image data concentration picture Afterwards, the color character parameter value may include Hue (tone) value, Saturation (saturation) value, RGB ((Red (red), Green (green), Blue (indigo plant)) value, the picture and described second that tone value in first data set is less than the first preset hue threshold In data set, tone value is removed higher than the picture of the second preset hue threshold, is next embezzled first data set again It is higher than the second preset saturation with intensity value in picture and second data set of the angle value less than the first preset saturation threshold value The picture of degree threshold value is removed.

Then, also the black and white picture in first data set can be removed according to the rgb value, that is, by Chrominance information is not included in one data set (such as, in RGB, three component values are three component values in 0 or RGB and are 255 etc.) Black and white picture be removed, further improve data in picture quality, reduce model training operand, reduce calculate when Between, improving operational speed.

Finally, the picture remained in the first data set and the second data set is identified as first kind data Collection and Second Type data set, using as the image data for machine learning model training.

Additionally, for the operand for further reducing model training, improving operational speed can also be according to preset color Feature weight, before doing weighted sum calculating to the color character parameter value of every width picture, the size adjusting by every width picture is pre- Size is put, the size that every width picture is adjusted to model needs.

Due to there may be the very high picture of some similarities in the first data set and the second data set, in order to improve number According to the quality of intensive data, picture number is reduced, reduce the operand of model training, and then improving operational speed, can also incited somebody to action In first data set, color score value less than color score value in the picture and second data set of the first preset score value is higher than After the picture of the second preset score value is removed, to the similarity in the first data set and the second data set between remaining picture Judged, and reached from similarity according to judged result and a width picture is chosen in the picture of preset similarity threshold protected Stay, so as to using the picture remained in the first data set and the second data set as the first kind data set and the Two categorical data collection, with this, are obtained that multiplicity is low, better quality data set.

Wherein, the convolutional neural networks can be the convolutional neural networks of Inception-v3 definition.

Step 3, is estimated to the CNN machine learning models.

Step 4, if assessment pass through, training terminate and using the CNN machine learning models for picture classification as The CNN machine learning models for picture classification for training；

Step 5, assessment do not pass through, then to being entered using the parameter of algorithm in the CNN machine learning models for picture classification Row adjustment, to continue to do in the CNN machine learning models for picture classification by the image data after parameter adjustment Iteration is instructed and adjusts the weights of convolutional neural networks during repetitive exercise, until the CNN machines for picture classification for obtaining The assessment of device learning model passes through.

As seen through the above description of the embodiments, those skilled in the art can be understood that the application can Mode by software plus required general hardware platform is realizing.It is based on such understanding, the technical scheme essence of the application On part that in other words prior art is contributed can be embodied in the form of software product, the computer software product Can be stored in storage medium, such as ROM/RAM, magnetic disc, CD etc., use so that a computer equipment including some instructions (can be personal computer, server, or network equipment etc.) executes some of each embodiment of the application or embodiment Method described in part.

Each embodiment in this specification is described by the way of going forward one by one, identical similar portion between each embodiment Divide mutually referring to what each embodiment was stressed is the difference with other embodiment.Especially for system or For system embodiment, as which is substantially similar to embodiment of the method, so describing fairly simple, related part is referring to method The part explanation of embodiment.System described above and system embodiment are only schematically wherein described conduct Separating component explanation unit can be or may not be physically separate, as the part that unit shows can be or Person may not be physical location, you can be located at a place, or can also be distributed on multiple NEs.Can be with root Factually border need select some or all of module therein to realize the purpose of this embodiment scheme.Ordinary skill Personnel are not in the case where creative work is paid, you can to understand and implement.

Offer method and device to video front cover provided herein, is described in detail above, herein should The principle of the application and embodiment are set forth with specific case, the explanation of above example is only intended to help and manages Solution the present processes and its core concept；Simultaneously for one of ordinary skill in the art, according to the thought of the application, Will change in specific embodiment and range of application.In sum, this specification content is should not be construed as to this Shen Restriction please.

Claims

1. a kind of offer method of video front cover, it is characterised in that include：

The video file that receive user is uploaded, the situation of change according to adjacent content frame in video file determine that scene change is crucial Frame is simultaneously intercepted to the corresponding picture of the scene change key frame；

It is that the picture being truncated to is given a mark and sorted by the good machine learning model for picture classification of training in advance；

According to sequence preset width number picture high for score is supplied to user as candidate's picture of video front cover, so as to user from The selection of video front cover is carried out in candidate's picture.

2. method according to claim 1, it is characterised in that also include：

The picture that user selects is defined as video front cover.

3. method according to claim 1, it is characterised in that true according to the situation of change of adjacent content frame in video file Determine scene change key frame and the corresponding picture of the scene change key frame is intercepted, including：

Frame beyond preset change threshold is defined as scene change key frame；

4. method according to claim 1, it is characterised in that the training to the machine learning model for picture classification, Including：

Determine the image data for machine learning model training；

The image data is done repetitive exercise in the machine learning model of convolutional neural networks CNN, and in repetitive exercise mistake The weights of convolutional neural networks are adjusted in journey, to obtain the CNN machines for picture classification on the basis of CNN machine learning models Device learning model；

The CNN machine learning models for picture classification are estimated；

If assessment passes through, training terminates and using the CNN machine learning models for picture classification as the use for training CNN machine learning models in picture classification.

5. method according to claim 4, it is characterised in that also include：

If assessment does not pass through, to being adjusted using the parameter of algorithm in the CNN machine learning models for picture classification, with Just continue to do repetitive exercise in the CNN machine learning models for picture classification by the image data after parameter adjustment, And the weights of convolutional neural networks are adjusted during repetitive exercise, until the CNN machine learning for picture classification that obtains Model evaluation passes through.

6. method according to claim 4, it is characterised in that the determination is used for the picture number of machine learning model training According to, including：

Obtain basic image data collection；

Basic image data is concentrated the picture for not meeting prerequisite remove according to the color character parameter value, to obtain use Image data in machine learning model training.

7. method according to claim 6, it is characterised in that the basic image data collection includes：Upload containing user First data set of picture and the second data set containing the picture for pressing the random intercepting of preset time interval；

Basic image data is concentrated the picture for not meeting prerequisite remove according to the color character parameter value, to obtain use In machine learning model training image data, including：

According to preset color character weight, weighted sum calculating is done to the color character parameter value of every width picture, to obtain per width The corresponding color character numerical value of picture；

By color in picture and second data set of the color character numerical value in first data set less than the first preset score value Color character numerical value is removed higher than the picture of the second preset score value, obtains first kind data set and Second Type data respectively Collection, using as the image data for machine learning model training.

8. method according to claim 6, it is characterised in that the basic image data collection includes：Upload containing user First data set of picture and the second data set containing the picture for pressing the random intercepting of preset time interval；

By tone in picture and second data set of the tone value in first data set less than the first preset hue threshold Value is removed higher than the picture of the second preset hue threshold；

By in picture and second data set of the intensity value in first data set less than the first preset saturation threshold value Intensity value is removed higher than the picture of the second preset saturation threshold value；

The picture remained in first data set and the second data set is identified as first kind data set and second Categorical data collection, using as the image data for machine learning model training.

9. method according to claim 7, it is characterised in that color character numerical value in first data set is being less than In the picture of the first preset score value and second data set, color character numerical value is gone higher than the picture of the second preset score value Remove afterwards, also include：

Respectively the similarity in the first data set and the second data set between remaining picture is judged, and according to judged result Reach from similarity and a width picture is chosen in the picture of preset similarity threshold retained, so as to by the first data set and second The picture remained in data set is respectively as the first kind data set and Second Type data set.

10. a kind of offer device of video front cover, it is characterised in that include：

Sectional drawing unit, for the video file that receive user is uploaded, and the situation of change according to adjacent content frame in video file Determine scene change key frame and the corresponding picture of the scene change key frame is intercepted；

Marking unit, for being that the picture being truncated to is carried out by the good machine learning model for picture classification of training in advance Give a mark and sort；

Candidate's picture provide unit, for according to sequence using preset width number picture high for score as video front cover candidate's picture User is supplied to, so that user carries out the selection of video front cover from candidate's picture.

11. devices according to claim 10, it is characterised in that also include：

12. devices according to claim 10, it is characterised in that the sectional drawing unit, specifically for：

Frame beyond preset change threshold is defined as scene change key frame；