CN115905617A

CN115905617A - Video scoring prediction method based on deep neural network and double regularization

Info

Publication number: CN115905617A
Application number: CN202310187456.2A
Authority: CN
Inventors: 赵学健; 张晶晶; 孙知信; 孙哲; 曹亚东; 宫婧; 汪胡青; 胡冰; 徐玉华
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2023-03-02
Filing date: 2023-03-02
Publication date: 2023-04-04
Anticipated expiration: 2043-03-02
Also published as: WO2024179089A1; CN115905617B

Abstract

The invention relates to a video scoring prediction method based on a deep neural network and double regularization, which comprises the steps of reconstructing a user-video scoring matrix, introducing a video associated regular item fused with user activity and a reliable nearest neighbor regular item, constructing a matrix decomposition recommendation model fused with the video associated regular item fused with the user activity and the reliable nearest neighbor regular item, inputting potential features into the deep neural network to obtain a result of a deep neural network model, and combining the result of the deep neural network model with a matrix decomposition structure to obtain a final prediction score, so that the precision of the prediction score is improved; and mining related information in user video comments by using an LDA (latent Dirichlet Allocation) model to generate a user type latent feature matrix and a video type latent feature matrix, combining the user type latent feature matrix and the video type latent feature matrix to obtain a hidden information matrix, and combining the hidden information matrix and an original user video scoring matrix to generate a new user-video scoring matrix, so that the problems of cold start and data sparsity are solved.

Description

Video scoring prediction method based on deep neural network and double regularization

Technical Field

The invention relates to a video scoring prediction method based on a deep neural network and double regularization, and belongs to the field of scoring prediction.

Background

With the rapid development of the internet technology, video resources in each network platform are more and more, abundant video resources are provided for users, and when more choices are provided for the users, troubles and puzzles are brought to the users, and the huge video resources not only increase the difficulty of searching favorite videos of the users, but also lead the process of searching the videos to be very time-consuming. In order to solve the information overload problem, the personalized recommendation system becomes an effective tool for solving the problem. The score prediction is again an important component of the recommendation algorithm. The existing recommendation algorithms mainly have three main categories: collaborative filtering based recommendation algorithms, content based recommendation algorithms, and hybrid recommendation algorithms. At present, the most used collaborative filtering-based recommendation algorithm is the collaborative filtering-based recommendation algorithm, and the most used collaborative filtering-based recommendation algorithm is the model-based collaborative filtering recommendation algorithm, and several algorithms which are relatively common in the model-based collaborative filtering recommendation algorithm include: matrix decomposition model, singular value decomposition, cluster analysis, etc. However, the existing collaborative filtering recommendation algorithm has problems of data sparseness, cold start and the like, which causes inaccurate score prediction of recommended video resources, thereby affecting personalized recommendation results, improving accuracy of prediction scoring of video resources, and further improving recommendation precision becomes one of the hot spots of current research.

Disclosure of Invention

The invention aims to provide a video scoring prediction method based on a deep neural network and double regularization aiming at the defects of the prior art, a user-video scoring matrix is reconstructed, a video association regular term and a reliable nearest neighbor regular term which are fused with user liveness are introduced during matrix decomposition to restrict the learning of a potential feature matrix, the deep neural network is introduced, the nonlinear feature of the deep neural network is utilized to relieve the limitation of linear dot products in the matrix decomposition process, the result of a deep neural network model is combined with the result of double regularization matrix decomposition, and the precision of video scoring prediction is improved.

The technical scheme adopted by the invention is as follows: a video scoring prediction method based on a deep neural network and double regularization is used for improving the precision of scoring prediction of recommended videos, and specifically comprises the following steps:

step S1: processing the video comments, excavating hidden information, combining a hidden information matrix with an original user-video scoring matrix to generate a new user-video scoring matrix, and entering the step S2;

step S2: the method comprises the following steps of adding a biregular term constraint potential feature matrix into a user-video scoring matrix for learning when decomposing a user-video scoring matrix, wherein each user scores a video to make a certain contribution to video similarity, and the user contributions are not completely the same, so that the users can be divided into active users and inactive users according to the liveness of the users, the active users refer to users who score and record a large number of videos, the inactive users refer to users who score and record only a small number of videos, so that the contributions of the active users and the inactive users are separated when the video similarity is calculated, and the liveness of the users is defined as:

equation 1

In the formula 1, the first and second groups of the compound,

the total score of the user u is represented, so that the video similarity calculation method obtained by combining the activity coefficient of the user and the modified cosine similarity comprises the following steps: />

Equation 2

In the formula 2, the first and second groups,

represents the rating of user u on video i, </or>

Represents the rating of user u on video j, and>

represents a rating score for user u, and->

Representing a set of users who have simultaneously scored videos i and j; and (3) introducing the learning of the video associated regularization term constraint project potential feature matrix integrated with the user activity during matrix decomposition, wherein the video associated regularization constraint function formula integrated with the user activity at the moment is as follows:

equation 3

In equation 3, V represents a video feature matrix, V _j Is a potential feature vector of video j，V _i Is a potential feature vector of video i and proceeds to step S3;

and step S3: taking the potential characteristic vector decomposed by the matrix as the input of the multilayer perceptron, processing the potential characteristic vector by the multilayer perceptron to obtain a result predicted by a multilayer perceptron model, and entering the step S4;

step S4; and combining the result predicted by the multilayer perceptron model with the result of matrix decomposition in the merging layer, and optimizing the model by using a normalized cross entropy method to obtain the final predicted score.

As a preferred technical scheme of the invention: in the step S1, firstly, the LDA model is used to mine the hidden information of the relevant types in the user video review, generate the user type potential feature matrix LU and the video type potential feature matrix LV, and combine the user type potential feature matrix with the video type potential feature matrix to obtain the user type potential feature matrix LU and the video type potential feature matrix LV

To the hidden information matrix L, the calculation formula is:

equation 4

And combining the hidden information matrix L with the original user-video scoring matrix R to generate a new user-video scoring matrix

The calculation formula is as follows:

equation 5

As a preferred technical scheme of the invention: in step S2, users with the same interests and hobbies may affect each other, and the user similarity may be calculated using a weighted pearson correlation coefficient:

equation 6

In the formula 6, the first and second groups,

and &>

Represents the mean rating, based on the user u and v, respectively>

Represents the rating of user u on video i, </or>

Represents the rating of user v on video i, <' > or>

Represents a set of videos that user u has commented on, and @>

Represents a set of videos that user v has commented on, <' > or>

The Jaccard correlation coefficient of the items which can influence the user similarity calculation is the weight, and the calculation formula is as follows:

equation 7

Wherein,

represents a set of videos that user u has commented on, and @>

Representing a video set which is commented by a user v;

the user's score for an item depends on the influence of neighboring users and may also be influenced by neighboring users of neighboring users, but neighboring users at a certain distance do not influence the user, i.e. become unreliable, so a reliable value is introduced, and neighboring users with a reliable value greater than a certain value have an influence on the user item score, and the reliable value is calculated by:

equation 8

In the formula 8, the process is described,

represents a score of u on video i, <' > or>

Represents the rating of user v on video i, <' > or>

Represents a set of videos that user u has commented on, and @>

Represents a set of videos that user v has commented on, <' > or>

Represents the maximum value of the score, based on the evaluation>

Indicates a trust distance, i.e. the number of users present between user u and user v, is present>

Represents the maximum distance allowed between two users, and>

is a correction parameter, is a number greater than 0 and less than 1, and is reliable

Equation 9

Introducing a reliable nearest neighbor regular term to constrain the learning of the potential feature matrix of the user during matrix decomposition, wherein a reliable nearest neighbor regular term constraint function is as follows:

. Equation 10

Wherein,

is a potential vector for user u, <' > is>

Is a potential feature vector for user v.

As a preferred technical scheme of the invention: in the step S3, the user potential feature vector and the video potential feature vector are used as inputs of a multilayer perceptron, wherein the deep neural network is composed of the multilayer perceptron and a single-layer perceptron, the multilayer perceptron includes an input layer, a plurality of hidden layers allowing neural structure nonlinearity and an output layer, and a result of a multilayer perceptron model is obtained by processing the multilayer perceptron through nonlinear features of the hidden layers.

As a preferred technical scheme of the invention: in step S4, the single-layer perceptron in the deep neural network structure is a merging layer, and the prediction result of the multilayer perceptron model is combined with the result of the regularized dual matrix decomposition model in the merging layer, where the calculation formula is:

equation 11

In the formula 11

For an activation function, <' >>

For the set of matrix weights between the output layer and the merging layer, <' >>

Is the result of the output layer, is>

For a user potential vector, based on the number of potential vectors in the selected candidate set>

For a video potential vector>

And optimizing the model by using a normalized cross entropy method for merging the deviation items of the layers to finally obtain the prediction score.

Has the beneficial effects that:

1. the method introduces the video association regular term and the reliable nearest neighbor regular term which are blended into the user activity to restrain the learning of the potential feature matrix, utilizes the nonlinear structure of the deep neural network to relieve the limitation of linear dot products in the matrix decomposition process, combines the result of the deep neural network model with the result of the double regularization matrix decomposition, and improves the precision of video scoring prediction.

2. According to the method, the LDA model is utilized to mine relevant information in the user video comments, a user type potential feature matrix and a video type potential feature matrix are generated, the user type potential feature matrix and the video type potential feature matrix are combined to obtain a hidden information matrix, the hidden information matrix and an original user video scoring matrix are combined to generate a new user-video scoring matrix, and the problems of cold start and data sparsity are solved.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a block diagram of the multi-level perceptron of the present invention;

FIG. 3 is a diagram of the deep neural network architecture of the present invention.

Detailed Description

The following description will explain embodiments of the present invention in further detail with reference to the accompanying drawings.

The method is characterized in that a user-video scoring matrix is reconstructed on the basis of a traditional matrix decomposition model, and a video association regular term and a reliable nearest neighbor regular term which are blended into the user activity are introduced to limit the learning of a potential feature matrix. The nonlinear structure of the deep neural network is utilized, the limit of linear dot products in the matrix decomposition process is relieved, the potential feature vector decomposed by the matrix is used as the input of the deep neural network, the result of the MLP model is obtained through the processing of a plurality of layers of perceptrons, the result of the MLP model is combined with the result of the biregularized matrix decomposition model in a single-layer perceptron layer, namely a merging layer, the model is optimized through a normalized cross entropy method, and therefore the scoring prediction precision is improved.

As shown in fig. 1.

The invention designs a video scoring prediction method based on a deep neural network and double regularization, which is used for improving the precision of the scoring prediction of a recommended video, and comprises the following steps:

step S2: adding a double regular term for restricting the learning of the potential feature matrix when decomposing the user-video scoring matrix, and entering a step S3;

and step S4: and combining the result predicted by the multilayer perceptron model with the result of matrix decomposition in the merging layer, and optimizing the model by using a normalized cross entropy method to obtain the final predicted score.

The method comprises the following specific steps:

the step S1 comprises the following steps: firstly, mining related types of hidden information in user video review by using an LDA (latent Dirichlet Allocation) model to generate a user type potential feature matrix LU and a video type potential feature matrix LV, and combining the user type potential feature matrix with the video type potential feature matrix to obtain a hidden information matrix L, wherein the calculation formula is as follows

And combining the hidden information matrix L with the original user-video scoring matrix R to reconstruct a user-video scoring matrix R, wherein the calculation formula is as follows:

。

The step S2 comprises the following steps: carrying out matrix decomposition on the user-video scoring matrix, and decomposing the high-dimensional user-video scoring matrix into a low-dimensional user characteristic matrix and a video characteristic matrix, wherein the formula is as follows:

wherein U represents a user feature matrix, U _i Potential feature vector representing user i, V represents a video feature matrix, V _j Is the potential feature vector for video j. The low-dimensional matrix decomposition method approximately calculates the scoring matrix R by the product of the d-rank factors. The prediction score of user i for video j is expressed as

The square of the error between the predicted score and the raw score is taken as a loss function, which is minimized to approximate the score matrix R. The loss function is:

，

in the above-mentioned formula,

is an indicator function that indicates that if user i scores item j, it equals 1, otherwise it equals 0.

And &>

Over-fitting is prevented for two regular terms. Since each user scores the video to make a certain contribution to the video similarity, but each user contribution is not exactly the same, and the video similarity can be divided into active users and inactive users by considering the activity of the users, the active users refer to users who have a large number of score records on the video, and the inactive users refer to users who only have score records on a small number of videos, so the contributions of the active users and the inactive users should be distinguished when calculating the video similarity, and the activity of the users can be defined as:

in the above-mentioned formula,

representing the total number of scores for user u. Therefore, the video similarity calculation method obtained by combining the liveness coefficient of the user with the modified cosine similarity is as follows:

in the above-mentioned formula,

represents the rating of user u on video i, </or>

Represents the rating of user u on video j, and>

representing the user u's score. The video association regularization item integrated with the user activity is introduced during matrix decomposition to constrain the learning of a project potential feature matrix, and the video association regularization constraint function formula integrated with the user activity is as follows:

wherein V represents a video feature matrix, V _j Is a potential feature vector, V, of video j _i Are potential feature vectors for video i. Users with the same interest will influence each other, and the user similarity can be calculated using weighted pearson correlation coefficients:

in the above-mentioned formula,

and &>

Represents the mean score of users u and v, respectively>

wherein,

represents a set of videos that user u has commented on, and @>

Representing a set of videos that user v has commented on.

，

in the above-mentioned formula,

represents a score of u on video i, <' > or>

Represents the rating of user v on video i, <' > or>

Represents a set of videos that user u has commented on, and @>

Represents a set of videos that a user v has commented on,/>>

Represents the maximum value of the score, based on the evaluation>

Representing a trust distance, i.e. the number of users present between user u and user v, in conjunction with a user v>

Represents the maximum distance allowed between two users, and>

is a number greater than 0 and less than 1 for the correction parameter. Reliable nearest neighbor users are

，

Introducing a reliable nearest neighbor regular term to constrain the learning of a potential feature matrix of a user during matrix decomposition, wherein a reliable nearest neighbor regular term constraint function is as follows:

wherein,

is a potential vector for user u, <' > is>

Is a potential feature vector for user v.

Adding a video associated regularization term fused with user activity and a reliable nearest neighbor regularization term, wherein the final optimization loss function is as follows:

and searching an optimal solution by adopting a random gradient descent method, and finding out an optimal potential feature matrix.

As shown in fig. 2.

The step S3 comprises the following steps: user potential feature vector U _u And video latent feature vector V _i As input to a multi-layered perceptron, as shown in FIG. 2, comprising an input layer L _in A plurality of hidden layers allowing the nonlinearity of the neural structure and an output layer L _out An input layer L _in The output vector of (a) is:

outputting a vector after the first hidden layer processing:

wherein

Is a set of weights, included in a matrix between the input layer and the first layer hidden layer L1, is->

Is a aberration of level L1>

Is an activation function of

So as to hide the layer L _k The output vector of (a) is:

wherein

Is an activation function of the neuron>

Is a weight matrix, is based on>

Is a deviation. Multilayer perceptron output layer L _out The output vector is:

as shown in fig. 3.

The step S4 comprises the following steps: the single-layer perceptron in the deep neural network structure is a merging layer, wherein the deep neural network structure is shown in fig. 3, the prediction result of the multilayer perceptron model and the result of the double regularization matrix decomposition model are combined in the merging layer, and the calculation formula is as follows:

wherein,

for an activation function, <' >>

Is the result of the output layer, is>

For a potential vector of video, be>

To merge the deviation terms of the layers

The proposed model is continuously optimized by using a normalized cross entropy method through the following cost functions:

in the above-mentioned formula, the first and second,

for merging layer neuron number, based on the combined layer neuron number>

For a predicted score>

For a real point of training instance, <' > based on the number of true points>

The maximum value of the score is indicated. The cost function continuously optimizes the model by using a gradient descent method to obtain the final prediction score.

The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims

1. A video scoring prediction method based on a deep neural network and double regularization is characterized by comprising the following steps:

step S2: the learning of a double regular term constraint potential feature matrix is added during the decomposition of a user-video scoring matrix, the scoring of each user on a video makes a certain contribution to the video similarity, the user contributions are not completely the same, the video scoring matrix can be divided into active users and inactive users according to the liveness of the users, the active users refer to users who record a large number of scoring records on the video, the inactive users refer to users who record only a small number of scoring records on the video, so the contributions of the active users and the inactive users are separated when the video similarity is calculated, and the liveness of the users is defined as:

equation 1

In the formula 1, the first and second groups of the compound,

the total score of the user u is represented, so that the video similarity calculation method obtained by combining the activity coefficient of the user and the modified cosine similarity comprises the following steps:

equation 2

In the case of the formula 2, the,

represents the rating of user u on video i, </or>

Represents the rating of user u on video j, and>

represents a rating score for user u, and->

Representing a set of users who have simultaneously scored videos i and j; and (3) introducing the learning of the video associated regularization term constraint project potential feature matrix blended with the user activity during matrix decomposition, wherein the video associated regularization constraint function formula blended with the user activity at the moment is as follows:

equation 3

In equation 3, V represents the video feature matrix, V _j Is a potential feature vector, V, of video j _i Is a potential feature vector of video i and proceeds to step S3;

2. The method of claim 1, wherein the video score prediction method based on deep neural network and regularization comprises: in the step S1, firstly, the LDA model is used to mine the hidden information of the relevant type in the user video review, the user type potential feature matrix LU and the video type potential feature matrix LV are generated, and the user type potential feature matrix is combined with the video type potential feature matrix to obtain the hidden information matrix L, wherein the calculation formula is as follows:

equation 4

The calculation formula is as follows:

equation 5.

3. The method of claim 1, wherein the video score prediction method based on deep neural network and regularization comprises: in step S2, users with the same interests and hobbies affect each other, and the user similarity is calculated using the weighted pearson correlation coefficient:

equation 6

In the formula 6, the first and second groups,

and &>

Represents the mean rating, based on the user u and v, respectively>

Represents the rating of user u on video i, </or>

Represents the rating of user v on video i, <' > or>

Represents a set of videos that user u has commented on, and @>

Represents a set of videos that user v has commented on,

equation 7

Wherein,

represents a set of videos that user u has commented on, and @>

Representing a video set which is commented by a user v;

the user's score for an item depends on the influence of neighboring users and is also influenced by neighboring users of neighboring users, but neighboring users at a certain distance do not influence the user, i.e. become unreliable, so a reliable value is introduced, and neighboring users with a reliable value greater than a certain value have an influence on the user item score, and the reliable value is calculated by:

equation 8

In the formula 8, the process is described,

represents a score of u on video i, <' > or>

Represents the rating on video i of user v>

Represents a set of videos that user u has commented on, and @>

Represents a set of videos that user v has commented on, <' > or>

Represents the maximum value of the score, based on the evaluation>

Represents the maximum distance allowed between two users, and>

is to correct the parameter, is oneA number greater than 0 and less than 1, the reliable nearest neighbor user is

Equation 9

equation 10

Wherein,

for a potential vector of user u>

Is a potential feature vector for user v.

4. The method of claim 1, wherein the video score prediction method based on deep neural network and regularization comprises: in the step S3, the user potential feature vector and the video potential feature vector are used as inputs of a multilayer perceptron, wherein the deep neural network is composed of the multilayer perceptron and a single-layer perceptron, the multilayer perceptron includes an input layer, a plurality of hidden layers allowing the neural structure to be nonlinear, and an output layer, and the result of the multilayer perceptron model is obtained through processing by the multilayer perceptron by utilizing the nonlinear features of the hidden layers.

5. The method of claim 1, wherein the video score prediction method based on deep neural network and regularization comprises: in step S4, the single-layer perceptron in the deep neural network structure is a merging layer, and the prediction result of the multilayer perceptron model is combined with the result of the regularized dual matrix decomposition model in the merging layer, where the calculation formula is:

equation 11

In equation 11

For an activation function, <' >>

Is the result of the output layer, is>

For a user potential vector>

For a potential vector of video, be>

And optimizing the model by using a normalized cross entropy method for merging the deviation items of the layers to finally obtain the prediction score. />