CN111859165A

CN111859165A - Real-time personalized information flow recommendation method based on user behaviors

Info

Publication number: CN111859165A
Application number: CN202010558811.9A
Authority: CN
Inventors: 柳凯; 陈运文; 于敬; 刘文海; 陈雨; 赵圆方; 纪达麒
Original assignee: Datagrand Tech Inc
Current assignee: Datagrand Tech Inc
Priority date: 2020-06-18
Filing date: 2020-06-18
Publication date: 2020-10-30

Abstract

The invention discloses a real-time personalized information flow recommendation method based on user behaviors, which is characterized by comprising the following steps of: collecting user behavior for the content; classifying the user behavior into a positive behavior and a negative behavior; acquiring content similar to the content clicked by the user, and sequencing according to the similarity to obtain a preliminary recommendation result; and performing weighting processing on the content sequence in the preliminary recommendation result based on the positive action and the negative action to obtain a final recommendation result. The method and the device can improve the recommendation effect and improve the user experience.

Description

Real-time personalized information flow recommendation method based on user behaviors

Technical Field

The invention belongs to the field of intelligent recommendation, and particularly relates to a real-time personalized information flow recommendation method based on user behaviors.

Background

With the development of the internet and the popularization of smart phones, people have moved into the electronic reading era from paper reading, people more and more utilize smart phones to obtain various information from various large information flow software through the internet, and users can click, subscribe, share, comment and the like on the software to generate a large amount of user behaviors. And each piece of large software also utilizes the behaviors to perform personalized recommendation on the user.

When users interact with the information flow software, many behaviors such as praise, share, like, comment and the like can be generated. These behaviors can be classified into positive behaviors and negative behaviors from the emotion of the user. The behaviors that can reflect that the user likes the content are forward behaviors, and some behaviors are analyzed to reflect that the user does not like the content. For example, the comments of the user, which correspond to the subjective feeling of the user, are positive, positive comments are positive behavior, and negative comments are negative, which indicate that the user dislikes or even dislikes the content. Such content should therefore no longer be recommended to the user.

However, most software and mainstream information flow recommendation algorithms only select a small part of the behaviors or consider all user behaviors as forward behaviors, so that many recommended contents are generated, and the user does not like the forward behaviors, thereby influencing the user experience.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a real-time personalized information flow recommendation method based on user behaviors.

In order to achieve the purpose, the invention adopts the following technical scheme:

a real-time personalized information flow recommendation method based on user behaviors comprises the following steps: collecting user behavior for the content; classifying the user behavior into positive and negative behavior; acquiring content similar to the content clicked by the user, and sequencing according to the similarity to obtain a preliminary recommendation result; and performing weighting processing on the content sequence in the preliminary recommendation result based on the positive behavior and the negative behavior to obtain a final recommendation result.

Preferably, the user behaviors comprise comments, and the comments are classified into positive behaviors and negative behaviors based on a comment emotion classification model built by using Word2vec Word vectors and a long-short term memory model.

Preferably, the method for building the comment emotion classification model comprises the following steps: preprocessing the comments into an ordered combination of a plurality of words through a word dividing tool jieba; putting the ordered combination into a Word2vec model, and training Word vectors by adopting a skip-gram method; accumulating word vectors of words contained in the comments to obtain sentence vectors of the comments; the vectorized comments and the pre-labeled results are placed in an LSTM model for supervised training.

Preferably, the user behavior comprises a click, and classifying the click into a positive behavior and a negative behavior comprises: based on the time that the user stays on the clicked content page after clicking, wherein the time is larger than or equal to a preset value, and the clicking is classified as forward behavior; the time is less than a predetermined value and the click is classified as negative-going behavior.

Preferably, the acquiring content similar to the content clicked by the user includes: and calculating the similarity between the contents by adopting a collaborative filtering algorithm based on the articles.

Preferably, the similarity of the contents is judged by using the Jaccard coefficient, and the larger the coefficient value is, the more similar is shown, and the formula of the Jaccard coefficient is as follows:

where n (i) represents a set of users who like content i, and n (j) represents a set of users who like content j.

An electronic device, comprising: a processor; and the memory is provided with executable instructions which can be executed by the processor, and the executable instructions realize the recommendation method after being executed.

A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the recommendation method.

A real-time personalized information flow recommendation system based on user behavior, the recommendation system comprising: a collection module that collects user behavior for content; a classification module that classifies the user behavior as positive and negative behavior; the recall module acquires contents similar to contents clicked by a user and carries out sequencing according to the similarity to obtain a preliminary recommendation result; and

and the weighting module is used for weighting the content sequence in the recommendation pool based on the positive behavior and the negative behavior to obtain a final recommendation result.

Compared with the prior art, the invention has the beneficial effects that:

1. the information hidden behind the user behavior can be fully mined, and the recommendation quality is obviously improved;

2. the recommendation result is more intelligent, and the user experience is obviously improved;

3. and a part of inferior content can be found from negative-going behaviors, so that the platform is helped to control the content quality.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic flow chart of an embodiment of the present invention.

FIG. 2 is a schematic diagram of an LSTM model according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.

As shown in fig. 1-2, the present embodiment provides the following technical solutions:

1. collecting the behaviors of the user:

and collecting and storing behaviors generated by users in the software through a data collection module, wherein the behaviors comprise clicks, comments, praise, likes, subscriptions and the like. It is also necessary to record the time the user clicks on each content and the content of the comments. And taking out part of comments of the users, and manually marking the comments of the users according to different emotions, wherein the comments are divided into positive comments and negative comments.

2. Classifying the user's behavior:

most user behaviors can be directly relegated to forward behaviors, such as praise, like, subscribe, and the like. Some user behaviors can also be directly relegated to negative behaviors, such as dislike, deletion, etc. But for click and comment behavior, certain analysis is needed before classification.

3. Classifying click behaviors:

the time length of the user staying on the content after clicking is calculated according to the time of clicking of the user. If the dwell time is short, meaning that the user may be just attracted by the title, and not interested in the content, then negative behavior is considered. If the dwell time is longer, it indicates that the user is likely to read the entire content, and the recommended content is of great interest, it may be considered a forward behavior.

4. Classifying the comment behavior:

for review behavior: and (3) building a comment emotion classification model by using the Word2vec Word vector and a long-short term memory model (long-short term) and classifying the comments of the user into positive comments and negative comments, wherein the positive comments correspond to positive behaviors, and the negative comments correspond to negative behaviors.

All comments of a user are preprocessed, and by utilizing an open-source word segmentation tool jieba, all comments are segmented and meaningless single words such as ' and ' ground ' are removed, namely each comment is changed into an ordered combination of a plurality of words. Putting the processed comments into a Word2vec model, and training Word vectors by adopting a skip-gram method, wherein W represents a whole vocabulary,

w_iFor a given word, w_jIs a contextual word.

And

is corresponding to the target word w_iE.w and its surrounding potential vector representations of context words.

Skip-gram formula in Word2 vec:

wherein,

after training, each word can be represented by a high-dimensional vector. And accumulating word vectors of words contained in the comments to obtain a sentence vector of the comment.

And putting the vectorized comments and the marked results into an LSTM model for supervised training. An emotion classification model suitable for the user comments in the scene can be obtained. I.e. a comment is entered, whether the comment belongs to a positive comment or a negative comment can be obtained.

LSTM model formula:

z_t＝σ(W_z·[h_t-1，x_t])

r_t＝σ(W_r·[h_t-1，x_t])

wherein,

x_tfor the current input, h_t-1For the last node to pass the status of, W_z、W_rAnd W are weights.

5. Recalling content

The recalling is to find the content which is most similar to the content clicked by the user, and sort according to the similarity, wherein the more similar, the higher the sort. Here, an Item-based collaborative filtering (Item-based collaborative filtering) algorithm is used to calculate the similarity between content and content. Similarity is judged by using the Jaccard coefficient, and the larger the coefficient value is, the more similar the coefficient value is.

The formula for the Jaccard coefficient is:

6. Reordering to generate recommended results

And classifying the behaviors of the user, finding the recall results related to the contents in the recall results for the contents with the positive behaviors of the user, and weighting the results to make the results ranked higher. For the content with the negative behavior of the user, recall results relevant to the content are found in the recall results, and the results are subjected to weight reduction or deleted from the recall results. And after the behaviors of the user are analyzed, generating a final recommendation result in real time.

Although the present invention has been described in detail with respect to the above embodiments, it will be understood by those skilled in the art that modifications or improvements based on the disclosure of the present invention may be made without departing from the spirit and scope of the invention, and these modifications and improvements are within the spirit and scope of the invention.

Claims

1. A real-time personalized information flow recommendation method based on user behaviors is characterized by comprising the following steps:

collecting user behavior for the content;

Classifying the user behavior into a positive behavior and a negative behavior;

acquiring content similar to the content clicked by the user, and sequencing according to the similarity to obtain a preliminary recommendation result;

and performing weighting processing on the content sequence in the preliminary recommendation result based on the positive action and the negative action to obtain a final recommendation result.

2. The real-time personalized information flow recommendation method based on user behaviors as claimed in claim 1, wherein the user behaviors include comments, and the classification of the comments into positive behaviors and negative behaviors is based on a comment emotion classification model built by using Word2vec Word vectors and a long-short term memory model.

3. The real-time personalized information flow recommendation method based on user behaviors as claimed in claim 2, wherein the construction method of the comment emotion classification model comprises the following steps:

preprocessing the comments into an ordered combination of a plurality of words through a word segmentation tool jieba;

putting the ordered combination into a Word2vec model, and training Word vectors by adopting a skip-gram method;

accumulating word vectors of words contained in the comments to obtain sentence vectors of the comments;

the vectorized comments and the pre-labeled results are placed in an LSTM model for supervised training.

4. The method of claim 1, wherein the user behavior comprises clicks, and classifying the clicks as positive and negative behaviors comprises:

based on the time that the user stays on the clicked content page after clicking, wherein the time is more than or equal to a preset value, and the clicking is classified into forward behaviors; the time is less than a predetermined value and the click is classified as negative-going behavior.

5. The method according to claim 1, wherein the obtaining content similar to the content clicked by the user comprises:

and calculating the similarity between the contents by adopting a collaborative filtering algorithm based on the articles.

6. The method of claim 5, wherein the similarity of the content is determined by using the Jaccard coefficient, and the larger the value of the coefficient, the more similar the similarity is indicated, and the formula of the Jaccard coefficient is:

7. An electronic device, comprising:

A processor;

a memory, wherein executable instructions are provided in the memory for execution by the processor, and when executed, the executable instructions implement the recommended method of any one of claims 1-6.

8. A computer-readable storage medium storing computer instructions for implementing the steps of the recommendation method according to any one of claims 1-6 when executed by a processor.

9. A real-time personalized information flow recommendation system based on user behavior, the recommendation system comprising:

a collection module that collects user behavior for content;

a classification module that classifies the user behavior as positive and negative behavior;

the recall module acquires contents similar to contents clicked by a user and sorts the contents according to the similarity to obtain a primary recommendation result; and