CN114564594A

CN114564594A - Knowledge graph user preference entity recall method based on double-tower model

Info

Publication number: CN114564594A
Application number: CN202210169936.1A
Authority: CN
Inventors: 陆佳炜; 吴俚达; 程振波; 韦航俊; 朱昊天; 方静雯; 徐俊; 肖刚
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2022-02-23
Filing date: 2022-02-23
Publication date: 2022-05-31
Anticipated expiration: 2042-02-23

Abstract

本发明公开了一种基于双塔模型的知识图谱用户偏好实体召回方法，在传统的双塔模型中添加了优化方法，用于更好的学习用户与物品之间的交互，已训练的双塔模型能用于召回在知识图谱上与用户偏好相关的实体。首先将用户历史记录的物品在知识图谱对应的实体作为起点，沿着边检索到所有的邻居实体。然后通过已经训练好的优化双塔模型对召回到的实体进行筛选。最后以召回到的实体作为新的起点，重复上述操作。最终构成了能够表示用户偏好和潜在偏好的知识图谱。The invention discloses a knowledge map user preference entity recall method based on a double-tower model, and an optimization method is added to the traditional double-tower model for better learning the interaction between users and items. The model can be used to recall entities related to user preferences on the knowledge graph. First, the entities corresponding to the items in the user history in the knowledge graph are used as the starting point, and all neighbor entities are retrieved along the edges. The recalled entities are then screened through the trained and optimized twin-tower model. Finally, the above operations are repeated with the recalled entity as a new starting point. Finally, a knowledge graph that can represent user preferences and potential preferences is formed.

Description

A Recall Method for User Preference Entity in Knowledge Graph Based on Twin Tower Model

技术领域technical field

本发明涉及知识图谱和深度学习技术领域，具体涉及一种基于双塔模型的知识图谱用户偏好实体召回方法。The invention relates to the technical field of knowledge graphs and deep learning, in particular to a method for recalling user preference entities of knowledge graphs based on a twin-tower model.

背景技术Background technique

知识图谱是谷歌在2012年提出的概念，是谷歌用于增强其搜索引擎功能的知识库。本质上，知识图谱旨在描述真实世界中存在的各种实体或概念及其关系，其构成一张巨大的语意网络图，节点表示实体或概念，边则由属性或关系构成。每条知识的表示为三元组(h，r，t)的形式，其中h表示头实体，t表示尾实体，r表示头、尾实体之间的关系。知识图谱以强大的语义处理能力和开放组织能力，在推荐系统、智能问答和信息检索等领域发挥着重要的作用，为互联网时代的知识组织和智能应用奠定了基础。A knowledge graph, a concept proposed by Google in 2012, is a knowledge base that Google uses to enhance the functionality of its search engine. Essentially, the knowledge graph aims to describe various entities or concepts and their relationships in the real world, which constitute a huge semantic network graph, where nodes represent entities or concepts, and edges are composed of attributes or relationships. Each piece of knowledge is represented in the form of a triple (h, r, t), where h represents the head entity, t represents the tail entity, and r represents the relationship between the head and tail entities. With powerful semantic processing capabilities and open organization capabilities, knowledge graphs play an important role in recommender systems, intelligent question answering, and information retrieval, laying the foundation for knowledge organization and intelligent applications in the Internet era.

传统的推荐系统使用显式或隐式信息作为输入来进行预测，存在两个主要问题。一是稀疏性问题，实际场景中，用户和物品的交互信息往往是非常稀疏的，使用如此少的观测数据来预测大量的未知信息，会极大增加过拟合的风险。二是冷启动问题，对于新加入的用户或者物品，其没有对应的历史信息，因此难以进行准确地建模和推荐。Traditional recommender systems use explicit or implicit information as input to make predictions, and there are two main problems. One is the sparsity problem. In actual scenarios, the interaction information between users and items is often very sparse. Using such a small amount of observation data to predict a large amount of unknown information will greatly increase the risk of overfitting. The second is the cold start problem. For newly added users or items, there is no corresponding historical information, so it is difficult to accurately model and recommend.

知识图谱包含了实体之间丰富的语义关联，为推荐系统提供了潜在的辅助信息来源。知识图谱为物品引入更多的语义关系，可以深层地发现用户兴趣。通过知识图谱中不同的关系链接种类，有利于推荐结果的发散。知识图谱可以连接用户的历史记录和推荐结果，从而提高用户对推荐结果的满意度和接受度，增强用户对系统的信任。Knowledge graphs contain rich semantic associations between entities, providing a potential source of auxiliary information for recommender systems. Knowledge graphs introduce more semantic relationships for items, which can deeply discover user interests. Through the different types of relational links in the knowledge graph, it is beneficial to the divergence of the recommendation results. The knowledge graph can connect the user's historical records and recommendation results, thereby improving the user's satisfaction and acceptance of the recommendation results, and enhancing the user's trust in the system.

现有的知识图谱推荐应用的方法主要有两类。一类是基于向量嵌入方法(embedding-based methods)，通过知识图谱向量嵌入算法，去学习知识图谱中的实体和关系，并得到实体和关系的向量表示，再把实体和关系的向量引入到推荐系统框架中。比如，基于卷积神经网络的DKN框架(Deep Knowledge-aware Network)，基于协同知识库嵌入的CKE框架(Collaborative Knowledge base Embedding)。虽然基于向量嵌入的知识图谱推荐方法具有很强的灵活性，但这类方法通常适用于图形内链路预测应用，而推荐场景更需要的是去挖掘用户的潜在兴趣。另一类是基于路径方法(path-based methods)，探索知识图谱中各个实体之间的各种联系，为推荐系统提供额外的指导。比如，基于个性化实体的推荐方法(Personalized Entity Recommendation)，基于元图的推荐方法(Meta-GraphBased Recommendation)。虽然基于路径的知识图谱推荐方法能够以更自然和更直观的方式使用知识图谱，但是它们严重依赖于手动设计的元路径，这在实践中很难优化。There are mainly two types of existing methods for knowledge graph recommendation applications. One is based on the vector embedding method (embedding-based methods), through the knowledge map vector embedding algorithm, to learn the entities and relationships in the knowledge map, and obtain the vector representation of the entities and relationships, and then introduce the vectors of the entities and relationships into the recommendation. in the system framework. For example, the DKN framework (Deep Knowledge-aware Network) based on convolutional neural networks, and the CKE framework (Collaborative Knowledge base Embedding) based on collaborative knowledge base embedding. Although the knowledge graph recommendation method based on vector embedding has strong flexibility, this kind of method is usually suitable for the application of link prediction in the graph, and the recommendation scenario needs to explore the potential interests of users. The other category is path-based methods, which explore various connections between entities in the knowledge graph and provide additional guidance for recommender systems. For example, Personalized Entity Recommendation and Meta-GraphBased Recommendation. Although path-based knowledge graph recommendation methods are able to use knowledge graphs in a more natural and intuitive way, they heavily rely on manually designed meta-paths, which are difficult to optimize in practice.

发明内容SUMMARY OF THE INVENTION

针对现有技术中存在的问题，本发明提供了一种基于双塔模型的知识图谱用户偏好实体召回方法。在传统的双塔模型中添加了优化方法，用于更好的学习用户与物品之间的交互；已训练的双塔模型能用于召回在知识图谱上与用户偏好相关的实体；首先将用户历史记录的物品在知识图谱对应的实体作为起点，沿着边检索到所有的邻居实体；然后通过已经训练好的优化双塔模型对召回到的实体进行筛选。最后以召回到的实体作为新的起点，重复上述操作；最终构成了能够表示用户偏好和潜在偏好的知识图谱。In view of the problems existing in the prior art, the present invention provides a method for recalling user preference entities of knowledge graphs based on the twin tower model. An optimization method is added to the traditional two-tower model to better learn the interaction between users and items; the trained two-tower model can be used to recall entities related to user preferences on the knowledge graph; The items in the history record take the entity corresponding to the knowledge graph as the starting point, and retrieve all neighbor entities along the edge; then the recalled entities are screened through the trained optimized twin-tower model. Finally, the recalled entities are used as a new starting point, and the above operations are repeated; finally, a knowledge graph that can represent user preferences and potential preferences is formed.

本发明所采用的技术方案如下：The technical scheme adopted in the present invention is as follows:

一种基于双塔模型的知识图谱用户偏好实体召回方法，包括如下步骤：A method for recalling user preference entities in knowledge graph based on two-tower model, comprising the following steps:

1、定义用户特征向量和物品特征向量，作为双塔模型的输入；1. Define the user feature vector and the item feature vector as the input of the twin tower model;

2、训练双塔模型，结合in-batch softmax损失函数与基于哈希序列的频率估计方法对双塔模型进行优化；2. Train the twin-tower model, and optimize the twin-tower model by combining the in-batch softmax loss function and the frequency estimation method based on the hash sequence;

3、定义用户历史交互矩阵与知识图谱的实体映射关系；3. Define the entity mapping relationship between the user history interaction matrix and the knowledge graph;

4、通过偏好实体传播的方式，将每次传播召回到的实体与用户特征输入到优化的双塔模型户偏得出预测概率，筛选概率高的实体，最终得到表示用户偏好和潜在偏好的知识图谱。4. Through the method of preference entity propagation, input the entities and user characteristics recalled by each propagation into the optimized twin-tower model to obtain the predicted probability, screen the entities with high probability, and finally obtain the knowledge representing user preferences and potential preferences Atlas.

所述步骤1的过程如下：The process of step 1 is as follows:

1.1、用户特征指的是用户对物品的交互行为，包括点击记录，搜索记录，社交数据，个人数据和样本年龄，用户特征向量是将上述交互数据转化为向量并做拼接(concatenate)。其中将原始数据转化为向量的方式称为向量嵌入(embedding)，向量嵌入(embedding)是机器学习中常用的表示数据特征的方法，目的是将原始数据提取出特征，也就是通过神经网络映射之后的低维向量；1.1. User feature refers to the user's interaction behavior with items, including click records, search records, social data, personal data and sample age. User feature vector is to convert the above interaction data into vectors and concatenate them. The method of converting raw data into vectors is called vector embedding. Vector embedding is a commonly used method to represent data features in machine learning. The purpose is to extract features from raw data, that is, after mapping through neural networks. The low-dimensional vector of ;

更进一步，所述1.1的流程如下：Further, the process of 1.1 is as follows:

1.1.1、用户点击记录的embedding，是所有点击物品的id类embedding的加权平均，其中id类embedding是将物品唯一标识符映射到同一维度的向量，其权重与浏览物品时间成正比。其用户点击记录的embedding计算公式如下：1.1.1. The embedding of the user click record is the weighted average of the id-type embeddings of all clicked items. The id-type embedding is a vector that maps the unique identifier of the item to the same dimension, and its weight is proportional to the browsing time of the item. The embedding calculation formula of its user click record is as follows:

其中v_click表示用户点击记录的embedding，

表示第i个权重，v_click,i表示点击记录中第i个物品的id类embedding，n表示embedding的个数；其中，

可通过如下公式计算：Where v _click represents the embedding of the user click record,

represents the ith weight, v _{click, i} represents the id class embedding of the ith item in the click record, and n represents the number of embeddings; among them,

It can be calculated by the following formula:

其中

表示用户对物品i浏览的时间，N表示样本总数，k表示正例总数；in

Represents the time that the user browses the item i, N represents the total number of samples, and k represents the total number of positive examples;

1.1.2、用户搜索记录的embedding是历史搜索的关键词进行分词得到词条。分词的过程是通过Word2vec模型得到对应词条的embedding，然后将用户搜索记录的embedding进行加权平均。1.1.2. The embedding of the user's search record is the keyword of the historical search for word segmentation to obtain the entry. The process of word segmentation is to obtain the embedding of the corresponding entry through the Word2vec model, and then perform a weighted average of the embeddings of the user search records.

其中分词是搜索引擎针对用户提交搜索的关键词串进行切分成不同词条token的技术。The word segmentation is a technology in which the search engine divides the keyword string submitted by the user into different entry tokens.

Word2vec模型将文本中的内容词汇通过转换处理，化简为空间向量，词向量的数值受上下文的影响，蕴含了词与词之间相互的关联性。The Word2vec model converts the content vocabulary in the text and reduces it to a space vector. The value of the word vector is affected by the context, which implies the correlation between words.

其用户搜索记录的embedding的计算公式如下：The calculation formula of the embedding of its user search records is as follows:

其中v_search表示用户搜索记录的embedding，

表示第i个权重，v_search,i表示搜索记录中第i个词条的embedding，n表示embedding的个数；where v _search represents the embedding of the user search record,

Represents the ith weight, v _{search, i} represents the embedding of the ith entry in the search record, and n represents the number of embeddings;

搜索记录的embedding的权重计算：The weight calculation of the embedding of the search record:

其中搜索的有效性判断为用户是否在搜索后点击物品；The validity of the search is judged as whether the user clicks the item after the search;

1.1.3、用户的社交数据包括收藏、点赞和订阅数据对应的embedding加权平均。其中收藏和点赞数据对应的embedding指的是用户收藏和点赞的物品id类的embedding；订阅数据对应的embedding指的是用户订阅物品对应的负责人的id类的embedding。1.1.3. The user's social data includes the embedding weighted average of favorites, likes, and subscription data. The embedding corresponding to the collection and like data refers to the embedding of the id category of the items that the user has favorited and liked; the embedding corresponding to the subscription data refers to the embedding of the id category of the person in charge corresponding to the user subscribed items.

其用户社交数据的embedding的计算公式如下：The calculation formula of the embedding of its user social data is as follows:

其中v_social表示用户搜索记录的embedding，

表示第i个权重，v_social,i表示搜索记录中第i个社交数据的embedding；where v _social represents the embedding of user search records,

represents the ith weight, v _social,i represents the embedding of the ith social data in the search record;

对于收藏和点赞的embedding的权重计算：For the weight calculation of the favorite and liked embeddings:

其中

对于订阅的embedding权重计算：For subscription embedding weight calculation:

其中

示被订阅者第i个物品的浏览时间，N表示样本总数，k表示正例总数；in

Indicates the browsing time of the ith item of the subscriber, N represents the total number of samples, and k represents the total number of positive examples;

1.1.4、用户的个人数据包括用户的性别、年龄和地域；其中性别是简单的二元特征，年龄和地域属于连续型特征，可以将其归一化为[0,1]区间上的实数值。用户个人数据的embedding，是将处理过的性别、年龄和地域的值做拼接操作后得到的向量；1.1.4. The user's personal data includes the user's gender, age, and region; gender is a simple binary feature, and age and region are continuous features, which can be normalized to a real value in the [0,1] interval. numerical value. The embedding of the user's personal data is a vector obtained by splicing the processed values of gender, age and region;

更进一步，1.1.4的流程如下：Going a step further, the process for 1.1.4 is as follows:

1.1.4.1、计算用户性别的二元表示，其公式如下：1.1.4.1. Calculate the binary representation of user gender, the formula is as follows:

1.1.4.2、计算用户的年龄和地域的归一化实数值，其归一化公式如下：1.1.4.2. Calculate the normalized real value of the user's age and region. The normalization formula is as follows:

其中X表示样本数值，μ为所有样本数据的均值，σ为所有样本数据的标准差；where X represents the sample value, μ is the mean of all sample data, and σ is the standard deviation of all sample data;

1.1.4.3、将1.1.4所述的性别二元值，年龄和地域归一化实数值做拼接操作得到一个向量，这个向量拼接操作公式如下：1.1.4.3. Perform a splicing operation on the gender binary value, age and region normalized real value described in 1.1.4 to obtain a vector. The vector splicing operation formula is as follows:

v_personal＝[gender,z_age,z_region]v _personal = [gender, z _age , z _region ]

其中v_personal表示用户特征向量，gender表示用户性别，z_age和z_region分别表示用户的年龄和地域的归一化值；where v _personal represents the user feature vector, gender represents the user’s gender, and z _age and z _region represent the normalized value of the user’s age and region, respectively;

1.1.5、将1.1流程所述的用户点击记录的embedding，用户搜索记录的embedding，用户交互数据的embedding，用户个人数据的embedding做concatenate连接操作得到用户特征向量，其公式如下：1.1.5. The user feature vector is obtained by performing the concatenate connection operation on the embedding of the user click record, the embedding of the user search record, the embedding of the user interaction data, and the embedding of the user's personal data described in the process of 1.1. The formula is as follows:

v_user＝concatenate(v_click,v_search,v_social,v_personal)v _user ＝concatenate(v _click ,v _search ,v _social ,v _personal )

＝[v_click[1],v_click[2],…,v_search[1],v_search[2],…,v_social[1],v_social[2],…,v_personal[1],v_personal[2],…]=[v _click [1],v _click [2],…,v _search [1],v _search [2],…,v _social [1],v _social [2],…,v _personal [1], v _personal [2],…]

其中v_user表示用户特征向量，v_click[i]表示用户点击embedding的第i个分量，v_search[i]表示用户搜索记录embedding的第i个分量，v_social[i]表示用户社交数据embedding的第i个分量，v_personal[i]表示用户个人数据embedding的第i个分量；where v _user represents the user feature vector, v _click [i] represents the ith component of the user click embedding, v _search [i] represents the ith component of the user search record embedding, and v _social [i] represents the user's social data embedding The i-th component, v _personal [i] represents the i-th component of the user's personal data embedding;

1.2、物品特征包括物品的id及其上下文信息，物品特征向量由物品的id类embedding于其上下文信息的embedding拼接而成；1.2. Item features include the id of the item and its context information, and the item feature vector is formed by splicing the id class embedding of the item and the embedding of its context information;

更进一步，1.2的流程如下：Going a step further, the process for 1.2 is as follows:

1.2.1、给出物品的id类embedding，是将物品唯一标识符映射到同一维度的向量；1.2.1. The id class embedding of the item is given, which is a vector that maps the unique identifier of the item to the same dimension;

1.2.2、给出物品的上下文信息embedding，是将上下文信息通过Word2vec得到的向量；1.2.2. The context information embedding of the item is given, which is a vector obtained by passing the context information through Word2vec;

1.2.3、将1.2所述的id类embedding和上下文信息embedding做concatenate连接操作得到物品特征向量，其公式如下：1.2.3. Perform the concatenate connection operation of the id class embedding and the context information embedding described in 1.2 to obtain the item feature vector. The formula is as follows:

v_item＝concatenate(v_id,v_context)＝[v_id[1],v_id[2],…,v_context[1],v_context[2],…]v _item =concatenate(v _id ,v _context )=[v _id [1],v _id [2],…,v _context [1],v _context [2],…]

其中v_item表示物品特征向量，v_id表示物品的id类embedding,v_context表示物品上下文信息的embedding,v_id[i]表示物品的id类embedding的第i个分量，v_context[i]表示物品上下文信息的embedding的第i个分量；Where v _item represents the feature vector of the item, v _id represents the id class embedding of the item, v _context represents the embedding of the item context information, v _id [i] represents the i-th component of the id class embedding of the item, and v _context [i] represents the item The ith component of the embedding of context information;

所述步骤2中，双塔模型衍生于DSSM(Deep Structured Semantic Models)。DSSM是深度结构化语义模型，在自然语言处理领域中，DSSM常用于解决语义相似度的问题。双塔模型从上往下可以分为输入层、表示层、匹配层，从水平方向看分为两个输入层，两个表示层和一个匹配层。其中两个输入层的输出分别是两个表示层的输入，两个表示层输出汇总到匹配层，整体呈现“双塔”的形式。本发明中，双塔模型的输入层的两个输入分别是用户特征向量和物品特征向量，两个表示层是相同的神经网络结构，特征向量通过神经网络计算后会得到相同维度的向量。最后将这两个向量经过L2归一化后进行内积。进一步的，本发明优化的双塔模型，是采用了基于哈希序列的频率估计方法对双塔模型进行优化。该方法减少了负采样在每个batch中可能会出现的采样偏差问题，从而优化了损失函数。In the step 2, the twin tower model is derived from DSSM (Deep Structured Semantic Models). DSSM is a deep structured semantic model. In the field of natural language processing, DSSM is often used to solve the problem of semantic similarity. The dual-tower model can be divided into input layer, presentation layer and matching layer from top to bottom, and from the horizontal direction, it is divided into two input layers, two presentation layers and one matching layer. The outputs of the two input layers are the inputs of the two presentation layers, respectively, and the outputs of the two presentation layers are aggregated to the matching layer, and the whole is in the form of a "twin tower". In the present invention, the two inputs of the input layer of the twin-tower model are the user feature vector and the item feature vector respectively, the two representation layers have the same neural network structure, and the feature vector will obtain a vector of the same dimension after being calculated by the neural network. Finally, the two vectors are L2 normalized and then inner product is performed. Further, the optimized twin-tower model of the present invention adopts a frequency estimation method based on a hash sequence to optimize the twin-tower model. This method reduces the possible sampling bias problem of negative sampling in each batch, thereby optimizing the loss function.

所述步骤2的流程如下：The process of step 2 is as follows:

2.1、给出两个带参数的embedding函数：

其中

表示d维实数向量。2.1. Give two embedding functions with parameters:

in

Represents a d-dimensional real vector.

是通过深度神经网络提取后的用户特征向量和物品特征向量，其中

和

分别是双塔模型需要的输入值。

are the user feature vector and item feature vector extracted by the deep neural network, where

and

are the input values required by the twin-tower model, respectively.

更进一步，2.1的流程如下：Going a step further, the process for 2.1 is as follows:

2.1.1、双塔模型含有两个深度神经网络模型。神经网络模型的组成单元是感知机，感知机有一个若干个输入和一个输出，其中输出和输入是训练到的一个线性关系，输出的值会通过一个非线性激活函数得到最终结果；2.1.1. The twin tower model contains two deep neural network models. The component unit of the neural network model is the perceptron. The perceptron has several inputs and one output. The output and the input are a linear relationship trained, and the output value will get the final result through a nonlinear activation function;

2.1.2、深度神经网络在神经网络模型上做了扩展，深度神经网络有一个输入层，一个输出层和若干个隐藏层，本发明采用了3层隐藏层，并且层与层之间采用全连接，全连接是指的每一个结点都与上一层的所有结点相连；2.1.2. The deep neural network is extended on the neural network model. The deep neural network has one input layer, one output layer and several hidden layers. Connection, full connection means that each node is connected to all nodes in the previous layer;

2.1.3、将用户特征向量和物品特征向量输入到对应的神经网络，由于模型需要训练，所以输出结果用含有待训练参数θ的函数来表示，输出结果分别为：

其中

2.1.3. Input the user feature vector and item feature vector into the corresponding neural network. Since the model needs to be trained, the output result is represented by a function containing the parameter θ to be trained. The output results are:

in

2.1.4、将输出结果

做L2归一化处理，L2归一化公式如下：2.1.4, will output the result

To do L2 normalization, the L2 normalization formula is as follows:

其中

γ_i表示[Y]向量上第i个分量；

in

γ _i represents the i-th component on the [Y] vector;

2.2、通过基于哈希序列的频率估计方法，即利用哈希序列来记录采样的序列号，减少了负采样在每个batch中可能会出现的采样偏差问题，从而优化了损失函数，循环以下步骤；2.2. Through the frequency estimation method based on the hash sequence, that is, using the hash sequence to record the serial number of the sampling, the sampling deviation problem that may occur in each batch of negative sampling is reduced, thereby optimizing the loss function, and looping the following steps ;

更进一步，2.2的流程如下：Going a step further, the process for 2.2 is as follows:

2.2.1、从样本集中随机采取T个样本，表示如下：2.2.1. Take T samples randomly from the sample set, which are expressed as follows:

其中x_i表示样本T中的第i个用户特征向量，

where x _i represents the ith user feature vector in sample T,

y_i表示样本T中的第i个物品特征向量，y _i represents the i-th item feature vector in sample T,

r_i表示样本T中第i个用户反馈程度，且r_i ∈[0,1]。ri represents the feedback degree of the _ith user in the sample T, and _ri ∈ [0,1].

2.2.2、利用基于哈希序列的频率估计方法计算每个样本中y_i的概率p_i；2.2.2. Calculate the probability p _i of y _i in each sample by using the frequency estimation method based on the hash sequence;

所述2.2.2流程如下：The process described in 2.2.2 is as follows:

2.2.2.1、设学习率α，大小为H的数组A和D，哈希函数h；2.2.2.1. Let the learning rate α, the arrays A and D of size H, and the hash function h;

其中哈希函数h可以把每个y_i映射到范围在[0,H]的整数。where the hash function h can map each _yi to an integer in the range [0, H].

2.2.2.2、对于每一步t＝1,2,…,在训练样本数量的一个批次(batch)中的所有物品集合

对于每一物品

有：2.2.2.2. For each step t = 1, 2, ..., the set of all items in a batch of the number of training samples

for each item

Have:

2.2.2.2.1、

2.2.2.2.1,

其中←表示赋值，

表示y_i上次被采样到的时刻，

表示y_i被采样一次经过的步数；where ← means assignment,

represents the last time _yi was sampled,

Indicates the number of steps that _yi is sampled once;

2.2.2.2.2、

2.2.2.2.2,

2.2.2.3、对于每一个

采样概率

2.2.2.3, for each

sampling probability

2.2.3、计算优化后的损失函数

并给出推导过程；2.2.3. Calculate the optimized loss function

And give the derivation process;

所述2.2.3流程如下：The process described in 2.2.3 is as follows:

2.2.3.1、给出向量内积公式：2.2.3.1. Give the vector inner product formula:

2.2.3.2、给定一个向量x，从M个物品中得到其中一个

的概率可以利用softmax函数计算，其公式如下：2.2.3.2. Given a vector x, get one of the M items

The probability of can be calculated using the softmax function, the formula is as follows:

其中e为自然常数；where e is a natural constant;

2.2.3.3、加权对数似然损失函数为：2.2.3.3. The weighted log-likelihood loss function is:

其中T表示2.2.1所述随机采取的T个样本；where T represents T samples randomly taken as described in 2.2.1;

2.2.3.4、本发明利用负采样算法进行计算，负采样算法是采用了一种平滑策略，能够提高低频样本的采样几率。我们在同一batch中采样负样本。给出一个最小batch，为

其中batch-softmax函数为：2.2.3.4. The present invention uses a negative sampling algorithm for calculation, and the negative sampling algorithm adopts a smoothing strategy, which can improve the sampling probability of low-frequency samples. We sample negative samples in the same batch. gives a minimum batch, which is

The batch-softmax function is:

2.2.3.5、在每一batch中，由于存在幂分布现象，即在每个batch中随机采样负样本，会使热门物品容易被采样到，在损失函数中就过度惩罚了这些热门商品，因此考虑用频率对采样进行修正，其公式如下：2.2.3.5. In each batch, due to the power distribution phenomenon, that is, random sampling of negative samples in each batch will make popular items easy to be sampled, and these popular items will be excessively punished in the loss function, so consider The sampling is corrected by frequency, and the formula is as follows:

s^c(x_i,y_i)＝s(x_i,y_i)-log(p_i)s ^c (x _i , y _i )=s(x _i , y _i )-log( _pi )

其中s^c(x_i,y_i)表示s(x_i,y_i)的修正值，p_i来自2.2.2所述的基于哈希序列的频率估计算法；where s ^c ( _xi , y _i ) represents the correction value of s ( _xi , y _i ), and _pi comes from the frequency estimation algorithm based on hash sequence described in 2.2.2;

2.2.3.6、因此修正后的条件概率函数为：2.2.3.6. Therefore, the modified conditional probability function is:

2.2.3.7、最后损失函数为：2.2.3.7. The final loss function is:

2.2.4、采用梯度下降算法更新参数θ，使其接近最优值。其中梯度下降算法机器学习常用的算法，用于求解模型参数；2.2.4. Use the gradient descent algorithm to update the parameter θ to make it close to the optimal value. Among them, the gradient descent algorithm is a commonly used algorithm in machine learning, which is used to solve the model parameters;

所述步骤3流程如下：The step 3 process is as follows:

3.1、给定一个用户交互矩阵

表示第

个用户与第

个物品的交互情况，U和V分别表示用户集合和物品集合；3.1. Given a user interaction matrix

means the first

users and

The interaction of each item, U and V represent the user set and the item set, respectively;

用户交互矩阵Y的表达公式如下：The expression formula of the user interaction matrix Y is as follows:

3.2、定义O_i,j表示用户i对物品j的交互情况，将用户i存在交互的物品映射到知识图谱

的实体；3.2. Define O _i,j to represent the interaction of user i with item j, and map the items that user i interacts with to the knowledge graph

entity;

更进一步，所述3.2流程如下：Further, the 3.2 process is as follows:

3.2.1、O_i,j是一个交互矩阵，由用户的行向量组成。O_i,j表示用户i对物品j的交互情况，其中j为物品的索引；3.2.1. O _i,j is an interaction matrix consisting of row vectors of users. O _i,j represents the interaction of user i with item j, where j is the index of the item;

3.2.2、定义一个HashMap用于存放全部物品，HashMap以key-value键值对的方式存放，这里的key存放物品的索引，value存放物体对应的实体；3.2.2. Define a HashMap to store all items. HashMap is stored in the form of key-value key-value pairs. The key here stores the index of the item, and the value stores the entity corresponding to the object;

3.2.3，从用户交互矩阵O的行向量表示一个用户的交互情况，把这个行向量存放于一个数组，这个数组定义为Ε；3.2.3, from the row vector of the user interaction matrix O to represent the interaction situation of a user, this row vector is stored in an array, and this array is defined as E;

3.2.3、定义一个临时集合temp_set用户存放实体，遍历数组Ε，如果数组里元素的值为1，则根据这个元素的索引访问物品HashMap，获取对应的实体，存放于临时集合temp_set中；3.2.3. Define a temporary set temp_set user to store the entity, traverse the array E, if the value of the element in the array is 1, then access the item HashMap according to the index of this element, obtain the corresponding entity, and store it in the temporary set temp_set;

所述步骤4流程如下：The step 4 process is as follows:

4.1、为一个用户给出第一步所述的用户特征向量v_user；4.1, give the user feature vector v _{user described in the first step for a user} ;

4.2、按第三步初始化该用户的临时集合temp_set；4.2. Initialize the user's temporary set temp_set according to the third step;

4.3、定义一个用户的偏好的HashMap，命名为user_map，key值存放循环次数，value存放三元组集合；4.3. Define a HashMap of user preferences, named user_map, the key value stores the number of cycles, and the value stores the triplet set;

4.4、定义一个集合used_set，用于存放已经召回的三元组；4.4. Define a set used_set to store the recalled triples;

4.5、令循环次数k＝1,2,3…,给定一个最大值K,当user_map中的三元组数量大于K，退出循环，循环执行以下步骤：4.5. Let the number of loops k = 1, 2, 3..., given a maximum value K, when the number of triples in user_map is greater than K, exit the loop and execute the following steps:

4.5.1、遍历temp_set中的实体，去掉used_set中的实体4.5.1. Traverse the entities in temp_set and remove the entities in used_set

temp_set←temp_set-used_settemp_set←temp_set-used_set

其中←表示赋值，-表示差集运算；← means assignment, - means difference operation;

4.5.2、从知识图谱找到三元组集合

并存放到user_map，定义如下：4.5.2. Find the triplet set from the knowledge graph

And stored in user_map, defined as follows:

其中(h',r',t')表示三元组，h'表示头实体，r'表示关系，t'表示尾实体；Where (h', r', t') represents a triple, h' represents a head entity, r' represents a relationship, and t' represents a tail entity;

4.5.3、由于知识图谱存在回路，为了防止收集到已经召回的实体，需要将temp_set中的实体添加到used_set；4.5.3. Since there is a loop in the knowledge graph, in order to prevent the collection of recalled entities, it is necessary to add entities in temp_set to used_set;

4.5.4、取出

中三元组的尾实体,存放到集合

4.5.4, take out

The tail entity of the triple, stored in the collection

4.5.5、遍历

中的实体，取出对应的物品特征向量v_item；4.5.5. Traversal

The entity in , take out the corresponding item feature vector v _item ;

4.5.6、通过第二步所述的输入参数v_user和v_item到双塔模型，得出用户与集合

中对应的物品的概率，并根据概率大小将实体排序；4.5.6. Through the input parameters v _user and v _item described in the second step to the twin tower model, the user and set are obtained

The probability of the corresponding item in , and the entities are sorted according to the probability;

4.5.7、给定一个值τ,0<τ≤1，用于决定筛选实体的个数：4.5.7. Given a value τ, 0<τ≤1, it is used to determine the number of screening entities:

其中

表示集合中的元素按概率排序，并返回一个数组，getSubVec(i,j)表示取出子数组，即原数组的第i到第j的元素，newSet()表示将数组转化为集合，

表示向上取整；in

Indicates that the elements in the set are sorted by probability and returns an array, getSubVec(i,j) means to take out the sub-array, that is, the i-th to j-th elements of the original array, newSet() means to convert the array into a set,

means rounded up;

4.5.8、从

集合中筛选出尾实体属于

的三元组：4.5.8. From

Filters out the tail entity from the collection that belongs to

triples of :

4.5.9、存放三元组集合Λ到user_map的方法：user_map(k,Λ)；4.5.9. The method of storing triplet set Λ to user_map: user_map(k, Λ);

4.5.10、将

集合覆盖到temp_set；4.5.10, will

Set overwrites to temp_set;

4.6、执行完4.5后，最终得到用户偏好的知识图谱。4.6. After executing 4.5, the knowledge graph of user preference is finally obtained.

本发明的有益效果如下：通过优化的双塔模型对知识图谱中用户偏好的实体进行筛选。双塔模型的优化方法是采用了基于哈希序列的频率估计方法，使得物品可以更好得适应多种数据分布。对知识图谱实体的筛选，不仅可以得到更优质的数据，召回的实体能够真实得接近用户偏好；而且筛选实体有利于知识图谱的深度召回，因为每次召回的实体可能会因为前一次实体的基数爆炸式得增多，如果不进行筛选，将会影响计算效率和对用户潜在偏好的探索。The beneficial effects of the present invention are as follows: the entities preferred by the user in the knowledge graph are screened through the optimized twin-tower model. The optimization method of the twin-tower model adopts the frequency estimation method based on the hash sequence, so that the items can better adapt to a variety of data distributions. The screening of knowledge graph entities can not only obtain better data, the recalled entities can be truly close to user preferences; and screening entities is conducive to the deep recall of knowledge graphs, because each recalled entity may be due to the cardinality of the previous entity. Explosive growth, if not screened, will affect computational efficiency and exploration of users’ potential preferences.

具体实施方式Detailed ways

以下结合实施例对本发明作进一步描述。The present invention will be further described below in conjunction with the embodiments.

实施例：Example:

第一步：定义用户特征向量和物品特征向量，作为双塔模型的输入；The first step: define the user feature vector and item feature vector as the input of the twin tower model;

所述第一步的过程如下：The process of the first step is as follows:

1.1、用户特征指的是用户对物品的交互行为，包括点击记录，搜索记录，社交数据，个人数据和样本年龄，用户特征向量是将上述交互数据转化为向量并做拼接(concatenate)。其中将原始数据转化为向量的方式称为向量嵌入(embedding)，向量嵌入(embedding)是机器学习中常用的表示数据特征的方法，目的是将原始数据提取出特征，也就是通过神经网络映射之后的低维向量；1.1. User feature refers to the user's interaction behavior with items, including click records, search records, social data, personal data and sample age. User feature vector is to convert the above interaction data into vectors and concatenate them. The method of converting raw data into vectors is called vector embedding. Vector embedding is a method commonly used in machine learning to represent data features. The purpose is to extract features from raw data, that is, after mapping through neural networks. The low-dimensional vector of ;

更进一步，所述1.1的流程如下Further, the process of 1.1 is as follows

其中v_click表示用户点击记录的embedding，

It can be calculated by the following formula:

其中

Word2vec模型由Mikolov等人于2013年提出，该模型将文本中的内容词汇通过转换处理，化简为空间向量，词向量的数值受上下文的影响，蕴含了词与词之间相互的关联性。The Word2vec model was proposed by Mikolov et al. in 2013. This model converts the content words in the text and simplifies them into space vectors. The value of the word vector is affected by the context, which implies the correlation between words.

其中v_search表示用户搜索记录的embedding，

其中v_social表示用户搜索记录的embedding，

其中

其中

1.1.4、用户的个人数据包括用户的性别、年龄和地域。其中性别是简单的二元特征，年龄和地域属于连续型特征，可以将其归一化为[0,1]区间上的实数值。用户个人数据的embedding，是将处理过的性别、年龄和地域的值做拼接操作后得到的向量；1.1.4. The user's personal data includes the user's gender, age and region. Among them, gender is a simple binary feature, and age and region are continuous features, which can be normalized to real values in the [0,1] interval. The embedding of the user's personal data is a vector obtained by splicing the processed values of gender, age and region;

v_personal＝[gender,z_age,z_region]v _personal = [gender, z _age , z _region ]

第二步：训练双塔模型，结合in-batch softmax损失函数与基于哈希序列的频率估计方法对双塔模型进行优化；The second step: train the double-tower model, and optimize the double-tower model by combining the in-batch softmax loss function and the frequency estimation method based on the hash sequence;

所述第二步中，双塔模型衍生于DSSM(Deep Structured Semantic Models)。DSSM是深度结构化语义模型，在自然语言处理领域中，DSSM常用于解决语义相似度的问题。双塔模型从上往下可以分为输入层、表示层、匹配层，从水平方向看分为两个输入层，两个表示层和一个匹配层。其中两个输入层的输出分别是两个表示层的输入，两个表示层输出汇总到匹配层，整体呈现“双塔”的形式。本发明中，双塔模型的输入层的两个输入分别是用户特征向量和物品特征向量，两个表示层是相同的神经网络结构，特征向量通过神经网络计算后会得到相同维度的向量。最后将这两个向量经过L2归一化后进行内积。进一步的，本发明优化的双塔模型，是采用了基于哈希序列的频率估计方法对双塔模型进行优化。该方法减少了负采样在每个batch中可能会出现的采样偏差问题，从而优化了损失函数。In the second step, the twin tower model is derived from DSSM (Deep Structured Semantic Models). DSSM is a deep structured semantic model. In the field of natural language processing, DSSM is often used to solve the problem of semantic similarity. The dual-tower model can be divided into input layer, presentation layer and matching layer from top to bottom, and from the horizontal direction, it is divided into two input layers, two presentation layers and one matching layer. The outputs of the two input layers are the inputs of the two presentation layers, respectively, and the outputs of the two presentation layers are aggregated to the matching layer, and the whole is in the form of a "twin tower". In the present invention, the two inputs of the input layer of the twin-tower model are the user feature vector and the item feature vector respectively, the two representation layers have the same neural network structure, and the feature vector will obtain a vector of the same dimension after being calculated by the neural network. Finally, the two vectors are L2 normalized and then inner product is performed. Further, the optimized twin-tower model of the present invention adopts a frequency estimation method based on a hash sequence to optimize the twin-tower model. This method reduces the possible sampling bias problem of negative sampling in each batch, thereby optimizing the loss function.

所述第二步的流程如下：The process of the second step is as follows:

2.1、给出两个带参数的embedding函数：

其中

表示d维实数向量。2.1. Give two embedding functions with parameters:

in

Represents a d-dimensional real vector.

和

分别是双塔模型需要的输入值。

and

are the input values required by the twin-tower model, respectively.

更进一步，2.1的流程如下Further, the process of 2.1 is as follows

其中

in

2.1.4、将输出结果

做L2归一化处理，L2归一化公式如下：2.1.4, will output the result

To do L2 normalization, the L2 normalization formula is as follows:

其中

γ_i表示[Y]向量上第i个分量；

in

γ _i represents the i-th component on the [Y] vector;

其中x_i表示样本T中的第i个用户特征向量，

where x _i represents the ith user feature vector in sample T,

y_i表示样本T中的第i个物品特征向量，y _i represents the i-th item feature vector in the sample T,

所述2.2.2流程如下：The process described in 2.2.2 is as follows:

对于每一物品

for each item

Have:

2.2.2.2.1、

2.2.2.2.1,

其中←表示赋值，

表示y_i上次被采样到的时刻，

表示y_i被采样一次经过的步数；where ← means assignment,

represents the last time _yi was sampled,

Indicates the number of steps that _yi is sampled once;

2.2.2.2.2、

2.2.2.2.2,

2.2.2.3、对于每一个

采样概率

2.2.2.3, for each

sampling probability

2.2.3、计算优化后的损失函数

并给出推导过程；2.2.3. Calculate the optimized loss function

And give the derivation process;

所述2.2.3流程如下：The process described in 2.2.3 is as follows:

2.2.3.2、给定一个向量x，从M个物品中得到其中一个

其中；符号后面的θ表示

所带的参数，e为自然常数；Among them; θ after the symbol means

With the parameters, e is a natural constant;

The batch-softmax function is:

2.2.3.7、最后损失函数为：2.2.3.7. The final loss function is:

2.2.4、采用梯度下降算法更新参数θ，使其接近最优值。其中梯度下降算法2.2.4. Use the gradient descent algorithm to update the parameter θ to make it close to the optimal value. The gradient descent algorithm

机器学习常用的算法，用于求解模型参数；Algorithms commonly used in machine learning to solve model parameters;

第三步：定义用户历史交互矩阵与知识图谱的实体映射关系；Step 3: Define the entity mapping relationship between the user history interaction matrix and the knowledge graph;

所述第三步流程如下：The third step process is as follows:

3.1、给定一个用户交互矩阵

表示第

个用户与第

means the first

users and

entity;

更进一步，所述3.2流程如下：Further, the 3.2 process is as follows:

3.2.3、定义一个临时集合temp_set用户存放实体，遍历数组Ε，如果数组里元素的值为1，则根据这个元素的索引访问物品HashMap，获取对应的实体，存放于临时集合temp_set中；3.2.3. Define a temporary set temp_set user to store entities, traverse the array E, if the value of the element in the array is 1, access the item HashMap according to the index of this element, obtain the corresponding entity, and store it in the temporary set temp_set;

第四步：通过偏好实体传播的方式，将每次传播召回到的实体与用户特征输入到优化的双塔模型户偏得出预测概率，筛选概率高的实体，最终得到表示用户偏好和潜在偏好的知识图谱；Step 4: Through the method of preference entity propagation, input the entities and user characteristics recalled by each propagation into the optimized twin-tower model to obtain the predicted probability, screen the entities with high probability, and finally obtain the representation of user preference and potential preference. knowledge graph;

所述第四步流程如下：The fourth step process is as follows:

temp_set←temp_set-used_settemp_set←temp_set-used_set

4.5.2、从知识图谱找到三元组集合

And stored in user_map, defined as follows:

4.5.4、取出

中三元组的尾实体,存放到集合

4.5.4, take out

The tail entity of the triple, stored in the collection

4.5.5、遍历

中的实体，取出对应的物品特征向量v_item；4.5.5. Traversal

The entity in , take out the corresponding item feature vector v _item ;

其中

表示向上取整；in

means rounded up;

4.5.8、从

集合中筛选出尾实体属于

的三元组：4.5.8. From

Filter out the tail entity from the collection that belongs to

triples of :

4.5.9、存放三元组集合Λ到user_map的方法：user_map(k,Λ)；4.5.10、将

集合覆盖到temp_set；4.5.9. The method of storing the triplet set Λ to user_map: user_map(k, Λ); 4.5.10. Putting the

Set overwrites to temp_set;

Claims

1. A knowledge graph user preference entity recalling method based on a double-tower model is characterized by comprising the following steps:

1) defining a user characteristic vector and an article characteristic vector as the input of a double-tower model;

2) training a double-tower model, and optimizing the double-tower model by combining an in-batch softmax loss function and a frequency estimation method based on a Hash sequence;

3) defining an entity mapping relation between a user historical interaction matrix and a knowledge graph;

4) and inputting the entities recalled by each transmission and the user characteristics into the optimized double-tower model user bias to obtain a prediction probability in a preference entity transmission mode, screening the entities according to the prediction probability, and finally obtaining a knowledge graph representing the user preference and the potential preference.

2. The method for recalling knowledge-graph user preference entity based on double-tower model according to claim 1, wherein the specific process of step 1) is as follows:

1.1) user characteristics refer to interactive behaviors of a user on an article, including click records, search records, social data, personal data and sample age, and user characteristic vectors are obtained by converting the interactive data into vectors and splicing the vectors; the mode of converting the original data into the vector is called vector embedding;

1.1.1) the embedding of the user click record is the weighted average of the id types of the clicked items, wherein the id types of the embedding are vectors which map the unique identifiers of the items to the same dimension, and the weight of the id types of the embedding is in direct proportion to the item browsing time; the imbedding calculation formula of the user click record is as follows:

wherein v is_clickAn embedding representing the user clicking on a record,

denotes the ith weight, v_click,iRepresenting id type embedding of the ith item in the click record, wherein n represents the number of the embedding; wherein,

can be calculated by the following formula:

wherein

Representing the time when the user browses the item i, N representing the total number of samples, and k representing the total number of positive examples;

1.1.2) the embedding of the user search record is to perform word segmentation on the keywords of the historical search to obtain entries; the Word segmentation process is to obtain embedding of a corresponding entry through a Word2vec model, and then carry out weighted average on the embedding of the user search record;

the formula for embedding the user search records is as follows:

wherein v is_searchEmbedding representing a user's search records,

denotes the ith weight, v_search,iIndicating the imbedding of the ith entry in the search record, wherein n indicates the number of the imbedding;

weight calculation of embedding of search records:

judging whether the user clicks the article after searching the article according to the searching validity;

1.1.3) the social data of the user comprises the embedding weighted average corresponding to the collection, praise and subscription data; the imbedding corresponding to the collection and approval data refers to the imbedding of the item id class collected and approved by the user; subscribing the embedding corresponding to the data refers to subscribing the embedding of the id class of a person in charge corresponding to the item by the user;

the formula for embedding the social data of the user is as follows:

wherein v is_socialEmbedding representing a user's search records,

denotes the ith weight, v_social,iEmbedding representing the ith social data in the search record;

weight calculation for favorites and praise embedding:

wherein

Representing user to itemi, browsing time, N represents the total number of samples, and k represents the total number of positive cases;

embedding weight calculation for a subscription:

wherein

Showing the browsing time of the ith item of the subscriber, wherein N represents the total number of samples, and k represents the total number of positive examples;

1.1.4) personal data of a user includes the user's gender, age, and location; the neutral characteristic is a simple binary characteristic, the age and the region belong to a continuous characteristic, and the continuous characteristic is normalized into a real numerical value in a [0,1] interval; embedding of user personal data is a vector obtained by splicing the processed values of gender, age and region;

1.1.4.1) calculates a binary representation of the user's gender, which is formulated as follows:

1.1.4.2) calculating the normalized real value of the age and the region of the user, wherein the normalized formula is as follows:

wherein X represents the sample value, μ is the mean of all sample data, and σ is the standard deviation of all sample data;

1.1.4.3) splicing the sex binary value, age and the region normalized real value in the step 1.1.4) to obtain a vector, wherein the vector splicing operation formula is as follows:

v_personal＝[gender,z_age,z_region]

wherein v is_personalRepresenting the user feature vector, gender, z_ageAnd z_regionNormalized values respectively representing the age and region of the user;

1.1.5) clicking recorded embedding by the user according to the process in the step 1.1), searching recorded embedding by the user, interacting data embedding by the user, and carrying out concatemate connection operation on the embedding of personal data of the user to obtain a user characteristic vector, wherein the formula is as follows:

v_user＝concatenate(v_click,v_search,v_social,v_personal)＝[v_click[1],v_click[2],…,v_search[1],v_search[2],…,v_social[1],v_social[2],…,v_personal[1],v_personal[2],…]

wherein v is_userRepresenting a user feature vector, v_click[i]The ith component, v, representing the user clicking on embedding_search[i]The i-th component, v, representing the user's search record embedding_social[i]The i component, v, representing the user's social data embedding_personal[i]An ith component representing user personal data embedding;

1.2) the article characteristics comprise the id of the article and the context information thereof, and the article characteristic vector is formed by splicing the id class embedding of the article and the embedding of the context information thereof;

1.2.1) giving an id class embedding of the article, wherein the id class embedding is a vector for mapping the unique identifier of the article to the same dimension;

1.2.2) giving context information embedding of the article, wherein the context information is a vector obtained by Word2 vec;

1.2.3) carrying out a containing connection operation on the id type embedding and the context information embedding in the step 1.2) to obtain an article feature vector, wherein the formula is as follows:

v_item＝concatenate(v_id,v_context)＝[v_id[1],v_id[2],…,v_context[1],v_context[2],…]

wherein v is_itemIndicating the nature of the articleEigenvectors, v_idId class embedding, v representing an item_contextEmbedding, v representing item context information_id[i]The i component, v, of the id class embedding representing the item_context[i]The ith component of embedding representing item context information.

3. The method for recalling knowledge-graph user preference entity based on double-tower model according to claim 1, wherein the specific process of the step 2) is as follows:

2.1) the embedding function gives two band parameters:

wherein

Representing a d-dimensional real number vector;

is the user feature vector and the article feature vector extracted by the deep neural network, wherein

And

respectively are input values required by the double-tower model;

2.1.1) the double tower model contains two deep neural network models; the component unit of the neural network model is a perceptron, the perceptron has a plurality of inputs and an output, wherein the output and the input are a linear relation trained, the output value can obtain the final result through a nonlinear activation function;

2.1.2) the deep neural network is expanded on the neural network model, and the deep neural network is provided with an input layer, an output layer and a plurality of hidden layers;

2.1.3) inputting the user characteristic vector and the article characteristic vector into a corresponding neural network, wherein the output result is represented by a function containing a parameter theta to be trained because the model needs to be trained, and the output results are respectively as follows:

wherein

2.1.4) will output the result

The L2 normalization processing is carried out, and the L2 normalization formula is as follows:

wherein

γ_iIs represented by [ Y]The ith component on the vector;

2.2) through a frequency estimation method based on a hash sequence, namely, the hash sequence is used for recording the serial number of the sample, the problem of sampling deviation possibly occurring in each batch of negative samples is reduced, so that a loss function is optimized, and the following steps are circulated;

2.2.1) randomly taking T samples from the sample set, expressed as follows:

wherein x_iRepresenting the ith user feature vector in sample T,

y_irepresenting the ith item feature vector in sample T,

r_irepresents the degree of feedback of the ith user in the sample T, and r_i∈[0,1]；

2.2.2) computing y in each sample by using a frequency estimation method based on a Hash sequence_iProbability p of_i；

2.2.2.1) setting arrays A and D with learning rate alpha and size H and a hash function H;

wherein the hash function h may hash each y_iMapping to a range of [0, H]An integer of (d);

2.2.2.2) for each step t ═ 1,2, …, all collections of items in one batch of the training sample number batch

For each article

Comprises the following steps:

2.2.2.2.1)

where ← represents the assignment,

denotes y_iThe time of the last time that it was sampled,

denotes y_iThe number of steps of one pass of the sampled data;

2.2.2.2.2)

2.2.2.3) for each

Probability of sampling

2.2.3) after calculation optimizationLoss function of

And giving a derivation process;

2.2.3.1) gives the vector inner product formula:

2.2.3.2) given a vector

Obtaining one of the M articles

The probability of (c) can be calculated using the softmax function, which is formulated as follows:

wherein e is a natural constant;

2.2.3.3) weighted log-likelihood loss function as:

wherein T represents 2.2.1) the randomly taken T samples;

2.2.3.4) calculating by using a negative sampling algorithm, wherein the negative sampling algorithm adopts a smoothing strategy and can improve the sampling probability of low-frequency samples; sampling negative samples in the same batch; giving a minimum batch of

Wherein the batch-softmax function is:

2.2.3.5) in each batch, the sampling is corrected by frequency, which is considered to be the following equation, because of the power distribution phenomenon, that is, the negative samples are randomly sampled in each batch, which makes the hot goods easy to sample, and the hot goods are excessively punished in the loss function:

s^c(x_i,y_i)＝s(x_i,y_i)-log(p_i)

wherein s is^c(x_i,y_i) Denotes s (x)_i,y_i) Correction value of p_iIs y_iThe probability of (d);

2.2.3.6) the conditional probability function thus modified is:

2.2.3.7) the final loss function is:

2.2.4) updating the parameter theta by adopting a gradient descent algorithm to enable the parameter theta to be close to an optimal value; the gradient descent algorithm machine learns the commonly used algorithm and is used for solving the model parameters.

4. The method for recalling knowledge-graph user preference entity based on double-tower model according to claim 1, wherein the specific process of the step 3) is as follows:

3.1) given a user interaction matrix

Is shown as

A user and the second

The interactive condition of each item, U and V respectively represent a user set and an item set;

the expression of the user interaction matrix Y is as follows:

3.2) definition of O_i,jRepresenting the interaction condition of the user i to the item j, and mapping the item with the interaction of the user i to the knowledge graph

The entity of (1);

3.2.1)O_i,jis an interactive matrix, which is composed of row vectors of users; o is_i,jRepresenting the interaction condition of a user i to an item j, wherein j is the index of the item;

3.2.2) defining a HashMap for storing all articles, wherein the HashMap is stored in a key-value key value pair mode, the key stores the index of the article, and the value stores the entity corresponding to the object;

3.2.3) expressing the interaction condition of a user from the row vector of the user interaction matrix O, and storing the row vector in an array, wherein the array is defined as Ee;

3.2.3) defining a temporary set temp _ set user storage entity, traversing the array E, and if the value of an element in the array is 1, accessing the article HashMap according to the index of the element, acquiring the corresponding entity, and storing the entity in the temporary set temp _ set.

5. The method for recalling knowledge-graph user preference entity based on double-tower model according to claim 1, wherein the specific process of the step 4) is as follows:

4.1) providing a user with the user feature vector v of the first step_user；

4.2) initializing the temporary set temp _ set of the user according to the third step;

4.3) defining a HashMap preferred by a user, namely a user _ map, storing the circulation times of a key value, and storing a triple set by a value;

4.4) defining a set used _ set for storing the recalled triples;

4.5) making the number K of loops equal to 1,2,3 …, giving a maximum value K, and when the number of triples in the user _ map is greater than K, exiting the loop, and executing the following steps:

4.5.1) traverse the entities in temp _ set, remove the entities in used _ set:

temp_set←temp_set-used_set

wherein ← represents valuation, -represents difference set operation;

4.5.2) finding triple sets from the knowledge-graph

And stored in the user _ map, defined as follows:

wherein (h ', r', t ') represents a triplet, h' represents a head entity, r 'represents a relationship, and t' represents a tail entity;

4.5.3) due to the existence of a loop of the knowledge graph, in order to prevent the collection of the recalled entities, the entities in temp _ set need to be added to used _ set;

4.5.4) taking out

Tail entities of middle triplets, deposited into collections

4.5.5) traversal

The corresponding article feature vector v is taken out from the entity in (1)_item；

4.5.6) input parameter v as described by the second step_userAnd v_itemTo the double tower model, the users and the sets are obtained

The probability of the corresponding article in (1), and ordering the entities according to the probability;

4.5.7) given a value τ,0< τ ≦ 1, for determining the number of screening entities:

wherein

Representing the elements in the set sorted by probability and returning an array, getSubVec (i, j) representing the fetching of the child array, i.e. the i to j elements of the original array, newSet () representing the conversion of the array into the set,

represents rounding up;

4.5.8) from

The screened tail entities in the set belong to

The triplet of (2):

4.5.9) method of depositing a triplet set Λ to user _ map: user _ map (k, Λ);

4.5.10) will be

Set overlay to temp _ set;

4.6) and 4.5) finally obtaining the knowledge graph preferred by the user.