CN116662564A

CN116662564A - A service recommendation method based on deep matrix factorization and knowledge graph

Info

Publication number: CN116662564A
Application number: CN202310584709.XA
Authority: CN
Inventors: 付春雷; 吴冕; 唐鹏辉; 李成高; 洪伟; 赵义伟; 鄢萌
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2023-05-23
Filing date: 2023-05-23
Publication date: 2023-08-29

Abstract

The invention relates to a service recommendation method based on deep matrix decomposition and a knowledge graph, which utilizes government service item data to construct a government service knowledge graph, models entity context and entity description text information of the knowledge graph through a knowledge representation method, and combines knowledge representation learning and personalized recommendation through a combined learning mode to obtain an optimal GKGR model. And finally predicting the score of each pair of users and service items, and recommending the service items with higher scores to the users in a recommendation list mode. According to the method, the neural network is utilized to conduct feature extraction on users and service matters, user behavior data is fully utilized, and the problem of data sparseness is effectively relieved; modeling is carried out on the entity context and the entity description text information through a knowledge representation method, knowledge representation tasks and personalized recommendation tasks are jointly learned, accuracy and interpretability of recommendation results are improved, and the problem of cold start of government service recommendation is effectively relieved.

Description

A service recommendation method based on deep matrix factorization and knowledge graph

技术领域technical field

本发明涉及政务服务推荐领域，具体涉及一种基于深度矩阵分解与知识图谱的服务推荐方法。The invention relates to the field of government service recommendation, in particular to a service recommendation method based on deep matrix decomposition and knowledge graph.

背景技术Background technique

“互联网+政务服务”将传统的政务服务方式与现代互联网技术相结合，实现了政府与公民之间的便捷互动和信息交流，提升了政务服务的效率、透明度和公正性，同时也推动了数字化转型和智慧城市建设。然而随着城市级一站式服务平台的建设，政务服务资源庞大分散、种类多、层级复杂，政务服务面向市民用户，常常需要个性化的信息服务。如何从海量城市政务服务中为用户过滤出所需的服务事项，并将其推荐给用户，是一站式市服务平台面临的痛点难点问题。在个性化推荐技术中，传统的协同过滤推荐算法，应用广泛，技术成熟，但难以应对政务服务个性化推荐场景下面临的数据稀疏问题。"Internet + government service" combines traditional government service methods with modern Internet technology, realizes convenient interaction and information exchange between the government and citizens, improves the efficiency, transparency and fairness of government services, and also promotes digitalization Transformation and smart city building. However, with the construction of city-level one-stop service platforms, government service resources are huge and scattered, with many types and complex levels. Government services are oriented to citizen users and often require personalized information services. How to filter out the required service items for users from a large number of urban government services and recommend them to users is a pain point and difficulty faced by the one-stop city service platform. In the personalized recommendation technology, the traditional collaborative filtering recommendation algorithm is widely used and the technology is mature, but it is difficult to deal with the data sparse problem faced in the personalized recommendation scenario of government services.

发明内容Contents of the invention

本发明的目的在于提供一种基于深度矩阵分解与知识图谱的服务推荐方法，旨在解决政务服务推荐时面临的用户数据稀疏，大量异质多源、组织松散的数据未能充分利用的问题。The purpose of the present invention is to provide a service recommendation method based on deep matrix decomposition and knowledge graph, which aims to solve the problem of sparse user data, a large number of heterogeneous, multi-source, and loosely organized data that cannot be fully utilized when recommending government services.

为解决上述技术问题，本发明采用如下技术方案：一种基于深度矩阵分解与知识图谱的服务推荐方法，包括如下步骤：In order to solve the above-mentioned technical problems, the present invention adopts the following technical solution: a service recommendation method based on depth matrix decomposition and knowledge graph, including the following steps:

S1:根据用户与服务事项交互过程中产生的行为数据获得初始用户向量，将初始用户向为输入全连接层输出即为用户向量u_i：S1: Obtain the initial user vector according to the behavior data generated during the interaction between the user and the service item, and output the initial user to the fully connected layer as the input user vector u _i :

根据用户与服务事项交互过程中产生的行为数据，按照度量规则对用户行为数据进行量化，构建用户-服务事项行为矩阵其中矩阵的每一行表示一个初始用户向量，矩阵中的数值R_ij表示用户i对服务事项j的点击次数。According to the behavior data generated during the interaction between users and service items, the user behavior data is quantified according to the measurement rules, and the user-service item behavior matrix is constructed Each row of the matrix represents an initial user vector, and the value R _ij in the matrix represents the number of times user i clicks on service item j.

构建政务服务知识图谱G，服务事项实体和关系之间用图结构进行表示，每个服务事项实体看作图中的节点，关系看作边；Construct the government service knowledge map G, and use the graph structure to represent the service item entities and relationships. Each service item entity is regarded as a node in the graph, and the relationship is regarded as an edge;

S2:构建和训练GKGR模型，所述GKGR模型包括：S2: build and train GKGR model, described GKGR model comprises:

S2-1:根据政务服务知识图谱定义实体上下文信息获得服务事项实体向量e_s；S2-1: Obtain the service item entity vector e _s according to the entity context information defined in the government service knowledge graph;

S2-2:根据政务服务知识图谱定义实体描述文本获得第二服务事项实体向量e_d；S2-2: Obtain the second service item entity vector _ed according to the definition entity description text of the government service knowledge graph;

S2-3:根据服务事项实体向量e_s与服务事项实体向量e_d得到最终服务事项向量e；S2-3: Obtain the final service item vector e according to the service item entity vector e _s and the service item entity vector e _d ;

S2-4:给定用户i、服务事项实体j以及用户-服务事项行为矩阵R_ij，构建了用户-服务事项偏好对<i,j,j′>，表示用户i与服务事项实体j有交互，与服务事项j′无交互，即用户i对服务事项j有需求。从G中找到与j、j′相关的三元组和实体描述文本。通过实体上下文的知识表示方法学习e_s，利用Bi-LSTM学习服务事项实体向量e_d。通过门控机制将两种实体向量进行融合，将用户向量u_i和服务事项向量输入到个性化排序模型。S2-4: Given user i, service item entity j, and user-service item behavior matrix R _ij , construct a user-service item preference pair <i,j,j′>, indicating that user i interacts with service item entity j , there is no interaction with service item j′, that is, user i has demand for service item j. Find triples and entity description text related to j, j′ from G. Learn e _s through the knowledge representation method of entity context, and use Bi-LSTM to learn service item entity vector _ed . The two entity vectors are fused through the gating mechanism, and the user vector u _i and the service item vector are input into the personalized ranking model.

当目标函数最大且不再变化时，训练结束，此时得到最优GKGR模型；When the objective function is the largest and no longer changes, the training ends, and the optimal GKGR model is obtained at this time;

S3:对于一个用户，采用S1得到用户向量输入最优GKGR模型，最优GKGR模型计算该用户与所有服务事项的关联程度，并按照关联程度值的大小降序排列，输出关联程度值所对应的服务事项序列。S3: For a user, use S1 to obtain the user vector input optimal GKGR model, the optimal GKGR model calculates the degree of association between the user and all service items, and arranges them in descending order according to the value of the degree of association, and outputs the service corresponding to the value of the degree of association sequence of events.

作为优选，所述S2-1获得服务事项实体向量e_s的过程如下：As a preference, the process of S2-1 obtaining the service item entity vector e _s is as follows:

实体上下文信息C(h,r,t)包括邻居上下文C_n(h)和路径上下文C_p(h,t)。Entity context information C(h,r,t) includes neighbor context C _n (h) and path context C _p (h,t).

邻居上下文C_n(h)是指与给定节点直接相连的其他节点的集合。Neighborhood context C _n (h) refers to the set of other nodes directly connected to a given node.

路径上下文C_p(h,t)是指与给定节点相连的所有路径所组成的上下文信息,即C(h,r,t)＝C_n(h)∪C_p(h,t)。Path context C _p (h, t) refers to the context information composed of all paths connected to a given node, that is, C (h, r, t) = C _n (h)∪C _p (h, t).

服务事项实体h的邻居上下文定义如下公式所示，G表示政务服务知识图谱。The neighbor context definition of the service item entity h is shown in the following formula, and G represents the government service knowledge map.

其中，h,t表示不同的服务事项实体，r表示关系；Among them, h, t represent different service item entities, and r represents a relationship;

服务事项实体h和t的路径上下文定义如下公式所示：The path context definitions of service item entities h and t are as follows:

其中，p_i是h到达实体t的关系序列，L是所有关系路径中的最大长度，r₁，表示h到达实体t途径的其它关系，e₁，/>表示h到达实体t途径的其它实体，l_i表示第i个关系。Among them, p _i is the sequence of relations from h to entity t, L is the maximum length among all relation paths, r ₁ , Indicates the other relations of h reaching entity t, e ₁ , /> Indicates that h reaches other entities on the way to entity t, and l _i indicates the i-th relationship.

三元组(h,r,t)成立的概率如下公式所示。The probability of the triplet (h, r, t) being established is shown in the following formula.

f(h,r,t)＝P((h,r,t)|C(h,r,t)；θ) (3)f(h,r,t)=P((h,r,t)|C(h,r,t);θ) (3)

其中，θ表示模型的参数，评分函数f(·)的分数越高，三元组成立的概率越大。Among them, θ represents the parameters of the model, and the higher the score of the scoring function f( ), the greater the probability of the triplet being established.

预训练模型transE，三元组(h,r,t)输入transE，当f(h,r,t)值最大时，transE的输出即服务事项实体向量e_s。The pre-training model transE, the triple (h, r, t) is input into transE, when the value of f(h, r, t) is the largest, the output of transE is the service item entity vector e _s .

作为优选，对所述评分函数f(·)采用方法进行优化：As preferably, the scoring function f ( ) is optimized using the method:

通过条件概率对f(h,r,t)进行分解，可以得到：By decomposing f(h,r,t) through conditional probability, we can get:

其中P(h|C(h,r,t)；θ)表示在h出现的条件概率。由于实体h与其邻居上下文相关，因此可以将P(h|C(h,r,t)；θ)直接近似表示为P(h|C_n(h)；θ)，定义如下公式所示。where P(h|C(h,r,t); θ) represents the conditional probability of occurrence at h. Since the entity h is contextually related to its neighbors, P(h|C(h,r,t); θ) can be directly approximated as P(h|C _n (h); θ), defined as shown in the following formula.

其中表示任意实体与h实体邻居上下文的关联程度。in Indicates the degree of association of any entity with the context of h-entity neighbors.

h′表示错误的三元组中的头实体；h' represents the head entity in the wrong triplet;

P(t|C(h,r,t),h；θ)表示实体t概率，利用路径上下文衡量头实体、尾实体之间的关联程度，P(t|C(h,r,t),h;θ) represents the probability of entity t, and uses the path context to measure the degree of association between the head entity and the tail entity,

P(t|C(h,r,t),h；θ)近似表示为P(t|C_p(h,t),h；θ)，定义如下公式所示。P(t|C(h,r,t),h;θ) is approximately expressed as P(t|C _p (h,t),h;θ), defined as shown in the following formula.

其中，ε表示尾实体集合、t′表示错误三元组(负例)中的尾实体；Among them, ε represents the tail entity set, and t′ represents the tail entity in the error triplet (negative example);

P(r|C(h,r,t),h,t；θ)表示关系r出现的条件概率。实体h和t已确定，实体上下文已经被引入，因此，省略P(r|C(h,r,t),h,t；θ)中的实体上下文C(h,r,t)如下公式所示。P(r|C(h, r, t), h, t; θ) represents the conditional probability of occurrence of relation r. The entities h and t have been determined, and the entity context has been introduced. Therefore, the entity context C(h,r,t) in P(r|C(h,r,t),h,t; θ) is omitted as shown in the following formula Show.

f(g,r,t)≈P(h|C_n(h)；θ)·P(t|C_p(h,t),h；θ)·P(r|h,t；θ) (8)通过最大化评分函数f(h,r,t)＝P((h,r,t)|C(h,r,t)；θ)对实体上下文信息的向量进行优化。作为优选，所述S2-2根据政务知识图谱实体描述文本获得服务事项实体向量e_d的过程如下：f(g,r,t)≈P(h|C _n (h);θ)·P(t|C _p (h,t),h;θ)·P(r|h,t;θ) ( 8) Optimizing the vector of entity context information by maximizing the scoring function f(h,r,t)=P((h,r,t)|C(h,r,t);θ). As a preference, the process of S2-2 obtaining the service item entity vector _ed according to the description text of the government knowledge map entity is as follows:

基于构建的政务服务知识图谱G，针对服务事项实体定义实体描述文本，实体描述文本包括实体名称、实体关联的关系名称和尾实体名称；Based on the constructed government service knowledge graph G, the entity description text is defined for the service entity. The entity description text includes the entity name, the relationship name associated with the entity, and the tail entity name;

实体描述文本的第i个位置对给定关系r的权重定义为α_i(r)，如下公式所示。The weight of the i-th position of the entity description text to a given relation r is defined as α _i (r), as shown in the following formula.

其中，其中，是通过表示学习得到的关系向量，/>是第i个位置的输出，W_a和U_a是参数矩阵，/>是参数向量。e_i(r)是z_i和关系r的相关性，n表示实体描述文本的长度。where, where, is the relationship vector obtained through representation learning, /> is the output of the i-th position, W _a and U _a are parameter matrices, /> is a parameter vector. e _i (r) is the correlation between z _i and relation r, and n represents the length of entity description text.

服务事项实体向量e_d定义如下公式所示。Service item entity vector _ed is defined as shown in the following formula.

x₁，x_n分别表示实体描述文本的长度为1和为n的位置。x ₁ , x _n represent the position of the entity description text whose length is 1 and n respectively.

作为优选，所述S2-3根据服务事项实体向量e_s与服务事项实体向量e_d得到最终服务事项向量e的过程如下：As a preference, the process of S2-3 obtaining the final service item vector e according to the service item entity vector e _s and the service item entity vector _ed is as follows:

通过门控机制将e_s与e_d进行融合得到e，定义如下公式所示。Through the gating mechanism, e _s and ed _d are fused to obtain e, which is defined as shown in the following formula.

e＝β⊙e_s+(1-β)⊙e_d (12)e=β⊙e _s +(1-β)⊙e _d (12)

其中，β∈[0,1]表示平衡两种表示权重的门。where β ∈ [0,1] represents a gate that balances two representation weights.

作为优选，所述目标函数L为：Preferably, the objective function L is:

v_j＝β⊙e_sj+(1-β)⊙e_dj (15)v _j ＝β⊙e _sj +(1-β)⊙e _dj (15)

其中，u_i是用户i的向量表示，v_j、v_j′分别是政务服务事项实体j和j′的向量表示，z表示正则项。Among them, u _i is the vector representation of user i, v _j and v _j′ are the vector representations of government service item entities j and j′ respectively, and z represents the regular term.

其中，f(h,r,t；d_h,d_t)表示知识表示学习部分的评分函数，g_h、g_t分别是头实体和尾实体的门控大小，h_s、t_s分别为h和t的实体上下文信息的向量，h_d、t_d为h和t的实体描述文本知识的向量。Among them, f(h,r,t; d _h ,d _t ) represents the scoring function of knowledge representation learning part, g _h , g _t are the gate sizes of head entity and tail entity respectively, h _s , t _s are h and t's entity context information vectors, h _d , t _d are the vectors of h and t's entity description text knowledge.

相对于现有技术，本发明至少具有如下优点：Compared with the prior art, the present invention has at least the following advantages:

所述S1通过分析用户行为的特点，构建用户-服务事项行为矩阵，利用神经网络对用户和服务事项进行特征提取。充分利用用户行为数据，有效地缓解数据稀疏的问题。解决了政务服务推荐方法个性化程度不足，传统的协同过滤技术难以解决用户-服务事项矩阵稀疏等问题。The S1 constructs a user-service item behavior matrix by analyzing the characteristics of user behavior, and uses a neural network to extract features of users and service items. Make full use of user behavior data to effectively alleviate the problem of data sparseness. It solves the lack of personalization of government service recommendation methods, and the traditional collaborative filtering technology is difficult to solve the problem of sparse user-service item matrix.

所述S2利用政务资源数据构建政务服务知识图谱，通过知识表示方法对实体上下文和实体描述文本信息进行建模，联合学习知识表示任务和个性化推荐任务，提高推荐结果的准确性和可解释性，有效缓解了政务服务推荐的冷启动问题。解决了政务信息资源异质多源、组织松散而未能充分利用的问题。The S2 constructs a government service knowledge map using government resource data, models entity context and entity description text information through knowledge representation methods, jointly learns knowledge representation tasks and personalized recommendation tasks, and improves the accuracy and interpretability of recommendation results , which effectively alleviates the cold start problem of government service recommendation. It solves the problem of heterogeneity and multiple sources of government information resources, loose organization and underutilization.

附图说明Description of drawings

图1基于知识的政务服务推荐示例。Figure 1 An example of knowledge-based government service recommendation.

图2政务本体构建步骤。Figure 2 Construction steps of government ontology.

图3政务本体结构。Figure 3 Government ontology structure.

图4protégé构建政务本体。Figure 4 protégé constructs the government ontology.

图5部分RDF三元组。Figure 5 Partial RDF triples.

图6为一种实体描述文本信息的结构图。Fig. 6 is a structural diagram of entity description text information.

图7为本发明方法的架构图。Fig. 7 is a structure diagram of the method of the present invention.

图8为三元组示例图。Figure 8 is an example diagram of a triplet.

具体实施方式Detailed ways

下面对本发明作进一步详细说明。The present invention will be described in further detail below.

本发明通过对用户行为建模，提出一种基于深度矩阵分解与知识图谱的服务推荐方法(BDMF)。通过神经网络对用户与服务事项进行特征提取，利用深度协同过滤，预测用户对未交互服务事项的需求程度。该BDMF并没有考虑服务事项相关信息对用户需求的影响。政务服务场景中，每个服务事项均包含受理条件、办理主体、行使层级、服务对象等多种信息。用户需求和上述事项信息密切相关，合理利用这些服务事项信息，可以有效提升BDMF的推荐准确度。如图1所示，当用户点击“企业社会保险登记”服务事项，该事项的受理条件为“需在市场监管部门进行登记”，推荐算法应考虑该信息为用户推荐实施主体为市场监管管理局且与企业相关的服务事项，如：“内资企业及分支机构设立登记(公司设立登记)”、“外资企业及分支机构设立登记(外商投资企业设立登记)”等事项。在政务服务推荐中，服务事项信息越相似，对用户需求的参考价值更高。合理利用服务事项信息之间的相似性，能够提升算法的精确度，有效缓解政务服务推荐的冷启动问题。The present invention proposes a service recommendation method (BDMF) based on deep matrix decomposition and knowledge graph by modeling user behavior. The features of users and service items are extracted through the neural network, and the depth of collaborative filtering is used to predict the degree of user demand for non-interactive service items. The BDMF does not consider the impact of information on service items on user needs. In the government service scenario, each service item contains various information such as acceptance conditions, processing subject, exercise level, and service object. User needs are closely related to the above-mentioned item information, and rational use of these service item information can effectively improve the recommendation accuracy of BDMF. As shown in Figure 1, when a user clicks on the service item of "corporate social insurance registration", the acceptance condition of this item is "need to be registered with the market supervision department", and the recommendation algorithm should consider this information to recommend the implementation subject for the user to be the market supervision administration And service matters related to enterprises, such as: "Registration of Establishment of Domestic-funded Enterprises and Branches (Company Establishment Registration)", "Registration of Establishment of Foreign-funded Enterprises and Branches (Registration of Establishment of Foreign-Invested Enterprises)" and other matters. In government service recommendation, the more similar the service item information, the higher the reference value for user needs. Reasonable use of the similarity between service item information can improve the accuracy of the algorithm and effectively alleviate the cold start problem of government service recommendation.

在政务服务推荐场景中，用户的潜在需求与服务事项信息有关联。然而政务服务资源庞大分散、种类多、层级复杂，政务服务推荐面临政务信息资源异质多源、组织松散、利用不充分的问题。知识图谱作为一种蕴含丰富语义信息的异构网络，利用知识表示方法能够有效地在一个低维连续向量空间中表示知识图谱中的实体和关系，可以合理地利用政务资源数据，同时赋予知识图谱融合、推理以及应用的能力，提升政务服务推荐的准确度。In the government service recommendation scenario, the potential needs of users are related to service item information. However, government service resources are huge and scattered, with many types and complex levels. Government service recommendation faces the problems of heterogeneity and multiple sources of government information resources, loose organization, and insufficient utilization. As a heterogeneous network containing rich semantic information, the knowledge graph can effectively represent the entities and relationships in the knowledge graph in a low-dimensional continuous vector space by using the knowledge representation method, and can make reasonable use of government resource data while endowing the knowledge graph with The ability to integrate, reason, and apply improves the accuracy of government service recommendations.

政务服务知识图谱构建Government Service Knowledge Graph Construction

知识图谱在逻辑上可分为模式层和数据层。模式层是知识图谱的核心，存储经过提炼的知识，对领域内数据层级、类别进行定义与规范。通常使用本体库来管理图谱的模式层，利用本体库中的规则、公理及约束条件等能力来规范图谱中实体、关系以及实体的类型和属性等对象之间的关联。The knowledge map can be logically divided into a schema layer and a data layer. The schema layer is the core of the knowledge graph, storing refined knowledge, and defining and standardizing data levels and categories in the domain. The ontology library is usually used to manage the schema layer of the graph, and the rules, axioms, and constraints in the ontology library are used to standardize the associations between entities, relationships, and entity types and attributes in the graph.

数据层负责知识图谱中具体三元组的具体存储，在结构上处于模式层之下，是整个知识图谱的实际表现形式。在数据层中，三元组通过<实体、关系、实体>和<实体、属性、值>两种表达形式在图数据库中进行存储，如图8所示。The data layer is responsible for the specific storage of specific triples in the knowledge graph. It is structurally under the schema layer and is the actual form of expression of the entire knowledge graph. In the data layer, triples are stored in the graph database through two expressions of <entity, relationship, entity> and <entity, attribute, value>, as shown in Figure 8.

知识图谱的构建过程从获取原始知识数据开始，采用知识处理技术(包括自动或者半自动)从原始数据中抽取出所需要的知识要素。根据模式层和数据层的定义进行存储。海量异构知识通过模式层的结构定义以及数据层的处理形成庞大的实体关系网络，从而构建出知识图谱。知识图谱基于领域专家的先验知识构建的。The construction process of the knowledge map starts with the acquisition of original knowledge data, and uses knowledge processing technology (including automatic or semi-automatic) to extract the required knowledge elements from the original data. Store according to the definition of schema layer and data layer. Massive heterogeneous knowledge forms a huge entity relationship network through the structural definition of the schema layer and the processing of the data layer, thereby constructing a knowledge graph. The knowledge graph is constructed based on the prior knowledge of domain experts.

一种基于深度矩阵分解与知识图谱的服务推荐方法，包括如下步骤：A service recommendation method based on deep matrix decomposition and knowledge graph, comprising the following steps:

基于用户行为和深度矩阵分解获得用户向量。根据用户与服务事项交互过程中产生的行为数据，按照度量规则对用户行为数据进行量化，构建用户-服务事项行为矩阵其中矩阵的每一行表示一个初始用户向量，，矩阵中的数值R_ij表示用户i对服务事项j的点击次数。User vectors are obtained based on user behavior and deep matrix factorization. According to the behavior data generated during the interaction between users and service items, the user behavior data is quantified according to the measurement rules, and the user-service item behavior matrix is constructed Each row of the matrix represents an initial user vector, and the value R _ij in the matrix represents the number of times user i clicks on service item j.

S2-4:采用联合训练的方法，通过结合用户-服务事项行为数据和政务知识图谱数据学习用户、服务事项实体以及关系的向量表示。给定用户i、服务事项实体j以及用户-服务事项行为矩阵R_ij，构建了用户-服务事项偏好对<i,j,j′>，表示用户i与服务事项实体j有交互，与服务事项j′无交互，即用户i对服务事项j有需求。从G中找到与j、j′相关的三元组和实体描述文本。通过实体上下文的知识表示方法学习e_s，利用Bi-LSTM学习服务事项实体向量e_d。通过门控机制将两种实体向量进行融合，将用户向量和服务事项向量输入到个性化排序模型。个性化排序模型采用贝叶斯个性化排序，将推荐问题转化成排序问题，针对每个用户，将系统中所有带推荐物品进行排序，把该用户喜欢的物品尽可能排在前面，最后将序列中Top-K的物品推荐给用户。S2-4: Using the method of joint training, learn the vector representation of users, service item entities and relationships by combining user-service item behavior data and government affairs knowledge map data. Given user i, service item entity j, and user-service item behavior matrix R _ij , a user-service item preference pair <i,j,j′> is constructed, which means that user i interacts with service item entity j, and interacts with service item j' has no interaction, that is, user i has demand for service item j. Find triples and entity description text related to j, j′ from G. Learn e _s through the knowledge representation method of entity context, and use Bi-LSTM to learn service item entity vector _ed . The two entity vectors are fused through a gating mechanism, and the user vector and service item vector are input into the personalized ranking model. The personalized sorting model uses Bayesian personalized sorting to convert the recommendation problem into a sorting problem. For each user, sort all the recommended items in the system, and rank the items that the user likes as much as possible. Finally, the sequence The Top-K items are recommended to users.

具体的，所述S2-1获得服务事项实体向量e_s的过程如下：Specifically, the process of S2-1 obtaining the service item entity vector e _s is as follows:

实体上下文信息包括邻居上下文和路径上下文。在政务服务知识图谱中，给定服务事项实体，其邻居上下文节点包括服务事项的类型、实施主体、行使层级等。引入实体上下文的知识表示方法将实体关联的语义信息编码为高维向量表示，提高了模型的表征能力。其次，引入实体上下文的知识表示能更好的理解实体语义关系，如实体之间的相似性、层次结构等，提高推荐系统的可解释性。Entity context information includes neighbor context and path context. In the government service knowledge map, given a service item entity, its neighbor context nodes include the type of service item, implementation subject, exercise level, etc. The knowledge representation method that introduces the entity context encodes the semantic information associated with the entity into a high-dimensional vector representation, which improves the representation ability of the model. Secondly, the introduction of knowledge representation of entity context can better understand the semantic relationship of entities, such as the similarity and hierarchical structure between entities, and improve the interpretability of the recommendation system.

服务事项实体h的邻居上下文定义如下公式所示，G表示政务服务知识图谱。The neighbor context definition of the service item entity h is shown in the following formula, and G represents the government service knowledge graph.

其中，p_i是h到达实体t的关系序列，L是所有关系路径中的最大长度，r₁，表示h到达实体t途径的其它关系，e₁，/>表示g到达实体t途径的其它实体，l_i表示第i个关系。三元组(h,r,t)成立的概率如下公式所示。Among them, p _i is the sequence of relations from h to entity t, L is the maximum length among all relation paths, r ₁ , Indicates the other relations of h reaching entity t, e ₁ , /> Indicates that g reaches other entities on the way to entity t, and l _i indicates the i-th relationship. The probability of the triplet (h, r, t) being established is shown in the following formula.

f(h,r,t)＝P((h,r,t)|C(h,r,t)；θ) (3)f(h,r,t)=P((h,r,t)|C(h,r,t);θ) (3)

具体的，对所述评分函数f(·)采用方法进行优化：Specifically, the scoring function f( ) is optimized using the method:

h′表示错误的三元组(即负例)中的头实体；h' represents the head entity in the wrong triplet (i.e. negative example);

其中，ε表示尾实体集合、t′表示错误三元组(负例)中的尾实体。Among them, ε represents the tail entity set, and t′ represents the tail entity in the error triplet (negative example).

通过最大化评分函数f(h,r,t)＝P((h,r,t)|C(h,r,t)；θ)对实体上下文信息的向量进行优化。The vector of entity context information is optimized by maximizing the scoring function f(h,r,t)=P((h,r,t)|C(h,r,t);θ).

具体的，所述S2-2根据政务知识图谱实体描述文本获得服务事项实体向量e_d的过程如下：Specifically, the process of S2-2 obtaining the service item entity vector _ed according to the description text of the government knowledge map entity is as follows:

其中，其中，是通过表示学习得到的关系向量，/>是第i个位置的输出，W_a和U_a是参数矩阵，/>是参数向量。e_i(r)z_i和关系r的相关性，n表示实体描述文本的长度。where, where, is the relationship vector obtained through representation learning, /> is the output of the i-th position, W _a and U _a are parameter matrices, /> is a parameter vector. The correlation between e _i (r) z _i and relation r, n represents the length of entity description text.

具体的，所述S2-3根据服务事项实体向量e_s与服务事项实体向量e_d得到最终服务事项向量e的过程如下：Specifically, the process of S2-3 obtaining the final service item vector e according to the service item entity vector e _s and the service item entity vector e _d is as follows:

e＝β⊙e_s+(1-β)⊙e_d (12)e=β⊙e _s +(1-β)⊙e _d (12)

具体的，所述目标函数L为：Specifically, the objective function L is:

v_j＝β⊙e_sj+(1-β)⊙e_dj (15)v _j ＝β⊙e _sj +(1-β)⊙e _dj (15)

其中，f(h,r,t；d_h,d_t)表示知识表示学习部分的评分函数，g_h、g_t分别是头实体和尾实体的门控大小，h_s、t_s分别为h和t的实体上下文信息的向量，h_d、t_d为h和t的实体描述文本知识的向量。h_s、r、t_s是通过基于TransE表示学习方法预训练得到，h_d、t_d是通过对实体描述文本进行表示学习获得。Among them, f(h,r,t; d _h ,d _t ) represents the scoring function of knowledge representation learning part, g _h , g _t are the gate sizes of head entity and tail entity respectively, h _s , t _s are h and t's entity context information vectors, h _d , t _d are the vectors of h and t's entity description text knowledge. h _s , r, and t _s are obtained through pre-training based on the TransE representation learning method, and h _d , t _d are obtained through representation learning on entity description texts.

表1 GKGR训练过程Table 1 GKGR training process

实验设计与分析Experimental Design and Analysis

本发明以基于用户行为和深度矩阵分解的服务推荐方法为实验基线，分析添加知识表示模块后对服务推荐结果的影响，验证融合政务服务知识图谱的推荐方法。The present invention takes the service recommendation method based on user behavior and deep matrix decomposition as the experimental baseline, analyzes the impact of adding the knowledge representation module on the service recommendation results, and verifies the recommendation method that integrates the government service knowledge map.

1.数据集1. Dataset

负例三元组的构造方式：在用户-服务事项行为数据中随机选取一个用户从未交互过的服务事项作为负例，在政务服务知识图谱中查找包含该服务事项实体的三元组，并对其进行负采样。负采样的方式是替换三元组中的头实体或尾实体，得到知识图谱中不存在的三元组和实体描述文本。通过增加负例样本的多样性，使算法更好地学习正例三元组之间的关系，提高推荐的准确性。最后，采样得到的负例与正例一起用于训练和测试算法，正负比例为1：3。The construction method of the negative example triplet: randomly select a service item that the user has never interacted with as a negative example in the user-service item behavior data, find the triplet containing the entity of the service item in the government service knowledge map, and Negative sample it. The way of negative sampling is to replace the head entity or tail entity in the triplet, and obtain the triplet and entity description text that do not exist in the knowledge graph. By increasing the diversity of negative samples, the algorithm can better learn the relationship between positive triples and improve the accuracy of recommendation. Finally, the sampled negative examples are used together with the positive examples to train and test the algorithm, with a ratio of 1:3.

2.评价指标2. Evaluation indicators

为保证实验的一致性，评估指标与基于用户行为和深度矩阵分解的服务推荐方法实验所用指标相同，即HR、Precision、Recall和F1值。In order to ensure the consistency of the experiment, the evaluation indicators are the same as those used in the service recommendation method experiment based on user behavior and deep matrix decomposition, namely HR, Precision, Recall and F1 value.

实验中每个模型向量维度设置为64维。网络参数使用均值为0、方差为0.001的高斯分布进行初始化。batch size设置为64，正负样本比例为1:3，学习率设置为0.01，正则化系数均为0.001，并采用Adam优化器进行参数优化。In the experiment, the dimension of each model vector is set to 64 dimensions. The network parameters are initialized using a Gaussian distribution with mean 0 and variance 0.001. The batch size is set to 64, the ratio of positive and negative samples is 1:3, the learning rate is set to 0.01, the regularization coefficient is 0.001, and the Adam optimizer is used for parameter optimization.

BDMF：本发明提出的方法。BDMF: The method proposed by the present invention.

BDMF+TransE：使用TransE模型对实体进行表示。BDMF+TransE: Use the TransE model to represent entities.

CKE：基于协同知识库嵌入的推荐系统方法。CKE: An Approach to Recommender Systems Based on Collaborative Knowledge Base Embedding.

①比较本发明方法与对比方法在推荐效果上的差异① Compare the difference in recommendation effect between the method of the present invention and the comparison method

各模型在HR、Precision、Recall和F1指标下的实验结果分别如表2所示。The experimental results of each model under the indicators of HR, Precision, Recall and F1 are shown in Table 2.

表2对比实验结果Table 2 Comparative Experimental Results

根据实验结果分析，GKGR、CKE、BDMF+TransE模型在融合政务服务知识图谱后，在四个指标上均高于BDMF方法，其中本发明提出的GKGR模型在Precision、Recall、F1、HR上分别提高了5.08％，17.53％，10.58％，22.87％。在融合政务服务知识图谱的模型中，BDMF+TransE方法由于没有对实体描述文本建模，在相关指标上均低于GKGR模型和CKE模型。CKE模型通过TransE表征政务服务知识图谱的基础上引入实体描述文本，推荐效果有一定提升。本发明提出的GKGR模型在CKE的基础上通过实体上下文的知识表示，实现对实体路径上下文和邻居上下文的表征，对比CKE，Precision、Recall、F1和HR分别提高了5.76％，3.77％，5.6％，9.95％，提高了政务服务推荐的效果，表明融合政务服务知识图谱实体上下文可以进一步提高性能。According to the analysis of the experimental results, the GKGR, CKE, and BDMF+TransE models are higher than the BDMF method in four indicators after integrating the government service knowledge map, and the GKGR model proposed by the present invention improves Precision, Recall, F1, and HR respectively. 5.08%, 17.53%, 10.58%, 22.87%. In the model that integrates the knowledge map of government services, the BDMF+TransE method is lower than the GKGR model and the CKE model in terms of related indicators because it does not model the entity description text. The CKE model introduces entity description text on the basis of TransE to represent the knowledge map of government services, and the recommendation effect has been improved to a certain extent. The GKGR model proposed by the present invention realizes the representation of entity path context and neighbor context through the knowledge representation of entity context on the basis of CKE. Compared with CKE, Precision, Recall, F1 and HR are respectively increased by 5.76%, 3.77%, and 5.6%. , 9.95%, improving the effect of government service recommendation, indicating that the fusion of government service knowledge graph entity context can further improve the performance.

在政务服务推荐场景，如果两个政务服务事项实体的路径上下文和邻居上下文在知识图谱中相似，实体向量表示也会相应地更接近。在政务服务平台办理业务时，用户通常会根据该服务事项的受理条件、行使层级等信息找到该服务事项的前置服务。通过对服务事项实体描述文本表征，增强知识图谱的语义表示能力。本发明提出的融合政务服务知识图谱的推荐模型充分利用了政务服务事项实体之间的关系，引入政务服务知识图谱实体上下文和实体描述文本，并通过门控机制平衡实体描述文本和实体上下文信息所占的权重，可以实现较好的推荐效果。In the government service recommendation scenario, if the path context and neighbor context of two government service item entities are similar in the knowledge graph, the entity vector representations will be correspondingly closer. When handling business on the government service platform, users usually find the pre-service of the service item based on the acceptance conditions of the service item, exercise level and other information. By describing the text representation of the service item entity, the semantic representation ability of the knowledge graph is enhanced. The recommendation model that integrates the government service knowledge map proposed by the present invention makes full use of the relationship between government service item entities, introduces the entity context and entity description text of the government service knowledge map, and balances the entity description text and entity context information through the gating mechanism. The weight accounted for can achieve a better recommendation effect.

②比较不同知识表示方法对推荐效果的影响②Comparing the influence of different knowledge representation methods on the recommendation effect

为了验证知识表示方法对推荐结果的影响，选取了四个常见的TransX系列模型，分别为TransE、TransH、TransR和TransD。并使用Precision、Recall、F1、HR评价指标来衡量。如表3所示使用不同知识表示方法后的推荐效果。实验结果表明，在政务服务推荐任务中，TransH、TransR、TransD的表现均比TransE效果好，其中TransD知识表示方法在推荐准确性和召回率两个指标上表现最佳，F1值为0.2897，HR值为0.7661。但是TransE的效果与TransD相差较少，且模型结构简单，易于训练，所以本实验主要基于TransE模型进行训练。In order to verify the impact of knowledge representation methods on recommendation results, four common TransX series models are selected, namely TransE, TransH, TransR and TransD. And use Precision, Recall, F1, HR evaluation indicators to measure. Table 3 shows the recommendation effect after using different knowledge representation methods. The experimental results show that in government service recommendation tasks, TransH, TransR, and TransD perform better than TransE, among which the TransD knowledge representation method performs best in the two indicators of recommendation accuracy and recall rate, with a F1 value of 0.2897 and HR The value is 0.7661. However, the effect of TransE is less different from that of TransD, and the model structure is simple and easy to train, so this experiment is mainly based on the TransE model for training.

表3不同表示方法的在各个指标上的影响Table 3 The impact of different representation methods on each indicator

③比较不同模型对实体描述文本在推荐效果上的差异③Comparing the difference in the recommendation effect of different models for entity description text

为了验证实体描述文本对推荐结果的影响，选取了四个常见模型用于实体描述文本的表征，分别为Word2Vec、CNN、RNN和Bi-LSTM，并使用Precision、Recall、F1、HR评价指标来衡量。如表4所示，使用不同表征方法上的推荐效果。In order to verify the impact of entity description text on recommendation results, four common models were selected for the representation of entity description text, namely Word2Vec, CNN, RNN and Bi-LSTM, and used Precision, Recall, F1, HR evaluation indicators to measure . As shown in Table 4, the recommendation effects on different characterization methods are used.

表4不同表征方法的在各个指标上的影响Table 4 The influence of different characterization methods on each indicator

如表4所示，在实体描述文本的建模上，擅长处理文本序列的RNN和Bi-LSTM有更好的效果，对实体描述文本特征的提取能力更强。其中Bi-LSTM在RNN的基础上引入门控机制，对实体描述文本的建模能力更强，在F1和HR指标上，比RNN分别高6.5％，9.25％，提高了政务服务推荐的准确度。As shown in Table 4, in the modeling of entity description text, RNN and Bi-LSTM, which are good at processing text sequences, have better results, and have a stronger ability to extract features of entity description text. Among them, Bi-LSTM introduces a gating mechanism on the basis of RNN, and has a stronger modeling ability for entity description text. In terms of F1 and HR indicators, it is 6.5% and 9.25% higher than RNN, respectively, which improves the accuracy of government service recommendations. .

最后说明的是，以上实施例仅用以说明本发明的技术方案而非限制，尽管参照较佳实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，可以对本发明的技术方案进行修改或者等同替换，而不脱离本发明技术方案的宗旨和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it is noted that the above embodiments are only used to illustrate the technical solutions of the present invention without limitation. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be carried out Modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present invention shall be covered by the claims of the present invention.

Claims

1. A service recommendation method based on depth matrix decomposition and knowledge graph is characterized by comprising the following steps:

s1: obtaining an initial user vector according to behavior data generated in the interaction process of the user and the service items, and outputting the initial user vector as an input full-connection layer to obtain a user vector u _i ：

Quantifying the user behavior data according to the measurement rules according to the behavior data generated in the interaction process of the user and the service items, and constructing a user-service item behavior matrixWherein each row of the matrix represents an initial user vector, the values R in the matrix _ij The number of clicks of the user i on the service item j is represented;

constructing a government service knowledge graph G, wherein the service event entities and the relations are represented by graph structures, each service event entity is regarded as a node in the graph, and the relations are regarded as edges;

s2: constructing and training a GKGR model, said GKGR model comprising:

s2-1: obtaining service item entity vector e according to government service knowledge graph definition entity context information _s ；

S2-2: obtaining a second service item entity vector e according to the government service knowledge graph definition entity description text _d ；

S2-3: entity according to service mattersVector e _s And service item entity vector e _d Obtaining a final service item vector e;

s2-4: given user i, service item entity j and user-service item behavior matrix R _ij User-service item preference pairs are constructed<i，j，j′>Indicating that the user i has interaction with the service item entity j and does not have interaction with the service item j ', namely that the user i has a requirement on the service item j, finding out triples and entity description text related to j and j' from G, and learning e by a knowledge representation method of entity context _s Learning service item entity vector e using Bi-LSTM _d The two entity vectors are fused through a gating mechanism, and the user vector u is obtained _i And the service item vector is input into the personalized sequencing model;

when the objective function is maximum and is not changed any more, training is finished, and an optimal GKGR model is obtained at the moment;

s3: for a user, S1 is adopted to obtain a user vector and input an optimal GKGR model, the optimal GKGR model calculates the association degree of the user and all service matters, and the service matters are arranged in descending order according to the association degree value, and the service matter sequence corresponding to the association degree value is output.

2. The service recommendation method based on depth matrix decomposition and knowledge graph as claimed in claim 1, wherein: the S2-1 obtains a service item entity vector e _s The process of (2) is as follows:

the entity context information C (h, r, t) includes neighbor context C _n (h) And path context C _p (h，t)；

Neighbor context C _n (h) Refers to a collection of other nodes directly connected to a given node;

path context C _p (h, t) refers to the context information composed of all paths connected to a given node, i.e., C (h, r, t) =c _n (h)∪C _p (h，t)；

The neighbor context definition of the service item entity h is shown in the following formula, and G represents a government service knowledge graph;

wherein h, t represent different service event entities, and r represents a relationship;

the path context definition of service instance entities h and t is as follows:

wherein p is _i Is the relation sequence of h reaching entity t, L is the maximum length in all relation paths, r ₁ ，Representing other relationships of h to the entity t pathway, e ₁ ，/>Representing h reaching other entities of the entity t pathway, l _i Representing an ith relationship;

the probability that triplet (h, r, t) is established is shown in the following formula;

f(h，r，t)＝P((h，r，t)|C(h，r，t)；θ) (3)

wherein θ represents a parameter of the model, and the higher the score of the scoring function f (·) is, the greater the probability of the triplet being established;

the pre-training model is transmitted, the triplet (h, r, t) is input into the transmitted, when the f (h, r, t) value is maximum, the output of the transmitted is the service item entity vector e _s 。

3. The service recommendation method based on depth matrix decomposition and knowledge graph as claimed in claim 2, wherein: optimizing the scoring function f (·) by adopting a method:

by decomposing f (h, r, t) with conditional probability, we can get:

f(h，r，t)＝P(h|C(h，r，t)；θ)·P(t|C(h，r，t)，h；θ)·P(r|C(h，r，t)，h，t；θ) (4)

wherein P (h|C (h, r, t); θ) represents the conditional probability of occurrence at h, defined as shown in the following formula;

wherein the method comprises the steps ofRepresenting the association degree of any entity and a h entity neighbor context;

h' represents the head entity in the wrong triplet;

p (t|C (h, r, t), h; θ) represents entity t probability, the degree of association between head and tail entities is measured by path context, and P (t|C (h, r, t), h; θ) is approximately represented as P (t|C) _p (h, t), h; θ), defined as the following formula;

where ε represents the set of tail entities and t' represents the tail entities in the error triplet (negative example);

p (r|C (h, r, t), h, t; θ) represents the conditional probability of occurrence of the relationship r; entities h and t have determined that the entity context has been introduced, and therefore omitting the entity context C (h, r, t) in P (r|C (h, r, t), h, t; θ) is shown in the following formula;

f(h，r，t)≈P(h|C _n (h)；θ)·P(t|C _p (h，t)，h；θ)·P(r|h，t；θ) (8)

the vector of entity context information is optimized by maximizing the scoring function f (h, r, t) =p ((h, r, t) |c (h, r, t); θ).

4. The service recommendation method based on depth matrix decomposition and knowledge graph as claimed in claim 3, wherein: s2-2 obtains a service item entity vector e according to the government knowledge map entity description text _d The process of (2) is as follows:

based on the constructed government service knowledge graph G, defining an entity description text aiming at the service event entity, wherein the entity description text comprises an entity name, an entity association relationship name and a tail entity name;

the weight of the ith position of the entity description text to a given relationship r is defined as α _i (r) as shown in the following formula;

wherein, among them,is a relation vector obtained by expression learning, +.>Is the output of the i-th position, W _a And U _a Is a parameter matrix,/->Is a parameter vector; e, e _i (r) is z _i Closing deviceThe correlation of r, n represents the length of the entity description text;

service item entity vector e _d The definition is shown in the following formula;

x ₁ ，x _n representing the positions of the entity description text of length 1 and n, respectively.

5. The service recommendation method based on depth matrix decomposition and knowledge graph as claimed in claim 4, wherein: the S2-3 is based on the service item entity vector e _s And service item entity vector e _d The process of obtaining the final service event vector e is as follows:

e is controlled by a gating mechanism _s And e _d Fusing to obtain e, wherein the definition is shown in the following formula;

e＝β⊙e _s +(1-β)⊙e _d (12)

wherein β ε [0,1] represents the gate that balances two types of representation weights.

6. The service recommendation method based on depth matrix decomposition and knowledge graph as claimed in claim 5, wherein: the objective function L is:

v _j ＝β⊙e _sj +(1-β)⊙e _dj (15)

wherein u is _i Is a vector representation of user i, v _j 、v _j′ Vector table of government service event entities j and j' respectivelyShown, z represents a regularization term;

wherein f (h, r, t, d) _h ，d _t ) Scoring function, g, representing knowledge representation learning portion _h 、g _t The gating sizes of the head entity and the tail entity, h respectively _s 、t _s Vector of entity context information h and t, respectively, h _d 、t _d Vectors of textual knowledge are described for the entities of h and t.