[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN117150041A - Small sample knowledge graph completion method based on reinforcement learning - Google Patents

Small sample knowledge graph completion method based on reinforcement learning Download PDF

Info

Publication number
CN117150041A
CN117150041A CN202311112645.XA CN202311112645A CN117150041A CN 117150041 A CN117150041 A CN 117150041A CN 202311112645 A CN202311112645 A CN 202311112645A CN 117150041 A CN117150041 A CN 117150041A
Authority
CN
China
Prior art keywords
entity
knowledge graph
training
representing
triplet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311112645.XA
Other languages
Chinese (zh)
Inventor
解修蕊
孙国林
段贵多
罗光春
刘文泊
田茂君
曾诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202311112645.XA priority Critical patent/CN117150041A/en
Publication of CN117150041A publication Critical patent/CN117150041A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a knowledge graph reasoning technology, which discloses a small sample knowledge graph completion method based on reinforcement learning, and solves the problems of interpretability and sparsity of knowledge graph data in the prior art, and comprises the following steps: constructing an action space according to the input head entity serving as an agent of the current step and taking the agent and the relation as inputs of a decision network, wherein the input head entity is a tail entity obtained by reasoning in the previous step, and the action space comprises the existing action space with the same head entity and relation as the head entity in the knowledge graph and an additional action space obtained by utilizing meta-learning network reasoning based on the head entity and the target relation; then, determining the action of the current step according to the action space based on a policy function of the decision network, further obtaining an reasoning result of the current step, finally judging whether the reasoning result meets the requirement, and completing complementation if the reasoning result meets the requirement; if not, continuing iteration, and carrying out reasoning again according to the current reasoning result.

Description

Small sample knowledge graph completion method based on reinforcement learning
Technical Field
The invention relates to a knowledge graph reasoning technology, in particular to a small sample knowledge graph completion method based on reinforcement learning.
Background
The data in the knowledge graph usually exists in the form of triples, specifically expressed as (head entity, relation, tail entity), and a large amount of triples form a huge network graph. Anything and concepts in nature can be represented as an entity node in the knowledge graph, and any relation can be represented as an edge in the knowledge graph. For example, if there is a triplet (scotlag, born in jacobian) in the knowledge graph, the semantics of the triplet mean that the "scotlag is born in jacobian", where "scotlag" and "jacobian" are two physical nodes, and "occurs in" is a relationship, which together form an edge in the graph structure.
Currently, knowledge graphs have become one of the main data sources of many scientific research and application neighborhoods, for example, the tasks of information retrieval, data mining, data analysis and the like of many existing Natural Language Processing (NLP) neighborhoods all need data support of the knowledge graphs. Related applications of knowledge graphs are also popular in various industries, such as social networks, intelligent conversations, personalized intelligent recommendations, intelligent questions and answers, intelligent searches, etc., and are further applied in the vertical fields of finance, social security, medical services, public opinion, etc.
However, in practical applications, knowledge-graph data obtained automatically through a model or obtained manually are usually redundant and missing, and even abnormal and wrong knowledge may exist, which greatly reduces the usability of the knowledge-graph data in tasks, and even the quality of the data directly affects the development and progress of the related technology. Some representation learning methods are already applied to the knowledge graph completion, and map complex objects and corresponding relation information into a low-dimensional continuous vector space, and measure the similarity between the objects and the relations based on vectors, so that the effective knowledge graph completion is realized. The existing deep learning network contains hundreds of thousands, even hundreds of thousands, of neurons, and the models can obtain higher effects in the aspects of speed, stability, accuracy and the like of the knowledge graph completion task.
However, since some special application fields such as intelligent medical treatment, military, finance, traffic safety and the like are very sensitive to risk perception, any decision made by the inference model can be related to the life and property safety of the user, and if the user does not have visual knowledge of the completed process and result, trust between the user and the machine model cannot be established.
Secondly, because the sample data in the small sample knowledge graph is sparse, namely, entities and relations in the knowledge graph lack corresponding triplet information, or path information which is enough for reasoning is also lack between a source entity and a target entity in the reasoning process. Due to the sparse relation, the number of available entity pairs for knowledge-graph reasoning is insufficient, so that enough prediction information cannot be obtained in the reasoning process, and a path becomes ambiguous.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: the small sample knowledge graph completion method based on reinforcement learning solves the problem of interpretability and the problem of sparsity of knowledge graph data in the existing knowledge graph completion technology, so that data support is better provided for downstream tasks.
The technical scheme adopted for solving the technical problems is as follows:
a small sample knowledge graph completion method based on reinforcement learning comprises the following steps:
s1, inputting a to-be-complemented triplet (h 0 R,? ) Wherein g 0 Representing the head entity of the triplet to be completed, r representing the target relationship? Representing the tail entity to be complemented; with head entity h 0 And the target relation r as a decision network t 0 Initial input of time;
s2, taking the input head entity as an agent of the decision network at the current time step t; based on the input head entity and target relation, updating the state S of the decision network at the current time step t t
S t =(h t ,r,E t )
Wherein h is t Head entity representing current time step t input, E t A history path representation representing an agent, the initial value of which is null;
based on the input head entity and target relation, constructing an action space A of a decision network at the current time step t t The action space A t Comprises the following two parts:
stored action space constructed based on knowledge graph to be complementedWherein (1)>Triplet set representing knowledge graph to be complemented>The middle head entity is h t A tail entity of the kth triplet of relation r;
extracting meta representation of target relation r by using meta learning network of knowledge graph to be complemented, and based on head entity h t And the meta-representation of the target relation r, and deducing to obtain additional action spaceWherein (1)>Tail entity representing the first triplet obtained by meta-learning network reasoning, r l Representing ++in the knowledge graph to be complemented>Relationships in the corresponding triples;
s3, policy function based on decision network, which is formed by action space A t Act a of determining the current time step t t Based on taking the action a t The tail entity e obtained t Constructing an inference result (h) 0 ,r t ′,e t ) Wherein r is t ' Hadamard product representing the physical relationship traversed by the history path, and expressed as Represents Hadamard product, r t Representing tail entity e in to-be-complemented knowledge-graph t Relationships in the corresponding triples;
s4, reasoning result (h) based on the current time step 0 ,r t ′,e t ) Judging r t ' between and r, e t And (h) 0 +r), if a preset similarity threshold is met, completing complementation; otherwise, tail entity e based on current time step reasoning result t Constructing a head entity h of a decision network in the next time step t+1 The method comprises the steps of carrying out a first treatment on the surface of the Action a based on current time step t t Updating historical path representation E of agent t The method comprises the steps of carrying out a first treatment on the surface of the Then, returning to the step S2;
before the small sample knowledge graph completion, the decision network and the meta learning network train according to the following steps:
b1, constructing a support set and a query set by using the knowledge graph to be complemented, and training a meta learning network based on the support set and the query set to obtain the meta learning network of the knowledge graph to be complemented after training;
and B2, embedding the element learning network with the training as an inference module of an additional action space in the strategy network, and then training the strategy network by using the triples in the to-be-complemented knowledge graph as training triples.
Further, the policy network is an LSTM network, and the policy function is:
wherein σ represents a softmax operator; w (W) 1 And W is 2 Parameters of the respective policy network;representing the product of the tensors; pi θ (a t ∣S t ) Representing the current time step action space A t Probability distribution of each element in the motion space A t Action a with the highest probability element determined as the current time step t t
Further, the training triplet is used as a training triplet in the knowledge graph to be complemented, and the training of the strategy network specifically comprises the following steps:
b21, randomly selecting one triplet from the knowledge graph to be complemented as a training triplet, and designating the head entity and the relation of the training triplet as a head entity h 0 And a target relation r, designating its tail entity as tail entity true value e 0
B22, according to steps S2-S3, obtaining the reasoning result (h) of the current time step 0 ,r t ′,e t );
B23 based on triples (h 0 ,r t ′,e t ) Judging whether it is (h 0 ,r,e 0 ) If yes, completing the completion and obtaining the prizeExcitation R b =1, calculating the prize value R (a t ) Then, step B24 is performed;
otherwise, judge (h 0 ,r t ′,e t ) And (h) 0 ,r,e 0 ) If the similarity between the two is satisfied with a preset similarity threshold, completing the completion and obtaining the rewards R b =1, calculating the prize value R (a t ) Then, step B24 is performed;
otherwise, get rewards R b =0, calculating the prize value R (a t ) The method comprises the steps of carrying out a first treatment on the surface of the Tail entity e based on reasoning result of current time step t Constructing a head entity h of a decision network in the next time step t+1 The method comprises the steps of carrying out a first treatment on the surface of the Action a based on current time step t t Updating historical path representation E of agent t The method comprises the steps of carrying out a first treatment on the surface of the Then, returning to step B22;
the prize value R (a t ) Calculated as a reward function:
R(α t )=R b +(1-R b )f(h o ,r t ′,e t )
wherein,representing a cosine similarity function;
b24, training loss is calculated according to the following loss function, and parameters of the strategy network are updated according to the reverse gradient:
wherein T represents the total number of steps, pi, required to reach the final inference result θ (a t ∣S t ) Is shown in state S t Take action a down t Is defined by the probability of R (a) t ) Indicating the prize value available to take the action;
policy network parameter θ C The optimization process of (2) is as follows:
wherein beta is a preset learning rate parameter,gradient representing loss;
and B25, judging whether the training converges or reaches a preset training round, if so, completing the training, otherwise, returning to the step B21.
Further, r is based on the following formula t ' between and r, e t And (h) 0 +r), judging whether the similarity between the two components meets a preset similarity threshold lambda:
wherein,representing a cosine similarity function.
Further, a meta learning network of the knowledge graph to be complemented is utilized to extract meta representation of the target relation r, and the meta representation is based on the head entity h t And the meta-representation of the target relation r, and deducing to obtain additional action spaceComprising the following steps:
a1, extracting a support set from a knowledge graph to be complemented based on a relation r, wherein a triplet in the support set is expressed asWherein i is the sequence number of the support concentrated triplet;
a2, constructing each head entity h of the support set i Neighbor set of (a)And tail entities->Neighbor set of->Initializing each head entity h through a pre-training model i Each tail entity->And its neighbor set->And neighbor set->Feature vectors of the entities;
a3, aggregating neighbor information to each head entity h i And tail entityUpdating the feature vector of the model (a);
a4, based on each updated head entity h i And tail entityConstructing an embedded vector of head and tail entity pairs of each triplet in the support set>
A5, embedding vector based on head and tail entity pairs of each triplet in support setExtracting the meta representation of the target relation r;
a6, utilizing header entity h t And a meta-representation of the target relationship r based onTo complete the knowledge graph, constructing a query set, screening triples meeting preset screening conditions in the query set as expansion results, and constructing additional action space based on the expansion results
Further, in step A3, by means of aggregating neighbor information, for each header entity h i And tail entityIs updated as follows:
wherein σ represents a sigmoid activation function, e n An initialized feature vector r obtained by the n-th neighbor entity of the current computing entity e through a pre-training model n Representing a current computing entity e and its neighbor entity e n Initialized feature vector, alpha, obtained by pre-training the relation between the feature vectors n For the attention weight of the nth neighbor entity,representing join operator, u rtAnd b rt Is a learnable parameter, and T represents a matrix transpose.
Further, step A4 is based on the updated header entities h i And tail entityIs used for constructing the embedding direction of each triplet head and tail entity pair in the support setQuantity->The method comprises the following steps:
wherein,representing join operator, f θ (h i ) And->And respectively representing the feature vectors of the head entity and the tail entity after the i-th triplet in the support set is updated.
Further, step A5 is based on the embedded vectors of the head-tail entity pairs of the triples in the support setExtracting a meta representation of the target relationship r, comprising:
a51, embedding vectors of head and tail entity pairs of each tripletSequentially inputting a cyclic neural network for coding, wherein the number of layers of the cyclic neural network is consistent with the number of the supporting concentrated triples;
a52, updating hidden features output by the cyclic neural network by adding residual connection according to the following formula:
wherein m is i Hidden features representing ith entity pair of recurrent neural network output, m i ' means that the ith entity has hidden characteristics after updating;
a53, based on the attention to each entity pair according to the following formulaThe hidden features are polymerized to obtain the meta representation f of the target relation r (R r ):
Wherein u is R ,And b R Is a learnable parameter, and T represents a matrix transpose.
Further, step A6, utilizing header entity h t And meta-representation of the target relation r, constructing a query set based on the knowledge graph to be complemented, screening triples meeting preset screening conditions in the query set as expansion results, and constructing an additional action space based on the expansion resultsComprising the following steps:
a61 utilization of header entity h t And meta-representation of the target relation r, and extracting entity data based on the knowledge graph to be complemented to construct a query set
A62, obtaining similarity scores between head and tail entity pairs of all triples in the query set and the relation r by utilizing meta-representation of the target relation rThe similarity score function is expressed as:
wherein P is r Representing in a knowledge graphThe normal vector of the hyperplane of relationship r,represents the j-th tail entity in the support set based on relation r, f (R r ) Representing the element of the target relation r;
a63, based on the similarity score, screening triples meeting preset screening conditions in the query set as expansion results, and constructing an additional action space based on the expansion results
Further, in step B1, a support set and a query set are constructed by using the knowledge graph to be complemented, and the meta learning network is trained based on the support set and the query set, including:
b11, randomly designating a relation as a training relation, and extracting a triplet with the relation as the training relation from the knowledge graph to be complemented as a training set;
b12, dividing the training set into two parts to form a support set S r Sum positive query setBy replacing positive support set->In the way of triple tail entity, a negative query set +.>The positive query set->The medium triplet is expressed asThe negative query set->Middle triplet representationIs->
B13 based on support set S r Obtaining the meta representation of the training relation r according to the steps A2-A5;
b14, based on the element representation of the training relation r, respectively calculating positive query setsAnd negative query set->Matching scores of the triples, and performing loss calculation according to the following loss function:
wherein,representing positive query set +.>Matching score of the ith triplet in +.>Representing the negative query set +.>In (3) the matching score of the ith triplet, ζ is the safety margin distance, and γ is the discount factor;
then, the loss l obtained by calculation (S r ) Hyperplane normal vector P for training relationships in knowledge-graph r Updating:
wherein l p Representing the current model versus hyperplane normal vector P on support set data r The learning rate when updating is performed;
and B15, judging whether the training converges or reaches a preset training round, if so, completing the training, otherwise, returning to the step B11.
The beneficial effects of the invention are as follows:
(1) The knowledge is captured from the knowledge graph by a small sample method, and the entity pairs which are associated with the relationships are matched for each relationship, so that data in the knowledge graph in the reasoning process is expanded, a potential action space is formed, and data support is provided for reasoning of a meta-learning network of the knowledge graph. The method solves the problems that the number of available entity pairs of knowledge graph reasoning is insufficient, enough prediction information cannot be obtained in the reasoning process and the like due to the sparsity of the knowledge graph.
(2) Adopting reinforcement learning as a basic model of knowledge graph completion, and firstly determining the state of an intelligent agent in the current time step; and secondly, acquiring actions which the intelligent agent can take, wherein the actions consist of two parts, one part is an original action space, and the other part is an additional action space acquired based on meta-learning. Therefore, the completion process is converted into a sequential decision problem by carrying out path exploration in the knowledge graph through the decision network based on reinforcement learning, and the path information experienced by the intelligent agent is reserved, so that the interpretation of the completion method is improved, and a trust bridge is built between the user and the model.
Drawings
FIG. 1 is a schematic flow diagram of a small sample knowledge graph completion method based on reinforcement learning in an embodiment of the invention;
FIG. 2 is a flow chart of constructing additional action spaces in an embodiment of the invention.
Detailed Description
The invention aims to provide a small sample knowledge graph completion method based on reinforcement learning, which solves the problem of interpretability and sparsity of knowledge graph data in the prior knowledge graph completion technology so as to better provide data support for downstream tasks. In the invention, a support set and a query set are firstly constructed by utilizing the knowledge graph to be complemented, and the meta learning network is trained based on the support set and the query set, so as to obtain the meta learning network of the knowledge graph to be complemented which is trained; and embedding the element learning network after training into a strategy network as an inference module of an additional action space in the strategy network, and training the strategy network by using the triples in the knowledge graph to be complemented as training triples. After obtaining the trained strategy network integrated into the meta-learning network, the training strategy network can be used for executing the knowledge graph completion task.
When the knowledge graph completion task is executed, under the current time step, an action space is constructed according to the input head entity serving as an intelligent agent and the target relation serving as the input of the decision network, wherein the input head entity is constructed according to the tail entity deduced from the previous time step, the action space comprises the stored action space with the same head entity and relation in the knowledge graph, the method further comprises the steps of extracting characteristics of the target relation by utilizing a meta learning network, carrying out reasoning on the characteristics of the head entity and the target relation to obtain an additional action space which does not exist in the original knowledge graph, constructing a reasoning result according to the tail entity corresponding to the action in the action space by utilizing a strategy function of the decision network, judging whether the completed tail entity meets the requirement or not through calculating the similarity between the reasoning result and the query triplet, if the completed tail entity does not meet the requirement, constructing a new head entity again according to the deduced tail entity, and inputting the new head entity and the target relation into the decision network for continuous iteration.
The scheme captures knowledge from the knowledge graph, matches the entity pairs associated with each relationship, so that data in the knowledge graph in the reasoning process is expanded, a potential action space is formed, data support is provided for reasoning of a meta-learning network of the knowledge graph, and the problems that the number of available entity pairs in the knowledge graph reasoning is insufficient, enough prediction information cannot be obtained in the reasoning process due to sparsity of the knowledge graph are solved. In addition, in the path exploration process of the intelligent agent, the path information experienced by the intelligent agent is reserved and is continuously updated along with the time step, so that the interpretability of the completion method is improved.
Examples:
knowledge graph for a small sampleWherein ε and ∈ ->Respectively representing the set of entities and relations in the knowledge graph, < >>Is a triplet set in the knowledge graph.
The embodiment aims at given knowledge graph data of a small sampleOne source query triplet (h 0 R,? ) Wherein h is 0 And r is the target relation of the query, the graph structure formed by the knowledge graph data is explored through the reinforcement learning strategy, and the tail entity is predicted, so that the knowledge graph is completed. The flow of the small sample knowledge graph completion method based on reinforcement learning provided in this embodiment is shown in fig. 1, and the method includes the following implementation steps:
s1, inputting a query triplet (h) t ,r,?)
In this step, if the current time step is t 0 The moment, i.e. the initial moment of inquiry, the input inquiry triplet is (h) 0 R,? ) If the current time step is not t 0 At the moment, the head entity h in the input query triplet t Is the tail entity e predicted from the last time step t-1 And (5) constructing and obtaining.
S2, updating the state S of the decision network at the current time step t based on the head entity and target relation input at the current time step t t And constructing an action space A of the decision network at the current time step t t
In this step, the header entity h is input in the current time step t t As an agent of the decision network at the current time step t; based on the input head entity and target relation, updating the state S of the decision network at the current time step t t
S t =(h t ,r,E t )
Wherein h is t Head entity representing current time step t input, E t Historical path representation representing agent, action a taken at each moment t I.e. the path traversed at time t, by way of example: when t=0, E t Is empty; t=1, action a taken 1 Will save it to E t In t=2, action a taken 2 Will save it to E t Analogies and so forth, thereby preserving the historical path of the agent to provide an interpretability of the completion process.
Action space A of decision network at current time step t t Is composed of two parts, expressed as:wherein (1)>For the stored action space based on the knowledge graph to be complemented +.>Wherein (1)>The head entity in the triplet set T of the knowledge graph to be complemented is represented as h t The tail entity of the kth triplet with relation r.
Additional action space not existing in the original knowledge-graph, obtained for reasoning based on the head entity and the target relation features of the current time step,/o>Wherein (1)>Tail entity representing the first triplet obtained by meta-learning network reasoning, r l Representing ++in the knowledge graph to be complemented>Relationships in the corresponding triples.
Reasoning about the extra action spaceAs shown in fig. 2, specifically includes:
a1, extracting a support set from a knowledge graph to be complemented based on target relation of query
In the step, based on the target relation r of the query, a support set is extracted from the knowledge graph to be complemented, and the triples in the support set are expressed asWhere i is the sequence number of the triplet in the support set.
A2, initializing feature vectors of head and tail entities and neighbor entities in the support set
In this step, each head entity h of the support set is constructed i Neighbor set of (a)And tail entities->Neighbor set of->Initializing each head entity h by a pre-training model such as a TransE model i Each tail entity->And its neighbor set->And neighbor set->Feature vectors of the entities.
A3, aggregating neighbor information and updating the head and tail entity feature vectors
In this step, by means of aggregating neighbor information, for each header entity h i And tail entityThe feature vector of (2) is updated, and the calculation formula is as follows:
wherein σ represents a sigmoid activation function, e n An initialized feature vector r obtained by the n-th neighbor entity of the current computing entity e through a pre-training model n Representing a current computing entity e and its neighbor entity e n Initialized feature vector, alpha, obtained by pre-training the relation between the feature vectors n For the attention weight of the nth neighbor entity,representing join operator, u rtAnd b rt Is a learnable parameter, and T represents a matrix transpose.
A4, constructing embedded vectors of head and tail entity pairs of each triplet in the support set based on the updated feature vectors of the head and tail entities
In this step, based on each updated header entity h i And tail entityConstructing an embedded vector of head and tail entity pairs of each triplet in the support set>Expressed as:
wherein,representing join operator, f θ (h i ) And->And respectively representing the feature vectors of the head entity and the tail entity after the i-th triplet in the support set is updated.
A5, extracting meta-representation of target relation based on embedded vector of each triplet head and tail entity pair in support set
In this step, the embedded vector of each triplet head and tail entity pair in the support set is based onExtracting the meta representation of the target relation r, which specifically comprises the following steps:
a51, embedding vectors of head and tail entity pairs of each tripletSequentially inputting a cyclic neural network for coding, wherein the number of layers of the cyclic neural network is consistent with the number of the supporting concentrated triples; the recurrent neural network, that is, the RNN network or the one obtained by improving on the basis of the RNN, is adopted in the present embodiment.
A52, updating hidden features output by the cyclic neural network by adding residual connection according to the following formula:
wherein m is i Hidden features representing ith entity pair of recurrent neural network output, m i ' means that the ith entity has hidden characteristics after updating;
a53, aggregating hidden features of each entity pair based on attention according to the following formula to obtain a meta-representation f of the target relationship r (R r ):
Wherein u is R ,And b R Is a learnable parameter, and T represents a matrix transpose.
A6, constructing a query set based on the to-be-complemented knowledge graph according to the meta-representation of the head entity and the target relation, screening out triples meeting the conditions, and constructing an additional action space
In this step, the header entity h is utilized t And meta-representation of the target relation r, constructing a query set based on the knowledge graph to be complemented, and screening the query set to accord withPresetting a triplet of screening conditions as an expansion result, and constructing an additional action space based on the expansion resultThe method specifically comprises the following steps:
a61 utilization of header entity h t And meta-representation of the target relation r, and extracting entity data based on the knowledge graph to be complemented to construct a query setThe entity data can be extracted according to the entity type corresponding to the relationship type, for example, the relationship belongs to the interpersonal relationship, and the entity type is the entity type representing the person, such as name, job, etc. In this embodiment, a query set is constructed for extracting all tail entity data based on the knowledge graph to be complemented.
A62, obtaining similarity scores between head and tail entity pairs of all triples in the query set and the relation r by utilizing meta-representation of the target relation rThe similarity score function is expressed as:
wherein P is r A normal vector representing a hyperplane of the relation r in the knowledge-graph to be complemented,represents the jth tail entity in the support set based on relation r, f (R r ) Representing the element of the target relation r;
a63, based on the similarity score, screening triples meeting preset screening conditions in the query set as expansion results, and constructing an additional action space based on the expansion results Wherein (1)>Tail entity representing the first triplet obtained by meta-learning network reasoning, r l Representing ++in the knowledge graph to be complemented>Relationships in the corresponding triples.
According to the embodiment, through the construction of the additional action space, the action space information can be further enriched on the basis of the action space with the stored knowledge graph, so that more prediction information can be obtained, the sparse problem in the reasoning process is solved, and the reasoning accuracy is improved.
S3, constructing an inference result by a policy function based on the decision network according to the tail entity corresponding to the action in the action space
In this step, based on policy function of decision network, action space A t Act a of determining the current time step t t Based on taking the action a t The tail entity e obtained t Constructing an inference result (h) 0 ,r t ′,e t ) Wherein r is t ' Hadamard product representing the physical relationship traversed by the history path, and expressed as Represents Hadamard product, r t Representing tail entity e in to-be-complemented knowledge-graph t Relationships in the corresponding triples.
As an exemplary option, in this embodiment, the policy network may employ an LSTM network, where the policy function is:
wherein σ represents a softmax operator; w (W) 1 And W is 2 Parameters of the respective policy network;representing the product of the tensors; pi θ (a t ∣S t ) Representing the current time step action space A t Probability distribution of each element in the motion space A t Action a with the highest probability element determined as the current time step t t
S4, judging whether the completed tail entity meets the requirement or not by calculating the similarity between the reasoning result and the source query triplet
In this step, based on the result of the inference (h 0 ,r t ′,e t ) Judging the reasoning result and the source query triplet (h 0 R,? ) Similarity between the two, and when similarity is judged, judgment r is adopted to judge the similarity t ' between and r, e t And (h) 0 +r), the formula is:
wherein,representing a cosine similarity function, λ being a similarity threshold.
Namely, whether a preset similarity threshold is met is judged by judging whether the above formula is met.
If the preset similarity threshold is satisfied, a completion result is obtained, and a reasoning result (h 0 ,r t ′,e t ) Tail entity e in (a) t I.e. a source query triplet (h 0 R,? ) Complement tail entity.
If the similarity threshold cannot be met, then the search needs to be continued, i.e., based on the currentTail entity e of time step reasoning result t Constructing a head entity h of a decision network in the next time step t+1 The method comprises the steps of carrying out a first treatment on the surface of the Action a based on current time step t t Updating historical path representation E of agent t The method comprises the steps of carrying out a first treatment on the surface of the Then, the process returns to step S1.
The embodiment also provides a training mode of the decision network and the meta learning network, which specifically comprises the following steps:
b1, constructing a support set and a query set by using the knowledge graph to be complemented, and training a meta learning network based on the support set and the query set to obtain the meta learning network of the knowledge graph to be complemented after training;
the method specifically comprises the following steps:
b11, randomly designating a relation as a training relation, and extracting a triplet with the relation as the training relation from the knowledge graph to be complemented as a training set;
b12, dividing the training set into two parts to form a support set S r Sum positive query setBy replacing positive support set->In the way of triple tail entity, a negative query set +.>The positive query set->The medium triplet is expressed asThe negative query set->The middle triplet is denoted->
B13 based on support set S r Obtaining the meta representation of the training relation r according to the steps A2-A5;
b14, based on the element representation of the training relation r, respectively calculating positive query setsAnd negative query set->Matching scores of the triples, and performing loss calculation according to the following loss function:
wherein,representing positive query set +.>Matching score of the ith triplet in +.>Representing the negative query set +.>In (3) the matching score of the ith triplet, ζ is the safety margin distance, and γ is the discount factor;
then, the loss L (S r ) Hyperplane normal vector P for training relationships in knowledge-graph r Updating:
wherein l p Representing the current model versus hyperplane normal vector P on support set data r The learning rate when updating is performed;
and B15, judging whether the training converges or reaches a preset training round, if so, completing the training, otherwise, returning to the step B11.
And B2, embedding a meta learning network which is subjected to training into a strategy network as an inference module of an additional action space in the strategy network, and training the strategy network by using the triples in the to-be-complemented knowledge graph as training triples, wherein the method specifically comprises the following steps of:
b21, randomly selecting one triplet from the knowledge graph to be complemented as a training triplet, and designating the head entity and the relation of the training triplet as a head entity h 0 And a target relation r, designating its tail entity as tail entity true value e 0
B22, according to steps S2-S3, obtaining the reasoning result (h) of the current time step 0 ,r t ′,e t );
B23 based on triples (h 0 ,r t ′,e t ) Judging whether it is (h 0 ,r,e 0 ) If yes, completing the complement to obtain the reward R b =1, calculating the prize value R (a t ) Then, step B24 is performed;
otherwise, judge (h 0 ,r t ′,e t ) And (h) 0 ,r,e 0 ) If the similarity between the two is satisfied with a preset similarity threshold, completing the completion and obtaining the rewards R b =1, calculating the prize value R (a t ) Then, step B24 is performed;
otherwise, get rewards R b =0, calculating the prize value R (a t ) The method comprises the steps of carrying out a first treatment on the surface of the Tail entity e based on reasoning result of current time step t Constructing a decision network at the next timeHead entity h of the stride t+1 The method comprises the steps of carrying out a first treatment on the surface of the Action a based on current time step t t Updating historical path representation E of agent t The method comprises the steps of carrying out a first treatment on the surface of the Then, returning to step B22;
the prize value R (a t ) Calculated as a reward function:
R(a t )=R b +(1-R b )f(h o ,r t ′,e t )
wherein,representing a cosine similarity function;
b24, training loss is calculated according to the following loss function, and parameters of the strategy network are updated according to the reverse gradient:
where T represents the total number of steps, pi, needed to reach the final inference result θ (a t ∣S t ) Is shown in state S t Take action a down t Is defined by the probability of R (a) t ) Indicating the prize value available to take the action;
policy network parameter θ C The optimization process of (2) is as follows:
wherein beta is a preset learning rate parameter,gradient representing loss;
and B25, judging whether the training converges or reaches a preset training round, if so, completing the training, otherwise, returning to the step B21.
Finally, it should be noted that the above examples are only preferred embodiments and are not intended to limit the invention. It should be noted that modifications, equivalents, improvements and others may be made by those skilled in the art without departing from the spirit of the invention and the scope of the claims, and are intended to be included within the scope of the invention.

Claims (10)

1. The small sample knowledge graph completion method based on reinforcement learning is characterized by comprising the following steps of:
s1, inputting a to-be-complemented triplet (h 0 R,? ) Wherein h is 0 Representing the head entity of the triplet to be completed, r representing the target relationship? Representing the tail entity to be complemented; with head entity h 0 And the target relation r as a decision network t 0 Initial input of time;
s2, taking the input head entity as an agent of the decision network at the current time step t; based on the input head entity and target relation, updating the state S of the decision network at the current time step t t
S t =(h t ,r,E t )
Wherein h is t Head entity representing current time step t input, E t A history path representation representing an agent, the initial value of which is null;
based on the input head entity and target relation, constructing an action space A of a decision network at the current time step t t The action space A t Comprises the following two parts:
stored action space constructed based on knowledge graph to be complementedWherein (1)>Triplet set representing knowledge graph to be complemented>The middle head entity is h t A tail entity of the kth triplet of relation r;
extracting meta representation of target relation r by using meta learning network of knowledge graph to be complemented, and based on head entity h t And the meta-representation of the target relation r, and deducing to obtain additional action spaceWherein (1)>Tail entity representing the first triplet obtained by meta-learning network reasoning, r l Representing ++in the knowledge graph to be complemented>Relationships in the corresponding triples;
s3, policy function based on decision network, which is formed by action space A t Act a of determining the current time step t t Based on taking the action a t The tail entity e obtained t Constructing an inference result (h) 0 ,r t ′,e t ) Wherein r' t Hadamard product representing the physical relationship traversed by a historic path and expressed as Represents Hadamard product, r t Representing tail entity e in to-be-complemented knowledge-graph t Relationships in the corresponding triples;
s4, reasoning result (h) based on the current time step 0 ,r t ′,e t ) Judging r t ' between and r, e t And (h) 0 +r), if a preset similarity threshold is met, then completingComplement; otherwise, tail entity e based on current time step reasoning result t Constructing a head entity h of a decision network in the next time step t+1 The method comprises the steps of carrying out a first treatment on the surface of the Action a based on current time step t t Updating historical path representation E of agent t The method comprises the steps of carrying out a first treatment on the surface of the Then, returning to the step S2;
before the small sample knowledge graph completion, the decision network and the meta learning network train according to the following steps:
b1, constructing a support set and a query set by using the knowledge graph to be complemented, and training a meta learning network based on the support set and the query set to obtain the meta learning network of the knowledge graph to be complemented after training;
and B2, embedding the element learning network with the training as an inference module of an additional action space in the strategy network, and then training the strategy network by using the triples in the to-be-complemented knowledge graph as training triples.
2. The reinforcement learning-based small sample knowledge graph completion method of claim 1, wherein the strategy network is an LSTM network, and the strategy function is:
wherein σ represents a softmax operator; w (W) 1 And W is 2 Parameters of the respective policy network;representing the product of the tensors; pi θ (a t |S t ) Representing the current time step action space A t Probability distribution of each element in the motion space A t Action a with the highest probability element determined as the current time step t t
3. The reinforcement learning-based small sample knowledge graph completion method of claim 1 or 2, wherein training the strategy network by using the triplet in the knowledge graph to be completed as a training triplet comprises:
b21, randomly selecting one triplet from the knowledge graph to be complemented as a training triplet, and designating the head entity and the relation of the training triplet as a head entity h 0 And a target relation r, designating its tail entity as tail entity true value e 0
B22, according to steps S2-S3, obtaining the reasoning result (h) of the current time step 0 ,r t ′,e t );
B23 based on triples (h 0 ,r t ′,e t ) Judging whether it is (h 0 ,r,e 0 ) If yes, completing the complement to obtain the reward R b =1, calculating the prize value R (a t ) Then, step B24 is performed;
otherwise, judge (h 0 ,r t ′,e t ) And (h) 0 ,r,e 0 ) If the similarity between the two is satisfied with a preset similarity threshold, completing the completion and obtaining the rewards R b =1, calculating the prize value R (a t ) Then, step B24 is performed;
otherwise, get rewards R b =0, calculating the prize value R (a t ) The method comprises the steps of carrying out a first treatment on the surface of the Tail entity e based on reasoning result of current time step t Constructing a head entity h of a decision network in the next time step t+1 The method comprises the steps of carrying out a first treatment on the surface of the Action a based on current time step t t Updating historical path representation E of agent t The method comprises the steps of carrying out a first treatment on the surface of the Then, returning to step B22;
the prize value R (a t ) Calculated as a reward function:
R(a t )=R b +(1-R b )f(h o ,r t ′,e t )
wherein,representing a cosine similarity function;
b24, training loss is calculated according to the following loss function, and parameters of the strategy network are updated according to the reverse gradient:
wherein T represents the total number of steps, pi, required to reach the final inference result θ (a t |S t ) Is shown in state S t Take action a down t Is defined by the probability of R (a) t ) Indicating the prize value available to take the action;
policy network parameter θ C The optimization process of (2) is as follows:
wherein beta is a preset learning rate parameter,gradient representing loss;
and B25, judging whether the training converges or reaches a preset training round, if so, completing the training, otherwise, returning to the step B21.
4. The reinforcement learning-based small sample knowledge graph completion method as claimed in claim 1 or 2, wherein r is based on the following formula t ' between and r, e t And (h) 0 +r), judging whether the similarity between the two components meets a preset similarity threshold lambda:
wherein,representing a cosine similarity function.
5. The reinforcement learning-based small sample knowledge graph completion method as claimed in claim 1 or 2, wherein the meta-learning network of the knowledge graph to be completed is utilized to extract the meta-representation of the target relationship r and based on the head entity h t And the meta-representation of the target relation r, and deducing to obtain additional action spaceComprising the following steps:
a1, extracting a support set from a knowledge graph to be complemented based on a relation r, wherein a triplet in the support set is expressed asWherein i is the sequence number of the support concentrated triplet;
a2, constructing each head entity h of the support set i Neighbor set of (a)And tail entities->Neighbor set of->Initializing each head entity h through a pre-training model i Each tail entity->And its neighbor set->And neighbor set->Feature vectors of the entities;
a3, aggregating neighbor information to each head entity h i And tail entityUpdating the feature vector of the model (a);
a4, based on each updated head entity h i And tail entityConstructing an embedded vector of head and tail entity pairs of each triplet in the support set>
A5, embedding vector based on head and tail entity pairs of each triplet in support setExtracting the meta representation of the target relation r;
a6, utilizing header entity h t And meta-representation of the target relation r, constructing a query set based on the knowledge graph to be complemented, screening triples meeting preset screening conditions in the query set as expansion results, and constructing an additional action space based on the expansion results
6. The reinforcement learning-based small sample knowledge graph completion method of claim 5, wherein step A3, by means of aggregating neighbor information, performs a search on each head entity h i And tail entityIs updated as follows:
wherein σ represents a sigmoid activation function, e n An initialized feature vector r obtained by the n-th neighbor entity of the current computing entity e through a pre-training model n Representing a current computing entity e and its neighbor entity e n Initialized feature vector, alpha, obtained by pre-training the relation between the feature vectors n For the attention weight of the nth neighbor entity,representing join operator, u rt 、/>And b rt Is a learnable parameter, and T represents a matrix transpose.
7. The reinforcement learning-based small sample knowledge graph completion method of claim 5, wherein step A4 is based on updated head entities h i And tail entityConstructing an embedded vector of head and tail entity pairs of each triplet in the support set>The method comprises the following steps:
wherein,representing join operator, f θ (h i ) And->And respectively representing the feature vectors of the head entity and the tail entity after the i-th triplet in the support set is updated.
8. The reinforcement learning-based small sample knowledge graph completion method of claim 5, wherein step A5 is based on embedding vectors supporting head-tail entity pairs of each triplet in a setExtracting a meta representation of the target relationship r, comprising:
a51, embedding vectors of head and tail entity pairs of each tripletSequentially inputting a cyclic neural network for coding, wherein the number of layers of the cyclic neural network is consistent with the number of the supporting concentrated triples;
a52, updating hidden features output by the cyclic neural network by adding residual connection according to the following formula:
wherein m is i Hidden features representing pairs of ith entities of the recurrent neural network output, m' i Representing the hidden characteristics of the ith entity after updating;
a53, aggregating hidden features of each entity pair based on attention according to the following formula to obtain a meta-representation f of the target relationship r (R r ):
Wherein u is RAnd b R Is a learnable parameter, and T represents a matrix transpose.
9. The reinforcement learning-based small sample knowledge graph completion method of claim 5, wherein step A6 uses a head entity h t And meta-representation of the target relation r, constructing a query set based on the knowledge graph to be complemented, screening triples meeting preset screening conditions in the query set as expansion results, and constructing an additional action space based on the expansion resultsComprising the following steps:
a61 utilization of header entity h t And meta-representation of the target relation r, and extracting entity data based on the knowledge graph to be complemented to construct a query set
A62, obtaining similarity scores between head and tail entity pairs of all triples in the query set and the relation r by utilizing meta-representation of the target relation rThe similarity score function is expressed as:
wherein P is r A normal vector representing a hyperplane of the relation r in the knowledge-graph to be complemented,represents the jth tail entity in the support set based on relation r, f (R r ) Representing the element of the target relation r;
a63, based on the similarity score, screening triples meeting preset screening conditions in the query set as expansion results, and constructing an additional action space based on the expansion results
10. The reinforcement learning-based small sample knowledge graph completion method of claim 9, wherein the step B1 of constructing a support set and a query set using the knowledge graph to be completed and training the meta learning network based on the support set and the query set comprises:
b11, randomly designating a relation as a training relation, and extracting a triplet with the relation as the training relation from the knowledge graph to be complemented as a training set;
b12, dividing the training set into two parts to form a support set S r Sum positive query setBy replacing positive support set->In the way of triple tail entity, a negative query set +.>The positive query set->Middle tripletDenoted as->The negative query set->The middle triplet is denoted->
B13 based on support set S r Obtaining the meta representation of the training relation r according to the steps A2-A5;
b14, based on the element representation of the training relation r, respectively calculating positive query setsAnd negative query set->Matching scores of the triples, and performing loss calculation according to the following loss function:
wherein,representing positive query set +.>Matching score of the ith triplet in +.>Representing the negative query set +.>In (3) the matching score of the ith triplet, ζ is the safety margin distance, and γ is the discount factor;
then, the loss L (S r ) Hyperplane normal vector P for training relationships in knowledge-graph r Updating:
wherein l p Representing the current model versus hyperplane normal vector P on support set data r The learning rate when updating is performed;
and B15, judging whether the training converges or reaches a preset training round, if so, completing the training, otherwise, returning to the step B11.
CN202311112645.XA 2023-08-31 2023-08-31 Small sample knowledge graph completion method based on reinforcement learning Pending CN117150041A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311112645.XA CN117150041A (en) 2023-08-31 2023-08-31 Small sample knowledge graph completion method based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311112645.XA CN117150041A (en) 2023-08-31 2023-08-31 Small sample knowledge graph completion method based on reinforcement learning

Publications (1)

Publication Number Publication Date
CN117150041A true CN117150041A (en) 2023-12-01

Family

ID=88909445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311112645.XA Pending CN117150041A (en) 2023-08-31 2023-08-31 Small sample knowledge graph completion method based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN117150041A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117993497A (en) * 2024-03-15 2024-05-07 广州大学 Knowledge graph completion method based on meta-relation learning

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117993497A (en) * 2024-03-15 2024-05-07 广州大学 Knowledge graph completion method based on meta-relation learning

Similar Documents

Publication Publication Date Title
WO2023000574A1 (en) Model training method, apparatus and device, and readable storage medium
Shi et al. Transductive semi-supervised deep learning using min-max features
Diallo et al. Deep embedding clustering based on contractive autoencoder
Malik et al. Multilayered echo state machine: A novel architecture and algorithm
Wang et al. Relational deep learning: A deep latent variable model for link prediction
Law et al. Multi-label classification using a cascade of stacked autoencoder and extreme learning machines
CN109389151B (en) Knowledge graph processing method and device based on semi-supervised embedded representation model
CN113066526B (en) Hypergraph-based drug-target-disease interaction prediction method
Qamar et al. Artificial neural networks: An overview
CN112528873B (en) Signal semantic recognition method based on multi-stage semantic representation and semantic calculation
Kim et al. Building deep random ferns without backpropagation
CN116015967B (en) Industrial Internet intrusion detection method based on improved whale algorithm optimization DELM
CN115422369B (en) Knowledge graph completion method and device based on improved TextRank
CN117150041A (en) Small sample knowledge graph completion method based on reinforcement learning
Junliang CNN or RNN: Review and Experimental Comparison on Image Classification
CN113408721A (en) Neural network structure searching method, apparatus, computer device and storage medium
CN114741460B (en) Knowledge graph data expansion method and system based on association between rules
CN117436451A (en) Agricultural pest and disease damage named entity identification method based on IDCNN-Attention
Liu et al. Tinygraph: joint feature and node condensation for graph neural networks
Dai et al. Two novel hybrid Self-Organizing Map based emotional learning algorithms
Zhang Deep loopy neural network model for graph structured data representation learning
CN115481256A (en) Inverse relation rotation embedding knowledge representation method and system based on convolution quaternion
CN114722212A (en) Automatic meta-path mining method oriented to character relation network
CN114722273A (en) Network alignment method, device and equipment based on local structural feature enhancement
JP6993250B2 (en) Content feature extractor, method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination