Large-scale knowledge graphs are crucial for structuring human knowledge; however, they often remain incomplete. This paper tackles the challenge of completing missing factual triples in knowledge graphs using through rule reasoning. Current rule learning methods tend to allocate a significant portion of triples to constructing the graph during training, while neglecting multi-target reasoning scenarios. Furthermore, these methods typically depend on qualitative assessments of mined rules, lacking a quantitative method to evaluate rule quality. We propose a model that optimizes training data usage and supports multi-target reasoning. To overcome limitations in evaluating model performance and rule quality, we propose two novel metrics. Experimental results show that our model outperforms baseline methods on five benchmark datasets, validating the effectiveness of these metrics.
The experimental data that supports the findings of this study is available in GitHub at the following URL: https://github.com/lirt1231/MPLR
In Neural LP framework, they view tail as the question to query, and only one head the answer to the query. Then a confidence \(\alpha _i\) is assigned to one particular path \(p_i\).
To be more precise, our model simulates the removal of edges associated with the input query are removed, as previously detailed in Section 4.2.It is worth noting that in Neural LP [43], DRUM [32], and similar models, a reversed query (?, q, t) with answer h is added for each triple. However, for a fair comparison, we only use the query (h, q, ?)
The URLs we use to implement these models are listed in Appendix B.2.
The work of this paper is supported by the "National Key R&D Program of China" (2021YFB2012400), "National Natural Science Foundation of China" (Grant No. 62272129).
Appendix A An example of TensorLog for KG reasoning
Each KG entity \(e \in \mathcal {E}\) depicted in Fig. 1 is encoded into a binary vector of length \(|\mathcal {E}| = 6\). For each predicate \(p \in \mathcal {P}\) and each pair of entities \(e_i, e_j \in \mathcal {E}\), the TensorLog operator associated with p is defined as a matrix \(\textrm{M}_p \in \{0,1\}^{|\mathcal {E}| \times |\mathcal {E}|}\). The (i, j)-th element of this matrix is set to 1 (indicated in
in the matrices) if the triple \((e_i, p, e_j)\) exists in \(\mathcal {G}\). Taking the KG illustrated in Fig. 1 as an example, for the predicate \(p = \texttt {daughterOf}\), we have:
The rule sisterOf(X, Z) \(\wedge \) sonOf(Z, Y) \(\Rightarrow \) daughterOf(X, Y) can be simulated by performing the following sparse matrix multiplication:
By setting \(\textrm{v}_{z_1} = [0, 0, 1, 0, 0, 0]^{\top }\) as the one-hot vector of \(z_1\) and performing left multiplication with \(\textrm{v}_{z_1}^{\top }\), we compute \(s^{\top } = \textrm{v}_{z_1}^{\top } \cdot \mathrm {M_{p'}}\). This operation effectively selects the row in \(\mathrm {M_{p'}}\) corresponding to \(z_1\). Subsequent right-hand side multiplication by \(\textrm{v}_{x_1}\) yields the number of unique paths following the pattern \(\texttt {sisterOf} \wedge \texttt {sonOf}\) from \(z_1\) to \(x_1\): \(s^{\top } \cdot \textrm{v}_{x_1} = 1\).
Appendix B Experiment
1.1 B.1 Extension to Table 3: saturations of UMLS
1.2 B.2 Implementation details
The models we use are available at the following URLs:
TransE, DistMult and ComplEx: https://github.com/Accenture/AmpliGraph (Accessed June 2023)
TuckER: https://github.com/ibalazevic/TuckER (Acce-ssed June 2023)
RotatE: https://github.com/DeepGraphLearning/KnowledgeGraphEmbedding (Accessed June 2023)
ConvE: https://github.com/TimDettmers/ConvE (Acce-ssed June 2023)
QuatDE: https://github.com/hopkin-ghp/QuatDE (Acce-ssed June 2023)
KNN-KG: https://github.com/zjunlp/KNN-KG (Acce-ssed June 2023)
RNNLogic https://github.com/DeepGraphLearning/RNNLogic (Accessed June 2023)
Neural LP: https://github.com/fanyangxyz/Neural-LP (Acce-ssed June 2023)
DRUM: https://github.com/alisadeghian/DRUM (Acce-ssed June 2023)
MPLR: https://github.com/lirt1231/MPLR (Accessed- June 2023)
1.3 B.3 Extension to Table 4: bifurcation of all datasets
1.4 B.4 Extension to Table 2: results on FB15K-237 and WN18
1.5 B.5 Extension to Table 7: more mined rules from the Family dataset
