CN111143691A

CN111143691A - Joint information extraction method and device

Info

Publication number: CN111143691A
Application number: CN201911416984.0A
Authority: CN
Inventors: 周兴发; 孙锐
Original assignee: Sichuan Changhong Electric Co Ltd
Current assignee: Sichuan Changhong Electric Co Ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-05-12
Anticipated expiration: 2039-12-31
Also published as: CN111143691B

Abstract

The invention belongs to the field of data mining and information extraction, and discloses a joint information extraction method and device, which solve the problems that the traditional information extraction technology depends on the prior knowledge of experts and the performance of a characteristic extraction tool, and the semantic relation judgment between targets is influenced by the extraction error of an information target. The device includes: the code initialization module is used for carrying out semantic code initialization on the data, the predefined relationship type and the target type; the coding interaction module is used for mutual coding interaction between the data semantic coding sequence and the relationship type semantic coding sequence as well as between the target type semantic coding sequences; the coding fusion module is used for fusing the data semantic coding sequence and the relation type coding sequence and fusing the data semantic coding sequence and the target type coding sequence; the relation prediction module is used for predicting the relation type contained in the obtained data semantic code; and the target prediction module is used for predicting a first target and a second target corresponding to the relation type contained in the data.

Description

Joint information extraction method and device

Technical Field

The invention belongs to the field of data mining and information extraction, and particularly relates to a joint information extraction method and device.

Background

With the rapid development of the internet and the accumulation of a large amount of information, people have more and more urgent needs on how to quickly and automatically extract the implicit knowledge in the information, especially the text structure information extraction function in the field of natural language processing. The information extraction function can enable people to quickly know the hidden local information structure venation in the data source, can integrate and link the information extracted from a plurality of data sources, and can comprehensively know the whole information structure, so that input data sources or auxiliary information can be provided for other tasks such as public opinion early warning, an intelligent chat system, knowledge map construction, knowledge reasoning and the like.

The purpose of information extraction is to discover objects in a data source that need attention while identifying which objects have semantic relationships between them. Currently, the mainstream methods for information extraction include three types, namely template-based methods, statistical-based methods, and representation-based methods.

The template-based method usually makes corresponding rules and templates by experts according to the characteristics of data sources, and then extracts information by adopting a pattern matching method. The method has the advantages of high accuracy and low recall rate, can only extract and identify information structures in the range of the prepared template, and is also seriously dependent on the prior knowledge of experts, time-consuming and labor-consuming.

Statistical-based methods generally extract feature information, such as part of speech, dependency analysis, n-gram features, etc., in the field of text processing according to the characteristics of the data source, and then extract information structures using statistical learning methods. The method has the advantages that the characteristics can be automatically extracted, and then the information structure can be automatically extracted by learning through a statistical learning method. The method avoids the defect that an expert is needed to make rules and templates in the template-based method, but the extraction effect is seriously dependent on the tool performance of feature extraction.

The representation-based method generally directly performs initialization coding on a data division unit of a data source, and then performs further semantic coding on the data source by adopting neural network methods such as a convolutional neural network and a long-short time memory network. Aiming at the semantic code obtained by learning, firstly extracting information targets by adopting a pipeline mode, and then judging the semantic relation between the targets. The method not only avoids the limitation of an expert to formulate a rule template, but also avoids the defects of feature selection and dependence on a feature extraction tool. Because the method adopts a pipeline extraction mode, the relation between the information target extraction and the semantic relationship judgment between the targets is ignored, and meanwhile, the error of the information target extraction is propagated backwards to lead the information target to introduce more error information, thereby influencing the semantic relationship judgment between the targets.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the method and the device for extracting the joint information solve the problems that the prior knowledge of experts and the performance of a characteristic extraction tool are depended on in the traditional information extraction technology, and the semantic relation judgment between the targets is influenced by the extraction error of the information targets.

The technical scheme adopted by the invention for solving the technical problems is as follows:

in one aspect, the present invention provides a method for extracting joint information, including the following steps:

A. initializing semantic code sequences of data, predefined relationship types and target types;

B. performing mutual interaction according to the data and the initialized semantic code sequence of the predefined relationship to obtain a data semantic code sequence with relationship type information and a relationship type semantic code sequence with data information;

C. forward propagation of information is carried out by using a data semantic coding sequence with relationship type information, and relationship types contained in the data are obtained through prediction;

D. obtaining a data semantic coding sequence after the relationship type is strengthened according to the relationship type obtained by prediction, the relationship type semantic coding sequence with the data information and the data semantic coding sequence with the relationship type information;

E. initializing a semantic code sequence according to the data semantic code sequence with the strengthened relation type and a predefined target type, and performing mutual interaction to obtain a data semantic code sequence with target type information and a target type semantic code sequence with data information;

F. forward propagation is carried out by using a data semantic coding sequence with target type information, and a first target corresponding to a relation type contained in data is obtained through prediction;

G. and predicting to obtain a second target corresponding to the data implication relation type and the first target according to the predicted first target, the target type semantic code sequence with the data information and the data semantic code sequence with the target type information.

As a further optimization, step a specifically includes:

a1, performing minimum unit segmentation and minimum unit semantic combination-based segmentation on the data, so that the data are converted into sequence-based data;

a2, respectively carrying out semantic code initialization on the two types of segmented data obtained in the step A1, then combining the segmented sequence data based on the minimum unit, and fusing the two types of initialized semantic code sequences to obtain a fused data semantic code sequence;

and step A3, respectively carrying out semantic code initialization on the predefined relationship type and the target type to obtain a predefined relationship type initialization semantic code sequence and a predefined target type initialization semantic code sequence.

As a further optimization, step B specifically includes:

b1, according to the fused data semantic code sequence obtained in the step A2 and the predefined relationship type initialization code obtained in the step A3, enabling each sequence unit of the data to be respectively aligned with the predefined relationship type initialization code sequence to obtain an alignment weight, and then fusing the predefined relationship type semantic codes added by the weights to obtain a data semantic code sequence with relationship type information;

b2, according to the fused data semantic code sequence obtained in the step A2 and the predefined relationship type initialization codes obtained in the step A3, enabling each predefined relationship type initialization code to be respectively aligned with the data semantic code sequence to obtain an alignment weight, and then fusing the data semantic codes added by the weights to obtain a relationship type semantic code sequence with data information.

As a further optimization, in step C, the forward propagation of information is performed by using a data semantic coding sequence with relationship type information, and the relationship type contained in the data is predicted, specifically: and B1, using the data semantic code sequence with the relation type information obtained in the step B1 as input, carrying out forward propagation of the information in coding networks such as a convolutional neural network or a long-term memory network and the like to obtain semantic codes of the data, and then predicting the relation types in the data according to the obtained semantic codes.

As a further optimization, step D specifically includes:

d1, inquiring the relation type semantic code sequence with data information according to the predicted relation type to obtain the corresponding relation type semantic code;

d2, carrying out coding fusion on each sequence unit of the relation type semantic code obtained in the step D1 and the data semantic code sequence with the relation type information to obtain the data semantic code sequence with the strengthened relation type.

As a further optimization, step E specifically includes:

e1, aligning the semantic code of each enhanced data sequence unit with the predefined target type initialization semantic code sequence to obtain an alignment weight according to the relationship type enhanced data semantic code sequence obtained in the step D2 and the predefined target type initialization semantic code sequence obtained in the step A3, and then fusing the predefined target type semantic codes added by the weights to obtain a data semantic code sequence with target type information;

e2, according to the relationship type reinforced data semantic code sequence obtained in the step D2 and the predefined target type initialized semantic code sequence obtained in the step A3, enabling each predefined target type initialized code to be respectively aligned with the reinforced data semantic code sequence to obtain an alignment weight, and then fusing the reinforced data semantic codes added by the weights to obtain the target type semantic code sequence with data information.

As a further optimization, in step F, the forward propagation is performed by using the data semantic code sequence with the target type information, and the first target corresponding to the relationship type contained in the data is obtained by prediction, specifically: and E1, taking the data semantic code sequence with the target type information obtained in the step E1 as input, carrying out forward propagation on the information in coding networks such as a convolutional neural network or a long-time memory network and the like to obtain the semantic code sequence of the data, predicting the target type of each unit of the sequence according to the obtained semantic code sequence, and obtaining a first target corresponding to the relation type contained in the data.

As a further optimization, step G specifically includes:

g1, inquiring a target type semantic code sequence with data information according to the predicted first target to obtain a semantic code vector corresponding to the first target;

g2, fusing the first target semantic coding vector obtained in the step G1 with each data semantic coding sequence unit with the target type information obtained in the step E2 to obtain a data semantic coding sequence with enhanced target type;

g3, using the data semantic code sequence with the strengthened target type obtained in the step G2 as input, and carrying out forward propagation on information in coding networks such as a convolutional neural network or a long-time memory network and the like to obtain the data semantic code sequence; and predicting the target type of each unit of the sequence according to the obtained semantic coding sequence to obtain a relation type contained in the data and a second target corresponding to the first target.

In another aspect, the present invention further provides a joint information extraction device, which includes:

the code initialization module is used for carrying out semantic code initialization on the data, the predefined relationship type and the target type;

the coding interaction module is used for mutual coding interaction between the data semantic coding sequence and the relationship type semantic coding sequence as well as between the target type semantic coding sequences;

the coding fusion module is used for fusing the data semantic coding sequence and the relation type coding sequence and fusing the data semantic coding sequence and the target type coding sequence;

the relation prediction module is used for predicting the relation type contained in the obtained data semantic code;

and the target prediction module is used for predicting a first target and a second target corresponding to the relation type contained in the data.

The invention has the beneficial effects that:

the method can automatically extract the information structure contained in the data source without depending on expert prior knowledge and a feature extraction tool, and can avoid the problem of wrong information transmission caused by a representation-based pipeline extraction mode.

Drawings

FIG. 1 is a block diagram of a joint information extraction device according to the present invention;

fig. 2 is a flowchart of a joint information extraction method in an embodiment.

Detailed Description

The invention aims to provide a joint information extraction method and a joint information extraction device, and solves the problems that the prior knowledge of experts and the performance of a characteristic extraction tool are depended on in the traditional information extraction technology, and the semantic relation judgment between targets can be influenced by the extraction error of an information target. As shown in fig. 1, the joint information extraction device of the present invention includes: the system comprises a coding initialization module, a coding interaction module, a coding fusion module, a relation prediction module and a target prediction module; wherein,

Example (b):

the embodiment is an embodiment of an information extraction method implemented based on the joint information extraction device, and the method includes the following steps:

step 1, performing minimum unit segmentation and minimum unit semantic combination based segmentation on data, so that the data is converted into sequence-based data. Word-based segmentation and word-based segmentation are employed for the text data source. For example, for the original text data, "wife of yaoming is leaf Li", the result after minimal segmentation based on words is: tc ═ yao, mingming, wife, son, yes, leaf, li ], based on one segmentation of the word, the results were: tw is the name "Yaoming", wife, Yeli ".

Step 2, respectively carrying out semantic coding initialization on the data, the predefined relationship type and the target type after the two types of segmentation in the step 1, wherein the result after the initialization of the Tw semantic coding in the step 1 is as follows: v_w＝[v₀,v₁,v₂,v₂,v₃]Wherein

d is the dimension of the initialization vector, V_wRepresenting word-based semantically initialized coding sequences, equivalently using V_cRepresenting a word-based semantic initialization code sequence, V_rInitialising a semantic code sequence, V, representing a predefined relationship type_eThe semantic code sequence is initialized to represent a predefined target type.

Step 3, comparing V in step 2_wAnd V_cAnd fusing to obtain a fused data semantic coding sequence V. Wherein V ═ f_m(f_w(V_w),f_c(V_c) Wherein f) is_mThe fusion function can be represented as a maximization function, a mean function, a weighted summation function and the like; fw and fc are dimension normalization functions, which can be functions of a convolutional neural network, a long-time and short-time memory network and the like.

Step 4, according to the fused data semantic code sequence V obtained in the step 3 and the predefined relation type initialization code sequence V obtained in the step 2_rAnd respectively carrying out coding alignment on each sequence unit of the data and the initialized coding sequence of the predefined relationship type to obtain alignment weight, and then fusing the semantic codes of the predefined relationship type added by the weight to obtain a data semantic coding sequence V with relationship type information.

Step 5, according to the fused data semantic code sequence V obtained in the step 3 and the predefined relation type initialization code sequence V obtained in the step 2_rRespectively aligning each predefined relation type initialization code with a data semantic code sequence to obtain an alignment weight, and then fusing the data semantic codes added by the weights to obtain a relation type semantic code sequence V with data information_r。

And 6, carrying out forward propagation on the information by using a data semantic coding sequence with the relationship type information, and predicting to obtain the relationship type contained in the data. Aiming at V obtained in the step 4, a coding function f is adopted_rCarrying out information forward propagation on the V, further coding the V in the propagation process to obtain the V, and further obtaining the probability p of the semantic relation_r＝f_r(V). The relation r ═ argmax (p) contained in the data_r)。

7, according to the relationship type r obtained by prediction and the V in the step 5 of the semantic code sequence of the relationship type with the data information_rAnd V in the step 6 of the data semantic code sequence with the relationship type information, and obtaining the data semantic code sequence V after the relationship type is strengthened. Specific v_r＝f(V_rR) denotes the semantic code vector to which the relation r corresponds, then v_i＝f[V,i]The semantic code after relationship reinforcement is: v. of_i＝[v_i,v_r],v_iRepresenting the semantic code of the ith unit in the data.

Step 8, according to the reinforced data semantic code sequence V obtained in step 7 and the predefined target type initialization semantic code sequence V obtained in step 2_eAligning the semantic code of each strengthened data sequence unit with the predefined target type initialization semantic code sequence to obtain an alignment weight, and then fusing the weighted predefined target type semanticsAnd coding to obtain a data semantic coding sequence V with target type information.

Step 9, according to the reinforced data semantic code sequence V obtained in step 7 and the predefined target type initialization semantic code sequence V obtained in step 2_eRespectively aligning each predefined target type initialization code with the enhanced data semantic code sequence to obtain an alignment weight, and then fusing the enhanced data semantic codes added by the weights to obtain a target type semantic code sequence V with data information_e；

Step 10, use V in step 9_eAnd carrying out forward propagation, and predicting to obtain a first target corresponding to the relation type contained in the data. The method specifically comprises the following steps: using V_eAs input, the information is carried into coding networks such as a convolutional neural network or a long-term memory network and the like to carry out forward propagation of the information, and a semantic coding sequence of the data is obtained; predicting the target type of each unit of the sequence according to the obtained semantic coding sequence to obtain a first target e corresponding to the relation type of data inclusion₁。

Step 11, predicting the first target e according to the step 10₁9, target type semantic code sequence V with data information_eAnd 8, predicting a data semantic coding sequence V with target type information to obtain types r and e of relation with data implication₁Corresponding second target e₂。

Claims

1. A joint information extraction method is characterized by comprising the following steps:

2. The joint information extraction method according to claim 1,

the step A specifically comprises the following steps:

3. The joint information extraction method according to claim 2,

the step B specifically comprises the following steps:

4. The joint information extraction method according to claim 3,

in step C, the forward propagation of the information is performed by using the data semantic code sequence with the relationship type information, and the relationship type contained in the data is predicted, specifically: and B1, using the data semantic code sequence with the relation type information obtained in the step B1 as input, carrying out forward propagation of the information in coding networks such as a convolutional neural network or a long-term memory network and the like to obtain semantic codes of the data, and then predicting the relation types in the data according to the obtained semantic codes.

5. The joint information extraction method according to claim 4,

the step D specifically comprises the following steps:

6. The joint information extraction method according to claim 5,

the step E specifically comprises the following steps:

7. The joint information extraction method according to claim 6,

in step F, the forward propagation is performed by using the data semantic code sequence with the target type information, and the first target corresponding to the relationship type contained in the data is obtained through prediction, specifically: and E1, taking the data semantic code sequence with the target type information obtained in the step E1 as input, carrying out forward propagation on the information in coding networks such as a convolutional neural network or a long-time memory network and the like to obtain the semantic code sequence of the data, predicting the target type of each unit of the sequence according to the obtained semantic code sequence, and obtaining a first target corresponding to the relation type contained in the data.

8. The joint information extraction method according to claim 7,

the step G specifically comprises the following steps:

9. A joint information extraction device, comprising: