CN115730058A

CN115730058A - Reasoning question-answering method based on knowledge fusion

Info

Publication number: CN115730058A
Application number: CN202211559297.6A
Authority: CN
Inventors: 秦科; 段贵多; 罗光春; 许毅; 董谦; 董悦洲
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-03-03

Abstract

The invention discloses a reasoning question-answering method based on knowledge fusion, wherein the method comprises the following steps: acquiring a key entity and a non-key entity in a problem text to be processed, and determining first interpretation text information corresponding to the key entity and second interpretation text information corresponding to the non-key entity; determining a background knowledge vector to be used corresponding to the problem text to be processed based on the key entity, the non-key entity, the first interpretation text information and the second text information; determining a problem vector to be used corresponding to the problem text to be processed, and obtaining a target problem vector based on the problem vector to be used and the background knowledge vector to be used; and determining at least one candidate answer entity corresponding to the target question vector, obtaining a target answer entity according to the answer evaluation attribute corresponding to each candidate answer entity, and determining a target answer corresponding to the text of the question to be processed based on the target answer entity. The effect of more accurately determining answer information corresponding to the input question is obtained.

Description

Reasoning question-answering method based on knowledge fusion

Technical Field

The invention relates to the technical field of machine reading understanding, in particular to a reasoning question-answering method based on knowledge fusion.

Background

With the rapid development of artificial intelligence, the automatic question-answering system is one of the vision of strong artificial intelligence.

Currently, automated question-answering systems include a system that can be generally divided into two phases. The first stage is to match questions and answers from a pre-constructed knowledge base, and the second stage focuses on that a computer can really understand the meaning of a question and enhance the reasoning ability of the questions and answers by adding open knowledge. However, the application range of the question-answering system in the first stage is small, the requirements on the precision and the breadth of the question-answering database are high, and the question-answering system in the second stage cannot understand the questions accurately enough, so that the obtained answers are not accurate enough.

In order to solve the above problems, an improvement of the inference question-answering method is required.

Disclosure of Invention

The invention provides a reasoning question-answering method based on knowledge fusion, which aims to solve the problem that the obtained answer is not accurate enough because background knowledge associated with input questions is ignored when a question-answering system analyzes the input questions.

In a first aspect, an embodiment of the present invention provides a reasoning question-answering method based on knowledge fusion, including:

acquiring a key entity and a non-key entity in a problem text to be processed, and determining first interpretation text information corresponding to the key entity and second interpretation text information corresponding to the non-key entity; the key entities are matched with entities in a preset entity database;

determining a background knowledge vector to be used corresponding to the text of the problem to be processed based on the key entity, the non-key entity, the first explanatory text information and the second text information;

determining a to-be-used problem vector corresponding to the to-be-processed problem text, and obtaining a target problem vector based on the to-be-used problem vector and the to-be-used background knowledge vector;

and determining at least one candidate answer entity corresponding to the target question vector, obtaining a target answer entity according to answer evaluation attributes corresponding to the candidate answer entities, and determining a target answer corresponding to the to-be-processed question text based on the target answer entity.

According to the technical scheme, a key entity and a non-key entity in a problem text to be processed are obtained, first explanatory text information corresponding to the key entity and second explanatory text information corresponding to the non-key entity are determined, in order to construct background knowledge information corresponding to the problem text to be processed, entity extraction is carried out on the problem text to be processed based on an entity recognition technology, the key entity and the non-key entity are obtained, and meanwhile, the first explanatory text information corresponding to the key entity and the second explanatory text information corresponding to the non-key entity are obtained from an explanatory text database. Further, a to-be-used background knowledge vector corresponding to the to-be-processed problem text is determined based on the key entity, the non-key entity, the first interpretation text information and the second text information, a to-be-matched triple is constructed based on the key entity and the non-key entity, a target triple is determined from the to-be-matched triple, the target triple, the key entity, the non-key entity, the first interpretation text information and the second text information are determined to be vectorized with the to-be-processed problem, a corresponding vector is obtained, and the to-be-used background knowledge vector corresponding to the to-be-processed problem text is obtained according to each vector and corresponding weight. In addition, a problem vector to be used corresponding to the problem text to be processed is determined, a target problem vector is obtained based on the problem vector to be used and the background knowledge vector to be used, the problem text to be processed is subjected to vector coding based on a vector processing model, the problem vector to be used is obtained, and the problem vector to be used and the background knowledge vector to be used are spliced to obtain the target problem vector. And determining at least one candidate answer entity corresponding to the target question vector, and obtaining a target answer entity according to the answer evaluation attribute corresponding to each candidate answer entity so as to determine a target answer corresponding to the to-be-processed question text based on the target answer entity. Analyzing and processing the target question vector through a pre-trained model, determining at least one candidate answer entity corresponding to the target question vector from an answer entity set, evaluating each candidate answer entity based on an answer evaluation model to obtain corresponding answer evaluation attributes, and taking the candidate answer entity corresponding to the highest answer evaluation attribute as the target answer entity to determine a target answer corresponding to the to-be-processed question text based on the target answer entity. The problem that the obtained answer is not accurate enough due to neglecting background knowledge associated with the input question when the question-answering system analyzes the input question is solved, and the effect of more accurately determining the answer information corresponding to the input question is achieved.

It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of a reasoning question-answering method based on knowledge fusion according to an embodiment of the present invention;

FIG. 2 is a flowchart of a reasoning question-answering method based on knowledge fusion according to a second embodiment of the present invention;

fig. 3 is a schematic diagram for determining a background knowledge vector to be used based on the ALBERT model according to the second embodiment of the present invention;

fig. 4 is a schematic diagram of determining a problem vector to be used by a GPT-2 model according to the second embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein.

Before the technical scheme is explained, an application scenario of the technical scheme is briefly introduced to more clearly understand the technical scheme.

The automatic question-answering system is very important for the development of artificial intelligence, and the current automatic question-answering system is generally divided into two types: one is a knowledge-base-based question-answering system, which generally converts questions input by users into database query instructions by using an entity recognition technology in combination with grammatical analysis, and then queries in a corresponding database to output answers. However, the question-answering method is very dependent on the establishment of a question-answering database, and a complete question-answering database needs to be established artificially according to the limited field and the application direction sense, so that the precision and the rigidity of the question-answering database directly influence the performance of a question-answering system, the question-answering system is usually applied to the professional limited field, the application range of the question-answering system is not wide, and if the problem input by a user exceeds the field limit, the question-answering system cannot give an accurate answer. The second is a common sense reasoning question-answering system, which generally semantically understands the question and adds open knowledge to increase muscle reasoning question-answering ability, but such a question-answering system generally lacks consideration of background knowledge.

Based on the above, the technical scheme provides a reasoning question-answering method based on knowledge fusion, which can be applied to a question-answering system based on open knowledge, wherein the question-answering system mainly focuses on understanding the question and is not applied to the limited field. The question-answering system mainly comprises three stages of question analysis, information retrieval and answer extraction, wherein in the stage of question analysis, the questions need to be classified and key information in the questions needs to be extracted, and the key information is generally entity information in the questions and is usually completed by combining an entity identification technology; the information retrieval stage aims to construct corresponding evidence for the questions, and when the machine reads and understands the question and answer task, a section of background knowledge evidence is given, so that when the questions are understood, the questions can be better understood through the background knowledge associated with the questions, and the best matching answers are obtained from the answer knowledge base in the answer extraction stage.

Example one

Fig. 1 is a flowchart of a reasoning question-answering method based on knowledge fusion according to an embodiment of the present invention, which is applicable to a case where background knowledge associated with an input question text is added to determine an answer that is most matched with the input question text based on analysis of the input question text in a question-answering system.

As shown in fig. 1, the method includes:

s110, acquiring a key entity and a non-key entity in the problem text to be processed, and determining first interpretation text information corresponding to the key entity and second interpretation text information corresponding to the non-key entity.

The text of the question to be processed may be the text of the question input by the user in the question-answering system based on open knowledge, for example, the text of the question to be processed may be "what is the movie made by the user a". In the technical scheme, information in a problem text to be processed is divided into a key entity and a non-key entity, wherein the key entity is matched with an entity in a preset entity database, and the non-key entity is not matched with an entity in the preset entity database. Illustratively, the preset entity database includes an entity 1, an entity 2 and an entity 3, and the text of the problem to be processed includes the entity 1 and the entity 4, so that the entity 1 is a key entity and the entity 4 is a non-key entity. That is, if an entity in the text of the to-be-processed question can be found in a preset entity database, the entity can be used as a key entity, otherwise, the entity can be used as a non-key entity.

Wherein the first explanatory text information may be understood as background knowledge information associated with the key entity and the second explanatory text information may be understood as background knowledge information associated with the non-key entity. For example, taking the first interpretation text information corresponding to the key entity as an example, the key entity is "chrysanthemum", and the corresponding first interpretation text information may include the plant attribute, the category, the color feature, the shape feature, the odor feature, the related efficacy of the chrysanthemum, and the like corresponding to the "chrysanthemum".

Specifically, the technical scheme can be applied to a question-answering system based on open knowledge, and in practical application, a question text to be processed can be input based on an editing control provided by the question-answering system, entities in the question text to be processed are extracted, and the entities are classified to obtain key entities and non-key entities in the question text to be processed. Meanwhile, in order to enable the question answering system to better understand the text of the question to be processed, first explanatory text information corresponding to the key entity and second explanatory text information corresponding to the non-key entity need to be acquired.

Optionally, the obtaining of the key entity and the non-key entity in the text of the problem to be processed includes: extracting at least one entity to be determined from the problem text to be processed based on an entity identification technology; for each entity to be determined, determining whether the current entity is matched with an entity in a preset entity database; if yes, determining the current entity as a key entity; if not, determining that the current entity is a non-key entity.

The entity identification technology is an information extraction technology, and can acquire entity data such as a name of a person, a place name and the like from text data. For example, if the to-be-processed problem text is "user a likes to watch movie a", the to-be-processed problem text is identified and extracted based on the entity identification technology, and the to-be-determined entities are "user a" and "movie a". The entity database can be a custom-built entity database or an existing entity database, for example, the entity database can be a database provided by a concept network.

In practical application, at least one entity to be determined is extracted from a problem text to be processed based on an entity identification technology, for example, one entity to be determined is taken as a current entity, whether the current text entity exists in a preset entity database is inquired, if so, the current entity is determined as a key entity, otherwise, the current entity is determined as a non-key entity.

Optionally, determining first explanatory text information associated with the key entity and second explanatory text information corresponding to the non-key entity includes: first explanatory text information corresponding to the key entities and second explanatory text information corresponding to the non-key entities are determined from an explanatory text database according to a keyword detection technique.

The key detection technology may be understood as a technology for detecting a keyword, and the explanatory text database may be understood as a database storing background knowledge corresponding to each entity, and the explanatory text database includes explanatory text information associated with at least one entity.

Specifically, after the key entities and the non-key entities are determined, in order to obtain first explanatory text information corresponding to the key entities and second explanatory text information corresponding to the non-key entities, the key entities and the non-key entities are used as key words, and whether explanatory text information associated with the key words exists in an explanatory text database or not is queried based on a key word detection technology. That is to say, in the technical solution, both the key entities and the non-key entities in the problem text to be processed can be detected as keywords.

And S120, determining a background knowledge vector to be used corresponding to the question text to be processed based on the key entity, the non-key entity, the first explanation text information and the second text information.

Wherein the background knowledge vector to be used can be used to characterize the background knowledge associated with the question text to be processed.

In the technical scheme, in order to enable the question answering system to better understand the text meaning of the question text to be processed, first explanation text information corresponding to a key entity and second explanation text information corresponding to a non-key entity are obtained. On the basis, the key entity, the non-key entity, the first explanation text information and the second text information are subjected to vectorization processing to obtain a corresponding background vector of knowledge to be used, so that the question-answering system supplements the background knowledge of the problem text to be processed based on the background knowledge vector to be used according to the background knowledge vector to be used.

In practical application, determining a background knowledge vector to be used corresponding to the text of the problem to be processed based on the key entity, the non-key entity, the first interpretation text information and the second text information comprises: obtaining at least one triple to be matched based on at least one non-key entity and a key entity; in the triple database, respectively carrying out triple matching on each triple to be matched, and determining the successfully matched triple to be matched as a target triple; and determining a background knowledge vector to be used corresponding to the question text to be processed based on the target triple, the key entity, the non-key entity, the first interpretation text information and the second text information.

In the technical solution, the to-be-matched triples may be understood as triples formed based on key entities and non-key entities, and used for characterizing logic facts between the key entities and the non-key entities. Illustratively, the key entity is "user a", the non-key entities are "enterprise a" and "position a", and corresponding triples to be matched, that is, "user a-employee-enterprise a" and "user a-job-position a", are respectively available based on each non-key entity and the key entity. A triple database is understood to be a pre-constructed database for storing triples, in which a large number of pre-arranged triples of entities are included. A target triplet may be understood as the triplet that best characterizes the logical relationship between the critical entity and the non-critical entity.

Specifically, at least one key entity and at least one non-key entity are extracted from the problem text to be processed, for each key entity, taking one key entity as the current key entity as an example, at least one non-key entity associated with the current key entity is selected, and a corresponding triple to be matched is formed based on each non-key entity and the current key entity. Further, whether a triple consistent with the triple to be matched exists or not is inquired in a pre-constructed triple database, and if the triple consistent with the triple to be matched exists, the triple to be matched is determined to be the target triple.

Further, after the target triple is obtained, a background knowledge vector to be used corresponding to the problem text to be processed is obtained based on the target triple, the key entity, the non-key entity, the first interpretation text information and the second interpretation text information.

It should be noted that, when matching each triplet to be matched, the triplet database may not include a corresponding triplet, that is, a target triplet that can best characterize the logical relationship between the current key entity and the current non-key entity cannot be determined from at least one triplet to be matched. Based on the three-tuple evaluation attribute, if each to-be-matched triplet does not exist in the three-tuple database, determining the triplet evaluation attribute corresponding to each to-be-matched triplet; comparing the triple evaluation attributes, and determining the triple to be matched corresponding to the highest triple evaluation attribute as a target triple; and determining a background knowledge vector to be used corresponding to the text of the problem to be processed based on the target triple, the key entity, the non-key entity, the first interpretation text information and the second text information.

In order to determine a target triple which can most represent the logical relationship between the current key entity and the non-key entity from the triples to be matched, each triplet to be matched is evaluated, and the obtained evaluation value is used as a corresponding triplet evaluation attribute.

Specifically, the number of triples to be determined of the triples to be determined corresponding to the triples including the current key entity in the triplet database and the total triplet number of the triples in the triplet database are determined, and the weight to be used corresponding to the triplet to be matched corresponding to the current key entity is obtained based on the ratio of the total triplet number to the triplet number to be determined. Meanwhile, the number to be determined is obtained according to the occurrence frequency of the triples in the triple database, which are the same as the triple category of the triples to be matched, the weight to be determined, which corresponds to the triples to be matched and contains the current key entity, is obtained according to the ratio of the number to be determined to the total triples, and the corresponding triple evaluation attribute is obtained based on the product of the weight to be used and the weight to be determined. It can be understood that, based on a similar method, the triple evaluation attribute corresponding to each triple to be matched may be determined.

And after the triplet evaluation attribute corresponding to each triplet to be matched is obtained, comparing the triplet evaluation attributes to obtain the highest triplet evaluation attribute, and taking the triplet to be matched corresponding to the highest triplet evaluation attribute as a target triplet. Further, a to-be-used background knowledge vector corresponding to the to-be-processed question text is determined based on the target triple, the key entity, the non-key entity, the first explanatory text information and the second text information.

Optionally, determining a to-be-used background knowledge vector corresponding to the to-be-processed question text based on the target triple, the key entity, the non-key entity, the first interpretation text information, and the second text information includes: vector coding is respectively carried out on the target triple, the key entity, the non-key entity, the first explanation text information and the second text information to obtain a corresponding first vector, a corresponding second vector, a corresponding third vector, a corresponding fourth vector and a corresponding fifth vector; determining corresponding vectors to be fused based on the vectors and the corresponding weights; and carrying out fusion processing on each vector to be fused to obtain a background knowledge vector to be used.

In the technical solution, when vector encoding is performed on the target triple, the key entity, the non-key entity, the first interpretation text information, and the second text information, a byte pair encoding mode (BPE) may be adopted.

It can be understood that in the BPE embedded control, all words are identified by fixed vectors, and since background knowledge of various sources and formats is introduced and different background knowledge has different influence degrees on the question-answering system, in order to distinguish the sources of the extended knowledge, corresponding end embedded offsets can be added to each background knowledge, and after vector encoding is performed based on a BPE encoding mode, corresponding vectors are obtained. Based on BPE coding mode, obtaining a first vector corresponding to the target triple, a second vector corresponding to the key entity, a third vector corresponding to the non-key entity, a fourth vector corresponding to the first interpretation text information and a fifth vector corresponding to the second interpretation text information. Further, since different vectors have different degrees of influence on the question-answering system, weights corresponding to the vectors can be determined, so as to obtain corresponding vectors to be fused based on the vectors and the corresponding weights.

Specifically, a first to-be-fused vector is obtained based on the product of a first vector and a corresponding first weight; obtaining a second vector to be fused based on the product of the second vector and the corresponding second weight; obtaining a third vector to be fused based on the product of the third vector and the corresponding third weight; obtaining a fourth vector to be fused based on the product of the fourth vector and the corresponding fourth weight; and obtaining a fifth vector to be fused based on the product of the fifth vector and the corresponding fifth weight. Further, the fusion vectors of the phases are spliced to obtain a background knowledge vector to be used.

It should be noted that, in the present technical solution, the stitching sequence of each to-be-fused stitching vector is: { a first vector to be spliced; a second vector to be spliced; a fourth vector to be spliced; a third vector to be spliced; the first vector to be spliced }. In other words, the technical solution is according to { target triples; a key entity; first interpretation text information; a non-critical entity; and splicing the sequences of the vectors to be fused corresponding to the second interpretation text information.

S130, determining a to-be-used problem vector corresponding to the to-be-processed problem text, and obtaining a target problem vector based on the to-be-used problem vector and the to-be-used background knowledge vector.

The problem vector to be used can be understood as a vector obtained by directly vectorizing the problem text to be processed. The target question vector may be understood as a vector corresponding to the question text to be processed and associated background knowledge information.

Optionally, determining a to-be-used problem vector corresponding to the to-be-processed problem text, and obtaining a target problem vector based on the to-be-used problem vector and the to-be-used background knowledge vector, where the method includes: based on the vector processing model, performing vectorization processing on the problem text to be processed to obtain a problem vector to be used; and splicing the problem vector to be used and the background knowledge vector to be used to obtain a target problem vector.

The vector processing model can be a GPT-2 pre-training model.

In practical application, the problem text to be processed can be subjected to vector coding based on a GPT-2 pre-training model, so that a problem vector to be used is obtained. Further, in order to enhance the analysis of the question-answering system carried in the to-be-used question vector, the to-be-used background knowledge vector is spliced on the basis of the to-be-used question vector to obtain the target question vector.

The advantage of this arrangement is that the target question vector includes not only the question semantic information corresponding to the question text to be processed, but also the background knowledge information associated with the question text to be processed. On the basis, when the question-answering system analyzes and processes the target question vector, the actual meaning expressed by the text of the question to be processed can be more accurately understood so as to determine the best answer corresponding to the text of the question to be processed.

S140, at least one candidate answer entity corresponding to the target question vector is determined, the target answer entity is obtained according to the answer evaluation attribute corresponding to each candidate answer entity, and the target answer corresponding to the to-be-processed question text is determined based on the target answer entity.

In the present technical solution, at least one corresponding candidate answer entity may be determined according to the target question vector, where the candidate answer entity may be an entity determined according to a keyword in a candidate answer. The answer evaluation attribute may be used to characterize the matching degree between the candidate answer entity and the target question vector, where a higher answer evaluation attribute indicates a higher matching degree between the candidate answer entity and the target question vector, and a lower answer evaluation attribute indicates a lower matching degree between the candidate answer entity and the target question vector. The target answer entity may be understood as a candidate answer entity that matches the target question to the highest degree. The target answer may be understood as answer text information determined based on the target answer entity.

In practical applications, the target question vector may be input into the ALBERT model, so as to perform vector analysis on the target question vector based on the model, and obtain at least one candidate answer entity corresponding to the target question vector. Further, a candidate answer entity with the highest matching degree with the target question vector is found from the candidate answer entities, the candidate answer entities can be evaluated to obtain corresponding answer evaluation attributes, the target answer entity is determined based on the answer evaluation attributes, and a target answer corresponding to the to-be-processed question text is determined based on the target answer entity.

Optionally, determining at least one candidate answer entity corresponding to the target question vector, and obtaining the target answer entity according to the answer evaluation attribute corresponding to each candidate answer entity, including: determining at least one candidate answer entity corresponding to the target question vector based on a preset answer entity set; based on the answer evaluation model, respectively performing entity evaluation on each candidate answer entity to obtain corresponding answer evaluation attributes; and determining the candidate answer entity corresponding to the highest answer evaluation attribute as the target answer entity based on each answer evaluation attribute.

The answer entity set includes at least one answer entity, for example, a corresponding answer entity set may be constructed according to the answer entities in the concept network. The answer evaluation model may be understood as a model that evaluates a degree of matching between each candidate answer entity and the target question vector, and an answer evaluation attribute corresponding to each candidate answer entity may be obtained based on the answer evaluation model.

Specifically, the target question vector may be input into the ALBERT model, so as to perform vector analysis on the target question vector based on the model, and further obtain at least one candidate answer entity with a higher matching degree with the target question vector in the answer entity set. For example, the number of candidate answer entities may be preset, or a corresponding number of candidate answer entities determined from a large number of candidate answer entities may be screened according to a preset ratio to serve as candidate answer entities corresponding to the target question vector.

Further, based on the answer evaluation model, each candidate answer entity is evaluated to obtain a corresponding answer evaluation attribute, and the candidate answer entity corresponding to the highest answer evaluation attribute is used as the target answer entity.

According to the technical scheme, a key entity and a non-key entity in a problem text to be processed are obtained, first explanation text information corresponding to the key entity and second explanation text information corresponding to the non-key entity are determined, in order to construct background knowledge information corresponding to the problem text to be processed, entity extraction is carried out on the problem text to be processed based on an entity recognition technology, the key entity and the non-key entity are obtained, and meanwhile, the first explanation text information corresponding to the key entity and the second explanation text information corresponding to the non-key entity are obtained from an explanation text database. Further, a to-be-used background knowledge vector corresponding to the to-be-processed problem text is determined based on the key entity, the non-key entity, the first interpretation text information and the second text information, a to-be-matched triple is constructed based on the key entity and the non-key entity, a target triple is determined from the to-be-matched triple, the target triple, the key entity, the non-key entity, the first interpretation text information and the second text information are determined to be vectorized with the to-be-processed problem, a corresponding vector is obtained, and the to-be-used background knowledge vector corresponding to the to-be-processed problem text is obtained according to each vector and corresponding weight. In addition, a problem vector to be used corresponding to the problem text to be processed is determined, a target problem vector is obtained based on the problem vector to be used and the background knowledge vector to be used, the problem text to be processed is subjected to vector coding based on a vector processing model, the problem vector to be used is obtained, and the problem vector to be used and the background knowledge vector to be used are spliced to obtain the target problem vector. And determining at least one candidate answer entity corresponding to the target question vector, and obtaining a target answer entity according to the answer evaluation attribute corresponding to each candidate answer entity so as to determine a target answer corresponding to the to-be-processed question text based on the target answer entity. Analyzing and processing the target question vector through a pre-trained model, determining at least one candidate answer entity corresponding to the target question vector from an answer entity set, evaluating each candidate answer entity based on an answer evaluation model to obtain corresponding answer evaluation attributes, and taking the candidate answer entity corresponding to the highest answer evaluation attribute as the target answer entity to determine a target answer corresponding to the to-be-processed question text based on the target answer entity. The problem that the obtained answer is not accurate enough due to neglecting background knowledge associated with the input question when the question-answering system analyzes the input question is solved, and the effect of more accurately determining the answer information corresponding to the input question is achieved.

Example two

In a specific example, as shown in fig. 2, a question text (i.e., a to-be-processed question text) is input in an editing control provided by an open knowledge-based question-answering system, and in order to make the to-be-processed question text more standard, the to-be-processed question text is preprocessed before entity recognition, for example, hyphens, quotation marks, or various special characters in the to-be-processed question text are removed. Next, key entities and non-key entities are extracted from the text of the problem to be processed based on an entity recognition technology. Specifically, a frame of a pre-training model BERT + conditional random field CRF is adopted to label a sequence of a problem, in order to enable a recognition result to construct an entity dictionary more accurately, all entities in a knowledge-graph concept network (namely, an entity database) are extracted to form an entity dictionary, semantic matching is carried out between the entities in the sequence labeling result and the entities in the entity dictionary, the successfully matched entities are marked as key entities, other recognition marks are marked as common entities (namely, non-key entities), and a problem text to be processed is' A correlation between subsequent entities for two direction transformation, but not a sum of all entities? For example, the entity identified by the entity includes "resolving door" and "security measure", wherein the "resolving door" can be successfully matched in the concept network, so that the entity can be determined as a key entity, and the "security measure" is not successfully matched, so that the entity is marked as a normal entity (i.e., a non-key entity).

Further, background knowledge is automatically constructed in the open corpus for the problem text to be processed according to the key entities, and in order to enable the background knowledge to meet the requirement of commonsense, a manually constructed structured knowledge map concept network and a wiki dictionary containing extensive explanatory knowledge can be selected as the extended corpus.

Specifically, a triple to be matched corresponding to a key entity is constructed, from the key entity, a triple most associated with the text of the problem to be processed is selected according to a triple method based on the relation weight, and a triple set in the concept network is marked as C, e _q Representing key entities in the problem, e _c The specific process represents a common entity in the problem and comprises the following steps:

1) If there is an edge relation r, such that (e) _q ，r，e _c ) E.g. C, directly select the triplet (e) _q ，r，e _c ) As a result. That is to say, the triples to be matched are matched in the preset triple database, and if the triples to be matched are included in the triple database, the triples to be matched are determined as the target triples.

2) If the triple database does not contain the triple to be matched, reselectingAll contain e _q The total number of the triples to be matched is marked as N, and a score s is calculated for the jth triplet in the triples to be matched _j This score is given by the weight w of the triplets in the concept net _j (i.e., the weight to be used) multiplied by a weight based on the current relationship

(i.e., the weights to be determined), the relational weights

Is calculated as follows:

wherein r is _j The relationship type of the jth triplet is represented,

is shown in the inclusion e _q N triplets of (a) is _j The number of occurrences (i.e., the number of triples to be determined), N represents the inclusion of e in the triple database _q The number of triplets of (2).

It will be appreciated that the above-described,

the smaller, and w _j The higher the height, the more common sense of the background knowledge contained in the triplets is and the lower the noise. Wherein, w _j The confidence corresponding to the triples to be matched can be characterized.

Determining a triple evaluation attribute corresponding to the triple to be matched based on the following formula:

wherein s is _j Represents the triple evaluation attribute, w _j Which represents the weight to be used and,

representing the weights to be determined.

Based on a similar method, the triple evaluation attribute corresponding to each triple to be matched can be obtained, and the triple to be matched corresponding to the highest triple evaluation attribute is used as a target triple.

Meanwhile, for an unstructured dictionary, a request header is directly constructed for each key entity or non-key entity to call an API of a wiki dictionary, the access rate is limited to 1 second and 5 times, explanatory text information corresponding to the entities is obtained, and then the explanatory text information is stored as a json file in a key-value pair mode. That is, based on the keyword detection technique, first explanatory text information corresponding to the key entity and second explanatory text information corresponding to the non-key entity are determined.

Further, in order to add corresponding background knowledge information on the basis of the problem text to be processed, a text sequence is constructed on the basis of the target triple, the key entity, the first interpretation text information, the non-key entity and the second interpretation text information, and vectorization processing is performed on the text sequence to obtain a corresponding vector.

Specifically, a [ CLS ] marker is inserted into the head according to a standard input format of a pre-training model, the marker can represent the function of the whole sequence semantic information in the pre-training model, then a triple extracted from a concept net is inserted, a [ SEP ] separator is inserted, then a key entity, first explanation text information, a non-key entity and second explanation text information are sequentially inserted, and the [ SEP ] marker is used for segmentation. After the text sequence is constructed, vector coding is carried out on each text subsequence in the text sequence based on a BPE coding method in an ALBERT model, and a first vector corresponding to the target triple, a second vector corresponding to the key entity, a third vector corresponding to the non-key entity, a fourth vector corresponding to the first explanation text information and a fifth vector corresponding to the second explanation text are obtained. Further, embedding offset vectors into each vector adding segment, so that the ALBERT model obtains a first vector to be fused based on the product of the first vector and the corresponding first weight; obtaining a second vector to be fused based on the product of the second vector and the corresponding second weight; obtaining a third vector to be fused based on the product of the third vector and the corresponding third weight; obtaining a fourth vector to be fused based on the product of the fourth vector and the corresponding fourth weight; and obtaining a fifth vector to be fused based on the product of the fifth vector and the corresponding fifth weight, and splicing the fused vectors of all phases to obtain a background knowledge vector to be used.

Specifically, as shown in FIG. 3, [ CLS ] is used]A large amount of background knowledge features are lost when corresponding semantic information is marked, and for this reason, feature vector matrixes need to be subjected to feature fusion so as to simultaneously reserve knowledge graph triple information, key entity dictionary interpretation information and common entity dictionary interpretation information, and the text embedding of a final hidden layer of an ALBERT model is assumed to be represented as X = (X =) ₀ ，x ₁ ，...，x _m ) Wherein x is _i ∈R ^d D is the dimension of a word embedding vector, an attention mechanism of human thinking characteristics is fitted, the side emphasis of a text vector can be analyzed, a layer of attention mechanism is added before the output vector of the ALBERT and linear transformation to carry out weighted summation on different knowledge characteristics, and the utilization efficiency of knowledge can be effectively improved.

Firstly, introducing a parameter vector u initialized randomly, wherein u belongs to R ^d All embedded vectors x _i Multiplying the obtained values by the obtained values, performing linear transformation on the multiplied values, and inputting the multiplied values into a normalization function softmax to obtain probability serving as an attention weight alpha corresponding to each embedded vector _i Taking the key entities as an example:

α _i ＝softmax(u ^T x _i )

wherein alpha is _i Representing a second weight corresponding to the key entity, u representing an initialization vector, x _i And representing a second vector corresponding to the key entity, and softmax representing a normalization function.

Then, feature fusion is carried out by utilizing the attention weight value of each vector, and the expression form is to carry out one-time weighted summation on the embedded feature vector matrix;

wherein the vector g represents the final fused background knowledge vector to be used, α _i Representing a second weight, x, corresponding to the key entity _i And representing a second vector corresponding to the key entity, m representing the number of vectors, and i representing the ith vector.

Meanwhile, vector coding is carried out on the problem text to be processed based on the GPT-2 pre-training model, and a problem vector to be used is obtained. Further, in order to enhance the analysis of the question-answering system carried in the to-be-used question vector, the to-be-used background knowledge vector is spliced on the basis of the to-be-used question vector to obtain the target question vector. Specifically, as shown in fig. 4, firstly, adding [ START ] and [ END ] marks at the beginning and the END of a question, and a unidirectional GPT-2 can fit the reading and understanding habits of humans to generate answers to the question by using a vector q corresponding to the [ END ] mark, wherein q represents a question vector to be used, and simultaneously, background knowledge is blended to effectively improve the question-answering capability of a model and improve the quality and precision of the answers; l ].

Further, the target question vector may be input into the ALBERT model, so as to perform vector analysis on the target question vector based on the model, and further obtain at least one candidate answer entity with a higher matching degree with the target question vector in the answer entity set. For example, the number of candidate answer entities may be preset, or a corresponding number of candidate answer entities determined from a large number of candidate answer entities may be screened according to a preset ratio as candidate answer entities corresponding to the target question vector.

Specifically, a target question vector v is mapped into a set of answer entities, w is a random initialThe linear transformation parameter matrix w belongs to R ^d*k D is the dimension of the language model feature vector, which in the ALBERT model can be set to 768,k as the number of candidate answers, and b is a bias vector.

s＝v ^T w+b

Where s denotes a target problem vector, v denotes a linear transformation parameter matrix, and b denotes an offset vector.

In order to effectively evaluate the answer entities, the scores need to be normalized so that the feature value of each answer entity score is compressed to (0,1), and the sum of the score feature values of all answer entities is 1, which represents the probability that each answer entity is a correct answer. A commonly used normalization function is softmax, which is an exponential normalization function, and for each output of the entity to the question, the final probability y of the answer _i The calculation is shown in the following formula.

Wherein, y _i Indicates the probability that the ith candidate answer is the correct answer,

represents the score corresponding to the ith candidate answer entity,

represents the sum of the scores corresponding to the n candidate answer entities.

Let the final evaluation probability of the answer be Y = { Y = ₁ ，y ₂ ，...，y _n And generating label vectors of samples according to answers provided by the training set and the verification set in the common sense reasoning data set

If the answer to the question is the second entity in the set of answer entities, then the generated label vector

In order to train the model, the technical scheme adopts a cross entropy loss function to evaluate the difference between the label sample and the prediction result, and the definition of the cross entropy loss function is shown as the following formula:

wherein, y _j For the probability that the jth answer is the correct answer,

is 1 or 0, depending on whether the jth answer is the correct answer or not.

And minimizing a cross entropy loss function between the label sample and the prediction result, which is a target of the model training. Finally, parameters in the pre-training model GPT-2, the ALBERT, the feature fusion module and the answer reasoning model are adjusted through the back propagation error, the training of the whole reasoning question-answer model is completed, and in the final answer prediction, the answer corresponding to the answer entity with the maximum probability (namely, the target answer entity)) is directly input to serve as the answer of the question (namely, the target answer).

According to the technical scheme, a key entity and a non-key entity in a problem text to be processed are obtained, first explanatory text information corresponding to the key entity and second explanatory text information corresponding to the non-key entity are determined, in order to construct background knowledge information corresponding to the problem text to be processed, entity extraction is carried out on the problem text to be processed based on an entity recognition technology, the key entity and the non-key entity are obtained, and meanwhile, the first explanatory text information corresponding to the key entity and the second explanatory text information corresponding to the non-key entity are obtained from an explanatory text database. Further, a to-be-used background knowledge vector corresponding to the to-be-processed problem text is determined based on the key entity, the non-key entity, the first interpretation text information and the second text information, a to-be-matched triple is constructed based on the key entity and the non-key entity, a target triple is determined from the to-be-matched triple, the target triple, the key entity, the non-key entity, the first interpretation text information and the second text information are determined to be vectorized with the to-be-processed problem, a corresponding vector is obtained, and the to-be-used background knowledge vector corresponding to the to-be-processed problem text is obtained according to each vector and corresponding weight. In addition, a problem vector to be used corresponding to the problem text to be processed is determined, a target problem vector is obtained based on the problem vector to be used and the background knowledge vector to be used, the problem text to be processed is subjected to vector coding based on a vector processing model, the problem vector to be used is obtained, and the problem vector to be used and the background knowledge vector to be used are spliced to obtain the target problem vector. And determining at least one candidate answer entity corresponding to the target question vector, obtaining a target answer entity according to the answer evaluation attribute corresponding to each candidate answer entity, and determining a target answer corresponding to the text of the question to be processed based on the target answer entity. Analyzing and processing the target question vector through a pre-trained model, determining at least one candidate answer entity corresponding to the target question vector from an answer entity set, evaluating each candidate answer entity based on an answer evaluation model to obtain corresponding answer evaluation attributes, and taking the candidate answer entity corresponding to the highest answer evaluation attribute as the target answer entity to determine a target answer corresponding to the to-be-processed question text based on the target answer entity. The problem that the obtained answer is not accurate enough due to neglecting background knowledge associated with the input question when the question-answering system analyzes the input question is solved, and the effect of more accurately determining the answer information corresponding to the input question is achieved.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.

The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A reasoning question-answering method based on knowledge fusion is characterized by comprising the following steps:

determining a problem vector to be used corresponding to the problem text to be processed, and obtaining a target problem vector based on the problem vector to be used and the background knowledge vector to be used;

determining at least one candidate answer entity corresponding to the target question vector, obtaining a target answer entity according to answer evaluation attributes corresponding to the candidate answer entities, and determining a target answer corresponding to the to-be-processed question text based on the target answer entity.

2. The method according to claim 1, wherein the obtaining key entities and non-key entities in the text of the question to be processed comprises:

extracting at least one entity to be determined from the problem text to be processed based on an entity identification technology;

for each entity to be determined, determining whether the current entity is matched with an entity in a preset entity database;

if yes, determining the current entity as a key entity;

if not, determining that the current entity is a non-key entity.

3. The method of claim 1, wherein the determining first explanatory text information associated with the key entity and second explanatory text information corresponding to the non-key entity comprises:

determining first explanatory text information corresponding to the key entity and second explanatory text information corresponding to the non-key entity from an explanatory text database according to a keyword detection technology; wherein the explanatory text database includes explanatory text information associated with at least one entity.

4. The method of claim 1, wherein determining the to-be-used background knowledge vector corresponding to the to-be-processed question text based on the key entity, the non-key entity, the first explanatory text information, and the second text information comprises:

obtaining at least one triple to be matched based on at least one non-key entity and the key entity;

in the triple database, respectively carrying out triple matching on each triple to be matched, and determining the successfully matched triple to be matched as a target triple;

determining a background knowledge vector to be used corresponding to the question text to be processed based on the target triple, the key entity, the non-key entity, the first explanatory text information and the second text information.

5. The method of claim 4, further comprising:

if each triplet to be matched does not exist in the triplet database, determining a triplet evaluation attribute corresponding to each triplet to be matched;

comparing the triple evaluation attributes, and determining the triple to be matched corresponding to the highest triple evaluation attribute as a target triple;

6. The method according to claim 4 or 5, wherein the determining a to-be-used background knowledge vector corresponding to the to-be-processed question text based on the target triple, the key entity, the non-key entity, the first explanatory text information, and the second text information comprises:

vector coding is respectively carried out on the target triple, the key entity, the non-key entity, the first explanation text information and the second text information to obtain a corresponding first vector, a second vector, a third vector, a fourth vector and a fifth vector;

determining corresponding vectors to be fused based on the vectors and the corresponding weights;

and carrying out fusion processing on each vector to be fused to obtain a background knowledge vector to be used.

7. The method according to claim 1, wherein the determining a to-be-used question vector corresponding to the to-be-processed question text and obtaining a target question vector based on the to-be-used question vector and the to-be-used background knowledge vector comprises:

vectorizing the problem text to be processed based on a vector processing model to obtain a problem vector to be used;

and splicing the problem vector to be used and the background knowledge vector to be used to obtain a target problem vector.

8. The method of claim 1, wherein determining at least one candidate answer entity corresponding to the target question vector and obtaining a target answer entity according to the answer evaluation attribute corresponding to each candidate answer entity comprises:

determining at least one candidate answer entity corresponding to the target question vector based on a preset answer entity set; wherein, the answer entity set comprises at least one answer entity;

based on the answer evaluation model, respectively carrying out entity evaluation on each candidate answer entity to obtain corresponding answer evaluation attributes;

and determining the candidate answer entity corresponding to the highest answer evaluation attribute as the target answer entity based on each answer evaluation attribute.