[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112035627A - Automatic question answering method, device, equipment and storage medium - Google Patents

Automatic question answering method, device, equipment and storage medium Download PDF

Info

Publication number
CN112035627A
CN112035627A CN202010731550.6A CN202010731550A CN112035627A CN 112035627 A CN112035627 A CN 112035627A CN 202010731550 A CN202010731550 A CN 202010731550A CN 112035627 A CN112035627 A CN 112035627A
Authority
CN
China
Prior art keywords
question
natural language
knowledge
answer
automatic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010731550.6A
Other languages
Chinese (zh)
Other versions
CN112035627B (en
Inventor
傅向华
杨静莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Technology University
Original Assignee
Shenzhen Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Technology University filed Critical Shenzhen Technology University
Priority to CN202010731550.6A priority Critical patent/CN112035627B/en
Publication of CN112035627A publication Critical patent/CN112035627A/en
Application granted granted Critical
Publication of CN112035627B publication Critical patent/CN112035627B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention is suitable for the technical field of computers, and provides an automatic question answering method, a device, equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining natural language question sentences, searching knowledge information of the natural language question sentences in a preset knowledge-graph-based question-answer library, inputting the natural language question sentences and the knowledge information into a pre-trained automatic question-answer model, and coding, fusing and decoding the natural language question sentences and the knowledge information through the automatic question-answer model to obtain answers of the natural language question sentences output by the automatic question-answer model, so that the accuracy of automatic question-answer is effectively improved.

Description

Automatic question answering method, device, equipment and storage medium
Technical Field
The invention belongs to the fields of natural language processing, machine learning and artificial intelligence in the technical field of computers, and particularly relates to an automatic question answering method, an automatic question answering device, automatic question answering equipment and a storage medium.
Background
The automatic question-answering technology relates to natural language processing, machine learning, artificial intelligence and the like, and can automatically analyze questions put forward by a user in a natural language mode and return corresponding answers to the user.
With respect to the automatic question-and-answer technology, researchers have proposed various types of automatic question-and-answer models, such as Sequence-to-Sequence network (seq 2seq) based automatic question-and-answer models. However, most of the existing automatic question-answering models face the problem of lack of knowledge, so that the automatic question-answering models cannot accurately understand the question, and the accuracy of the obtained answer is not high.
Disclosure of Invention
The invention aims to provide an automatic question and answer method, an automatic question and answer device, automatic question and answer equipment and a storage medium, and aims to solve the problem that the accuracy of answers in automatic question and answer is not high due to the fact that an effective automatic question and answer method cannot be provided in the prior art.
In one aspect, the present invention provides an automatic question answering method, comprising the steps of:
acquiring a natural language question;
searching knowledge information of the natural language question in a preset question-answer library based on a knowledge graph;
and inputting the natural language question and the knowledge information into a pre-trained automatic question-answering model to obtain an answer of the natural language question output by the automatic question-answering model.
In another aspect, the present invention provides an automatic question answering apparatus, including:
a question acquiring unit for acquiring a natural language question;
the knowledge searching unit is used for searching knowledge information of the natural language question in a preset question-answer base based on a knowledge map; and
and the answer generating unit is used for inputting the natural language question and the knowledge information into a pre-trained automatic question-answering model and obtaining the answer of the natural language question output by the automatic question-answering model.
In another aspect, the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the automatic question answering method when executing the computer program.
In another aspect, the present invention further provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the steps of the automatic question answering method.
The method comprises the steps of obtaining natural language question sentences, searching knowledge information of the natural language question sentences in a preset knowledge-graph-based question-answer library, inputting the natural language question sentences and the knowledge information into a pre-trained automatic question-answer model, and obtaining answers of the natural language question sentences output by the automatic question-answer model, so that the natural language question sentences and the knowledge information in the automatic question-answer are fused by searching the knowledge information of the natural language question sentences in the knowledge-graph-based question-answer library and inputting the natural language question sentences and the knowledge information into a field question-answer model, the problem of knowledge shortage in the automatic question-answer is effectively solved, the knowledge information obtaining efficiency is improved, and the accuracy and the efficiency of the automatic question-answer are further improved.
Drawings
Fig. 1 is a flowchart illustrating an implementation of an automatic question answering method according to an embodiment of the present invention;
fig. 2 is a flowchart of an implementation of an automatic question answering method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an automatic question answering device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The following detailed description of specific implementations of the present invention is provided in conjunction with specific embodiments:
the first embodiment is as follows:
fig. 1 shows an implementation flow of an automatic question answering method according to an embodiment of the present invention, and for convenience of description, only the relevant parts related to the embodiment of the present invention are shown, which are detailed as follows:
in step S101, a natural language question is acquired.
The embodiment of the invention is suitable for electronic equipment such as computers, servers, tablet computers, smart phones and the like.
In the embodiment of the invention, a natural language question input by a user, or a natural language question sent by a terminal device, or a pre-collected natural language question can be acquired so as to answer the acquired natural language question. The natural language is a language such as chinese, english or japanese, and the natural language question is a question expressed by the natural language. For example, the natural language question is "what drug is used to treat the child's cold? ". The number of the natural language question sentences is more than or equal to 1, and when the number of the natural language question sentences is multiple, the multiple natural language question sentences can be automatically answered subsequently.
In step S102, knowledge information of natural language question sentences is searched in a preset knowledge-graph-based question-answer library.
In the embodiment of the present invention, the knowledge graph includes a piece of knowledge, and each piece of knowledge can be represented as a Subject-Predicate-Object (SPO) triple, where a Subject and an Object represent an entity and a Predicate represents a relationship between two entities, so each piece of knowledge can be further understood as follows: entity-entity relationship-entity. The knowledge-graph-based question-answer library comprises a plurality of preset entity types, a plurality of preset entity relationship types and a plurality of question types, so that the knowledge-graph-based question-answer library comprises a plurality of pieces of knowledge. The knowledge information of the natural language question is obtained by the knowledge according to the corresponding knowledge searched in the knowledge map according to the natural language question input into the knowledge map-based question-answer library, and the knowledge information acquisition efficiency is effectively improved.
By way of example, taking the example of a knowledge-graph-based medical question-answer library, the entity types in the knowledge-graph-based medical question-answer library may include "diagnostic examination items" (e.g., bronchography, arthroscopy), "medical subjects" (e.g., internal medicine, plastic cosmetology), "diseases," "drugs," etc.; the types of entity relationships in the knowledge-graph-based medical question-answer library may include "belong to" (e.g., XX symptoms belong to the XX family), "common medications for disease" (e.g., XX medications for XX diseases), "food for disease" (e.g., XX food for XX diseases); the types of questions in the knowledge-graph-based medical question-answer library may include "ask for symptoms of disease" (e.g., "ask for symptoms of disease XX" for which types of questions "there are symptoms of disease XX"), "ask for causes of disease" (e.g., "ask for causes of disease" for the types of questions "why there is disease XX").
In the embodiment of the invention, knowledge information related to the natural language question can be searched in the question-answer library based on the knowledge graph through the database query sentence. The number of the knowledge information of the natural language question is one or more, and each piece of knowledge information can be a sentence or a paragraph. For example, the natural language question "what drug to treat the child's cold fever", knowledge information related to "what drug to treat the child's cold" is found in the knowledge map-based question-and-answer library at the same time "XX drug is recommended when the child's cold symptom is XX, XXX drug is recommended when the child's cold symptom is XXX", and knowledge information related to "what drug to treat the child's fever" is "XX drug is recommended when the child's fever is treated".
In step S103, the natural language question and the knowledge information are input into the pre-trained automatic question-answering model, and the natural language question and the knowledge information are encoded, fused, and decoded by the automatic question-answering model, so as to obtain an answer of the natural language question output by the automatic question-answering model.
In the embodiment of the invention, after the knowledge information of the natural language question is obtained, the knowledge information of the natural language question can be used as the external extended knowledge of the natural language question to supplement the knowledge of the natural language question and solve the problem of lack of knowledge in the natural question-answering process.
In the embodiment of the invention, the natural language question and the knowledge information of the natural language question can be input into the automatic question-answering model. In the automatic question-answering model, a natural language question and knowledge information are respectively coded to obtain a question representation of the natural language question and a knowledge representation of the knowledge information, the question representation and the knowledge representation are fused to obtain a question representation fused with the knowledge representation, and the question representation fused with the knowledge representation is decoded to obtain an answer of the natural language question. The questions of the natural language questions are expressed as word vector sequences of the natural language questions, the knowledge of the knowledge information is expressed as word vector sequences of the knowledge information, and the automatic question-answering model is a deep neural network model.
In the embodiment of the invention, the knowledge information of the natural language question is searched in the knowledge-graph-based question-answering base, the natural language question and the knowledge information are coded, fused and decoded through the pre-trained automatic question-answering model, and the answer of the natural language question output by the automatic question-answering model is obtained, so that the knowledge information is obtained through the knowledge-graph-based question-answering base, the obtaining efficiency of the knowledge information is improved, the answer is obtained by processing the natural language question and the knowledge information through the automatic question-answering model, the problem of the lack of knowledge faced by automatic question-answering is solved, and the accuracy and the efficiency of automatic question-answering are further improved.
Example two:
fig. 2 shows an implementation flow of an automatic question answering method provided by the second embodiment of the present invention, and for convenience of description, only the relevant parts of the second embodiment of the present invention are shown, which are detailed as follows:
in step S201, a natural language question is acquired.
In the embodiment of the present invention, step S201 may refer to the detailed description of step S101, and is not described again.
In step S202, knowledge information of natural language question sentences is searched in a preset knowledge-graph-based question-answer library.
In the embodiment of the present invention, the knowledge graph includes a piece of knowledge, and each piece of knowledge may be represented as a triplet of a principal and predicate, and may be understood as follows: entity-entity relationship-entity. The knowledge-graph-based question-answer library comprises a plurality of preset entity types, a plurality of preset entity relationship types and a plurality of question types, so that the knowledge-graph-based question-answer library comprises a plurality of pieces of knowledge.
In the embodiment of the invention, knowledge information related to the natural language question can be searched in the question-answer library based on the knowledge graph through the database query sentence. The number of the knowledge information of the natural language question is one or more, and each piece of knowledge information can be a sentence or a paragraph.
In the embodiment of the invention, in the knowledge-graph-based question-answer library, in the process of searching for knowledge information related to natural language question-sentences through database query sentences, category words preset in the knowledge-graph-based question-answer library can be obtained first, the obtained category words are matched with the natural language question-sentences, and question-sentence categories to which the natural language question-sentences belong are determined, wherein the question-sentence categories to which the natural language question-sentences belong can be one or more, for example, the natural language question-sentences "how to determine whether to catch a cold" may belong to both question-sentence types of "asking for disease symptoms" and question-sentence types of "asking for what check needs to be made for disease". And extracting keywords and subject words from the natural language question according to keywords and subject words preset in a knowledge-graph-based question-answer library, for example, extracting the keywords and the subject words by using an AC automaton (Aho-coral automation) algorithm, wherein the keywords and the subject words are important words in the sentence, but the subject words are more important than the keywords, the natural language question usually includes a small number of subject words and a plurality of keywords, for example, in the natural language question, "what medicine is used for treating children cough and fever," cough "and" fever "are the subject words in the natural language question, and" children "," medicine "and" treatment "are the keywords in the natural language question. After the question type, the subject word and the key word of the natural language question are obtained, a database query sentence can be formed by the question type, the subject word and the key word of the natural language question, knowledge information of the natural language question is searched in a knowledge-map-based question-answer library, and therefore the knowledge information is obtained through semantic analysis of the natural language question and query according to the question type, the subject word and the key word obtained through the semantic analysis, the complexity of knowledge information query is reduced, and the knowledge information obtaining efficiency is improved.
In a feasible implementation mode, the knowledge graph-based question-answer library is a graph database, so that the storage effect and the searching efficiency of the knowledge graph are improved in a graph database mode. For example, the knowledge-graph based question-answer library is graph database neo4j, and the database lookup statement may be a structured cypher query statement. As another example, the knowledge-graph based question-answer library may employ an existing knowledge-graph based medical question-answer library.
In step S203, a natural language question is encoded by an encoder in the automatic question-answering model, so as to obtain a question representation.
In the embodiment of the invention, the natural language question can be quantized in the forward direction, the vectorized natural language question is input into the encoder in the automatic question-answering model, and the vectorized natural language question is encoded to obtain the question expression of the natural language question. The question of the natural language question is represented as a word vector sequence formed by word vectors of all words in the natural language question, and the encoder is a trained neural network model.
In a feasible implementation manner, the encoder comprises a preset number of network layers, and the natural language question after opposite quantization is sequentially encoded through each network layer in the encoder to obtain the question expression of the natural language question, wherein each network layer comprises a multi-head attention layer and a feedforward neural network layer, and when the output of the previous network layer is used as the input of the current network layer, the input is sequentially processed through the multi-head attention layer and the feedforward neural network layer in the current network layer, so that the encoding effect of the natural language question is improved.
In one possible embodiment, in the process of processing the natural language question after sequentially carrying out inverse quantization through each network, the output of the nth network layer in the encoder is input into the multi-head attention layer of the (N + 1) th network layer in the encoder to obtain the output of the multi-head attention layer, the output of the multi-head attention layer is subjected to residual learning and batch normalization, and the output of the multi-head attention layer subjected to the residual learning and batch normalization is input into the feedforward neural network in the (N + 1) th network layer to obtain the output of the (N + 1) th network layer. And then the output of the (N + 1) th network layer is used as the input of the (N + 2) th network layer, so that the coding process of each network layer is realized. Wherein N is more than or equal to 1, and the input of the first network layer is a vectorized natural language question. Therefore, the coding effect of the natural language question is improved by combining a multi-head attention layer, residual learning, batch standardization processing and a feedforward neural network.
In a possible embodiment, taking the first network layer as an example, the process of encoding a vector-quantized natural language question through the first network layer includes the following steps:
(1) in a multi-head attention layer of the first layer network layer, the quantified natural language question is processed through a preset weight parameter. The processing formula may be:
Qi=FWi Q、Ki=FWi K、Vi=FWi Vwherein W isi Q、Wi K、Wi VI is more than or equal to 1 and less than or equal to h is a weight parameter of a multi-head attention layer, h is a preset parameter, F is a vectorized natural language question, Wi Q、Wi K、Wi VCan be obtained by training in the process of training an automatic question-answering model.
(2) In a multi-head attention layer of the first layer network layer, h Q numbers to be calculated are obtained through a preset self-attention formulai、Ki、ViCalculating to obtain H weighted feature matrixes HiAccording to H weighted feature matrices HiAnd obtaining a final characteristic matrix H.
Feature matrix HiThe calculation formula of (c) may be:
Figure BDA0002603504760000071
Hi=Attention(Qi,Ki,Vi),dkis a preset constant parameter.
The calculation formula of the feature matrix H may be:
H=Concat(H1,H2,…,Hh)W0wherein Concat () is for mergingFunctions of vectors or arrays, here for the feature matrices HiAre combined, W0Is a preset weight parameter, W0Can be obtained by training in the process of training an automatic question-answering model.
(3) And performing residual error learning and batch standardization processing on the feature matrix H.
The formula for residual learning of the feature matrix H may be:
h '═ F + H, where H' is the feature matrix after residual learning.
The formula for performing batch normalization processing on the residual learned feature matrix H' may be:
Figure BDA0002603504760000081
wherein gamma and beta are preset weight parameters which can be obtained by training during the training of the automatic question-answering model,
Figure BDA0002603504760000082
n is the number of nodes of the multi-head attention layer and is a preset constant parameter, and L is a feature matrix after batch standardization processing.
(4) And processing the characteristic matrix after batch standardization in a feedforward network layer of the first network layer to obtain the output of the feedforward network layer.
In the feedforward network layer of the first network layer, the processing formula for processing the feature matrix after the batch normalization processing can be:
FFN(L)=max(0,LW1+b1)W2+b2wherein W is1、W2Is a preset weight parameter, b1、b2Is a preset bias parameter, W1、W2、b1And b2Can be obtained by training in the automatic question-answering model training, max (0, LW)1+b1) Represent from 0 and LW1+b1The maximum value is taken.
(5) And residual error learning and batch standardization processing are carried out on the output of the feedforward network layer in the first network layer to obtain the output of the first network layer. Wherein, residual learning and batch standardization processing are not repeated.
Therefore, the processing of the oppositely quantized natural language question by the first network layer is completed through the steps described in (1), (2), (3), (4) and (5), and the subsequent processing procedures of the second layer, the third layer and other network layers can refer to the description of the first network layer, and are not repeated. And determining the output of the last network layer as the question expression of the natural language question by analogy with the input of the second network layer.
In step S204, the knowledge information is encoded by the encoder to obtain a knowledge representation.
In the embodiment of the invention, the knowledge information of the natural language question is encoded by the encoder to obtain the knowledge representation of the knowledge information, and the process of encoding the natural language question by the encoder to obtain the knowledge representation of the question can be referred to, so that the description is omitted.
In step S205, a weight parameter for merging the question representation and the knowledge representation is determined, and the question representation and the knowledge representation are merged based on the weight parameter to obtain a question representation of the merged knowledge representation.
In the embodiment of the invention, after the question expression of the natural language question and the knowledge expression of the knowledge information of the natural language question are obtained, the fusion parameter for fusing the question expression and the knowledge expression can be determined, and the question expression and the knowledge expression are fused according to the fusion parameter to obtain the question expression fused with the knowledge expression, so that the question expression used for decoding subsequently contains sufficient knowledge information, and the answer accuracy is improved.
In a possible embodiment, in the process of determining the fusion parameters for fusing the question representation and the knowledge representation, a matching degree between the question representation and the knowledge representation may be determined first, and the fusion parameters for fusing the question representation and the knowledge representation may be determined according to the matching degree, so as to improve the accuracy of the fusion parameters for fusing the question representation and the knowledge representation.
In one possible embodiment, the calculation formula of the matching degree between the question expression and the knowledge expression is as follows:
Figure BDA0002603504760000091
wherein r isqFor question presentation, rkFor knowledge representation, f (r)q,rk) For the degree of matching of the question representation and the knowledge representation,
Figure BDA0002603504760000092
is a preset weight parameter, and is a weight parameter,
Figure BDA0002603504760000093
in order to be the pre-set offset parameter,
Figure BDA0002603504760000094
and
Figure BDA0002603504760000095
the method can be obtained by training an automatic question-answering model.
Further, the calculation formula for determining the fusion parameters according to the matching degree is as follows:
Figure BDA0002603504760000096
where α is the fusion parameter, relu () is the activation function,
Figure BDA0002603504760000097
is a preset weight parameter, and is a weight parameter,
Figure BDA0002603504760000098
in order to be the pre-set offset parameter,
Figure BDA0002603504760000099
and
Figure BDA00026035047600000910
the method can be obtained by training an automatic question-answering model. Alpha is formed by RnAnd n is the number of knowledge information and also the number of knowledge representations.
In a feasible implementation mode, in the process of fusing the question expression and the knowledge expression according to the fusion parameters, the fusion parameters can play a role of soft switching, the fusion parameters are used as extraction probabilities to extract information of the knowledge expression, and the fusion parameters subtracted from 1 are used as extraction probabilities to extract information of the question expression, so that the flexibility of fusing the question expression and the knowledge expression is improved, and the complexity of fusing the question expression and the knowledge expression is reduced. The calculation formula of the fusion process may be:
r'k=α*rk,r'q=(1-α)*rq,Rq=r'k+r'q. Wherein R isqR 'is a question expression integrated with knowledge information'kIs the information r 'extracted from the knowledge representation by taking the fusion parameter alpha as the extraction probability'qInformation extracted from the question expression with 1 minus the fusion parameter α as the extraction probability.
Further, when the knowledge information of the natural language question is not found in the knowledge-graph-based question-answer library, the value of the fusion parameter is set to 0, the extraction probability represented by the knowledge is 0, and the extraction probability represented by the question is 1. Before the knowledge representation and the question representation are fused, whether the natural language question has effective information or not can be detected, if the natural language question is detected to have no effective information (for example, no effective information exists in the natural language question of 'what cause is uncomfortable today'), the value of a fusion parameter is set to be 1, the extraction probability of the knowledge representation is 1 at the moment, and the extraction probability of the question representation is 0, so that the flexibility of fusing the knowledge representation and the question representation is effectively improved.
In step S206, the question expression of the fused knowledge expression is decoded by the decoder in the automatic question-answering model to obtain an answer.
In the embodiment of the invention, after the question expression fused with the knowledge expression is obtained, the question expression fused with the knowledge expression can be decoded by a decoder in the automatic question-answering model to obtain the answer of the natural language question, so that the automatic question-answering of the natural language question is realized.
In the embodiment of the invention, the decoder comprises a plurality of network layers, each network layer sequentially comprises a multi-head attention layer and a feedforward neural network layer, the output of each multi-head attention layer needs residual learning and batch standardization, the output of each feedforward neural network layer needs residual learning and batch standardization, and the multi-head attention layer, the feedforward neural network layer, the residual learning and batch standardization have detailed descriptions of a reference coding layer and are not described again. The question expression fused with the knowledge expression is processed by a plurality of network layers of a decoder to obtain a predicted word output by the last network layer. Inputting each predicted word into a linear network layer and a classification layer which are connected with the last network layer in an encoder, and obtaining the probability of each predicted word by the preset probability distribution corresponding to a vocabulary table, selecting vocabularies in all the predicted words according to the probability of each predicted word, and combining the selected vocabularies to obtain the answer of the natural language question.
In a possible embodiment, when selecting words from all the predicted words according to the probability of each predicted word, the predicted words with the highest probability may be sequentially selected until the selected predicted words are the preset terminator. All the selected vocabularies are combined according to the selected sequence to obtain answers of the natural language question sentences, so that the accuracy of automatic question answering is improved.
In one possible implementation, the linear network layer and the classification layer can be expressed as the following formulas:
Pkgsoftmax (Vy + b), where PkgIn order to predict the probability corresponding to the vocabulary y, V is a preset weight parameter, b is a preset bias parameter, and both V and b can be obtained through the training of an automatic question-answering model.
In one possible implementation, the encoder and the decoder in the auto-questioning and answering model may respectively use a transform encoder and a transform decoder to improve the encoding effect and the decoding effect of the auto-questioning and answering model.
In one possible embodiment, when training the automatic question-answering model, a supervised training mode may be adopted to improve the training effect of the automatic question-answering model.
In the embodiment of the invention, the knowledge information of the natural language question is searched in the knowledge-graph-based question-answering base, the natural language question and the knowledge information are coded, fused and decoded through the pre-trained automatic question-answering model, and the answer of the natural language question output by the automatic question-answering model is obtained, so that the knowledge information is obtained through the knowledge-graph-based question-answering base, the obtaining efficiency of the knowledge information is improved, the answer is obtained by processing the natural language question and the knowledge information through the automatic question-answering model, the problem of the lack of knowledge faced by automatic question-answering is solved, and the accuracy and the efficiency of automatic question-answering are further improved.
Example three:
fig. 3 shows a structure of an automatic question answering apparatus according to a third embodiment of the present invention, and for convenience of explanation, only the parts related to the third embodiment of the present invention are shown, including:
a question acquisition unit 31 for acquiring a natural language question;
a knowledge search unit 32, configured to search knowledge information of natural language question sentences in a preset knowledge-graph-based question-answer library; and
the answer generating unit 33 is configured to input the natural language question and the knowledge information into a pre-trained automatic question-and-answer model, and encode, merge, and decode the natural language question and the knowledge information through the automatic question-and-answer model to obtain an answer of the natural language question output by the automatic question-and-answer model.
In a possible embodiment, the knowledge search unit 32 is specifically configured to:
determining question categories to which natural language questions belong according to category words preset in a knowledge graph-based question-answer library; extracting keywords and subject words from natural language question sentences according to keywords and subject words preset in a knowledge-graph-based question-answer library; and searching knowledge information of the natural language question in a question-answer base based on the knowledge graph according to the question category to which the natural language question belongs, the key words in the natural language question and the subject words in the natural language question.
In one possible implementation, the answer generating unit 33 is specifically configured to:
coding the natural language question by a coder in the automatic question-answering model to obtain a question representation; determining fusion parameters for fusing question expression and knowledge expression; fusing the question expression and the knowledge expression according to the fusion parameters to obtain the question expression represented by the fusion knowledge; and decoding the question expression expressed by the fusion knowledge through a decoder in the automatic question-answering model to obtain an answer.
In one possible embodiment, the encoder includes a preset number of network layers; the answer generating unit 33 is specifically configured to:
vectorizing the natural language question; inputting the vectorized natural language question into an encoder; and coding the natural language question after opposite quantization in sequence through each network layer to obtain question representation, wherein each network layer comprises a multi-head attention layer and a feedforward neural network layer.
In one possible implementation, the answer generating unit 33 is specifically configured to:
inputting the output of the Nth network layer in the encoder into a multi-head attention layer in the (N + 1) th network layer in the encoder, wherein N is more than or equal to 1; residual error learning and batch standardization processing are carried out on the output of the multi-head attention layer in the (N + 1) th network layer, and the input of a feedforward neural network layer in the (N + 1) th network layer is obtained; and performing residual error learning and batch standardization processing on the output of the feedforward neural network layer in the (N + 1) th network layer to obtain the output of the (N + 1) th network layer.
In one possible implementation, the answer generating unit 33 is specifically configured to:
determining the matching degree between the question expression and the knowledge expression; and determining fusion parameters according to the matching degree.
In one possible implementation, the answer generating unit 33 is specifically configured to:
inputting the question expression fused with the knowledge expression into a decoder according to the decoder and a preset vocabulary table to obtain the probability distribution of the vocabulary table; and sequentially selecting the predictive words in the vocabulary table according to the probability distribution of the vocabulary table, and obtaining answers according to the selected predictive words.
In the embodiment of the present invention, specific implementation and achieved technical effects of the automatic question answering device may refer to specific descriptions of corresponding method embodiments, and are not described in detail.
In the embodiment of the present invention, each unit of the automatic question answering device may be implemented by a corresponding hardware or software unit, and each unit may be an independent software or hardware unit, or may be integrated into a software or hardware unit, which is not limited herein.
Example four:
fig. 4 shows a structure of an electronic device according to a fourth embodiment of the present invention, and only a part related to the fourth embodiment of the present invention is shown for convenience of description.
The electronic device 4 of an embodiment of the invention comprises a processor 40, a memory 41 and a computer program 42 stored in the memory 41 and executable on the processor 40. The processor 40, when executing the computer program 42, implements the steps in the various method embodiments described above, such as steps S101 to S103 shown in fig. 1 or steps S201 to S206 shown in fig. 2. Alternatively, the processor 40, when executing the computer program 42, implements the functions of the units in the above-described device embodiments, such as the functions of the units 31 to 33 shown in fig. 3.
In the embodiment of the invention, the natural language question is obtained, the knowledge information of the natural language question is searched in a preset question-answering base based on a knowledge map, the natural language question and the knowledge information are input into a pre-trained automatic question-answering model, and the natural language question and the knowledge information are coded, fused and decoded by the automatic question-answering model to obtain the answer of the natural language question output by the automatic question-answering model, so that the problem of the lack of knowledge of automatic question-answering is effectively solved, and the accuracy of automatic question-answering is improved.
The electronic device of the embodiment of the invention can be a computer, a server, a tablet computer and the like. The steps implemented when the processor 40 in the electronic device terminal 4 executes the computer program 42 to implement the automatic question answering method can refer to the description of the foregoing method embodiments, and are not described herein again.
Example five:
in an embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program that, when executed by a processor, implements the steps in the above-described method embodiments, for example, steps S101 to S103 shown in fig. 1 or steps S201 to S206 shown in fig. 2. Alternatively, the computer program may be adapted to perform the functions of the units of the above-described device embodiments, such as the functions of the units 31 to 33 shown in fig. 3, when executed by the processor.
In the embodiment of the invention, the natural language question is obtained, the knowledge information of the natural language question is searched in a preset question-answering base based on a knowledge map, the natural language question and the knowledge information are input into a pre-trained automatic question-answering model, and the natural language question and the knowledge information are coded, fused and decoded by the automatic question-answering model to obtain the answer of the natural language question output by the automatic question-answering model, so that the problem of the lack of knowledge of automatic question-answering is effectively solved, and the accuracy of automatic question-answering is improved.
The computer readable storage medium of the embodiments of the present invention may include any entity or device capable of carrying computer program code, a recording medium, such as a ROM/RAM, a magnetic disk, an optical disk, a flash memory, or the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. An automatic question-answering method, characterized in that it comprises the following steps:
acquiring a natural language question;
searching knowledge information of the natural language question in a preset question-answer library based on a knowledge graph;
inputting the natural language question and the knowledge information into a pre-trained automatic question-answering model, and coding, fusing and decoding the natural language question and the knowledge information through the automatic question-answering model to obtain an answer of the natural language question output by the automatic question-answering model.
2. The method according to claim 1, wherein the searching for the knowledge information of the natural language question in the preset knowledge-graph-based question-answer library comprises:
determining question categories to which the natural language question sentences belong according to category words preset in the knowledge-graph-based question-answer library;
extracting keywords and subject words from the natural language question sentences according to the keywords and the subject words preset in the knowledge-graph-based question-answer library;
and searching knowledge information of the natural language question in a knowledge-graph-based question-answer library according to the question category to which the natural language question belongs, the keywords in the natural language question and the subject words in the natural language question.
3. The method according to claim 1, wherein the inputting the natural language question and the knowledge information into a pre-trained auto-question-answer model, and encoding, fusing and decoding the natural language question and the knowledge information through the auto-question-answer model to obtain an answer of the natural language question output by the auto-question-answer model comprises:
encoding the natural language question by an encoder in the automatic question-answering model to obtain a question representation;
coding the knowledge information through the coder to obtain knowledge representation;
determining fusion parameters for fusing the question representation and the knowledge representation;
fusing the question expression and the knowledge expression according to the fusion parameters to obtain the question expression fused with the knowledge expression;
and decoding the question expression fused with the knowledge expression by a decoder in the automatic question-answering model to obtain the answer.
4. The method of claim 3, wherein the encoder comprises a predetermined number of network layers, and wherein the step of encoding the natural language question by the encoder in the automatic question-and-answer model to obtain a question representation comprises:
vectorizing the natural language question;
inputting the vectorized natural language question into the encoder;
and coding the natural language question after opposite quantization sequentially through each network layer to obtain the question representation, wherein each network layer comprises a multi-head attention layer and a feedforward neural network layer.
5. The method of claim 4, wherein the sequentially processing the quantified natural language question through each of the network layers comprises:
inputting the output of the Nth network layer in the encoder into a multi-head attention layer in the (N + 1) th network layer in the encoder, wherein N is more than or equal to 1;
residual error learning and batch standardization processing are carried out on the output of the multi-head attention layer in the (N + 1) th network layer, and the input of a feedforward neural network layer in the (N + 1) th network layer is obtained;
inputting the input of a feedforward neural network layer in the (N + 1) th network layer into a feedforward neural network in the (N + 1) th network layer;
and residual error learning and batch standardization processing are carried out on the output of the feedforward neural network layer in the (N + 1) th network layer to obtain the output of the (N + 1) th network layer.
6. The method of claim 3, wherein determining fusion parameters for fusing the question representation and the knowledge representation comprises:
determining a degree of match between the question representation and the knowledge representation;
and determining the fusion parameters according to the matching degree.
7. The method of claim 3, wherein decoding, by a decoder, the question expression fused with the knowledge expression to obtain the answer comprises:
inputting the question expression fused with the knowledge expression into the decoder according to the probability distribution of a preset vocabulary table to obtain the probability of a plurality of predicted words;
selecting words according to the probability of the plurality of predicted words;
and obtaining the answer according to the selected vocabulary.
8. An automatic question answering device, characterized in that the device comprises:
a question acquiring unit for acquiring a natural language question;
the knowledge searching unit is used for searching knowledge information of the natural language question in a preset question-answer base based on a knowledge map; and
and the answer generating unit is used for inputting the natural language question and the knowledge information into a pre-trained automatic question-answering model, and coding, fusing and decoding the natural language question and the knowledge information through the automatic question-answering model to obtain the answer of the natural language question output by the automatic question-answering model.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the steps of the method according to any of claims 1 to 7 are implemented when the computer program is executed by the processor.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202010731550.6A 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium Active CN112035627B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010731550.6A CN112035627B (en) 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010731550.6A CN112035627B (en) 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112035627A true CN112035627A (en) 2020-12-04
CN112035627B CN112035627B (en) 2023-11-17

Family

ID=73583231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010731550.6A Active CN112035627B (en) 2020-07-27 2020-07-27 Automatic question and answer method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112035627B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559707A (en) * 2020-12-16 2021-03-26 四川智仟科技有限公司 Knowledge-driven customer service question and answer method
CN112650768A (en) * 2020-12-22 2021-04-13 网易(杭州)网络有限公司 Dialog information generation method and device and electronic equipment
CN114265925A (en) * 2021-12-24 2022-04-01 科大讯飞(苏州)科技有限公司 Question answering method and device, electronic equipment and storage medium
CN114385887A (en) * 2021-12-16 2022-04-22 阿里健康科技(杭州)有限公司 Data searching method, medicine searching method and device

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070276723A1 (en) * 2006-04-28 2007-11-29 Gideon Samid BiPSA: an inferential methodology and a computational tool
US20110307435A1 (en) * 2010-05-14 2011-12-15 True Knowledge Ltd Extracting structured knowledge from unstructured text
CN102663129A (en) * 2012-04-25 2012-09-12 中国科学院计算技术研究所 Medical field deep question and answer method and medical retrieval system
US20130262125A1 (en) * 2005-08-01 2013-10-03 Evi Technologies Limited Knowledge repository
US9336259B1 (en) * 2013-08-08 2016-05-10 Ca, Inc. Method and apparatus for historical analysis analytics
CN106710596A (en) * 2016-12-15 2017-05-24 腾讯科技(上海)有限公司 Answer statement determination method and device
CN107729493A (en) * 2017-09-29 2018-02-23 北京创鑫旅程网络技术有限公司 Travel the construction method of knowledge mapping, device and travelling answering method, device
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN108073600A (en) * 2016-11-11 2018-05-25 阿里巴巴集团控股有限公司 A kind of intelligent answer exchange method, device and electronic equipment
CN108647233A (en) * 2018-04-02 2018-10-12 北京大学深圳研究生院 A kind of answer sort method for question answering system
CN108984778A (en) * 2018-07-25 2018-12-11 南京瓦尔基里网络科技有限公司 A kind of intelligent interaction automatically request-answering system and self-teaching method
CN109271524A (en) * 2018-08-02 2019-01-25 中国科学院计算技术研究所 Entity link method in knowledge base question answering system
CN109918489A (en) * 2019-02-28 2019-06-21 上海乐言信息科技有限公司 A kind of knowledge question answering method and system of more strategy fusions
CN110188182A (en) * 2019-05-31 2019-08-30 中国科学院深圳先进技术研究院 Model training method, dialogue generation method, device, equipment and medium
CN111125333A (en) * 2019-06-06 2020-05-08 北京理工大学 Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN111274373A (en) * 2020-01-16 2020-06-12 山东大学 Electronic medical record question-answering method and system based on knowledge graph

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130262125A1 (en) * 2005-08-01 2013-10-03 Evi Technologies Limited Knowledge repository
US20070276723A1 (en) * 2006-04-28 2007-11-29 Gideon Samid BiPSA: an inferential methodology and a computational tool
US20110307435A1 (en) * 2010-05-14 2011-12-15 True Knowledge Ltd Extracting structured knowledge from unstructured text
CN102663129A (en) * 2012-04-25 2012-09-12 中国科学院计算技术研究所 Medical field deep question and answer method and medical retrieval system
US9336259B1 (en) * 2013-08-08 2016-05-10 Ca, Inc. Method and apparatus for historical analysis analytics
CN108073600A (en) * 2016-11-11 2018-05-25 阿里巴巴集团控股有限公司 A kind of intelligent answer exchange method, device and electronic equipment
CN106710596A (en) * 2016-12-15 2017-05-24 腾讯科技(上海)有限公司 Answer statement determination method and device
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN107729493A (en) * 2017-09-29 2018-02-23 北京创鑫旅程网络技术有限公司 Travel the construction method of knowledge mapping, device and travelling answering method, device
CN108647233A (en) * 2018-04-02 2018-10-12 北京大学深圳研究生院 A kind of answer sort method for question answering system
CN108984778A (en) * 2018-07-25 2018-12-11 南京瓦尔基里网络科技有限公司 A kind of intelligent interaction automatically request-answering system and self-teaching method
CN109271524A (en) * 2018-08-02 2019-01-25 中国科学院计算技术研究所 Entity link method in knowledge base question answering system
CN109918489A (en) * 2019-02-28 2019-06-21 上海乐言信息科技有限公司 A kind of knowledge question answering method and system of more strategy fusions
CN110188182A (en) * 2019-05-31 2019-08-30 中国科学院深圳先进技术研究院 Model training method, dialogue generation method, device, equipment and medium
CN111125333A (en) * 2019-06-06 2020-05-08 北京理工大学 Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN111274373A (en) * 2020-01-16 2020-06-12 山东大学 Electronic medical record question-answering method and system based on knowledge graph

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ASHISH VASWANI: "Attention is all you need", 《PROCEEDINGS OF THE 31ST INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS》, pages 6000 - 6010 *
SHUMAN LIU: "Knowledge Diffusion for Neural Dialogue Generation", 《PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS 》, vol. 1, pages 1489 *
徐聪: "基于深度学习和强化学习的对话模型研究", 《中国博士学位论文全文数据库 信息科技》, pages 138 - 261 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559707A (en) * 2020-12-16 2021-03-26 四川智仟科技有限公司 Knowledge-driven customer service question and answer method
CN112650768A (en) * 2020-12-22 2021-04-13 网易(杭州)网络有限公司 Dialog information generation method and device and electronic equipment
CN114385887A (en) * 2021-12-16 2022-04-22 阿里健康科技(杭州)有限公司 Data searching method, medicine searching method and device
CN114265925A (en) * 2021-12-24 2022-04-01 科大讯飞(苏州)科技有限公司 Question answering method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112035627B (en) 2023-11-17

Similar Documents

Publication Publication Date Title
Liu et al. Probabilistic reasoning via deep learning: Neural association models
CN109471895B (en) Electronic medical record phenotype extraction and phenotype name normalization method and system
CN112035627B (en) Automatic question and answer method, device, equipment and storage medium
CN111209384B (en) Question-answer data processing method and device based on artificial intelligence and electronic equipment
CN112667799B (en) Medical question-answering system construction method based on language model and entity matching
CN113672708B (en) Language model training method, question-answer pair generation method, device and equipment
WO2023029506A1 (en) Illness state analysis method and apparatus, electronic device, and storage medium
CN107516110A (en) A kind of medical question and answer Semantic Clustering method based on integrated convolutional encoding
CN113724882B (en) Method, device, equipment and medium for constructing user portrait based on inquiry session
CN114565104A (en) Language model pre-training method, result recommendation method and related device
CN111444715B (en) Entity relationship identification method and device, computer equipment and storage medium
CN115048447B (en) Database natural language interface system based on intelligent semantic completion
CN113223735B (en) Diagnosis method, device, equipment and storage medium based on dialogue characterization
CN113408430B (en) Image Chinese description system and method based on multi-level strategy and deep reinforcement learning framework
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN112632250A (en) Question and answer method and system under multi-document scene
CN116662502A (en) Method, equipment and storage medium for generating financial question-answer text based on retrieval enhancement
CN111540470A (en) Social network depression tendency detection model based on BERT transfer learning and training method thereof
CN116737911A (en) Deep learning-based hypertension question-answering method and system
CN115455985A (en) Natural language system processing method based on machine reading understanding
CN108509539B (en) Information processing method and electronic device
CN116975212A (en) Answer searching method and device for question text, computer equipment and storage medium
CN112131363B (en) Automatic question and answer method, device, equipment and storage medium
CN112668481A (en) Semantic extraction method for remote sensing image
CN117056475A (en) Knowledge graph-based intelligent manufacturing question-answering method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant