CN117874202A

CN117874202A - Intelligent question-answering method and system based on large model

Info

Publication number: CN117874202A
Application number: CN202410057741.7A
Authority: CN
Inventors: 邢练军; 李克
Original assignee: Shenzhen Aicarer Technology Co ltd
Current assignee: Shenzhen Aicarer Technology Co ltd
Priority date: 2024-01-12
Filing date: 2024-01-12
Publication date: 2024-04-12
Anticipated expiration: 2044-01-12
Also published as: CN117874202B

Abstract

The invention relates to the technical field of intelligent question and answer, and discloses an intelligent question and answer method and system based on a large model, wherein the method comprises the following steps: searching word segmentation contexts corresponding to text word segmentation of the problem text, and calculating an embedded vector of each word segmentation context; calculating a word context vector of each word context according to the embedded vector; according to the embedded vector of each word segmentation context of the word context vector, carrying out anisotropic disambiguation and text reconstruction on the text vector to obtain a target reconstructed text of the problem text; acquiring a pre-constructed answer corpus, constructing a word embedding matrix corresponding to the answer corpus and the target reconstruction text, and performing attention interaction on the word embedding matrix to obtain an interaction matrix; and calculating the similarity between the target reconstructed text and the answer corpus according to the interaction matrix, and determining the target answer corpus corresponding to the question text according to the similarity. The invention can improve the accuracy of intelligent question and answer.

Description

Intelligent question-answering method and system based on large model

Technical Field

The invention relates to the technical field of intelligent question and answer, in particular to an intelligent question and answer method and system based on a large model.

Background

With the rapid development of information technology in recent years, the continuous development and advancement of Artificial Intelligence (AI), big Data Technology (BDT) and "internet +" practice, the data volume of various modes has been exponentially increased, and especially under the popularization of various networking applications, different types of data such as text, pictures, video and the like can be obtained by users from different channels. The need for rapid and accurate information acquisition is becoming more and more urgent, and the question-answering system can answer through unstructured data or structured data to acquire information, which becomes an important branch and emerging research hotspot in the fields of artificial intelligence and information retrieval.

The existing intelligent question-answering method comprises a question-answering system based on knowledge graph embedding, and the core idea is that each predicate and each entity are expressed as a low-dimensional vector, the core entity and the predicate of a problem are jointly restored in KG embedding space, and the fact that the obtained vector is closest in KG is calculated and returned as an answer through joint distance measurement, so that the calculation amount is huge, and the accuracy is low. Also included are ConvNet-based convolution network variants that infer sentence similarity by integrating the differences between multiple convolutions at different scales, however the limitations of the tag data limit the performance of the model. Therefore, the traditional knowledge graph question-answering has the problems that the entity relationship cannot be extracted correctly or the entity relationship cannot be matched with the two types of questions in the knowledge graph, so that a question-answering system cannot feed back the answers of the questions effectively, and the accuracy of the questions-answering is low.

Disclosure of Invention

The invention provides an intelligent question-answering method and system based on a large model, and mainly aims to solve the problem of low accuracy of intelligent question-answering.

In order to achieve the above purpose, the invention provides an intelligent question-answering method based on a large model, which comprises the following steps:

obtaining a problem text, segmenting the problem text to obtain text segmentation, searching a segmentation context set corresponding to each text segmentation in a pre-constructed corpus, and calculating an embedded vector of each segmentation context in the segmentation context set;

word embedding and clustering are carried out on the word segmentation context set according to the embedding vector, a context cluster is obtained, probability distribution of each word segmentation context in the context cluster in the word segmentation context set is calculated, and word context vectors of each word segmentation context are generated according to the probability distribution;

the probability distribution of each word segmentation context in the context cluster is calculated by using the following formula:

wherein p is _i,k Representing the probability distribution of the ith word context in the kth context cluster, C representing the total number of the context clusters, C _k 、c _j The cluster centers of the kth and jth context clusters are respectively represented, and x _i Representing a clustering vector corresponding to the ith word segmentation context, wherein m represents a preset clustering parameter;

Constructing a text vector of the problem text according to the word context vector, carrying out anisotropic disambiguation on the text vector to obtain a target vector of the problem text, and carrying out text reconstruction on the problem text according to the target vector to obtain a target reconstructed text of the problem text;

acquiring pre-constructed answer corpus, respectively constructing word embedding matrixes corresponding to the answer corpus and the target reconstruction text, and performing attention interaction on the word embedding matrixes to obtain an interaction matrix of the answer corpus and the target reconstruction text;

and calculating the similarity between the target reconstructed text and the answer corpus according to the interaction matrix, and determining the target answer corpus corresponding to the question text according to the similarity.

Optionally, the calculating an embedded vector of each word context in the word context set includes:

converting each word-segmentation context in the word-segmentation context set into a context vector;

and performing embedded coding on the context vector by using the pre-trained coding language model to obtain an embedded vector of each word segmentation context in the word segmentation context set.

Optionally, the constructing a text vector of the question text according to the word context vector includes:

Calculating the inverse document frequency of the word segmentation context corresponding to each word context vector in the word segmentation context set;

multiplying the inverse document frequency with the word context vector to obtain a frequency vector;

and carrying out vector concatenation on the frequency vectors to obtain the text vector of the problem text.

Optionally, the anisotropic disambiguating the text vector to obtain a target vector of the problematic text includes:

decomposing the text vector into a word segmentation vector set, and calculating a covariance matrix corresponding to the word segmentation vector set;

converting the covariance matrix into an identity matrix to obtain an identity covariance matrix;

singular value decomposition is carried out on the unit covariance matrix to obtain a decomposition matrix;

performing linear transformation on the word segmentation vector set according to the decomposition matrix to obtain a transformation vector;

and linearly transforming the word segmentation vector set by using the following formula to obtain a transformation vector:

wherein,representing a transformation vector, y, corresponding to the first word segmentation vector in the word segmentation vector set _l Representing the first word segmentation vector in a word segmentation vector set, μ representing the expectation among the word segmentation vector sets, Λ representing the facing angular matrix in the decomposition matrix, and U representing the orthogonal matrix in the decomposition matrix;

And constructing a target vector of the problem text according to the transformation vector.

Optionally, the text reconstruction is performed on the question text according to the target vector to obtain a target reconstructed text of the question text, which includes:

carrying out semantic coding on the problem text by utilizing a pre-constructed semantic coder to obtain coding vectors of different coding layers;

performing nonlinear mapping on the coding vector to obtain a mapping coding vector;

vector superposition is carried out on the mapping coding vector to obtain a fusion semantic vector of the problem text;

and carrying out full-connection activation calculation on the fusion semantic vector until the target reconstruction text of the problem text is leveled.

Optionally, the respectively constructing a word embedding matrix corresponding to the answer corpus and the target reconstructed text includes:

performing text preprocessing on the answer corpus and the target reconstruction text to obtain a clean text;

performing character segmentation and character filling on the clean text to obtain text characters;

and mapping the text characters into character vectors, and generating the answer corpus and a word embedding matrix corresponding to the target reconstructed text according to the character vectors.

Optionally, the performing attention interaction on the word embedding matrix to obtain an interaction matrix of the answer corpus and the target reconstructed text includes:

calculating a similarity matrix between the answer corpus and the target reconstruction text according to the word embedding matrix;

performing row activation operation and column activation operation on the similarity matrix respectively to obtain an activation matrix;

and multiplying the activation matrix by the word embedding matrix to obtain an interaction matrix of the answer corpus and the target reconstruction text.

Optionally, the calculating the similarity between the target reconstructed text and the answer corpus according to the interaction matrix includes:

residual connection is carried out on the interaction matrix by utilizing a pre-constructed feedforward neural network, so as to obtain residual characteristics;

calculating fusion characteristics of the interaction matrix according to the residual characteristics;

calculating the fusion characteristics of the interaction matrix by using the following formula:

N＝[u _a ；u _b ；u _a -u _b ；u _a *u _b ]

wherein N represents a fusion feature, u _a Representing residual features corresponding to answer corpus in the residual features, u _b Representing residual characteristics corresponding to target reconstruction text in the residual characteristics, wherein the residual characteristics represent vector cross-multiplying symbols;

And fully connecting the fusion features to obtain the similarity between the target reconstructed text and the answer corpus.

Optionally, the performing full connection on the fusion feature to obtain a similarity between the target reconstructed text and the answer corpus includes:

performing first full connection on the fusion characteristics to obtain a first full connection vector;

performing activation calculation on the first full connection vector to obtain an activation vector;

performing second full connection on the activation vector to obtain a second full connection vector;

normalizing the second full-connection vector to obtain the similarity between the target reconstructed text and the answer corpus.

In order to solve the above problems, the present invention further provides an intelligent question-answering system based on a large model, the system comprising:

the embedded vector calculation module is used for obtaining a problem text, segmenting the problem text to obtain text segmentation, searching a segmentation context set corresponding to each text segmentation in a pre-constructed corpus, and calculating an embedded vector of each segmentation context in the segmentation context set;

the word context vector generation module is used for carrying out word embedding clustering on the word segmentation context set according to the embedding vector to obtain a context cluster, calculating probability distribution of each word segmentation context in the context cluster, and generating a word context vector of each word segmentation context according to the probability distribution;

the text reconstruction module is used for constructing a text vector of the problem text according to the word context vector, carrying out anisotropic disambiguation on the text vector to obtain a target vector of the problem text, and carrying out text reconstruction on the problem text according to the target vector to obtain a target reconstructed text of the problem text;

the word embedding matrix attention interaction module is used for acquiring pre-constructed answer corpus, respectively constructing word embedding matrixes corresponding to the answer corpus and the target reconstructed text, and carrying out attention interaction on the word embedding matrixes to obtain an interaction matrix of the answer corpus and the target reconstructed text;

and the target answer corpus determining module is used for calculating the similarity between the target reconstructed text and the answer corpus according to the interaction matrix and determining the target answer corpus corresponding to the question text according to the similarity.

According to the embodiment of the invention, the word context vector of each word context can be calculated by searching the word context set corresponding to the text word of the problem text, so that the word context vector of each text word under different contexts can be calculated, the text word is fused with different contexts, and the accuracy of generating the text vector is improved; constructing a text vector according to the word context vector, carrying out anisotropic disambiguation on the text vector, and carrying out text reconstruction on the problem text to obtain a target reconstructed text, so that the anisotropy of the text vector can be eliminated, and the target reconstructed text with more accurate context and semantics can be reconstructed; constructing a word embedding matrix corresponding to the answer corpus and the target reconstruction text, performing attention interaction on the word embedding matrix, and comprehensively considering related information between the answer corpus and the target reconstruction text, so that characteristic information in the interaction matrix is more comprehensive; according to the similarity between the target reconstruction text and the answer corpus calculated by the interaction matrix, the answer corpus and the related information between the target reconstruction text can be comprehensively considered, and the more accurate target answer corpus is obtained. Therefore, the intelligent question-answering method and system based on the large model can solve the problem of low accuracy of intelligent question-answering.

Drawings

FIG. 1 is a flow chart of a large model-based intelligent question-answering method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a text vector for constructing a question text based on word context vectors according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of constructing a word embedding matrix corresponding to an answer corpus and a target reconstructed text according to an embodiment of the present invention;

fig. 4 is a functional block diagram of a large-model-based intelligent question-answering system according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

The embodiment of the application provides an intelligent question-answering method based on a large model. The execution subject of the intelligent question-answering method based on the large model comprises, but is not limited to, at least one of a server, a terminal and the like which can be configured to execute the method provided by the embodiment of the application. In other words, the intelligent question-answering method based on the big model can be executed by software or hardware installed on a terminal device or a server device, and the software can be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.

Referring to fig. 1, a flow chart of a large model-based intelligent question-answering method according to an embodiment of the present invention is shown. In this embodiment, the intelligent question-answering method based on the big model includes:

s1, acquiring a problem text, segmenting the problem text to obtain text segmentation, searching a segmentation context set corresponding to each text segmentation in a pre-constructed corpus, and calculating an embedded vector of each segmentation context in the segmentation context set.

In the embodiment of the invention, the question text is an inquiry text input by a user, each question text can be divided into a plurality of text word segments, but the text word segments may have different contexts in different texts, for example, in the case that the bundle in the phase is not well shaken and the bundle on the body is too heavy, the bundle has different contexts and also has different meanings, so that word segment contexts possibly contained in each text analysis can be searched in a pre-constructed corpus. The corpus comprises a plurality of word segmentation contexts corresponding to a preset corpus.

Specifically, the embodiment of the invention can segment the problem text by using text segmentation methods such as bargain segmentation, shortest path segmentation, cyclic neural network segmentation, n-gram segmentation and the like to obtain text segmentation, and the text segmentation is subjected to consistency matching in a corpus to obtain corpus matched with each text segmentation, and a context set corresponding to the corpus is extracted to serve as a segmentation context set of each text segmentation.

In one embodiment, each word segmentation context may be embedded vector encoded using a pre-trained language model (Bidirectional Encoder Representation from Transformers, bert) to obtain an embedded vector.

In the embodiment of the present invention, the calculating the embedded vector of each word segmentation context in the word segmentation context set includes:

In detail, by calculating the embedded vector of each word segmentation context, the vector representation of each word segmentation context can be obtained, and further the text vector of the question text can be constructed, so that semantic disambiguation can be carried out on the question text, and the accuracy of intelligent question and answer can be improved.

S2, carrying out word embedding clustering on the word segmentation context set according to the embedding vector to obtain a context cluster, calculating probability distribution of each word segmentation context in the context cluster, and generating word context vectors of each word segmentation context according to the probability distribution.

In the embodiment of the invention, word embedding clustering is to obtain a preset number of context clustering clusters after fuzzy clustering is carried out on word segmentation context sets, and the clustering center of each context clustering cluster is used as multi-context representation of each text word segmentation.

It should be noted that, fuzzy clustering belongs to soft clustering, word embedding clustering is performed on word segmentation context sets through soft clustering, so that each word segmentation context can be ensured to belong to each context cluster with a certain probability, and each word segmentation context can have different probability distribution in the context clusters.

In the embodiment of the invention, the probability distribution of each word segmentation context in the context cluster is calculated by using the following formula:

wherein p is _i,k Representing the probability distribution of the ith word context in the kth context cluster, C representing the total number of the context clusters, C _k 、c _j The cluster centers of the kth and jth context clusters are respectively represented, and x _i And (3) representing a clustering vector corresponding to the ith word segmentation context, wherein m represents a preset clustering parameter.

According to the embodiment of the invention, the probability that each word segmentation context belongs to different context clusters can be calculated through probability distribution, and word context vectors under a plurality of different contexts are generated through probability distribution.

Specifically, the embodiment of the invention multiplies the embedded vector corresponding to each word segmentation context with the corresponding probability distribution to obtain the word context vector of each word segmentation context.

In detail, the word environment vector of each word segmentation context can be generated according to the probability distribution, so that the word environment vector representation of each text segmentation under different contexts can be obtained, the text segmentation can be fused with different contexts, and the accuracy of generating the text vector is improved.

S3, constructing a text vector of the problem text according to the word context vector, carrying out anisotropic disambiguation on the text vector to obtain a target vector of the problem text, and carrying out text reconstruction on the problem text according to the target vector to obtain a target reconstructed text of the problem text.

In the embodiment of the invention, the text vector is a vector for representing the whole text of the problem, different contexts can be fused through word context vectors, ambiguity of the upper context and the lower context is avoided, and the calculation accuracy of the text vector is improved.

In an embodiment of the present invention, referring to fig. 2, the constructing a text vector of the question text according to the word context vector includes:

s21, calculating the inverse document frequency of the word segmentation context corresponding to each word segmentation context vector in the word segmentation context set;

S22, multiplying the inverse document frequency by the word context vector to obtain a frequency vector;

s23, carrying out vector series connection on the frequency vectors to obtain text vectors of the problem text.

In the embodiment of the invention, the inverse document frequency is the reciprocal of the occurrence frequency of each word segmentation context in the word segmentation context set, the importance of each word segmentation context is reflected through the inverse document frequency, in the word segmentation context set, the importance of the word segmentation context is possibly proportional to the occurrence frequency, but at the same time, the importance of each word segmentation context in a question text can be reflected through the inverse document frequency, and further, a more accurate text vector is obtained.

Specifically, the importance degree of each text word in different word segmentation contexts is obtained by multiplying the inverse document frequency and the word context vector, so that different word segmentation contexts can be comprehensively considered, and a text vector with more accurate context is obtained.

In the embodiment of the invention, the text vector is calculated by using the embedded vector generated by the pre-trained language model, but the generated text vector has the problem of anisotropy due to the fact that the distribution of the embedded vector does not meet Gaussian distribution, so that the text vector can be subjected to anisotropic disambiguation to obtain a more accurate target vector.

In the embodiment of the present invention, the anisotropic disambiguation of the text vector to obtain the target vector of the question text includes:

In detail, the word segmentation vector set is composed by using frequency vectors composing text vectors, and the covariance matrix is obtained by the word segmentation vector set, so that the covariance matrix can be converted into W ^T And (3) converting the orthogonal matrix and the opposite angle matrix into a unit matrix to obtain a unit covariance matrix, wherein the unit covariance matrix is a positive definite symmetric matrix, singular value decomposition is carried out on the positive definite symmetric matrix to obtain a decomposition matrix W consisting of the orthogonal matrix and the opposite angle matrix, and then word segmentation vectors are converted into target vectors conforming to Gaussian distribution through the decomposition matrix, so that anisotropy of text vectors is eliminated.

In one embodiment, the set of word segmentation vectors is linearly transformed using the following formula to obtain a transformed vector:

wherein,representing a transformation vector, y, corresponding to the first word segmentation vector in the word segmentation vector set _l Representing the first word-segmentation vector in the word-segmentation vector set, μ representing the expectation between the word-segmentation vector sets, Λ representing the facing angular matrix in the decomposition matrix, and U representing the orthogonal matrix in the decomposition matrix.

In the embodiment of the invention, the text reconstruction is to reconstruct the target reconstructed text with more accurate context and semantics according to the target vector so as to improve the accuracy of the subsequent target answer corpus determination.

In the embodiment of the present invention, the text reconstruction is performed on the question text according to the target vector to obtain a target reconstructed text of the question text, including:

The semantic encoder may be a preset bert model in the above step S1, where the semantic encoder includes embedded coding layers in multiple berts, so as to obtain coding vectors of different embedded coding layers, and then maps the coding vectors of different dimensions to a vector space of the same dimension through linear mapping, so as to further extract semantic features in the problem text, and specifically, the coding vectors can be mapped in a nonlinear manner by using a nonlinear function.

In the embodiment of the invention, the vector superposition is to superpose elements of the same channel of the mapping coding vector to obtain the fusion semantic vector fusing a plurality of hierarchical coding features, so that full-connection activation is performed through a full-connection layer and an activation function, and a text corresponding to the fusion semantic vector is output to obtain the target reconstruction text.

S4, acquiring pre-constructed answer corpus, respectively constructing word embedding matrixes corresponding to the answer corpus and the target reconstructed text, and performing attention interaction on the word embedding matrixes to obtain an interaction matrix of the answer corpus and the target reconstructed text.

In the embodiment of the present invention, the answer corpus is an answer text corresponding to a preset question text, and it should be noted that the answer corpus may be an answer preset by a plurality of existing questions, for example, a corpus corresponding to a plurality of preset question texts, such as a question in expertise, a question in a network, a hot question, a history question, and the like.

Specifically, referring to fig. 3, the respectively constructing a word embedding matrix corresponding to the answer corpus and the target reconstructed text includes:

s31, performing text preprocessing on the answer corpus and the target reconstruction text to obtain a clean text;

s32, carrying out character segmentation and character filling on the clean text to obtain text characters;

and S33, mapping the text characters into character vectors, and generating the answer corpus and a word embedding matrix corresponding to the target reconstructed text according to the character vectors.

The text preprocessing is to remove punctuation marks and nonsensical characters, and clean text with meaning of each character is obtained. And then, character segmentation is carried out on the clean text by using a preset character segmentation tool, for example, a pynpir tool, character filling is to fix the segmented characters with the same length by using a preset special symbol so as to obtain a character embedding matrix, and subsequent calculation is facilitated, wherein the character embedding matrix can be a row matrix of 1*n.

Specifically, each text character can be mapped into a character vector with a fixed length by using models such as word2vec which is pre-trained, and an answer corpus and a word embedding matrix of the target reconstructed text are generated through the character vector.

In another embodiment of the invention, attention interaction is to comprehensively consider the answer corpus, the target reconstruction text and semantic relativity among the answer corpus and the target reconstruction text, so as to improve the comprehensiveness of information in the word embedding matrix.

In the embodiment of the present invention, the performing attention interaction on the word embedding matrix to obtain an interaction matrix of the answer corpus and the target reconstructed text includes:

In the embodiment of the invention, cosine similarity between character vectors in each word embedding matrix is in the similarity matrix, for example, matrix elements of an a-th row and a b-th column in the similarity matrix are the similarity between an a-th character vector in the answer corpus and a b-th character vector in the target reconstruction text.

It should be noted that, the matrix multiplication of the active matrix and the word embedding matrix is performed to obtain an answer corpus, and the word embedding matrix corresponding to the target reconstructed text is multiplied by one active matrix, for example, the word embedding matrix of the answer corpus is multiplied by the active matrix obtained by the row active operation, and the word embedding matrix corresponding to the target reconstructed text is multiplied by the active matrix obtained by the column active operation.

In the embodiment of the invention, the relevant information between the answer corpus and the target reconstruction text can be comprehensively considered through the attention interaction, so that the characteristic information in the interaction matrix is more comprehensive, and the accuracy of the subsequent similarity is improved.

S5, calculating the similarity between the target reconstructed text and the answer corpus according to the interaction matrix, and determining the target answer corpus corresponding to the question text according to the similarity.

In the embodiment of the invention, the similarity is the feature similarity between the target reconstructed text and each answer corpus calculated according to the interaction matrix, and the larger the feature similarity is, the greater the possibility that the answer corpus is the target answer corpus is indicated, so that the target answer corpus corresponding to the question text can be determined according to the similarity.

In the embodiment of the present invention, the calculating, according to the interaction matrix, the similarity between the target reconstructed text and the answer corpus includes:

In the embodiment of the invention, the feedforward neural network comprises a feedforward network with two layers and a normalization layer, the feedforward neural network is used for carrying out linear transformation and activation on the interaction matrix twice, and finally residual connection and layer normalization are carried out on the results of the linear transformation and the first layer normalization to obtain residual characteristics.

It should be noted that, the residual features include residual features corresponding to the interaction matrix of the answer corpus and residual features corresponding to the interaction matrix of the target reconstructed text, that is, corresponding residual features of the answer corpus and corresponding residual features of the target reconstructed text

Specifically, the fusion features are obtained by carrying out feature fusion on the answer corpus and the interaction matrix of the target reconstructed text, and the fusion features of the answer corpus and the target reconstructed text are obtained.

In detail, the fusion characteristics of the interaction matrix are calculated using the following formula:

N＝[u _a ；u _b ；u _a -u _b ；u _a *u _b ]

wherein N represents a fusion feature, u _a Representing residual features corresponding to answer corpus in the residual features, u _b And representing residual characteristics corresponding to the target reconstruction text in the residual characteristics, wherein the residual characteristics represent vector cross-multiplying symbols.

Preferably, before calculating the fusion feature, the residual feature may be unidimensionalized first, so as to perform vector stitching on the residual feature, so as to obtain the fusion feature.

In the embodiment of the present invention, the performing full connection on the fusion feature to obtain the similarity between the target reconstructed text and the answer corpus includes:

In the embodiment of the invention, the fusion characteristics are fully connected through two layers of fully connected networks, wherein the dimensions in each layer of fully connected network are inconsistent, so that the dimension scale of the fully connected network is improved.

Specifically, the activation calculation may be performed on the first full-connection vector and the second full-connection may be normalized by using different activation ambiguity, so as to finally output a similarity between the target reconstructed text and the answer corpus.

In the embodiment of the invention, the similarity between the target reconstructed text and the answer corpus is calculated through the interaction matrix, so that the interaction information and the semantic information between the target reconstructed text and the answer corpus can be comprehensively considered, and the similarity between the target reconstructed text and the answer corpus is calculated more accurately.

In the embodiment of the invention, the answer corpus with the maximum similarity with the target reconstructed text is selected as the target answer corpus corresponding to the question text, and the target answer corpus is used as the answer text of the question text.

FIG. 4 is a functional block diagram of a large model-based intelligent question-answering system according to one embodiment of the present invention.

The intelligent question-answering system 400 based on the large model can be installed in electronic equipment. Depending on the implementation, the large model-based intelligent question-answering system 400 may include an embedded vector calculation module 401, a word context vector generation module 402, a text reconstruction module 403, a word embedded matrix attention interaction module 404, and a target answer corpus determination module 405. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.

In the present embodiment, the functions concerning the respective modules/units are as follows:

the embedded vector calculation module 401 is configured to obtain a problem text, segment the problem text to obtain text segments, search a word segmentation context set corresponding to each text segment in a pre-constructed corpus, and calculate an embedded vector of each word segmentation context in the word segmentation context set;

The word context vector generation module 402 is configured to perform word embedding clustering on the word segmentation context set according to the embedding vector to obtain a context cluster, calculate probability distribution of each word segmentation context in the context cluster, and generate a word context vector of each word segmentation context according to the probability distribution;

the text reconstruction module 403 is configured to construct a text vector of the question text according to the word context vector, anisotropically disambiguate the text vector to obtain a target vector of the question text, and perform text reconstruction on the question text according to the target vector to obtain a target reconstructed text of the question text;

the word embedding matrix attention interaction module 404 is configured to obtain a pre-constructed answer corpus, respectively construct word embedding matrices corresponding to the answer corpus and the target reconstructed text, and perform attention interaction on the word embedding matrices to obtain an interaction matrix of the answer corpus and the target reconstructed text;

The target answer corpus determining module 405 is configured to calculate a similarity between the target reconstructed text and the answer corpus according to the interaction matrix, and determine a target answer corpus corresponding to the question text according to the similarity.

In detail, each module in the large model-based intelligent question-answering system 400 in the embodiment of the present invention adopts the same technical means as the large model-based intelligent question-answering method described in fig. 1 to 3, and can produce the same technical effects, which are not described herein.

The invention also provides an electronic device which may include a processor, a memory, a communication bus, and a communication interface, and may further include a computer program stored in the memory and executable on the processor, such as a large model-based intelligent question-answering method program.

The processor may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and the like. The processor is a Control Unit (Control Unit) of the electronic device, connects various components of the entire electronic device using various interfaces and lines, and executes various functions of the electronic device and processes data by running or executing programs or modules stored in the memory (for example, executing a large model-based intelligent question-answering method program, etc.), and calling data stored in the memory.

The memory includes at least one type of readable storage medium including flash memory, removable hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory may also include both internal storage units and external storage devices of the electronic device. The memory may be used not only for storing application software installed in an electronic device and various types of data, such as codes of a large model-based intelligent question-answering method program, but also for temporarily storing data that has been output or is to be output.

The communication bus may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory and at least one processor or the like.

The communication interface is used for communication between the electronic equipment and other equipment, and comprises a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.

Only an electronic device having components is shown, and it will be understood by those skilled in the art that the structures shown in the figures do not limit the electronic device, and may include fewer or more components than shown, or may combine certain components, or a different arrangement of components.

For example, although not shown, the electronic device may further include a power source (such as a battery) for powering the respective components, and preferably, the power source may be logically connected to the at least one processor through a power management system, so as to perform functions of charge management, discharge management, and power consumption management through the power management system. The power supply may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like. The electronic device may further include various sensors, bluetooth modules, wi-Fi modules, etc., which are not described herein.

Specifically, the specific implementation method of the above instruction by the processor may refer to descriptions of related steps in the corresponding embodiment of the drawings, which are not repeated herein.

Further, the electronic device integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or system capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, system and method may be implemented in other manners. For example, the system embodiments described above are merely illustrative, e.g., the division of the modules is merely a logical function division, and other manners of division may be implemented in practice.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. Multiple units or systems as set forth in the system claims may also be implemented by means of one unit or system in software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.

Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. An intelligent question-answering method based on a large model, which is characterized by comprising the following steps:

2. The large model based intelligent question-answering method according to claim 1, wherein the calculating of the embedded vector for each word-segmentation context in the set of word-segmentation contexts includes:

3. The large model based intelligent question-answering method according to claim 1, wherein the constructing a text vector of the question text from the word context vector comprises:

4. The large model-based intelligent question-answering method according to claim 1, wherein the anisotropic disambiguation of the text vector to obtain a target vector of the question text comprises:

5. The large model-based intelligent question-answering method according to claim 1, wherein the text reconstruction of the question text according to the target vector to obtain a target reconstructed text of the question text comprises:

6. The large model-based intelligent question-answering method according to claim 1, wherein the respectively constructing the word embedding matrix of the answer corpus corresponding to the target reconstructed text includes:

7. The intelligent question-answering method based on big model as claimed in claim 1, wherein the performing attention interaction on the word embedding matrix to obtain an interaction matrix of the answer corpus and the target reconstructed text comprises:

8. The large model-based intelligent question-answering method according to claim 1, wherein the calculating of the similarity between the target reconstructed text and the answer corpus according to the interaction matrix includes:

N＝[u _a ；u _b ；u _a -u _b ；u _a *u _b ]

9. The intelligent question-answering method based on a large model according to claim 8, wherein the fully connecting the fusion features to obtain the similarity between the target reconstructed text and the answer corpus comprises:

10. An intelligent question-answering system based on a large model, the system comprising: