CN113688231B

CN113688231B - Abstract extraction method and device of answer text, electronic equipment and medium

Info

Publication number: CN113688231B
Application number: CN202110881696.3A
Authority: CN
Inventors: 花新宇; 代文; 陈帅
Original assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date: 2021-08-02
Filing date: 2021-08-02
Publication date: 2024-11-08
Anticipated expiration: 2041-08-02
Also published as: CN113688231A

Abstract

The embodiment of the application provides a method, a device, electronic equipment and a medium for extracting abstract of answer text, wherein the method comprises the following steps: acquiring a first answer text with a first text length; determining a target model from a plurality of alternative models according to the text type and/or the first text length of the first answer text; and processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same theme meaning as the first answer text.

Description

Abstract extraction method and device of answer text, electronic equipment and medium

Technical Field

The present application relates to the field of natural language processing technologies, and in particular, to a method and apparatus for extracting a abstract of an answer text, an electronic device, and a medium.

Background

With the popularization of the internet, people increasingly search the internet for answer information to questions. And the internet often returns a large amount of answer information after matching, so that a large amount of time is required to browse to identify effective answers, and the user experience is poor.

Therefore, it becomes particularly important to extract the abstract of the answer information.

Disclosure of Invention

In view of this, the embodiments of the present disclosure provide a method and apparatus for extracting a summary of answer text, an electronic device, and a storage medium.

According to a first aspect of an embodiment of the present disclosure, there is provided a method for extracting a summary of answer text, including:

acquiring a first answer text with a first text length;

Determining a target model from a plurality of alternative models according to the text type and/or the first text length of the first answer text;

and processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same theme meaning as the first answer text.

In one embodiment, the alternative model includes at least one of:

a generative model for generating the second answer text based on the content of the first answer text;

Extracting at least one keyword and/or keyword sentence which are already existing in the first answer text by the extraction model to form the second answer text;

The comprehensive model comprises the extraction model and the generation model which are arranged in sequence, and is used for processing the first answer text through the extraction model and the generation model in sequence to form the second answer text.

In one embodiment, the determining the target model from the plurality of candidate models according to the text type and/or the first text length of the first answer text includes at least one of:

If the text type of the first answer text is a first text type and the first text length is in a first interval range, determining that the target model is the generated model;

If the first answer text is of a first text type and the first text length is in a second interval range, determining that the target model is the extraction model;

If the first answer text is of a first text type and the first text length is in a third interval range, determining that the target model is the comprehensive model;

And the minimum value of the second interval range is larger than or equal to the maximum value of the first interval range, and the minimum value of the third interval range is larger than or equal to the maximum value of the second interval range.

In one embodiment, the determining the target model from the multiple candidate models according to the text type and/or the first text length of the first answer text further includes:

And if the first answer text is of the second text type or the third text type and the first text length is in the fourth area range, determining the target model as the extraction model.

In one embodiment, the generating a model for generating the second answer text based on the content of the first answer text includes:

performing word segmentation processing on the first answer text, and determining word segmentation positions in the first answer text;

inserting a predetermined separator at a word segmentation location in the first answer text;

And inputting the first answer text inserted with the separator into the generative model to obtain the second answer text.

In one embodiment, the generative model is a Transformers bi-directional encoder BERT based language model.

In one embodiment, the extracting model extracts at least one keyword and/or keyword sentence existing in the first answer text to form the second answer text, including:

Splitting the first answer text into N sentences, and selecting M sentences from the N sentences, wherein M is not more than N;

Determining candidate keywords of the M sentences;

ranking the importance of the candidate keywords;

selecting a preset number of candidate keywords with high importance as keywords;

And forming the second answer text based on the keywords.

According to a second aspect of the embodiments of the present disclosure, there is provided a digest extracting apparatus of answer text, including:

the acquisition module is used for acquiring a first answer text with a first text length;

the model determining module is used for determining a target model from a plurality of alternative models according to the text type and/or the first text length of the first answer text;

And the processing module is used for processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same theme meaning as the first answer text.

In one embodiment, the alternative model includes at least one of:

In one embodiment, the generative model comprises:

the processing unit is used for carrying out word segmentation processing on the first answer text and determining word segmentation positions in the first answer text;

an insert separator unit for inserting a predetermined separator at a word segmentation position in the first answer text;

and the generating unit is used for inputting the first answer text inserted with the separator into the generating model to obtain the second answer text.

In one embodiment, the extraction model comprises:

A selecting unit, configured to split the first answer text into N sentences, and select M sentences from the N sentences, where M is not greater than N;

a candidate keyword determining unit configured to determine candidate keywords of the M sentences;

The ranking unit is used for ranking the importance of the candidate keywords;

A selecting unit, configured to select a preset number of candidate keywords with high importance as keywords;

And the conjunctive unit is used for forming the second answer text based on the keywords.

According to a third aspect of embodiments of the present disclosure, there is provided an electronic device comprising the apparatus of the second aspect.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium storing executable instructions for causing a processor to perform the method of the first aspect.

According to the abstract extraction method and device of the answer text, the electronic equipment and the storage medium, the method is used for processing different alternative models aiming at the first answer text with different text types and/or text lengths according to the text types and the text lengths of the answer text, so that the second answer text is obtained as the abstract, the abstract extraction accuracy of the answer text is improved, and the information acquisition efficiency and experience of a user are improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of embodiments of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the embodiments of the disclosure.

FIG. 1 is a flow chart illustrating a first answer text summarization method according to an example embodiment;

FIG. 2 is a flowchart illustrating a second method of summarization of answer texts according to an example embodiment;

FIG. 3 is a flowchart illustrating a method of summarizing a third answer text according to an exemplary embodiment;

FIG. 4 is a flowchart illustrating a method of summarizing a fourth answer text according to an example embodiment;

FIG. 5 is a flowchart illustrating a summary extraction method of a fifth answer text, according to an example embodiment;

FIG. 6 is a flowchart illustrating a method of summarizing a sixth answer text according to an example embodiment;

FIG. 7 is a schematic diagram of a removable model, according to an example embodiment;

FIG. 8 is a schematic diagram of a generative model, shown in accordance with an exemplary embodiment;

FIG. 9 is a schematic diagram of an integrated model shown in accordance with an exemplary embodiment;

FIG. 10 is a schematic diagram of another generative model, shown in accordance with an exemplary embodiment;

fig. 11 is a schematic diagram showing a structure of a digest extracting apparatus of a first answer text according to an exemplary embodiment;

Fig. 12 is a schematic diagram showing a structure of a digest extracting apparatus of a second answer text according to an exemplary embodiment;

fig. 13 is a schematic diagram showing a structure of a digest extracting apparatus of a third answer text according to an exemplary embodiment;

Fig. 14 is a schematic diagram showing a structure of a digest extracting apparatus of a fourth answer text according to an exemplary embodiment;

fig. 15 is a block diagram showing a constituent structure of a digest extracting apparatus of answer text according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the embodiments of the present disclosure. Rather, they are merely examples of apparatus and methods consistent with aspects of embodiments of the present disclosure as detailed in the accompanying claims.

The terminology used in the embodiments of the disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the embodiments of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in embodiments of the present disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of embodiments of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.

As shown in fig. 1, the present exemplary embodiment provides a method for extracting a summary of answer text, including:

Step S101: acquiring a first answer text with a first text length;

Step S103: determining a target model from a plurality of alternative models according to the text type and/or the first text length of the first answer text;

step S104: and processing the first answer text by using the target model to obtain a second answer text with a second text length, wherein the second text length is shorter than the first text length, and the second answer text has the same theme meaning as the first answer text.

In this embodiment, the method for extracting the abstract of the answer text may be performed by an apparatus for extracting the abstract of the answer text, which may be integrated in an electronic device (e.g., a server), and may be implemented in the form of hardware and/or software.

In this embodiment, the first answer text may be answer information provided by the interaction system based on the question information, where the interaction system may be an intelligent interaction system such as an intelligent terminal, a platform, an application, a client, etc. capable of providing an intelligent interaction service, for example, an intelligent sound box, an intelligent video sound box, an intelligent storytelling machine, an intelligent interaction platform, an intelligent interaction application, a search engine, a question-answering system, etc. For example, the interactive system performs matching through a preset question-answer library based on the input question information, and takes the answer with the highest matching degree as answer information corresponding to the question information.

As shown in fig. 2, before step S103, the method further includes:

Step S102: a text type of the first answer text is determined.

Here, the text type may be determined according to the first answer text and/or question information corresponding to the first answer text, and the text type includes a fact type text, a cause type text, a method type text, and the like.

In some possible embodiments, the sentence similarity calculation may be performed on the question information corresponding to the first answer text and all the question sentences in the question library by pre-establishing the question library, and finally, the question type corresponding to the question sentence with the highest similarity is used as the text type of the first answer text.

For example, different types of question sentences are predefined in the question library, and the question sentences may include: factual, causal, and methodological question statements, wherein the factual question statements may include: what, who, what, where, which, including, etc.; the cause type question statement may include: why, etc.; the method-type question statement may include: how, how methods are the manner is equal.

It will be appreciated that the content included in the various types of question sentences given above is only some examples of building a question bank, and may be defined according to the actual application field and scenario, etc. when applied specifically.

In other possible embodiments, the sentence similarity calculation may be performed on the first answer text and all answer sentences in the answer library by pre-establishing the answer library, and finally, the answer type corresponding to the answer sentence with the highest similarity is used as the text type of the first answer text.

For example, different types of answer sentences are predefined in the answer library, and the answer sentences may include: a fact-type answer sentence, a cause-type answer sentence, and a method-type answer sentence, wherein the fact-type answer sentence may include: "(person, place or entity) is", "(person, place or entity) includes" etc.; the causal answer sentence may include: "cause," "because," "due," and the like represent words of cause; the method answer sentence may include: the method is the mode is "step is" and the like.

It will be appreciated that the content included in the various types of answer sentences given above is merely some examples of building an answer library, and may be defined according to the actual application field and scenario when applied specifically.

It will be appreciated that, according to the needs of the practical application, the text type of the first answer text may also be determined in other manners, which is not limited in this embodiment.

After determining the text type of the first answer text, in step S103, a target model is determined from a plurality of candidate models according to the text type and/or the first text length of the first answer text.

Here, since the text type embodies the composition characteristics of the text to a certain extent, and meanwhile, the text length directly determines the complexity of model processing on the premise of the same target model, the target model for extracting the answer text is selected from at least one dimension of the text type and the text length.

In this embodiment, the alternative model may be a model implementing a text digest extraction function algorithm, for example, a decimated model including a model implementing a Lead-3 algorithm, a TextRank algorithm, an LDA algorithm, a CNN algorithm, and/or a BERT algorithm, and/or a generative model including a model implementing an LSTM algorithm, convS S algorithm, and/or UniLM algorithm.

In some possible embodiments, the target model may be determined from a plurality of candidate models directly according to the text type of the first answer text.

For example, for a fact-type text or a reason-type text, since the gist meaning of the fact or reason to be stated is mainly concentrated on the first few sentences or the last few sentences of the first answer text, when determining the target model from the candidate models, it is determined that the first few sentences or the last few sentences of the first answer text are mainly the models of the second answer text. If the determined target model is a decimated model, the first answer text is processed by the decimated model, and the obtained second answer text includes the first few sentences or the last few sentences of the first answer text as the second answer text.

For another example, for a method-type text, the gist meaning of the method to be stated is dispersed throughout the first answer text, and therefore, when determining the target model from among the candidate models, it is determined to summarize keywords/sentences of the first answer text into a model of the second answer text. If the determined target model is a generative model, the first answer text is processed by the generative model, and the obtained second answer text is a summary of the gist meaning of the first answer text.

In other possible embodiments, the target model may be determined from a plurality of candidate models directly according to the first text length of the first answer text.

For example, if the first text length is less than a first preset threshold, when determining the target model from the candidate models, it is determined to summarize keywords/sentences of the first answer text into a model of the second answer text. If the determined target model is a generative model, the first answer text is processed by the generative model, and the obtained second answer text is a summary of the gist meaning of the first answer text.

For another example, if the first text length is greater than a first preset threshold, when determining the target model from the candidate models, it is determined that the keyword/sentence is mainly extracted from the first answer text to form a model of the second answer text. If the determined target model is a decimated model, the first answer text is processed by the decimated model, and the obtained second answer text includes the second answer text of the keyword/sentence.

In still other possible embodiments, the target model may be determined from a plurality of candidate models in combination with the text type and the text length of the first answer text.

For example, for method type text with a first text length greater than a first preset threshold, when determining a target model from the candidate models, first, a first model is determined that extracts keywords/sentences from the first answer text, intermediate answer text is generated through processing of the first model, and then a second model is determined that is used to summarize the intermediate answer text into a second answer text. Finally, the target model comprises a first model and a second model which are sequentially arranged.

It will be appreciated that the above is given merely as some examples of determining a target model from a plurality of alternative models based on text type and/or text length, and that suitable target models may be selected depending on the actual application requirements, as the application is specific.

After the target model is determined, the target model is used for processing the first answer text, and a second answer text with a second text length is obtained.

That is, the first answer text is used as the input of the target model, and the target model outputs the second answer text after processing. The second answer text is a summary of the first answer text, so the second text length of the second answer text is shorter than the first text length of the first answer text, and the second answer text has the same meaning as the subject of the first answer text.

According to the abstract extraction method of the answer text, according to the text type and the text length of the answer text, different alternative models are adopted for processing the first answer text with different text types and/or text lengths, and the second answer text is obtained as the abstract, so that the abstract extraction accuracy of the answer text is improved, and the information acquisition efficiency and experience of a user are improved.

In some possible embodiments, before the processing of the first answer text using the target model in step S103, the method further includes:

and filtering the first answer text.

Specifically, noise information contained in the first answer text is filtered, where the noise information may include: hypertext markup language tags, scrambled words, punctuation and/or spoken words and sentences, etc. The accuracy of the extracted abstract can be effectively improved by filtering the first answer text.

In some possible implementations, the alternative model includes at least one of:

A generative model for generating the second answer text based on the content of the first answer text; for example, the generative model rewrites and reorganizes the first answer text based on an understanding of the first answer text semantics, generating a more compact, more generalized second answer text.

Extracting at least one keyword and/or keyword sentence which are already existing in the first answer text by the extraction model to form the second answer text; for example, the extraction model directly picks important phrases or sentences (i.e., keywords or key sentences) from the first answer text, and combines the keywords and/or key sentences to form the second answer text.

The comprehensive model comprises the extraction model and the generation model which are arranged in sequence, and is used for processing the first answer text through the extraction model and the generation model in sequence to form the second answer text. For example, first, the first answer text is processed by the extraction model to obtain an intermediate answer text, and then, the intermediate answer text is processed by the generative model to obtain a second answer text.

In this embodiment, the third text length of the intermediate answer text is not less than the second preset threshold, where the third text length is K times the second text length, and K is greater than 1.

In some possible implementations, if the determined target model is a generative model and the second answer text is generated based on the generative model, the method further includes:

judging whether the semantics of the second answer text is consistent with the semantics of the first answer text by using a semantic matching model;

and if the answer texts are inconsistent, processing the first answer text by using the extraction model to obtain a third answer text, and outputting the third answer text.

In some possible implementations, when determining the target model from the plurality of candidate models, at least one of the following may be employed:

In this embodiment, when the text type of the first answer text is the first text type, a suitable candidate model is adopted as the target model according to the first text length of the first answer text.

Here, the first section range, the second section range, and the third section range are divided based on the size of the text length, and the object model is determined according to the divided section ranges. For example, the first interval range is not more than 300 words, the second interval range is more than 300 words and less than 1000 words, and the third interval range is not less than 1000 words.

The first text type may be a fact-type text or a cause-type text, preferably a fact-type text.

Specifically, for example, when the first answer text is a fact-type text and the first text length is not more than 300 words, the target model is determined to be a generating model, and the generating model rewrites and reorganizes the first answer text based on understanding of the semantics of the first answer text, so as to generate a more concise and generalized second answer text.

For another example, when the first answer text is a fact-type text and the first text length is not less than 1000 words, the target model is determined to be an integrated model, the first answer text is processed by the extraction model to obtain an intermediate answer text, and then the intermediate answer text is processed by the generation model to obtain a second answer text.

In some possible embodiments, if the first answer text is of the second text type or the third text type and the first text length is within the fourth region, the target model is determined to be the extraction model.

In this embodiment, when the text type of the first answer text is the second text type or the third text type, a suitable candidate model is adopted as the target model according to the first text length of the first answer text.

Here, the fourth section range, the fifth section range, and the sixth section range are divided based on the size of the text length, and the object model is determined according to the divided section ranges. And the minimum value of the fourth interval range is larger than or equal to the maximum value of the fifth interval range, and the minimum value of the sixth interval range is larger than or equal to the maximum value of the fourth interval range. For example, the fifth section range is not more than 500 words, the fourth section range is more than 500 words and less than 1000 words, and the sixth section range is not less than 1000 words.

It is understood that the fourth, fifth, and sixth section ranges divided herein are not directly associated with the first, second, and third section ranges divided for the first text type.

The second text type and the third text type may be causal text or methodological text.

Specifically, for example, when the first answer text is a method-type text and the first text length is greater than 500 words and less than 1000 words, the target model is determined to be an extraction model, and the extraction model directly selects important phrases or sentences (i.e., keywords or keywords sentences) from the first answer text, and combines the keywords and/or keywords to form the second answer text.

In some possible embodiments, as shown in fig. 3, the generating model forms a second answer text based on the first answer text, including:

step S201: performing word segmentation processing on the first answer text, and determining word segmentation positions in the first answer text;

step S202: inserting a predetermined separator at a word segmentation location in the first answer text;

Step S203: and inputting the first answer text inserted with the separator into the generative model to obtain the second answer text.

In the above step S201, the first answer text is subjected to word segmentation processing, and the positions of the words in the first answer text may be recorded, where one word is included between two adjacent word segmentation positions, and meanwhile, based on the BERT mechanism, a symbol [ CLS ] may be set at the beginning of one sentence in the first answer text, a symbol [ SEP ] may be used to separate between two adjacent sentences, and a symbol [ SEP ] may be set at the end of the first answer text.

For example, the first answer text is "typhoons belong to one of tropical cyclones, which is a low-pressure vortex occurring on tropical or subtropical ocean surfaces, is a powerful and profound" tropical weather system ". … …, wherein the wind reaches 12 or more levels, collectively referred to as typhoons. By word segmentation, the first answer text can be segmented to obtain [ CLS ] typhoon/belongs to/tropical/cyclone/one/tropical/cyclone/yes/occur/on/tropical// sub-tropical/ocean surface/up/low pressure/vortex/yes/one/powerful/and/deep/tropical/weather/system/… …/wherein/wind/up/12 grade/or more/collectively/called/typhoon [ SEP ] ", wherein"/"can be used to determine the word segmentation position, and a word is included between the symbol [ CLS ] or [ SEP ] and"/", and a word is also included between two adjacent"/".

In step S202, the separator may be a symbol for dividing two adjacent words in the sentence, and the specific form of the separator may be various and may be specifically set according to the actual situation, for example, the separator may be "/", may be "[ SEW ]", or the like. A word may be between two adjacent separators, and the word may form a semantic information, and may further represent that multiple characters in the word have strong association with each other, weak association between different words, and so on.

If the separator is "[ SEW ]", after inserting a preset separator at the word segmentation position in the first answer text, the example sentence is changed into a [ CLS ] typhoon [ SEW ] which belongs to [ SEW ] tropical [ SEW ] cyclone [ SEW ], one [ SEW ] tropical [ SEW ] cyclone [ SEW ] is that [ SEW ] happens on [ SEW ] tropical [ SEW ] or [ SEW ] subtropical [ SEW ] ocean surface [ SEW ], and [ SEW ] low-pressure [ SEW ] vortex [ SEW ] of [ SEW ] is that [ SEW ] is a [ SEW ] powerful [ SEW ] and [ SEW ] deep [ SEW ] is that [ SEW ] tropical [ SEW ] systems [ SEW ] … … [ SEW ] in which [ SEW ] wind power [ SEW ] reaches [ SEW ] 12-level [ SEW ] or more than [ SEW ] and [ SEW ] reaches [ SEW ] level [ SEW ] and is a [ SEW ] weather table [ SEW ] and p "".

In step S203, the first answer text with the separator inserted is input into the generative model, and the second answer text is obtained.

In some possible implementations, the generative model is a Transformers bi-directional encoder BERT-based language model.

In this embodiment, the language model of the bi-directional encoder BERT based on Transformers models the first answer text based entirely on the attention mechanism to calculate the correlations of each word in the first answer text to all the words in the first answer text, then considers that the correlations between the word and the word reflect the correlations and the importance degree between different words in the first answer text to a certain extent, based on this, the importance (or weight) of each word is adjusted by using the correlations and the importance degree, and finally a new expression of each word is obtained.

Finally, based on the new expression of each word, determining a second answer text corresponding to the first answer text, and improving the accuracy of the second text.

In some possible embodiments, the extraction model extracts at least one keyword and/or keyword sentence that is already present in the first answer text, to form the second answer text, as shown in fig. 4, and includes:

Step S301: splitting the first answer text into N sentences, and selecting M sentences from the N sentences, wherein M is not more than N;

step S302: determining candidate keywords of the M sentences;

Step S303: ranking the importance of the candidate keywords;

Step S304: selecting a preset number of candidate keywords with high importance as keywords;

Step S305: and forming the second answer text based on the keywords.

In this embodiment, by directly extracting a part of sentences (M sentences) from the N sentences of the first answer text to perform further processing, the computational complexity of the whole model processing can be effectively reduced. Here, the extraction may be to extract the first M sentences of the first answer text, the last M sentences of the first answer text, and so on.

In step S302, candidate keywords may be determined by a method of extracting keywords in the related art.

After the candidate keywords are determined, importance values are determined for each candidate keyword and used for representing the importance of the candidate keywords. Here, the importance value of the candidate keyword may be determined according to the occurrence frequency of the candidate keyword in the first answer text, where the higher the occurrence frequency is, the larger the importance value is, and the higher the corresponding importance degree is; or determining according to the corresponding relation between the preset keywords and the importance value.

And finally, sorting the importance degree of the candidate keywords according to the importance degree value, selecting a preset number of candidate keywords with high importance degree as the keywords, and forming a second answer text based on the determined keywords.

A specific example is provided below in connection with any of the embodiments described above:

the embodiment of the disclosure is applied to an open domain answer scene in intelligent question answering, and the basic flow is as shown in fig. 5 and 6:

a1: after the question-answering system obtains the question information (query) of the user, retrieving the final answer (i.e., the first answer text) from the library;

a2: for answers with longer length, judging the text type of the answer through a text analysis module (i.e. a model determining module) and judging which abstract module (i.e. a target model) the answer is applicable to;

a3: issuing the query and the first answer text to a summary module, and processing to obtain a summary result (namely, a second answer text);

a4: if the abstract module is a generated model, a verification module (such as a semantic matching model) is further required to judge whether the semantics of the second answer text and the semantics of the first answer text are consistent, if not, a third answer text is obtained by using an extraction model for processing, and the third answer text is used as the abstract of the first answer text.

Here, the text analysis module is used for analyzing the text type and the text length of the first answer text; and analyzing the query and the first answer text through a text analysis module to obtain information such as the type, the length and the like of the answer text. By analysing the query and the first answer text, it is possible to get what the main question type is, why, how it is, i.e. the reason text, i.e. the method text, from which the first answer text for different text types and text lengths will pass through different summary modules. By experimental comparison, the following protocol is preferably used: fact text and text length if between 200-500 words, a decimated model is used; the realistic text and the text length is 100-300 words, a generative model is used; fact text and text greater than 1000 words, then the integrated model is used. The reason text and method text, which are mainly paragraph text without line feed, are usually used with extraction models for both texts and text lengths between 300-1000 words.

The abstract module is used for extracting a second answer text based on the first answer text, and comprises a generation model, an extraction model and a comprehensive model.

For the extraction model, the first problem is the choice of sentence granularity. In this embodiment, commas, periods, semicolons, and the like are selected as sentence breaks, so that texts with consistent semantics and shorter text length can be extracted as much as possible.

For fact-based text, the abstract model fuses the Lead-3 algorithm and the TextRank algorithm using bm25 as the weight and word vectors as the weight, as shown in fig. 7, in order to guarantee semantic consistency. Meanwhile, in order to ensure that key core sentences are not lost, the longest public subsequence matching algorithm is used for matching similar places in the query and the first answer text, and the position of the core sentences is determined according to the matching places.

Here, the correlation algorithm is described as follows:

And the Lead-3 algorithm selects the first 3 divided sentences according to the granularity division of the clauses. The information concentration of the real text is higher, the key information of the real text is generally concentrated in the first few fragments of the text, and the later parts are supplemented with some information;

The TextRank algorithm is a graph-based ranking algorithm for texts, and important components in the texts are scored by a voting mechanism by dividing the texts into a plurality of sentences and building a graph model. The extraction results of TextRank are more scattered, and sentences distributed in each place in the answer text can be extracted;

and the longest public subsequence matching algorithm ensures that the extracted key core sentence is not lost.

For generative models, the problems faced mainly include: the training data is less, unregistered words appear at the time of generation, and cannot be faithful to the original text. Meanwhile, the first answer text covers various kinds of cultures, such as life, science and technology and the like, and the text line ways of the various kinds of cultures are different.

Therefore, to make the subdivision class more effective, a BERT based on a pre-trained model is used as an initialization model. Meanwhile, in order to cope with the problem that the BERT model is not suitable for the generative task, its attention matrix is improved to support the generative task. Thus, by labeling a small amount of data of a category, a better summary can be generated in each category.

Here, BERT is a bi-directional encoder representation pre-training model based on Transformers, using a large number of corpora for self-supervised pre-training upstream tasks. The natural language processing task can use some priori corpus to improve the effect of the downstream task. The original BERT cannot be used as a generative task, UNILM modifies the mask matrix, and the Seq2Seq task can be performed by using a single BERT model.

As shown in fig. 8, a typical bi-directional attention employs a splicing method as shown on the left side in fig. 8, and when generating a task, information leakage occurs using the bi-directional attention. Since only the information of the "today's weather" part is really to be predicted, the information Mask of the "sunny" part can be removed, resulting in the attention splice shown on the right side in fig. 8. Thus, the Attention of the input part is bidirectional, the Attention of the output part is unidirectional, and the requirement of the Seq2Seq is met, which is the thought provided in UNILM that the Seq2Seq task can be completed by using a single Bert model, and the task is generated by using training weights only by adding the Mask with the shape without modifying the model framework.

Meanwhile, in order to solve the problem that the unknown words are inconsistent with the semantics, a sequence labeling model is used for assisting in generating a result. Some fragments are copied from the original text into the generated summary (i.e., the second answer text).

When labeling data, the longest common subsequence is used to calculate the common part of the abstract and the original text, so long as the part of the abstract and the article which is overlapped is a segment which needs to be reserved, the segment is marked as B as the beginning, I marks the rest words, and the word which does not appear is marked as O. For example, an article is an intelligent question and answer abstract, the abstract is two texts of the question and answer abstract, the text is marked as O O B I I, and the other text is marked as O.

In the training stage, a sequence prediction task is added, and a Copy mechanism of sequence labeling can ensure the fidelity degree of the abstract and the original text and avoid professional errors.

For the integrated model, as shown in fig. 9, it can be applied to a real text with a text length greater than 1000 words, such an answer text has a long length and key information is scattered at various positions of an article, and the effect of using the extraction model or the generation model alone is not good.

For this problem, a short text (intermediate answer text) is obtained using a decimated model, for example, within 500 words, and then the short text is processed using a generative model to obtain a second answer text. At this time, the model can ultimately give two results: matching the second answer text finally generated by the generation model through a verification model matching the semantic matching model with the second answer text, and if the matching degree of the second answer text and the first answer text does not meet the preset condition, using the intermediate answer text obtained through extraction model processing as a final result; and if the matching degree of the second answer text and the first answer text meets the preset condition, taking the second answer text as a final result.

The verification module is used for verifying the matching degree between the second answer text and the first answer text. In order to ensure that the generated second answer text is faithful to the original first answer text, the second answer text is checked by a checking module so as to ensure the matching degree of the finally generated abstract. Here, the verification module may be a semantic matching model.

As shown in FIG. 10, the embodiment of the application uses the pre-training model BERT, which can obtain better effect on the downstream task only by a small amount of labeling data, and uses the pre-training model BERT to splice the generated abstract (second answer text) and the original article (first answer text) together to make a machine reasoning task, so as to obtain context information, and uses the pre-training model BERT to make attention on the original article and the generated sentence to obtain the representation of the context.

As shown in fig. 11, the present exemplary embodiment also provides a digest extracting apparatus 10 of answer text, including:

an obtaining module 110, configured to obtain a first answer text with a first text length;

a model determining module 120, configured to determine a target model from a plurality of candidate models 140 according to the text type and/or the first text length of the first answer text;

and the processing module 130 is configured to process the first answer text by using the target model to obtain a second answer text with a second text length, where the second text length is shorter than the first text length, and the second answer text has the same meaning as the first answer text.

In some possible embodiments, as shown in fig. 12, the alternative model 140 includes at least one of:

A generating model 1401 for generating the second answer text based on the content of the first answer text;

a decimating model 1402, extracting at least one keyword and/or keyword sentence that are already present in the first answer text, to form the second answer text;

the comprehensive model 1403 includes the extraction model and the generation model that are set in sequence, and is configured to process the first answer text sequentially through the extraction model and the generation model to form the second answer text.

In some possible embodiments, as shown in fig. 13, the generative model 1401 includes:

a processing unit 14011, configured to perform word segmentation processing on the first answer text, and determine a word segmentation position in the first answer text;

An insert separator unit 14012 for inserting a predetermined separator at a word segmentation position in the first answer text;

A generating unit 14013, configured to input the first answer text inserted with the separator into the generative model, to obtain the second answer text.

In some possible implementations, the generative model 1401 is a Transformers bi-directional encoder BERT-based language model.

In some possible embodiments, as shown in fig. 14, the extraction model 1402 includes:

A selecting unit 14021, configured to split the first answer text into N sentences, and select M sentences from the N sentences, where M is not greater than N;

A candidate keyword determination unit 14022 for determining candidate keywords of the M sentences;

A ranking unit 14023, configured to rank importance of the candidate keywords;

a selecting unit 14024 for selecting a preset number of candidate keywords with high importance as keywords;

and the conjunctive unit 14025 is configured to form the second answer text based on the keyword.

The present exemplary embodiment also provides an electronic device, which includes the apparatus of any one of the above embodiments.

The present exemplary embodiment also provides a computer readable storage medium storing executable instructions for causing a processor to perform the method of any one of the above embodiments.

The computer readable storage medium may be: a storage medium such as a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or a optical disk, which can store a program code, may be selected as a non-transitory storage medium.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

In an exemplary embodiment, the modules/units in the apparatus may be implemented by one or more central processing units (CPU, central Processing Unit), graphics processors (GPU, graphics Processing Unit), baseband processors (BP, baseband processor), application Specific Integrated Circuits (ASIC), DSPs, programmable logic devices (PLD, programmable Logic Device), complex Programmable logic devices (CPLD, complex Programmable Logic Device), field-Programmable gate arrays (FPGA), general purpose processors, controllers, microcontrollers (MCU, micro Controller Unit), microprocessors (Microprocessor), or other electronic components for performing the aforementioned methods.

Fig. 15 is a block diagram of an electronic device apparatus 800, according to an example embodiment. For example, apparatus 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.

Referring to fig. 15, apparatus 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the apparatus 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the apparatus 800. Examples of such data include instructions for any application or method operating on the device 800, contact data, phonebook data, messages, pictures, videos, and the like. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 806 provides power to the various components of the device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.

The multimedia component 808 includes a screen between the device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the apparatus 800 is in an operational mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the apparatus 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the components, such as a display and keypad of the device 800, the sensor assembly 814 may also detect a change in position of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, an orientation or acceleration/deceleration of the device 800, and a change in temperature of the device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communication between the apparatus 800 and other devices, either in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi,4G or 5G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including instructions executable by processor 820 of apparatus 800 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

Other implementations of the disclosed embodiments will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosed embodiments following, in general, the principles of the disclosed embodiments and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosed embodiments pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosed embodiments being indicated by the following claims.

It is to be understood that the disclosed embodiments are not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the embodiments of the present disclosure is limited only by the appended claims.

Claims

1. A method for extracting a digest of answer text, comprising:

Acquiring a first answer text from a question-answer library based on the input question information;

Determining a target model from a plurality of candidate models according to the text type and the first text length of the first answer text, wherein the target model comprises at least one of the following: if the text type of the first answer text is a first text type and the first text length is in a first interval range, determining that the target model is a generating model; if the first answer text is of a first text type and the first text length is in a second interval range, determining that the target model is an extraction model; if the first answer text is of a first text type and the first text length is in a third interval range, determining that the target model is a comprehensive model; wherein the minimum value of the second interval range is greater than or equal to the maximum value of the first interval range, and the minimum value of the third interval range is greater than or equal to the maximum value of the second interval range; the first text type is a fact type text or a reason type text;

When the target model is the comprehensive model, extracting texts from the first answer texts by using an extraction model in the comprehensive model to obtain intermediate answer texts, and processing the intermediate answer texts by using a generation model in the comprehensive model to obtain second answer texts; when the matching degree of the second answer text and the first answer text meets a preset condition, determining that the second answer text is the abstract of the first answer text, otherwise, determining that the intermediate answer text is the abstract of the first answer text; the second text length of the second answer text is shorter than the first text length of the first answer text, and the second answer text has the same theme meaning as the first answer text.

2. The method of claim 1, wherein the candidate model comprises one of:

and the comprehensive model comprises the extraction model and the generation model.

3. The method for extracting a abstract of an answer text according to claim 2, wherein said determining a target model from a plurality of candidate models according to a text type and/or a first text length of said first answer text further comprises:

4. The method for extracting the abstract of the answer text according to claim 2, comprising:

when the target model is the generated model, word segmentation processing is carried out on the first answer text, and word segmentation positions in the first answer text are determined;

5. The method of claim 4, wherein the generative model is a bi-directional encoder BERT based language model Transformers.

6. The method for extracting the abstract of the answer text according to claim 2, comprising:

When the target model is the extraction model, splitting the first answer text into N sentences, and selecting M sentences from the N sentences, wherein M is not more than N;

Determining candidate keywords of the M sentences;

ranking the importance of the candidate keywords;

And forming the second answer text based on the keywords.

7. A digest extracting apparatus of answer text, comprising:

the acquisition module is used for acquiring a first answer text from the question-answer library based on the input question information;

The model determining module is used for determining a target model from a plurality of alternative models according to the text type and the first text length of the first answer text; the model determining module is specifically configured to determine that the target model is a generative model if the text type of the first answer text is a first text type and the first text length is in a first interval range; if the first answer text is of a first text type and the first text length is in a second interval range, determining that the target model is an extraction model; if the first answer text is of a first text type and the first text length is in a third interval range, determining that the target model is a comprehensive model; wherein the minimum value of the second interval range is greater than or equal to the maximum value of the first interval range, and the minimum value of the third interval range is greater than or equal to the maximum value of the second interval range; the first text type is a fact type text or a reason type text;

The processing module is used for extracting texts from the first answer texts by using an extraction model in the comprehensive model to obtain intermediate answer texts when the target model is the comprehensive model, and processing the intermediate answer texts by using a generation model in the comprehensive model to obtain second answer texts; when the matching degree of the second answer text and the first answer text meets a preset condition, determining that the second answer text is the abstract of the first answer text, otherwise, determining that the intermediate answer text is the abstract of the first answer text; the second text length of the second answer text is shorter than the first text length of the first answer text, and the second answer text has the same theme meaning as the first answer text.

8. The answer text summarization apparatus of claim 7 wherein the alternative model comprises one of:

9. The apparatus for summarization of answer texts according to claim 8, wherein said generative model comprises:

10. The apparatus for summarization of answer texts according to claim 8, wherein the generative model is a language model of a Transformers-based bi-directional encoder BERT.

11. The apparatus for summarization of answer texts according to claim 9, wherein said extraction model comprises:

The ranking unit is used for ranking the importance of the candidate keywords;

12. An electronic device, characterized in that it comprises the apparatus of any one of claims 7 to 11.

13. A computer readable storage medium storing executable instructions for causing a processor to perform the method of any one of claims 1 to 6.