CN110765763A - Error correction method and device for speech recognition text, computer equipment and storage medium - Google Patents
Error correction method and device for speech recognition text, computer equipment and storage medium Download PDFInfo
- Publication number
- CN110765763A CN110765763A CN201910903618.1A CN201910903618A CN110765763A CN 110765763 A CN110765763 A CN 110765763A CN 201910903618 A CN201910903618 A CN 201910903618A CN 110765763 A CN110765763 A CN 110765763A
- Authority
- CN
- China
- Prior art keywords
- word
- corrected
- corpus
- error correction
- fluency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012937 correction Methods 0.000 title claims abstract description 194
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000012549 training Methods 0.000 claims abstract description 18
- 230000011218 segmentation Effects 0.000 claims description 55
- 238000004590 computer program Methods 0.000 claims description 29
- 238000012545 processing Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 8
- 238000010276 construction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The application relates to a method and a device for correcting errors of a speech recognition text, a computer device and a storage medium. The method comprises the following steps: acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene; if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text; and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word. The method and the device improve the accuracy of user intention identification.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for correcting a speech recognition text, a computer device, and a storage medium.
Background
For enterprise applications, correctly understanding the user's intent is key to improving the user's satisfaction. In the voice interaction system, the user concept representation is carried out on the voice recognition result to obtain the intention of the user, wherein the user concept representation means that the essential characteristics of the perceived things are expressed by processing the input information.
However, the traditional speech recognition technology only models from the perspective of pronunciation and grammar, so that the speech recognition result has the problem of inaccuracy, thereby influencing the accuracy rate of the recognition of the user intention.
Disclosure of Invention
In view of the above, it is necessary to provide a method and apparatus for correcting a speech recognition text, a computer device, and a storage medium, which can improve the accuracy of recognition of a user's intention, in view of the above technical problems.
A method of error correction for speech recognized text, the method comprising:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
In one embodiment, the error correction database is constructed in a manner including:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
In one embodiment, the method further comprises:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
In one embodiment, the obtaining of the word to be corrected in the speech recognition text includes:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In one embodiment, the determining, from the error correction database, a correction word corresponding to the word to be corrected includes:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
In one embodiment, the determining, from the error correction database, an error correction candidate word corresponding to the word to be corrected includes:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In one embodiment, the obtaining the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database includes:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
In one embodiment, the determining the corrected word from the error correction candidate words includes:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In one embodiment, the preset language model is a binary language model and a ternary language model;
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
An apparatus for error correction of speech recognized text, the apparatus comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring fluency of a voice recognition text by using a preset language model, the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises corpus of a general scene, and the second corpus comprises corpus of a preset scene;
the obtaining module is further configured to obtain a word to be corrected in the speech recognition text if the fluency of the speech recognition text is less than a fluency threshold;
and the determining module is used for determining a corrected word corresponding to the word to be corrected from an error correction database and obtaining a corrected voice recognition text according to the corrected word.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
According to the method, the device, the computer equipment and the storage medium for correcting the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than a fluency threshold value, words to be corrected in the voice recognition text are obtained, the correction words corresponding to the words to be corrected are determined from the correction database, and the corrected voice recognition text is obtained according to the correction words.
Drawings
FIG. 1 is a diagram illustrating an exemplary embodiment of a method for error correction of speech recognition text;
FIG. 2 is a flowchart illustrating a method for error correction of speech recognition text according to one embodiment;
FIG. 3 is a diagram illustrating the operation of a method for error correction of speech recognition text according to one embodiment;
FIG. 4 is a schematic diagram of an error correction database in one embodiment;
FIG. 5 is a flowchart illustrating a method for correcting errors in speech recognition text according to another embodiment;
FIG. 6 is a block diagram showing the structure of an apparatus for correcting a speech-recognized text in one embodiment;
FIG. 7 is a block diagram showing the construction of an apparatus for correcting a speech recognition text in another embodiment;
FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The method for correcting the text by the voice recognition can be applied to the application environment shown in fig. 1. The method comprises the steps that the fluency of a voice recognition text is obtained by the terminal 102 or the server 104 through a preset language model, wherein the preset language model is obtained through corpus training of a first corpus and a second corpus, the first corpus comprises corpora of a general scene, and the second corpus comprises corpora of a preset scene; if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text; and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, a method for correcting a speech recognition text is provided, which is described by taking the method as an example applied to the terminal in fig. 1, and includes the following steps:
In one embodiment, the first corpus may be a wikipedia dataset, and the wikipedia dataset includes 5000 ten thousand correct expressions conforming to the general scene.
The corpus of the preset scene refers to a corpus applied to a specific scene in each field, and the specific scene can be a working scene, such as finance (financial index query, expense reimbursement, enterprise operation data query), approval (business trip approval, leave approval), purchase (commodity purchase), management (human resource management) and the like. The second corpus includes corpora of predetermined scenes, and in one embodiment, the second corpus can select interactive corpora of working scenes in various fields. Since each domain corresponds to professional knowledge which plays an important role in characterizing the user concept, the interpretation of the user concept can be enhanced by the second corpus.
The speech recognition text is text data recognized based on input speech. Due to the diversity, complexity, and dialect habits of natural language, different users may express the same thing differently, and thus the text data obtained by recognition may also be different. For example, the input speech may be "how much stock of the warehouse is left", the speech recognition text may be "how much stock of the warehouse is left", and may also be "how much stock of the warehouse is saved".
A preset language model refers to a mathematical model established for the context between each word in a sentence that takes into account the context between at least two words, i.e. the occurrence of the next word depends only on the word or words in front of it. The preset language model includes at least one of a binary language model, a ternary language model, …, and an n-gram language model.
As shown in fig. 3, the predetermined language model is obtained by training corpora in the first corpus and the second corpus. Specifically, a language model training tool is used to train the corpora in the first corpus and the second corpus to obtain a preset language model. The language model training tool can be SRILM, IRSTLM, BerkeleyLM, KenLM and the like.
Taking training a binary language model as an example, the probability that two adjacent words in the first corpus and the second corpus occur together is counted, and the statistical result is stored. To simplify the calculation, the probability may take a base-10 logarithmic value, e.g., "our company" may be stored as "our company-1.25". To improve storage efficiency, the storage file may be converted into a binary file.
Specifically, firstly, a preset language model is adopted to detect errors of the voice recognition text. Inputting the voice recognition text into a preset language model to obtain the fluency of the voice recognition text, and judging whether the voice recognition text has errors or not through the fluency, wherein if the fluency is smaller than a fluency threshold value, judging that the voice recognition text has errors, and correcting the voice recognition text.
In one embodiment, the preset language models are a binary language model and a ternary language model, and the binary language model and the ternary language model are both obtained by corpus training in the first corpus and the second corpus. And inputting the voice recognition text into a preset language model to obtain two fluency, and judging that the voice recognition text has errors if the maximum value of the two fluency is smaller than a fluency threshold value.
And 204, if the fluency of the voice recognition text is smaller than a fluency threshold, acquiring words to be corrected in the voice recognition text.
The fluency threshold is used for judging whether errors exist in the speech recognition text, and can be set according to practical application. If the fluency of the voice recognition text is greater than or equal to the fluency threshold, judging that the voice recognition text is correct; and if the fluency is smaller than the fluency threshold value, judging that the voice recognition text has errors, and correcting the voice recognition text.
The words to be corrected refer to erroneous text words in the speech recognition text. In one embodiment, the speech recognition text is tokenized (using a tokenization tool, such as a crust tokenization tool) to obtain text words. Calculating the average absolute deviation value of each text word, if the average absolute deviation of one text word is greater than a deviation threshold value, judging that the text word is wrong, and taking the text word as a word to be corrected; and if the average absolute deviation of one text word is less than or equal to the deviation threshold value, judging that the text word is correct.
The error correction database is used for replacing words to be corrected in the voice recognition text.
In one embodiment, as shown in fig. 3, the error correction database may be constructed from the second corpus. And acquiring the corpus of the second corpus, segmenting the corpus of the second corpus by using a segmentation dictionary to obtain candidate words, and constructing an error correction database according to the candidate words and pinyin of the candidate words.
In another embodiment, the error correction database may be constructed from the first corpus and the second corpus. Obtaining the linguistic data of the first corpus and the second corpus, utilizing a word segmentation dictionary to segment the linguistic data of the first corpus and the second corpus to obtain candidate words, and constructing an error correction database according to the candidate words and pinyin of the candidate words.
Wherein, a large number of words are stored in the word segmentation dictionary and are used for word segmentation operation. And when the candidate word is obtained, obtaining the confusion word corresponding to the candidate word, and adding the confusion word into the word segmentation dictionary to enrich the resources of the word segmentation dictionary.
Specifically, error correction candidate words corresponding to the words to be corrected are determined from the error correction database, and further, correction words are determined from the error correction candidate words.
The mode of determining the error correction candidate word corresponding to the word to be corrected from the error correction database may be: the method comprises the steps of obtaining the pinyin of a word to be corrected, obtaining the similarity between the pinyin of the word to be corrected and the pinyin of a candidate word in an error correction database, and taking the candidate word with the similarity larger than a similarity threshold as an error correction candidate word.
The manner of determining the corrected word among the error correction candidate words may be: and replacing the words to be corrected in the voice recognition text by the error correction candidate words, calculating the fluency of the replaced voice recognition text by using a preset language model, and taking the error correction candidate words with the fluency meeting the preset conditions as correction words. In one embodiment, the error correction candidate word corresponding to the maximum value in fluency is taken as the correction word.
Specifically, the words to be corrected in the speech recognition text are replaced by the correction words, so that the corrected speech recognition text is obtained.
According to the method for correcting the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than the fluency threshold, words to be corrected in the voice recognition text are obtained, correction words corresponding to the words to be corrected are determined from the correction database, and the corrected voice recognition text is obtained according to the correction words.
In one embodiment, the error correction database is constructed in a manner including: obtaining the corpus of the second corpus; performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words; and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
The candidate words refer to words included in the corpus of the second corpus.
Specifically, the corpus of the second corpus is segmented by using a segmentation tool (such as a word segmentation tool for ending) and a segmentation dictionary to obtain candidate words. For example, "how much cash can be withdrawn from the account of our company", the word segmentation tool is used to segment "we", "company", "account", "can", "withdraw", "how much", "cash".
And acquiring the pinyin of each candidate word, and storing the candidate words and the pinyin in an associated manner. In one embodiment, as shown in FIG. 4, words and pinyins are stored as key-value pairs.
In the error correction method for the voice recognition text, the error correction database is constructed according to the second corpus, and the user concept characterization is distinguished and enhanced.
In one embodiment, the method further comprises: obtaining confusion words corresponding to the candidate words; and adding the confusion word into the word segmentation dictionary.
The confusing word refers to a word with a pronunciation close to or the same as that of the candidate word.
A segmentation dictionary stores a large number of words, which are used for the segmentation operation. When the candidate word is obtained, the confusion word corresponding to the candidate word is obtained, and the confusion word is added into the word segmentation dictionary, so that the resources of the word segmentation dictionary are enriched, and the accuracy of word segmentation of the voice recognition text is improved.
Specifically, each character in each candidate word is replaced to obtain a confusion word corresponding to the candidate word. In one embodiment, each word in the candidate word is replaced with a word-level confusion set. For example, the confusing words corresponding to "cash" may be "advanced", "line-in", "current time", etc.
In the error correction method for the voice recognition text, the confusion words are added into the word segmentation dictionary, so that the resources of the word segmentation dictionary are enriched, and the accuracy of the word segmentation of the voice recognition text is improved.
In one embodiment, the obtaining of the word to be corrected in the speech recognition text includes: performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words; calculating the average absolute deviation value of each text word; and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
Wherein, the word to be corrected refers to the wrong word in the speech recognition text; text words refer to words in speech recognition text; the deviation threshold is used for judging whether the text word is wrong or not, and can be set according to practical application.
Specifically, the speech recognition text is segmented by using a segmentation tool (such as a Chinese word segmentation tool) and a segmentation dictionary to obtain text words. Calculating the average absolute deviation value of each text word, if the average absolute deviation of one text word is greater than a deviation threshold value, judging that the text word is wrong, and taking the text word as a word to be corrected; and if the average absolute deviation of one text word is less than or equal to the deviation threshold value, judging that the text word is correct.
In the error correction method for the voice recognition text, whether the text word has errors or not is judged according to the average absolute deviation value of the text word, so that the accuracy of error correction is improved.
In one embodiment, the determining, from the error correction database, a correction word corresponding to the word to be corrected includes: determining an error correction candidate word corresponding to the word to be corrected from the error correction database; determining the corrected word among the error corrected candidate words.
The word to be corrected is a word set that may be used to correct the word to be corrected. For example, if the speech recognition text is "how many lines can be drawn by our company account", and "line in" is a word to be corrected, then the word candidate for correction may be "advanced", "cash", and so on.
Specifically, the manner of determining the error correction candidate word corresponding to the word to be corrected from the error correction database may be: and determining error correction candidate words from an error correction database through the pinyin similarity. The pinyin similarity can be determined by the edit distance of the pinyin.
Specifically, the manner of determining the corrected word in the corrected candidate words may be: and replacing the words to be corrected in the voice recognition text by the error correction candidate words, calculating the fluency of the replaced voice recognition text by using a preset language model, and taking the error correction candidate words with the fluency meeting the preset conditions as correction words.
In the method for correcting the voice recognition text, the voice recognition text can be corrected by combining the editing distance of the pinyin and the preset language model, so that the accuracy of selecting the corrected words is further improved.
In one embodiment, the determining, from the error correction database, an error correction candidate word corresponding to the word to be corrected includes: obtaining the pinyin of the word to be corrected; acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database; and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
Because the candidate word and the pinyin of the candidate word are stored in the error correction database, the error correction candidate word can be determined by comparing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database.
Specifically, the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database may be determined by calculating an edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word, where the edit distance is an index for measuring the similarity between the two sequences. Taking pinyin as an example, the edit distance of pinyin means the minimum number of character edit operations required to convert one pinyin to another pinyin between two pinyins. The smaller the editing distance between the pinyin of the word to be corrected and the pinyin of the candidate word is, the greater the similarity between the word to be corrected and the candidate word is, and therefore the candidate word with the similarity greater than the similarity threshold is taken as the error correction candidate word.
In one embodiment, the edit distance of a pinyin is calculated as follows:
wherein, t0And tiRespectively the word to be corrected and the candidate in the correction databaseSelecting words, len (x) is the number of words included in the word x, lenp(x) The number of characters contained in the pinyin for the word x.
In the error correction method for the voice recognition text, the error correction candidate words are determined by using the editing distance of the pinyin, so that the accuracy rate of selecting the error correction candidate words is improved.
In one embodiment, the determining the corrected word among the error correction candidate words includes: replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model; and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
The preset language model is used for calculating the probability of fluency of a sentence. The predetermined language model may be an N-gram language model, where N may be one, two, three, four, etc. The N-element language model means that for one position in a sentence, the probability that the sentence is fluent when each word to be selected is at the position is calculated according to the first N-1 words of the position. The preset language model may also be a combination of at least two N-gram language models, for example, the preset language model may be a binary language model, a ternary language model, or the like.
Specifically, the candidate words for error correction are used for replacing words to be corrected in the voice recognition text, the fluency of the replaced voice recognition text is calculated by using a preset language model, and the correction words are determined according to the fluency. The method for calculating the fluency of the replaced voice recognition text by utilizing the preset language model comprises the following steps: and inputting the replaced voice recognition text into a preset language model to obtain the fluency of the replaced voice recognition text.
The preset conditions are used for screening the correction words from the error correction candidate words and can be set according to practical application. In one embodiment, the error correction candidate word corresponding to the maximum fluency in fluency of the replaced speech recognition text output by the preset language model is used as the correction word.
In the error correction method for the voice recognition text, the correction words are determined through the preset language model, and the accuracy rate of selecting the correction words is improved.
In one embodiment, the preset language model is a binary language model and a ternary language model; the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps: respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model; and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
The binary language model and the ternary language model are obtained through corpus training in the first corpus and the second corpus.
Specifically, the replaced speech recognition text is input into a preset language model, the fluency output by the binary language model and the fluency output by the ternary language model are obtained, and the maximum value of the two fluency is used as the fluency of the speech recognition text.
In the error correction method for the voice recognition text, the correction words are determined through the binary language model and the ternary language model, and the accuracy rate of selecting the correction words is improved.
As shown in fig. 5, the method for correcting the error of the speech recognition text in one embodiment is described in detail:
502, acquiring fluency of a voice recognition text by using a preset language model;
and 518, taking the error correction candidate word with the fluency meeting the preset condition as a correction word of the word to be corrected.
According to the method for correcting the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than the fluency threshold, words to be corrected in the voice recognition text are obtained, correction words corresponding to the words to be corrected are determined from the correction database, and the corrected voice recognition text is obtained according to the correction words.
It should be understood that although the steps in the flowcharts of fig. 2 and 5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2 and 5 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 6, there is provided an apparatus 600 for correcting a speech recognition text, including: an obtaining module 602 and a determining module 604, wherein:
an obtaining module 602, configured to obtain fluency of a speech recognition text by using a preset language model, where the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus includes corpora of a general scene, and the second corpus includes corpora of a preset scene;
the obtaining module 602 is further configured to obtain a word to be corrected in the speech recognition text if the fluency of the speech recognition text is less than a fluency threshold;
a determining module 604, configured to determine a correction word corresponding to the word to be corrected from an error correction database, and obtain a corrected speech recognition text according to the correction word.
According to the error correction device 600 for the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than the fluency threshold, words to be corrected in the voice recognition text are obtained, correction words corresponding to the words to be corrected are determined from the error correction database, and the corrected voice recognition text is obtained according to the correction words, so that the wrong words in the voice recognition text are detected and corrected, the accuracy of recognition of the voice recognition text is improved, the preset language model is trained by using the second corpus, the error correction database is built, the concept representations of users are distinguished and enhanced, and the accuracy of recognition of the intentions of the users is improved.
In an embodiment, as shown in fig. 7, the apparatus 600 for correcting a text recognition further includes a word segmentation module 606 and a construction module 608, where the obtaining module 602 is further configured to obtain corpora of the second corpus; the word segmentation module 606 is configured to perform word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words; the constructing module 608 is configured to construct the error correction database according to the candidate word and the pinyin of the candidate word.
In an embodiment, the apparatus 600 for identifying a text further includes an adding module, and the obtaining module 602 is further configured to obtain a confusion word corresponding to the candidate word; the adding module is used for adding the confusion word into the word segmentation dictionary.
In an embodiment, the obtaining module 602 is further configured to perform word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words; calculating the average absolute deviation value of each text word; and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In an embodiment, the determining module 604 is further configured to determine, from the error correction database, an error correction candidate word corresponding to the word to be corrected; determining the corrected word among the error corrected candidate words.
In an embodiment, the determining module 604 is further configured to obtain a pinyin of the word to be corrected; acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database; and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In an embodiment, the determining module 604 is further configured to obtain an editing distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and represent a similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the editing distance. In an embodiment, the determining module 604 is further configured to replace a word to be corrected in the speech recognition text with the error correction candidate word, and calculate fluency of the replaced speech recognition text by using the preset language model; and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In an embodiment, the determining module 604 is further configured to input the replaced speech recognition text into the binary language model and the ternary language model respectively, so as to obtain fluency output by the binary language model and fluency output by the ternary language model; and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text. For the specific definition of the error correction device for the speech recognition text, reference may be made to the above definition of the error correction method for the speech recognition text, and details are not described here. The modules in the device for correcting the speech recognition text can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server or a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing error correction data for speech recognition text. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of error correction for speech recognition text.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
In one embodiment, the computer program when executed by the processor further performs the steps of:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In one embodiment, the computer program when executed by the processor further performs the steps of:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
In one embodiment, the computer program when executed by the processor further performs the steps of:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In one embodiment, the computer program when executed by the processor further performs the steps of:
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (12)
1. A method of error correction for speech recognized text, the method comprising:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
2. The method of claim 1, wherein the error correction database is constructed in a manner comprising:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
3. The method of claim 2, further comprising:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
4. The method according to claim 3, wherein the obtaining of the word to be corrected in the speech recognition text comprises:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
5. The method according to claim 2, wherein the determining, from the error correction database, a corrected word corresponding to the word to be corrected comprises:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
6. The method according to claim 5, wherein the determining, from the error correction database, the error correction candidate word corresponding to the word to be corrected comprises:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
7. The method according to claim 6, wherein the obtaining the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database comprises:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
8. The method of claim 5, wherein determining the corrected word among the error corrected candidate words comprises:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
9. The method according to claim 8, wherein the preset language model is a binary language model and a ternary language model;
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
10. An apparatus for correcting a speech-recognized text, the apparatus comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring fluency of a voice recognition text by using a preset language model, the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises corpus of a general scene, and the second corpus comprises corpus of a preset scene;
the obtaining module is further configured to obtain a word to be corrected in the speech recognition text if the fluency of the speech recognition text is less than a fluency threshold;
and the determining module is used for determining a corrected word corresponding to the word to be corrected from an error correction database and obtaining a corrected voice recognition text according to the corrected word.
11. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 9 when executing the computer program.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910903618.1A CN110765763B (en) | 2019-09-24 | 2019-09-24 | Error correction method and device for voice recognition text, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910903618.1A CN110765763B (en) | 2019-09-24 | 2019-09-24 | Error correction method and device for voice recognition text, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110765763A true CN110765763A (en) | 2020-02-07 |
CN110765763B CN110765763B (en) | 2023-12-12 |
Family
ID=69330240
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910903618.1A Active CN110765763B (en) | 2019-09-24 | 2019-09-24 | Error correction method and device for voice recognition text, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110765763B (en) |
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111444318A (en) * | 2020-04-08 | 2020-07-24 | 厦门快商通科技股份有限公司 | Text error correction method |
CN111476641A (en) * | 2020-04-13 | 2020-07-31 | 南京掌控网络科技有限公司 | Method, system and storage medium for automatically placing order on mobile device by voice |
CN111613214A (en) * | 2020-05-21 | 2020-09-01 | 重庆农村商业银行股份有限公司 | Language model error correction method for improving voice recognition capability |
CN111651599A (en) * | 2020-05-29 | 2020-09-11 | 北京搜狗科技发展有限公司 | Method and device for sorting candidate voice recognition results |
CN111680476A (en) * | 2020-05-26 | 2020-09-18 | 广州多益网络股份有限公司 | Method for intelligently generating business hot word recognition conversion of similar text |
CN111783471A (en) * | 2020-06-29 | 2020-10-16 | 中国平安财产保险股份有限公司 | Semantic recognition method, device, equipment and storage medium of natural language |
CN111881675A (en) * | 2020-06-30 | 2020-11-03 | 北京百度网讯科技有限公司 | Text error correction method and device, electronic equipment and storage medium |
CN111985213A (en) * | 2020-09-07 | 2020-11-24 | 科大讯飞华南人工智能研究院(广州)有限公司 | Method and device for correcting voice customer service text |
CN112151021A (en) * | 2020-09-27 | 2020-12-29 | 北京达佳互联信息技术有限公司 | Language model training method, speech recognition device and electronic equipment |
CN112257437A (en) * | 2020-10-20 | 2021-01-22 | 科大讯飞股份有限公司 | Voice recognition error correction method and device, electronic equipment and storage medium |
CN112509581A (en) * | 2020-11-20 | 2021-03-16 | 北京有竹居网络技术有限公司 | Method and device for correcting text after speech recognition, readable medium and electronic equipment |
CN112580324A (en) * | 2020-12-24 | 2021-03-30 | 北京百度网讯科技有限公司 | Text error correction method and device, electronic equipment and storage medium |
CN112767924A (en) * | 2021-02-26 | 2021-05-07 | 北京百度网讯科技有限公司 | Voice recognition method and device, electronic equipment and storage medium |
CN112905775A (en) * | 2021-02-24 | 2021-06-04 | 北京三快在线科技有限公司 | Text processing method and device, electronic equipment and readable storage medium |
CN113012705A (en) * | 2021-02-24 | 2021-06-22 | 海信视像科技股份有限公司 | Error correction method and device for voice text |
CN113111660A (en) * | 2021-04-22 | 2021-07-13 | 脉景(杭州)健康管理有限公司 | Data processing method, device, equipment and storage medium |
CN113129894A (en) * | 2021-04-12 | 2021-07-16 | 阿波罗智联(北京)科技有限公司 | Speech recognition method, speech recognition device, electronic device and storage medium |
CN113157852A (en) * | 2021-04-26 | 2021-07-23 | 深圳市优必选科技股份有限公司 | Voice processing method, system, electronic equipment and storage medium |
CN113270088A (en) * | 2020-02-14 | 2021-08-17 | 阿里巴巴集团控股有限公司 | Text processing method, data processing method, voice processing method, data processing device, voice processing device and electronic equipment |
CN113326702A (en) * | 2021-06-11 | 2021-08-31 | 北京猎户星空科技有限公司 | Semantic recognition method and device, electronic equipment and storage medium |
CN113449090A (en) * | 2021-06-23 | 2021-09-28 | 山东新一代信息产业技术研究院有限公司 | Error correction method, device and medium for intelligent question answering |
CN113506586A (en) * | 2021-06-18 | 2021-10-15 | 杭州摸象大数据科技有限公司 | Method and system for recognizing emotion of user |
CN113744718A (en) * | 2020-05-27 | 2021-12-03 | 海尔优家智能科技(北京)有限公司 | Voice text output method and device, storage medium and electronic device |
CN113763961A (en) * | 2020-06-02 | 2021-12-07 | 阿里巴巴集团控股有限公司 | Text processing method and device |
CN113781998A (en) * | 2021-09-10 | 2021-12-10 | 未鲲(上海)科技服务有限公司 | Dialect correction model-based voice recognition method, device, equipment and medium |
CN113807080A (en) * | 2020-06-15 | 2021-12-17 | 科沃斯商用机器人有限公司 | Text correction method, text correction device and storage medium |
CN114048321A (en) * | 2021-08-12 | 2022-02-15 | 湖南达德曼宁信息技术有限公司 | Multi-granularity text error correction data set generation method, device and equipment |
CN114120972A (en) * | 2022-01-28 | 2022-03-01 | 科大讯飞华南有限公司 | Intelligent voice recognition method and system based on scene |
CN114530145A (en) * | 2020-11-23 | 2022-05-24 | 中移互联网有限公司 | Speech recognition result error correction method and device, and computer readable storage medium |
CN114896965A (en) * | 2022-05-17 | 2022-08-12 | 马上消费金融股份有限公司 | Text correction model training method and device and text correction method and device |
CN116129906A (en) * | 2023-02-14 | 2023-05-16 | 新声科技(深圳)有限公司 | Speech recognition text revising method, device, computer equipment and storage medium |
CN117194818A (en) * | 2023-11-08 | 2023-12-08 | 北京信立方科技发展股份有限公司 | Image-text webpage generation method and device based on video |
CN117807990A (en) * | 2023-12-27 | 2024-04-02 | 北京海泰方圆科技股份有限公司 | Text processing method, device, equipment and medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107729321A (en) * | 2017-10-23 | 2018-02-23 | 上海百芝龙网络科技有限公司 | A kind of method for correcting error of voice identification result |
CN107977356A (en) * | 2017-11-21 | 2018-05-01 | 新疆科大讯飞信息科技有限责任公司 | Method and device for correcting recognized text |
CN108304385A (en) * | 2018-02-09 | 2018-07-20 | 叶伟 | A kind of speech recognition text error correction method and device |
CN109522419A (en) * | 2018-11-15 | 2019-03-26 | 北京搜狗科技发展有限公司 | Session information complementing method and device |
CN110210029A (en) * | 2019-05-30 | 2019-09-06 | 浙江远传信息技术股份有限公司 | Speech text error correction method, system, equipment and medium based on vertical field |
-
2019
- 2019-09-24 CN CN201910903618.1A patent/CN110765763B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107729321A (en) * | 2017-10-23 | 2018-02-23 | 上海百芝龙网络科技有限公司 | A kind of method for correcting error of voice identification result |
CN107977356A (en) * | 2017-11-21 | 2018-05-01 | 新疆科大讯飞信息科技有限责任公司 | Method and device for correcting recognized text |
CN108304385A (en) * | 2018-02-09 | 2018-07-20 | 叶伟 | A kind of speech recognition text error correction method and device |
CN109522419A (en) * | 2018-11-15 | 2019-03-26 | 北京搜狗科技发展有限公司 | Session information complementing method and device |
CN110210029A (en) * | 2019-05-30 | 2019-09-06 | 浙江远传信息技术股份有限公司 | Speech text error correction method, system, equipment and medium based on vertical field |
Cited By (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113270088B (en) * | 2020-02-14 | 2022-04-29 | 阿里巴巴集团控股有限公司 | Text processing method, data processing method, voice processing method, data processing device, voice processing device and electronic equipment |
CN113270088A (en) * | 2020-02-14 | 2021-08-17 | 阿里巴巴集团控股有限公司 | Text processing method, data processing method, voice processing method, data processing device, voice processing device and electronic equipment |
CN111444318A (en) * | 2020-04-08 | 2020-07-24 | 厦门快商通科技股份有限公司 | Text error correction method |
CN111476641A (en) * | 2020-04-13 | 2020-07-31 | 南京掌控网络科技有限公司 | Method, system and storage medium for automatically placing order on mobile device by voice |
CN111613214A (en) * | 2020-05-21 | 2020-09-01 | 重庆农村商业银行股份有限公司 | Language model error correction method for improving voice recognition capability |
CN111680476A (en) * | 2020-05-26 | 2020-09-18 | 广州多益网络股份有限公司 | Method for intelligently generating business hot word recognition conversion of similar text |
CN111680476B (en) * | 2020-05-26 | 2024-01-30 | 广州多益网络股份有限公司 | Method for intelligently generating service hotword recognition conversion of class text |
CN113744718A (en) * | 2020-05-27 | 2021-12-03 | 海尔优家智能科技(北京)有限公司 | Voice text output method and device, storage medium and electronic device |
CN111651599B (en) * | 2020-05-29 | 2023-05-26 | 北京搜狗科技发展有限公司 | Method and device for ordering voice recognition candidate results |
CN111651599A (en) * | 2020-05-29 | 2020-09-11 | 北京搜狗科技发展有限公司 | Method and device for sorting candidate voice recognition results |
CN113763961B (en) * | 2020-06-02 | 2024-04-09 | 阿里巴巴集团控股有限公司 | Text processing method and device |
CN113763961A (en) * | 2020-06-02 | 2021-12-07 | 阿里巴巴集团控股有限公司 | Text processing method and device |
CN113807080A (en) * | 2020-06-15 | 2021-12-17 | 科沃斯商用机器人有限公司 | Text correction method, text correction device and storage medium |
CN111783471A (en) * | 2020-06-29 | 2020-10-16 | 中国平安财产保险股份有限公司 | Semantic recognition method, device, equipment and storage medium of natural language |
CN111783471B (en) * | 2020-06-29 | 2024-05-31 | 中国平安财产保险股份有限公司 | Semantic recognition method, device, equipment and storage medium for natural language |
CN111881675A (en) * | 2020-06-30 | 2020-11-03 | 北京百度网讯科技有限公司 | Text error correction method and device, electronic equipment and storage medium |
CN111985213B (en) * | 2020-09-07 | 2024-05-28 | 科大讯飞华南人工智能研究院(广州)有限公司 | Voice customer service text error correction method and device |
CN111985213A (en) * | 2020-09-07 | 2020-11-24 | 科大讯飞华南人工智能研究院(广州)有限公司 | Method and device for correcting voice customer service text |
CN112151021A (en) * | 2020-09-27 | 2020-12-29 | 北京达佳互联信息技术有限公司 | Language model training method, speech recognition device and electronic equipment |
CN112257437B (en) * | 2020-10-20 | 2024-02-13 | 中国科学技术大学 | Speech recognition error correction method, device, electronic equipment and storage medium |
CN112257437A (en) * | 2020-10-20 | 2021-01-22 | 科大讯飞股份有限公司 | Voice recognition error correction method and device, electronic equipment and storage medium |
CN112509581B (en) * | 2020-11-20 | 2024-03-01 | 北京有竹居网络技术有限公司 | Error correction method and device for text after voice recognition, readable medium and electronic equipment |
CN112509581A (en) * | 2020-11-20 | 2021-03-16 | 北京有竹居网络技术有限公司 | Method and device for correcting text after speech recognition, readable medium and electronic equipment |
CN114530145A (en) * | 2020-11-23 | 2022-05-24 | 中移互联网有限公司 | Speech recognition result error correction method and device, and computer readable storage medium |
CN114530145B (en) * | 2020-11-23 | 2023-08-15 | 中移互联网有限公司 | Speech recognition result error correction method and device and computer readable storage medium |
CN112580324B (en) * | 2020-12-24 | 2023-07-25 | 北京百度网讯科技有限公司 | Text error correction method, device, electronic equipment and storage medium |
CN112580324A (en) * | 2020-12-24 | 2021-03-30 | 北京百度网讯科技有限公司 | Text error correction method and device, electronic equipment and storage medium |
CN113012705A (en) * | 2021-02-24 | 2021-06-22 | 海信视像科技股份有限公司 | Error correction method and device for voice text |
CN113012705B (en) * | 2021-02-24 | 2022-12-09 | 海信视像科技股份有限公司 | Error correction method and device for voice text |
CN112905775A (en) * | 2021-02-24 | 2021-06-04 | 北京三快在线科技有限公司 | Text processing method and device, electronic equipment and readable storage medium |
CN112767924A (en) * | 2021-02-26 | 2021-05-07 | 北京百度网讯科技有限公司 | Voice recognition method and device, electronic equipment and storage medium |
US11842726B2 (en) | 2021-02-26 | 2023-12-12 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method, apparatus, electronic device and storage medium for speech recognition |
JP2022088586A (en) * | 2021-04-12 | 2022-06-14 | 阿波▲羅▼智▲聯▼(北京)科技有限公司 | Voice recognition method, voice recognition device, electronic apparatus, storage medium computer program product and computer program |
JP7349523B2 (en) | 2021-04-12 | 2023-09-22 | 阿波▲羅▼智▲聯▼(北京)科技有限公司 | Speech recognition method, speech recognition device, electronic device, storage medium computer program product and computer program |
CN113129894A (en) * | 2021-04-12 | 2021-07-16 | 阿波罗智联(北京)科技有限公司 | Speech recognition method, speech recognition device, electronic device and storage medium |
CN113111660A (en) * | 2021-04-22 | 2021-07-13 | 脉景(杭州)健康管理有限公司 | Data processing method, device, equipment and storage medium |
CN113157852A (en) * | 2021-04-26 | 2021-07-23 | 深圳市优必选科技股份有限公司 | Voice processing method, system, electronic equipment and storage medium |
CN113326702B (en) * | 2021-06-11 | 2024-02-20 | 北京猎户星空科技有限公司 | Semantic recognition method, semantic recognition device, electronic equipment and storage medium |
CN113326702A (en) * | 2021-06-11 | 2021-08-31 | 北京猎户星空科技有限公司 | Semantic recognition method and device, electronic equipment and storage medium |
CN113506586A (en) * | 2021-06-18 | 2021-10-15 | 杭州摸象大数据科技有限公司 | Method and system for recognizing emotion of user |
CN113449090A (en) * | 2021-06-23 | 2021-09-28 | 山东新一代信息产业技术研究院有限公司 | Error correction method, device and medium for intelligent question answering |
CN114048321A (en) * | 2021-08-12 | 2022-02-15 | 湖南达德曼宁信息技术有限公司 | Multi-granularity text error correction data set generation method, device and equipment |
CN113781998A (en) * | 2021-09-10 | 2021-12-10 | 未鲲(上海)科技服务有限公司 | Dialect correction model-based voice recognition method, device, equipment and medium |
CN113781998B (en) * | 2021-09-10 | 2024-06-07 | 河南松音科技有限公司 | Speech recognition method, device, equipment and medium based on dialect correction model |
CN114120972B (en) * | 2022-01-28 | 2022-04-12 | 科大讯飞华南有限公司 | Intelligent voice recognition method and system based on scene |
CN114120972A (en) * | 2022-01-28 | 2022-03-01 | 科大讯飞华南有限公司 | Intelligent voice recognition method and system based on scene |
CN114896965B (en) * | 2022-05-17 | 2023-09-12 | 马上消费金融股份有限公司 | Text correction model training method and device, text correction method and device |
CN114896965A (en) * | 2022-05-17 | 2022-08-12 | 马上消费金融股份有限公司 | Text correction model training method and device and text correction method and device |
CN116129906A (en) * | 2023-02-14 | 2023-05-16 | 新声科技(深圳)有限公司 | Speech recognition text revising method, device, computer equipment and storage medium |
CN117194818B (en) * | 2023-11-08 | 2024-01-16 | 北京信立方科技发展股份有限公司 | Image-text webpage generation method and device based on video |
CN117194818A (en) * | 2023-11-08 | 2023-12-08 | 北京信立方科技发展股份有限公司 | Image-text webpage generation method and device based on video |
CN117807990A (en) * | 2023-12-27 | 2024-04-02 | 北京海泰方圆科技股份有限公司 | Text processing method, device, equipment and medium |
CN117807990B (en) * | 2023-12-27 | 2024-07-19 | 北京海泰方圆科技股份有限公司 | Text processing method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN110765763B (en) | 2023-12-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110765763B (en) | Error correction method and device for voice recognition text, computer equipment and storage medium | |
CN110457431B (en) | Knowledge graph-based question and answer method and device, computer equipment and storage medium | |
CN111666401B (en) | Document recommendation method, device, computer equipment and medium based on graph structure | |
CN110334179B (en) | Question-answer processing method, device, computer equipment and storage medium | |
CN109376222B (en) | Question-answer matching degree calculation method, question-answer automatic matching method and device | |
CN110598206A (en) | Text semantic recognition method and device, computer equipment and storage medium | |
CN110688853B (en) | Sequence labeling method and device, computer equipment and storage medium | |
CN108664595B (en) | Domain knowledge base construction method and device, computer equipment and storage medium | |
CN110413961B (en) | Method and device for text scoring based on classification model and computer equipment | |
CN112651238A (en) | Training corpus expansion method and device and intention recognition model training method and device | |
US11157686B2 (en) | Text sequence segmentation method, apparatus and device, and storage medium thereof | |
CN110674131A (en) | Financial statement data processing method and device, computer equipment and storage medium | |
CN110162681B (en) | Text recognition method, text processing method, text recognition device, text processing device, computer equipment and storage medium | |
CN110362798B (en) | Method, apparatus, computer device and storage medium for judging information retrieval analysis | |
CN111177307A (en) | Test scheme and system based on semantic understanding similarity threshold configuration | |
CN112766319A (en) | Dialogue intention recognition model training method and device, computer equipment and medium | |
US12124487B2 (en) | Search platform for unstructured interaction summaries | |
CN113159013A (en) | Paragraph identification method and device based on machine learning, computer equipment and medium | |
CN111400340A (en) | Natural language processing method and device, computer equipment and storage medium | |
CN109885695B (en) | Asset suggestion generation method, device, computer equipment and storage medium | |
CN114896382A (en) | Artificial intelligent question-answering model generation method, question-answering method, device and storage medium | |
CN114547087A (en) | Method, device, equipment and medium for automatically identifying proposal and generating report | |
WO2021217619A1 (en) | Label smoothing-based speech recognition method, terminal, and medium | |
CN117422064A (en) | Search text error correction method, apparatus, computer device and storage medium | |
CN116186223A (en) | Financial text processing method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TG01 | Patent term adjustment | ||
TG01 | Patent term adjustment |