[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110765763A - Error correction method and device for speech recognition text, computer equipment and storage medium - Google Patents

Error correction method and device for speech recognition text, computer equipment and storage medium Download PDF

Info

Publication number
CN110765763A
CN110765763A CN201910903618.1A CN201910903618A CN110765763A CN 110765763 A CN110765763 A CN 110765763A CN 201910903618 A CN201910903618 A CN 201910903618A CN 110765763 A CN110765763 A CN 110765763A
Authority
CN
China
Prior art keywords
word
corrected
corpus
error correction
fluency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910903618.1A
Other languages
Chinese (zh)
Other versions
CN110765763B (en
Inventor
宁义双
张良杰
闵刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kingdee Software China Co Ltd
Original Assignee
Kingdee Software China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kingdee Software China Co Ltd filed Critical Kingdee Software China Co Ltd
Priority to CN201910903618.1A priority Critical patent/CN110765763B/en
Publication of CN110765763A publication Critical patent/CN110765763A/en
Application granted granted Critical
Publication of CN110765763B publication Critical patent/CN110765763B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a method and a device for correcting errors of a speech recognition text, a computer device and a storage medium. The method comprises the following steps: acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene; if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text; and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word. The method and the device improve the accuracy of user intention identification.

Description

Error correction method and device for speech recognition text, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for correcting a speech recognition text, a computer device, and a storage medium.
Background
For enterprise applications, correctly understanding the user's intent is key to improving the user's satisfaction. In the voice interaction system, the user concept representation is carried out on the voice recognition result to obtain the intention of the user, wherein the user concept representation means that the essential characteristics of the perceived things are expressed by processing the input information.
However, the traditional speech recognition technology only models from the perspective of pronunciation and grammar, so that the speech recognition result has the problem of inaccuracy, thereby influencing the accuracy rate of the recognition of the user intention.
Disclosure of Invention
In view of the above, it is necessary to provide a method and apparatus for correcting a speech recognition text, a computer device, and a storage medium, which can improve the accuracy of recognition of a user's intention, in view of the above technical problems.
A method of error correction for speech recognized text, the method comprising:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
In one embodiment, the error correction database is constructed in a manner including:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
In one embodiment, the method further comprises:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
In one embodiment, the obtaining of the word to be corrected in the speech recognition text includes:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In one embodiment, the determining, from the error correction database, a correction word corresponding to the word to be corrected includes:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
In one embodiment, the determining, from the error correction database, an error correction candidate word corresponding to the word to be corrected includes:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In one embodiment, the obtaining the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database includes:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
In one embodiment, the determining the corrected word from the error correction candidate words includes:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In one embodiment, the preset language model is a binary language model and a ternary language model;
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
An apparatus for error correction of speech recognized text, the apparatus comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring fluency of a voice recognition text by using a preset language model, the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises corpus of a general scene, and the second corpus comprises corpus of a preset scene;
the obtaining module is further configured to obtain a word to be corrected in the speech recognition text if the fluency of the speech recognition text is less than a fluency threshold;
and the determining module is used for determining a corrected word corresponding to the word to be corrected from an error correction database and obtaining a corrected voice recognition text according to the corrected word.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
According to the method, the device, the computer equipment and the storage medium for correcting the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than a fluency threshold value, words to be corrected in the voice recognition text are obtained, the correction words corresponding to the words to be corrected are determined from the correction database, and the corrected voice recognition text is obtained according to the correction words.
Drawings
FIG. 1 is a diagram illustrating an exemplary embodiment of a method for error correction of speech recognition text;
FIG. 2 is a flowchart illustrating a method for error correction of speech recognition text according to one embodiment;
FIG. 3 is a diagram illustrating the operation of a method for error correction of speech recognition text according to one embodiment;
FIG. 4 is a schematic diagram of an error correction database in one embodiment;
FIG. 5 is a flowchart illustrating a method for correcting errors in speech recognition text according to another embodiment;
FIG. 6 is a block diagram showing the structure of an apparatus for correcting a speech-recognized text in one embodiment;
FIG. 7 is a block diagram showing the construction of an apparatus for correcting a speech recognition text in another embodiment;
FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The method for correcting the text by the voice recognition can be applied to the application environment shown in fig. 1. The method comprises the steps that the fluency of a voice recognition text is obtained by the terminal 102 or the server 104 through a preset language model, wherein the preset language model is obtained through corpus training of a first corpus and a second corpus, the first corpus comprises corpora of a general scene, and the second corpus comprises corpora of a preset scene; if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text; and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, a method for correcting a speech recognition text is provided, which is described by taking the method as an example applied to the terminal in fig. 1, and includes the following steps:
step 202, obtaining fluency of a speech recognition text by using a preset language model, wherein the preset language model is obtained by utilizing corpus training of a first corpus and a second corpus, the first corpus comprises corpus of a general scene, and the second corpus comprises corpus of a preset scene.
In one embodiment, the first corpus may be a wikipedia dataset, and the wikipedia dataset includes 5000 ten thousand correct expressions conforming to the general scene.
The corpus of the preset scene refers to a corpus applied to a specific scene in each field, and the specific scene can be a working scene, such as finance (financial index query, expense reimbursement, enterprise operation data query), approval (business trip approval, leave approval), purchase (commodity purchase), management (human resource management) and the like. The second corpus includes corpora of predetermined scenes, and in one embodiment, the second corpus can select interactive corpora of working scenes in various fields. Since each domain corresponds to professional knowledge which plays an important role in characterizing the user concept, the interpretation of the user concept can be enhanced by the second corpus.
The speech recognition text is text data recognized based on input speech. Due to the diversity, complexity, and dialect habits of natural language, different users may express the same thing differently, and thus the text data obtained by recognition may also be different. For example, the input speech may be "how much stock of the warehouse is left", the speech recognition text may be "how much stock of the warehouse is left", and may also be "how much stock of the warehouse is saved".
A preset language model refers to a mathematical model established for the context between each word in a sentence that takes into account the context between at least two words, i.e. the occurrence of the next word depends only on the word or words in front of it. The preset language model includes at least one of a binary language model, a ternary language model, …, and an n-gram language model.
As shown in fig. 3, the predetermined language model is obtained by training corpora in the first corpus and the second corpus. Specifically, a language model training tool is used to train the corpora in the first corpus and the second corpus to obtain a preset language model. The language model training tool can be SRILM, IRSTLM, BerkeleyLM, KenLM and the like.
Taking training a binary language model as an example, the probability that two adjacent words in the first corpus and the second corpus occur together is counted, and the statistical result is stored. To simplify the calculation, the probability may take a base-10 logarithmic value, e.g., "our company" may be stored as "our company-1.25". To improve storage efficiency, the storage file may be converted into a binary file.
Specifically, firstly, a preset language model is adopted to detect errors of the voice recognition text. Inputting the voice recognition text into a preset language model to obtain the fluency of the voice recognition text, and judging whether the voice recognition text has errors or not through the fluency, wherein if the fluency is smaller than a fluency threshold value, judging that the voice recognition text has errors, and correcting the voice recognition text.
In one embodiment, the preset language models are a binary language model and a ternary language model, and the binary language model and the ternary language model are both obtained by corpus training in the first corpus and the second corpus. And inputting the voice recognition text into a preset language model to obtain two fluency, and judging that the voice recognition text has errors if the maximum value of the two fluency is smaller than a fluency threshold value.
And 204, if the fluency of the voice recognition text is smaller than a fluency threshold, acquiring words to be corrected in the voice recognition text.
The fluency threshold is used for judging whether errors exist in the speech recognition text, and can be set according to practical application. If the fluency of the voice recognition text is greater than or equal to the fluency threshold, judging that the voice recognition text is correct; and if the fluency is smaller than the fluency threshold value, judging that the voice recognition text has errors, and correcting the voice recognition text.
The words to be corrected refer to erroneous text words in the speech recognition text. In one embodiment, the speech recognition text is tokenized (using a tokenization tool, such as a crust tokenization tool) to obtain text words. Calculating the average absolute deviation value of each text word, if the average absolute deviation of one text word is greater than a deviation threshold value, judging that the text word is wrong, and taking the text word as a word to be corrected; and if the average absolute deviation of one text word is less than or equal to the deviation threshold value, judging that the text word is correct.
Step 206, determining a correction word corresponding to the word to be corrected from the error correction database, and obtaining a corrected voice recognition text according to the correction word.
The error correction database is used for replacing words to be corrected in the voice recognition text.
In one embodiment, as shown in fig. 3, the error correction database may be constructed from the second corpus. And acquiring the corpus of the second corpus, segmenting the corpus of the second corpus by using a segmentation dictionary to obtain candidate words, and constructing an error correction database according to the candidate words and pinyin of the candidate words.
In another embodiment, the error correction database may be constructed from the first corpus and the second corpus. Obtaining the linguistic data of the first corpus and the second corpus, utilizing a word segmentation dictionary to segment the linguistic data of the first corpus and the second corpus to obtain candidate words, and constructing an error correction database according to the candidate words and pinyin of the candidate words.
Wherein, a large number of words are stored in the word segmentation dictionary and are used for word segmentation operation. And when the candidate word is obtained, obtaining the confusion word corresponding to the candidate word, and adding the confusion word into the word segmentation dictionary to enrich the resources of the word segmentation dictionary.
Specifically, error correction candidate words corresponding to the words to be corrected are determined from the error correction database, and further, correction words are determined from the error correction candidate words.
The mode of determining the error correction candidate word corresponding to the word to be corrected from the error correction database may be: the method comprises the steps of obtaining the pinyin of a word to be corrected, obtaining the similarity between the pinyin of the word to be corrected and the pinyin of a candidate word in an error correction database, and taking the candidate word with the similarity larger than a similarity threshold as an error correction candidate word.
The manner of determining the corrected word among the error correction candidate words may be: and replacing the words to be corrected in the voice recognition text by the error correction candidate words, calculating the fluency of the replaced voice recognition text by using a preset language model, and taking the error correction candidate words with the fluency meeting the preset conditions as correction words. In one embodiment, the error correction candidate word corresponding to the maximum value in fluency is taken as the correction word.
Specifically, the words to be corrected in the speech recognition text are replaced by the correction words, so that the corrected speech recognition text is obtained.
According to the method for correcting the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than the fluency threshold, words to be corrected in the voice recognition text are obtained, correction words corresponding to the words to be corrected are determined from the correction database, and the corrected voice recognition text is obtained according to the correction words.
In one embodiment, the error correction database is constructed in a manner including: obtaining the corpus of the second corpus; performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words; and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
The candidate words refer to words included in the corpus of the second corpus.
Specifically, the corpus of the second corpus is segmented by using a segmentation tool (such as a word segmentation tool for ending) and a segmentation dictionary to obtain candidate words. For example, "how much cash can be withdrawn from the account of our company", the word segmentation tool is used to segment "we", "company", "account", "can", "withdraw", "how much", "cash".
And acquiring the pinyin of each candidate word, and storing the candidate words and the pinyin in an associated manner. In one embodiment, as shown in FIG. 4, words and pinyins are stored as key-value pairs.
In the error correction method for the voice recognition text, the error correction database is constructed according to the second corpus, and the user concept characterization is distinguished and enhanced.
In one embodiment, the method further comprises: obtaining confusion words corresponding to the candidate words; and adding the confusion word into the word segmentation dictionary.
The confusing word refers to a word with a pronunciation close to or the same as that of the candidate word.
A segmentation dictionary stores a large number of words, which are used for the segmentation operation. When the candidate word is obtained, the confusion word corresponding to the candidate word is obtained, and the confusion word is added into the word segmentation dictionary, so that the resources of the word segmentation dictionary are enriched, and the accuracy of word segmentation of the voice recognition text is improved.
Specifically, each character in each candidate word is replaced to obtain a confusion word corresponding to the candidate word. In one embodiment, each word in the candidate word is replaced with a word-level confusion set. For example, the confusing words corresponding to "cash" may be "advanced", "line-in", "current time", etc.
In the error correction method for the voice recognition text, the confusion words are added into the word segmentation dictionary, so that the resources of the word segmentation dictionary are enriched, and the accuracy of the word segmentation of the voice recognition text is improved.
In one embodiment, the obtaining of the word to be corrected in the speech recognition text includes: performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words; calculating the average absolute deviation value of each text word; and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
Wherein, the word to be corrected refers to the wrong word in the speech recognition text; text words refer to words in speech recognition text; the deviation threshold is used for judging whether the text word is wrong or not, and can be set according to practical application.
Specifically, the speech recognition text is segmented by using a segmentation tool (such as a Chinese word segmentation tool) and a segmentation dictionary to obtain text words. Calculating the average absolute deviation value of each text word, if the average absolute deviation of one text word is greater than a deviation threshold value, judging that the text word is wrong, and taking the text word as a word to be corrected; and if the average absolute deviation of one text word is less than or equal to the deviation threshold value, judging that the text word is correct.
In the error correction method for the voice recognition text, whether the text word has errors or not is judged according to the average absolute deviation value of the text word, so that the accuracy of error correction is improved.
In one embodiment, the determining, from the error correction database, a correction word corresponding to the word to be corrected includes: determining an error correction candidate word corresponding to the word to be corrected from the error correction database; determining the corrected word among the error corrected candidate words.
The word to be corrected is a word set that may be used to correct the word to be corrected. For example, if the speech recognition text is "how many lines can be drawn by our company account", and "line in" is a word to be corrected, then the word candidate for correction may be "advanced", "cash", and so on.
Specifically, the manner of determining the error correction candidate word corresponding to the word to be corrected from the error correction database may be: and determining error correction candidate words from an error correction database through the pinyin similarity. The pinyin similarity can be determined by the edit distance of the pinyin.
Specifically, the manner of determining the corrected word in the corrected candidate words may be: and replacing the words to be corrected in the voice recognition text by the error correction candidate words, calculating the fluency of the replaced voice recognition text by using a preset language model, and taking the error correction candidate words with the fluency meeting the preset conditions as correction words.
In the method for correcting the voice recognition text, the voice recognition text can be corrected by combining the editing distance of the pinyin and the preset language model, so that the accuracy of selecting the corrected words is further improved.
In one embodiment, the determining, from the error correction database, an error correction candidate word corresponding to the word to be corrected includes: obtaining the pinyin of the word to be corrected; acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database; and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
Because the candidate word and the pinyin of the candidate word are stored in the error correction database, the error correction candidate word can be determined by comparing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database.
Specifically, the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database may be determined by calculating an edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word, where the edit distance is an index for measuring the similarity between the two sequences. Taking pinyin as an example, the edit distance of pinyin means the minimum number of character edit operations required to convert one pinyin to another pinyin between two pinyins. The smaller the editing distance between the pinyin of the word to be corrected and the pinyin of the candidate word is, the greater the similarity between the word to be corrected and the candidate word is, and therefore the candidate word with the similarity greater than the similarity threshold is taken as the error correction candidate word.
In one embodiment, the edit distance of a pinyin is calculated as follows:
Figure BDA0002212598920000101
wherein, t0And tiRespectively the word to be corrected and the candidate in the correction databaseSelecting words, len (x) is the number of words included in the word x, lenp(x) The number of characters contained in the pinyin for the word x.
In the error correction method for the voice recognition text, the error correction candidate words are determined by using the editing distance of the pinyin, so that the accuracy rate of selecting the error correction candidate words is improved.
In one embodiment, the determining the corrected word among the error correction candidate words includes: replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model; and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
The preset language model is used for calculating the probability of fluency of a sentence. The predetermined language model may be an N-gram language model, where N may be one, two, three, four, etc. The N-element language model means that for one position in a sentence, the probability that the sentence is fluent when each word to be selected is at the position is calculated according to the first N-1 words of the position. The preset language model may also be a combination of at least two N-gram language models, for example, the preset language model may be a binary language model, a ternary language model, or the like.
Specifically, the candidate words for error correction are used for replacing words to be corrected in the voice recognition text, the fluency of the replaced voice recognition text is calculated by using a preset language model, and the correction words are determined according to the fluency. The method for calculating the fluency of the replaced voice recognition text by utilizing the preset language model comprises the following steps: and inputting the replaced voice recognition text into a preset language model to obtain the fluency of the replaced voice recognition text.
The preset conditions are used for screening the correction words from the error correction candidate words and can be set according to practical application. In one embodiment, the error correction candidate word corresponding to the maximum fluency in fluency of the replaced speech recognition text output by the preset language model is used as the correction word.
In the error correction method for the voice recognition text, the correction words are determined through the preset language model, and the accuracy rate of selecting the correction words is improved.
In one embodiment, the preset language model is a binary language model and a ternary language model; the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps: respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model; and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
The binary language model and the ternary language model are obtained through corpus training in the first corpus and the second corpus.
Specifically, the replaced speech recognition text is input into a preset language model, the fluency output by the binary language model and the fluency output by the ternary language model are obtained, and the maximum value of the two fluency is used as the fluency of the speech recognition text.
In the error correction method for the voice recognition text, the correction words are determined through the binary language model and the ternary language model, and the accuracy rate of selecting the correction words is improved.
As shown in fig. 5, the method for correcting the error of the speech recognition text in one embodiment is described in detail:
502, acquiring fluency of a voice recognition text by using a preset language model;
step 504, if the fluency of the voice recognition text is smaller than a fluency threshold, performing word segmentation on the voice recognition text by using a word segmentation dictionary to obtain text words;
step 506, calculating the average absolute deviation value of each text word;
step 508, if the average absolute deviation value of the text word is greater than the deviation threshold, determining that the text word is a word to be corrected;
step 510, obtaining the pinyin of the word to be corrected;
step 512, obtaining the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
step 514, using the candidate word with the similarity larger than the similarity threshold as the error correction candidate word;
step 516, replacing the words to be corrected in the voice recognition text with the error correction candidate words, and calculating the fluency of the replaced voice recognition text by using a preset language model;
and 518, taking the error correction candidate word with the fluency meeting the preset condition as a correction word of the word to be corrected.
According to the method for correcting the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than the fluency threshold, words to be corrected in the voice recognition text are obtained, correction words corresponding to the words to be corrected are determined from the correction database, and the corrected voice recognition text is obtained according to the correction words.
It should be understood that although the steps in the flowcharts of fig. 2 and 5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2 and 5 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 6, there is provided an apparatus 600 for correcting a speech recognition text, including: an obtaining module 602 and a determining module 604, wherein:
an obtaining module 602, configured to obtain fluency of a speech recognition text by using a preset language model, where the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus includes corpora of a general scene, and the second corpus includes corpora of a preset scene;
the obtaining module 602 is further configured to obtain a word to be corrected in the speech recognition text if the fluency of the speech recognition text is less than a fluency threshold;
a determining module 604, configured to determine a correction word corresponding to the word to be corrected from an error correction database, and obtain a corrected speech recognition text according to the correction word.
According to the error correction device 600 for the voice recognition text, the fluency of the voice recognition text is obtained by using the preset language model, if the fluency of the voice recognition text is smaller than the fluency threshold, words to be corrected in the voice recognition text are obtained, correction words corresponding to the words to be corrected are determined from the error correction database, and the corrected voice recognition text is obtained according to the correction words, so that the wrong words in the voice recognition text are detected and corrected, the accuracy of recognition of the voice recognition text is improved, the preset language model is trained by using the second corpus, the error correction database is built, the concept representations of users are distinguished and enhanced, and the accuracy of recognition of the intentions of the users is improved.
In an embodiment, as shown in fig. 7, the apparatus 600 for correcting a text recognition further includes a word segmentation module 606 and a construction module 608, where the obtaining module 602 is further configured to obtain corpora of the second corpus; the word segmentation module 606 is configured to perform word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words; the constructing module 608 is configured to construct the error correction database according to the candidate word and the pinyin of the candidate word.
In an embodiment, the apparatus 600 for identifying a text further includes an adding module, and the obtaining module 602 is further configured to obtain a confusion word corresponding to the candidate word; the adding module is used for adding the confusion word into the word segmentation dictionary.
In an embodiment, the obtaining module 602 is further configured to perform word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words; calculating the average absolute deviation value of each text word; and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In an embodiment, the determining module 604 is further configured to determine, from the error correction database, an error correction candidate word corresponding to the word to be corrected; determining the corrected word among the error corrected candidate words.
In an embodiment, the determining module 604 is further configured to obtain a pinyin of the word to be corrected; acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database; and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In an embodiment, the determining module 604 is further configured to obtain an editing distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and represent a similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the editing distance. In an embodiment, the determining module 604 is further configured to replace a word to be corrected in the speech recognition text with the error correction candidate word, and calculate fluency of the replaced speech recognition text by using the preset language model; and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In an embodiment, the determining module 604 is further configured to input the replaced speech recognition text into the binary language model and the ternary language model respectively, so as to obtain fluency output by the binary language model and fluency output by the ternary language model; and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text. For the specific definition of the error correction device for the speech recognition text, reference may be made to the above definition of the error correction method for the speech recognition text, and details are not described here. The modules in the device for correcting the speech recognition text can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server or a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing error correction data for speech recognition text. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of error correction for speech recognition text.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
In one embodiment, the computer program when executed by the processor further performs the steps of:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
In one embodiment, the computer program when executed by the processor further performs the steps of:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
In one embodiment, the computer program when executed by the processor further performs the steps of:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
In one embodiment, the computer program when executed by the processor further performs the steps of:
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (12)

1. A method of error correction for speech recognized text, the method comprising:
acquiring fluency of a voice recognition text by using a preset language model, wherein the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises a corpus of a general scene, and the second corpus comprises a corpus of a preset scene;
if the fluency of the voice recognition text is smaller than a fluency threshold value, acquiring words to be corrected in the voice recognition text;
and determining a correction word corresponding to the word to be corrected from an error correction database, and obtaining a corrected voice recognition text according to the correction word.
2. The method of claim 1, wherein the error correction database is constructed in a manner comprising:
obtaining the corpus of the second corpus;
performing word segmentation on the corpus of the second corpus by using a word segmentation dictionary to obtain candidate words;
and constructing the error correction database according to the candidate words and the pinyin of the candidate words.
3. The method of claim 2, further comprising:
obtaining confusion words corresponding to the candidate words;
and adding the confusion word into the word segmentation dictionary.
4. The method according to claim 3, wherein the obtaining of the word to be corrected in the speech recognition text comprises:
performing word segmentation on the voice recognition text by using the word segmentation dictionary to obtain text words;
calculating the average absolute deviation value of each text word;
and if the average absolute deviation value of the text word is greater than the deviation threshold value, determining that the text word is the word to be corrected.
5. The method according to claim 2, wherein the determining, from the error correction database, a corrected word corresponding to the word to be corrected comprises:
determining an error correction candidate word corresponding to the word to be corrected from the error correction database;
determining the corrected word among the error corrected candidate words.
6. The method according to claim 5, wherein the determining, from the error correction database, the error correction candidate word corresponding to the word to be corrected comprises:
obtaining the pinyin of the word to be corrected;
acquiring the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database;
and taking the candidate word with the similarity larger than a similarity threshold as the error correction candidate word.
7. The method according to claim 6, wherein the obtaining the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database comprises:
and acquiring the edit distance between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database, and representing the similarity between the pinyin of the word to be corrected and the pinyin of the candidate word in the error correction database by using the edit distance.
8. The method of claim 5, wherein determining the corrected word among the error corrected candidate words comprises:
replacing words to be corrected in the voice recognition text by the error correction candidate words, and calculating fluency of the replaced voice recognition text by the preset language model;
and taking the error correction candidate word with the fluency meeting the preset condition as the correction word.
9. The method according to claim 8, wherein the preset language model is a binary language model and a ternary language model;
the calculating the fluency of the replaced voice recognition text by using the preset language model comprises the following steps:
respectively inputting the replaced voice recognition texts into the binary language model and the ternary language model to obtain fluency output by the binary language model and fluency output by the ternary language model;
and taking the maximum value of the fluency output by the binary language model and the fluency output by the ternary language model as the fluency of the voice recognition text.
10. An apparatus for correcting a speech-recognized text, the apparatus comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring fluency of a voice recognition text by using a preset language model, the preset language model is obtained by using corpus training of a first corpus and a second corpus, the first corpus comprises corpus of a general scene, and the second corpus comprises corpus of a preset scene;
the obtaining module is further configured to obtain a word to be corrected in the speech recognition text if the fluency of the speech recognition text is less than a fluency threshold;
and the determining module is used for determining a corrected word corresponding to the word to be corrected from an error correction database and obtaining a corrected voice recognition text according to the corrected word.
11. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 9 when executing the computer program.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 9.
CN201910903618.1A 2019-09-24 2019-09-24 Error correction method and device for voice recognition text, computer equipment and storage medium Active CN110765763B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910903618.1A CN110765763B (en) 2019-09-24 2019-09-24 Error correction method and device for voice recognition text, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910903618.1A CN110765763B (en) 2019-09-24 2019-09-24 Error correction method and device for voice recognition text, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110765763A true CN110765763A (en) 2020-02-07
CN110765763B CN110765763B (en) 2023-12-12

Family

ID=69330240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910903618.1A Active CN110765763B (en) 2019-09-24 2019-09-24 Error correction method and device for voice recognition text, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110765763B (en)

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444318A (en) * 2020-04-08 2020-07-24 厦门快商通科技股份有限公司 Text error correction method
CN111476641A (en) * 2020-04-13 2020-07-31 南京掌控网络科技有限公司 Method, system and storage medium for automatically placing order on mobile device by voice
CN111613214A (en) * 2020-05-21 2020-09-01 重庆农村商业银行股份有限公司 Language model error correction method for improving voice recognition capability
CN111651599A (en) * 2020-05-29 2020-09-11 北京搜狗科技发展有限公司 Method and device for sorting candidate voice recognition results
CN111680476A (en) * 2020-05-26 2020-09-18 广州多益网络股份有限公司 Method for intelligently generating business hot word recognition conversion of similar text
CN111783471A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Semantic recognition method, device, equipment and storage medium of natural language
CN111881675A (en) * 2020-06-30 2020-11-03 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and storage medium
CN111985213A (en) * 2020-09-07 2020-11-24 科大讯飞华南人工智能研究院(广州)有限公司 Method and device for correcting voice customer service text
CN112151021A (en) * 2020-09-27 2020-12-29 北京达佳互联信息技术有限公司 Language model training method, speech recognition device and electronic equipment
CN112257437A (en) * 2020-10-20 2021-01-22 科大讯飞股份有限公司 Voice recognition error correction method and device, electronic equipment and storage medium
CN112509581A (en) * 2020-11-20 2021-03-16 北京有竹居网络技术有限公司 Method and device for correcting text after speech recognition, readable medium and electronic equipment
CN112580324A (en) * 2020-12-24 2021-03-30 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and storage medium
CN112767924A (en) * 2021-02-26 2021-05-07 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN112905775A (en) * 2021-02-24 2021-06-04 北京三快在线科技有限公司 Text processing method and device, electronic equipment and readable storage medium
CN113012705A (en) * 2021-02-24 2021-06-22 海信视像科技股份有限公司 Error correction method and device for voice text
CN113111660A (en) * 2021-04-22 2021-07-13 脉景(杭州)健康管理有限公司 Data processing method, device, equipment and storage medium
CN113129894A (en) * 2021-04-12 2021-07-16 阿波罗智联(北京)科技有限公司 Speech recognition method, speech recognition device, electronic device and storage medium
CN113157852A (en) * 2021-04-26 2021-07-23 深圳市优必选科技股份有限公司 Voice processing method, system, electronic equipment and storage medium
CN113270088A (en) * 2020-02-14 2021-08-17 阿里巴巴集团控股有限公司 Text processing method, data processing method, voice processing method, data processing device, voice processing device and electronic equipment
CN113326702A (en) * 2021-06-11 2021-08-31 北京猎户星空科技有限公司 Semantic recognition method and device, electronic equipment and storage medium
CN113449090A (en) * 2021-06-23 2021-09-28 山东新一代信息产业技术研究院有限公司 Error correction method, device and medium for intelligent question answering
CN113506586A (en) * 2021-06-18 2021-10-15 杭州摸象大数据科技有限公司 Method and system for recognizing emotion of user
CN113744718A (en) * 2020-05-27 2021-12-03 海尔优家智能科技(北京)有限公司 Voice text output method and device, storage medium and electronic device
CN113763961A (en) * 2020-06-02 2021-12-07 阿里巴巴集团控股有限公司 Text processing method and device
CN113781998A (en) * 2021-09-10 2021-12-10 未鲲(上海)科技服务有限公司 Dialect correction model-based voice recognition method, device, equipment and medium
CN113807080A (en) * 2020-06-15 2021-12-17 科沃斯商用机器人有限公司 Text correction method, text correction device and storage medium
CN114048321A (en) * 2021-08-12 2022-02-15 湖南达德曼宁信息技术有限公司 Multi-granularity text error correction data set generation method, device and equipment
CN114120972A (en) * 2022-01-28 2022-03-01 科大讯飞华南有限公司 Intelligent voice recognition method and system based on scene
CN114530145A (en) * 2020-11-23 2022-05-24 中移互联网有限公司 Speech recognition result error correction method and device, and computer readable storage medium
CN114896965A (en) * 2022-05-17 2022-08-12 马上消费金融股份有限公司 Text correction model training method and device and text correction method and device
CN116129906A (en) * 2023-02-14 2023-05-16 新声科技(深圳)有限公司 Speech recognition text revising method, device, computer equipment and storage medium
CN117194818A (en) * 2023-11-08 2023-12-08 北京信立方科技发展股份有限公司 Image-text webpage generation method and device based on video
CN117807990A (en) * 2023-12-27 2024-04-02 北京海泰方圆科技股份有限公司 Text processing method, device, equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729321A (en) * 2017-10-23 2018-02-23 上海百芝龙网络科技有限公司 A kind of method for correcting error of voice identification result
CN107977356A (en) * 2017-11-21 2018-05-01 新疆科大讯飞信息科技有限责任公司 Method and device for correcting recognized text
CN108304385A (en) * 2018-02-09 2018-07-20 叶伟 A kind of speech recognition text error correction method and device
CN109522419A (en) * 2018-11-15 2019-03-26 北京搜狗科技发展有限公司 Session information complementing method and device
CN110210029A (en) * 2019-05-30 2019-09-06 浙江远传信息技术股份有限公司 Speech text error correction method, system, equipment and medium based on vertical field

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107729321A (en) * 2017-10-23 2018-02-23 上海百芝龙网络科技有限公司 A kind of method for correcting error of voice identification result
CN107977356A (en) * 2017-11-21 2018-05-01 新疆科大讯飞信息科技有限责任公司 Method and device for correcting recognized text
CN108304385A (en) * 2018-02-09 2018-07-20 叶伟 A kind of speech recognition text error correction method and device
CN109522419A (en) * 2018-11-15 2019-03-26 北京搜狗科技发展有限公司 Session information complementing method and device
CN110210029A (en) * 2019-05-30 2019-09-06 浙江远传信息技术股份有限公司 Speech text error correction method, system, equipment and medium based on vertical field

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113270088B (en) * 2020-02-14 2022-04-29 阿里巴巴集团控股有限公司 Text processing method, data processing method, voice processing method, data processing device, voice processing device and electronic equipment
CN113270088A (en) * 2020-02-14 2021-08-17 阿里巴巴集团控股有限公司 Text processing method, data processing method, voice processing method, data processing device, voice processing device and electronic equipment
CN111444318A (en) * 2020-04-08 2020-07-24 厦门快商通科技股份有限公司 Text error correction method
CN111476641A (en) * 2020-04-13 2020-07-31 南京掌控网络科技有限公司 Method, system and storage medium for automatically placing order on mobile device by voice
CN111613214A (en) * 2020-05-21 2020-09-01 重庆农村商业银行股份有限公司 Language model error correction method for improving voice recognition capability
CN111680476A (en) * 2020-05-26 2020-09-18 广州多益网络股份有限公司 Method for intelligently generating business hot word recognition conversion of similar text
CN111680476B (en) * 2020-05-26 2024-01-30 广州多益网络股份有限公司 Method for intelligently generating service hotword recognition conversion of class text
CN113744718A (en) * 2020-05-27 2021-12-03 海尔优家智能科技(北京)有限公司 Voice text output method and device, storage medium and electronic device
CN111651599B (en) * 2020-05-29 2023-05-26 北京搜狗科技发展有限公司 Method and device for ordering voice recognition candidate results
CN111651599A (en) * 2020-05-29 2020-09-11 北京搜狗科技发展有限公司 Method and device for sorting candidate voice recognition results
CN113763961B (en) * 2020-06-02 2024-04-09 阿里巴巴集团控股有限公司 Text processing method and device
CN113763961A (en) * 2020-06-02 2021-12-07 阿里巴巴集团控股有限公司 Text processing method and device
CN113807080A (en) * 2020-06-15 2021-12-17 科沃斯商用机器人有限公司 Text correction method, text correction device and storage medium
CN111783471A (en) * 2020-06-29 2020-10-16 中国平安财产保险股份有限公司 Semantic recognition method, device, equipment and storage medium of natural language
CN111783471B (en) * 2020-06-29 2024-05-31 中国平安财产保险股份有限公司 Semantic recognition method, device, equipment and storage medium for natural language
CN111881675A (en) * 2020-06-30 2020-11-03 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and storage medium
CN111985213B (en) * 2020-09-07 2024-05-28 科大讯飞华南人工智能研究院(广州)有限公司 Voice customer service text error correction method and device
CN111985213A (en) * 2020-09-07 2020-11-24 科大讯飞华南人工智能研究院(广州)有限公司 Method and device for correcting voice customer service text
CN112151021A (en) * 2020-09-27 2020-12-29 北京达佳互联信息技术有限公司 Language model training method, speech recognition device and electronic equipment
CN112257437B (en) * 2020-10-20 2024-02-13 中国科学技术大学 Speech recognition error correction method, device, electronic equipment and storage medium
CN112257437A (en) * 2020-10-20 2021-01-22 科大讯飞股份有限公司 Voice recognition error correction method and device, electronic equipment and storage medium
CN112509581B (en) * 2020-11-20 2024-03-01 北京有竹居网络技术有限公司 Error correction method and device for text after voice recognition, readable medium and electronic equipment
CN112509581A (en) * 2020-11-20 2021-03-16 北京有竹居网络技术有限公司 Method and device for correcting text after speech recognition, readable medium and electronic equipment
CN114530145A (en) * 2020-11-23 2022-05-24 中移互联网有限公司 Speech recognition result error correction method and device, and computer readable storage medium
CN114530145B (en) * 2020-11-23 2023-08-15 中移互联网有限公司 Speech recognition result error correction method and device and computer readable storage medium
CN112580324B (en) * 2020-12-24 2023-07-25 北京百度网讯科技有限公司 Text error correction method, device, electronic equipment and storage medium
CN112580324A (en) * 2020-12-24 2021-03-30 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and storage medium
CN113012705A (en) * 2021-02-24 2021-06-22 海信视像科技股份有限公司 Error correction method and device for voice text
CN113012705B (en) * 2021-02-24 2022-12-09 海信视像科技股份有限公司 Error correction method and device for voice text
CN112905775A (en) * 2021-02-24 2021-06-04 北京三快在线科技有限公司 Text processing method and device, electronic equipment and readable storage medium
CN112767924A (en) * 2021-02-26 2021-05-07 北京百度网讯科技有限公司 Voice recognition method and device, electronic equipment and storage medium
US11842726B2 (en) 2021-02-26 2023-12-12 Beijing Baidu Netcom Science And Technology Co., Ltd. Method, apparatus, electronic device and storage medium for speech recognition
JP2022088586A (en) * 2021-04-12 2022-06-14 阿波▲羅▼智▲聯▼(北京)科技有限公司 Voice recognition method, voice recognition device, electronic apparatus, storage medium computer program product and computer program
JP7349523B2 (en) 2021-04-12 2023-09-22 阿波▲羅▼智▲聯▼(北京)科技有限公司 Speech recognition method, speech recognition device, electronic device, storage medium computer program product and computer program
CN113129894A (en) * 2021-04-12 2021-07-16 阿波罗智联(北京)科技有限公司 Speech recognition method, speech recognition device, electronic device and storage medium
CN113111660A (en) * 2021-04-22 2021-07-13 脉景(杭州)健康管理有限公司 Data processing method, device, equipment and storage medium
CN113157852A (en) * 2021-04-26 2021-07-23 深圳市优必选科技股份有限公司 Voice processing method, system, electronic equipment and storage medium
CN113326702B (en) * 2021-06-11 2024-02-20 北京猎户星空科技有限公司 Semantic recognition method, semantic recognition device, electronic equipment and storage medium
CN113326702A (en) * 2021-06-11 2021-08-31 北京猎户星空科技有限公司 Semantic recognition method and device, electronic equipment and storage medium
CN113506586A (en) * 2021-06-18 2021-10-15 杭州摸象大数据科技有限公司 Method and system for recognizing emotion of user
CN113449090A (en) * 2021-06-23 2021-09-28 山东新一代信息产业技术研究院有限公司 Error correction method, device and medium for intelligent question answering
CN114048321A (en) * 2021-08-12 2022-02-15 湖南达德曼宁信息技术有限公司 Multi-granularity text error correction data set generation method, device and equipment
CN113781998A (en) * 2021-09-10 2021-12-10 未鲲(上海)科技服务有限公司 Dialect correction model-based voice recognition method, device, equipment and medium
CN113781998B (en) * 2021-09-10 2024-06-07 河南松音科技有限公司 Speech recognition method, device, equipment and medium based on dialect correction model
CN114120972B (en) * 2022-01-28 2022-04-12 科大讯飞华南有限公司 Intelligent voice recognition method and system based on scene
CN114120972A (en) * 2022-01-28 2022-03-01 科大讯飞华南有限公司 Intelligent voice recognition method and system based on scene
CN114896965B (en) * 2022-05-17 2023-09-12 马上消费金融股份有限公司 Text correction model training method and device, text correction method and device
CN114896965A (en) * 2022-05-17 2022-08-12 马上消费金融股份有限公司 Text correction model training method and device and text correction method and device
CN116129906A (en) * 2023-02-14 2023-05-16 新声科技(深圳)有限公司 Speech recognition text revising method, device, computer equipment and storage medium
CN117194818B (en) * 2023-11-08 2024-01-16 北京信立方科技发展股份有限公司 Image-text webpage generation method and device based on video
CN117194818A (en) * 2023-11-08 2023-12-08 北京信立方科技发展股份有限公司 Image-text webpage generation method and device based on video
CN117807990A (en) * 2023-12-27 2024-04-02 北京海泰方圆科技股份有限公司 Text processing method, device, equipment and medium
CN117807990B (en) * 2023-12-27 2024-07-19 北京海泰方圆科技股份有限公司 Text processing method, device, equipment and medium

Also Published As

Publication number Publication date
CN110765763B (en) 2023-12-12

Similar Documents

Publication Publication Date Title
CN110765763B (en) Error correction method and device for voice recognition text, computer equipment and storage medium
CN110457431B (en) Knowledge graph-based question and answer method and device, computer equipment and storage medium
CN111666401B (en) Document recommendation method, device, computer equipment and medium based on graph structure
CN110334179B (en) Question-answer processing method, device, computer equipment and storage medium
CN109376222B (en) Question-answer matching degree calculation method, question-answer automatic matching method and device
CN110598206A (en) Text semantic recognition method and device, computer equipment and storage medium
CN110688853B (en) Sequence labeling method and device, computer equipment and storage medium
CN108664595B (en) Domain knowledge base construction method and device, computer equipment and storage medium
CN110413961B (en) Method and device for text scoring based on classification model and computer equipment
CN112651238A (en) Training corpus expansion method and device and intention recognition model training method and device
US11157686B2 (en) Text sequence segmentation method, apparatus and device, and storage medium thereof
CN110674131A (en) Financial statement data processing method and device, computer equipment and storage medium
CN110162681B (en) Text recognition method, text processing method, text recognition device, text processing device, computer equipment and storage medium
CN110362798B (en) Method, apparatus, computer device and storage medium for judging information retrieval analysis
CN111177307A (en) Test scheme and system based on semantic understanding similarity threshold configuration
CN112766319A (en) Dialogue intention recognition model training method and device, computer equipment and medium
US12124487B2 (en) Search platform for unstructured interaction summaries
CN113159013A (en) Paragraph identification method and device based on machine learning, computer equipment and medium
CN111400340A (en) Natural language processing method and device, computer equipment and storage medium
CN109885695B (en) Asset suggestion generation method, device, computer equipment and storage medium
CN114896382A (en) Artificial intelligent question-answering model generation method, question-answering method, device and storage medium
CN114547087A (en) Method, device, equipment and medium for automatically identifying proposal and generating report
WO2021217619A1 (en) Label smoothing-based speech recognition method, terminal, and medium
CN117422064A (en) Search text error correction method, apparatus, computer device and storage medium
CN116186223A (en) Financial text processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment
TG01 Patent term adjustment