CN117875313B - Chinese grammar error correction method and system - Google Patents
Chinese grammar error correction method and system Download PDFInfo
- Publication number
- CN117875313B CN117875313B CN202410279802.4A CN202410279802A CN117875313B CN 117875313 B CN117875313 B CN 117875313B CN 202410279802 A CN202410279802 A CN 202410279802A CN 117875313 B CN117875313 B CN 117875313B
- Authority
- CN
- China
- Prior art keywords
- probability
- representing
- error
- text
- errors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012937 correction Methods 0.000 title claims abstract description 28
- 238000000034 method Methods 0.000 title claims abstract description 26
- 239000013598 vector Substances 0.000 claims abstract description 71
- 230000010076 replication Effects 0.000 claims abstract description 44
- 238000013528 artificial neural network Methods 0.000 claims abstract description 37
- 238000010606 normalization Methods 0.000 claims abstract description 35
- 238000012512 characterization method Methods 0.000 claims abstract description 25
- 230000006870 function Effects 0.000 claims description 35
- 238000004364 calculation method Methods 0.000 claims description 26
- 238000012549 training Methods 0.000 claims description 20
- 238000006467 substitution reaction Methods 0.000 claims description 14
- 238000012986 modification Methods 0.000 claims description 8
- 230000004048 modification Effects 0.000 claims description 8
- 230000003213 activating effect Effects 0.000 claims description 6
- 238000003780 insertion Methods 0.000 claims description 6
- 230000037431 insertion Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 239000003550 marker Substances 0.000 claims description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The application relates to a Chinese grammar error correction method and a system, wherein the method comprises the following steps: acquiring a primary text containing grammar errors; inputting the original text into a pretrained Bert model, and outputting a semantic representation vector; the semantic representation vector is respectively passed through two different feedforward neural networks with normalization to respectively obtain replication probability and error type probability, and an index value of the maximum value of the error type probability is returned, and the replication distribution vector is determined based on the index value of the maximum value; calculating to obtain editing tag probability based on the semantic characterization vector; fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability; modifying the original text based on the final editing tag; and obtaining the correct text until the original text is free of errors.
Description
Technical Field
The application relates to the technical field of Chinese grammar error correction, in particular to a Chinese grammar error correction method and a Chinese grammar error correction system.
Background
The Chinese grammar error correction method mainly comprises two schemes, namely a seq2seq mode based on a machine translation mode and a seq2 wait mode based on editing label prediction. The seq2seq architecture based on machine translation has the problems of low reasoning speed and large quantity of training data because of being an autoregressive language model, and has poor interpretability, cannot distinguish the specific grammar error type of sentences, and cannot meet the requirement of actual production environment in terms of speed performance. The current seq2 wait architecture also has a plurality of problems, firstly, a Bert pre-training language model is formed by pre-training two tasks of shielding language modeling (MLM, masked Language Modeling) and context matching (NSP, next SENTENCE PREDICT), the pre-training task related to word insertion and deletion is lacked, and a grammar error correction task has a plurality of redundancy and missing errors; secondly, the requirement on the editing label is high, and the prediction space of the editing label is too large.
Disclosure of Invention
The application aims to provide a Chinese grammar error correction method and a Chinese grammar error correction system, which aim to improve the accuracy of Chinese grammar error correction.
The embodiment of the application provides a Chinese grammar error correction method, which comprises the following steps:
s1: acquiring a primary text containing grammar errors;
S2: inputting the original text to a pretrained Bert model, and outputting a semantic representation vector;
S3: the semantic characterization vector is respectively passed through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, and an index value of the maximum value of the error type probability is returned, and a replication distribution vector is determined based on the index value of the maximum value;
calculating to obtain editing tag probability based on the semantic representation vector;
S4: fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
S5: modifying the original text based on the final editing tag;
s6: and repeating S2-S5 until the original text is free of errors, and obtaining the correct text.
The embodiment of the application also provides a Chinese grammar error correction system, which comprises:
the acquisition module is used for acquiring the original text containing grammar errors;
The characterization module is used for inputting the original text into the pretrained Bert model and outputting a semantic characterization vector;
The replication module is used for respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value;
The editing tag probability calculation module is used for calculating the editing tag probability based on the semantic characterization vector;
The editing tag prediction module is used for fusing the editing tag probability, the replication distribution vector and the replication probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
The modification module is used for modifying the original text based on the final editing tag;
The iteration module is used for sequentially running the characterization module, the replication module, the editing tag probability calculation module, the editing tag prediction module and the modification module once to obtain a correct text, wherein the cycle is a cycle until the original text is free of errors.
The application has the beneficial effects that: outputting a semantic representation vector by inputting the original text into the pretrained Bert model; the method comprises the steps of respectively obtaining replication probability and error type probability based on semantic characterization vectors, returning an index value of a maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value; calculating to obtain editing tag probability based on the semantic characterization vector; fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability; modifying the original text according to the final editing label; circularly executing until the original text is free from errors, and obtaining a correct text; the adaptation degree of the pretrained model Bert model in the grammar error correction task is improved, the problem of sparse labels of the downstream grammar error correction task is solved, the overall performance is improved, and the accuracy of Chinese grammar error correction is improved.
Drawings
Fig. 1 is a flowchart of a chinese grammar error correction method provided by an embodiment of the present application.
Fig. 2 is a flowchart of pretraining a Bert model provided in an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
As shown in fig. 1, the present embodiment provides a method for correcting errors in chinese grammar, which includes:
s1: the original text containing grammar errors is obtained.
S2: inputting the original text to a pretrained Bert model, and outputting a semantic representation vector.
Specifically, as shown in fig. 2, the pretraining process of the Bert model includes:
s2.1: obtaining an error-free text, and performing replacement operation and insertion operation on the error-free text to obtain a lost text; the lost text comprises text with missing errors and text with redundant errors;
further, the process of obtaining the lost text includes:
Randomly extracting 20% of characters from the error-free text, selecting 50% of extracted characters to perform substitution operation to obtain the text with the missing error, and selecting the rest 50% to perform insertion operation to obtain the text with the redundancy error;
The replacing operation comprises the steps that 50% of the probability of any one character in the selected characters is replaced by a 'mask' mark, 25% of the probability is randomly replaced, and 25% of the probability is replaced by a character corresponding to the 10 th position of the prediction probability in all characters;
the inserting operation comprises the steps of randomly selecting an inserting position in selected characters, wherein 50% probability of the inserting position is an inserting "[ mask ]" mark, 15% probability of the inserting position randomly selects one character to be inserted from error-free texts, and 35% probability of the inserting position randomly selects one character to be inserted from characters before 10 of the prediction probability in the MLM pre-training task of the Bert model.
S2.2: inputting the lost text into a Bert model, and outputting semantic feature representations;
s2.3: and (3) enabling the semantic feature representation to pass through a Softmax full-connection layer to obtain a prediction probability, wherein a calculation formula is as follows:
;
Wherein, Representing a prediction probability; softmax (·) represents Softmax holo-linked layer; representing a semantic feature representation; weights representing Softmax fully connected layers; Representing the bias of the Softmax fully connected layer;
S2.4: and returning an index of the maximum value of the prediction probability to obtain a prediction character result, wherein the calculation formula is as follows:
;
Wherein, Representing predicted character results; argmax (·) represents the Argmax function;
s2.5: and calculating a pre-training loss function based on the real characters in the error-free text and the predicted character result, wherein the calculation formula is as follows:
;
Wherein, Representing a pre-training loss function; i is an ordinal number; Representing a set of alternate location markers; text representing a missing error; representing a set of insert location markers; Text representing redundant errors; representing the result of the ith predicted character; Representing the ith real character in the error-free text; Semantic feature representation representing the i-th character in the lost text; Representing a delete flag; p (·) represents known Under the condition of (a) and (b),Probability of being a real character or a delete marker; in case of substitution errors, is a real characterIn case of redundant errors, the delete flag;
s2.6: and training the Bert model according to the pre-training loss function until convergence or maximum times are reached, so as to obtain the pre-trained Bert model.
S3: and respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining a replication distribution vector based on the index value of the maximum value.
Specifically, the calculation formula for obtaining the duplication probability and the error type probability is as follows:
;
;
;
;
;
;
Wherein, Representing the weight of a first feedforward neural network of the first two layers of feedforward neural networks with normalization; representing the bias of the first one of the two layers of feedforward neural networks with normalization; representing any one normalization function; representing an ith value in the semantic token vector; Representing any one of the activation functions; Representing a replication probability; representing a Sigmoid full-connection layer in a first two-layer feedforward neural network with normalization; Representation of Weights of (2); Representing the weight of the first layer feedforward neural network in the second layer feedforward neural network with normalization; Representing the bias of the first layer of the second, normalized two-layer feedforward neural network; Representing an error type probability; representing a second Sigmoid full-connection layer in the two-layer feedforward neural network with normalization; Representation of Weights of (2); Representation of Intermediate output through a first full-connection layer in a first two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; Representation of Intermediate output of a first full-connection layer in the second feedforward neural network with normalization; Representation of And (5) normalizing the output by any activating function.
Further, the returning the index value of the maximum value of the error type probability, and determining the duplicate distribution vector based on the index value of the maximum value includes:
Returning an index value of the maximum value of the error type probability through Argmax functions, wherein the calculation formula is as follows:
;
Wherein type represents an index value of a maximum value of the error type probability, 0 Indicates no error, 1 indicates a redundant error, 2 indicates a replacement error, and 3 indicates a missing error; argmax (·) represents the Argmax function; Representing an error type probability;
Determining a value in a copy distribution vector by using an index value of the maximum value of the error type probability, wherein the copy distribution vector has the expression:
;
Wherein CopyDistribute denotes a duplicate distribution vector; k represents an element without error; d represents an element in which a redundancy error exists; r 1 represents an element in which substitution errors exist in the 1 st; r 2 represents an element of which the 2 nd has a substitution error; Represent the first Elements with substitution errors; a 1 represents the 1 st element having a missing error; a 2 represents an element having a missing error in the 2 nd; Represent the first Elements with missing errors; representing the number of edit tags;
In the duplicate distribution vector of the present invention,
When type=0, k=1, and the remaining elements are all 0;
when type=1, d=1, and the remaining elements are all 0;
when the type=2, All are 1, and the other elements are 0;
when the type is set to be 3, All are 1, and the other elements are 0.
And calculating to obtain editing tag probability based on the semantic representation vector.
Specifically, the calculation formula is:
;
;
;
Wherein, Representing an ith value in the semantic token vector; representing any one normalization function; Representing any one of the activation functions; An edit tag probability representing an i-th character; Representation of Intermediate output of a first full-connection layer in a third two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; representing the weight of the first layer feedforward neural network in the third two-layer feedforward neural network with normalization; representing the bias of the first layer of the third two-layer feedforward neural network with normalization; The third one with the weights of the second one of the normalized two-layer feedforward neural networks is shown.
S4: and fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and returning an index of the maximum value of the final editing tag probability through Argmax functions to obtain the final editing tag.
Further, the final calculation formula of the label editing probability is:
;
Wherein, Representing a final edit tag probability; Representing a replication probability; copyDistribute denotes a duplicate distribution vector; Representing edit tag probabilities.
S5: and modifying the original text based on the final editing tag.
S6: and repeating S2-S5 until the original text is free of errors, and obtaining the correct text.
In this embodiment, editing the tag includes holding, deleting, replacing, inserting.
According to the Chinese grammar error correction method provided by the embodiment, the original text is input into the pretrained Bert model, and the semantic representation vector is output; the method comprises the steps of respectively obtaining replication probability and error type probability based on semantic characterization vectors, returning an index value of a maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value; calculating to obtain editing tag probability based on the semantic characterization vector; fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability; modifying the original text according to the final editing label; circularly executing until the original text is free from errors, and obtaining a correct text; the adaptation degree of the pretrained model Bert model in the grammar error correction task is improved, the problem of sparse labels of the downstream grammar error correction task is solved, the overall performance is improved, and the accuracy of Chinese grammar error correction is improved.
The embodiment also provides a Chinese grammar error correction system, which comprises:
the device comprises an acquisition module, a pre-training module and a correction module;
the acquisition module is used for acquiring the original text containing grammar errors;
The pretraining module is used for pretraining the Bert model;
The correction module includes: the system comprises a characterization module, a replication module, an editing tag probability calculation module, an editing tag prediction module, a modification module and an iteration module;
The characterization module is used for inputting the original text into the pretrained Bert model and outputting a semantic characterization vector;
The replication module is used for respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value;
the replication module includes: a duplication probability calculation sub-module and an error type judgment sub-module;
the replication probability calculation sub-module is used for calculating the replication probability based on the semantic representation vector;
The error type judging sub-module is used for calculating error type probability based on the semantic characterization vector, returning an index value of the maximum value of the error type probability and determining a copy distribution vector based on the index value of the maximum value;
The editing tag probability calculation module is used for calculating the editing tag probability based on the semantic characterization vector;
The editing tag prediction module is used for fusing the editing tag probability, the replication distribution vector and the replication probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
The modification module is used for modifying the original text based on the final editing tag;
The iteration module is used for sequentially running the characterization module, the replication module, the editing tag probability calculation module, the editing tag prediction module and the modification module once to obtain a correct text, wherein the cycle is a cycle until the original text is free of errors.
According to the Chinese grammar error correction system provided by the embodiment, the confidence level of the grammar error position [ PAD ] label is judged through the MLM pre-training task of the Bert model, so that the loss function of the pre-training task is calculated, the Bert model is represented to have the capability of expressing missing and redundant error information after the model converges, and reliable and comprehensive information is provided for the downstream grammar correction task. In addition, a replication module is introduced into the correction module, and an error type judgment task (embodied by replication and distribution vectors) in the replication module is used as an auxiliary task of an editing tag prediction task, so that the searching range of the editing tag can be reduced, and the accuracy of editing tag prediction is improved; meanwhile, the replication probability is introduced, a method for fusing the information of the error type judgment task and the information of the editing label prediction task is designed, the error judgment of the auxiliary task can be shielded, the information when the auxiliary task is correctly judged can be enhanced, and a buffer zone is added between the two tasks through a soft replication mode of the replication probability.
Exemplary embodiments of the present disclosure are specifically illustrated and described above. It is to be understood that this disclosure is not limited to the particular arrangements, instrumentalities and methods of implementation described herein; on the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Claims (8)
1. A method for correcting errors in chinese grammar, comprising:
s1: acquiring a primary text containing grammar errors;
S2: inputting the original text to a pretrained Bert model, and outputting a semantic representation vector;
The pretraining process of the Bert model comprises the following steps:
s2.1: obtaining an error-free text, and performing replacement operation and insertion operation on the error-free text to obtain a lost text; the lost text comprises text with missing errors and text with redundant errors;
S2.2: inputting the lost text into a Bert model, and outputting semantic feature representations;
s2.3: and (3) enabling the semantic feature representation to pass through a Softmax full-connection layer to obtain a prediction probability, wherein a calculation formula is as follows:
;
Wherein, Representing a prediction probability; softmax (·) represents Softmax holo-linked layer; representing a semantic feature representation; weights representing Softmax fully connected layers; Representing the bias of the Softmax fully connected layer;
S2.4: and returning an index of the maximum value of the prediction probability to obtain a prediction character result, wherein the calculation formula is as follows:
;
Wherein, Representing predicted character results; argmax (·) represents the Argmax function;
s2.5: and calculating a pre-training loss function based on the real characters in the error-free text and the predicted character result, wherein the calculation formula is as follows:
;
Wherein, Representing a pre-training loss function; i is an ordinal number; Representing a set of alternate location markers; text representing a missing error; representing a set of insert location markers; Text representing redundant errors; representing the result of the ith predicted character; Representing the ith real character in the error-free text; Semantic feature representation representing the i-th character in the lost text; Representing a delete flag; p (·) represents known Under the condition of (a) and (b),Probability of being a real character or a delete marker; in case of substitution errors, is a real characterIn case of redundant errors, the delete flag;
S2.6: training the Bert model according to the pre-training loss function until convergence or maximum times are reached, so as to obtain the pre-trained Bert model;
S3: the semantic characterization vector is respectively passed through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, and an index value of the maximum value of the error type probability is returned, and a replication distribution vector is determined based on the index value of the maximum value;
the returning the index value of the maximum value of the error type probability, the determining the duplicate distribution vector based on the index value of the maximum value includes:
Returning an index value of the maximum value of the error type probability through Argmax functions, wherein the calculation formula is as follows:
;
Wherein type represents an index value of a maximum value of the error type probability, 0 Indicates no error, 1 indicates a redundant error, 2 indicates a replacement error, and 3 indicates a missing error; argmax (·) represents the Argmax function; Representing an error type probability;
Determining a value in a copy distribution vector by using an index value of the maximum value of the error type probability, wherein the copy distribution vector has the expression:
;
Wherein CopyDistribute denotes a duplicate distribution vector; k represents an element without error; d represents an element in which a redundancy error exists; r 1 represents an element in which substitution errors exist in the 1 st; r 2 represents an element of which the 2 nd has a substitution error; Represent the first Elements with substitution errors; a 1 represents the 1 st element having a missing error; a 2 represents an element having a missing error in the 2 nd; Represent the first Elements with missing errors; representing the number of edit tags;
In the duplicate distribution vector of the present invention,
When type=0, k=1, and the remaining elements are all 0;
when type=1, d=1, and the remaining elements are all 0;
when the type=2, All are 1, and the other elements are 0;
when the type is set to be 3, All are 1, and the other elements are 0;
calculating to obtain editing tag probability based on the semantic representation vector;
S4: fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
S5: modifying the original text based on the final editing tag;
s6: and repeating S2-S5 until the original text is free of errors, and obtaining the correct text.
2. The method of claim 1, wherein in S2.1, the replacing and inserting the error-free text to obtain the lost text comprises:
Randomly extracting 20% of characters from the error-free text, selecting 50% of extracted characters to perform substitution operation to obtain the text with the missing error, and selecting the rest 50% to perform insertion operation to obtain the text with the redundancy error;
The replacing operation comprises the steps that 50% of the probability of any one character in the selected characters is replaced by a 'mask' mark, 25% of the probability is randomly replaced, and 25% of the probability is replaced by a character corresponding to the 10 th position of the prediction probability in all characters;
The inserting operation comprises the steps of randomly selecting an inserting position in selected characters, wherein 50% probability of the inserting position is an inserting "[ mask ]" mark, 15% probability of the inserting position randomly selects one character to be inserted from the text without errors, and 35% probability of the inserting position randomly selects one character to be inserted from the characters in front of 10 in the sequence of the predicted probability.
3. The method for correcting chinese grammar according to claim 1, wherein in S3, the calculation formula for obtaining the duplication probability and the error type probability is:
;
;
;
;
;
;
Wherein, Representing the weight of a first feedforward neural network of the first two layers of feedforward neural networks with normalization; representing the bias of the first one of the two layers of feedforward neural networks with normalization; representing any one normalization function; representing an ith value in the semantic token vector; Representing any one of the activation functions; Representing a replication probability; representing a Sigmoid full-connection layer in a first two-layer feedforward neural network with normalization; Representation of Weights of (2); Representing the weight of the first layer feedforward neural network in the second layer feedforward neural network with normalization; Representing the bias of the first layer of the second, normalized two-layer feedforward neural network; Representing an error type probability; representing a second Sigmoid full-connection layer in the two-layer feedforward neural network with normalization; Representation of Weights of (2); Representation of Intermediate output through a first full-connection layer in a first two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; Representation of Intermediate output of a first full-connection layer in the second feedforward neural network with normalization; Representation of And (5) normalizing the output by any activating function.
4. The method according to claim 1, wherein in S3, the calculating the edit tag probability based on the semantic token vector comprises:
;
;
;
Wherein, Representing an ith value in the semantic token vector; representing any one normalization function; Representing any one of the activation functions; An edit tag probability representing an i-th character; Representation of Intermediate output of a first full-connection layer in a third two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; representing the weight of the first layer feedforward neural network in the third two-layer feedforward neural network with normalization; representing the bias of the first layer of the third two-layer feedforward neural network with normalization; The third one with the weights of the second one of the normalized two-layer feedforward neural networks is shown.
5. The method of claim 1, wherein in S4, the final calculation formula of the edit tag probability is:
;
Wherein, Representing a final edit tag probability; Representing a replication probability; copyDistribute denotes a duplicate distribution vector; Representing edit tag probabilities.
6. The method of claim 1, wherein in S4, determining the final edit label based on the final edit label probability includes returning an index of a maximum value of the final edit label probability by Argmax functions to obtain the final edit label.
7. The method of claim 1, wherein editing the tag includes holding, deleting, replacing, inserting.
8. A chinese grammar error correction system, comprising:
the acquisition module is used for acquiring the original text containing grammar errors;
The characterization module is used for inputting the original text into the pretrained Bert model and outputting a semantic characterization vector;
The pretraining process of the Bert model comprises the following steps:
s2.1: obtaining an error-free text, and performing replacement operation and insertion operation on the error-free text to obtain a lost text; the lost text comprises text with missing errors and text with redundant errors;
S2.2: inputting the lost text into a Bert model, and outputting semantic feature representations;
s2.3: and (3) enabling the semantic feature representation to pass through a Softmax full-connection layer to obtain a prediction probability, wherein a calculation formula is as follows:
;
Wherein, Representing a prediction probability; softmax (·) represents Softmax holo-linked layer; representing a semantic feature representation; weights representing Softmax fully connected layers; Representing the bias of the Softmax fully connected layer;
S2.4: and returning an index of the maximum value of the prediction probability to obtain a prediction character result, wherein the calculation formula is as follows:
;
Wherein, Representing predicted character results; argmax (·) represents the Argmax function;
s2.5: and calculating a pre-training loss function based on the real characters in the error-free text and the predicted character result, wherein the calculation formula is as follows:
;
Wherein, Representing a pre-training loss function; i is an ordinal number; Representing a set of alternate location markers; text representing a missing error; representing a set of insert location markers; Text representing redundant errors; representing the result of the ith predicted character; Representing the ith real character in the error-free text; Semantic feature representation representing the i-th character in the lost text; Representing a delete flag; p (·) represents known Under the condition of (a) and (b),Probability of being a real character or a delete marker; in case of substitution errors, is a real characterIn case of redundant errors, the delete flag;
S2.6: training the Bert model according to the pre-training loss function until convergence or maximum times are reached, so as to obtain the pre-trained Bert model;
The replication module is used for respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value;
the returning the index value of the maximum value of the error type probability, the determining the duplicate distribution vector based on the index value of the maximum value includes:
Returning an index value of the maximum value of the error type probability through Argmax functions, wherein the calculation formula is as follows:
;
Wherein type represents an index value of a maximum value of the error type probability, 0 Indicates no error, 1 indicates a redundant error, 2 indicates a replacement error, and 3 indicates a missing error; argmax (·) represents the Argmax function; Representing an error type probability;
Determining a value in a copy distribution vector by using an index value of the maximum value of the error type probability, wherein the copy distribution vector has the expression:
;
Wherein CopyDistribute denotes a duplicate distribution vector; k represents an element without error; d represents an element in which a redundancy error exists; r 1 represents an element in which substitution errors exist in the 1 st; r 2 represents an element of which the 2 nd has a substitution error; Represent the first Elements with substitution errors; a 1 represents the 1 st element having a missing error; a 2 represents an element having a missing error in the 2 nd; Represent the first Elements with missing errors; representing the number of edit tags;
In the duplicate distribution vector of the present invention,
When type=0, k=1, and the remaining elements are all 0;
when type=1, d=1, and the remaining elements are all 0;
when the type=2, All are 1, and the other elements are 0;
when the type is set to be 3, All are 1, and the other elements are 0;
The editing tag probability calculation module is used for calculating the editing tag probability based on the semantic characterization vector;
The editing tag prediction module is used for fusing the editing tag probability, the replication distribution vector and the replication probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
The modification module is used for modifying the original text based on the final editing tag;
The iteration module is used for sequentially running the characterization module, the replication module, the editing tag probability calculation module, the editing tag prediction module and the modification module once to obtain a correct text, wherein the cycle is a cycle until the original text is free of errors.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410279802.4A CN117875313B (en) | 2024-03-12 | 2024-03-12 | Chinese grammar error correction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410279802.4A CN117875313B (en) | 2024-03-12 | 2024-03-12 | Chinese grammar error correction method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117875313A CN117875313A (en) | 2024-04-12 |
CN117875313B true CN117875313B (en) | 2024-07-02 |
Family
ID=90588765
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410279802.4A Active CN117875313B (en) | 2024-03-12 | 2024-03-12 | Chinese grammar error correction method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117875313B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021212612A1 (en) * | 2020-04-23 | 2021-10-28 | 平安科技(深圳)有限公司 | Intelligent text error correction method and apparatus, electronic device and readable storage medium |
CN115146621A (en) * | 2022-05-09 | 2022-10-04 | 腾讯科技(深圳)有限公司 | Training method, application method, device and equipment of text error correction model |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8583416B2 (en) * | 2007-12-27 | 2013-11-12 | Fluential, Llc | Robust information extraction from utterances |
US11494647B2 (en) * | 2019-12-06 | 2022-11-08 | Adobe Inc. | Slot filling with contextual information |
CN112528671A (en) * | 2020-12-02 | 2021-03-19 | 北京小米松果电子有限公司 | Semantic analysis method, semantic analysis device and storage medium |
KR102552811B1 (en) * | 2020-12-14 | 2023-07-06 | 박지우 | System for providing cloud based grammar checker service |
CN112836496B (en) * | 2021-01-25 | 2024-02-13 | 之江实验室 | Text error correction method based on BERT and feedforward neural network |
CN115034218A (en) * | 2022-06-10 | 2022-09-09 | 哈尔滨福涛科技有限责任公司 | Chinese grammar error diagnosis method based on multi-stage training and editing level voting |
KR102542220B1 (en) * | 2022-09-19 | 2023-06-13 | 아주대학교 산학협력단 | Method of semantic segmentation based on self-knowledge distillation and semantic segmentation device based on self-knowledge distillation |
-
2024
- 2024-03-12 CN CN202410279802.4A patent/CN117875313B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021212612A1 (en) * | 2020-04-23 | 2021-10-28 | 平安科技(深圳)有限公司 | Intelligent text error correction method and apparatus, electronic device and readable storage medium |
CN115146621A (en) * | 2022-05-09 | 2022-10-04 | 腾讯科技(深圳)有限公司 | Training method, application method, device and equipment of text error correction model |
Also Published As
Publication number | Publication date |
---|---|
CN117875313A (en) | 2024-04-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110489760B (en) | Text automatic correction method and device based on deep neural network | |
CN108416058B (en) | Bi-LSTM input information enhancement-based relation extraction method | |
CN111708882B (en) | Transformer-based Chinese text information missing completion method | |
CN111985239A (en) | Entity identification method and device, electronic equipment and storage medium | |
CN110276069B (en) | Method, system and storage medium for automatically detecting Chinese braille error | |
CN114818668B (en) | Name correction method and device for voice transcription text and computer equipment | |
CN112215013B (en) | Clone code semantic detection method based on deep learning | |
CN111767718B (en) | Chinese grammar error correction method based on weakened grammar error feature representation | |
CN110209822A (en) | Sphere of learning data dependence prediction technique based on deep learning, computer | |
CN116306600B (en) | MacBert-based Chinese text error correction method | |
CN113657098B (en) | Text error correction method, device, equipment and storage medium | |
CN114781651A (en) | Small sample learning robustness improving method based on contrast learning | |
CN114818669B (en) | Method for constructing name error correction model and computer equipment | |
CN114330350B (en) | Named entity recognition method and device, electronic equipment and storage medium | |
CN114586038B (en) | Method and device for event extraction and extraction model training, equipment and medium | |
CN114239584A (en) | Named entity identification method based on self-supervision learning | |
CN114330349A (en) | Specific field named entity recognition method | |
CN117875313B (en) | Chinese grammar error correction method and system | |
CN112488111A (en) | Instruction expression understanding method based on multi-level expression guide attention network | |
CN117744658A (en) | Ship naming entity identification method based on BERT-BiLSTM-CRF | |
CN115437511B (en) | Pinyin Chinese character conversion method, conversion model training method and storage medium | |
CN114925170B (en) | Text proofreading model training method and device and computing equipment | |
CN116611428A (en) | Non-autoregressive decoding Vietnam text regularization method based on editing alignment algorithm | |
CN115952284A (en) | Medical text relation extraction method fusing density clustering and ERNIE | |
CN115964486A (en) | Small sample intention recognition method based on data enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |