[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN117875313B - Chinese grammar error correction method and system - Google Patents

Chinese grammar error correction method and system Download PDF

Info

Publication number
CN117875313B
CN117875313B CN202410279802.4A CN202410279802A CN117875313B CN 117875313 B CN117875313 B CN 117875313B CN 202410279802 A CN202410279802 A CN 202410279802A CN 117875313 B CN117875313 B CN 117875313B
Authority
CN
China
Prior art keywords
probability
representing
error
text
errors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410279802.4A
Other languages
Chinese (zh)
Other versions
CN117875313A (en
Inventor
康占英
黄惟
王青
肖峰
徐伯辰
刘优
彭卓
汤达夫
李芳芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha Zhiwei Information Technology Co ltd
Original Assignee
Changsha Zhiwei Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha Zhiwei Information Technology Co ltd filed Critical Changsha Zhiwei Information Technology Co ltd
Priority to CN202410279802.4A priority Critical patent/CN117875313B/en
Publication of CN117875313A publication Critical patent/CN117875313A/en
Application granted granted Critical
Publication of CN117875313B publication Critical patent/CN117875313B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a Chinese grammar error correction method and a system, wherein the method comprises the following steps: acquiring a primary text containing grammar errors; inputting the original text into a pretrained Bert model, and outputting a semantic representation vector; the semantic representation vector is respectively passed through two different feedforward neural networks with normalization to respectively obtain replication probability and error type probability, and an index value of the maximum value of the error type probability is returned, and the replication distribution vector is determined based on the index value of the maximum value; calculating to obtain editing tag probability based on the semantic characterization vector; fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability; modifying the original text based on the final editing tag; and obtaining the correct text until the original text is free of errors.

Description

Chinese grammar error correction method and system
Technical Field
The application relates to the technical field of Chinese grammar error correction, in particular to a Chinese grammar error correction method and a Chinese grammar error correction system.
Background
The Chinese grammar error correction method mainly comprises two schemes, namely a seq2seq mode based on a machine translation mode and a seq2 wait mode based on editing label prediction. The seq2seq architecture based on machine translation has the problems of low reasoning speed and large quantity of training data because of being an autoregressive language model, and has poor interpretability, cannot distinguish the specific grammar error type of sentences, and cannot meet the requirement of actual production environment in terms of speed performance. The current seq2 wait architecture also has a plurality of problems, firstly, a Bert pre-training language model is formed by pre-training two tasks of shielding language modeling (MLM, masked Language Modeling) and context matching (NSP, next SENTENCE PREDICT), the pre-training task related to word insertion and deletion is lacked, and a grammar error correction task has a plurality of redundancy and missing errors; secondly, the requirement on the editing label is high, and the prediction space of the editing label is too large.
Disclosure of Invention
The application aims to provide a Chinese grammar error correction method and a Chinese grammar error correction system, which aim to improve the accuracy of Chinese grammar error correction.
The embodiment of the application provides a Chinese grammar error correction method, which comprises the following steps:
s1: acquiring a primary text containing grammar errors;
S2: inputting the original text to a pretrained Bert model, and outputting a semantic representation vector;
S3: the semantic characterization vector is respectively passed through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, and an index value of the maximum value of the error type probability is returned, and a replication distribution vector is determined based on the index value of the maximum value;
calculating to obtain editing tag probability based on the semantic representation vector;
S4: fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
S5: modifying the original text based on the final editing tag;
s6: and repeating S2-S5 until the original text is free of errors, and obtaining the correct text.
The embodiment of the application also provides a Chinese grammar error correction system, which comprises:
the acquisition module is used for acquiring the original text containing grammar errors;
The characterization module is used for inputting the original text into the pretrained Bert model and outputting a semantic characterization vector;
The replication module is used for respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value;
The editing tag probability calculation module is used for calculating the editing tag probability based on the semantic characterization vector;
The editing tag prediction module is used for fusing the editing tag probability, the replication distribution vector and the replication probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
The modification module is used for modifying the original text based on the final editing tag;
The iteration module is used for sequentially running the characterization module, the replication module, the editing tag probability calculation module, the editing tag prediction module and the modification module once to obtain a correct text, wherein the cycle is a cycle until the original text is free of errors.
The application has the beneficial effects that: outputting a semantic representation vector by inputting the original text into the pretrained Bert model; the method comprises the steps of respectively obtaining replication probability and error type probability based on semantic characterization vectors, returning an index value of a maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value; calculating to obtain editing tag probability based on the semantic characterization vector; fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability; modifying the original text according to the final editing label; circularly executing until the original text is free from errors, and obtaining a correct text; the adaptation degree of the pretrained model Bert model in the grammar error correction task is improved, the problem of sparse labels of the downstream grammar error correction task is solved, the overall performance is improved, and the accuracy of Chinese grammar error correction is improved.
Drawings
Fig. 1 is a flowchart of a chinese grammar error correction method provided by an embodiment of the present application.
Fig. 2 is a flowchart of pretraining a Bert model provided in an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
As shown in fig. 1, the present embodiment provides a method for correcting errors in chinese grammar, which includes:
s1: the original text containing grammar errors is obtained.
S2: inputting the original text to a pretrained Bert model, and outputting a semantic representation vector.
Specifically, as shown in fig. 2, the pretraining process of the Bert model includes:
s2.1: obtaining an error-free text, and performing replacement operation and insertion operation on the error-free text to obtain a lost text; the lost text comprises text with missing errors and text with redundant errors;
further, the process of obtaining the lost text includes:
Randomly extracting 20% of characters from the error-free text, selecting 50% of extracted characters to perform substitution operation to obtain the text with the missing error, and selecting the rest 50% to perform insertion operation to obtain the text with the redundancy error;
The replacing operation comprises the steps that 50% of the probability of any one character in the selected characters is replaced by a 'mask' mark, 25% of the probability is randomly replaced, and 25% of the probability is replaced by a character corresponding to the 10 th position of the prediction probability in all characters;
the inserting operation comprises the steps of randomly selecting an inserting position in selected characters, wherein 50% probability of the inserting position is an inserting "[ mask ]" mark, 15% probability of the inserting position randomly selects one character to be inserted from error-free texts, and 35% probability of the inserting position randomly selects one character to be inserted from characters before 10 of the prediction probability in the MLM pre-training task of the Bert model.
S2.2: inputting the lost text into a Bert model, and outputting semantic feature representations;
s2.3: and (3) enabling the semantic feature representation to pass through a Softmax full-connection layer to obtain a prediction probability, wherein a calculation formula is as follows:
Wherein, Representing a prediction probability; softmax (·) represents Softmax holo-linked layer; representing a semantic feature representation; weights representing Softmax fully connected layers; Representing the bias of the Softmax fully connected layer;
S2.4: and returning an index of the maximum value of the prediction probability to obtain a prediction character result, wherein the calculation formula is as follows:
Wherein, Representing predicted character results; argmax (·) represents the Argmax function;
s2.5: and calculating a pre-training loss function based on the real characters in the error-free text and the predicted character result, wherein the calculation formula is as follows:
Wherein, Representing a pre-training loss function; i is an ordinal number; Representing a set of alternate location markers; text representing a missing error; representing a set of insert location markers; Text representing redundant errors; representing the result of the ith predicted character; Representing the ith real character in the error-free text; Semantic feature representation representing the i-th character in the lost text; Representing a delete flag; p (·) represents known Under the condition of (a) and (b),Probability of being a real character or a delete marker; in case of substitution errors, is a real characterIn case of redundant errors, the delete flag;
s2.6: and training the Bert model according to the pre-training loss function until convergence or maximum times are reached, so as to obtain the pre-trained Bert model.
S3: and respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining a replication distribution vector based on the index value of the maximum value.
Specifically, the calculation formula for obtaining the duplication probability and the error type probability is as follows:
Wherein, Representing the weight of a first feedforward neural network of the first two layers of feedforward neural networks with normalization; representing the bias of the first one of the two layers of feedforward neural networks with normalization; representing any one normalization function; representing an ith value in the semantic token vector; Representing any one of the activation functions; Representing a replication probability; representing a Sigmoid full-connection layer in a first two-layer feedforward neural network with normalization; Representation of Weights of (2); Representing the weight of the first layer feedforward neural network in the second layer feedforward neural network with normalization; Representing the bias of the first layer of the second, normalized two-layer feedforward neural network; Representing an error type probability; representing a second Sigmoid full-connection layer in the two-layer feedforward neural network with normalization; Representation of Weights of (2); Representation of Intermediate output through a first full-connection layer in a first two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; Representation of Intermediate output of a first full-connection layer in the second feedforward neural network with normalization; Representation of And (5) normalizing the output by any activating function.
Further, the returning the index value of the maximum value of the error type probability, and determining the duplicate distribution vector based on the index value of the maximum value includes:
Returning an index value of the maximum value of the error type probability through Argmax functions, wherein the calculation formula is as follows:
Wherein type represents an index value of a maximum value of the error type probability, 0 Indicates no error, 1 indicates a redundant error, 2 indicates a replacement error, and 3 indicates a missing error; argmax (·) represents the Argmax function; Representing an error type probability;
Determining a value in a copy distribution vector by using an index value of the maximum value of the error type probability, wherein the copy distribution vector has the expression:
Wherein CopyDistribute denotes a duplicate distribution vector; k represents an element without error; d represents an element in which a redundancy error exists; r 1 represents an element in which substitution errors exist in the 1 st; r 2 represents an element of which the 2 nd has a substitution error; Represent the first Elements with substitution errors; a 1 represents the 1 st element having a missing error; a 2 represents an element having a missing error in the 2 nd; Represent the first Elements with missing errors; representing the number of edit tags;
In the duplicate distribution vector of the present invention,
When type=0, k=1, and the remaining elements are all 0;
when type=1, d=1, and the remaining elements are all 0;
when the type=2, All are 1, and the other elements are 0;
when the type is set to be 3, All are 1, and the other elements are 0.
And calculating to obtain editing tag probability based on the semantic representation vector.
Specifically, the calculation formula is:
Wherein, Representing an ith value in the semantic token vector; representing any one normalization function; Representing any one of the activation functions; An edit tag probability representing an i-th character; Representation of Intermediate output of a first full-connection layer in a third two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; representing the weight of the first layer feedforward neural network in the third two-layer feedforward neural network with normalization; representing the bias of the first layer of the third two-layer feedforward neural network with normalization; The third one with the weights of the second one of the normalized two-layer feedforward neural networks is shown.
S4: and fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and returning an index of the maximum value of the final editing tag probability through Argmax functions to obtain the final editing tag.
Further, the final calculation formula of the label editing probability is:
Wherein, Representing a final edit tag probability; Representing a replication probability; copyDistribute denotes a duplicate distribution vector; Representing edit tag probabilities.
S5: and modifying the original text based on the final editing tag.
S6: and repeating S2-S5 until the original text is free of errors, and obtaining the correct text.
In this embodiment, editing the tag includes holding, deleting, replacing, inserting.
According to the Chinese grammar error correction method provided by the embodiment, the original text is input into the pretrained Bert model, and the semantic representation vector is output; the method comprises the steps of respectively obtaining replication probability and error type probability based on semantic characterization vectors, returning an index value of a maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value; calculating to obtain editing tag probability based on the semantic characterization vector; fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability; modifying the original text according to the final editing label; circularly executing until the original text is free from errors, and obtaining a correct text; the adaptation degree of the pretrained model Bert model in the grammar error correction task is improved, the problem of sparse labels of the downstream grammar error correction task is solved, the overall performance is improved, and the accuracy of Chinese grammar error correction is improved.
The embodiment also provides a Chinese grammar error correction system, which comprises:
the device comprises an acquisition module, a pre-training module and a correction module;
the acquisition module is used for acquiring the original text containing grammar errors;
The pretraining module is used for pretraining the Bert model;
The correction module includes: the system comprises a characterization module, a replication module, an editing tag probability calculation module, an editing tag prediction module, a modification module and an iteration module;
The characterization module is used for inputting the original text into the pretrained Bert model and outputting a semantic characterization vector;
The replication module is used for respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value;
the replication module includes: a duplication probability calculation sub-module and an error type judgment sub-module;
the replication probability calculation sub-module is used for calculating the replication probability based on the semantic representation vector;
The error type judging sub-module is used for calculating error type probability based on the semantic characterization vector, returning an index value of the maximum value of the error type probability and determining a copy distribution vector based on the index value of the maximum value;
The editing tag probability calculation module is used for calculating the editing tag probability based on the semantic characterization vector;
The editing tag prediction module is used for fusing the editing tag probability, the replication distribution vector and the replication probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
The modification module is used for modifying the original text based on the final editing tag;
The iteration module is used for sequentially running the characterization module, the replication module, the editing tag probability calculation module, the editing tag prediction module and the modification module once to obtain a correct text, wherein the cycle is a cycle until the original text is free of errors.
According to the Chinese grammar error correction system provided by the embodiment, the confidence level of the grammar error position [ PAD ] label is judged through the MLM pre-training task of the Bert model, so that the loss function of the pre-training task is calculated, the Bert model is represented to have the capability of expressing missing and redundant error information after the model converges, and reliable and comprehensive information is provided for the downstream grammar correction task. In addition, a replication module is introduced into the correction module, and an error type judgment task (embodied by replication and distribution vectors) in the replication module is used as an auxiliary task of an editing tag prediction task, so that the searching range of the editing tag can be reduced, and the accuracy of editing tag prediction is improved; meanwhile, the replication probability is introduced, a method for fusing the information of the error type judgment task and the information of the editing label prediction task is designed, the error judgment of the auxiliary task can be shielded, the information when the auxiliary task is correctly judged can be enhanced, and a buffer zone is added between the two tasks through a soft replication mode of the replication probability.
Exemplary embodiments of the present disclosure are specifically illustrated and described above. It is to be understood that this disclosure is not limited to the particular arrangements, instrumentalities and methods of implementation described herein; on the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (8)

1. A method for correcting errors in chinese grammar, comprising:
s1: acquiring a primary text containing grammar errors;
S2: inputting the original text to a pretrained Bert model, and outputting a semantic representation vector;
The pretraining process of the Bert model comprises the following steps:
s2.1: obtaining an error-free text, and performing replacement operation and insertion operation on the error-free text to obtain a lost text; the lost text comprises text with missing errors and text with redundant errors;
S2.2: inputting the lost text into a Bert model, and outputting semantic feature representations;
s2.3: and (3) enabling the semantic feature representation to pass through a Softmax full-connection layer to obtain a prediction probability, wherein a calculation formula is as follows:
Wherein, Representing a prediction probability; softmax (·) represents Softmax holo-linked layer; representing a semantic feature representation; weights representing Softmax fully connected layers; Representing the bias of the Softmax fully connected layer;
S2.4: and returning an index of the maximum value of the prediction probability to obtain a prediction character result, wherein the calculation formula is as follows:
Wherein, Representing predicted character results; argmax (·) represents the Argmax function;
s2.5: and calculating a pre-training loss function based on the real characters in the error-free text and the predicted character result, wherein the calculation formula is as follows:
Wherein, Representing a pre-training loss function; i is an ordinal number; Representing a set of alternate location markers; text representing a missing error; representing a set of insert location markers; Text representing redundant errors; representing the result of the ith predicted character; Representing the ith real character in the error-free text; Semantic feature representation representing the i-th character in the lost text; Representing a delete flag; p (·) represents known Under the condition of (a) and (b),Probability of being a real character or a delete marker; in case of substitution errors, is a real characterIn case of redundant errors, the delete flag;
S2.6: training the Bert model according to the pre-training loss function until convergence or maximum times are reached, so as to obtain the pre-trained Bert model;
S3: the semantic characterization vector is respectively passed through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, and an index value of the maximum value of the error type probability is returned, and a replication distribution vector is determined based on the index value of the maximum value;
the returning the index value of the maximum value of the error type probability, the determining the duplicate distribution vector based on the index value of the maximum value includes:
Returning an index value of the maximum value of the error type probability through Argmax functions, wherein the calculation formula is as follows:
Wherein type represents an index value of a maximum value of the error type probability, 0 Indicates no error, 1 indicates a redundant error, 2 indicates a replacement error, and 3 indicates a missing error; argmax (·) represents the Argmax function; Representing an error type probability;
Determining a value in a copy distribution vector by using an index value of the maximum value of the error type probability, wherein the copy distribution vector has the expression:
Wherein CopyDistribute denotes a duplicate distribution vector; k represents an element without error; d represents an element in which a redundancy error exists; r 1 represents an element in which substitution errors exist in the 1 st; r 2 represents an element of which the 2 nd has a substitution error; Represent the first Elements with substitution errors; a 1 represents the 1 st element having a missing error; a 2 represents an element having a missing error in the 2 nd; Represent the first Elements with missing errors; representing the number of edit tags;
In the duplicate distribution vector of the present invention,
When type=0, k=1, and the remaining elements are all 0;
when type=1, d=1, and the remaining elements are all 0;
when the type=2, All are 1, and the other elements are 0;
when the type is set to be 3, All are 1, and the other elements are 0;
calculating to obtain editing tag probability based on the semantic representation vector;
S4: fusing the editing tag probability, the copying distribution vector and the copying probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
S5: modifying the original text based on the final editing tag;
s6: and repeating S2-S5 until the original text is free of errors, and obtaining the correct text.
2. The method of claim 1, wherein in S2.1, the replacing and inserting the error-free text to obtain the lost text comprises:
Randomly extracting 20% of characters from the error-free text, selecting 50% of extracted characters to perform substitution operation to obtain the text with the missing error, and selecting the rest 50% to perform insertion operation to obtain the text with the redundancy error;
The replacing operation comprises the steps that 50% of the probability of any one character in the selected characters is replaced by a 'mask' mark, 25% of the probability is randomly replaced, and 25% of the probability is replaced by a character corresponding to the 10 th position of the prediction probability in all characters;
The inserting operation comprises the steps of randomly selecting an inserting position in selected characters, wherein 50% probability of the inserting position is an inserting "[ mask ]" mark, 15% probability of the inserting position randomly selects one character to be inserted from the text without errors, and 35% probability of the inserting position randomly selects one character to be inserted from the characters in front of 10 in the sequence of the predicted probability.
3. The method for correcting chinese grammar according to claim 1, wherein in S3, the calculation formula for obtaining the duplication probability and the error type probability is:
Wherein, Representing the weight of a first feedforward neural network of the first two layers of feedforward neural networks with normalization; representing the bias of the first one of the two layers of feedforward neural networks with normalization; representing any one normalization function; representing an ith value in the semantic token vector; Representing any one of the activation functions; Representing a replication probability; representing a Sigmoid full-connection layer in a first two-layer feedforward neural network with normalization; Representation of Weights of (2); Representing the weight of the first layer feedforward neural network in the second layer feedforward neural network with normalization; Representing the bias of the first layer of the second, normalized two-layer feedforward neural network; Representing an error type probability; representing a second Sigmoid full-connection layer in the two-layer feedforward neural network with normalization; Representation of Weights of (2); Representation of Intermediate output through a first full-connection layer in a first two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; Representation of Intermediate output of a first full-connection layer in the second feedforward neural network with normalization; Representation of And (5) normalizing the output by any activating function.
4. The method according to claim 1, wherein in S3, the calculating the edit tag probability based on the semantic token vector comprises:
Wherein, Representing an ith value in the semantic token vector; representing any one normalization function; Representing any one of the activation functions; An edit tag probability representing an i-th character; Representation of Intermediate output of a first full-connection layer in a third two-layer feedforward neural network with normalization; Representation of The output after normalization of any activating function; representing the weight of the first layer feedforward neural network in the third two-layer feedforward neural network with normalization; representing the bias of the first layer of the third two-layer feedforward neural network with normalization; The third one with the weights of the second one of the normalized two-layer feedforward neural networks is shown.
5. The method of claim 1, wherein in S4, the final calculation formula of the edit tag probability is:
Wherein, Representing a final edit tag probability; Representing a replication probability; copyDistribute denotes a duplicate distribution vector; Representing edit tag probabilities.
6. The method of claim 1, wherein in S4, determining the final edit label based on the final edit label probability includes returning an index of a maximum value of the final edit label probability by Argmax functions to obtain the final edit label.
7. The method of claim 1, wherein editing the tag includes holding, deleting, replacing, inserting.
8. A chinese grammar error correction system, comprising:
the acquisition module is used for acquiring the original text containing grammar errors;
The characterization module is used for inputting the original text into the pretrained Bert model and outputting a semantic characterization vector;
The pretraining process of the Bert model comprises the following steps:
s2.1: obtaining an error-free text, and performing replacement operation and insertion operation on the error-free text to obtain a lost text; the lost text comprises text with missing errors and text with redundant errors;
S2.2: inputting the lost text into a Bert model, and outputting semantic feature representations;
s2.3: and (3) enabling the semantic feature representation to pass through a Softmax full-connection layer to obtain a prediction probability, wherein a calculation formula is as follows:
Wherein, Representing a prediction probability; softmax (·) represents Softmax holo-linked layer; representing a semantic feature representation; weights representing Softmax fully connected layers; Representing the bias of the Softmax fully connected layer;
S2.4: and returning an index of the maximum value of the prediction probability to obtain a prediction character result, wherein the calculation formula is as follows:
Wherein, Representing predicted character results; argmax (·) represents the Argmax function;
s2.5: and calculating a pre-training loss function based on the real characters in the error-free text and the predicted character result, wherein the calculation formula is as follows:
Wherein, Representing a pre-training loss function; i is an ordinal number; Representing a set of alternate location markers; text representing a missing error; representing a set of insert location markers; Text representing redundant errors; representing the result of the ith predicted character; Representing the ith real character in the error-free text; Semantic feature representation representing the i-th character in the lost text; Representing a delete flag; p (·) represents known Under the condition of (a) and (b),Probability of being a real character or a delete marker; in case of substitution errors, is a real characterIn case of redundant errors, the delete flag;
S2.6: training the Bert model according to the pre-training loss function until convergence or maximum times are reached, so as to obtain the pre-trained Bert model;
The replication module is used for respectively passing the semantic characterization vectors through two different two layers of feedforward neural networks with normalization to respectively obtain replication probability and error type probability, returning an index value of the maximum value of the error type probability, and determining replication distribution vectors based on the index value of the maximum value;
the returning the index value of the maximum value of the error type probability, the determining the duplicate distribution vector based on the index value of the maximum value includes:
Returning an index value of the maximum value of the error type probability through Argmax functions, wherein the calculation formula is as follows:
Wherein type represents an index value of a maximum value of the error type probability, 0 Indicates no error, 1 indicates a redundant error, 2 indicates a replacement error, and 3 indicates a missing error; argmax (·) represents the Argmax function; Representing an error type probability;
Determining a value in a copy distribution vector by using an index value of the maximum value of the error type probability, wherein the copy distribution vector has the expression:
Wherein CopyDistribute denotes a duplicate distribution vector; k represents an element without error; d represents an element in which a redundancy error exists; r 1 represents an element in which substitution errors exist in the 1 st; r 2 represents an element of which the 2 nd has a substitution error; Represent the first Elements with substitution errors; a 1 represents the 1 st element having a missing error; a 2 represents an element having a missing error in the 2 nd; Represent the first Elements with missing errors; representing the number of edit tags;
In the duplicate distribution vector of the present invention,
When type=0, k=1, and the remaining elements are all 0;
when type=1, d=1, and the remaining elements are all 0;
when the type=2, All are 1, and the other elements are 0;
when the type is set to be 3, All are 1, and the other elements are 0;
The editing tag probability calculation module is used for calculating the editing tag probability based on the semantic characterization vector;
The editing tag prediction module is used for fusing the editing tag probability, the replication distribution vector and the replication probability to obtain a final editing tag probability, and determining a final editing tag based on the final editing tag probability;
The modification module is used for modifying the original text based on the final editing tag;
The iteration module is used for sequentially running the characterization module, the replication module, the editing tag probability calculation module, the editing tag prediction module and the modification module once to obtain a correct text, wherein the cycle is a cycle until the original text is free of errors.
CN202410279802.4A 2024-03-12 2024-03-12 Chinese grammar error correction method and system Active CN117875313B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410279802.4A CN117875313B (en) 2024-03-12 2024-03-12 Chinese grammar error correction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410279802.4A CN117875313B (en) 2024-03-12 2024-03-12 Chinese grammar error correction method and system

Publications (2)

Publication Number Publication Date
CN117875313A CN117875313A (en) 2024-04-12
CN117875313B true CN117875313B (en) 2024-07-02

Family

ID=90588765

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410279802.4A Active CN117875313B (en) 2024-03-12 2024-03-12 Chinese grammar error correction method and system

Country Status (1)

Country Link
CN (1) CN117875313B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021212612A1 (en) * 2020-04-23 2021-10-28 平安科技(深圳)有限公司 Intelligent text error correction method and apparatus, electronic device and readable storage medium
CN115146621A (en) * 2022-05-09 2022-10-04 腾讯科技(深圳)有限公司 Training method, application method, device and equipment of text error correction model

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8583416B2 (en) * 2007-12-27 2013-11-12 Fluential, Llc Robust information extraction from utterances
US11494647B2 (en) * 2019-12-06 2022-11-08 Adobe Inc. Slot filling with contextual information
CN112528671A (en) * 2020-12-02 2021-03-19 北京小米松果电子有限公司 Semantic analysis method, semantic analysis device and storage medium
KR102552811B1 (en) * 2020-12-14 2023-07-06 박지우 System for providing cloud based grammar checker service
CN112836496B (en) * 2021-01-25 2024-02-13 之江实验室 Text error correction method based on BERT and feedforward neural network
CN115034218A (en) * 2022-06-10 2022-09-09 哈尔滨福涛科技有限责任公司 Chinese grammar error diagnosis method based on multi-stage training and editing level voting
KR102542220B1 (en) * 2022-09-19 2023-06-13 아주대학교 산학협력단 Method of semantic segmentation based on self-knowledge distillation and semantic segmentation device based on self-knowledge distillation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021212612A1 (en) * 2020-04-23 2021-10-28 平安科技(深圳)有限公司 Intelligent text error correction method and apparatus, electronic device and readable storage medium
CN115146621A (en) * 2022-05-09 2022-10-04 腾讯科技(深圳)有限公司 Training method, application method, device and equipment of text error correction model

Also Published As

Publication number Publication date
CN117875313A (en) 2024-04-12

Similar Documents

Publication Publication Date Title
CN110489760B (en) Text automatic correction method and device based on deep neural network
CN108416058B (en) Bi-LSTM input information enhancement-based relation extraction method
CN111708882B (en) Transformer-based Chinese text information missing completion method
CN111985239A (en) Entity identification method and device, electronic equipment and storage medium
CN110276069B (en) Method, system and storage medium for automatically detecting Chinese braille error
CN114818668B (en) Name correction method and device for voice transcription text and computer equipment
CN112215013B (en) Clone code semantic detection method based on deep learning
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN110209822A (en) Sphere of learning data dependence prediction technique based on deep learning, computer
CN116306600B (en) MacBert-based Chinese text error correction method
CN113657098B (en) Text error correction method, device, equipment and storage medium
CN114781651A (en) Small sample learning robustness improving method based on contrast learning
CN114818669B (en) Method for constructing name error correction model and computer equipment
CN114330350B (en) Named entity recognition method and device, electronic equipment and storage medium
CN114586038B (en) Method and device for event extraction and extraction model training, equipment and medium
CN114239584A (en) Named entity identification method based on self-supervision learning
CN114330349A (en) Specific field named entity recognition method
CN117875313B (en) Chinese grammar error correction method and system
CN112488111A (en) Instruction expression understanding method based on multi-level expression guide attention network
CN117744658A (en) Ship naming entity identification method based on BERT-BiLSTM-CRF
CN115437511B (en) Pinyin Chinese character conversion method, conversion model training method and storage medium
CN114925170B (en) Text proofreading model training method and device and computing equipment
CN116611428A (en) Non-autoregressive decoding Vietnam text regularization method based on editing alignment algorithm
CN115952284A (en) Medical text relation extraction method fusing density clustering and ERNIE
CN115964486A (en) Small sample intention recognition method based on data enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant