WO2019232869A1 - Handwriting model training method, text recognition method and apparatus, device, and medium - Google Patents
Handwriting model training method, text recognition method and apparatus, device, and medium Download PDFInfo
- Publication number
- WO2019232869A1 WO2019232869A1 PCT/CN2018/094344 CN2018094344W WO2019232869A1 WO 2019232869 A1 WO2019232869 A1 WO 2019232869A1 CN 2018094344 W CN2018094344 W CN 2018094344W WO 2019232869 A1 WO2019232869 A1 WO 2019232869A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- chinese
- training
- chinese text
- recognition model
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/32—Digital ink
- G06V30/333—Preprocessing; Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/32—Digital ink
- G06V30/36—Matching; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/28—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
- G06V30/287—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present application relates to the field of Chinese text recognition, and in particular, to a handwriting model training method, a text recognition method, a device, a device, and a medium.
- the embodiments of the present application provide a handwriting model training method, a device, a device, and a medium to solve the problem that the current accuracy of handwritten Chinese text recognition is not high.
- a handwriting model training method includes:
- the time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
- non-standard Chinese text training samples and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;
- Input the error text training sample into the adjusted Chinese handwritten text recognition model train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
- a handwriting model training device includes:
- the normal Chinese text recognition model acquisition module is used to obtain normal Chinese text training samples, and batch the normal Chinese text training samples according to a preset batch, and input the batch normal Chinese text training samples to the recurrent neural network.
- time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
- Adjust the Chinese handwritten text recognition model acquisition module to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples
- training is performed based on a continuous time classification algorithm, and a time-dependent back propagation algorithm is used to update network parameters of the normal Chinese text recognition model to obtain an adjusted Chinese handwritten text recognition model;
- Error text training sample acquisition module for acquiring Chinese text samples to be tested, using the adjusted Chinese handwritten text recognition model to identify the Chinese text samples to be tested, obtaining error texts whose recognition results do not match the true results, and putting all the errors Text as training text for error text;
- a target Chinese handwritten text recognition model acquisition module is configured to input the error text training sample into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and update the time-dependent backpropagation algorithm with batch gradient descent. Adjust the network parameters of the Chinese handwritten text recognition model to obtain the target Chinese handwritten text recognition model.
- a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor.
- the processor executes the computer-readable instructions, the following steps are implemented:
- the time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
- non-standard Chinese text training samples and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;
- Input the error text training sample into the adjusted Chinese handwritten text recognition model train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
- One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
- the time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
- non-standard Chinese text training samples and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;
- Input the error text training sample into the adjusted Chinese handwritten text recognition model train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
- the embodiments of the present application further provide a text recognition method, device, device, and medium to solve the problem of low accuracy of current handwritten text recognition.
- a text recognition method includes:
- a text recognition device includes:
- An output value acquisition module configured to acquire Chinese text to be recognized, identify the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtain an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model;
- the target Chinese handwritten text recognition model is obtained by using the handwriting model training method;
- the recognition result obtaining module is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.
- a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor.
- the processor executes the computer-readable instructions, the following steps are implemented:
- One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
- FIG. 1 is an application environment diagram of a handwriting model training method according to an embodiment of the present application
- FIG. 2 is a flowchart of a handwriting model training method according to an embodiment of the present application
- FIG. 3 is a specific flowchart of step S10 in FIG. 2;
- step S10 in FIG. 2 is another specific flowchart of step S10 in FIG. 2;
- FIG. 5 is a specific flowchart of step S30 in FIG. 2;
- FIG. 6 is a schematic diagram of a handwriting model training device according to an embodiment of the present application.
- FIG. 7 is a flowchart of a text recognition method according to an embodiment of the present application.
- FIG. 8 is a schematic diagram of a text recognition device according to an embodiment of the present application.
- FIG. 9 is a schematic diagram of a computer device in an embodiment of the present application.
- FIG. 1 illustrates an application environment of a handwriting model training method provided by an embodiment of the present application.
- the application environment of the handwriting model training method includes a server and a client, wherein the server and the client are connected through a network, and the client is a device that can interact with the user, including, but not limited to, a computer and a smart phone.
- the server can be implemented with an independent server or a server cluster consisting of multiple servers.
- the handwriting model training method provided in the embodiment of the present application is applied to a server.
- FIG. 2 shows a flowchart of a handwriting model training method in an embodiment of the present application. As shown in FIG. 2, the handwriting model training method includes the following steps:
- S10 Obtain training samples of standard Chinese text, and batch the training samples of standard Chinese characters according to preset batches, input the batch of training samples of standard Chinese text into recurrent neural network, and train based on continuous time classification algorithm.
- the time-correlated back-propagation algorithm was used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.
- the standard Chinese text training samples refer to training samples obtained from standard texts (such as texts that belong to all the order of Chinese fonts such as Kai, Song, or Lishu, and the font is generally selected from Kai or Song).
- Recurrent neural networks are neural networks that model sequence data. Chinese text is composed of several fonts in order, so using RNN can better learn the deep features of Chinese text on sequences.
- the Continuous Time Classification (CTC) algorithm is a completely end-to-end acoustic model training algorithm. It does not need to align the training samples in advance. It only needs one input sequence and one output sequence to train.
- the batch of normalized Chinese text training samples is input into a recurrent neural network for training, and the process of updating the weights and biases of the convolutional neural network using a back propagation algorithm uses a small batch of gradient descent.
- Methods Mini-batch gradient descent (MBGD) is used to update the network parameters by accumulating the errors generated during the training according to preset batches to obtain the cumulative error corresponding to several batches.
- the time-dependent backpropagation algorithm (Back Propagation, Thin Time, BPTT algorithm for short) is a training and learning method in neural network learning, which is used to update and adjust the network parameters between nodes in the neural network.
- the minimum value of the error function is required.
- the minimum value of the error function is specifically calculated using a small batch gradient descent method.
- a training sample of standard Chinese text is obtained, and the training sample of standard Chinese characters is batched according to a preset batch.
- the fonts used in the standard Chinese text training samples are the same (multiple fonts are not mixed).
- the standard Chinese text training samples used for model training are all in the New Roman style.
- the New Roman style is used as an example.
- the Chinese fonts in the standard text here refer to the mainstream fonts in the current Chinese fonts, such as the default font Song style in the input method of computer equipment, and the mainstream font italics commonly used in copying; and like in daily life
- the less commonly used Chinese fonts, such as cursive and young round are not included in the scope of Chinese fonts that make up the standard text.
- the batched normalized Chinese text training samples are input into a recurrent neural network and trained based on a continuous-time classification algorithm.
- a time-dependent back-propagation algorithm (based on a small batch of gradients) updates the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.
- the standard Chinese text recognition model learns the deep features of the standard Chinese text training samples during the training process, enabling the model to accurately recognize standard standard text, has the ability to recognize standard standard text, and trains standard Chinese text recognition In the process of the model, manual labeling and data alignment of the standard Chinese text training samples are not required, and end-to-end training can be performed directly.
- the trained canonical Chinese text recognition model can accurately recognize standard canonical texts corresponding to typefaces such as Kai, Song, or Lishu, and obtain more accurate recognition results.
- S20 Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on continuous time
- the classification algorithm is trained, and the time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain the adjusted Chinese handwritten text recognition model.
- the non-standard Chinese text training sample refers to a training sample obtained based on handwritten Chinese text.
- the handwritten Chinese text may specifically be a text obtained by handwriting in mainstream fonts such as Kai, Song, or Lishu. Understandably, the difference between this non-standardized Chinese text training sample and the normalized Chinese text training sample is that the non-standardized Chinese text training sample is obtained by handwritten Chinese text. Since it is handwritten, it certainly contains a variety of different fonts. form.
- the server obtains non-standard Chinese text training samples, and batches the non-standard Chinese text training samples into preset batches.
- the training samples include the characteristics of handwritten Chinese text.
- the batched non-standard Chinese text training samples are input into the standard Chinese text recognition model, trained and adjusted based on the continuous time classification algorithm, and the time-dependent back propagation algorithm (based on small batch gradients) is used to update the standard Chinese text recognition model. Network parameters to get adjusted Chinese handwritten text recognition model. Understandably, the standard Chinese text recognition model has the ability to recognize standard Chinese text, but does not have high recognition accuracy when recognizing handwritten Chinese text.
- this embodiment uses non-standard Chinese text training samples for training, so that the standard Chinese handwritten text recognition model can adjust the network parameters in the model based on the existing standard text of the recognition standard to obtain the adjusted Chinese handwritten text recognition model.
- the adjusted Chinese handwritten text recognition model learns the deep features of handwritten Chinese text on the basis of the original standard text recognition, so that the adjusted Chinese handwritten text recognition model combines the deep features of standard and handwritten Chinese text, and can simultaneously adjust the standard specifications.
- the text and handwritten Chinese text are effectively recognized, and a high accuracy recognition result is obtained.
- the recurrent neural network makes judgments based on the pixel distribution and sequence of the text.
- the handwritten Chinese text is different from the standard text, but this difference is compared with other texts that do not correspond to the standard text.
- the difference is much smaller, for example, there is a difference in pixel distribution between "hello” in handwritten Chinese text and "hello” in standard specification text, but this difference is compared to "hello” and standard specification text "
- the difference between "goodbye” is significantly smaller. It can be considered that even if there is a certain difference between the handwritten Chinese text and the corresponding standard text, this difference is much smaller than the non-corresponding standard text. Therefore, the most similar (that is, the difference The minimum) principle determines the recognition result.
- the adjusted Chinese handwritten text recognition model is trained by a recurrent neural network. The model combines the standard features of text and the deep features of handwritten Chinese text, and can effectively recognize handwritten Chinese text based on the deep features.
- step S10 and step S20 in this embodiment is not interchangeable, and step S10 needs to be executed before step S20.
- Training the recurrent neural network with the normal Chinese training samples first can make the obtained normal Chinese text recognition model have better recognition ability, and make it have accurate recognition results for the standard normal text.
- the fine-tuning of step S20 is performed, so that the adjusted Chinese handwritten text recognition model obtained by training can effectively recognize the handwritten Chinese text based on the deep features of the learned handwritten Chinese text and make it handwriting Chinese text recognition has more accurate recognition results.
- step S20 is performed first or only step S20, because the handwritten Chinese text contained in the handwritten Chinese text has various forms, the features learned by directly training the handwritten Chinese text cannot reflect the characteristics of the handwritten Chinese text. Make the model learn "bad" at the beginning, which makes it difficult to make accurate recognition results for handwritten Chinese text recognition. Although each person's handwritten Chinese text is different, most of them are similar to standard specification text (such as handwritten Chinese text imitating standard specification text). Therefore, at the beginning, model training based on standard and normative text is more in line with the objective situation. It is more effective than model training directly on handwritten Chinese text. You can make corresponding adjustments under the "good" model to obtain the recognition rate of handwritten Chinese text. Highly adjusted Chinese handwritten text recognition model.
- S30 Obtain a sample of Chinese text to be tested, use the adjusted Chinese handwritten text recognition model to identify the sample of Chinese text to be tested, obtain error texts whose recognition results do not match the true results, and use all the error texts as training text samples for errors.
- the Chinese text sample to be tested refers to the training sample obtained for testing according to the standard text and the handwritten Chinese text.
- the standard text used in this step is the same as the standard text used for training in step S10 (because For example, each character corresponding to a font such as Kai, Song, etc. is uniquely determined); the handwritten Chinese text used and the handwritten Chinese text used for training in step S20 may be different (the Chinese text handwritten by different people is not complete) Similarly, each text of the handwritten Chinese text can correspond to multiple font forms. In order to distinguish it from the non-standard Chinese text training samples used for training in step S20, and to avoid the situation of model training overfitting, this step is generally used with S20 different handwritten Chinese text).
- the trained adjusted Chinese handwritten text recognition model is used to identify the Chinese text sample to be tested.
- Standard training text and handwritten Chinese text can be input to the adjusted Chinese handwritten text recognition model in a mixed manner during training.
- the adjusted Chinese handwritten text recognition model is used to recognize the Chinese text samples to be tested, the corresponding recognition results will be obtained, and all error texts whose recognition results do not match the label value (true result) will be used as the error text training samples.
- the error text training sample reflects that the problem of insufficient recognition accuracy still exists in adjusting the Chinese text handwriting recognition model, so as to further update and optimize the Chinese handwriting text recognition model based on the error text training sample.
- the network parameters were first updated with the normal Chinese text training samples, and then the non-standard Chinese text training samples were used to update
- the acquired adjusted Chinese handwritten text recognition model will over-learn the characteristics of non-standard Chinese text training samples, so that the obtained adjusted Chinese handwritten text recognition model will train non-standard Chinese text training samples (including handwritten Chinese text).
- step S30 uses the Chinese text sample to be tested to adjust Chinese handwritten text recognition model for recognition can largely eliminate over-learning of non-standard Chinese text training samples used during training. That is, by adjusting the Chinese handwritten text recognition model to identify the Chinese text samples to be tested to find the errors caused by over-learning, the errors can be specifically reflected by the error text, so the Chinese handwriting can be further updated and optimized based on the error text. Network parameters for text recognition models.
- S40 Input the training sample of the wrong text into the adjusted Chinese handwritten text recognition model, train it based on the continuous time classification algorithm, and update the network parameters of the Chinese handwritten text recognition model with batch gradient descent time-dependent back propagation algorithm to obtain the target Chinese. Handwritten text recognition model.
- an error text training sample is input into the adjusted Chinese handwritten text recognition model, and training is performed based on a continuous time classification algorithm.
- the error text training sample reflects that during training and adjustment of the Chinese handwritten text recognition model, due to excessive learning non-standard
- the characteristics of Chinese text training samples lead to inaccurate recognition problems when adjusting the Chinese handwritten text recognition model to recognize handwritten Chinese text other than non-standard Chinese text training samples.
- the reason that the normalized Chinese text training samples are used first and then the non-standardized Chinese text training samples are used to train the model will overly weaken the characteristics of the previously learned standard canonical text, which will affect the initial establishment of the model to recognize the standard canonical text. frame".
- the use of erroneous text training samples can well solve the problems of over-learning and over-weakening. According to the recognition accuracy problems reflected by the erroneous text training samples, the over-learning and over-weakening generated during the original training process can be largely eliminated. Adverse effects.
- the time-correlated back-propagation algorithm of batch gradient descent is used for training using the error text training samples, and the network parameters of the Chinese handwritten text recognition model are updated and adjusted according to the algorithm to obtain the target Chinese handwritten text recognition model.
- the target Chinese The handwritten text recognition model refers to the finally trained model that can be used to recognize Chinese handwritten text.
- the training uses a recurrent neural network, which can combine the sequence characteristics of Chinese text to learn the deep features of Chinese text and improve the recognition rate of the target Chinese handwritten text recognition model.
- the training algorithm is a continuous-time classification algorithm. Using this algorithm for training does not require manual labeling and data alignment of the training samples, which can reduce the complexity of the model and enable direct training of non-aligned and variable-length sequences.
- the sample size of the error text training samples is small (less error texts).
- all errors generated by the error text training samples during the training of the recurrent neural network are all Back-propagation update is performed to ensure that all errors generated can be adjusted and updated on the network, can fully train the recurrent neural network, and improve the recognition accuracy of the target Chinese handwritten text recognition model.
- the standardized Chinese text training sample is used to train and obtain a standardized Chinese text recognition model, and then the standardized Chinese text recognition model is adjusted to update through the unstandardized Chinese text after the batch, so that after the update,
- the obtained adjusted Chinese handwritten text recognition model learns the deep features of handwritten Chinese text through training and updating on the premise that it has the ability to recognize standard and standardized text, so that the adjusted Chinese handwritten text recognition model can better recognize handwritten Chinese text.
- the classification algorithm is updated to obtain the target Chinese handwritten text recognition model.
- error text training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy.
- the training standard Chinese text recognition model and the adjusted Chinese handwritten text recognition model use a time-dependent back-propagation algorithm (based on small batch gradients), which can still have good training efficiency and training effect in the case of a large number of training samples, and It can also ensure that the error has global characteristics within a certain range compared to a single training sample, and it is easier to find the minimum value of the error function.
- the training target Chinese handwritten text recognition model uses a time-dependent backpropagation algorithm using batch gradient descent. Using batch gradient descent can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are backpropagated. Update, comprehensively update the parameters according to the generated errors, and improve the recognition accuracy of the obtained model.
- Each model is trained using a recurrent neural network, which can combine the sequence characteristics of Chinese text to learn the deep features of Chinese text and realize the function of recognizing different handwritten Chinese text.
- the algorithm used to train each model is a continuous-time classification algorithm. Using this algorithm for training does not require manual labeling and data alignment of the training samples, which can reduce the model complexity and enable direct training of non-aligned and indefinite length sequences.
- step S10 a training sample of standard Chinese text is obtained, and the training sample of standard Chinese characters is batched according to a preset batch, which specifically includes the following steps:
- S101 Obtain a pixel value feature matrix of each Chinese text in a training sample of Chinese text to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain a normalization of each Chinese text Pixel value feature matrix, where the normalization formula is MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization.
- the training text samples to be processed refers to the training samples that are initially acquired and not processed.
- a mature, open-source convolutional neural network may be used to extract the features of the Chinese text training samples to be processed, and obtain the pixel value feature matrix of each Chinese text in the Chinese text training samples to be processed.
- the pixel value feature matrix of each Chinese text represents the features of the corresponding text.
- the pixel values represent the features of the text. Since the text is represented two-dimensionally by the image, the pixel values can be represented by a matrix, that is, the pixel value feature matrix.
- the computer device can recognize the form of the pixel value characteristic matrix and read the value in the pixel value characteristic matrix.
- the server After the server obtains the pixel value feature matrix of each Chinese text, it uses the formula of normalization processing to normalize each pixel value in the feature matrix to obtain the normalized pixel value feature of each Chinese text.
- the normalized processing method can be used to compress the pixel value feature matrix of each Chinese text within the same range, which can speed up the calculation related to the pixel value feature matrix and help improve the training standard Chinese. Training efficiency of text recognition models.
- S102 Divide the pixel values in the normalized pixel value feature matrix of each Chinese text into two types of pixel values, build a binary pixel value feature matrix of each Chinese text based on the two types of pixel values, and divide each Chinese text
- the Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample of the standard Chinese text, and the training sample of the standard Chinese text is batched according to a preset batch.
- the pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values.
- the two types of pixel values refer to that the pixel values include only the pixel value A or the pixel value B.
- a pixel value greater than or equal to 0.5 in the normalized pixel feature matrix can be taken as 1 and a pixel value less than 0.5 can be taken as 0, and a corresponding binary pixel value feature matrix for each Chinese text can be established.
- the original of the binarized pixel feature matrix of each Chinese text contains only 0 or 1.
- the Chinese text combination corresponding to the binarized pixel value feature matrix is used as the standard Chinese text training sample, and the standard Chinese text training sample is batched according to a preset batch Minute.
- the standard Chinese text training sample is batched according to a preset batch Minute.
- the “1" in the binarized pixel value feature matrix represents the portion of the text pixel
- the "0" represents the portion of the blank pixel in the image.
- the feature representation of text can be further simplified by establishing a binary pixel value feature matrix. Only the matrix of 0 and 1 can be used to represent and distinguish each text, which can improve the computer processing of the feature matrix of text. Speed, which further improves the training efficiency of training standard Chinese text recognition models.
- Steps S101-S102 Normalize the training samples of the Chinese text to be processed and divide the two types of values, obtain the binary pixel value feature matrix of each Chinese text, and binarize the pixel value features of each Chinese text The text corresponding to the matrix is used as the training sample of the standard Chinese text, which can significantly shorten the training time of training the standard Chinese text recognition model.
- step S10 the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous time classification algorithm, and the loop is updated by using a time-dependent back propagation algorithm.
- the network parameters of the neural network to obtain the standard Chinese text recognition model include the following steps:
- S111 input the batch of normalized Chinese text training samples into a recurrent neural network, and train based on a continuous time classification algorithm to obtain the forward propagation output and the backward direction of the batched normalized Chinese text training samples in a recurrent neural network
- the propagation output, the forward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t is the label value l ' u .
- the backward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t + 1 is the label value l ' u ,
- a batch of standardized Chinese text training samples is input into a recurrent neural network, and training is performed based on a continuous time classification (CTC) algorithm.
- the CTC algorithm is essentially an algorithm that calculates an error function. This algorithm is used to measure the error between the input sequence data after passing through the neural network and the real result (objective facts, also called label values). Therefore, the corresponding error function can be constructed by obtaining the forward propagation output and the backward propagation output of a batch of standardized Chinese text training samples in a recurrent neural network, and then using the forward propagation output and the backward propagation output to describe.
- CTC continuous time classification
- the output sequence is (a-ab-)
- the letter a is the label value corresponding to the third step.
- x) represents the probability of the given input sequence x and the output path being ⁇ ; since the probability of the corresponding label value output at each sequence step is independent of each other, then p ( ⁇
- the mapping transformation may be a process of removing overlapping words and removing spaces as in the above example.
- x) Represents a given input sequence x (such as a sample in a standard Chinese text training sample), and the probability of output is sequence l.
- the probability of output as sequence l can be expressed as all output paths ⁇
- the mapped sequence is the sum of the probabilities of l, expressed by the formula: Understandably, as the length of the sequence l increases, the number of corresponding paths increases exponentially, so iterative thinking can be adopted, from the t-step and t-1, t + 1 step forward From the perspective of propagation and backward propagation, the path probability corresponding to sequence l is calculated to improve the efficiency of the calculation.
- the set of paths and the output of step t is l ' u , where u / 2 represents the index, so it needs to be rounded down.
- x) can be represented by a forward variable, that is: p (l
- x) ⁇ (T, U ') + ⁇ (T, U'-1), where ⁇ (T, U' ) Can be understood as the length of all paths is T, after the F mapping is sequence l, and the label value of the output at time T is: l ' U or l' U-1 , that is, whether the last of the path includes a space. Therefore, the calculation of the forward variable can be recursed according to time, expressed by the formula: Among them, f (u) here is actually a list of all possible paths at the previous moment, and the specific condition formula is as follows:
- a backward variable ⁇ (t, u) can be defined, which means that starting from time t + 1, a path ⁇ 'is added to the forward variable ⁇ (t, u), so that it is finally mapped by F
- the sum of the probabilities of the sequence l is followed by the formula:
- g (u) represents a possible path selection function at time t + 1, which is expressed as According to the recursive expression of the forward variable and the recursive expression of the backward variable, the process of forward propagation and the process of backward propagation can be described, and the corresponding
- S112 Construct an error function according to the forward propagation output and the backward propagation output.
- a time-dependent back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.
- a time-dependent back-propagation algorithm (based on a small batch of gradients) can be used to update the network parameters. Specifically, the partial derivative of the error function on the network output without the sofmax layer (that is, the gradient) is obtained. The network parameters are updated by subtracting the product of the gradient and the learning rate from the original network parameters.
- Steps S111-S113 can construct an error function according to the forward propagation output and the backward propagation output obtained from the recurrent neural network by the training samples of the standard Chinese text Based on the error function, the error is back-propagated, and the network parameters are updated to achieve the purpose of obtaining the standard Chinese text recognition model.
- the model learns the deep features of the training samples of standard Chinese text and can accurately identify standard standard text.
- step S30 the Chinese text sample to be tested is identified by adjusting the Chinese handwritten text recognition model, and error texts whose recognition results do not match the real results are obtained, and all the error texts are used as the error text training samples. , Including the following steps:
- S31 Input the Chinese text sample to be tested into the adjusted Chinese handwritten text recognition model, and obtain the output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model.
- the Chinese handwritten text recognition model is adjusted to recognize the Chinese text sample to be tested, and the Chinese text sample to be tested includes several Chinese texts.
- the text includes text, and the output value of each text mentioned in this embodiment specifically refers to each output value corresponding to each font in each text.
- the Chinese character library there are more than 3,000 commonly used Chinese characters (including spaces and various Chinese punctuation marks).
- each character in the Chinese character library and the input Chinese to be tested should be set.
- the probability value of the similarity of the words in the text sample can be achieved through the softmax function.
- each output number Corresponding Chinese characters have probability values that are similar to each word in the Chinese character library.
- the probability value is the output value of each text in the test Chinese text sample in adjusting the Chinese handwritten text recognition model.
- Each output value corresponds to the probability value of the similarity between the Chinese character corresponding to the output number and each character in the Chinese character library.
- the recognition result of each text can be determined according to the probability value.
- S32 Select the maximum output value among the output values corresponding to each text, and obtain the recognition result of each text according to the maximum output value.
- a maximum output value among all output values corresponding to each text is selected, and a recognition result of the text can be obtained according to the maximum output value.
- the output value directly reflects the similarity between the words in the input Chinese text sample and each character in the Chinese character library, and the maximum output value indicates that the word in the text sample to be tested is closest to a certain character in the Chinese character library.
- the actual output can be determined according to the word corresponding to the maximum output value, such as the actual output is "You guys ___”, “You_ guys_men_good_”, “_ ⁇ ⁇ _ ⁇ _ ⁇ _ "And so on instead of actual output like” ⁇ ⁇ ” ⁇ ___”, “ ⁇ _ ⁇ _ ⁇ _”, “_ ⁇ ⁇ _ ⁇ _ ⁇ _”, etc., according to the definition of continuous time classification algorithm,
- the actual output needs to be further processed to remove the reduplicated words in the actual output, leaving only one; and to remove the spaces, you can get the recognition result, for example, the recognition result in this embodiment is "hello".
- the correctness of the actual output word is determined by the maximum output value, and the de-superposition and space removal processing are performed to effectively obtain the recognition result of each text.
- the obtained recognition result is compared with an actual result (objective fact), and an error text in which the recognition result does not match the actual result is used as an error text training sample.
- the recognition result is just the result recognized by the Chinese text training sample to be tested in adjusting the Chinese handwritten text recognition model. It may be different from the real result, reflecting that the model still has the recognition accuracy. Shortcomings, and these shortcomings can be optimized by training samples of erroneous text to achieve more accurate recognition results.
- Steps S31-S33 adjust the output value of the Chinese handwritten text recognition model according to each text in the Chinese text sample to be tested, and select the maximum output value from the output value that can reflect the similarity between texts (actually the similarity of words). ; Then get the recognition result through the maximum output value, and get the error text training sample according to the recognition result, which provides an important technical premise for the subsequent use of the error text training sample to further optimize the recognition accuracy.
- the handwriting model training method before step S10, that is, before the step of obtaining training samples of standard Chinese text, the handwriting model training method further includes the following steps: initializing a recurrent neural network.
- the initialization of the recurrent neural network is to initialize the network parameters of the network and assign initial values to the network parameters. If the initial weight is in a relatively flat area of the error surface, the convergence speed of the RNN model training may be abnormally slow.
- the network parameters can be initialized to be uniformly distributed in a relatively small interval with a zero mean, such as in an interval such as [-0.30, + 0.30].
- Reasonably initializing the recurrent neural network can make the network more flexible in the initial stage. It can effectively adjust the network during the training process. It can quickly and effectively find the minimum value of the error function, which is conducive to the update and recurrent neural network. Adjusted so that the model obtained by model training based on recurrent neural network has accurate recognition effect when performing Chinese handwriting recognition.
- the network parameters of the recurrent neural network are initialized to be uniformly distributed in a relatively small interval with a zero mean, such as an interval such as [-0.30, +0.30].
- This method can quickly and efficiently find the minimum value of the error function, which is conducive to the update and adjustment of the recurrent neural network. Normalize the Chinese text training samples to be processed and divide the two types of values to obtain the binary pixel value feature matrix of each Chinese text, and the text corresponding to the binary pixel value feature matrix of each Chinese text As a training sample of canonical Chinese text, it can significantly shorten the time for training a canonical Chinese text recognition model.
- the network parameters are updated to obtain a standard Chinese text recognition model.
- the model learns the deep features of the standard Chinese text training samples and can accurately identify standard standard texts.
- the standardized Chinese text recognition model is adjusted to update through the batch of non-standard Chinese text, so that the adjusted Chinese handwritten text recognition model obtained after the update can learn by training and updating on the premise that it has the ability to recognize standard Chinese handwritten text.
- the deep features of non-standard Chinese text make the adjusted Chinese hand-written text recognition model better recognize non-standard Chinese hand-written text.
- the maximum output value that reflects the degree of similarity between texts is selected from the output values, and the maximum output value is used to obtain the recognition result.
- Recognition results are obtained from training text samples of errors, and all error texts are input as training text samples to adjust the Chinese handwritten text recognition model, and training updates are performed based on the continuous time classification algorithm to obtain the target Chinese handwritten text recognition model.
- the use of error text training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy.
- the standardized Chinese text recognition model and the adjusted Chinese handwritten text recognition model are trained based on a small batch gradient (the standard Chinese text training samples are batched according to a preset batch) Points) of the back propagation algorithm, in the case of a large number of training samples, still has good training efficiency and training effect.
- the target Chinese handwritten text recognition model is trained using a time-dependent backpropagation algorithm using batch gradient descent, which can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are backpropagated. The parameters are updated comprehensively according to the generated errors to improve the recognition accuracy of the obtained model.
- FIG. 6 shows a principle block diagram of a handwriting model training device corresponding to the handwriting model training method in the embodiment.
- the handwriting model training device includes a standard Chinese text recognition model acquisition module 10, an adjusted Chinese handwriting text recognition model acquisition module 20, an error text training sample acquisition module 30, and a target Chinese handwriting text recognition model acquisition module 40.
- the implementation functions of the standard Chinese text recognition model acquisition module 10, adjusted Chinese handwritten text recognition model acquisition module 20, error text training sample acquisition module 30, and target Chinese handwritten text recognition model acquisition module 40 correspond to the handwriting model training method in the embodiment.
- the steps correspond one by one. In order to avoid redundant description, this embodiment is not detailed one by one.
- Canonical Chinese text recognition model acquisition module 10 is used to obtain normative Chinese text training samples, and batches the normative Chinese text training samples into preset batches, and inputs the batched normative Chinese text training samples to the recurrent neural network. Based on continuous-time classification algorithm for training, the time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.
- Adjust the Chinese handwritten text recognition model acquisition module 20 to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples to
- training is performed based on a continuous time classification algorithm, and the network parameters of the standard Chinese text recognition model are updated using a time-dependent backpropagation algorithm to obtain an adjusted Chinese handwritten text recognition model.
- Error text training sample acquisition module 30 which is used to obtain Chinese text samples to be tested, adjust the Chinese handwritten text recognition model to identify Chinese text samples to be tested, obtain error texts whose recognition results do not match the true results, and train all error texts as error texts sample.
- the target Chinese handwritten text recognition model acquisition module 40 is used to input training text error samples into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and use batch gradient descent time-dependent back-propagation algorithm to update and adjust Chinese handwriting. Network parameters of the text recognition model to obtain the target Chinese handwritten text recognition model.
- the normalized Chinese text recognition model acquisition module 10 includes a normalized pixel value feature matrix acquisition unit 101, a normalized Chinese text training sample acquisition unit 102, a propagation output acquisition unit 111, an error function construction unit 112, and a normalized Chinese text recognition model acquisition. Unit 113.
- the normalized pixel value feature matrix obtaining unit 101 is configured to obtain a pixel value feature matrix of each Chinese text in a Chinese text training sample to be processed, and normalize each pixel value in the pixel value feature matrix of each Chinese text. Processing to obtain a normalized pixel value feature matrix for each Chinese text, where the formula for normalization processing is MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization.
- a normalized Chinese text training sample acquisition unit 102 is configured to divide the pixel values in the normalized pixel value feature matrix of each Chinese text into two types of pixel values, and establish a binarized pixel of each Chinese text based on the two types of pixel values
- the value feature matrix uses the Chinese text combination corresponding to the binarized pixel value feature matrix of each Chinese text as a standard Chinese text training sample, and the standard Chinese text training samples are batched according to a preset batch.
- a propagation output obtaining unit 111 is configured to input the batch of standardized Chinese text training samples into a recurrent neural network, and perform training based on a continuous-time classification algorithm, and obtain a batch of standardized Chinese text training samples in a recurrent neural network.
- forward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t is the label value l ' u , The backward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t + 1 is the label value l ' u ,
- the error function constructing unit 112 is configured to construct an error function according to the forward propagation output and the backward propagation output.
- the standard Chinese text recognition model acquisition unit 113 is configured to update the network parameters of the recurrent neural network by using a time-dependent back-propagation algorithm according to an error function to obtain a standard Chinese text recognition model.
- the error text training sample acquisition module 30 includes a model output value acquisition unit 31, a model recognition result acquisition unit 32, and an error text training sample acquisition unit 33.
- the model output value acquiring unit 31 is configured to input a Chinese text sample to be tested into the adjusted Chinese handwritten text recognition model, and obtain an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model.
- the model recognition result obtaining unit 32 is configured to select a maximum output value among output values corresponding to each text, and obtain a recognition result of each text according to the maximum output value.
- the error text training sample acquisition unit 33 is configured to obtain error texts whose recognition results do not match the real results according to the recognition results, and use all the error texts as the error text training samples.
- the handwriting model training device further includes an initialization module 50 for initializing a recurrent neural network.
- FIG. 7 shows a flowchart of the text recognition method in this embodiment.
- the text recognition method can be applied to computer equipment configured by banks, investment, and insurance institutions to recognize handwritten Chinese text and achieve artificial intelligence purposes. As shown in FIG. 7, the text recognition method includes the following steps:
- S50 Obtain the Chinese text to be recognized, use the target Chinese handwritten text recognition model to identify the Chinese text to be recognized, and obtain the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model.
- the target Chinese handwritten text recognition model is trained using the above handwriting model. Method.
- the Chinese text to be identified refers to the Chinese text to be identified.
- the Chinese text to be recognized is obtained, the Chinese text to be recognized is input to the target Chinese handwritten text recognition model for recognition, and the Chinese text corresponding to each output number of the target Chinese handwritten text recognition model is obtained.
- the probability value is the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model.
- the recognition result of the Chinese text to be recognized can be determined based on the output value.
- S60 Select the maximum output value among the output values corresponding to the Chinese text to be recognized, and obtain the recognition result of the Chinese text to be recognized according to the maximum output value.
- the maximum output value among all output values corresponding to the Chinese text to be recognized is selected, and the corresponding actual output is determined according to the maximum output value, for example, the actual output is "you_men_men_ ⁇ _". Then the actual output is further processed, and the overlapping words in the actual output are removed, leaving only one; and the spaces are removed to obtain the recognition result of the Chinese text to be recognized.
- the maximum output value is used to determine the correctness of the words in the actual output stage, and then the de-superposition and de-space processing is performed to effectively obtain the recognition result of each text and improve the recognition accuracy.
- the target Chinese handwritten text recognition model is used to identify the Chinese text to be recognized, and the recognition result of the Chinese text to be recognized is obtained according to the maximum output value and the processing of desuperimposed characters and spaces.
- the target Chinese handwritten text recognition model itself has high recognition accuracy, and combined with the Chinese semantic thesaurus to further improve the accuracy of Chinese handwriting recognition.
- the Chinese text to be recognized is input into the target Chinese handwritten text recognition model for recognition, and the recognition result is obtained in combination with a preset Chinese semantic thesaurus.
- the target Chinese handwritten text recognition model is used to recognize Chinese handwritten text, accurate recognition results can be obtained.
- FIG. 8 shows a principle block diagram of a text recognition device that corresponds one-to-one to the text recognition method in the embodiment.
- the text recognition device includes an output value acquisition module 60 and a recognition result acquisition module 70.
- the implementation functions of the output value acquisition module 60 and the recognition result acquisition module 70 correspond to the steps corresponding to the text recognition method in the embodiment one by one. To avoid redundant descriptions, this embodiment does not detail them one by one.
- the text recognition device includes an output value acquisition module 60 for obtaining the Chinese text to be recognized, using the target Chinese handwritten text recognition model to identify the Chinese text to be recognized, and obtaining the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese The handwritten text recognition model is obtained using the handwriting model training method.
- the recognition result acquisition module 70 is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.
- This embodiment provides one or more non-volatile readable storage media storing computer-readable instructions.
- the computer-readable instructions are executed by one or more processors, the one or more processors are executed.
- the handwriting model training method in the embodiment is implemented at this time. To avoid repetition, details are not repeated here.
- the functions of each module / unit of the handwriting model training device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, here No longer.
- the functions of each step in the text recognition method in the embodiment are implemented when the one or more processors are executed. To avoid repetition, different ones are not provided here.
- the functions of each module / unit in the text recognition device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, here Not one by one.
- FIG. 9 is a schematic diagram of a computer device according to an embodiment of the present application.
- the computer device 80 of this embodiment includes a processor 81, a memory 82, and computer-readable instructions 83 stored in the memory 82 and executable on the processor 81.
- the computer-readable instructions 83 are processed.
- the device 81 implements the handwriting model training method in the embodiment when executed. To avoid repetition, details are not described here one by one.
- the computer-readable instructions 83 are executed by the processor 81, the functions of each model / unit in the handwriting model training device in the embodiment are implemented. To avoid repetition, details are not described here one by one.
- the computer-readable instructions 83 are executed by the processor 81, the functions of the steps in the text recognition method in the embodiment are implemented. To avoid repetition, details are not described here one by one.
- the computer-readable instructions 83 are executed by the processor 81, the functions of each module / unit in the text recognition device in the embodiment are realized. To avoid repetition, we will not repeat them here.
- the computer device 80 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
- the computer equipment may include, but is not limited to, a processor 81 and a memory 82.
- FIG. 9 is only an example of the computer device 80 and does not constitute a limitation on the computer device 80. It may include more or fewer components than shown in the figure, or combine some components or different components.
- computer equipment may also include input and output equipment, network access equipment, and buses.
- the so-called processor 81 may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
- the memory 82 may be an internal storage unit of the computer device 80, such as a hard disk or a memory of the computer device 80.
- the memory 82 may also be an external storage device of the computer device 80, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory card (Flash) provided on the computer device 80. Card) and so on.
- the memory 82 may also include both an internal storage unit of the computer device 80 and an external storage device.
- the memory 82 is used to store computer-readable instructions 83 and other programs and data required by the computer device.
- the memory 82 may also be used to temporarily store data that has been or will be output.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
- the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Character Discrimination (AREA)
Abstract
Disclosed are a handwriting model training method, a text recognition method and apparatus, a device, and a medium. The handwriting model training method comprises: obtaining a standard Chinese text training sample, performing batch classification on the standard Chinese text training sample according to a preset batch, inputting the classified standard Chinese text training sample into a recurrent neural network, performing training on the basis of a continuous time classification algorithm, updating a network parameter by using a time-related back propagation algorithm, and obtaining a standard Chinese text recognition model; obtaining and using a non-standard Chinese text training sample, and training and obtaining an adjusted Chinese handwriting text recognition model; obtaining and using a Chinese text sample to be tested to obtain an error text training sample; and using the error text training sample to update a network parameter of the Chinese handwriting text recognition model, and obtaining a target Chinese handwriting text recognition model. By using the handwriting model training method, a target Chinese handwriting text recognition model having a high recognition rate in handwriting text recognition can be obtained.
Description
本申请以2018年6月4日提交的申请号为201810564063.8,名称为“手写模型训练方法、文本识别方法、装置、设备及介质”的中国专利申请为基础,并要求其优先权。This application is based on a Chinese patent application filed on June 4, 2018 with the application number 201810564063.8, entitled "Handwriting Model Training Method, Text Recognition Method, Device, Equipment, and Medium", and claims its priority.
本申请涉及中文文本识别领域,尤其涉及一种手写模型训练方法、文本识别方法、装置、设备及介质。The present application relates to the field of Chinese text recognition, and in particular, to a handwriting model training method, a text recognition method, a device, a device, and a medium.
采用传统文本识别方法在识别较为潦草的非规范文本(手写中文文本)时,识别的精确度不高,使得其识别效果不理想。传统文本识别方法很大程度上只能识别规范文本,对实际生活中各种各样的手写文本进行识别时,准确率较低。When traditional text recognition methods are used to recognize the more sloppy non-standard text (handwritten Chinese text), the recognition accuracy is not high, which makes its recognition effect unsatisfactory. Traditional text recognition methods can only recognize canonical texts to a large extent, and have a low accuracy rate when recognizing a variety of handwritten texts in real life.
发明内容Summary of the Invention
本申请实施例提供一种手写模型训练方法、装置、设备及介质,以解决当前手写中文文本识别准确率不高的问题。The embodiments of the present application provide a handwriting model training method, a device, a device, and a medium to solve the problem that the current accuracy of handwritten Chinese text recognition is not high.
一种手写模型训练方法,包括:A handwriting model training method includes:
获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;
获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;
将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
一种手写模型训练装置,包括:A handwriting model training device includes:
规范中文文本识别模型获取模块,用于获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;The normal Chinese text recognition model acquisition module is used to obtain normal Chinese text training samples, and batch the normal Chinese text training samples according to a preset batch, and input the batch normal Chinese text training samples to the recurrent neural network. In training, based on continuous-time classification algorithm, time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
调整中文手写文本识别模型获取模块,用于获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Adjust the Chinese handwritten text recognition model acquisition module to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples In the normal Chinese text recognition model, training is performed based on a continuous time classification algorithm, and a time-dependent back propagation algorithm is used to update network parameters of the normal Chinese text recognition model to obtain an adjusted Chinese handwritten text recognition model;
出错文本训练样本获取模块,用于获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Error text training sample acquisition module, for acquiring Chinese text samples to be tested, using the adjusted Chinese handwritten text recognition model to identify the Chinese text samples to be tested, obtaining error texts whose recognition results do not match the true results, and putting all the errors Text as training text for error text;
目标中文手写文本识别模型获取模块,用于将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。A target Chinese handwritten text recognition model acquisition module is configured to input the error text training sample into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and update the time-dependent backpropagation algorithm with batch gradient descent. Adjust the network parameters of the Chinese handwritten text recognition model to obtain the target Chinese handwritten text recognition model.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor. When the processor executes the computer-readable instructions, the following steps are implemented:
获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;
获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;
将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;
获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;
获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;
将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
本申请实施例还提供一种文本识别方法、装置、设备及介质,以解决当前手写文本识别准确率不高的问题。The embodiments of the present application further provide a text recognition method, device, device, and medium to solve the problem of low accuracy of current handwritten text recognition.
一种文本识别方法,包括:A text recognition method includes:
获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是采用所述手写模型训练方法获取到的;Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method;
选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
一种文本识别装置,包括:A text recognition device includes:
输出值获取模块,用于获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是采用所述手写模型训练方法获取到的;An output value acquisition module, configured to acquire Chinese text to be recognized, identify the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtain an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; The target Chinese handwritten text recognition model is obtained by using the handwriting model training method;
识别结果获取模块,用于选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。The recognition result obtaining module is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor. When the processor executes the computer-readable instructions, the following steps are implemented:
获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是采用上述手写模型训练方法获取到的;Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the above handwriting model training method;
选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more non-volatile readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:
获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是上述手写模型训练方法获取到的;Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by the above handwriting model training method;
选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below, and other features and advantages of the present application will become apparent from the description, the drawings, and the claims.
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the drawings used in the description of the embodiments of the application will be briefly introduced below. Obviously, the drawings in the following description are just some embodiments of the application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without paying creative labor.
图1是本申请一实施例中手写模型训练方法的一应用环境图;1 is an application environment diagram of a handwriting model training method according to an embodiment of the present application;
图2是本申请一实施例中手写模型训练方法的一流程图;2 is a flowchart of a handwriting model training method according to an embodiment of the present application;
图3是图2中步骤S10的一具体流程图;FIG. 3 is a specific flowchart of step S10 in FIG. 2;
图4是图2中步骤S10的另一具体流程图;4 is another specific flowchart of step S10 in FIG. 2;
图5是图2中步骤S30的一具体流程图;FIG. 5 is a specific flowchart of step S30 in FIG. 2;
图6是本申请一实施例中手写模型训练装置的一示意图;6 is a schematic diagram of a handwriting model training device according to an embodiment of the present application;
图7是本申请一实施例中文本识别方法的一流程图;7 is a flowchart of a text recognition method according to an embodiment of the present application;
图8是本申请一实施例中文本识别装置的一示意图;8 is a schematic diagram of a text recognition device according to an embodiment of the present application;
图9是本申请一实施例中计算机设备的一示意图。FIG. 9 is a schematic diagram of a computer device in an embodiment of the present application.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of this application.
图1示出本申请实施例提供的手写模型训练方法的应用环境。该手写模型训练方法的应用环境包括服务端和客户端,其中,服务端和客户端之间通过网络进行连接,客户端是可与用户进行人机交互的设备,包括但不限于电脑、智能手机和平板等设备,服务端具体可以用独立的服务器或者多个服务器组成的服务器集群实现。本申请实施例提供的手写模型训练方法应用于服务端。FIG. 1 illustrates an application environment of a handwriting model training method provided by an embodiment of the present application. The application environment of the handwriting model training method includes a server and a client, wherein the server and the client are connected through a network, and the client is a device that can interact with the user, including, but not limited to, a computer and a smart phone. For devices such as tablets, the server can be implemented with an independent server or a server cluster consisting of multiple servers. The handwriting model training method provided in the embodiment of the present application is applied to a server.
图2示出本申请实施例中手写模型训练方法的一流程图,如图2所示,该手写模型训练方法包括如下步骤:FIG. 2 shows a flowchart of a handwriting model training method in an embodiment of the present application. As shown in FIG. 2, the handwriting model training method includes the following steps:
S10:获取规范中文文本训练样本,并将规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型。S10: Obtain training samples of standard Chinese text, and batch the training samples of standard Chinese characters according to preset batches, input the batch of training samples of standard Chinese text into recurrent neural network, and train based on continuous time classification algorithm. The time-correlated back-propagation algorithm was used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.
其中,规范中文文本训练样本是指由标准规范文本(如属于楷体、宋体或隶书等中文字体所有序组成的文本,字体一般选择楷体或者宋体)所获取的训练样本。循环神经网络(Recurrent neural networks,简称RNN)是一种对序列数据建模的神经网络。中文文本是由若干字体有序组成的,因此采用RNN能够较好地学习中文文本在序列上的深层特征。连续时间分类(Connectionist temporal classification,简称CTC)算法,是一种完全端到端的声学模型训练的算法,不需要预先对训练样本做对齐,只需要一个输入序列和一个输出序列即可训练。在一实施例中,将批分后的规范中文文本训练样本输入到循环神经网络中进行训练,采用后向传播算法更新卷积神经网络的权值和偏置的过程采用的是小批量梯度下降的方法。小批量梯度下降(Mini-batch Gradient Descent,简称MBGD)是在更新网络参数时,将训练过程中产生的误差按预设批次批分累加,得到若干批次对应的累加误差,并采用该若干批次对应的累加误差进行参数更新的方法。时间相关反向传播算法(Back Propagation Though Time,简称BPTT算法)是神经网络学习中一种训练与学习方法,用来更新调整神经网络中各个节点之间的网络参数。采用时间相关反向传播算法对神经网络中的网络参数进行调整时需要求出误差函数的极小值,而在本实施例中,误差函数的极小值具体采用小批量梯度下降的处理方法求出。Among them, the standard Chinese text training samples refer to training samples obtained from standard texts (such as texts that belong to all the order of Chinese fonts such as Kai, Song, or Lishu, and the font is generally selected from Kai or Song). Recurrent neural networks (RNNs) are neural networks that model sequence data. Chinese text is composed of several fonts in order, so using RNN can better learn the deep features of Chinese text on sequences. The Continuous Time Classification (CTC) algorithm is a completely end-to-end acoustic model training algorithm. It does not need to align the training samples in advance. It only needs one input sequence and one output sequence to train. In one embodiment, the batch of normalized Chinese text training samples is input into a recurrent neural network for training, and the process of updating the weights and biases of the convolutional neural network using a back propagation algorithm uses a small batch of gradient descent. Methods. Mini-batch gradient descent (MBGD) is used to update the network parameters by accumulating the errors generated during the training according to preset batches to obtain the cumulative error corresponding to several batches. Method for updating parameters by accumulating errors corresponding to batches. The time-dependent backpropagation algorithm (Back Propagation, Thin Time, BPTT algorithm for short) is a training and learning method in neural network learning, which is used to update and adjust the network parameters between nodes in the neural network. When adjusting the network parameters in a neural network using a time-dependent back-propagation algorithm, the minimum value of the error function is required. In this embodiment, the minimum value of the error function is specifically calculated using a small batch gradient descent method. Out.
本实施例中,获取规范中文文本训练样本,并将规范中文字训练样本按预设批次进行批分。规范中文文本训练样本中采用的字体是相同的(不将多种字体混杂),如进行模型训练的规范中文文本训练样本全部采用宋体,本实施例中以宋体为例进行说明。可以理解地,这里组成标准规范文本中的中文字体是指属于目前中文字体中的主流字体,如计算机设备的输入法中的默认字体宋体,常用于临摹的主流字体楷体等;而像日常生活中比较少使用的中文字体,如草书、幼圆,则不列入组成该标准规范文本的中文字体的范围。在获取规范中文文本训练样本并将规范中文字训练样本按预设批次进行批分后,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法(基于小批量梯度)更新循环神经网络的网络参数,获取规范中文文本识别模型。该规范中文文本识别模型在训练过程中学习了规范中文文本训练样本的深层特征,使得该模型能够对标准规范文本进行精确的识别,具备对标准规范文本的识别能力,并且,训练规范中文文本识别模型的过程中不需对规范中文文本训练样本进行手动标记和数据对齐,能够直接进行端到端的训练。需要说明的是,无论规范中文文本训练样本中的字体采用的是楷体、宋体或隶书等其他中文字体,由于这些不同的中文字体组成的标准规范文本在字体识别的层面上差别并不大,因此训练好的规范中文文本识别模型可以对楷体、宋体或隶书等字体对应的标准规范文本进行精确的识别,得到较准确的识别结果。In this embodiment, a training sample of standard Chinese text is obtained, and the training sample of standard Chinese characters is batched according to a preset batch. The fonts used in the standard Chinese text training samples are the same (multiple fonts are not mixed). For example, the standard Chinese text training samples used for model training are all in the New Roman style. In this embodiment, the New Roman style is used as an example. Understandably, the Chinese fonts in the standard text here refer to the mainstream fonts in the current Chinese fonts, such as the default font Song style in the input method of computer equipment, and the mainstream font italics commonly used in copying; and like in daily life The less commonly used Chinese fonts, such as cursive and young round, are not included in the scope of Chinese fonts that make up the standard text. After obtaining the normalized Chinese text training samples and batching the normalized Chinese text training samples into preset batches, the batched normalized Chinese text training samples are input into a recurrent neural network and trained based on a continuous-time classification algorithm. A time-dependent back-propagation algorithm (based on a small batch of gradients) updates the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model. The standard Chinese text recognition model learns the deep features of the standard Chinese text training samples during the training process, enabling the model to accurately recognize standard standard text, has the ability to recognize standard standard text, and trains standard Chinese text recognition In the process of the model, manual labeling and data alignment of the standard Chinese text training samples are not required, and end-to-end training can be performed directly. It should be noted that regardless of whether the typefaces in the training samples of the standard Chinese text are other Chinese fonts such as Kai, Song, or Lishu, since the standard standard texts composed of these different Chinese fonts are not much different in terms of font recognition, The trained canonical Chinese text recognition model can accurately recognize standard canonical texts corresponding to typefaces such as Kai, Song, or Lishu, and obtain more accurate recognition results.
S20:获取非规范中文文本训练样本,并将非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型。S20: Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on continuous time The classification algorithm is trained, and the time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain the adjusted Chinese handwritten text recognition model.
其中,非规范中文文本训练样本是指根据手写中文文本所获取的训练样本,该手写中文文本具体可以是按照楷体、宋体或隶书等主流字体通过手写方式得到的文本。可以理解地,该非规范中文文本训练样本与规范中文文本训练样本的区别在于非规范中文文本训练样本是由手写中文文本所获取的,既然是手写的,当然就包含各种各样不同的字体形态。The non-standard Chinese text training sample refers to a training sample obtained based on handwritten Chinese text. The handwritten Chinese text may specifically be a text obtained by handwriting in mainstream fonts such as Kai, Song, or Lishu. Understandably, the difference between this non-standardized Chinese text training sample and the normalized Chinese text training sample is that the non-standardized Chinese text training sample is obtained by handwritten Chinese text. Since it is handwritten, it certainly contains a variety of different fonts. form.
本实施例中,服务端获取非规范中文文本训练样本,并将非规范中文字训练样本按预设批次进行批分,该训练样本包含有手写中文文本的特征。将批分后的非规范中文文本训练样本输入到规范中文文本 识别模型中,基于连续时间分类算法进行训练并调整,采用时间相关反向传播算法(基于小批量梯度)更新规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型。可以理解地,规范中文文本识别模型具备识别标准规范中文文本的能力,但是在对手写中文文本进行识别时并没有较高的识别精确度。因此本实施例采用非规范中文文本训练样本进行训练,让规范中文手写文本识别模型在已有识别标准规范文本的基础上,对模型中的网络参数进行调整,获取调整中文手写文本识别模型。该调整中文手写文本识别模型在原本识别标准规范文本的基础上学习手写中文文本的深层特征,使得该调整中文手写文本识别模型结合了标准规范文本和手写中文文本的深层特征,能够同时对标准规范文本和手写中文文本进行有效的识别,得到准确率较高的识别结果。In this embodiment, the server obtains non-standard Chinese text training samples, and batches the non-standard Chinese text training samples into preset batches. The training samples include the characteristics of handwritten Chinese text. The batched non-standard Chinese text training samples are input into the standard Chinese text recognition model, trained and adjusted based on the continuous time classification algorithm, and the time-dependent back propagation algorithm (based on small batch gradients) is used to update the standard Chinese text recognition model. Network parameters to get adjusted Chinese handwritten text recognition model. Understandably, the standard Chinese text recognition model has the ability to recognize standard Chinese text, but does not have high recognition accuracy when recognizing handwritten Chinese text. Therefore, this embodiment uses non-standard Chinese text training samples for training, so that the standard Chinese handwritten text recognition model can adjust the network parameters in the model based on the existing standard text of the recognition standard to obtain the adjusted Chinese handwritten text recognition model. The adjusted Chinese handwritten text recognition model learns the deep features of handwritten Chinese text on the basis of the original standard text recognition, so that the adjusted Chinese handwritten text recognition model combines the deep features of standard and handwritten Chinese text, and can simultaneously adjust the standard specifications. The text and handwritten Chinese text are effectively recognized, and a high accuracy recognition result is obtained.
循环神经网络在进行文本识别时,是根据文本的像素分布及序列进行判断的,在实际生活中的手写中文文本与标准规范文本存在差别,但是这种差别相比与其他不对应标准规范文本的差别小很多的,例如,手写中文文本的“你好”和标准规范文本的“你好”在像素分布上存在差别,但是这种差别相比于手写中文文本“你好”和标准规范文本“再见”之间的差别明显小很多。可以这样认为,即使手写中文文本与相对应的标准规范文本之间存在一定的差别,但是这种差别比与不相对应的标准规范文本的差别小得多,因此,可以通过最相似(即差别最小)的原则确定识别结果。调整中文手写文本识别模型是由循环神经网络训练而来的,该模型结合标准规范文本和手写中文文本的深层特征,能够根据该深层特征对手写中文文本进行有效的识别。In the text recognition, the recurrent neural network makes judgments based on the pixel distribution and sequence of the text. In actual life, the handwritten Chinese text is different from the standard text, but this difference is compared with other texts that do not correspond to the standard text. The difference is much smaller, for example, there is a difference in pixel distribution between "hello" in handwritten Chinese text and "hello" in standard specification text, but this difference is compared to "hello" and standard specification text " The difference between "goodbye" is significantly smaller. It can be considered that even if there is a certain difference between the handwritten Chinese text and the corresponding standard text, this difference is much smaller than the non-corresponding standard text. Therefore, the most similar (that is, the difference The minimum) principle determines the recognition result. The adjusted Chinese handwritten text recognition model is trained by a recurrent neural network. The model combines the standard features of text and the deep features of handwritten Chinese text, and can effectively recognize handwritten Chinese text based on the deep features.
需要说明的是,本实施例的步骤S10和步骤S20的顺序是不可调换的,需先执行步骤S10再执行步骤S20。先采用规范中文训练样本训练循环神经网络可以使获取的规范中文文本识别模型拥有较好的识别能力,使其对标准规范文本有精确的识别结果。在拥有良好的识别能力的基础上再进行步骤S20的微调,使得训练获取的调整中文手写文本识别模型能够根据学习到的手写中文文本的深层特征对手写中文文本进行有效的识别,使其对手写中文文本识别有较精确的识别结果。若先执行步骤S20或只执行步骤S20,由于手写中文文本包含的手写字体有各种各样的形态,直接采用手写中文文本训练学习到的特征并不能较好地反映手写中文文本的特征,会使一开始模型就学“坏”,导致后来再怎么进行调整也难以使得对手写中文文本识别有精确的识别结果。虽然每个人的手写中文文本都不一样,但是极大部分都是与标准规范文本相似(如手写中文文本模仿标准规范文本)。因此,一开始根据标准规范文本进行模型训练更符合客观情况,要比直接对手写中文文本进行模型训练的效果更好,可以在“好”的模型下进行相应的调整,获取手写中文文本识别率高的调整中文手写文本识别模型。It should be noted that the order of step S10 and step S20 in this embodiment is not interchangeable, and step S10 needs to be executed before step S20. Training the recurrent neural network with the normal Chinese training samples first can make the obtained normal Chinese text recognition model have better recognition ability, and make it have accurate recognition results for the standard normal text. On the basis of having good recognition ability, the fine-tuning of step S20 is performed, so that the adjusted Chinese handwritten text recognition model obtained by training can effectively recognize the handwritten Chinese text based on the deep features of the learned handwritten Chinese text and make it handwriting Chinese text recognition has more accurate recognition results. If step S20 is performed first or only step S20, because the handwritten Chinese text contained in the handwritten Chinese text has various forms, the features learned by directly training the handwritten Chinese text cannot reflect the characteristics of the handwritten Chinese text. Make the model learn "bad" at the beginning, which makes it difficult to make accurate recognition results for handwritten Chinese text recognition. Although each person's handwritten Chinese text is different, most of them are similar to standard specification text (such as handwritten Chinese text imitating standard specification text). Therefore, at the beginning, model training based on standard and normative text is more in line with the objective situation. It is more effective than model training directly on handwritten Chinese text. You can make corresponding adjustments under the "good" model to obtain the recognition rate of handwritten Chinese text. Highly adjusted Chinese handwritten text recognition model.
S30:获取待测试中文文本样本,采用调整中文手写文本识别模型识别待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有出错文本作为出错文本训练样本。S30: Obtain a sample of Chinese text to be tested, use the adjusted Chinese handwritten text recognition model to identify the sample of Chinese text to be tested, obtain error texts whose recognition results do not match the true results, and use all the error texts as training text samples for errors.
其中,待测试中文文本样本是指根据标准规范文本和手写中文文本所获取的用于测试的训练样本,该步骤采用的标准规范文本和步骤S10中用于训练的标准规范文本是相同的(因为如楷体、宋体等字体所对应的每个字都是唯一确定的);采用的手写中文文本与和步骤S20中用于训练的手写中文文本可以是不同的(不同人手写的中文文本是不完全相同的,手写中文文本的每个文本可以对应多种字体形态,为了与步骤S20用于训练的非规范中文文本训练样本区分开来,避免模型训练过拟合的情况,一般该步骤采用与步骤S20不同的手写中文文本)。The Chinese text sample to be tested refers to the training sample obtained for testing according to the standard text and the handwritten Chinese text. The standard text used in this step is the same as the standard text used for training in step S10 (because For example, each character corresponding to a font such as Kai, Song, etc. is uniquely determined); the handwritten Chinese text used and the handwritten Chinese text used for training in step S20 may be different (the Chinese text handwritten by different people is not complete) Similarly, each text of the handwritten Chinese text can correspond to multiple font forms. In order to distinguish it from the non-standard Chinese text training samples used for training in step S20, and to avoid the situation of model training overfitting, this step is generally used with S20 different handwritten Chinese text).
本实施例中,将训练好的调整中文手写文本识别模型用来识别待测试中文文本样本。训练时标准规范文本和手写中文文本可以是采用混合的方式输入到调整中文手写文本识别模型。在采用调整中文手写文本识别模型对待测试中文文本样本进行识别时,将获取到相应的识别结果,把识别结果与标签值(真实结果)不相符的所有出错文本作为出错文本训练样本。该出错文本训练样本反映调整中文文本手写识别模型仍然存在识别精度不足的问题,以便后续根据该出错文本训练样本进一步更新、优化调整中文手写文本识别模型。In this embodiment, the trained adjusted Chinese handwritten text recognition model is used to identify the Chinese text sample to be tested. Standard training text and handwritten Chinese text can be input to the adjusted Chinese handwritten text recognition model in a mixed manner during training. When the adjusted Chinese handwritten text recognition model is used to recognize the Chinese text samples to be tested, the corresponding recognition results will be obtained, and all error texts whose recognition results do not match the label value (true result) will be used as the error text training samples. The error text training sample reflects that the problem of insufficient recognition accuracy still exists in adjusting the Chinese text handwriting recognition model, so as to further update and optimize the Chinese handwriting text recognition model based on the error text training sample.
由于调整中文手写文本识别模型的识别精度实际上受到规范中文文本训练样本和非规范中文文本训练样本的共同影响,在先采用规范中文文本训练样本更新网络参数,再采用非规范中文文本训练样本更新网络参数的前提下,会导致获取到的调整中文手写文本识别模型过度学习非规范中文文本训练样本的特征,使得获取的调整中文手写文本识别模型对非规范中文文本训练样本(包括手写中文文本)拥有非常高的识别精度,但却过度学习该非规范中文文本样本的特征,影响除该非规范中文文本训练样本以外的手写中文文本的识别精度,因此,步骤S30采用待测试中文文本样本对调整中文手写文本识别模型进行识别,能够很大程度上消除训练时采用的非规范中文文本训练样本的过度学习。即通过调整中文手写文本识别模型识别待测试中文文本样本,以找出由于过度学习而产生的误差,该误差具体可以通过出错文本反映出来,因此能够根据该出错文本进一步地更新、优化调整中文手写文本识别模型的网络参数。Because adjusting the recognition accuracy of the Chinese handwritten text recognition model is actually affected by both the normal Chinese text training samples and the non-standard Chinese text training samples, the network parameters were first updated with the normal Chinese text training samples, and then the non-standard Chinese text training samples were used to update On the premise of network parameters, the acquired adjusted Chinese handwritten text recognition model will over-learn the characteristics of non-standard Chinese text training samples, so that the obtained adjusted Chinese handwritten text recognition model will train non-standard Chinese text training samples (including handwritten Chinese text). Has very high recognition accuracy, but over-learns the characteristics of the non-standard Chinese text sample, which affects the recognition accuracy of handwritten Chinese text other than the non-standard Chinese text training sample. Therefore, step S30 uses the Chinese text sample to be tested to adjust Chinese handwritten text recognition model for recognition can largely eliminate over-learning of non-standard Chinese text training samples used during training. That is, by adjusting the Chinese handwritten text recognition model to identify the Chinese text samples to be tested to find the errors caused by over-learning, the errors can be specifically reflected by the error text, so the Chinese handwriting can be further updated and optimized based on the error text. Network parameters for text recognition models.
S40:将出错文本训练样本输入到调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。S40: Input the training sample of the wrong text into the adjusted Chinese handwritten text recognition model, train it based on the continuous time classification algorithm, and update the network parameters of the Chinese handwritten text recognition model with batch gradient descent time-dependent back propagation algorithm to obtain the target Chinese. Handwritten text recognition model.
本实施例中,将出错文本训练样本输入到调整中文手写文本识别模型中,基于连续时间分类算法进行训练,该出错文本训练样本反映了在训练调整中文手写文本识别模型时,由于过度学习非规范中文文本训练样本的特征,导致调整中文手写文本识别模型在识别非规范中文文本训练样本以外的手写中文文本时出现的识别不精确的问题。并且,由于先采用规范中文文本训练样本再采用非规范中文文本训练样本训练模型的原因,会过度削弱原先学习的标准规范文本的特征,这会影响模型初始搭建的对标准规范文本进行识别的“框架”。利用出错文本训练样本可以很好地解决过度学习和过度削弱的问题,可以根据出错文本训练样本反映的识别精确度上的问题,在很大程度上消除原本训练过程中产生的过度学习和过度削弱带来的不利影响。In this embodiment, an error text training sample is input into the adjusted Chinese handwritten text recognition model, and training is performed based on a continuous time classification algorithm. The error text training sample reflects that during training and adjustment of the Chinese handwritten text recognition model, due to excessive learning non-standard The characteristics of Chinese text training samples lead to inaccurate recognition problems when adjusting the Chinese handwritten text recognition model to recognize handwritten Chinese text other than non-standard Chinese text training samples. In addition, the reason that the normalized Chinese text training samples are used first and then the non-standardized Chinese text training samples are used to train the model will overly weaken the characteristics of the previously learned standard canonical text, which will affect the initial establishment of the model to recognize the standard canonical text. frame". The use of erroneous text training samples can well solve the problems of over-learning and over-weakening. According to the recognition accuracy problems reflected by the erroneous text training samples, the over-learning and over-weakening generated during the original training process can be largely eliminated. Adverse effects.
具体地,采用出错文本训练样本进行训练时采用的是批量梯度下降的时间相关反向传播算法,根据该算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型,该目标中文手写文本识别模型是指最终训练出来的可用于识别中文手写文本的模型。训练采用的是循环神经网络,该神经网络能够结合中文文本具有的序列特点,学习中文文本的深层特征,提高目标中文手写文本识别模型的识别率。训练采用的算法是连续时间分类算法,采用该算法进行训练,不需要对训练样本进行手动标记和数据对齐,能够减少模型复杂度,实现直接进行非对齐不定长序列的训练。在更新网络参数时,出错文本训练样本的样本容量较少(出错文本较少),采用批量梯度下降的时间相关反向传播算法能够将所有出错文本训练样本在循环神经网络训练时产生的误差都进行反传更新,保证产生的所有误差都能对网络进行调整和更新,能够全面地训练循环神经网络,提高目标中文手写文本识别模型的识别准确率。Specifically, the time-correlated back-propagation algorithm of batch gradient descent is used for training using the error text training samples, and the network parameters of the Chinese handwritten text recognition model are updated and adjusted according to the algorithm to obtain the target Chinese handwritten text recognition model. The target Chinese The handwritten text recognition model refers to the finally trained model that can be used to recognize Chinese handwritten text. The training uses a recurrent neural network, which can combine the sequence characteristics of Chinese text to learn the deep features of Chinese text and improve the recognition rate of the target Chinese handwritten text recognition model. The training algorithm is a continuous-time classification algorithm. Using this algorithm for training does not require manual labeling and data alignment of the training samples, which can reduce the complexity of the model and enable direct training of non-aligned and variable-length sequences. When updating the network parameters, the sample size of the error text training samples is small (less error texts). Using the time-dependent backpropagation algorithm of batch gradient descent, all errors generated by the error text training samples during the training of the recurrent neural network are all Back-propagation update is performed to ensure that all errors generated can be adjusted and updated on the network, can fully train the recurrent neural network, and improve the recognition accuracy of the target Chinese handwritten text recognition model.
步骤S10-S40中,采用批分后的规范中文文本训练样本训练并获取规范中文文本识别模型,再通过批分后的非规范中文文本对规范中文文本识别模型进行调整性的更新,使得更新后获取的调整中文手写文本识别模型在具备识别标准规范文本能力的前提下,通过训练更新的方式学习手写中文文本的深层特征,使得调整中文手写文本识别模型能够较好地识别手写中文文本。然后采用调整中文手写文本识别模型识别待测试中文文本样本,获取识别结果与真实结果不相符的出错文本,并将所有出错文本作为出错文本训练样本输入到调整中文手写文本识别模型中,基于连续时间分类算法进行训练更新,获取目标中文手写文本识别模型。采用出错文本训练样本可以在很大程度上消除原本训练过程中产生的过度学习和过度削弱带来的不利影响,能够进一步优化识别准确率。训练规范中文文本识别模型和调整中文手写文本识别模型采用了采用时间相关反向传播算法(基于小批量梯度),能够在训练样本数量多的情况下仍然有较好的训练效率和训练效果,并且还能保证误差相对于单个训练样本来说在一定范围内具有全局特征,更容易找到误差函数的极小值。训练目标中文手写文本识别模型采用了采用批量梯度下降的时间相关反向传播算法,采用批量梯度下降能够保证对模型中参数的充分更新,对训练样本在训练过程中产生 的误差都进行反向传播更新,全面地根据产生的误差进行参数更新,提高所获取的模型的识别准确率。训练各个模型采用的是循环神经网络,该神经网络能够结合中文文本具有的序列特点,学习中文文本的深层特征,实现对不同的手写中文文本进行识别的功能。训练各个模型采用的算法是连续时间分类算法,采用该算法进行训练,不需要对训练样本进行手动标记和数据对齐,能够减少模型复杂度,实现直接进行非对齐不定长序列的训练。In steps S10-S40, the standardized Chinese text training sample is used to train and obtain a standardized Chinese text recognition model, and then the standardized Chinese text recognition model is adjusted to update through the unstandardized Chinese text after the batch, so that after the update, The obtained adjusted Chinese handwritten text recognition model learns the deep features of handwritten Chinese text through training and updating on the premise that it has the ability to recognize standard and standardized text, so that the adjusted Chinese handwritten text recognition model can better recognize handwritten Chinese text. Then use the adjusted Chinese handwritten text recognition model to identify the Chinese text samples to be tested, obtain the error texts whose recognition results do not match the real results, and input all the error texts as training text of the error text into the adjusted Chinese handwritten text recognition model, based on continuous time The classification algorithm is updated to obtain the target Chinese handwritten text recognition model. The use of error text training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy. The training standard Chinese text recognition model and the adjusted Chinese handwritten text recognition model use a time-dependent back-propagation algorithm (based on small batch gradients), which can still have good training efficiency and training effect in the case of a large number of training samples, and It can also ensure that the error has global characteristics within a certain range compared to a single training sample, and it is easier to find the minimum value of the error function. The training target Chinese handwritten text recognition model uses a time-dependent backpropagation algorithm using batch gradient descent. Using batch gradient descent can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are backpropagated. Update, comprehensively update the parameters according to the generated errors, and improve the recognition accuracy of the obtained model. Each model is trained using a recurrent neural network, which can combine the sequence characteristics of Chinese text to learn the deep features of Chinese text and realize the function of recognizing different handwritten Chinese text. The algorithm used to train each model is a continuous-time classification algorithm. Using this algorithm for training does not require manual labeling and data alignment of the training samples, which can reduce the model complexity and enable direct training of non-aligned and indefinite length sequences.
在一实施例中,如图3所示,步骤S10中,获取规范中文文本训练样本,并将规范中文字训练样本按预设批次进行批分,具体包括如下步骤:In an embodiment, as shown in FIG. 3, in step S10, a training sample of standard Chinese text is obtained, and the training sample of standard Chinese characters is batched according to a preset batch, which specifically includes the following steps:
S101:获取待处理中文文本训练样本中每个中文文本的像素值特征矩阵,将每个中文文本的像素值特征矩阵中每个像素值进行归一化处理,获取每个中文文本的归一化像素值特征矩阵,其中,归一化处理的公式为
MaxValue为像素值特征矩阵中像素值的最大值,MinValue为像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值。
S101: Obtain a pixel value feature matrix of each Chinese text in a training sample of Chinese text to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain a normalization of each Chinese text Pixel value feature matrix, where the normalization formula is MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization.
其中,待处理中文文本训练样本是指初始获取的,未经处理的训练样本。Among them, the training text samples to be processed refers to the training samples that are initially acquired and not processed.
本实施例中,可以采用成熟的、开源的卷积神经网络提取待处理中文文本训练样本的特征,获取待处理中文文本训练样本中每个中文文本的像素值特征矩阵。每个中文文本的像素值特征矩阵代表着对应文本的特征,在这里用像素值代表文本的特征,由于文本是通过图像二维表示的,故像素值可以采用矩阵表示,即像素值特征矩阵。计算机设备能够识别像素值特征矩阵的形式,读取像素值特征矩阵中的数值。服务端获取每个中文文本的像素值特征矩阵后,采用归一化处理的公式对特征矩阵中的各个像素值进行归一化处理,获取每个中文文本的归一化像素值特征。本实施例中,采用归一化处理方式能够将每个中文文本的像素值特征矩阵都压缩在同一个范围区间内,能够加快与该像素值特征矩阵相关的计算,有助于提高训练规范中文文本识别模型的训练效率。In this embodiment, a mature, open-source convolutional neural network may be used to extract the features of the Chinese text training samples to be processed, and obtain the pixel value feature matrix of each Chinese text in the Chinese text training samples to be processed. The pixel value feature matrix of each Chinese text represents the features of the corresponding text. Here, the pixel values represent the features of the text. Since the text is represented two-dimensionally by the image, the pixel values can be represented by a matrix, that is, the pixel value feature matrix. The computer device can recognize the form of the pixel value characteristic matrix and read the value in the pixel value characteristic matrix. After the server obtains the pixel value feature matrix of each Chinese text, it uses the formula of normalization processing to normalize each pixel value in the feature matrix to obtain the normalized pixel value feature of each Chinese text. In this embodiment, the normalized processing method can be used to compress the pixel value feature matrix of each Chinese text within the same range, which can speed up the calculation related to the pixel value feature matrix and help improve the training standard Chinese. Training efficiency of text recognition models.
S102:将每个中文文本的归一化像素值特征矩阵中的像素值划分为两类像素值,基于两类像素值建立每个中文文本的二值化像素值特征矩阵,将每个中文文本的二值化像素值特征矩阵对应的中文文本组合作为规范中文文本训练样本,并将规范中文字训练样本按预设批次进行批分。S102: Divide the pixel values in the normalized pixel value feature matrix of each Chinese text into two types of pixel values, build a binary pixel value feature matrix of each Chinese text based on the two types of pixel values, and divide each Chinese text The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample of the standard Chinese text, and the training sample of the standard Chinese text is batched according to a preset batch.
本实施例中,将每个中文文本的归一化像素值特征矩阵中的像素值划分为两类像素值,该两类像素值是指像素值中只包含像素值A或者像素值B。具体地,可以将归一化像素特征矩阵中大于或等于0.5的像素值取为1,将小于0.5的像素值取为0,建立相应的每个中文文本的二值化像素值特征矩阵,每个中文文本的二值化像素特征矩阵中的原始只包含0或1。在建立每个中文文本的二值化像素值特征矩阵后,将二值化像素值特征矩阵对应的中文文本组合作为规范中文文本训练样本,并将规范中文字训练样本按预设批次进行批分。例如,在一张包含文本的图像中,包含文本像素的部分和空白像素的部分。文本上的像素值一般颜色会比较深,二值化像素值特征矩阵中的“1”代表文本像素的部分,而“0”则代表图像中空白像素的部分。可以理解地,通过建立二值化像素值特征矩阵可以进一步简化对文本的特征表示,仅采用0和1的矩阵就可以将各个文本表示并区别开来,能够提高计算机处理关于文本的特征矩阵的速度,进一步提高训练规范中文文本识别模型的训练效率。In this embodiment, the pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values. The two types of pixel values refer to that the pixel values include only the pixel value A or the pixel value B. Specifically, a pixel value greater than or equal to 0.5 in the normalized pixel feature matrix can be taken as 1 and a pixel value less than 0.5 can be taken as 0, and a corresponding binary pixel value feature matrix for each Chinese text can be established. The original of the binarized pixel feature matrix of each Chinese text contains only 0 or 1. After establishing the binarized pixel value feature matrix of each Chinese text, the Chinese text combination corresponding to the binarized pixel value feature matrix is used as the standard Chinese text training sample, and the standard Chinese text training sample is batched according to a preset batch Minute. For example, in an image containing text, there are portions of text pixels and portions of blank pixels. The pixel values on the text are generally darker. The "1" in the binarized pixel value feature matrix represents the portion of the text pixel, and the "0" represents the portion of the blank pixel in the image. Understandably, the feature representation of text can be further simplified by establishing a binary pixel value feature matrix. Only the matrix of 0 and 1 can be used to represent and distinguish each text, which can improve the computer processing of the feature matrix of text. Speed, which further improves the training efficiency of training standard Chinese text recognition models.
步骤S101-S102对待处理中文文本训练样本进行归一化处理并进行二类值的划分,获取每个中文文本的二值化像素值特征矩阵,并将每个中文文本的二值化像素值特征矩阵对应的文本作为规范中文文本训练样本,能够显著缩短训练规范中文文本识别模型的时长。Steps S101-S102: Normalize the training samples of the Chinese text to be processed and divide the two types of values, obtain the binary pixel value feature matrix of each Chinese text, and binarize the pixel value features of each Chinese text The text corresponding to the matrix is used as the training sample of the standard Chinese text, which can significantly shorten the training time of training the standard Chinese text recognition model.
在一实施例中,如图4所示,步骤S10中,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型,具体包括如下步骤:In an embodiment, as shown in FIG. 4, in step S10, the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous time classification algorithm, and the loop is updated by using a time-dependent back propagation algorithm. The network parameters of the neural network to obtain the standard Chinese text recognition model include the following steps:
S111:将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练, 获取批分后的规范中文文本训练样本在循环神经网络中的前向传播输出和后向传播输出,前向传播输出表示为
其中,t表示序列步数,u表示与t相对应的输出的标签值,
表示在第t步的输出为标签值l'
u的概率,
后向传播输出表示为
其中,t表示序列步数,u表示与t相对应的输出的标签值,
表示在第t+1步的输出为标签值l'
u的概率,
S111: input the batch of normalized Chinese text training samples into a recurrent neural network, and train based on a continuous time classification algorithm to obtain the forward propagation output and the backward direction of the batched normalized Chinese text training samples in a recurrent neural network The propagation output, the forward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t is the label value l ' u , The backward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t + 1 is the label value l ' u ,
本实施例中,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类(CTC)算法进行训练。CTC算法本质上是一种计算误差函数的算法,该算法是用来衡量输入的序列数据经过神经网络之后,和真实结果(客观事实,也称为标签值)之间的误差有多少。因此,可以通过获取批分后的规范中文文本训练样本在循环神经网络中的前向传播输出和后向传播输出,再利用前向传播输出和后向传播输出描述、构建相对应的误差函数。首先简单介绍几个CTC中的基本定义,以更好地理解CTC的实现过程。
表示在第t步的输出为标签值k的概率。例如:当输出序列为(a-ab-)时,
表示在第3步输出的字母为a的概率,字母a是该第三步对应的标签值。p(π|x):表示给定的输入序列x,输出路径为π的概率;由于假设在每一个序列步相应输出的标签值的概率都是相互独立的,那么p(π|x)用公式来表示为
可以理解为每一个序列步输出路径π相应的标签值的概率的乘积。F:表示一种多对一的映射,将输出路径π映射到标签序列l的一种变换,例如:F(a-ab-)=F(-aa-abb)=aab(其中-代表了空格),本实施例中,该映射变换可以是如上述例子的去除叠字和去除空格处理。p(l|x):表示给定的输入序列x(如规范中文文本训练样本中的某个样本),输出为序列l的概率,因此,输出为序列l的概率可以表示为所有输出路径π映射后的序列为l的概率之和,用公式表示为
可以理解地,随着序列l长度的增加,相对应的路径的数目是成指数增加的,因此可以采用迭代的思路,从序列第t步与t-1步、t+1步的关于前向传播和后向传播的角度出发计算序列l对应的路径概率,提高计算的效率。
In this embodiment, a batch of standardized Chinese text training samples is input into a recurrent neural network, and training is performed based on a continuous time classification (CTC) algorithm. The CTC algorithm is essentially an algorithm that calculates an error function. This algorithm is used to measure the error between the input sequence data after passing through the neural network and the real result (objective facts, also called label values). Therefore, the corresponding error function can be constructed by obtaining the forward propagation output and the backward propagation output of a batch of standardized Chinese text training samples in a recurrent neural network, and then using the forward propagation output and the backward propagation output to describe. First, I will briefly introduce some basic definitions in CTC to better understand the implementation process of CTC. Represents the probability that the output at step t is the label value k. For example: When the output sequence is (a-ab-), Represents the probability that the letter output in step 3 is a, and the letter a is the label value corresponding to the third step. p (π | x): represents the probability of the given input sequence x and the output path being π; since the probability of the corresponding label value output at each sequence step is independent of each other, then p (π | x) is used Formula to represent It can be understood as the product of the probability of the label value corresponding to the output path π at each sequence step. F: represents a many-to-one mapping, a transformation that maps the output path π to the label sequence l, for example: F (a-ab-) = F (-aa-abb) = aab (where-represents a space ). In this embodiment, the mapping transformation may be a process of removing overlapping words and removing spaces as in the above example. p (l | x): Represents a given input sequence x (such as a sample in a standard Chinese text training sample), and the probability of output is sequence l. Therefore, the probability of output as sequence l can be expressed as all output paths π The mapped sequence is the sum of the probabilities of l, expressed by the formula: Understandably, as the length of the sequence l increases, the number of corresponding paths increases exponentially, so iterative thinking can be adopted, from the t-step and t-1, t + 1 step forward From the perspective of propagation and backward propagation, the path probability corresponding to sequence l is calculated to improve the efficiency of the calculation.
具体地,在进行计算之前,需要对序列l做一些预处理,在序列l的开头与结尾分别加上空格,并且在字母与字母之间都添加上空格。如果原来序列l的长度为U,那么预处理之后,序列l'的长度为2U+1。对于一个序列l,可以定义前向变量α(t,u)为输出序列长度为t,且经过F映射之后为序列l的路径的概率之和,用公式表示为:
其中,V(t,u)={π∈A'
t:F(π)=l
1:u/2,π
t=l'
u}, 表示所有满足经过F映射之后为序列l,长度为t的路径集合,且在第t序列步的输出为l'
u,这里的u/2表示的是索引,因此需要向下取整。所有正确路径的开头必须是空格或者l
1(即序列l的第一个字母),因此存在着初始化的约束条件:
(b表示blank,空格),
则p(l|x)可以由前向变量来表示,即:p(l|x)=α(T,U')+α(T,U'-1),其中,α(T,U')可以理解为所有路径长度为T,经过F映射之后为序列l,且第T时刻的输出的标签值为:l'
U或者l'
U-1,也就是路径的最后一个是否包括了空格。于是,前向变量的计算可以按照时间来进行递归,用公式表示为:
其中,这里的f(u),实际上是对前一时刻的所有可能路径的列举,其具体条件公式如下:
Specifically, before performing calculations, some preprocessing is needed on the sequence l, spaces are added at the beginning and end of the sequence l, and spaces are added between the letters. If the length of the original sequence l is U, then after preprocessing, the length of the sequence l ′ is 2U + 1. For a sequence l, the forward variable α (t, u) can be defined as the sum of the probabilities of the output sequence length t and the path of sequence l after F mapping, expressed by the formula: Among them, V (t, u) = {π∈A ' t : F (π) = l 1: u / 2 , π t = l' u }, which means that all the sequences after F mapping are satisfied, and the length is t The set of paths and the output of step t is l ' u , where u / 2 represents the index, so it needs to be rounded down. All correct paths must begin with a space or l 1 (that is, the first letter of the sequence l), so there are initialization constraints: (b represents blank, space), Then p (l | x) can be represented by a forward variable, that is: p (l | x) = α (T, U ') + α (T, U'-1), where α (T, U' ) Can be understood as the length of all paths is T, after the F mapping is sequence l, and the label value of the output at time T is: l ' U or l' U-1 , that is, whether the last of the path includes a space. Therefore, the calculation of the forward variable can be recursed according to time, expressed by the formula: Among them, f (u) here is actually a list of all possible paths at the previous moment, and the specific condition formula is as follows:
与前向传播的过程类似,可以定义一个后向变量β(t,u),表示从t+1时刻开始,在前向变量α(t,u)上添加路径π',使得最后通过F映射之后为序列l的概率之和,用公式表示为:
其中,W(t,u)={π∈A'
T-t:F(π'+π)=l,
后向传播也有相对应的初始化条件:β(T,U')=β(T,U'-1)=1,β(T,u)=0,
于是,后向变量同样可以根据递归的方式求出,用公式表示为:
其中,g(u)表示t+1时刻可能的路径选择函数,表示为
则可以根据前向变量的递归表达式和后向变量的递归表达式对前向传播的过程和后向传播的过程进行描述,获取相对应的前向传播输出和后向传播输出(前向变量的递归表达式即表示前向传播输出,后向变量的递归表达式即表示后向传播输出)。
Similar to the process of forward propagation, a backward variable β (t, u) can be defined, which means that starting from time t + 1, a path π 'is added to the forward variable α (t, u), so that it is finally mapped by F The sum of the probabilities of the sequence l is followed by the formula: Where W (t, u) = {π∈A ' Tt : F (π' + π) = l, There are corresponding initialization conditions for backward propagation: β (T, U ') = β (T, U'-1) = 1, β (T, u) = 0, Therefore, the backward variable can also be obtained in a recursive manner, and expressed by the formula: Among them, g (u) represents a possible path selection function at time t + 1, which is expressed as According to the recursive expression of the forward variable and the recursive expression of the backward variable, the process of forward propagation and the process of backward propagation can be described, and the corresponding forward propagation output and backward propagation output (forward variable The recursive expression of is the forward propagation output, and the recursive expression of the backward variable is the backward propagation output.
S112:根据前向传播输出和后向传播输出构建误差函数。S112: Construct an error function according to the forward propagation output and the backward propagation output.
在一实施例中,可以根据前向传播输出和后向传播输出构建误差函数,具体地,可以使用概率的负对数作为误差函数。设l=z,则误差函数可以表示为
其中,S表示规范中文文本训练样本。该式中的p(z|x)可以根据前向传播输出和后向传播输出进行计算。首先定义一个集合X,其代表t时刻位置处在u的所有正确的路径,用公式表示为:X(t,u)={π∈A'
T:F(π)=z,π
t=z'
u},于是,任意时刻前向变量与后向变量的乘积表示所有可能路 径的概率和,即
该式是t时刻位置恰好处于u的所有正确路径的概率和,则对于一般情况,对于任意时刻t,可以计算所有位置的正确路径得到总概率:
则根据误差函数的定义能够得到误差函数
在得到误差函数之后就可以根据误差函数更新网络参数,获取规范中文文本识别模型。
In an embodiment, an error function may be constructed based on the forward propagation output and the backward propagation output. Specifically, a negative logarithm of the probability may be used as the error function. Let l = z, then the error function can be expressed as Among them, S represents the standard Chinese text training sample. P (z | x) in this formula can be calculated from the forward propagation output and the backward propagation output. First define a set X, which represents all the correct paths at position u at time t, expressed by the formula: X (t, u) = {π∈A ' T : F (π) = z, π t = z ' u }, so the product of the forward and backward variables at any time represents the sum of the probabilities of all possible paths, ie This formula is the sum of the probabilities of all the correct paths for the position at t at time t. For the general case, for any time t, the correct path of all positions can be calculated to get the total probability: The error function can be obtained according to the definition of the error function After obtaining the error function, the network parameters can be updated according to the error function to obtain a standard Chinese text recognition model.
S113:根据误差函数,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型。S113: According to the error function, a time-dependent back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.
在一实施例中,根据获取的误差函数
可以采用时间相关反向传播算法(基于小批量梯度)更新网络参数,具体地,求出该误差函数对未经过sofmax层的网络输出的偏导数(即梯度),将该梯度乘以学习率,用原来的网络参数减去梯度乘以学习率的积即实现网络参数的更新。
In one embodiment, according to the obtained error function A time-dependent back-propagation algorithm (based on a small batch of gradients) can be used to update the network parameters. Specifically, the partial derivative of the error function on the network output without the sofmax layer (that is, the gradient) is obtained. The network parameters are updated by subtracting the product of the gradient and the learning rate from the original network parameters.
步骤S111-S113能够根据规范中文文本训练样本在循环神经网络得到的前向传播输出和后向传播输出构建误差函数
并根据该误差函数进行误差反传,更新网络参数,实现获取规范中文文本识别模型的目的。该模型学习了规范中文文本训练样本的深层特征,能够精确地识别标准规范文本。
Steps S111-S113 can construct an error function according to the forward propagation output and the backward propagation output obtained from the recurrent neural network by the training samples of the standard Chinese text Based on the error function, the error is back-propagated, and the network parameters are updated to achieve the purpose of obtaining the standard Chinese text recognition model. The model learns the deep features of the training samples of standard Chinese text and can accurately identify standard standard text.
在一实施例中,如图5所示,步骤S30中,采用调整中文手写文本识别模型识别待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有出错文本作为出错文本训练样本,具体包括如下步骤:In an embodiment, as shown in FIG. 5, in step S30, the Chinese text sample to be tested is identified by adjusting the Chinese handwritten text recognition model, and error texts whose recognition results do not match the real results are obtained, and all the error texts are used as the error text training samples. , Including the following steps:
S31:将待测试中文文本样本输入到调整中文手写文本识别模型,获取待测试中文文本样本中每一文本在调整中文手写文本识别模型中的输出值。S31: Input the Chinese text sample to be tested into the adjusted Chinese handwritten text recognition model, and obtain the output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model.
本实施例中,采用调整中文手写文本识别模型对待测试中文文本样本进行识别,待测试中文文本样本中包含若干个中文文本。文本包括文字,本实施例提及的每一文本的输出值具体是指每一文字中各个字体对应的各个输出值。在中文字库中,常用的中文字大概有三千多个(包括空格和各种中文标点符号),在调整中文手写文本识别模型的输出层应设置中文字库中每一个字与输入的待测试中文文本样本中的字的相似程度的概率值,具体可以通过softmax函数实现。可以理解地,若待测试中文文本样本中的一个文本样本假设为一张分辨率为8*8的图像,上面是“你们好”三个字,则识别时把图片进行垂直切割成8列,8个3维的向量,然后作为调整中文手写文本识别模型的8个输入数。调整中文手写文本识别模型的输出数和输入数的数量应该是相同的,而实际上该文本样本只有3个输出数,而不应是8个输出数,因此实际输出的情况会出现叠字的情况,例如:“你你们们好___”、“你_们_们_好_”、“_你你_们_好_”等输出情况,在这8个输出数中,每一个输出数对应的中文字都存在与中文字库中每一个字计算相似程度的概率值,该概率值即测试中文文本样本中每一文本在调整中文手写文本识别模型中的输出值,该输出值有很多个,每一输出值对应该输出数所对应的中文字与中文字库中每一个字相似程度的概率值。根据该概率值可以确定每一文本的识别结果。In this embodiment, the Chinese handwritten text recognition model is adjusted to recognize the Chinese text sample to be tested, and the Chinese text sample to be tested includes several Chinese texts. The text includes text, and the output value of each text mentioned in this embodiment specifically refers to each output value corresponding to each font in each text. In the Chinese character library, there are more than 3,000 commonly used Chinese characters (including spaces and various Chinese punctuation marks). When adjusting the output layer of the Chinese handwritten text recognition model, each character in the Chinese character library and the input Chinese to be tested should be set. The probability value of the similarity of the words in the text sample can be achieved through the softmax function. Understandably, if a text sample in the Chinese text sample to be tested is assumed to be an image with a resolution of 8 * 8, and the words "Hello" are above, the picture is vertically cut into 8 columns during recognition. Eight three-dimensional vectors are then used as the eight input numbers for adjusting the Chinese handwritten text recognition model. The number of output and input of the Chinese handwritten text recognition model should be adjusted. In fact, the text sample has only 3 output numbers instead of 8 output numbers. Therefore, the actual output situation will be overlapped. Situations, such as: "You, you guys ___", "You_men_mens_good_", "_Youyou_mens_good_" and other output situations. Among these 8 output numbers, each output number Corresponding Chinese characters have probability values that are similar to each word in the Chinese character library. The probability value is the output value of each text in the test Chinese text sample in adjusting the Chinese handwritten text recognition model. There are many output values. , Each output value corresponds to the probability value of the similarity between the Chinese character corresponding to the output number and each character in the Chinese character library. The recognition result of each text can be determined according to the probability value.
S32:选取每一文本对应的输出值中的最大输出值,根据最大输出值获取每一文本的识别结果。S32: Select the maximum output value among the output values corresponding to each text, and obtain the recognition result of each text according to the maximum output value.
本实施例中,选取每一文本对应的所有输出值中的最大输出值,根据该最大输出值即可获取该文本 的识别结果。可以理解地,输出值直接反映了输入的待测试中文文本样本中的字与中文字库中每一字的相似程度,而最大输出值则表明待测试文本样本中的字最接近中文字库中的某个字,则可以根据该最大输出值对应的字确定实际输出,如实际输出为“你你们们好___”、“你_们_们_好_”、“_你你_们_好_”等输出情况而不是像“妳妳扪扪好___”、“你_们_们_好_”、“_你妳_扪_好_”等实际输出,根据连续时间分类算法的定义,还需要对实际输出作进一步地处理,把实际输出中的叠词字去除,只保留一个;并把空格去除,则可以得到识别结果,如本实施例中的识别结果为“你们好”。通过最大输出值确定实际输出的字的正确性,再作去叠字和去空格的处理,能够有效获取每一文本的识别结果。In this embodiment, a maximum output value among all output values corresponding to each text is selected, and a recognition result of the text can be obtained according to the maximum output value. Understandably, the output value directly reflects the similarity between the words in the input Chinese text sample and each character in the Chinese character library, and the maximum output value indicates that the word in the text sample to be tested is closest to a certain character in the Chinese character library. Number of words, the actual output can be determined according to the word corresponding to the maximum output value, such as the actual output is "You guys ___", "You_ guys_men_good_", "_ 你 你 _ 人 _ 好 _ "And so on instead of actual output like" 你 你 ”好 ___", "你 _ 我们 _ 个 _ 好 _", "_ 你 你 _ 扪 _ 好 _", etc., according to the definition of continuous time classification algorithm, The actual output needs to be further processed to remove the reduplicated words in the actual output, leaving only one; and to remove the spaces, you can get the recognition result, for example, the recognition result in this embodiment is "hello". The correctness of the actual output word is determined by the maximum output value, and the de-superposition and space removal processing are performed to effectively obtain the recognition result of each text.
S33:根据识别结果,获取识别结果与真实结果不符的出错文本,把所有出错文本作为出错文本训练样本。S33: According to the recognition results, obtain error texts whose recognition results do not match the real results, and use all the error texts as training samples of the error texts.
本实施例中,将得到的识别结果与真实结果(客观事实)作比较,将比较识别结果与真实结果不符的出错文本作为出错文本训练样本。可以理解地,该识别结果只是待测试中文文本训练样本在调整中文手写文本识别模型识别出来的结果,与真实结果相比有可能是不相同的,反映了该模型在识别的精确度上仍有不足,而这些不足可以通过出错文本训练样本进行优化,以达到更精确的识别效果。In this embodiment, the obtained recognition result is compared with an actual result (objective fact), and an error text in which the recognition result does not match the actual result is used as an error text training sample. Understandably, the recognition result is just the result recognized by the Chinese text training sample to be tested in adjusting the Chinese handwritten text recognition model. It may be different from the real result, reflecting that the model still has the recognition accuracy. Shortcomings, and these shortcomings can be optimized by training samples of erroneous text to achieve more accurate recognition results.
步骤S31-S33根据待测试中文文本样本中每一文本在调整中文手写文本识别模型中的输出值,从输出值中选择能够反映文本间(实际上是字的相似程度)相似程度的最大输出值;再通过最大输出值得到识别结果,并根据识别结果得到出错文本训练样本,为后续利用出错文本训练样本进一步优化识别精确度提供了重要的技术前提。Steps S31-S33 adjust the output value of the Chinese handwritten text recognition model according to each text in the Chinese text sample to be tested, and select the maximum output value from the output value that can reflect the similarity between texts (actually the similarity of words). ; Then get the recognition result through the maximum output value, and get the error text training sample according to the recognition result, which provides an important technical premise for the subsequent use of the error text training sample to further optimize the recognition accuracy.
在一实施例中,在步骤S10之前,即在获取规范中文文本训练样本的步骤之前,该手写模型训练方法还包括如下步骤:初始化循环神经网络。In one embodiment, before step S10, that is, before the step of obtaining training samples of standard Chinese text, the handwriting model training method further includes the following steps: initializing a recurrent neural network.
在一实施例中,初始化循环神经网络即初始化该网络的网络参数,赋予网络参数初始值。若初始化的权值处在误差曲面的一个相对平缓的区域时,循环神经网络模型训练的收敛速度可能会异常缓慢。可以将网络参数初始化在一个具有0均值的相对小的区间内均匀分布,比如[-0.30,+0.30]这样的区间内。合理地初始化循环神经网络可以使网络在初期有较灵活的调整能力,可以在训练过程中对网络进行有效的调整,能够快速有效地找到误差函数的极小值,有利于循环神经网络的更新和调整,使得基于循环神经网络进行模型训练获取的模型在进行中文手写字识别时具备精确的识别效果。In one embodiment, the initialization of the recurrent neural network is to initialize the network parameters of the network and assign initial values to the network parameters. If the initial weight is in a relatively flat area of the error surface, the convergence speed of the RNN model training may be abnormally slow. The network parameters can be initialized to be uniformly distributed in a relatively small interval with a zero mean, such as in an interval such as [-0.30, + 0.30]. Reasonably initializing the recurrent neural network can make the network more flexible in the initial stage. It can effectively adjust the network during the training process. It can quickly and effectively find the minimum value of the error function, which is conducive to the update and recurrent neural network. Adjusted so that the model obtained by model training based on recurrent neural network has accurate recognition effect when performing Chinese handwriting recognition.
本实施例所提供的手写模型训练方法中,将循环神经网络的网络参数初始化在一个具有0均值的相对小的区间内均匀分布,比如[-0.30,+0.30]这样的区间,采用该初始化的方式能够快速有效地找到误差函数的极小值,有利于循环神经网络的更新和调整。对待处理中文文本训练样本进行归一化处理并进行二类值的划分,获取每个中文文本的二值化像素值特征矩阵,并将每个中文文本的二值化像素值特征矩阵对应的文本作为规范中文文本训练样本,能够显著缩短训练规范中文文本识别模型的时长。根据批分的规范中文文本训练样本在循环神经网络得到的前向传播输出和后向传播输出构建误差函数
并根据该误差函数反传更新网络参数,能够获取规范中文文本识别模型,该模型学习了规范中文文本训练样本的深层特征,能够精确地识别标准规范文本。接着通过批分的非规范中文文本对规范中文文本识别模型进行调整性的更新,使得更新后获取的调整中文手写文本识别模型在具备识别规范中文手写文本能力的前提下,通过训练更新的方式学习非规范中文文本的深层特征,使得调整中文手写文本识别模型能够较好地识别非规范中文手写文本。接着,根据待测试中文文本样本中每一文本在调整中文手写文本识别模型中的输出值,从输出值中选择能够反映文本间相似程度的最大输出值,利用最大输出值得到识别结果,并根据识别结果得到出错文本训练样本,并将所有出错文 本作为出错文本训练样本输入到调整中文手写文本识别模型中,基于连续时间分类算法进行训练更新,获取目标中文手写文本识别模型。采用出错文本训练样本可以在很大程度上消除原本训练过程中产生的过度学习和过度削弱带来的不利影响,能够进一步优化识别准确率。此外,本实施例所提供的手写模型训练方法中,规范中文文本识别模型和调整中文手写文本识别模型在训练时采用的是基于小批量梯度(按预设批次对规范中文文本训练样本进行批分)的后向传播算法,在训练样本数量多的情况下仍然有较好的训练效率和训练效果。目标中文手写文本识别模型在训练时采用的是采用批量梯度下降的时间相关反向传播算法,能够保证对模型中参数的充分更新,对训练样本在训练过程中产生的误差都进行反向传播更新,全面地根据产生的误差进行参数更新,提高所获取的模型的识别准确率。
In the handwriting model training method provided in this embodiment, the network parameters of the recurrent neural network are initialized to be uniformly distributed in a relatively small interval with a zero mean, such as an interval such as [-0.30, +0.30]. This method can quickly and efficiently find the minimum value of the error function, which is conducive to the update and adjustment of the recurrent neural network. Normalize the Chinese text training samples to be processed and divide the two types of values to obtain the binary pixel value feature matrix of each Chinese text, and the text corresponding to the binary pixel value feature matrix of each Chinese text As a training sample of canonical Chinese text, it can significantly shorten the time for training a canonical Chinese text recognition model. Construct an error function based on the forwarded and backward-propagated output of the recurrent neural network based on the batch of standardized Chinese text training samples Based on this error function, the network parameters are updated to obtain a standard Chinese text recognition model. The model learns the deep features of the standard Chinese text training samples and can accurately identify standard standard texts. Then, the standardized Chinese text recognition model is adjusted to update through the batch of non-standard Chinese text, so that the adjusted Chinese handwritten text recognition model obtained after the update can learn by training and updating on the premise that it has the ability to recognize standard Chinese handwritten text. The deep features of non-standard Chinese text make the adjusted Chinese hand-written text recognition model better recognize non-standard Chinese hand-written text. Then, according to the output value of each text in the Chinese text sample to be tested in the Chinese handwritten text recognition model, the maximum output value that reflects the degree of similarity between texts is selected from the output values, and the maximum output value is used to obtain the recognition result. Recognition results are obtained from training text samples of errors, and all error texts are input as training text samples to adjust the Chinese handwritten text recognition model, and training updates are performed based on the continuous time classification algorithm to obtain the target Chinese handwritten text recognition model. The use of error text training samples can largely eliminate the adverse effects caused by over-learning and over-weakening during the original training process, and can further optimize the recognition accuracy. In addition, in the handwriting model training method provided in this embodiment, the standardized Chinese text recognition model and the adjusted Chinese handwritten text recognition model are trained based on a small batch gradient (the standard Chinese text training samples are batched according to a preset batch) Points) of the back propagation algorithm, in the case of a large number of training samples, still has good training efficiency and training effect. The target Chinese handwritten text recognition model is trained using a time-dependent backpropagation algorithm using batch gradient descent, which can ensure that the parameters in the model are fully updated, and the errors generated by the training samples during the training process are backpropagated. The parameters are updated comprehensively according to the generated errors to improve the recognition accuracy of the obtained model.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.
图6示出与实施例中手写模型训练方法一一对应的手写模型训练装置的原理框图。如图6所示,该手写模型训练装置包括规范中文文本识别模型获取模块10、调整中文手写文本识别模型获取模块20、出错文本训练样本获取模块30和目标中文手写文本识别模型获取模块40。其中,规范中文文本识别模型获取模块10、调整中文手写文本识别模型获取模块20、出错文本训练样本获取模块30和目标中文手写文本识别模型获取模块40的实现功能与实施例中手写模型训练方法对应的步骤一一对应,为避免赘述,本实施例不一一详述。FIG. 6 shows a principle block diagram of a handwriting model training device corresponding to the handwriting model training method in the embodiment. As shown in FIG. 6, the handwriting model training device includes a standard Chinese text recognition model acquisition module 10, an adjusted Chinese handwriting text recognition model acquisition module 20, an error text training sample acquisition module 30, and a target Chinese handwriting text recognition model acquisition module 40. Among them, the implementation functions of the standard Chinese text recognition model acquisition module 10, adjusted Chinese handwritten text recognition model acquisition module 20, error text training sample acquisition module 30, and target Chinese handwritten text recognition model acquisition module 40 correspond to the handwriting model training method in the embodiment. The steps correspond one by one. In order to avoid redundant description, this embodiment is not detailed one by one.
规范中文文本识别模型获取模块10,用于获取规范中文文本训练样本,并将规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型。Canonical Chinese text recognition model acquisition module 10 is used to obtain normative Chinese text training samples, and batches the normative Chinese text training samples into preset batches, and inputs the batched normative Chinese text training samples to the recurrent neural network. Based on continuous-time classification algorithm for training, the time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model.
调整中文手写文本识别模型获取模块20,用于获取非规范中文文本训练样本,并将非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型。Adjust the Chinese handwritten text recognition model acquisition module 20 to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples to In the standard Chinese text recognition model, training is performed based on a continuous time classification algorithm, and the network parameters of the standard Chinese text recognition model are updated using a time-dependent backpropagation algorithm to obtain an adjusted Chinese handwritten text recognition model.
出错文本训练样本获取模块30,用于获取待测试中文文本样本,采用调整中文手写文本识别模型识别待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有出错文本作为出错文本训练样本。Error text training sample acquisition module 30, which is used to obtain Chinese text samples to be tested, adjust the Chinese handwritten text recognition model to identify Chinese text samples to be tested, obtain error texts whose recognition results do not match the true results, and train all error texts as error texts sample.
目标中文手写文本识别模型获取模块40,用于将出错文本训练样本输入到调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。The target Chinese handwritten text recognition model acquisition module 40 is used to input training text error samples into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and use batch gradient descent time-dependent back-propagation algorithm to update and adjust Chinese handwriting. Network parameters of the text recognition model to obtain the target Chinese handwritten text recognition model.
优选地,规范中文文本识别模型获取模块10包括归一化像素值特征矩阵获取单元101、规范中文文本训练样本获取单元102、传播输出获取单元111、误差函数构建单元112和规范中文文本识别模型获取单元113。Preferably, the normalized Chinese text recognition model acquisition module 10 includes a normalized pixel value feature matrix acquisition unit 101, a normalized Chinese text training sample acquisition unit 102, a propagation output acquisition unit 111, an error function construction unit 112, and a normalized Chinese text recognition model acquisition. Unit 113.
归一化像素值特征矩阵获取单元101,用于获取待处理中文文本训练样本中每个中文文本的像素值特征矩阵,将每个中文文本的像素值特征矩阵中每个像素值进行归一化处理,获取每个中文文本的归一化像素值特征矩阵,其中,归一化处理的公式为
MaxValue为像素值特征矩阵中像素值的最大值,MinValue为像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值。
The normalized pixel value feature matrix obtaining unit 101 is configured to obtain a pixel value feature matrix of each Chinese text in a Chinese text training sample to be processed, and normalize each pixel value in the pixel value feature matrix of each Chinese text. Processing to obtain a normalized pixel value feature matrix for each Chinese text, where the formula for normalization processing is MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization.
规范中文文本训练样本获取单元102,用于将每个中文文本的归一化像素值特征矩阵中的像素值划分为两类像素值,基于两类像素值建立每个中文文本的二值化像素值特征矩阵,将每个中文文本的二值化像素值特征矩阵对应的中文文本组合作为规范中文文本训练样本,并将规范中文字训练样本按预设批 次进行批分。A normalized Chinese text training sample acquisition unit 102 is configured to divide the pixel values in the normalized pixel value feature matrix of each Chinese text into two types of pixel values, and establish a binarized pixel of each Chinese text based on the two types of pixel values The value feature matrix uses the Chinese text combination corresponding to the binarized pixel value feature matrix of each Chinese text as a standard Chinese text training sample, and the standard Chinese text training samples are batched according to a preset batch.
传播输出获取单元111,用于将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,获取批分后的规范中文文本训练样本在循环神经网络中的前向传播输出和后向传播输出,前向传播输出表示为
其中,t表示序列步数,u表示与t相对应的输出的标签值,
表示在第t步的输出为标签值l'
u的概率,
后向传播输出表示为
其中,t表示序列步数,u表示与t相对应的输出的标签值,
表示在第t+1步的输出为标签值l'
u的概率,
A propagation output obtaining unit 111 is configured to input the batch of standardized Chinese text training samples into a recurrent neural network, and perform training based on a continuous-time classification algorithm, and obtain a batch of standardized Chinese text training samples in a recurrent neural network. Forward propagation output and backward propagation output, forward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t is the label value l ' u , The backward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t + 1 is the label value l ' u ,
误差函数构建单元112,用于根据前向传播输出和后向传播输出构建误差函数。The error function constructing unit 112 is configured to construct an error function according to the forward propagation output and the backward propagation output.
规范中文文本识别模型获取单元113,用于根据误差函数,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型。The standard Chinese text recognition model acquisition unit 113 is configured to update the network parameters of the recurrent neural network by using a time-dependent back-propagation algorithm according to an error function to obtain a standard Chinese text recognition model.
优选地,出错文本训练样本获取模块30包括模型输出值获取单元31、模型识别结果获取单元32和出错文本训练样本获取单元33。Preferably, the error text training sample acquisition module 30 includes a model output value acquisition unit 31, a model recognition result acquisition unit 32, and an error text training sample acquisition unit 33.
模型输出值获取单元31,用于将待测试中文文本样本输入到调整中文手写文本识别模型,获取待测试中文文本样本中每一文本在调整中文手写文本识别模型中的输出值。The model output value acquiring unit 31 is configured to input a Chinese text sample to be tested into the adjusted Chinese handwritten text recognition model, and obtain an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model.
模型识别结果获取单元32,用于选取每一文本对应的输出值中的最大输出值,根据最大输出值获取每一文本的识别结果。The model recognition result obtaining unit 32 is configured to select a maximum output value among output values corresponding to each text, and obtain a recognition result of each text according to the maximum output value.
出错文本训练样本获取单元33,用于根据识别结果,获取识别结果与真实结果不符的出错文本,把所有出错文本作为出错文本训练样本。The error text training sample acquisition unit 33 is configured to obtain error texts whose recognition results do not match the real results according to the recognition results, and use all the error texts as the error text training samples.
优选地,该手写模型训练装置还包括初始化模块50,用于初始化循环神经网络。Preferably, the handwriting model training device further includes an initialization module 50 for initializing a recurrent neural network.
图7示出本实施例中文本识别方法的一流程图。该文本识别方法可应用在银行、投资和保险等机构配置的计算机设备,用于对手写中文文本进行识别,达到人工智能目的。如图7所示,该文本识别方法包括如下步骤:FIG. 7 shows a flowchart of the text recognition method in this embodiment. The text recognition method can be applied to computer equipment configured by banks, investment, and insurance institutions to recognize handwritten Chinese text and achieve artificial intelligence purposes. As shown in FIG. 7, the text recognition method includes the following steps:
S50:获取待识别中文文本,采用目标中文手写文本识别模型识别待识别中文文本,获取待识别中文文本在目标中文手写文本识别模型中的输出值,目标中文手写文本识别模型是采用上述手写模型训练方法获取到的。S50: Obtain the Chinese text to be recognized, use the target Chinese handwritten text recognition model to identify the Chinese text to be recognized, and obtain the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model. The target Chinese handwritten text recognition model is trained using the above handwriting model. Method.
其中,待识别中文文本是指要进行识别的中文文本。The Chinese text to be identified refers to the Chinese text to be identified.
本实施例中,获取待识别中文文本,将待识别中文文本输入到目标中文手写文本识别模型中进行识别,获取待识别中文文本在目标中文手写文本识别模型中的每一个输出数对应的中文字与中文字库中每一个字的相似程度的概率值,该概率值即待识别中文文本在目标中文手写文本识别模型中的输出值,可以基于该输出值确定该待识别中文文本的识别结果。In this embodiment, the Chinese text to be recognized is obtained, the Chinese text to be recognized is input to the target Chinese handwritten text recognition model for recognition, and the Chinese text corresponding to each output number of the target Chinese handwritten text recognition model is obtained. A probability value similar to each word in the Chinese character library. The probability value is the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model. The recognition result of the Chinese text to be recognized can be determined based on the output value.
S60:选取待识别中文文本对应的输出值中的最大输出值,根据最大输出值获取待识别中文文本的识别结果。S60: Select the maximum output value among the output values corresponding to the Chinese text to be recognized, and obtain the recognition result of the Chinese text to be recognized according to the maximum output value.
本实施例中,选取待识别中文文本对应的所有输出值中的最大输出值,根据该最大输出值确定其对应的实际输出,例如,实际输出为“你_们_们_好_”。然后再对该实际输出作进一步地处理,把实际输出中的叠词字去除,只保留一个;并把空格去除,则可以得到待识别中文文本的识别结果。通过最大输出值确定实际输出阶段的字的正确性,再作去叠字和去空格的处理,能够有效获取每一文本的识别结果,提高识别的准确率。In this embodiment, the maximum output value among all output values corresponding to the Chinese text to be recognized is selected, and the corresponding actual output is determined according to the maximum output value, for example, the actual output is "you_men_men_ 好 _". Then the actual output is further processed, and the overlapping words in the actual output are removed, leaving only one; and the spaces are removed to obtain the recognition result of the Chinese text to be recognized. The maximum output value is used to determine the correctness of the words in the actual output stage, and then the de-superposition and de-space processing is performed to effectively obtain the recognition result of each text and improve the recognition accuracy.
步骤S50-S60,采用目标中文手写文本识别模型识别待识别中文文本,根据最大输出值以及去叠字和去空格的处理,获取待识别中文文本的识别结果。采用该目标中文手写文本识别模型本身拥有较高的识别精确度,再结合中文语义词库进一步提高中文手写的识别准确率。In steps S50-S60, the target Chinese handwritten text recognition model is used to identify the Chinese text to be recognized, and the recognition result of the Chinese text to be recognized is obtained according to the maximum output value and the processing of desuperimposed characters and spaces. The target Chinese handwritten text recognition model itself has high recognition accuracy, and combined with the Chinese semantic thesaurus to further improve the accuracy of Chinese handwriting recognition.
本申请实施例所提供的文本识别方法中,将待识别中文文本输入到目标中文手写文本识别模型中进行识别,并结合预设的中文语义词库获取识别结果。采用该目标中文手写文本识别模型对中文手写文本进行识别时,可以得到精确的识别结果。In the text recognition method provided in the embodiment of the present application, the Chinese text to be recognized is input into the target Chinese handwritten text recognition model for recognition, and the recognition result is obtained in combination with a preset Chinese semantic thesaurus. When the target Chinese handwritten text recognition model is used to recognize Chinese handwritten text, accurate recognition results can be obtained.
图8示出与实施例中文本识别方法一一对应的文本识别装置的原理框图。如图8所示,该文本识别装置包括输出值获取模块60和识别结果获取模块70。其中,输出值获取模块60和识别结果获取模块70的实现功能与实施例中文本识别方法对应的步骤一一对应,为避免赘述,本实施例不一一详述。FIG. 8 shows a principle block diagram of a text recognition device that corresponds one-to-one to the text recognition method in the embodiment. As shown in FIG. 8, the text recognition device includes an output value acquisition module 60 and a recognition result acquisition module 70. The implementation functions of the output value acquisition module 60 and the recognition result acquisition module 70 correspond to the steps corresponding to the text recognition method in the embodiment one by one. To avoid redundant descriptions, this embodiment does not detail them one by one.
文本识别装置包括输出值获取模块60,用于获取待识别中文文本,采用目标中文手写文本识别模型识别待识别中文文本,获取待识别中文文本在目标中文手写文本识别模型中的输出值;目标中文手写文本识别模型是采用手写模型训练方法获取到的。The text recognition device includes an output value acquisition module 60 for obtaining the Chinese text to be recognized, using the target Chinese handwritten text recognition model to identify the Chinese text to be recognized, and obtaining the output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese The handwritten text recognition model is obtained using the handwriting model training method.
识别结果获取模块70,用于选取待识别中文文本对应的输出值中的最大输出值,根据最大输出值获取待识别中文文本的识别结果。The recognition result acquisition module 70 is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.
本实施例提供一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中手写模型训练方法,为避免重复,这里不再赘述。或者,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中手写模型训练装置的各模块/单元的功能,为避免重复,这里不再赘述。或者,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中文本识别方法中各步骤的功能,为避免重复,此处不一一赘述。或者,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行时实现实施例中文本识别装置中各模块/单元的功能,为避免重复,此处不一一赘述。This embodiment provides one or more non-volatile readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors are executed. The handwriting model training method in the embodiment is implemented at this time. To avoid repetition, details are not repeated here. Alternatively, when the computer-readable instructions are executed by one or more processors, the functions of each module / unit of the handwriting model training device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, here No longer. Alternatively, when the computer-readable instructions are executed by one or more processors, the functions of each step in the text recognition method in the embodiment are implemented when the one or more processors are executed. To avoid repetition, different ones are not provided here. One more detail. Alternatively, when the computer-readable instructions are executed by one or more processors, the functions of each module / unit in the text recognition device in the embodiment are implemented when the one or more processors are executed. To avoid repetition, here Not one by one.
图9是本申请一实施例提供的计算机设备的示意图。如图9所示,该实施例的计算机设备80包括:处理器81、存储器82以及存储在存储器82中并可在处理器81上运行的计算机可读指令83,该计算机可读指令83被处理器81执行时实现实施例中的手写模型训练方法,为避免重复,此处不一一赘述。或者,该计算机可读指令83被处理器81执行时实现实施例中手写模型训练装置中各模型/单元的功能,为避免重复,此处不一一赘述。或者,该计算机可读指令83被处理器81执行时实现实施例中文本识别方法中各步骤的功能,为避免重复,此处不一一赘述。或者,该计算机可读指令83被处理器81执行时实现实施例中文本识别装置中各模块/单元的功能。为避免重复,此处不一一赘述。FIG. 9 is a schematic diagram of a computer device according to an embodiment of the present application. As shown in FIG. 9, the computer device 80 of this embodiment includes a processor 81, a memory 82, and computer-readable instructions 83 stored in the memory 82 and executable on the processor 81. The computer-readable instructions 83 are processed. The device 81 implements the handwriting model training method in the embodiment when executed. To avoid repetition, details are not described here one by one. Alternatively, when the computer-readable instructions 83 are executed by the processor 81, the functions of each model / unit in the handwriting model training device in the embodiment are implemented. To avoid repetition, details are not described here one by one. Alternatively, when the computer-readable instructions 83 are executed by the processor 81, the functions of the steps in the text recognition method in the embodiment are implemented. To avoid repetition, details are not described here one by one. Alternatively, when the computer-readable instructions 83 are executed by the processor 81, the functions of each module / unit in the text recognition device in the embodiment are realized. To avoid repetition, we will not repeat them here.
计算机设备80可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。计算机设备可包括,但不仅限于,处理器81、存储器82。本领域技术人员可以理解,图9仅仅是计算机设备80的示例,并不构成对计算机设备80的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如计算机设备还可以包括输入输出设备、网络接入设备、总线等。The computer device 80 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer equipment may include, but is not limited to, a processor 81 and a memory 82. Those skilled in the art can understand that FIG. 9 is only an example of the computer device 80 and does not constitute a limitation on the computer device 80. It may include more or fewer components than shown in the figure, or combine some components or different components. For example, computer equipment may also include input and output equipment, network access equipment, and buses.
所称处理器81可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑 器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The so-called processor 81 may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
存储器82可以是计算机设备80的内部存储单元,例如计算机设备80的硬盘或内存。存储器82也可以是计算机设备80的外部存储设备,例如计算机设备80上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器82还可以既包括计算机设备80的内部存储单元也包括外部存储设备。存储器82用于存储计算机可读指令83以及计算机设备所需的其他程序和数据。存储器82还可以用于暂时地存储已经输出或者将要输出的数据。The memory 82 may be an internal storage unit of the computer device 80, such as a hard disk or a memory of the computer device 80. The memory 82 may also be an external storage device of the computer device 80, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory card (Flash) provided on the computer device 80. Card) and so on. Further, the memory 82 may also include both an internal storage unit of the computer device 80 and an external storage device. The memory 82 is used to store computer-readable instructions 83 and other programs and data required by the computer device. The memory 82 may also be used to temporarily store data that has been or will be output.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the above-mentioned division of functional units and modules is used as an example. In practical applications, the above functions can be assigned by different functional units, Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to describe the technical solution of the present application, but not limited thereto. Although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing implementations. The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of this application.
Claims (20)
- 一种手写模型训练方法,其特征在于,包括:A handwriting model training method is characterized in that it includes:获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
- 根据权利要求1所述的手写模型训练方法,其特征在于,所述获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,包括:The method for training a handwriting model according to claim 1, wherein the obtaining a normal Chinese text training sample and batching the normal Chinese text training sample according to a preset batch comprises:获取待处理中文文本训练样本中每个中文文本的像素值特征矩阵,将每个中文文本的像素值特征矩阵中每个像素值进行归一化处理,获取每个中文文本的归一化像素值特征矩阵,其中,归一化处理的公式为 MaxValue为所述像素值特征矩阵中像素值的最大值,MinValue为所述像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值; Obtain the pixel value feature matrix of each Chinese text in the Chinese text training sample to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain the normalized pixel value of each Chinese text Feature matrix, where the formula for normalization is MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization;将每个中文文本的归一化像素值特征矩阵中的像素值划分为两类像素值,基于所述两类像素值建立每个中文文本的二值化像素值特征矩阵,将每个中文文本的二值化像素值特征矩阵对应的中文文本组合作为规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分。The pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values, and a binary pixel value feature matrix of each Chinese text is established based on the two types of pixel values, and each Chinese text is The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample for the standard Chinese text, and the training samples for the text in the standard are batched according to a preset batch.
- 根据权利要求1所述的手写模型训练方法,其特征在于,所述将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型,包括:The method for training a handwriting model according to claim 1, wherein the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous time classification algorithm, and a time-dependent back-propagation algorithm is used Update the network parameters of the recurrent neural network to obtain the standard Chinese text recognition model, including:将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,获取批分后的规范中文文本训练样本在所述循环神经网络中的前向传播输出和后向传播输出,所述前向传播输出表示为 其中,t表示序列步数,u表示与t相对应的输出的标签值, 表示在第t步的输出为标签值l' u的概率, 所述后向传播输出表示为 其中,t表示序列步数,u表示与t相对应的输出的标签值, 表示在第t+1步的输出为标签值l' u的概率, The batched canonical Chinese text training samples are input into a recurrent neural network, and training is performed based on a continuous-time classification algorithm to obtain the forward propagation output and backward direction of the batched canonical Chinese text training samples in the recurrent neural network. Propagation output, the forward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t is the label value l ' u , The back-propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t + 1 is the label value l ' u ,根据所述前向传播输出和所述后向传播输出构建误差函数;Constructing an error function according to the forward propagation output and the backward propagation output;根据所述误差函数,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型。According to the error function, a network parameter of the recurrent neural network is updated using a time-dependent back-propagation algorithm to obtain a standardized Chinese text recognition model.
- 根据权利要求1所述的手写模型训练方法,其特征在于,所述采用调整中文手写文本识别模型识别待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本,包括:The method for training a handwriting model according to claim 1, wherein the adjusted Chinese handwritten text recognition model is used to identify samples of Chinese text to be tested, obtain error texts whose recognition results do not match the true results, and treat all the error texts as Error text training samples, including:将待测试中文文本样本输入到调整中文手写文本识别模型,获取所述待测试中文文本样本中每一文本在所述调整中文手写文本识别模型中的输出值;Inputting a Chinese text sample to be tested into an adjusted Chinese handwritten text recognition model, and obtaining an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model;选取每一所述文本对应的输出值中的最大输出值,根据所述最大输出值获取每一所述文本的识别结果;Selecting a maximum output value among output values corresponding to each of the texts, and obtaining a recognition result of each of the texts according to the maximum output value;根据识别结果,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本。According to the recognition results, error texts whose recognition results do not match the real results are obtained, and all the error texts are used as training texts of the error texts.
- 根据权利要求1所述的手写模型训练方法,其特征在于,在所述获取规范中文文本训练样本的步骤之前,所述手写模型训练方法还包括:The method for training a handwriting model according to claim 1, wherein before the step of obtaining a training sample of standard Chinese text, the method for training a handwriting model further comprises:初始化循环神经网络。Initialize the recurrent neural network.
- 一种文本识别方法,其特征在于,包括:A text recognition method, comprising:获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method according to any one of claims 1-5;选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
- 一种手写模型训练装置,其特征在于,包括:A handwriting model training device, comprising:规范中文文本识别模型获取模块,用于获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;The normal Chinese text recognition model acquisition module is used to obtain normal Chinese text training samples, and batch the normal Chinese text training samples according to a preset batch, and input the batch normal Chinese text training samples to the recurrent neural network. In training, based on continuous-time classification algorithm, time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;调整中文手写文本识别模型获取模块,用于获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Adjust the Chinese handwritten text recognition model acquisition module to obtain non-standard Chinese text training samples, batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples In the normal Chinese text recognition model, training is performed based on a continuous time classification algorithm, and a time-dependent back propagation algorithm is used to update network parameters of the normal Chinese text recognition model to obtain an adjusted Chinese handwritten text recognition model;出错文本训练样本获取模块,用于获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Error text training sample acquisition module, for acquiring Chinese text samples to be tested, using the adjusted Chinese handwritten text recognition model to identify the Chinese text samples to be tested, obtaining error texts whose recognition results do not match the true results, and putting all the errors Text as training text for error text;目标中文手写文本识别模型获取模块,用于将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。A target Chinese handwritten text recognition model acquisition module is configured to input the error text training sample into the adjusted Chinese handwritten text recognition model, train based on a continuous time classification algorithm, and update the time-dependent backpropagation algorithm with batch gradient descent. Adjust the network parameters of the Chinese handwritten text recognition model to obtain the target Chinese handwritten text recognition model.
- 一种文本识别装置,其特征在于,包括:A text recognition device, comprising:输出值获取模块,用于获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;An output value acquisition module, configured to acquire Chinese text to be recognized, identify the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtain an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; The target Chinese handwritten text recognition model is obtained by using the handwriting model training method according to any one of claims 1-5;识别结果获取模块,用于选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。The recognition result obtaining module is configured to select a maximum output value among output values corresponding to the Chinese text to be recognized, and obtain a recognition result of the Chinese text to be recognized according to the maximum output value.
- 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算 机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and is characterized in that the processor implements the computer-readable instructions as follows step:获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
- 根据权利要求9所述的计算机设备,其特征在于,所述获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,包括:The computer device according to claim 9, wherein the obtaining a training sample of standard Chinese text and batching the training sample of standard Chinese characters according to a preset batch comprises:获取待处理中文文本训练样本中每个中文文本的像素值特征矩阵,将每个中文文本的像素值特征矩阵中每个像素值进行归一化处理,获取每个中文文本的归一化像素值特征矩阵,其中,归一化处理的公式为 MaxValue为所述像素值特征矩阵中像素值的最大值,MinValue为所述像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值; Obtain the pixel value feature matrix of each Chinese text in the Chinese text training sample to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain the normalized pixel value of each Chinese text Feature matrix, where the formula for normalization is MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization;将每个中文文本的归一化像素值特征矩阵中的像素值划分为两类像素值,基于所述两类像素值建立每个中文文本的二值化像素值特征矩阵,将每个中文文本的二值化像素值特征矩阵对应的中文文本组合作为规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分。The pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values, and a binary pixel value feature matrix of each Chinese text is established based on the two types of pixel values, and each Chinese text is The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample for the standard Chinese text, and the training samples for the text in the standard are batched according to a preset batch.
- 根据权利要求9所述的计算机设备,其特征在于,所述将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型,包括:The computer device according to claim 9, wherein the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous-time classification algorithm, and a loop is updated using a time-dependent back-propagation algorithm The network parameters of the neural network to obtain the standard Chinese text recognition model, including:将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,获取批分后的规范中文文本训练样本在所述循环神经网络中的前向传播输出和后向传播输出,所述前向传播输出表示为 其中,t表示序列步数,u表示与t相对应的输出的标签值, 表示在第t步的输出为标签值l' u的概率, 所述后向传播输出表示为 其中,t表示序列步数,u表示与t相对应的输出的标签值, 表示在第t+1步的输出为标签值l' u的概率, The batched canonical Chinese text training samples are input into a recurrent neural network, and training is performed based on a continuous-time classification algorithm to obtain the forward propagation output and backward direction of the batched canonical Chinese text training samples in the recurrent neural network. Propagation output, the forward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t is the label value l ' u , The back-propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t + 1 is the label value l ' u ,根据所述前向传播输出和所述后向传播输出构建误差函数;Constructing an error function according to the forward propagation output and the backward propagation output;根据所述误差函数,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型。According to the error function, a network parameter of the recurrent neural network is updated using a time-dependent back-propagation algorithm to obtain a standardized Chinese text recognition model.
- 根据权利要求9所述的计算机设备,其特征在于,所述采用调整中文手写文本识别模型识别待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本,包括:The computer device according to claim 9, characterized in that the adjusted Chinese handwritten text recognition model is used to identify samples of Chinese text to be tested, to obtain error texts whose recognition results do not match the true results, and to treat all the error texts as error texts Training samples, including:将待测试中文文本样本输入到调整中文手写文本识别模型,获取所述待测试中文文本样本中每一文本在所述调整中文手写文本识别模型中的输出值;Inputting a Chinese text sample to be tested into an adjusted Chinese handwritten text recognition model, and obtaining an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model;选取每一所述文本对应的输出值中的最大输出值,根据所述最大输出值获取每一所述文本的识别结果;Selecting a maximum output value among output values corresponding to each of the texts, and obtaining a recognition result of each of the texts according to the maximum output value;根据识别结果,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本。According to the recognition results, error texts whose recognition results do not match the real results are obtained, and all the error texts are used as training texts of the error texts.
- 根据权利要求9所述的计算机设备,其特征在于,在所述获取规范中文文本训练样本的步骤之前,所述处理器执行所述计算机可读指令时还实现如下步骤:The computer device according to claim 9, wherein before the step of obtaining a training sample of standard Chinese text, the processor further implements the following steps when executing the computer-readable instruction:初始化循环神经网络。Initialize the recurrent neural network.
- 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and is characterized in that the processor implements the computer-readable instructions as follows step:获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method according to any one of claims 1-5;选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
- 一个或多个存储有计算机可读指令的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more non-volatile readable storage media storing computer readable instructions, characterized in that when the computer readable instructions are executed by one or more processors, the one or more processors are caused to execute The following steps:获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型;Obtaining a normalized Chinese text training sample, and batching the normalized Chinese text training sample according to a preset batch, inputting the batched normalized Chinese text training sample into a recurrent neural network, and performing training based on a continuous time classification algorithm, The time-correlated back-propagation algorithm is used to update the network parameters of the recurrent neural network to obtain a standardized Chinese text recognition model;获取非规范中文文本训练样本,并将所述非规范中文字训练样本按预设批次进行批分,将批分后的非规范中文文本训练样本输入到所述规范中文文本识别模型中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新所述规范中文文本识别模型的网络参数,获取调整中文手写文本识别模型;Obtain non-standard Chinese text training samples, and batch the non-standard Chinese text training samples into preset batches, and input the batched non-standard Chinese text training samples into the standard Chinese text recognition model, based on Continuous time classification algorithm for training, and time-correlated back-propagation algorithm is used to update the network parameters of the standard Chinese text recognition model to obtain adjusted Chinese handwritten text recognition model;获取待测试中文文本样本,采用所述调整中文手写文本识别模型识别所述待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本;Obtaining samples of Chinese text to be tested, using the adjusted Chinese handwritten text recognition model to identify the samples of Chinese text to be tested, obtaining error texts whose recognition results do not match the true results, and using all the error texts as training text samples of the errors;将所述出错文本训练样本输入到所述调整中文手写文本识别模型中,基于连续时间分类算法进行训练,采用批量梯度下降的时间相关反向传播算法更新调整中文手写文本识别模型的网络参数,获取目标中文手写文本识别模型。Input the error text training sample into the adjusted Chinese handwritten text recognition model, train it based on a continuous time classification algorithm, and use batch gradient descent time-dependent backpropagation algorithm to update and adjust the network parameters of the Chinese handwritten text recognition model to obtain Target Chinese handwritten text recognition model.
- 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述获取规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分,包括:The non-volatile readable storage medium according to claim 15, wherein the obtaining a normal Chinese text training sample and batching the normal Chinese text training sample according to a preset batch comprises:获取待处理中文文本训练样本中每个中文文本的像素值特征矩阵,将每个中文文本的像素值特征矩阵中每个像素值进行归一化处理,获取每个中文文本的归一化像素值特征矩阵,其中,归一化处理的公式为 MaxValue为所述像素值特征矩阵中像素值的最大值,MinValue为所述像素值特征矩阵中像素值的最小值,x为归一化前的像素值,y为归一化后的像素值; Obtain the pixel value feature matrix of each Chinese text in the Chinese text training sample to be processed, normalize each pixel value in the pixel value feature matrix of each Chinese text, and obtain the normalized pixel value of each Chinese text Feature matrix, where the formula for normalization is MaxValue is the maximum pixel value in the pixel value feature matrix, MinValue is the minimum pixel value in the pixel value feature matrix, x is the pixel value before normalization, and y is the pixel value after normalization;将每个中文文本的归一化像素值特征矩阵中的像素值划分为两类像素值,基于所述两类像素值建立每个中文文本的二值化像素值特征矩阵,将每个中文文本的二值化像素值特征矩阵对应的中文文本组合 作为规范中文文本训练样本,并将所述规范中文字训练样本按预设批次进行批分。The pixel values in the normalized pixel value feature matrix of each Chinese text are divided into two types of pixel values, and a binary pixel value feature matrix of each Chinese text is established based on the two types of pixel values, and each Chinese text is The Chinese text combination corresponding to the feature matrix of the binarized pixel values is used as the training sample for the standard Chinese text, and the training samples for the text in the standard are batched according to a preset batch.
- 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型,包括:The nonvolatile readable storage medium according to claim 15, wherein the batch of standardized Chinese text training samples is input into a recurrent neural network, training is performed based on a continuous time classification algorithm, and time correlation is used The back-propagation algorithm updates the network parameters of the recurrent neural network to obtain a standard Chinese text recognition model, including:将批分后的规范中文文本训练样本输入到循环神经网络中,基于连续时间分类算法进行训练,获取批分后的规范中文文本训练样本在所述循环神经网络中的前向传播输出和后向传播输出,所述前向传播输出表示为 其中,t表示序列步数,u表示与t相对应的输出的标签值, 表示在第t步的输出为标签值l' u的概率, 所述后向传播输出表示为 其中,t表示序列步数,u表示与t相对应的输出的标签值, 表示在第t+1步的输出为标签值l' u的概率, The batched canonical Chinese text training samples are input into a recurrent neural network, and training is performed based on a continuous-time classification algorithm to obtain the forward propagation output and backward direction of the batched canonical Chinese text training samples in the recurrent neural network. Propagation output, the forward propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t is the label value l ' u , The back-propagation output is expressed as Among them, t represents the number of sequence steps, and u represents the label value of the output corresponding to t, Represents the probability that the output at step t + 1 is the label value l ' u ,根据所述前向传播输出和所述后向传播输出构建误差函数;Constructing an error function according to the forward propagation output and the backward propagation output;根据所述误差函数,采用时间相关反向传播算法更新循环神经网络的网络参数,获取规范中文文本识别模型。According to the error function, a network parameter of the recurrent neural network is updated using a time-dependent back-propagation algorithm to obtain a standardized Chinese text recognition model.
- 根据权利要求15所述的非易失性可读存储介质,其特征在于,所述采用调整中文手写文本识别模型识别待测试中文文本样本,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本,包括:The non-volatile readable storage medium according to claim 15, wherein the adjusted Chinese handwritten text recognition model is used to identify a sample of Chinese text to be tested, obtain error text that does not match the recognition result, and Describe the error text as an error text training sample, including:将待测试中文文本样本输入到调整中文手写文本识别模型,获取所述待测试中文文本样本中每一文本在所述调整中文手写文本识别模型中的输出值;Inputting a Chinese text sample to be tested into an adjusted Chinese handwritten text recognition model, and obtaining an output value of each text in the Chinese text sample to be tested in the adjusted Chinese handwritten text recognition model;选取每一所述文本对应的输出值中的最大输出值,根据所述最大输出值获取每一所述文本的识别结果;Selecting a maximum output value among output values corresponding to each of the texts, and obtaining a recognition result of each of the texts according to the maximum output value;根据识别结果,获取识别结果与真实结果不符的出错文本,把所有所述出错文本作为出错文本训练样本。According to the recognition results, error texts whose recognition results do not match the real results are obtained, and all the error texts are used as training texts of the error texts.
- 根据权利要求15所述的非易失性可读存储介质,其特征在于,在所述获取规范中文文本训练样本的步骤之前,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:The non-volatile readable storage medium according to claim 15, wherein before the step of obtaining a training sample of standard Chinese text, when the computer-readable instructions are executed by one or more processors, The one or more processors further perform the following steps:初始化循环神经网络。Initialize the recurrent neural network.
- 一个或多个存储有计算机可读指令的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more non-volatile readable storage media storing computer readable instructions, characterized in that when the computer readable instructions are executed by one or more processors, the one or more processors are caused to execute The following steps:获取待识别中文文本,采用目标中文手写文本识别模型识别所述待识别中文文本,获取所述待识别中文文本在所述目标中文手写文本识别模型中的输出值;所述目标中文手写文本识别模型是采用权利要求1-5任一项所述手写模型训练方法获取到的;Obtaining the Chinese text to be recognized, identifying the Chinese text to be recognized using a target Chinese handwritten text recognition model, and obtaining an output value of the Chinese text to be recognized in the target Chinese handwritten text recognition model; the target Chinese handwritten text recognition model Obtained by using the handwriting model training method according to any one of claims 1-5;选取所述待识别中文文本对应的输出值中的最大输出值,根据所述最大输出值获取待识别中文文本的识别结果。Selecting a maximum output value among output values corresponding to the Chinese text to be recognized, and obtaining a recognition result of the Chinese text to be recognized according to the maximum output value.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810564063.8A CN109086654B (en) | 2018-06-04 | 2018-06-04 | Handwriting model training method, text recognition method, device, equipment and medium |
CN201810564063.8 | 2018-06-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019232869A1 true WO2019232869A1 (en) | 2019-12-12 |
Family
ID=64839332
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/094344 WO2019232869A1 (en) | 2018-06-04 | 2018-07-03 | Handwriting model training method, text recognition method and apparatus, device, and medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109086654B (en) |
WO (1) | WO2019232869A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111461403A (en) * | 2020-03-06 | 2020-07-28 | 上海汽车集团股份有限公司 | Vehicle path planning method and device, computer readable storage medium and terminal |
CN111783427A (en) * | 2020-06-30 | 2020-10-16 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for training model and outputting information |
CN112115268A (en) * | 2020-09-28 | 2020-12-22 | 支付宝(杭州)信息技术有限公司 | Training method and device and classification method and device based on feature encoder |
CN112784845A (en) * | 2021-01-12 | 2021-05-11 | 安徽淘云科技有限公司 | Handwritten character detection method, electronic equipment and storage device |
CN113111234A (en) * | 2020-02-13 | 2021-07-13 | 北京明亿科技有限公司 | Regular expression-based alarm condition category determination method and device |
CN114973267A (en) * | 2022-05-31 | 2022-08-30 | 北京智通东方软件科技有限公司 | Model training method, text recognition method, device, medium and equipment |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626065A (en) * | 2019-02-26 | 2020-09-04 | 株式会社理光 | Training method and device of neural machine translation model and storage medium |
CN110750640B (en) * | 2019-09-17 | 2022-11-04 | 平安科技(深圳)有限公司 | Text data classification method and device based on neural network model and storage medium |
CN110866499B (en) * | 2019-11-15 | 2022-12-13 | 爱驰汽车有限公司 | Handwritten text recognition method, system, device and medium |
CN113128296B (en) * | 2019-12-31 | 2023-05-09 | 重庆傲雄在线信息技术有限公司 | Electronic handwriting signature fuzzy label recognition system |
CN113408373B (en) * | 2021-06-02 | 2024-06-07 | 中金金融认证中心有限公司 | Handwriting recognition method, handwriting recognition system, client and server |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105184226A (en) * | 2015-08-11 | 2015-12-23 | 北京新晨阳光科技有限公司 | Digital identification method, digital identification device, neural network training method and neural network training device |
CN106650933A (en) * | 2017-01-12 | 2017-05-10 | 西安电子科技大学 | Deep neural network optimizing method based on coevolution and back propagation |
CN107818289A (en) * | 2016-09-13 | 2018-03-20 | 北京搜狗科技发展有限公司 | A kind of prescription recognition methods and device, a kind of device for prescription identification |
CN107909101A (en) * | 2017-11-10 | 2018-04-13 | 清华大学 | Semi-supervised transfer learning character identifying method and system based on convolutional neural networks |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7983478B2 (en) * | 2007-08-10 | 2011-07-19 | Microsoft Corporation | Hidden markov model based handwriting/calligraphy generation |
US20150317336A1 (en) * | 2014-04-30 | 2015-11-05 | Hewlett-Packard Development Company, L.P. | Data reconstruction |
CN107316054A (en) * | 2017-05-26 | 2017-11-03 | 昆山遥矽微电子科技有限公司 | Non-standard character recognition methods based on convolutional neural networks and SVMs |
-
2018
- 2018-06-04 CN CN201810564063.8A patent/CN109086654B/en active Active
- 2018-07-03 WO PCT/CN2018/094344 patent/WO2019232869A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105184226A (en) * | 2015-08-11 | 2015-12-23 | 北京新晨阳光科技有限公司 | Digital identification method, digital identification device, neural network training method and neural network training device |
CN107818289A (en) * | 2016-09-13 | 2018-03-20 | 北京搜狗科技发展有限公司 | A kind of prescription recognition methods and device, a kind of device for prescription identification |
CN106650933A (en) * | 2017-01-12 | 2017-05-10 | 西安电子科技大学 | Deep neural network optimizing method based on coevolution and back propagation |
CN107909101A (en) * | 2017-11-10 | 2018-04-13 | 清华大学 | Semi-supervised transfer learning character identifying method and system based on convolutional neural networks |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113111234A (en) * | 2020-02-13 | 2021-07-13 | 北京明亿科技有限公司 | Regular expression-based alarm condition category determination method and device |
CN111461403A (en) * | 2020-03-06 | 2020-07-28 | 上海汽车集团股份有限公司 | Vehicle path planning method and device, computer readable storage medium and terminal |
CN111461403B (en) * | 2020-03-06 | 2023-09-29 | 上海汽车集团股份有限公司 | Vehicle path planning method and device, computer readable storage medium and terminal |
CN111783427A (en) * | 2020-06-30 | 2020-10-16 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for training model and outputting information |
CN111783427B (en) * | 2020-06-30 | 2024-04-02 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for training model and outputting information |
CN112115268A (en) * | 2020-09-28 | 2020-12-22 | 支付宝(杭州)信息技术有限公司 | Training method and device and classification method and device based on feature encoder |
CN112115268B (en) * | 2020-09-28 | 2024-04-09 | 支付宝(杭州)信息技术有限公司 | Training method and device based on feature encoder, and classifying method and device |
CN112784845A (en) * | 2021-01-12 | 2021-05-11 | 安徽淘云科技有限公司 | Handwritten character detection method, electronic equipment and storage device |
CN114973267A (en) * | 2022-05-31 | 2022-08-30 | 北京智通东方软件科技有限公司 | Model training method, text recognition method, device, medium and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109086654A (en) | 2018-12-25 |
CN109086654B (en) | 2023-04-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019232869A1 (en) | Handwriting model training method, text recognition method and apparatus, device, and medium | |
CN108764195B (en) | Handwriting model training method, handwritten character recognition method, device, equipment and medium | |
WO2019232861A1 (en) | Handwriting model training method and apparatus, text recognition method and apparatus, and device and medium | |
US12079696B2 (en) | Machine learning model training method and device, and expression image classification method and device | |
WO2019232854A1 (en) | Handwritten model training method and apparatus, handwritten character recognition method and apparatus, and device and medium | |
WO2019232853A1 (en) | Chinese model training method, chinese image recognition method, device, apparatus and medium | |
WO2018108129A1 (en) | Method and apparatus for use in identifying object type, and electronic device | |
WO2019232843A1 (en) | Handwritten model training method and apparatus, handwritten image recognition method and apparatus, and device and medium | |
WO2019232872A1 (en) | Handwritten character model training method, chinese character recognition method, apparatus, device, and medium | |
WO2019232855A1 (en) | Handwriting model training method, handwritten character recognition method and device, apparatus, and medium | |
CN109359214A (en) | Video presentation generation method, storage medium and terminal device neural network based | |
WO2019232857A1 (en) | Handwritten character model training method, handwritten character recognition method, apparatus, device, and medium | |
WO2019232849A1 (en) | Chinese character model training method, handwritten character recognition method, apparatuses, device and medium | |
WO2019232850A1 (en) | Method and apparatus for recognizing handwritten chinese character image, computer device, and storage medium | |
CN111783767B (en) | Character recognition method, character recognition device, electronic equipment and storage medium | |
CN112862093A (en) | Graph neural network training method and device | |
CN111260032A (en) | Neural network training method, image processing method and device | |
WO2021227333A1 (en) | Face key point detection method and apparatus, and electronic device | |
CN106530341A (en) | Point registration algorithm capable of keeping local topology invariance | |
WO2023134402A1 (en) | Calligraphy character recognition method based on siamese convolutional neural network | |
CN113987236B (en) | Unsupervised training method and unsupervised training device for visual retrieval model based on graph convolution network | |
CN110414541A (en) | The method, equipment and computer readable storage medium of object for identification | |
WO2019232859A1 (en) | Handwriting model training method and apparatus, handwritten character recognition method and apparatus, device, and medium | |
WO2019232844A1 (en) | Handwriting model training method and apparatus, handwritten character recognition method and apparatus, and device and medium | |
WO2022062403A1 (en) | Expression recognition model training method and apparatus, terminal device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18921835 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 11/03/2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18921835 Country of ref document: EP Kind code of ref document: A1 |